Larry Steinle

September 8, 2013

Coding On a Thread and a Prayer

Filed under: C,Security,VS.Net,Web — Larry Steinle @ 12:28 am
Tags: , , ,

Threading is a powerful concept growing more critical for application development. Modern computers aren’t always faster than computers from even four years ago. Instead, computer manufacturers focus on packing more CPU’s into a smaller package with the idea that workload split across many chips will run on a magnitude faster than the fastest chip. This performance gain can only be realized when an application is intentionally designed to take advantage of the multiple cores. As you begin working with threads, however, you will quickly realize that you must design your applications on a thread and a prayer!

Today’s article will be a little different from previous articles I have written. Instead of explaining what a thread is and how to use it I will be reviewing the various issues I have encountered while working with multiple threads and how I have resolved those issues. I trust that today’s article will help others who have encountered these same issues while developing multi-threaded applications.

General Strategy

Not all tasks can be easily divided up into an asynchronous threading model. At this time I use threading sparingly only when performance gains are absolutely critical and when an activity lends itself to the threading model. For example, copying folders and files is a perfect candidate for a multi-threaded operation because it doesn’t matter when a file is copied. I simply want all the files copied as quickly as possible. It was this simple task that led to a world of hurt!

Creating An ASP.Net Website to Copy Files As Quickly As Possible

One of my goals for this week included creating a website to copy several gigabytes of data from one server to another as quickly as possible. I thought, “This can’t be too bad! Simply spin off a new thread for each file and copy more than one file at the same time!” I discovered the System.Threading.Tasks namespace fallling in love with the “new” Parallel.ForEach loop. Better yet Microsoft’s article, “How to: Write a Simple Parallel.ForEach Loop” focused on using this method for managing files!

What I loved about the Parallel class was that I could first make sure that all the folders existed on the target machine before I even begin to copy the files. I felt that this style would be more elegant allowing me to create more refined, reusable functions. Thus I began with my prototype:

using System;
using System.IO;
using System.Security;
using System.Security.Principal;

public static class AsynchronousCopier
{
  //Copy path with current security credentials
  public static void CopyPath(string sourcePath, string targetPath)
  {
    CreateFolders(sourcePath, targetPath);
    CopyFiles(sourcePath, targetPath);
  }

  //Copy path using specified security credentials
  public static void CopyPath(string sourcePath, string targetPath, string userName, string password)
  {
    AppDomain.CurrentDomain.SetPrincipalPolicy(PrincipalPolicy.WindowsPrincipal);
    WindowsIdentity identity = new WindowsIdentity(username, password);
    WindowsImpersonationContext context = identity.Impersonate();

    CreateFolders(sourcePath, targetPath);
    CopyFiles(sourcePath, targetPath);

    context.Undo();
  }

  public static void CreateFolders(string sourcePath, string targetPath)
  {
    Parallel.ForEach(Directory.GetDirectories(sourcePath, SearchOption.AllDirectories), sourceFolder => {
      string targetFolder = sourceFolder.Replace(sourcePath, targetPath);
      if (!Directory.Exists(targetFolder)) Directory.Create(targetFolder);
    });
  }

  public static void CopyFiles(string sourcePath, string targetPath)
  {
    Parallel.ForEach(Directory.GetFiles(sourcePath, SearchOption.AllDirectories), sourceFile => {
      string targetFile = sourceFile.Replace(sourcePath, targetPath);
      File.Copy(sourceFile, targetFile, true);
      File.SetAttributes(targetFile, FileAttributes.Normal);
    });
  }
}

In my prototype webpage I simply made the following call:

using System.Thread;

protected void CopyBtn_Click(Object sender, EventArgs e)
{
  //Spin of copying action to a new thread so that user doesn't have to wait for post-back.
  Thread copyThread = new Thread(CopyPath);
  copyThread.Start();
}

public void CopyPath()
{
  AsynchronousCopier.CopyPath(txtSource.Text, txtTarget.Text, txtUserName.Text, txtPassword.Text);
}

Then the problems began…

Deadlock

The first time I clicked on the CopyBtn everything seemed to be working just fine. But very quickly my machine began to grow sluggish. When I opened up the task manager I watched as the screen stopped refreshing, apps locked up and even the windows 7 bar and task manager locked up. My only option was cold reboot.

On my next run I opened the Task Manager first before clicking the CopyBtn. I noticed that the number of threads executing slowly increased by 2,000 at which point the machine locked up again. Figuring that the problem was how many threads were created I set the MaxDegreeOfParallelism to restrict my app to use only twice as many threads as I have processors on my machine. But the problem persisted.

Realizing that errors from threads don’t necessarily make it back to the main thread I added error handling around each routine just in case an error was being thrown. As I ran the function again I finally watched error messages spit out into the debugger window but the computer didn’t lock up anymore.

Just to be sure of the solution I removed the MaxDegreeOfParallelism code and the deadlock returned. Apparently I needed a combination of error handling and thread throttling to have a reliable solution.

using System;
using System.IO;
using System.Security;
using System.Security.Principal;

public static class AsynchronousCopier
{
  //Copy path with current security credentials
  public static void CopyPath(string sourcePath, string targetPath)
  {
    CreateFolders(sourcePath, targetPath);
    CopyFiles(sourcePath, targetPath);
  }

  //Copy path using specified security credentials
  public static void CopyPath(string sourcePath, string targetPath, string userName, string password)
  {
    AppDomain.CurrentDomain.SetPrincipalPolicy(PrincipalPolicy.WindowsPrincipal);
    WindowsIdentity identity = new WindowsIdentity(username, password);
    WindowsImpersonationContext context = identity.Impersonate();

    CreateFolders(sourcePath, targetPath);
    CopyFiles(sourcePath, targetPath);

    context.Undo();
  }

  public static void CreateFolders(string sourcePath, string targetPath)
  {
    //Maximum of 4 threads per CPU appears to be most efficient algorithm and stops deadlock issues.
    var options = new ParallelOptions();
    options.MaxDegreeOfParallelism = Environment.ProcessorCount * 4;

    Parallel.ForEach(Directory.GetDirectories(sourcePath, SearchOption.AllDirectories), options, sourceFolder => {
      string targetFolder = sourceFolder.Replace(sourcePath, targetPath);

      try
      {
        if (!Directory.Exists(targetFolder)) Directory.Create(targetFolder);
      }
      catch (Exception ex)
      {
        System.Diagnostics.Debug.WriteLine("Creating Folder: " + targetPath);
        System.Diagnostics.Debug.WriteLine(ex.Message);
      }
    });
  }

  public static void CopyFiles(string sourcePath, string targetPath)
  {
    //Maximum of 4 threads per CPU appears to be most efficient algorithm and stops deadlock issues.
    var options = new ParallelOptions();
    options.MaxDegreeOfParallelism = Environment.ProcessorCount * 4;

    Parallel.ForEach(Directory.GetFiles(sourcePath, SearchOption.AllDirectories), options, sourceFile => {
      string targetFile = sourceFile.Replace(sourcePath, targetPath);

      try
      {
        File.Copy(sourceFile, targetFile, true);
        File.SetAttributes(targetFile, FileAttributes.Normal);
      }
      catch (Exception ex)
      {
        System.Diagnostics.Debug.WriteLine("Copying File: " + targetFile);
        System.Diagnostics.Debug.WriteLine(ex.Message);
      }
    });
  }
}

While my machine no longer locked up I noticed that whatever the maximum CPU utilization was during the run, it remained at that utilization when all processing had been completed. Wondering if perhaps this was due to the new architecture of the Task threading model I replaced my “normal” thread call in the button with a task thread call.

using System.Thread;

protected void CopyBtn_Click(Object sender, EventArgs e)
{
  //Spin of copying action to a new thread so that user doesn't have to wait for post-back.
  Tasks.Task.Factory.StartNew(delegate() { CopyPath });
  //For an even better way to start the task refer to:
  //http://msdn.microsoft.com/en-us/library/system.web.ui.pageasynctask.aspx
}

public void CopyPath()
{
  AsynchronousCopier.CopyPath(txtSource.Text, txtTarget.Text, txtUserName.Text, txtPassword.Text);
}

Sure enough! The next time I ran the application when the processing had completed the CPU returned to expected readings. Now I had an operational computer but what problems caused the deadlock to begin with?

What Do You Mean I Don’t Have Rights???!!!!!

To my surprise I found that occasionally I would get an UnauthorizedAccessException. I was suspicious that my authenticated security credentials were being dropped. But it didn’t make any sense for the impersonation context to get lost when it was correctly set before calling any of the threads!

After some painful googling I finally stumbled upon an excellent description of the problem in an article titled, ASP.Net Impersonation and Parallel.ForEach Issue! Rob Seder explained, “if you use ASP.NET with impersonation, and you also use Parallel.ForEach, threads that run on other cores, lose the execution context.” In an MSDN forum titled, Thread.CurrentPrincipal corrupts current SynchronizationContext, a service tech explained that the loss of security context was a known issue that will be resolved in the next version of the Common Language Runtime (CLR 4.5).

Unbelievable!

As suggested by Mr. Seder I added impersonation code inside the Parallel ForEach method that would run only when needed. In the future when the CLR 4.5 framework is installed on my server the extra authentication code would no longer be needed and will automatically stop being called.

using System;
using System.IO;
using System.Security;
using System.Security.Principal;

public static class AsynchronousCopier
{
  //Copy path with current security credentials
  public static void CopyPath(string sourcePath, string targetPath)
  {
    CreateFolders(sourcePath, targetPath);
    CopyFiles(sourcePath, targetPath);
  }

  //Copy path using specified security credentials
  public static void CopyPath(string sourcePath, string targetPath, string userName, string password)
  {
    AppDomain.CurrentDomain.SetPrincipalPolicy(PrincipalPolicy.WindowsPrincipal);
    WindowsIdentity identity = new WindowsIdentity(username, password);
    WindowsImpersonationContext context = identity.Impersonate();

    //This code is required to address known bug in CLR 4.0. It is only called when necessary.
    //Once CLR 4.5 is installed on the machine the bug is resolved even for CLR 4.0 framework.
    Thread.CurrentPrincipal = context;

    CreateFolders(sourcePath, targetPath, username, password);
    CopyFiles(sourcePath, targetPath, username, password);

    context.Undo();
  }

  public static void CreateFolders(string sourcePath, string targetPath, string userName, string password)
  {
    //Maximum of 4 threads per CPU appears to be most efficient algorithm and stops deadlock issues.
    var options = new ParallelOptions();
    options.MaxDegreeOfParallelism = Environment.ProcessorCount * 4;

    //This code is required to address known bug in CLR 4.0. It is only called when necessary.
    //Once CLR 4.5 is installed on the machine the bug is resolved even for CLR 4.0 framework.
    WindowsIdentity userIdentity;
    if (HttpContext.Current != null && HttpContext.Current.User != null && HttpContext.Current.User.Identity != null)
        userIdentity = (WindowsIdentity)HttpContext.Current.User.Identity;
    else
        userIdentity = (WindowsIdentity)Thread.CurrentPrincipal.Identity;
    SynchronizationContext.SetSynchronizationContext(SynchronizationContext.Current);

    Parallel.ForEach(Directory.GetDirectories(sourcePath, SearchOption.AllDirectories), options, sourceFolder => {
      string targetFolder = sourceFolder.Replace(sourcePath, targetPath);

      //This code is required to address known bug in CLR 4.0. It is only called when necessary.
      //Once CLR 4.5 is installed on the machine the bug is resolved even for CLR 4.0 framework.
      WindowsImpersonationContext userContext = SynchronizationContext.Current == null ? userIdentity.Impersonate() : null;

      try
      {
        if (!Directory.Exists(targetFolder)) Directory.Create(targetFolder);
      }
      catch (Exception ex)
      {
        System.Diagnostics.Debug.WriteLine("Creating Folder: " + targetPath);
        System.Diagnostics.Debug.WriteLine(ex.Message);
      }
      finally
      {
        if (userContext != null) userContext.Undo();
      }
    });
  }

  public static void CopyFiles(string sourcePath, string targetPath)
  {
    //Maximum of 4 threads per CPU appears to be most efficient algorithm and stops deadlock issues.
    var options = new ParallelOptions();
    options.MaxDegreeOfParallelism = Environment.ProcessorCount * 4;

    //This code is required to address known bug in CLR 4.0. It is only called when necessary.
    //Once CLR 4.5 is installed on the machine the bug is resolved even for CLR 4.0 framework.
    WindowsIdentity userIdentity;
    if (HttpContext.Current != null && HttpContext.Current.User != null && HttpContext.Current.User.Identity != null)
        userIdentity = (WindowsIdentity)HttpContext.Current.User.Identity;
    else
        userIdentity = (WindowsIdentity)Thread.CurrentPrincipal.Identity;
    SynchronizationContext.SetSynchronizationContext(SynchronizationContext.Current);

    Parallel.ForEach(Directory.GetFiles(sourcePath, SearchOption.AllDirectories), options, sourceFile => {
      string targetFile = sourceFile.Replace(sourcePath, targetPath);

      //This code is required to address known bug in CLR 4.0. It is only called when necessary.
      //Once CLR 4.5 is installed on the machine the bug is resolved even for CLR 4.0 framework.
      WindowsImpersonationContext userContext = SynchronizationContext.Current == null ? userIdentity.Impersonate() : null;

      try
      {
        File.Copy(sourceFile, targetFile, true);
        File.SetAttributes(targetFile, FileAttributes.Normal);
      }
      catch (Exception ex)
      {
        System.Diagnostics.Debug.WriteLine("Copying File: " + targetFile);
        System.Diagnostics.Debug.WriteLine(ex.Message);
      }
      finally
      {
        if (userContext != null) userContext.Undo();
      }
    });
  }
}

Finally arriving at a reasonable solution to my threading problems I pleasantly watched as the file copy time was reduced from 15 minutes down to 4 minutes.

Summary

It was a painful experience hunting down all the threading issues. While multi-threaded applications enjoy increased performance on modern computers they come at the cost of a more difficult to diagnose, debug and support architecture. I prefer to use threads sparingly but I am forced to recognize that the future will only require them even more. At least with the new System.Threading.Task namespace it is much easier to add more complex threading capabilities.

I would be very interested in any stories you may have on your own multi-threading journey. Please comment below!

Happy coding!

Advertisement

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: