I was working on a bit of code for a personal project, when I came upon the need to generate checksums on large amounts of files. First off let me say I already solved this problem ideally using System.Threading.Tasks.Parallel (.net net, C#), which behaves have I would expect. What I expected was several checksums running simultaneously using Tasks, given a list of tasks, but not necessarily have them be processed in order. In other words, if I put a small one (10mb perhaps) as the last one, and a 5gb file as the first, the last one should finish first. Because it takes significantly less time to process.
Here is a very simple example:
static async void MainAsync()
{
await GetChecksum(1,@"E:\Files\ISO\5gbfile.iso");
await GetChecksum(2,@"E:\Files\ISO\4gbfile.iso");
await GetChecksum(3,@"E:\Files\ISO\3gbfile.iso");
await GetChecksum(4,@"E:\Files\ISO\10mbfile.iso");
}
And the GetCheckSum function:
static async Task<string> GetChecksum(int index,string file)
{
using (FileStream stream = File.OpenRead(file))
{
SHA256Managed sha = new SHA256Managed();
Task<byte[]> checksum = sha.ComputeHashAsync(stream, 1200000);
var ret = await checksum;
System.Console.WriteLine($"{index} -> {file}");
var hash = BitConverter.ToString(ret).Replace("-", String.Empty);
System.Console.WriteLine($" ::{hash}");
return hash;
}
}
According to this article: https://msdn.microsoft.com/en-us/library/hh696703.aspx
Which states:
The method creates and starts three tasks of type Task, where TResult is an integer. As each task finishes, DisplayResults displays the task's URL and the length of the downloaded contents. Because the tasks are running asynchronously, the order in which the results appear might differ from the order in which they were declared.
However that is not what I experience with this example. I see each one finishing in the order they were called. I realize in this example its not using parallel processing, which I assume would force this to use a single processor, but given that the last one takes 2 seconds to process and the first one takes 2 minutes, I would still expect that the smallest one should finish first.
Can somebody possibly explain this behavior? I just want to understand whats going on behind the scenes with async and await when used like this.