Tag: LINQ

Make LINQ Aggregate asynchronous

I often use LINQ in my code. Well, put it in another way: I can’t live without using LINQ in my daily work. One of the my favorite methods is Aggregate. Applying it wisely could save you from having explicit loops, naturally chain into other LINQ methods and at the same time keep your code readable and well-structured. Aggregate is similar to reduce and fold functions which is hammer and anvil of functional programming tooling.

When you use Entity Framework it provides you with async extensions methods like ToListAsync(), ToArrayAsync(), SingleAsync(). But what if you want to achieve asynchronous behavior using LINQ Aggregate method? You will not find async extension in existing framework (on the moment of writing this article I’m using .NET Core 3.1 and C# 8.0). But let me give you a real-world example of the case when you could find this really useful.

Let’s say you need to fetch from database all distinct values for multiple columns in order to build multi-selection filter like this:

Let’s also assume you use SQL Server as it is most common one. For keeping it simple I will show you example with using Dapper micro-ORM.

The function could look like this:

public List<MultiSelectionModel> GetMultiSelectionFilterValues(string[] dataFields) {
  var results = new List<MultiSelectionModel>();

  var query = dataFields.Aggregate(new StringBuilder(), (acc, field) =>{
    return acc.AppendLine($ "SELECT [{field}] FROM Table GROUP BY [{field}];");
  });

  using var connection = new SqlConnection(this.connectionString);
  connection.Open();

  using(var multi = connection.QueryMultiple(query.ToString())) {
    results.AddRange(dataFields.Aggregate(
     new List<MultiSelectionModel>(), (acc, field) =>{
      acc.Add(new MultiSelectionModel {
        DataField = field,
        Values = multi.Read(),
      });

      return acc;
    }));
  }

  return results;
}

The function receives as input parameter array of data fields (columns) for which we need to fetch distinct values for multi-selection filter and returns a list of multi-selection model which is just simple data structure defined as:

public class MultiSelectionModel
{
    public string DataField { get; set; }
    public IEnumerable<dynamic> Values { get; set; }
}

On lines 4-6 you see how Aggregate method applied for building a SELECT query for fetching distinct values for provided columns. I uses GROUP BY in this example, but you can use DISTINCT with same effect, although there difference in performance between distinct and group by for more complex queries which is excellently explained in this article. Lines 13-21 highlights the main logic of the function where we actually querying database with multi.Read() and assign results with distinct values for each data field in resulting model. In both cases following Aggregate extension used:

public static TAccumulate Aggregate<TSource, TAccumulate>(
	this IEnumerable<TSource> source,
	TAccumulate seed,
	Func<TAccumulate, TSource, TAccumulate> func
)

In first case as a seed parameter we provided StringBuilder. Second parameter is a function which receives accumulator and element from the source and returns accumulator which is StringBuilder in our case. In second case, as a seed we used List<MultiSelectionModel> which is resulting collection, so that final list is accumulated in that collection.

So that works. You can stop reading now and go for a couple of 🍺 with fellows…

Oh, you still here 😏. You know, curiosity killed the cat. But we different animals, so let’s move on. Well, as you can notice, in the first example we used what is known in Dapper as multi-result result. It executes multiple queries within the same command and map results. The good news is that it also has async version. The bad news is that our Aggregate does not have async version. Should we go back to old good for-each loop for mapping results from query execution then? No way!

So how could we implement all the way down async version of GetMultiSelectionFilterValues? Well, let’s re-write it how we would like to see it:

public async Task<List<MultiSelectionModel>> GetMultiSelectionFilterValuesAsync(string[] dataFields) {
  var results = new List<MultiSelectionModel>();

  var query = dataFields.Aggregate(new StringBuilder(), (acc, field) =>{
    return acc.AppendLine($ "SELECT [{field}] FROM Table GROUP BY [{field}];");
  });

  using var connection = new SqlConnection(this.connectionString);
  connection.Open();

  using(var multi = await connection.QueryMultipleAsync(query.ToString())) {
    results.AddRange(await dataFields.AggregateAsync(
     new List<MultiSelectionModel>(), async (acc, field) =>{
      acc.Add(new MultiSelectionModel {
        DataField = field,
        Values = await multi.ReadAsync(),
      });

      return acc;
    }));
  }

  return results;
}

Much better now, isn’t it? I’ve highlighted the changes. This is fully asynchronous Aggregate method now. Of course you wish to know where did I get this async extension 😀? Here the extension methods I come up with to make it work:

public static class AsyncExtensions {
	public static Task<TSource> AggregateAsync<TSource>(
	this IEnumerable<TSource> source, Func<TSource, TSource, Task<TSource>> func) {
		if (source == null) {
			throw new ArgumentNullException(nameof(source));
		}

		if (func == null) {
			throw new ArgumentNullException(nameof(func));
		}

		return source.AggregateInternalAsync(func);
	}

	public static Task<TAccumulate> AggregateAsync<TSource,
	TAccumulate>(
	this IEnumerable<TSource> source, TAccumulate seed, Func<TAccumulate, TSource, Task<TAccumulate>> func) {
		if (source == null) {
			throw new ArgumentNullException(nameof(source));
		}

		if (func == null) {
			throw new ArgumentNullException(nameof(func));
		}

		return source.AggregateInternalAsync(seed, func);
	}

	private static async Task<TSource> AggregateInternalAsync <TSource> (
	this IEnumerable <TSource> source, Func<TSource, TSource, Task<TSource>> func) {
		using
		var e = source.GetEnumerator();

		if (!e.MoveNext()) {
			throw new InvalidOperationException("Sequence contains no elements");
		}

		var result = e.Current;
		while (e.MoveNext()) {
			result = await func(result, e.Current).ConfigureAwait(false);
		}

		return result;
	}

	private static async Task<TAccumulate> AggregateInternalAsync<TSource,	TAccumulate>(
	this IEnumerable<TSource> source, TAccumulate seed, Func<TAccumulate, TSource, Task<TAccumulate>> func) {
		var result = seed;
		foreach(var element in source) {
			result = await func(result, element);
		}

		return result;
	}
}

I did it for two of three existing Aggregate overloads. The last one you can implement yourself if you need it. It will be good exercise for you to understand how aggregate works behind the scenes.

Stay tuned and have fun.

Make your C# code cleaner with functional approach

Since introducing LINQ in .NET 3.5, the way how we write code changed a lot. Not only in the context of database queries with LINQ to SQL or LINQ to Entities, but also in day-to-day work with manipulating collections and all kind of transformations. Powerful language constructs like implicitly typed variables, anonymous types, lambda expressions and object initializers, gave us tools for writing more robust and conciseness code.

It was a big step towards functional approach to solve engineering tasks by using a more declarative way of expressing your intent instead of sequential statements in imperative paradigm.

Functional programming is a huge topic and mind shift for all .NET developers who is writing their code in C# for a long time. If you are new to the topic (like me), you probably don’t want to get into all that scary sounding things like functors, applicatives or monands right now (discussion for other posts). So let’s see how applying a functional approach could make your code cleaner here and now with our beloved C#.

For the sake of example we will solve a very simple FizzBuzz kata in C#. I will show you how it looks like in F#. If you don’t know what is kata, it just a fancy way of saying puzzle or coding task. The word kata came to us from the world of martial arts and particularly Karate. The FizzBuzz is a simple coding task where you need to solve the following problem:

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz “

So, first we will start with naïve implementation in C#:

void Main()
{
    for(var i = 1; i <= 100; i++)
    {
        if(i % (3 * 5) == 0)
            Console.WriteLine("FizzBuzz");
        else if(i % 3 == 0)
            Console.WriteLine("Fizz");
        else if(i % 5 == 0)
            Console.WriteLine("Buzz");
        else
            Console.WriteLine(i);
    }   
}

And here’s the output:

1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
...

So far so good. I told you, it’s a piece of a cake. Okay, how we can improve this code? Let’s use the power and beauty of LINQ:

void Main()
{
	var result = Enumerable
		.Range(1, 100)
		.Select(x => {
			switch(x)
			{
				case var n when n % (3 * 5) == 0: return "FizzBuzz";
				case var n when n % 3 == 0: return "Fizz";
				case var n when n % 5 == 0: return "Buzz";
				default: return x.ToString();
			}			
		})
		.Aggregate((x, y) => x + Environment.NewLine + y);
	
	Console.WriteLine(result);
}

We use the static helper Range on Enumerable to generate a sequence from 1 to 100. Then we use the Select method to iterate over each number in that range and return a string which contains one of those FizzBuzz words. We used very powerful concept – pattern matching. This feature is available from C# 7.0. This variation of pattern matching uses var pattern with when clause for specifying condition. Last method in chain is Aggregate. It is one of the most interesting in the LINQ – you could use it as a functional replacement for the loops in your code base. In this example we concatenated each element in sequence with a new line producing string as a result.

In C# 8.0 pattern matching was extended and improved. We can re-write our code like this:

public static string FizzBuzz(int n) =>
        (n % 3, n % 5) switch
        {
            (0, 0) => "FizzBuzz",
            (0, _) => "Fizz",
            (_, 0) => "Buzz",
            (_, _) => $"{n}"
        };
 
 static void Main(string[] args)
 { 
     foreach (var n in Enumerable.Range(1, 100))
     {
         Console.WriteLine(FizzBuzz(n));
     }
 }

This syntax is much closer to how pattern matching is applied in functional languages. In functional languages _ is called a discard symbol – meaning we are not interested in value in that position. We used what is called tuple pattern here.

  • When remainder of 3 and 5 in both positions 0 – we print “FizzBuzz”.
  • When remainder of 3 is 0 and we not interested in the remainder of 5, we print “Fizz”.
  • When the remainder of 5 is 0 and we not interested in the remainder of 3 we print “Buzz”.
  • For all other cases we just print value of the n.

Remember, in pattern matching order matters – first matched condition win and further calculation stops.

Finally, let’s look at the F# implementation of the kata:

let fizzBuzz list  = 
    list |> List.map (fun x -> 
        match (x % 3, x % 5) with
        | (0, 0) -> "FizzBuzz"
        | (0, _) -> "Fizz"
        | (_, 0) -> "Buzz"
        | _ -> string x
    )

fizzBuzz [1..100] |> List.iter (fun x -> printfn "%s" x)

You can see that this sample is very similar to the previous one with C# 8.0 pattern matching. And this should not surprise you, because the C# team is introducing more and more functional constructs in the language with each version, taking all the good parts from F#.