.NET, C#, Programming

C# 8.0 pattern matching in action

Let’s revisit definition of the pattern matching from the Wiki:

In computer sciencepattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact: “either it will or will not be a match.” The patterns generally have the form of either sequences or tree structures. Uses of pattern matching include outputting the locations (if any) of a pattern within a token sequence, to output some component of the matched pattern, and to substitute the matching pattern with some other token sequence

Pattern matching

Putting it into the human language means that instead of comparing expressions by exact values (think of if/else/switch statements), you literally match by patterns or shape of the data. Pattern could be constant value, tuple, variable, etc. For full definition please refer to the documentation. Initially, pattern matching was introduced in C# 7. It was shipped with basic capabilities to recognize const, type and var patterns. The language was extended with is and when keywords. One of the last pieces to make it work was introduction of discards for deconstruction of tuples and objects. Combining all these together you was able to use pattern matching in if and switch expressions:

if (o is null) Console.WriteLine("o is null");
if (o is string s && s.Trim() != string.Empty)
        Console.WriteLine("whoah, o is not null");
switch (o)
{
    case double n when n == 0.0: return 42;
    case string s when s == string.Empty: return 42;
    case int n when n == 0: return 42;
    case bool _: return 42;
    default: return -1;
}

C# 8 extended pattern matching with switch expressions and three new ways of expressing a pattern: positional, property and tuple. Again, I will refer to full documentation for the details.

In this post I would like to show a real-world example of using pattern matching power. Let’s say you want to build an SQL-query based on list of filters you receive as an input from HTTP request. For example we would like to get a list of all shurikens filtered by shape and material:

http://www.ninja.org/shuriken?filters[shape]='star,stick'&filters[material]='kugi-gata'

We need a model to which we can map this request with list of filters:

public class ShurikenQuery
{
    [BindProperty(Name = "filters", SupportsGet = true)]
    public IDictionary<string, string> ShurikenFilters { get; set; }
}

Now, let’s write a function which builds SQL-query string based on provided filters:

private static string BuildFilterExpression(ShurikenQuery query)
{
    if (query is null)
        throw new ArgumentNullException(nameof(query));

    const char Delimiter = ',';

    var expression = query.ShurikenFilters?.Aggregate(new StringBuilder(), (acc, ms) =>
    {
        var key = ms.Key;
        var value = ms.Value;
        var exp = (key, value) switch
        {
            (_, null) => $"[{key}] = ''",
            (_, var val) when val.Contains(Delimiter) =>
                @$"[{key}] IN ({string.Join(',', val
                    .Replace("'", string.Empty)
                    .Replace("\"", string.Empty)
                    .Split(Delimiter).Select(x => $"'{x}'"))})",
            (_, _) => $"[{key}] = '{value}'"
        };
        return exp != null ? acc.AppendLine($" AND {exp}") : acc;
    });
    return expression?.ToString() ?? string.Empty;
}

Let’s break this code down. On line 8 we use LINQ Aggregate function which do the main work of building a filter string. We want to iterate over each KeyValuePair in dictionary and based on the data shape in it create a string which represents expression which could be provided for SQL WHERE clause.

On line 10 and 11 we extract key and value for KeyValuePair in own variables just for convenience.

Lines between 12-21 are of main interest to us – that’s where all magic happens. On line 12 we wrapped key and value in a tuple and use switch expression to start pattern match. On line 14 we check if value variable is null (_ in key position means discard – it’s when we don’t care what actual value is). If that’s a case we produce string like [Shape]=”. On line 15-19 again, we not interested what in key position, but now we assign value to a dedicated variable val we can work with. Next, we check if this value contains filter with multiple values (like in case filters[shape]=’star,stick’) and split it in separate values, removing ” and ‘ on the way we go. We want to translate this into SQL IN operator, so string after processing this pattern looks like this: [Shape] IN (‘star’, ‘stick’). Last pattern matches for remaining cases of single value filters (like filters[material]=’kugi-gata’) which produce following string: [Material] = ‘kugi-gata’. Line 23 applies AND to string we built and accumulating result in a StringBuilder variable we provided in initial run of Aggregate function on line 8.

If we would put (_, _) as a first line in a switch expression other patterns will be not evaluated, because (_, _) will catch all values

Be aware that in pattern matching the order is everything

Finally, the resulting string we return looks like this:

 AND [Shape] IN ('star','stick') AND [Material] = 'kugi-gata'

And that’s a valid string for SQL WHERE clause. As you can see pattern matching is a real deal and could be used in a lot of cases for parsing expressions. With C# 9 pattern matching will be extended even more with relational and logical patterns.

Hope you enjoyed. Stay tuned and happy coding.

.NET, async, LINQ, Programming

Make LINQ Aggregate asynchronous

I often use LINQ in my code. Well, put it in another way: I can’t live without using LINQ in my daily work. One of the my favorite methods is Aggregate. Applying it wisely could save you from having explicit loops, naturally chain into other LINQ methods and at the same time keep your code readable and well-structured. Aggregate is similar to reduce and fold functions which is hammer and anvil of functional programming tooling.

When you use Entity Framework it provides you with async extensions methods like ToListAsync(), ToArrayAsync(), SingleAsync(). But what if you want to achieve asynchronous behavior using LINQ Aggregate method? You will not find async extension in existing framework (on the moment of writing this article I’m using .NET Core 3.1 and C# 8.0). But let me give you a real-world example of the case when you could find this really useful.

Let’s say you need to fetch from database all distinct values for multiple columns in order to build multi-selection filter like this:

Let’s also assume you use SQL Server as it is most common one. For keeping it simple I will show you example with using Dapper micro-ORM.

The function could look like this:

public List<MultiSelectionModel> GetMultiSelectionFilterValues(string[] dataFields) {
  var results = new List<MultiSelectionModel>();

  var query = dataFields.Aggregate(new StringBuilder(), (acc, field) =>{
    return acc.AppendLine($ "SELECT [{field}] FROM Table GROUP BY [{field}];");
  });

  using var connection = new SqlConnection(this.connectionString);
  connection.Open();

  using(var multi = connection.QueryMultiple(query.ToString())) {
    results.AddRange(dataFields.Aggregate(
     new List<MultiSelectionModel>(), (acc, field) =>{
      acc.Add(new MultiSelectionModel {
        DataField = field,
        Values = multi.Read(),
      });

      return acc;
    }));
  }

  return results;
}

The function receives as input parameter array of data fields (columns) for which we need to fetch distinct values for multi-selection filter and returns a list of multi-selection model which is just simple data structure defined as:

public class MultiSelectionModel
{
    public string DataField { get; set; }
    public IEnumerable<dynamic> Values { get; set; }
}

On lines 4-6 you see how Aggregate method applied for building a SELECT query for fetching distinct values for provided columns. I uses GROUP BY in this example, but you can use DISTINCT with same effect, although there difference in performance between distinct and group by for more complex queries which is excellently explained in this article. Lines 13-21 highlights the main logic of the function where we actually querying database with multi.Read() and assign results with distinct values for each data field in resulting model. In both cases following Aggregate extension used:

public static TAccumulate Aggregate<TSource, TAccumulate>(
	this IEnumerable<TSource> source,
	TAccumulate seed,
	Func<TAccumulate, TSource, TAccumulate> func
)

In first case as a seed parameter we provided StringBuilder. Second parameter is a function which receives accumulator and element from the source and returns accumulator which is StringBuilder in our case. In second case, as a seed we used List<MultiSelectionModel> which is resulting collection, so that final list is accumulated in that collection.

So that works. You can stop reading now and go for a couple of 🍺 with fellows…

Oh, you still here 😏. You know, curiosity killed the cat. But we different animals, so let’s move on. Well, as you can notice, in the first example we used what is known in Dapper as multi-result result. It executes multiple queries within the same command and map results. The good news is that it also has async version. The bad news is that our Aggregate does not have async version. Should we go back to old good for-each loop for mapping results from query execution then? No way!

So how could we implement all the way down async version of GetMultiSelectionFilterValues? Well, let’s re-write it how we would like to see it:

public async Task<List<MultiSelectionModel>> GetMultiSelectionFilterValuesAsync(string[] dataFields) {
  var results = new List<MultiSelectionModel>();

  var query = dataFields.Aggregate(new StringBuilder(), (acc, field) =>{
    return acc.AppendLine($ "SELECT [{field}] FROM Table GROUP BY [{field}];");
  });

  using var connection = new SqlConnection(this.connectionString);
  connection.Open();

  using(var multi = await connection.QueryMultipleAsync(query.ToString())) {
    results.AddRange(await dataFields.AggregateAsync(
     new List<MultiSelectionModel>(), async (acc, field) =>{
      acc.Add(new MultiSelectionModel {
        DataField = field,
        Values = await multi.ReadAsync(),
      });

      return acc;
    }));
  }

  return results;
}

Much better now, isn’t it? I’ve highlighted the changes. This is fully asynchronous Aggregate method now. Of course you wish to know where did I get this async extension 😀? Here the extension methods I come up with to make it work:

public static class AsyncExtensions {
	public static Task<TSource> AggregateAsync<TSource>(
	this IEnumerable<TSource> source, Func<TSource, TSource, Task<TSource>> func) {
		if (source == null) {
			throw new ArgumentNullException(nameof(source));
		}

		if (func == null) {
			throw new ArgumentNullException(nameof(func));
		}

		return source.AggregateInternalAsync(func);
	}

	public static Task<TAccumulate> AggregateAsync<TSource,
	TAccumulate>(
	this IEnumerable<TSource> source, TAccumulate seed, Func<TAccumulate, TSource, Task<TAccumulate>> func) {
		if (source == null) {
			throw new ArgumentNullException(nameof(source));
		}

		if (func == null) {
			throw new ArgumentNullException(nameof(func));
		}

		return source.AggregateInternalAsync(seed, func);
	}

	private static async Task<TSource> AggregateInternalAsync <TSource> (
	this IEnumerable <TSource> source, Func<TSource, TSource, Task<TSource>> func) {
		using
		var e = source.GetEnumerator();

		if (!e.MoveNext()) {
			throw new InvalidOperationException("Sequence contains no elements");
		}

		var result = e.Current;
		while (e.MoveNext()) {
			result = await func(result, e.Current).ConfigureAwait(false);
		}

		return result;
	}

	private static async Task<TAccumulate> AggregateInternalAsync<TSource,	TAccumulate>(
	this IEnumerable<TSource> source, TAccumulate seed, Func<TAccumulate, TSource, Task<TAccumulate>> func) {
		var result = seed;
		foreach(var element in source) {
			result = await func(result, element);
		}

		return result;
	}
}

I did it for two of three existing Aggregate overloads. The last one you can implement yourself if you need it. It will be good exercise for you to understand how aggregate works behind the scenes.

Stay tuned and have fun.