Simplifying [Theory] test data with Xunit.Combinatorial : Andrew Lock

August 06, 2024 August 06, 2024

Simplifying [Theory] test data with Xunit.Combinatorial
by: Andrew Lock
blow post content copied from Andrew Lock | .NET Escapades
click here to view original post

In this post I show how you can simplify your xUnit [Theory] tests using the Xunit.Combinatorial package. This package has a bunch of features that can make it easier to generate the test data you need; you can auto-generate parameters, generate all parameter combinations, or randomly generate values.

Creating parameterised tests in xUnit with `[InlineData]`, `[ClassData]`, and `[MemberData]`

Before we dig into Xunit.Combinatorial, I'll give a quick overview of the two different types of test in xunit, and the main approaches for supplying the test data to those tests.

To this day, one of the most popular posts I've written is "Creating parameterised tests in xUnit with [InlineData], [ClassData], and [MemberData]". If you're new to xUnit you might want to read that and its follow up post "Creating strongly typed xUnit theory test data with TheoryData".

There are two types of test in xUnit:

[Fact] tests are parameterless methods.
[Theory] tests are parameterised methods.

Applying one of the above attributes to a method turns it into an xUnit test. For [Theory] tests, you also need to provide a data source, so that xUnit knows what parameters to pass to the method when it calls it.

There are three built-in approaches to doing this:

[InlineData]—multiple [InlineData] attributes are applied to the method, providing the full set of parameters for a single execution of the method.
[MemberData]—reference to a static method or parameter to invoke which returns an IEnumerable<object[]>, where each element is the full set of parameters for a single execution of the method. The member can alternatively return a TheoryData instance.
[ClassData]—reference to a class which implements IEnumerable<object[]>, where each element is the full set of parameters for a single execution of the method. The class can alternatively be derived from TheoryData.

The following example shows the first two approaches for supplying all the possible values to a [Theory] test, where the test has two bool values.

public class MyTests
{
    [Theory]
    [InlineData(false, false)] // Each instance specifies the values for all parameters
    [InlineData(false, true)]
    [InlineData(true, false)]
    [InlineData(true, true)]
    public void MyInlineDataTest(bool val1, bool val2)
    {
    }

    // Generate the data - we could have listed out all the values
    // but I often have code like the following when I want to 
    // test all the combinations of various cases
    public static IEnumerable<object[]> MyData
        =>  from val1 in new[] { true, false }
            from val2 in new[] { true, false }
            select new object[] { val1, val2 };

    [Theory]
    [MemberData(nameof(MyData))]
    public void MyMemberDataTest(bool val1, bool val2)
    {
    }
}

As you can see in the above code, whether we want to list out all the values manually (as in [InlineData], but we could have done it for [MemberData] too) or generate the values (as I did in this case for [MemberData]) it's quite a lot of code dedicated to just generating some bool values.

This is where Xunit.Combinatorial shines…

Using Xunit.Combinatorial to auto-generate values

Test parameters like the ones I've shown above are the bread and butter of Xunit.Combinatorial a project from Andrew Arnott on the Visual Studio Platform team. For the rest of this post I walk through some of the features it provides.

Add Xunit.Combinatorial to your project by running:

dotnet add package Xunit.Combinatorial

At the time of writing, this adds version 1.6.24 to the project. We can now simplify the MyTests class to use the [CombinatorialData] attribute instead of [InlineData] or MemberData:

public class MyTests
{
    [Theory, CombinatorialData]
    public void MyInlineDataTest(bool val1, bool val2)
    {
    }

    [Theory, CombinatorialData]
    public void MyMemberDataTest(bool val1, bool val2)
    {
    }
}

That's it! The net result is exactly the same; we run all 4 permutations for val1 and val2 for both tests, but we've gone from 22 lines down to 12 lines!

The result of executing the test class above is that each test is executed with the same data as before

Ok, ok, I cheated a bit by putting the attributes on the same line, but that's something you feasibly can do now without things getting long and ugly!

Lets dig in a bit further. Generating all the bool values is obviously easy and feasible, but what about other data? The short answer is that there are only 5 main types that are supported by Xunit.Combinatorial in this "automatic" mode:

bool—As you've already seen, this generates data for true and false.
bool?—This includes null as a value, giving null, true, false.
int—There's obviously a lot of potential ints, so only 0 and 1 are used.
int?—As for bool?, this adds null as a value, giving null, 0, 1.
Enum—If you use an enum all the values returned by Enum.GetNames<T>() are used.

If you try to use a parameter that is not one of these Xunit.Combinatorial throws a NotSupportedException, which will likely break your test execution. This part of the library could do with a bit more love really - it would be nice for the error to say why it's not supported, and/or including an analyzer to point it out.

This may seem quite limiting up front. What if I want to test more than 0 and 1 with my int parameter? Or I have double or string parameters? Do I have to fall back to the built in attributes? Luckily, no, Xunit.Combinatorial provides a way to specify all the values for a given parameter.

Using custom defined values and ranges

The following example shows how to use a combination of the auto-generated bool parameter, specific non-default values for the int parameter, and an otherwise-unsupported double parameter:

public class MyTests
{
    [Theory, CombinatorialData]
    public void MyCombinatorialTest(
        bool val1, // use all the automatic values (true, false)
        [CombinatorialValues(1, -1)] int val2, // use the provided int values
        [CombinatorialValues(0.0, -1.2, 2.5)] double val3) // use the provided double values
    {
    }
}

[CombinatorialData] combines all these values to execute the test a total of 12 times:

val1: False, val2: -1, val3: -1.2
val1: False, val2: -1, val3: 0
val1: False, val2: -1, val3: 2.5
val1: False, val2: 1, val3: -1.2
val1: False, val2: 1, val3: 0
val1: False, val2: 1, val3: 2.5
val1: True, val2: -1, val3: -1.2
val1: True, val2: -1, val3: 0
val1: True, val2: -1, val3: 2.5
val1: True, val2: 1, val3: -1.2
val1: True, val2: 1, val3: 0
val1: True, val2: 1, val3: 2.5

Tests like this, where the large number of parameters means a large numbers of combinations are where [CombinatorialData] really shines.

If you want to test a range of ints you can use the [CombinatorialRange] attribute. This takes a from parameter (the first value) and a count (the number of values to generate), for example:

public class MyTests
{
    [Theory, CombinatorialData]
    public void MyCombinatorialTest([CombinatorialRange(from: 10, count: 5)] int val1)
    {
        // val1: 10
        // val1: 11
        // val1: 12
        // val1: 13
        // val1: 14
    }
}

Alternatively you can provide from, to, and step, in which case the from value is generated and incremented by step until it is greater than to:

public class MyTests
{
    [Theory, CombinatorialData]
    public void MyCombinatorialTest([CombinatorialRange(from: 10, to: 20, step: 3)] int val1)
    {
        // val1: 10
        // val1: 13
        // val1: 16
        // val1: 19
    }
}

Personally I'm not a fan of the design choice to use count in one case and to in the other. Consistency, which ever was chosen, would have been preferable in my opinion. For that reason, I recommend explicitly using the parameter names as I have in the examples above, to avoid ambiguity.

[CombinatorialValues] and [CombinatorialRange] work well when you only have a small number of values, but as the number of values get larger, you may think that [MemberData] is looking more appealing. Fear not, Xunit.Combinatorial has you covered!

Using `[CombinatorialMemberData]` to generate values for a single parameter

In some cases, placing all the values for a parameter inline in an attribute may not be desirable, while in other cases it may not even be possible. For these situations, Xunit.Combinatorial has a similar method to xunit's built-in [MemberData]: [CombinatorialMemberData].

public class MyTests
{
    // Members must be static methods, properties, or fields
    public static IEnumerable<Uri> GetUris
        => [new("http://localhost"), new("https://localhost")]; 

    [Theory, CombinatorialData]
    public void MyCombinatorialTest(
        [CombinatorialMemberData(nameof(GetUris))] Uri uri, // reference the member as a string
        [CombinatorialValues(8080, 8081)] int port)
    {
        // uri: http://localhost/, port: 8080
        // uri: http://localhost/, port: 8081
        // uri: https://localhost/, port: 8080
        // uri: https://localhost/, port: 8081
    }
}

[CombinatorialMemberData] is used in almost the same way as [MemberData], except instead of being applied to a test method and returning IEnumerable<object[]> with all the values for a test run, [CombinatorialMemberData] is applied to a single test parameter, and specifies all the possible values of the parameter. Xunit.Combinatorial then combines this with each of the other parameter values to generate the complete set of test data.

One thing to bear in mind is that the CombinatorialMemberData member is invoked once per test method. When you have multiple parameters in your tests, that means the same object will be used in multiple test runs. More concretely, in the example above, there are 4 executions of the test, but only 2 unique Uri instances.

Just like [MemberData], [CombinatorialMemberData] allows you to specify that a member is on a different type:

public class MyTests
{
    [Theory, CombinatorialData]
    public void MyCombinatorialTest( // Use Method 👇         on type 👇
        [CombinatorialMemberData(nameof(Data.GetPrimes), MemberType = typeof(Data))] int prime)
    {
    }

    class Data
    {
        public static IEnumerable<int> GetPrimes => [2, 3, 5, 7, 11, 13];
    }
}

It also lets you provide arguments that should be passed to the member when retrieving the values:

public class MyTests
{
    // Data generation function
    public static IEnumerable<int> GetPrimes(bool include1)
        => include1 ? [1, 2, 3, 5, 7] : [2, 3, 5, 7];

    [Theory, CombinatorialData]
    public void MyCombinatorialTest2(
        [CombinatorialMemberData(nameof(GetPrimes), true)] int prime)
    {                // Method to call ☝ passing in ☝
        // prime: 1 // 👈 include1 was true, so we have this value
        // prime: 2
        // prime: 3
        // prime: 5
        // prime: 7
    }
}

With all these attributes, you should be able to specify any combination you like.

Generating random data with `[CombinatorialRandomData]`

Sometimes you just want to test some random values. In those cases you could use [CombinatorialMemberData] and use Random.Shared.Next() to generate the values, or you could use the built-in support of [CombinatorialRandomData]. This attribute has 4 properties, each of which is optional:

Count—The number of values to generate. Defaults to 5
Minimum—The minimum value (inclusive) that can be generated. Defaults to 0
Maximum—The maximum value (inclusive) that can be generated. Defaults to int.MaxValue - 1.
Seed—The seed to use for random number generation. Defaults to not providing a seed, so different values are generated each time.

You can specify as many or as few of these values as you like, for example:

public class MyTests
{
    [Theory, CombinatorialData]
    public void MyCombinatorialTest2(
        [CombinatorialRandomData(Minimum = 10, Maximum = 20)] int value)
    {
        // value: 10
        // value: 12
        // value: 14
        // value: 18
        // value: 19
    }
}

The values are always unique, but be aware if you specify a very narrow range of possible values, the generator may throw an exception trying to satisfy the constraints.

Reducing the number of combinations

The final feature I'd like to look at is the "pairwise" support, which is a way to reduce your test matrix, while still exploring important points in the test parameter space. This is based on several observations:

As the number of parameters increases, the number of test cases increases dramatically if testing all combinations.
Exhaustive testing of all combinations often isn't necessary to reveal bugs.
Many bugs in tests are triggered based on a combination of two values.

Lets take a concrete example. The following test has 4 bool parameters. The full combinatorial matrix consists of $2^{4} = 16$ values, as shown below:

public class MyTests
{
    [Theory, CombinatorialData]
    public void MyTest(bool isSecure, bool isRemote, bool isNew, bool isReturn)
    {
        // isSecure: False, isRemote: False, isNew: False, isReturn: False
        // isSecure: False, isRemote: False, isNew: False, isReturn: True
        // isSecure: False, isRemote: False, isNew: True,  isReturn: False
        // isSecure: False, isRemote: False, isNew: True,  isReturn: True
        // isSecure: False, isRemote: True,  isNew: False, isReturn: False
        // isSecure: False, isRemote: True,  isNew: False, isReturn: True
        // isSecure: False, isRemote: True,  isNew: True,  isReturn: False
        // isSecure: False, isRemote: True,  isNew: True,  isReturn: True
        // isSecure: True,  isRemote: False, isNew: False, isReturn: False
        // isSecure: True,  isRemote: False, isNew: False, isReturn: True
        // isSecure: True,  isRemote: False, isNew: True,  isReturn: False
        // isSecure: True,  isRemote: False, isNew: True,  isReturn: True
        // isSecure: True,  isRemote: True,  isNew: False, isReturn: False
        // isSecure: True,  isRemote: True,  isNew: False, isReturn: True
        // isSecure: True,  isRemote: True,  isNew: True,  isReturn: False
        // isSecure: True,  isRemote: True,  isNew: True,  isReturn: True
    }
}

However, if we switch from CombinatorialData to PairwiseData instead, we can dramatically reduce the number of tests we execute:

public class MyTests
{
    [Theory, PairwiseData] // 👈 Using pairwise instead of [CombinatorialData]
    public void MyTest(bool isSecure, bool isRemote, bool isNew, bool isReturn)
    {
        // isSecure: False, isRemote: False, isNew: True,  isReturn: True)
        // isSecure: False, isRemote: True,  isNew: False, isReturn: False)
        // isSecure: True,  isRemote: False, isNew: False, isReturn: True)
        // isSecure: True,  isRemote: False, isNew: True,  isReturn: False)
        // isSecure: True,  isRemote: True,  isNew: False, isReturn: True)
        // isSecure: True,  isRemote: True,  isNew: True,  isReturn: True)
    }
}

This strategy dramatically reduces the number of test cases from 16 down to 6. However, if you look at each pair of parameters, isSecure and isRemote for example, you can see that we're still testing all 4 possible combinations.

If your tests are long-running then using [PairwiseData] to reduce your overall execution time while ensuring you're testing important cases may be a good trade off. On the other hand, if your tests are fast unit tests, then you may be better off sticking with [CombinatorialData] as it may be easier to understand exactly which parameters are causing the issues when you get failures.

Limitations

I'm really looking forward to trying out Xunit.Combinatorial in the Datadog .NET repository, as I think there's a bunch of places it would tidy things up and reduce verbosity. Nevertheless, there's a few limitations that I'll need to bear in mind:

As mentioned previously, you can't control the lifetime of parameters created using [CombinatorialMemberData], they will always be shared across test runs if you have multiple parameters in a test. To avoid flakiness, it's important not to mutate the parameters in the test.
There's currently no mechanism to exclude specific combinations. Currently if you explicitly don't want to test certain combinations, you won't be able to use Xunit.Combinatorial, or else you'll have to use some other mechanism to skip the combination.
[CombinatorialRange] can only be used with int and uint. If you want to use it with double/float/long or some other value, you're out of luck.

These seem like relatively easy limitations to live with, I'm looking forward to trying it out!

Summary

In this post I show how you can simplify your xUnit [Theory] tests using the Xunit.Combinatorial package. The built in [InlineData] and [MemberData] attributes require that you specify all the parameters for a test run. If you want to specify all the permutations for a set of parameters, that may be a lot of data to specify. In contrast, [CombinatorialData] has you specify all the possible values for each parameter separately, and generates all the test runs for you. In many cases, particularly [Theory] tests with many parameters, this can significantly simplify for your test definition code.

August 06, 2024 at 02:30PM
Click here for more details...

=============================
The original post is available in Andrew Lock | .NET Escapades by Andrew Lock
this post has been published as it is through automation. Automation script brings all the top bloggers post under a single umbrella.
The purpose of this blog, Follow the top Salesforce bloggers and collect all blogs in a single place through automation.
============================

Andrew Lock DotNet

Dotnet Reader