Home

PrettyTests extends Julia's basic unit-testing functionality by providing drop-in replacements for Test.@test with more informative error messages.

The inspiration for the package comes from python and numpy asserts, which customize their error messages depending on the type of test being performed; for example, by showing the differences between two sets that should be equal or the number of elements that differ in two arrays.

PrettyTests macros are designed to (a) provide clear and concise error messages tailored to specific situations, and (b) conform with the standard Test library interface, so that they can fit into any testing workflow. This guide walks through several examples:

@test_sets for set-like comparisons
@test_all for vectorized tests
Test integrations
- Broken/skipped tests
- Working with test sets

The package requires Julia 1.7 or higher, and can be installed in the usual way: type ] to enter the package manager, followed by add PrettyTests.

`@test_sets` for set-like comparisons

Set equality

The @test_sets macro is used to compare two set-like objects. It accepts expressions of the form @test_sets L <op> R, where op is an infix set comparison operator and L and R are collections, broadly defined.

In the simplest example, one could test for set equality with the (overloaded) == operator:

julia> a, b = [2, 1, 1], [1, 2];
julia> @test_sets a == bTest Passed

This is equivalent to the more verbose @test issetequal(a, b). It is also more informative in the case of failure:

julia> @test_sets a == 2:4Test Failed at REPL[1]:2
  Expression: a == 2:4
   Evaluated: L and R are not equal.
              L ∖ R has 1 element:  [1]
              R ∖ L has 2 elements: [3, 4]

The failed test message lists exactly how many and which elements were in the set differences L \ R and R \ L, which should have been empty in a passing test.

Note how the collections interpreted as L and R are color-coded (using ANSI color escape codes) so that they can be easily identified if the expressions are long:

julia> variable_with_long_name = 1:3;
julia> function_with_long_name = () -> 4:9;
julia> @test_sets variable_with_long_name ∪ Set(4:6) == function_with_long_name()Test Failed at REPL[3]:2
  Expression: variable_with_long_name ∪ Set(4:6) == function_with_long_name()
   Evaluated: L and R are not equal.
              L ∖ R has 3 elements: [1, 2, 3]
              R ∖ L has 3 elements: [7, 8, 9]

Disable color output

To disable colored subexpressions in failure messages use disable_failure_styling().

The symbol ∅ (typed as \emptyset<tab>) can be used as shorthand for Set() in any set expression:

julia> @test_sets Set() == ∅Test Passed
julia> @test_sets [1,1] == ∅Test Failed at REPL[2]:2
  Expression: [1, 1] == ∅
   Evaluated: L and R are not equal.
              L ∖ R has 1 element:  [1]
              R ∖ L has 0 elements: []

Because the macro internally expands the input expression to an issetequal call (and uses setdiff to print the differences), it works very flexibly with general collections and iterables:

julia> @test_sets Dict() == Set()Test Passed
julia> @test_sets "baabaa" == "abc"Test Failed at REPL[2]:2
  Expression: "baabaa" == "abc"
   Evaluated: L and R are not equal.
              L ∖ R has 0 elements: []
              R ∖ L has 1 element:  ['c']

Subsets

Set comparisons beyond equality are also supported, with modified error messages. For example, (as in base Julia), the expression L ⊆ R is equivalent to issubset(L, R):

julia> @test_sets "baabaa" ⊆ "abc"Test Passed
julia> @test_sets (3, 1, 2, 3) ⊆ (1, 2)Test Failed at REPL[2]:2
  Expression: (3, 1, 2, 3) ⊆ (1, 2)
   Evaluated: L is not a subset of R.
              L ∖ R has 1 element:  [3]

Note how, in this case, the failure displays only the set difference L \ R and omits the irrelevant R \ L.

Disjointness

The form L ∩ R == ∅ is equivalent to isdisjoint(L, R). In the case of failure, the macro displays the non-empty intersection L ∩ R, as computed by intersect:

julia> @test_sets (1, 2, 3) ∩ (4, 5, 6) == ∅Test Passed
julia> @test_sets "baabaa" ∩ "abc" == ∅Test Failed at REPL[2]:2
  Expression: "baabaa" ∩ "abc" == ∅
   Evaluated: L and R are not disjoint.
              L ∩ R has 2 elements: ['b', 'a']

Shorthand disjointness syntax

Though slightly abusive in terms of notation, the macro will also accept L ∩ R and L || R as shorthands for isdisjoint(L, R):

julia> @test_sets "baabaa" ∩ "moooo"Test Passed
julia> @test_sets (1,2) || (3,4)Test Passed

`@test_all` for vectorized tests

Basic usage

The @test_all macro is used for "vectorized" @tests. The name derives from the fact that @test_all ex will (mostly) behave like @test all(ex):

julia> a = [1, 2, 3, 4];
julia> @test all(a .< 5)Test Passed
julia> @test_all a .< 5Test Passed

With one important difference: @test_all does not short-circuit when it encounters the first false value. It evaluates the full expression and checks that each element is not false, printing errors for each "individual" failure:

julia> @test_all a .< 2Test Failed at REPL[1]:2
  Expression: all(a .< 2)
   Evaluated: false
    Argument: 4-element BitVector, 3 failures: 
              [2]: 2 < 2 ===> false
              [3]: 3 < 2 ===> false
              [4]: 4 < 2 ===> false

The failure message can be parsed as follows:

The expression all(a .< 2) evaluated to false
The argument to all() was a 4-element BitVector
There were 3 failures, i.e. elements of the argument that were false
These occured at indices [2], [3] and [4]

Like @test, the macro performed some introspection to show an unvectorized (and color-coded) form of the expression for each individual failure. For example, the failure at index [4] was because a[4] = 4, and 4 < 2 evaluated to false.

Why not `@testset` for ...?

One could achieve a similar effect to @test_all by using the @testset for syntax built in to Test. The test @test_all a .< 2 is basically equivalent to:

@testset for i in eachindex(a)
    @test a[i] < 2
end

When the iteration is relatively simple, @test_all should be preferred for its conciseness. It avoids the need for explicit indexing, e.g. a[i], conforming with Julia's intuitive broadcasting semantics.

More importantly, all relevant information about the test failures (i.e. how many and which indices failed) are printed in a single, concise Test.Fail result rather than multiple, redundant messages in nested test sets.

The introspection goes quite a bit deeper than what @test supports, handling pretty complicated expressions:

julia> x, y, str, func = 2, 4.0, "baa", arg -> arg > 0;
julia> @test_all (x .< 2) .| isnan.(y) .& .!occursin.(r"a|b", str) .| func(-1)Test Failed at REPL[2]:2
  Expression: all((x .< 2) .| isnan.(y) .& .!occursin.(r"a|b", str) .| func(-1))
   Evaluated: false
    Argument: (2 < 2) | isnan(4.0) & !occursin(r"a|b", "baa") | false ===> false

Note also how, since ex evaluated to a scalar in this case, the failure message ommitted the summary/indexing and printed just the single failure under Argument:.

Disable color output

To disable colored subexpressions in failure messages use disable_failure_styling().

Introspection mechanics

To create individual failure messages, the @test_all parser recursively dives through the Abstract Syntax Tree (AST) of the input expression and creates/combines python-like format strings for any of the following "displayable" forms:

:comparisons or :calls with vectorized comparison operators, e.g. .==, .≈, .∈, etc.
:calls to the vectorized negation operator .!
:calls to vectorized bitwise logical operators, e.g. .&, .|, .⊻, .⊽
:. (broadcast dot) calls to certain common functions, e.g. isnan, contains, occursin, etc.

Any (sub-)expressions that do not fall into one of these categories are escaped and collectively broadcast, so that elements can splatted into the format string at each failing index.

Note: Unvectorized forms are not considered displayable by the parser. This is to avoid certain ambiguities with broadcasting under the current implementation. This may be changed in future.

Example 1

julia> x, y = 2, 1;
julia> @test_all (x .< y) .& (x < y)Test Failed at REPL[2]:2
  Expression: all((x .< y) .& (x < y))
   Evaluated: false
    Argument: (2 < 1) & false ===> false

In this example, the parser first receives the top-level expression (x .< y) .& (x < y), which it knows to display as $f1 & $f2 in unvectorized form. The sub-format strings f1 and f2 must then be determined by recursively parsing the expressions on either side of .&.

On the left side, the sub-expression x .< y is also displayable as ($f11 < $f12) with format strings f11 and f22 given by further recursion. At this level, the parser hits the base case, since neither x nor y are displayable forms. The two expressions are escaped and used as the first and second broadcast arguments, while the corresponding format strings {1:s} and {2:s} are passed back up the recursion to create f1 as ({1:s} < {2:s}).

On the right side, x < y is not displayable (since it is unvectorized) and therefore escaped as whole to make the third broadcasted argument. The corresponding format string {3:s} is passed back up the recursion, and used as f2.

By the end, the parser has created the format string is ({1:s} < {2:s}) & {3:s}, with three corresponding expressions x, y, and x < y. Evaluating and collectively broadcasting the latter results in the scalar 3-tuple (2, 1, false), which matches the dimension of the evaluated expression (false). Since this is a failure, the 3-tuple is splatted into the format string to create the part of the message that reads (2 < 1) & false.

Example 2

julia> x, y = [5 6; 7 8], [5 6];
julia> @test_all x .== yTest Failed at REPL[2]:2
  Expression: all(x .== y)
   Evaluated: false
    Argument: 2×2 BitMatrix, 2 failures: 
              [2,1]: 7 == 5 ===> false
              [2,2]: 8 == 6 ===> false

Here, the top-level expression x .== y is displayable, while the two sub-expressions x and y are not. The parser creates a format string {1:s} == {2:s} with corresponding expressions x and y.

After evaluating and broadcasting, the arguments create a 2×2 matrix of 2-tuples to go with the 2×2 BitMatrix result. The latter has two false elements at indices [2,1] and [2,2], corresponding to the 2-tuples (7, 5) and (8, 6). Splatting each of these into the format string creates the parts of the message that read 7 == 5 and 8 == 6.

More complicated broadcasting

Expressions that involve more complicated broadcasting behaviour are naturally formatted. For example, if the expression evaluates to a higher-dimensional array (e.g. BitMatrix), individual failures are identified by their CartesianIndex:

julia> @test_all [1 0] .== [1 0; 0 1]Test Failed at REPL[1]:2
  Expression: all([1 0] .== [1 0; 0 1])
   Evaluated: false
    Argument: 2×2 BitMatrix, 2 failures: 
              [2,1]: 1 == 0 ===> false
              [2,2]: 0 == 1 ===> false
julia> @test_all occursin.([r"a|b" "oo"], ["moo", "baa"])Test Failed at REPL[2]:2
  Expression: all(occursin.([r"a|b" "oo"], ["moo", "baa"]))
   Evaluated: false
    Argument: 2×2 BitMatrix, 2 failures: 
              [1,1]: occursin(r"a|b", "moo") ===> false
              [2,2]: occursin("oo", "baa") ===> false

Ref can be used to avoid broadcasting certain elements:

julia> vals = [1,2,3];
julia> @test_all 1:5 .∈ Ref(vals)Test Failed at REPL[2]:2
  Expression: all(1:5 .∈ Ref(vals))
   Evaluated: false
    Argument: 5-element BitVector, 2 failures: 
              [4]: 4 ∈ [1, 2, 3] ===> false
              [5]: 5 ∈ [1, 2, 3] ===> false

Keyword splicing

Like @test, @test_all will accept trailing keyword arguments that will be spliced into ex if it is a function call (possibly . vectorized). This is primarily useful to make vectorized approximate comparisons more readable:

julia> v = [3, π, 4];
julia> @test_all v .≈ 3.14 atol=0.15Test Failed at REPL[2]:2
  Expression: all(.≈(v, 3.14, atol=0.15))
   Evaluated: false
    Argument: 3-element BitVector, 1 failure: 
              [3]: 4.0 ≈ 3.14 (atol=0.15) ===> false

Splicing works with any callable function, including if it is wrapped in a negation:

julia> iszero_mod(x; p=2) = x % p == 0;
julia> @test_all .!iszero_mod.(1:3) p = 3Test Failed at REPL[2]:2
  Expression: all(.!iszero_mod.(1:3, p = 3))
   Evaluated: false
    Argument: 3-element BitVector, 1 failure: 
              [3]: !true ===> false

General iterables

Paralleling its namesake, @test_all works with general iterables (as long as they also define length):

struct IsEven vals end
Base.iterate(x::IsEven, i=1) = i > length(x.vals) ? nothing : (iseven(x.vals[i]), i+1);
Base.length(x::IsEven) = length(x.vals)

julia> @test_all IsEven(1:4)Test Failed at REPL[1]:2
  Expression: all(IsEven(1:4))
   Evaluated: false
    Argument: IsEven, 2 failures

If they also define keys and a corresponding getindex, failures will be printed by index:

Base.keys(x::IsEven) = keys(x.vals)
Base.getindex(x::IsEven, args...) = getindex(x.vals, args...)

julia> @test_all IsEven(1:4)Test Failed at REPL[1]:2
  Expression: all(IsEven(1:4))
   Evaluated: false
    Argument: IsEven, 2 failures: 
              [1]: false ===> 1
              [3]: false ===> 3

Short-circuiting and iterables

Since @test_all ex does not short-circuit at the first false value, it may behave differently than @test all(ex) in certain edge cases, notably when iterating over ex has side-effects.

Consider the same IsEven iterable as above, but with an assertion that each value is non-negative:

function Base.iterate(x::IsEven, i=1)
    i > length(x.vals) && return nothing
    @assert x.vals[i] >= 0
    iseven(x.vals[i]), i+1
end
x = IsEven([1, 0, -1])

Evaluating @test all(x) will return a Test.Fail, since the evaluation of all(x) short-circuits after the first iteration and returns false:

julia> @test all(x)Test Failed at REPL[1]:2
  Expression: all(x)

Conversely, @test_all x will return a Test.Error because it evaluates all iterations and thus triggers the assertion error on the third iteration:

julia> @test_all xError During Test at REPL[1]:2
  Test threw exception
  Expression: all(x)
  AssertionError: x.vals[i] >= 0
  Stacktrace: [...]

`Missing` values

The only other major difference between @test all(ex) and @test_all ex is in how they deal with missing values. Recall that, in the presence of missing values, all() will return false if any non-missing value is false, or missing if all non-missing values are true.

Within an @test, the former will return a Test.Fail result, whereas the latter a Test.Error, pointing out that the return value was non-Boolean:

julia> @test all([1, missing] .== 2) # [false, missing] ===> falseTest Failed at REPL[1]:2
  Expression: all([1, missing] .== 2)
julia> @test all([2, missing] .== 2) # [true, missing] ===> missingError During Test at REPL[2]:2
  Expression evaluated to non-Boolean
  Expression: all([2, missing] .== 2)
       Value: missing

In the respective cases, @test_all will show the result of evaluating all(ex) (false or missing), but always returns a Test.Fail result showing individual elements that were missing along with the ones that were false:

julia> @test_all [1, missing] .== 2Test Failed at REPL[1]:2
  Expression: all([1, missing] .== 2)
   Evaluated: false
    Argument: 2-element Vector{Union{Missing, Bool}}, 1 missing and 1 failure: 
              [1]: 1 == 2 ===> false
              [2]: missing == 2 ===> missing
julia> @test_all [2, missing] .== 2Test Failed at REPL[2]:2
  Expression: all([2, missing] .== 2)
   Evaluated: missing
    Argument: 2-element Vector{Union{Missing, Bool}}, 1 missing: 
              [2]: missing == 2 ===> missing

Non-Boolean values

Finally, the macro will also produce a customized Test.Error result if the evaluated argument contains any non-Boolean, non-missing values. Where all() would short-circuit and throw a Core.TypeError on the first non-Boolean value, @test_all identifies the indices of all non-Boolean, non-missing values:

julia> @test_all [true, false, 42, "a", missing]Error During Test at REPL[1]:2
  Test threw exception
  Expression: all([true, false, 42, "a", missing])
   TypeError: non-boolean used in boolean context
    Argument: 5-element Vector{Any} with 2 non-Boolean values:
              [3]: 42 ===> Int64
              [4]: "a" ===> String
  Stacktrace: [...]

`Test` integrations

A core feature of PrettyTests is that macros integrate seamlessly with Julia's standard unit-testing framework. They return one of the standard Test.Result objects defined therein, namely:

Test.Pass if the test expression evaluates to true.
Test.Fail if it evaluates to false (or missing in the case of @test_all).
Test.Error if the expression cannot be evaluated.
Test.Broken if the test is marked as broken or skipped (see below).

Broken/skipped tests

They also support skip and broken keywords, with identical behavior to @test:

julia> @test_sets 1 ⊆ 2 skip=trueTest Broken
  Skipped: 1 ⊆ 2
julia> @test_all 1 .== 2 broken=trueTest Broken
  Expression: all(1 .== 2)
julia> @test_all 1 .== 1 broken=trueError During Test at REPL[3]:2
 Unexpected Pass
 Expression: all(1 .== 1)
 Got correct result, please change to @test if no longer broken.

Working with test sets

The macros will also record the result in the test set returned by Test.get_testset(). This means that they will play nicely with testing workflows that use @testset:

julia> @testset "MyTestSet" begin
           a = [1, 2]
           @test_all a .== 1:2
           @test_all a .< 1:2 broken=true
           @test_sets a ⊆ 1:2
           @test_sets a == 1:3 skip=true
       end;Test Summary: | Pass  Broken  Total  Time
MyTestSet     |    2       2      4  0.4s