Home
PrettyTests extends Julia's basic unit-testing functionality by providing drop-in replacements for Test.@test
with more informative error messages.
The inspiration for the package comes from python
and numpy
asserts, which customize their error messages depending on the type of test being performed; for example, by showing the differences between two sets that should be equal or the number of elements that differ in two arrays.
PrettyTests
macros are designed to (a) provide clear and concise error messages tailored to specific situations, and (b) conform with the standard Test
library interface, so that they can fit into any testing workflow. This guide walks through several examples:
The package requires Julia 1.7
or higher, and can be installed in the usual way: type ]
to enter the package manager, followed by add PrettyTests
.
@test_sets
for set-like comparisons
Set equality
The @test_sets
macro is used to compare two set-like objects. It accepts expressions of the form @test_sets L <op> R
, where op
is an infix set comparison operator and L
and R
are collections, broadly defined.
In the simplest example, one could test for set equality with the (overloaded) ==
operator:
julia> a, b = [2, 1, 1], [1, 2];
julia> @test_sets a == b
Test Passed
This is equivalent to the more verbose @test issetequal(a, b)
. It is also more informative in the case of failure:
julia> @test_sets a == 2:4
Test Failed at REPL[1]:2 Expression: a == 2:4 Evaluated: L and R are not equal. L ∖ R has 1 element: [1] R ∖ L has 2 elements: [3, 4]
The failed test message lists exactly how many and which elements were in the set differences L \ R
and R \ L
, which should have been empty in a passing test.
Note how the collections interpreted as L
and R
are color-coded (using ANSI color escape codes) so that they can be easily identified if the expressions are long:
julia> variable_with_long_name = 1:3;
julia> function_with_long_name = () -> 4:9;
julia> @test_sets variable_with_long_name ∪ Set(4:6) == function_with_long_name()
Test Failed at REPL[3]:2 Expression: variable_with_long_name ∪ Set(4:6) == function_with_long_name() Evaluated: L and R are not equal. L ∖ R has 3 elements: [1, 2, 3] R ∖ L has 3 elements: [7, 8, 9]
To disable colored subexpressions in failure messages use disable_failure_styling()
.
The symbol ∅
(typed as \emptyset<tab>
) can be used as shorthand for Set()
in any set expression:
julia> @test_sets Set() == ∅
Test Passed
julia> @test_sets [1,1] == ∅
Test Failed at REPL[2]:2 Expression: [1, 1] == ∅ Evaluated: L and R are not equal. L ∖ R has 1 element: [1] R ∖ L has 0 elements: []
Because the macro internally expands the input expression to an issetequal
call (and uses setdiff
to print the differences), it works very flexibly with general collections and iterables:
julia> @test_sets Dict() == Set()
Test Passed
julia> @test_sets "baabaa" == "abc"
Test Failed at REPL[2]:2 Expression: "baabaa" == "abc" Evaluated: L and R are not equal. L ∖ R has 0 elements: [] R ∖ L has 1 element: ['c']
Subsets
Set comparisons beyond equality are also supported, with modified error messages. For example, (as in base Julia), the expression L ⊆ R
is equivalent to issubset(L, R)
:
julia> @test_sets "baabaa" ⊆ "abc"
Test Passed
julia> @test_sets (3, 1, 2, 3) ⊆ (1, 2)
Test Failed at REPL[2]:2 Expression: (3, 1, 2, 3) ⊆ (1, 2) Evaluated: L is not a subset of R. L ∖ R has 1 element: [3]
Note how, in this case, the failure displays only the set difference L \ R
and omits the irrelevant R \ L
.
Disjointness
The form L ∩ R == ∅
is equivalent to isdisjoint
(L, R)
. In the case of failure, the macro displays the non-empty intersection L ∩ R
, as computed by intersect:
julia> @test_sets (1, 2, 3) ∩ (4, 5, 6) == ∅
Test Passed
julia> @test_sets "baabaa" ∩ "abc" == ∅
Test Failed at REPL[2]:2 Expression: "baabaa" ∩ "abc" == ∅ Evaluated: L and R are not disjoint. L ∩ R has 2 elements: ['b', 'a']
Though slightly abusive in terms of notation, the macro will also accept L ∩ R
and L || R
as shorthands for isdisjoint(L, R)
:
julia> @test_sets "baabaa" ∩ "moooo"
Test Passed
julia> @test_sets (1,2) || (3,4)
Test Passed
@test_all
for vectorized tests
Basic usage
The @test_all
macro is used for "vectorized" @test
s. The name derives from the fact that @test_all ex
will (mostly) behave like @test all(ex)
:
julia> a = [1, 2, 3, 4];
julia> @test all(a .< 5)
Test Passed
julia> @test_all a .< 5
Test Passed
With one important difference: @test_all
does not short-circuit when it encounters the first false
value. It evaluates the full expression and checks that each element is not false
, printing errors for each "individual" failure:
julia> @test_all a .< 2
Test Failed at REPL[1]:2 Expression: all(a .< 2) Evaluated: false Argument: 4-element BitVector, 3 failures: [2]: 2 < 2 ===> false [3]: 3 < 2 ===> false [4]: 4 < 2 ===> false
The failure message can be parsed as follows:
- The expression
all(a .< 2)
evaluated tofalse
- The argument to
all()
was a4-element BitVector
- There were
3
failures, i.e. elements of the argument that werefalse
- These occured at indices
[2]
,[3]
and[4]
Like @test
, the macro performed some introspection to show an unvectorized (and color-coded) form of the expression for each individual failure. For example, the failure at index [4]
was because a[4] = 4
, and 4 < 2
evaluated to false
.
One could achieve a similar effect to @test_all
by using the @testset for
syntax built in to Test
. The test @test_all a .< 2
is basically equivalent to:
@testset for i in eachindex(a)
@test a[i] < 2
end
When the iteration is relatively simple, @test_all
should be preferred for its conciseness. It avoids the need for explicit indexing, e.g. a[i]
, conforming with Julia's intuitive broadcasting semantics.
More importantly, all relevant information about the test failures (i.e. how many and which indices failed) are printed in a single, concise Test.Fail
result rather than multiple, redundant messages in nested test sets.
The introspection goes quite a bit deeper than what @test
supports, handling pretty complicated expressions:
julia> x, y, str, func = 2, 4.0, "baa", arg -> arg > 0;
julia> @test_all (x .< 2) .| isnan.(y) .& .!occursin.(r"a|b", str) .| func(-1)
Test Failed at REPL[2]:2 Expression: all((x .< 2) .| isnan.(y) .& .!occursin.(r"a|b", str) .| func(-1)) Evaluated: false Argument: (2 < 2) | isnan(4.0) & !occursin(r"a|b", "baa") | false ===> false
Note also how, since ex
evaluated to a scalar in this case, the failure message ommitted the summary/indexing and printed just the single failure under Argument:
.
To disable colored subexpressions in failure messages use disable_failure_styling()
.
Introspection mechanics
To create individual failure messages, the @test_all
parser recursively dives through the Abstract Syntax Tree (AST) of the input expression and creates/combines python
-like format strings for any of the following "displayable" forms:
:comparison
s or:call
s with vectorized comparison operators, e.g..==
,.≈
,.∈
, etc.:call
s to the vectorized negation operator.!
:call
s to vectorized bitwise logical operators, e.g..&
,.|
,.⊻
,.⊽
:.
(broadcast dot) calls to certain common functions, e.g.isnan
,contains
,occursin
, etc.
Any (sub-)expressions that do not fall into one of these categories are escaped and collectively broadcast
, so that elements can splatted into the format string at each failing index.
Note: Unvectorized forms are not considered displayable by the parser. This is to avoid certain ambiguities with broadcasting under the current implementation. This may be changed in future.
Example 1
julia> x, y = 2, 1;
julia> @test_all (x .< y) .& (x < y)
Test Failed at REPL[2]:2 Expression: all((x .< y) .& (x < y)) Evaluated: false Argument: (2 < 1) & false ===> false
In this example, the parser first receives the top-level expression (x .< y) .& (x < y)
, which it knows to display as $f1 & $f2
in unvectorized form. The sub-format strings f1
and f2
must then be determined by recursively parsing the expressions on either side of .&
.
On the left side, the sub-expression x .< y
is also displayable as ($f11 < $f12)
with format strings f11
and f22
given by further recursion. At this level, the parser hits the base case, since neither x
nor y
are displayable forms. The two expressions are escaped and used as the first and second broadcast arguments, while the corresponding format strings {1:s}
and {2:s}
are passed back up the recursion to create f1
as ({1:s} < {2:s})
.
On the right side, x < y
is not displayable (since it is unvectorized) and therefore escaped as whole to make the third broadcasted argument. The corresponding format string {3:s}
is passed back up the recursion, and used as f2
.
By the end, the parser has created the format string is ({1:s} < {2:s}) & {3:s}
, with three corresponding expressions x
, y
, and x < y
. Evaluating and collectively broadcasting the latter results in the scalar 3-tuple (2, 1, false)
, which matches the dimension of the evaluated expression (false
). Since this is a failure, the 3-tuple is splatted into the format string to create the part of the message that reads (2 < 1) & false
.
Example 2
julia> x, y = [5 6; 7 8], [5 6];
julia> @test_all x .== y
Test Failed at REPL[2]:2 Expression: all(x .== y) Evaluated: false Argument: 2×2 BitMatrix, 2 failures: [2,1]: 7 == 5 ===> false [2,2]: 8 == 6 ===> false
Here, the top-level expression x .== y
is displayable, while the two sub-expressions x
and y
are not. The parser creates a format string {1:s} == {2:s}
with corresponding expressions x
and y
.
After evaluating and broadcasting, the arguments create a 2×2
matrix of 2-tuples to go with the 2×2 BitMatrix
result. The latter has two false
elements at indices [2,1]
and [2,2]
, corresponding to the 2-tuples (7, 5)
and (8, 6)
. Splatting each of these into the format string creates the parts of the message that read 7 == 5
and 8 == 6
.
More complicated broadcasting
Expressions that involve more complicated broadcasting behaviour are naturally formatted. For example, if the expression evaluates to a higher-dimensional array (e.g. BitMatrix
), individual failures are identified by their CartesianIndex
:
julia> @test_all [1 0] .== [1 0; 0 1]
Test Failed at REPL[1]:2 Expression: all([1 0] .== [1 0; 0 1]) Evaluated: false Argument: 2×2 BitMatrix, 2 failures: [2,1]: 1 == 0 ===> false [2,2]: 0 == 1 ===> false
julia> @test_all occursin.([r"a|b" "oo"], ["moo", "baa"])
Test Failed at REPL[2]:2 Expression: all(occursin.([r"a|b" "oo"], ["moo", "baa"])) Evaluated: false Argument: 2×2 BitMatrix, 2 failures: [1,1]: occursin(r"a|b", "moo") ===> false [2,2]: occursin("oo", "baa") ===> false
Ref
can be used to avoid broadcasting certain elements:
julia> vals = [1,2,3];
julia> @test_all 1:5 .∈ Ref(vals)
Test Failed at REPL[2]:2 Expression: all(1:5 .∈ Ref(vals)) Evaluated: false Argument: 5-element BitVector, 2 failures: [4]: 4 ∈ [1, 2, 3] ===> false [5]: 5 ∈ [1, 2, 3] ===> false
Keyword splicing
Like @test
, @test_all
will accept trailing keyword arguments that will be spliced into ex
if it is a function call (possibly .
vectorized). This is primarily useful to make vectorized approximate comparisons more readable:
julia> v = [3, π, 4];
julia> @test_all v .≈ 3.14 atol=0.15
Test Failed at REPL[2]:2 Expression: all(.≈(v, 3.14, atol=0.15)) Evaluated: false Argument: 3-element BitVector, 1 failure: [3]: 4.0 ≈ 3.14 (atol=0.15) ===> false
Splicing works with any callable function, including if it is wrapped in a negation:
julia> iszero_mod(x; p=2) = x % p == 0;
julia> @test_all .!iszero_mod.(1:3) p = 3
Test Failed at REPL[2]:2 Expression: all(.!iszero_mod.(1:3, p = 3)) Evaluated: false Argument: 3-element BitVector, 1 failure: [3]: !true ===> false
General iterables
Paralleling its namesake, @test_all
works with general iterables (as long as they also define length
):
struct IsEven vals end
Base.iterate(x::IsEven, i=1) = i > length(x.vals) ? nothing : (iseven(x.vals[i]), i+1);
Base.length(x::IsEven) = length(x.vals)
julia> @test_all IsEven(1:4)
Test Failed at REPL[1]:2 Expression: all(IsEven(1:4)) Evaluated: false Argument: IsEven, 2 failures
If they also define keys
and a corresponding getindex
, failures will be printed by index:
Base.keys(x::IsEven) = keys(x.vals)
Base.getindex(x::IsEven, args...) = getindex(x.vals, args...)
julia> @test_all IsEven(1:4)
Test Failed at REPL[1]:2 Expression: all(IsEven(1:4)) Evaluated: false Argument: IsEven, 2 failures: [1]: false ===> 1 [3]: false ===> 3
Since @test_all ex
does not short-circuit at the first false
value, it may behave differently than @test all(ex)
in certain edge cases, notably when iterating over ex
has side-effects.
Consider the same IsEven
iterable as above, but with an assertion that each value is non-negative:
function Base.iterate(x::IsEven, i=1)
i > length(x.vals) && return nothing
@assert x.vals[i] >= 0
iseven(x.vals[i]), i+1
end
x = IsEven([1, 0, -1])
Evaluating @test all(x)
will return a Test.Fail
, since the evaluation of all(x)
short-circuits after the first iteration and returns false
:
julia> @test all(x)
Test Failed at REPL[1]:2 Expression: all(x)
Conversely, @test_all x
will return a Test.Error
because it evaluates all iterations and thus triggers the assertion error on the third iteration:
julia> @test_all x
Error During Test at REPL[1]:2 Test threw exception Expression: all(x) AssertionError: x.vals[i] >= 0 Stacktrace: [...]
Missing
values
The only other major difference between @test all(ex)
and @test_all ex
is in how they deal with missing values. Recall that, in the presence of missing values, all()
will return false
if any non-missing value is false
, or missing
if all non-missing values are true
.
Within an @test
, the former will return a Test.Fail
result, whereas the latter a Test.Error
, pointing out that the return value was non-Boolean:
julia> @test all([1, missing] .== 2) # [false, missing] ===> false
Test Failed at REPL[1]:2 Expression: all([1, missing] .== 2)
julia> @test all([2, missing] .== 2) # [true, missing] ===> missing
Error During Test at REPL[2]:2 Expression evaluated to non-Boolean Expression: all([2, missing] .== 2) Value: missing
In the respective cases, @test_all
will show the result of evaluating all(ex)
(false
or missing
), but always returns a Test.Fail
result showing individual elements that were missing
along with the ones that were false
:
julia> @test_all [1, missing] .== 2
Test Failed at REPL[1]:2 Expression: all([1, missing] .== 2) Evaluated: false Argument: 2-element Vector{Union{Missing, Bool}}, 1 missing and 1 failure: [1]: 1 == 2 ===> false [2]: missing == 2 ===> missing
julia> @test_all [2, missing] .== 2
Test Failed at REPL[2]:2 Expression: all([2, missing] .== 2) Evaluated: missing Argument: 2-element Vector{Union{Missing, Bool}}, 1 missing: [2]: missing == 2 ===> missing
Non-Boolean values
Finally, the macro will also produce a customized Test.Error
result if the evaluated argument contains any non-Boolean, non-missing values. Where all()
would short-circuit and throw a Core.TypeError
on the first non-Boolean value, @test_all
identifies the indices of all non-Boolean, non-missing values:
julia> @test_all [true, false, 42, "a", missing]
Error During Test at REPL[1]:2 Test threw exception Expression: all([true, false, 42, "a", missing]) TypeError: non-boolean used in boolean context Argument: 5-element Vector{Any} with 2 non-Boolean values: [3]: 42 ===> Int64 [4]: "a" ===> String Stacktrace: [...]
Test
integrations
A core feature of PrettyTests
is that macros integrate seamlessly with Julia's standard unit-testing framework. They return one of the standard Test.Result
objects defined therein, namely:
Test.Pass
if the test expression evaluates totrue
.Test.Fail
if it evaluates tofalse
(ormissing
in the case of@test_all
).Test.Error
if the expression cannot be evaluated.Test.Broken
if the test is marked as broken or skipped (see below).
Broken/skipped tests
They also support skip
and broken
keywords, with identical behavior to @test
:
julia> @test_sets 1 ⊆ 2 skip=true
Test Broken Skipped: 1 ⊆ 2
julia> @test_all 1 .== 2 broken=true
Test Broken Expression: all(1 .== 2)
julia> @test_all 1 .== 1 broken=true
Error During Test at REPL[3]:2 Unexpected Pass Expression: all(1 .== 1) Got correct result, please change to @test if no longer broken.
Working with test sets
The macros will also record
the result in the test set returned by Test.get_testset()
. This means that they will play nicely with testing workflows that use @testset
:
julia> @testset "MyTestSet" begin a = [1, 2] @test_all a .== 1:2 @test_all a .< 1:2 broken=true @test_sets a ⊆ 1:2 @test_sets a == 1:3 skip=true end;
Test Summary: | Pass Broken Total Time MyTestSet | 2 2 4 0.4s