We have the following test cases available:
We generate a 100px by 100px image of all flags and of a reference image relating to the test case.
Then we do some fancy math pixel-wise/color-respecting comparison of similiarity.
This is then presented as a scientific discovery.
An official netzunfall.de project