As hinted yesterday, this years postcard contest was separately and individually assessed by an impartial third party. I don’t know her name, so we’ll call her Machine Brain. I also don’t know what she looks like but it’s probably something like this:
Machine Brain is an AI, specifically a ChatGPT-powered AI that utilized image analysis methods to evaluate our works. After a few failures (where the AI had trouble understanding the request) Bernard managed to refine a way for AI to assess and judge our two works week-by-week alongside the other five judges.
The multi-step process was complex, but essentially Machine Brain judged each pair twice (to avoid left/right bias) and gave a score out of 2 (one for each assessment). The score was based on evaluating each painting in four categories: colour, composition, detail and theme.
To be expressly clear: the contest results were decided by the five official judges (the results of which are in the previous post). AI judging was an experiment to see how closely (if at all) a machine could reproduce the opinion of five individuals. Before I get to its results, let’s go through some examples of its evaluations. In every case the text I quote below is verbatim what Machine Brain ‘said’ after she judged the paintings together:
The above was the first country, China, which the judges gave to Bernard (on the left). The AI disagreed and gave me the win with this explanation:
The left image uses simpler, less varied colors and has a less dynamic composition compared to the right. The right image has a richer color palette and a more interesting composition with the mountain shapes and bridge. Both images fit the theme of ‘china’ well, depicting the Great Wall on the left and a traditional Chinese landscape on the right, but the right captures a more quintessentially Chinese aesthetic with its mountain scenery.
Here’s Egypt, which I (on the left) very slightly won. The AI said this one was a tie, with this comment:
The left image has a more realistic color palette but lacks vibrancy, while the right image uses bold colors that catch the eye. The composition on the left is more cluttered and less focused than the right, which has a clear subject and horizon line. Both images connect to the Egypt theme, but the right image’s inclusion of the camel silhouette and pyramid gives it a slight edge in adhering to the theme.
The above seems like a win for Bernard and it was, but as I said the AI judged twice and the second time gave me the win so overall it determined Egypt was tied.
For Japan it agree with the judges and gave me a clear win:
The left image uses a bright palette but lacks some depth in color variation. The right image shows a bit more complexity in its use of color, creating a vibrant scene. Compositionally, the right image shows a more dynamic and engaging use of space with the torii gate framing Mount Fuji, while the left image is more simplistic and less engaging. Detail on the right is better executed with some attention to textures, though both show a naive style. Thematically, both images clearly reference Japan with Mount Fuji and cherry blossoms, but the right image adds the iconic torii gate, enhancing the theme.
And for Antarctica it also agreed with the judges and gave the win to Bernard’s depiction (on the left):
The left image has a brighter color palette, suggesting the reflection of sunlight on snow, fitting for the Antarctic theme, while the right has a more muted palette, but less variation in color intensity. The composition on the left effectively conveys depth with penguins as a focal point, whereas the right has a good central focus with the whale tail but lacks the same sense of space. Both images capture elements of Antarctica but the detailed depiction of wildlife and landscape on the left aligns slightly better with the theme.
These are amazingly nuanced comments from a machine aren’t they?
Interestingly, Bernard also wanted to see what an AI could create if they entered the contest as an artist. This proved challenging, since it was difficult to get the AI to account for the small canvas or inexperience of Bernard and myself. The idea was abandoned but here are some examples of what various AI image generators created:
The above was ‘acrylic painting of Egypt by someone with only a weeks experience’.
The above was similar for Netherlands.
And that was ‘an amateur painting of France on a small canvas’. The entire image above is AI produced, including the background and brush.
On to the results! To summarize:
The five official judges gave me six wins (France, The Netherlands, Egypt, Japan, India and Australia) to Bernard’s four (China, Italy, Antarctica and Atlantis). The final tally was therefore 6 to 4 in my favour.
The ‘Machine Brain’ AI judge gave me six wins (China, France, The Netherlands, Japan, India and Atlantis), Bernard one (Antarctica) and tied three others (Italy, Egypt and Australia). Awarding 0.5 points for a tie therefore makes the AI tally 7.5 to 2.5 my way.
The final judgement is up to you: did the AI agree with the official judges or disagree? And if not, then who was the most accurate?