Leaderboard

Top Teams & Performances

🥇
CAD Team 2
40.00%
2/ 5 errors found
00:09 time
🥈
CAD Team 1
20.00%
1/ 5 errors found
00:11 time
🥉
CAD Team 2
20.00%
1/ 5 errors found
00:17 time
Rank Team Accuracy Errors Found Errors Missed False Positives Total Time Date Actions
# 1
CAD Team 2
40.00%
2 / 5
3
1
00:09
Feb 5, 2026
# 2
CAD Team 1
20.00%
1 / 5
4
2
00:11
Feb 5, 2026
# 3
CAD Team 2
20.00%
1 / 5
4
1
00:17
Feb 5, 2026
# 4
CAD Team 1
20.00%
1 / 5
4
3
00:18
Feb 5, 2026
# 5
CAD Team 1
0.00%
0 / 5
5
3
00:06
Feb 5, 2026
# 5
CAD Team 1
0.00%
0 / 5
5
3
00:06
Feb 5, 2026

AI Performance Insights

AI Analysis Debug Info:

Scores count: 6
API Key defined: Yes
Analysis var: Empty

Error Log:
- Started processing
- API Key check: PASS
- Scores check: PASS (6 scores)
- *** ENTERED IF BLOCK ***
- Time: 09:05:14
- URL: v1beta/models/gemini-1.5-flash...
- curl_exec completed
- HTTP: 404
- ✗ Failed. Resp: {
  "error": {
    "code": 404,
    "message": "models/gemini-1.5-flash is not found for API version

Error Pattern Analysis

Aggregate data from all games showing which errors are hardest to detect

Term swap
0%
0 found / 20 missed
(20 total)
Sign flip
40%
2 found / 3 missed
(5 total)
Number tweak
60%
3 found / 2 missed
(5 total)
Error Type Detection Rate Total Cases Commonly Missed Examples Commonly Found Examples
Term swap
0%
0 / 20
(20 total)
  • Expenses (was: Revenue)
  • Decrease (was: increase)
  • Non-Operating (was: operating)
None
Sign flip
40%
2 / 3
(5 total)
  • (2026). (was: 2026.)
  • (2026). (was: 2026.)
  • (2026). (was: 2026.)
  • (2026). (was: 2026.)
  • (2027) (was: 2027)
Number tweak
60%
3 / 2
(5 total)
  • 26-28% (was: 27-28%)
  • 2,169 (was: 2027)
  • Nectar389 (was: Nectar360)
  • 29-28% (was: 27-28%)
  • Nectar356 (was: Nectar360)
Play Game Admin Panel Back to Sandbox