Voina Blog (a tech warrior's blog) Hot on the web: AI stupid meter

This is a free promotion post to a very intelligent initiative: monitoring when AI goes stupid.

Tired of AI Models Acting Up? Meet aistupidlevel.info

Ever notice how your favorite AI model (Claude, GPT, Gemini, Grok) performs flawlessly one day, then “breaks” the next, giving you refusal after refusal? You’re not alone. We’ve all seen the inconsistent performance, and even leading labs like Anthropic admit models can change over time.

That’s why someone awsome built aistupidlevel.info – an open-source tool that runs 140+ coding, debugging, and optimization tasks hourly on various models. We meticulously measure performance across correctness, stability, refusals, latency, and more, all displayed on a live leaderboard.

What makes it a game-changer?

Spot “Bad” AI Days: See exactly when a model has a temporary “off” period with increased refusals or slower results.
Real-time Pricing: Understand the true cost of using each model. Some cheap models can burn your budget with 10+ iterations just to get it right!
Benchmark Your Own APIs: Run the exact same tests and compare your results against our public leaderboard.
Completely Open Source: The code, tests, and scoring are all on GitHub. Contribute or even build your own benchmarks!

Stop guessing about AI performance. Get the data.

Hot on the web: AI stupid meter

Like this:

Related

Leave a ReplyCancel reply

Share this:

Like this:

Related

Leave a ReplyCancel reply