Large Language Models Benchmarks

Anthropic sets AI performance records with new Mythos 5, Fable 5 frontier models

Anthropic PBC today introduced Claude Mythos 5 and Claude Fable 5, two large language models that it says outperform the ...

Forbes

Can Quantum-Inspired AI Compete With Today’s Large Language Models?

As large language models (LLMs) continue their rapid evolution and domination of the generative AI landscape, a quieter evolution is unfolding at the edge of two emerging domains: quantum computing ...

VentureBeat

Beyond generic benchmarks: How Yourbench lets enterprises evaluate AI models against actual data

Every AI model release inevitably includes charts touting how it outperformed its competitors in this benchmark test or that evaluation matrix. However, these benchmarks often test for general ...

1mon

DeepSeek previews new AI model that ‘closes the gap’ with frontier models

DeepSeek says both models are more efficient and performant than DeepSeek V3.2 due to architectural improvements, and have almost "closed the gap" with current leading models, both open and closed, on ...

Quanta Magazine

To Make Language Models Work Better, Researchers Sidestep Language

Language isn’t always necessary. While it certainly helps in getting across certain ideas, some neuroscientists have argued that many forms of human thought and reasoning don’t require the medium of ...

Futurism

Large Language Models Will Never Be Intelligent, Expert Says

Are tech companies on the verge of creating thinking machines with their tremendous AI models, as top executives claim they are? Not according to one expert. We humans tend to associate language with ...

Computer Weekly

Large language models provide unreliable answers about public services, Open Data Institute finds

Popular large language models (LLMs) are unable to provide reliable information about key public services such as health, taxes and benefits, the Open Data Institute (ODI) has found. Drawing on more ...

The Economist

Top AI models underperform in languages other than English

TO GET THE most accurate answer from a large language model, make sure to prompt it in the right language. An English-speaking user asking a world-leading model what to do about swollen legs late in ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results