Encoder and Decoder Model

Google Gemma 4 12B Brings Multimodal AI to 16GB Laptops, Free Under Apache 2.0

Google Gemma 4 12B, released June 3, is an open-weight multimodal model that processes text, images, audio, and video in a ...

20h

Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit

LCLMs compress LLM context before decode — 8.8x faster at 16x compression, beating every KV cache method tested. Open-sourced by NYU and Columbia.

VentureBeat

A look under the hood of transfomers, the engine driving AI model evolution

Today, virtually every cutting-edge AI product and model uses a transformer architecture. Large language models (LLMs) such as GPT-4o, LLaMA, Gemini and Claude are all transformer-based, and other AI ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results

Google Gemma 4 12B Brings Multimodal AI to 16GB Laptops, Free Under Apache 2.0

Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit

A look under the hood of transfomers, the engine driving AI model evolution

Trending now