News.title

News.subtitle

News.earlier

News.research📌 MED
2025-03-12

Long Context Goes Standard: 1M+ Tokens Across Top Models

In 2025, 1 million token context windows have become standard for frontier models, enabling analysis of entire codebases, books, and research corpora in a single prompt.

#Long Context#Context Window#Multimodal
News.release🔥 HIGH
2025-03-10

Claude Opus 4.5 Released with Extended Thinking

Anthropic released Claude Opus 4.5 featuring extended thinking mode for complex multi-step reasoning tasks, along with improved coding and analysis capabilities.

#Claude#Anthropic#Extended Thinking#Reasoning
News.relatedModel:claude-opus-4-5
News.research🔥 HIGH
2025-03-08

AI Agents Perform Complex Multi-Day Tasks Autonomously

Latest AI agent frameworks built on Claude Opus 4.5 and GPT-4o demonstrate the ability to complete complex multi-day software and research tasks with minimal human intervention.

#AI Agents#Autonomy#Agentic AI#Claude
News.relatedModel:claude-opus-4-5gpt-4o
Benchmark🔥 HIGH
2025-03-05

Gemini 2.5 Pro Tops LMSYS Arena Leaderboard

Google's Gemini 2.5 Pro achieved the highest ELO rating on the LMSYS Chatbot Arena, surpassing GPT-4o and Claude models in human preference evaluations.

#Gemini#Google#LMSYS#Leaderboard
News.relatedModel:gemini-2-5-pro
Business📌 MED
2025-03-03

AI Token Usage Surpasses 10 Trillion Tokens Per Day

Global AI API usage has crossed 10 trillion tokens per day, driven by enterprise adoption and agentic workflows that chain multiple AI calls for complex tasks.

#Usage#Scale#Enterprise#Agentic AI
News.research📌 MED
2025-03-01

Multimodal AI Becomes Standard: Video Understanding Goes Mainstream

Major AI models including Gemini 2.5 Pro and GPT-4o now offer robust video understanding capabilities, enabling new use cases in content analysis, education, and accessibility.

#Multimodal#Video#Vision AI#GPT-4o
News.relatedModel:gemini-2-5-progpt-4o
Business📌 MED
2025-02-28

Google Launches Gemini for Workspace with Deep Integration

Google launched Gemini for Workspace with deep integration across Gmail, Docs, Sheets, and Meet, powered by Gemini 2.5 Pro for enhanced productivity features.

#Google#Workspace#Productivity#Gemini
News.release🔥 HIGH
2025-02-27

GPT-4.5 Brings Improved Emotional Intelligence

OpenAI released GPT-4.5 with enhanced emotional understanding and interpersonal capabilities, showing improvements in nuanced conversation and creative writing tasks.

#GPT-4.5#OpenAI#Emotional Intelligence
News.relatedModel:gpt-4-5
Policy🔥 HIGH
2025-02-25

National AI Safety Institutes Coordinate Global Standards

AI safety institutes from the US, UK, EU, Japan, and South Korea coordinate on shared evaluation frameworks and minimum safety standards for frontier AI models.

#AI Safety#Policy#International#Regulation
News.research📌 MED
2025-02-22

AI Coding Assistants Handle 30% of Production Code

Major tech companies report that AI coding assistants now generate or significantly assist in writing 30% of production code, with adoption accelerating in 2025.

#Coding#GitHub#Developer Tools#Productivity
News.research📌 MED
2025-02-20

Open Source Models Close Gap with Proprietary APIs

The gap between open-source models (Llama 4, DeepSeek, Mistral) and closed proprietary APIs continues to narrow, with open models now competitive on most practical tasks.

#Open Source#Competition#Llama#DeepSeek
Business🔥 HIGH
2025-02-18

AI Model Pricing Wars: GPT-4o mini at $0.15/M Tokens

Fierce competition has driven small/efficient model API prices to unprecedented lows, with GPT-4o mini, Claude Haiku, and Gemini Flash all competing below $0.30/M input tokens.

#Pricing#GPT-4o mini#Competition#Efficiency
News.release🔥 HIGH
2025-02-15

Meta Llama 4 Achieves Native Multimodal Capabilities

Meta released Llama 4 series with Scout and Maverick variants, featuring native image and video understanding. Maverick claimed top performance on several multimodal benchmarks.

#Llama 4#Meta#Multimodal#Open Source
Benchmark📌 MED
2025-02-12

SWE-bench Verified Becomes Primary Coding Benchmark

The AI research community increasingly uses SWE-bench Verified as the gold standard for measuring practical coding ability, with real GitHub issues replacing synthetic problems.

#SWE-bench#Benchmarks#Coding#Research
News.release📌 MED
2025-02-10

Grok 3 Released with 1 Million Token Context

xAI released Grok 3 with a 1 million token context window, real-time X integration, and significant improvements on coding and math benchmarks.

#Grok 3#xAI#Long Context#Real-time
News.relatedModel:grok-3
Business🔥 HIGH
2025-02-05

AI Model APIs See 40% Price Reduction Industry-Wide

Following DeepSeek's disruption, major AI providers including OpenAI, Anthropic, and Google slashed API pricing by 40-70%, making AI more accessible for developers.

#Pricing#API#Competition#Accessibility
Policy🔥 HIGH
2025-02-01

EU AI Act Enforcement Begins for High-Risk Applications

The European Union's AI Act began enforcement for high-risk AI systems, requiring transparency, human oversight, and robust testing for models deployed in critical sectors.

#EU AI Act#Regulation#Policy#Compliance
News.research🔥 HIGH
2025-01-30

China's AI Models Achieve Global Competitiveness

Chinese AI models including DeepSeek V3, Qwen Max, and Kimi k1.5 have demonstrated globally competitive performance, signaling a shift in the AI development landscape.

#China#DeepSeek#Qwen#Global Competition
News.release📌 MED
2025-01-28

Mistral Large 2 Excels at Multilingual Enterprise Tasks

Mistral AI released Large 2 with significantly improved multilingual support across 20+ languages, achieving top performance for European enterprise deployments.

#Mistral#Multilingual#Enterprise#European AI
News.relatedModel:mistral-large-2
Benchmark🔥 HIGH
2025-01-25

Claude 3.7 Sonnet Sets New SOTA on SWE-bench Verified

Anthropic's Claude 3.7 Sonnet achieved 70.3% on SWE-bench Verified with extended thinking, establishing a new state-of-the-art for autonomous software engineering tasks.

#Claude#SWE-bench#Coding#SOTA
News.relatedModel:claude-opus-4-5
Business📌 MED
2025-01-22

Anthropic Raises $3.5B Series E at $61.5B Valuation

Anthropic completed a $3.5 billion Series E funding round led by Google and Spark Capital, valuing the company at $61.5 billion and confirming its position as OpenAI's primary competitor.

#Anthropic#Funding#Valuation#Investment
News.release🔥 HIGH
2025-01-20

DeepSeek V3 Outperforms GPT-4o at 1/10th the Cost

DeepSeek released V3, a 671B MoE model that matches or beats GPT-4o on most benchmarks while being significantly cheaper to run via API.

#DeepSeek#V3#MoE#Cost Efficiency
News.relatedModel:deepseek-v3
News.release🔥 HIGH
2025-01-18

DeepSeek R1 Open-Source Release Sparks Global Adoption

DeepSeek's decision to open-source R1 weights led to rapid adoption by researchers and enterprises, with hundreds of fine-tuned variants appearing within weeks.

#DeepSeek#R1#Open Source#Community
News.relatedModel:deepseek-r1
Business📌 MED
2025-01-13

Microsoft Invests $80B in AI Data Center Infrastructure

Microsoft announced plans to invest $80 billion in AI data centers in 2025, with more than half dedicated to US-based infrastructure to support growing AI model demand.

#Microsoft#Infrastructure#Investment#Data Centers
Benchmark🔥 HIGH
2025-01-10

OpenAI o3 Achieves 88% on ARC-AGI-2

OpenAI's o3 model achieved an unprecedented 88% score on the ARC-AGI-2 benchmark with high compute settings, far exceeding previous SOTA and approaching human-level performance.

#o3#OpenAI#ARC-AGI#Reasoning
News.relatedModel:o3