Token - AI Infrastructure Intelligence Search

AMD Other 2026-07-28

AMD Unveils 6th Gen EPYC Venice, MI400 GPUs, Helios Rack for AI Inference

AMD launches 6th Gen EPYC Venice (2nm) and MI400 series GPUs, with MI455X claiming 34x token throughput improvement. The Helios rack solution integrates 72 MI455X GPUs with 18 EPYC CPUs via Pensando networking and ROCm software, offering 30% more inference tokens per dollar than competitors. Adopted by OpenAI, Meta, and others.

Huawei Other 2026-07-28

DeepSeek Claims China Will Phase Out NVIDIA in One Year with Huawei Ascend 950 and Self-Developed Tile Language

DeepSeek founder claims China will phase out NVIDIA within a year, with Huawei Ascend 950 SuperPod matching GB200/GB300 performance. DeepSeek is developing its own inference chips and Tile Language to reduce CUDA dependency, marking a critical shift from import substitution to an independent AI compute ecosystem.

Cloudflare Other 2026-07-27

Cloudflare与OpenAI启动研究合作推出AI爬虫过滤与Agentic Internet控制

...

AMD Other 2026-07-27

AMD发布第六代EPYC Venice处理器与Helios机架级AI解决方案

...

Other Other 2026-07-27

Alibaba Cloud Unveils Agent-Native Suite and Open-Source SAIL Stack to Rival CUDA

At WAIC 2026, Alibaba Cloud launched its Agent-Native cloud suite (AgentLoop, AgentTeams, AgentRun, TokenWorks), open-sourced the T-Head SAIL AI software stack, and unveiled the 2.4T-parameter Qwen 3.8-Max-Preview model and Zhenwu M890 supernode, marking a comprehensive push into agent-native cloud and open-source AI ecosystem.

AMD Other 2026-07-26

AMD Launches Helios Rack-Scale AI Platform with MI400 GPUs, Targeting Inference TCO

At Advancing AI 2026, AMD unveiled the Helios rackscale platform integrating 72 MI455X GPUs and 18 EPYC Venice CPUs per rack, delivering 2.9 exaflops FP4 inference and 31TB HBM4 memory. The MI430X offers 288 TFLOPS FP64 for HPC. AMD claims up to 30% more inference tokens per dollar vs. competitors.

Anthropic Other 2026-07-26

Anthropic Claude Opus 5 Goes GA on AWS Bedrock with 0% Prompt Injection

Anthropic launched Claude Opus 5 on AWS Bedrock across 4 regions and on Claude Platform. Auto Mode achieves 0% prompt injection in 129 browser agent tests, refuting OpenAI's claim. Priced at $5/$25 per M tokens, it offers leading performance at half the cost of Fable 5.

Microsoft Other 2026-07-26

Microsoft Launches MAI Models, Slashes GPU Costs 89%, Reducing OpenAI Dependency

Microsoft unveiled MAI-Image-2.5-Pro and MAI-Voice-2-Flash on Azure Foundry, achieving 96.8% text rendering accuracy at 8K and reducing GPU costs by 84-89% vs GPT. Integrated across Bing, PowerPoint, and Dynamics 365, it marks a strategic shift from OpenAI dependency. Also, NVIDIA Jetson heads to the moon for edge AI.

AMD Other 2026-07-26

AMD Helios Enters Production: 12-Stack HBM4 Outmuscles NVIDIA, UALoE Opens AI Network

AMD's second-generation Helios rack-scale AI server enters full production, featuring 72 MI455X GPUs with Samsung's exclusive 12-stack HBM4 (31TB per rack). Compared to NVIDIA's 8-stack design, it offers 50% more memory and 30% lower token cost. Microsoft Azure commits to large-scale deployment, solidifying hyperscaler dual-vendor strategy.

Anthropic Other 2026-07-26

Anthropic Launches Claude Opus 5 at Half Price, Deep AWS Integration Shifts Control

Anthropic releases Claude Opus 5 with pricing unchanged from Opus 4.8 but performance approaching Fable 5, effectively halving cost. AWS announces Claude Platform GA, deeply integrating Anthropic API into AWS IAM/billing/management, shifting control from standalone API to cloud platform.

NVIDIA Other 2026-07-25

NVIDIA and Wistron Launch US-Based GB300 Production, Shifting AI Manufacturing Landscape

NVIDIA partnered with Wistron to open a $700M factory in Texas, achieving L6 integration of the GB300 Grace Blackwell Ultra system. The first US-made unit contains 1.5M parts, weighs 2 tons, and costs $4M. This milestone accelerates US AI capex localization from 5% to 30%, reshaping the global AI supply chain.

NVIDIA Other 2026-07-24

NVIDIA SIGGRAPH 2026展示Vera Rubin与AI生产工具

...

Google Other 2026-07-24

Google Begins Gemini 4 Pre-training with 4M+ Context and Monthly Releases

Alphabet confirms start of largest pre-training run for Gemini 4, featuring 4M+ context and native multimodality with near-monthly releases. 2026 capex raised to $195-205B, Google Cloud Q2 up 82%, signaling full-stack AI acceleration.

AMD Other 2026-07-24

AMD Helios Rack Challenges NVIDIA NVLink with Open UALoE Interconnect

At Advancing AI 2026, AMD launched the Helios rack with 72 MI455X GPUs, 18 Venice EPYC CPUs, and Pensando networking, claiming 30% higher inference token/$ vs NVIDIA NVL72. It introduced UALoE open interconnect to break NVLink lock-in, partnering with Cerebras, Cisco, and major AI firms.

Microsoft Other 2026-07-24

AMD Helios Full-Stack AI Server Deployed on Azure, UALoE Open Standard Challenges NVLink

AMD and Microsoft Azure announce large-scale deployment of Helios full-stack AI servers, featuring 72 MI455X GPUs, 18 EPYC Venice CPUs, and Pensando DPUs, with 31TB HBM4 and 1.7PB/s memory bandwidth. Two new Azure VM series target Agentic AI and semiconductor design, marking Microsoft's multi-vendor strategy and challenging NVIDIA's dominance.

Microsoft Other 2026-07-24

Microsoft launches MAI-Image-2.5-Pro and MAI-Voice-2-Flash, deepening in-house AI ecosystem

Microsoft introduces MAI-Image-2.5-Pro and MAI-Voice-2-Flash, proprietary models now powering Bing, PowerPoint, OneDrive, and Dynamics 365, replacing third-party models. Claims up to 84% GPU cost reduction and 2x faster voice, signaling a strategic shift to in-house AI.

NVIDIA Other 2026-07-23

NVIDIA-OpenAI $100B Partnership: 10GW Vera Rubin AI Factories Reshape Ecosystem

NVIDIA and OpenAI announce a strategic partnership to deploy at least 10GW of NVIDIA systems using the Vera Rubin platform (Rubin GPU, Vera CPU, HBM4, NVLink 6). NVIDIA will invest up to $100B. First facilities go online in H2 2026, powering OpenAI's next-gen models, marking the era of multi-GW AI factories.

Anthropic Other 2026-07-22

Anthropic Launches Claude Fable 5 with Classifier Routing for Sensitive Domains

On July 22, 2026, Anthropic released Claude Fable 5, a public version of its Mythos-class architecture, priced at $10/$50 per million tokens. It includes a classifier that automatically routes sensitive requests (cybersecurity, bio/chem, model distillation) back to Opus 4.8, establishing a tiered access governance model.

NVIDIA Other 2026-07-22

NVIDIA Reveals Vera Rubin GPU and Vera CPU: 3360B Transistors, 88-Core Olympus, 10x Agentic AI Efficiency

NVIDIA fully discloses Vera Rubin GPU and Vera CPU specifications. The GPU features 3360B transistors, HBM4 288GB, and 10x agentic AI efficiency over Blackwell. The CPU has 88 custom Olympus cores, delivering 2.2x faster agentic AI performance than Intel Sapphire Rapids. This solidifies NVIDIA's full-stack strategy against x86 incumbents.

Meta Other 2026-07-22

Meta Develops Switchboard AI Model Router to Control Inference Costs and Ecosystem

Meta's internal incubator AAI Labs is developing Switchboard, an AI model router that analyzes task complexity and routes to the most suitable model to minimize inference costs. Initially applied to internal AI coding agents, it may later become a commercial service, positioning Meta as a control layer for AI inference.

Reports

Filter

AMD Unveils 6th Gen EPYC Venice, MI400 GPUs, Helios Rack for AI Inference

DeepSeek Claims China Will Phase Out NVIDIA in One Year with Huawei Ascend 950 and Self-Developed Tile Language

Cloudflare与OpenAI启动研究合作推出AI爬虫过滤与Agentic Internet控制

AMD发布第六代EPYC Venice处理器与Helios机架级AI解决方案

Alibaba Cloud Unveils Agent-Native Suite and Open-Source SAIL Stack to Rival CUDA

AMD Launches Helios Rack-Scale AI Platform with MI400 GPUs, Targeting Inference TCO

Anthropic Claude Opus 5 Goes GA on AWS Bedrock with 0% Prompt Injection

Microsoft Launches MAI Models, Slashes GPU Costs 89%, Reducing OpenAI Dependency

AMD Helios Enters Production: 12-Stack HBM4 Outmuscles NVIDIA, UALoE Opens AI Network

Anthropic Launches Claude Opus 5 at Half Price, Deep AWS Integration Shifts Control

NVIDIA and Wistron Launch US-Based GB300 Production, Shifting AI Manufacturing Landscape

NVIDIA SIGGRAPH 2026展示Vera Rubin与AI生产工具

Google Begins Gemini 4 Pre-training with 4M+ Context and Monthly Releases

AMD Helios Rack Challenges NVIDIA NVLink with Open UALoE Interconnect

AMD Helios Full-Stack AI Server Deployed on Azure, UALoE Open Standard Challenges NVLink

Microsoft launches MAI-Image-2.5-Pro and MAI-Voice-2-Flash, deepening in-house AI ecosystem

NVIDIA-OpenAI $100B Partnership: 10GW Vera Rubin AI Factories Reshape Ecosystem

Anthropic Launches Claude Fable 5 with Classifier Routing for Sensitive Domains

NVIDIA Reveals Vera Rubin GPU and Vera CPU: 3360B Transistors, 88-Core Olympus, 10x Agentic AI Efficiency

Meta Develops Switchboard AI Model Router to Control Inference Costs and Ecosystem

Reports

Filter

AMD Unveils 6th Gen EPYC Venice, MI400 GPUs, Helios Rack for AI Inference

DeepSeek Claims China Will Phase Out NVIDIA in One Year with Huawei Ascend 950 and Self-Developed Tile Language

Cloudflare与OpenAI启动研究合作 推出AI爬虫过滤与Agentic Internet控制

AMD发布第六代EPYC Venice处理器与Helios机架级AI解决方案

Alibaba Cloud Unveils Agent-Native Suite and Open-Source SAIL Stack to Rival CUDA

AMD Launches Helios Rack-Scale AI Platform with MI400 GPUs, Targeting Inference TCO

Anthropic Claude Opus 5 Goes GA on AWS Bedrock with 0% Prompt Injection

Microsoft Launches MAI Models, Slashes GPU Costs 89%, Reducing OpenAI Dependency

AMD Helios Enters Production: 12-Stack HBM4 Outmuscles NVIDIA, UALoE Opens AI Network

Anthropic Launches Claude Opus 5 at Half Price, Deep AWS Integration Shifts Control

NVIDIA and Wistron Launch US-Based GB300 Production, Shifting AI Manufacturing Landscape

NVIDIA SIGGRAPH 2026展示Vera Rubin与AI生产工具

Google Begins Gemini 4 Pre-training with 4M+ Context and Monthly Releases

AMD Helios Rack Challenges NVIDIA NVLink with Open UALoE Interconnect

AMD Helios Full-Stack AI Server Deployed on Azure, UALoE Open Standard Challenges NVLink

Microsoft launches MAI-Image-2.5-Pro and MAI-Voice-2-Flash, deepening in-house AI ecosystem

NVIDIA-OpenAI $100B Partnership: 10GW Vera Rubin AI Factories Reshape Ecosystem

Anthropic Launches Claude Fable 5 with Classifier Routing for Sensitive Domains

NVIDIA Reveals Vera Rubin GPU and Vera CPU: 3360B Transistors, 88-Core Olympus, 10x Agentic AI Efficiency

Meta Develops Switchboard AI Model Router to Control Inference Costs and Ecosystem

Cloudflare与OpenAI启动研究合作推出AI爬虫过滤与Agentic Internet控制