AI基础设施 - AI Infrastructure Intelligence Search

Microsoft Other 2026-07-12

Microsoft Takes Over OpenAI's Arctic Data Center, Seizing AI Compute Control

Microsoft leases a data center in Norway's Arctic Circle from Nscale, deploying 30,000 NVIDIA Vera Rubin GPUs, filling the gap left by OpenAI's retreat. OpenAI slashes its 2030 infrastructure budget from $140B to $60B. Microsoft surpasses OpenAI in AI compute capacity and gains geographical redundancy.

Meta Other 2026-07-12

Meta Invests $9.17B in Canada AI Data Center, Iris AI Chip Mass Production Begins MTIA Roadmap

Meta announced a $9.17B AI data center in Canada with 1GW capacity, and its first in-house AI chip Iris will mass produce in September, kicking off the MTIA four-generation roadmap. Meta targets 14GW compute by 2027, using 6-month chip iterations to challenge NVIDIA's annual cadence and reduce GPU dependency.

Huawei Other 2026-07-10

Huawei Ascend 10K-Card Cluster Goes Live, UnifiedBus Protocol Pools All Resources

Huawei launched an Ascend 10,000-card AI cluster in Shaoguan, Guangdong, and showcased the Atlas 950 SuperPoD with its proprietary UnifiedBus interconnect supporting 8,192 NPUs at 16.3 PB/s. Huawei Cloud also entered the Gartner 2026 Cloud AI Infrastructure Leaders quadrant, reinforcing its push for a self-contained AI ecosystem.

Amazon Other 2026-07-07

AWS Boosts Trainium3 ASIC Shipments, Accelerating Custom AI Chip Ecosystem Against NVIDIA

Amazon AWS has notified its supply chain to increase Q3 2026 shipments of Trainium3-based ASIC servers by 20-30%. This reflects growing confidence in its custom AI chips and a strategic push to reduce reliance on NVIDIA GPUs. AWS also partnered with OpenAI to develop a Stateful Runtime Environment on Bedrock.

NVIDIA Other 2026-07-06

NVIDIA Kyber NVL144 Delayed to 2028: Midplane PCB Manufacturing Becomes AI Scaling Bottleneck

SemiAnalysis reveals NVIDIA's Kyber NVL144 delayed beyond 12 months to 2028 due to 78-layer Orthogonal Backplane manufacturing challenges. The interim NVL72x2 solution is cancelled due to operational burdens, and the 4-die Rubin Ultra is also scrapped, leaving a product gap in NVIDIA's scaling roadmap.

AMD Other 2026-07-04

AMD通知AIB合作伙伴上调GPU核心与GDDR捆绑套料出货价约10%

...

Meta Other 2026-07-03

Meta Admits AI Agent Stagnation, Plans to Sell Compute to Challenge Cloud Triopoly

Meta CEO Zuckerberg admits AI agent development is behind schedule, pushing ROI timeline to 3-6 months. Concurrently, Meta plans to sell AI compute and model access externally, directly challenging AWS, Azure, and GCP's cloud oligopoly, signaling a pivot from internal AI infrastructure to a commercial cloud provider.

Amazon Other 2026-07-02

AWS Invests $1B in AI Unit: Field Engineers Lock In Customers, Reshaping Cloud Ecosystem

AWS announces $1B investment in a new AI unit with thousands of field engineers, embedded directly into customer business, R&D, and security teams. Promises full AI system delivery within weeks and self-sustaining ops teams. This first-of-its-kind hyperscaler service aims to deepen customer lock-in via labor-intensive deployment.

Google Other 2026-06-29

Google Caps Meta's Gemini Access: AI Compute Bottleneck Reshapes Cloud Ecosystem

Google restricts Meta's access to Gemini API due to compute capacity shortage, delaying Meta's AI projects. This reveals that even with custom TPUs and massive data centers, Google cannot meet surging demand, forcing the industry to reassess AI compute allocation and supply chain resilience.

Samsung Electronics Other 2026-06-29

Samsung and SK Hynix Announce $300B Investment to Dominate AI Memory and Foundry

Samsung and SK Hynix announce a 10-year, 1,000 trillion won investment plan to expand HBM4 production, improve 3nm GAA yield, and build new AI chip fabs. This aims to cement their HBM duopoly and close the gap with TSMC in advanced foundry, reshaping global AI infrastructure supply chain costs.

OpenAI Other 2026-06-26

OpenAI and Broadcom Tape Out First Inference ASIC Jalapeño in 9 Months, Targeting NVIDIA Dominance

OpenAI and Broadcom unveil Jalapeño, their first custom inference ASIC, fabricated on TSMC 3nm and optimized for Transformer models. Targeting a 50% inference cost reduction, it taped out in 9 months and is slated for deployment in gigawatt-scale data centers by late 2026, marking OpenAI's strategic pivot to full-stack AI infrastructure and a direct challenge to NVIDIA's inference hegemony.

Qualcomm Other 2026-06-26

Qualcomm Acquires Modular for $3.9B, Open-Sources Mojo to Break CUDA Lock-In

Qualcomm acquires Modular for $3.9B in stock and open-sources Mojo, a Python-compatible systems language. Mojo targets CUDA dependency, aiming to provide a high-performance alternative for AI developers. This move strengthens Qualcomm's AI inference chip software stack and edge AI competitiveness.

Intel Other 2026-06-23

Intel at Computex 2026: CPU as Agentic AI Orchestrator, x86 Reclaims Inference Control

At Computex 2026, Intel unveiled the 288-core Xeon 6+ (Intel 18A) and 3rd-gen Core Ultra, claiming Agentic AI shifts CPU:GPU ratio from 1:8 to 1:1. Partnering with SambaNova and Foxconn for rack-scale inference systems, Intel repositions the CPU as the orchestrator for multi-step AI reasoning, aiming to reclaim control from GPU-centric architectures.

Cloudflare Other 2026-06-22

Cloudflare AI Gateway 2.0: Edge Control Plane Captures AI Inference Routing and Security

Cloudflare launches AI Gateway 2.0 with smart routing across 50+ model providers claiming 30% cost reduction, Workers AI edge inference (<10ms latency), NVIDIA GPU acceleration partnership, and expanded AI firewall. This shifts the AI traffic control plane from centralized clouds to the edge network.

Hewlett Packard Enterprise Other 2026-06-22

HPE ProLiant DL394 Gen12 with NVIDIA Vera CPU: ARM Takes on x86 in AI

HPE unveils ProLiant DL394 Gen12 server powered by NVIDIA Vera CPU at Computex 2026, shipping fall 2026. Vera is NVIDIA's first datacenter CPU, in mass production, delivering 1.8x AI workload performance over x86. Early customers include OpenAI, Anthropic, xAI, and others. HPE continues GreenLake as-a-service while also offering Intel Xeon 6+ options.

Apple Other 2026-06-22

Apple Expands Private Cloud Compute to Google Cloud with NVIDIA Confidential GPUs

Apple at WWDC 2026 expands Private Cloud Compute (PCC) to Google Cloud, leveraging NVIDIA GPU Confidential Computing for secure AI inference. This marks a strategic shift from Apple-owned data centers to third-party cloud, alongside M6 Neural Engine performance gains.

Intel Other 2026-06-22

Intel Launches Xeon 6+ with 288 Cores, Reclaims AI Control Plane

Intel unveils Xeon 6+ (288 E-cores, 576MB L3, 18A process), Ethernet 800 E835 controller (200GbE), and next-gen GPU Crescent Island at Computex 2026. Partnerships with SambaNova and Foxconn for rack-scale AI. Strategy: Xeon as the control plane for Agentic AI.

Microsoft Azure Other 2026-06-21

Microsoft Azure Debuts Blackwell Ultra AI Supercomputer, Training-as-a-Service Reshapes Ecosystem

Microsoft Azure launched an AI supercomputer cluster powered by NVIDIA Blackwell Ultra GPUs, delivering over 200 exaflops of AI compute. It introduced AI Training as a Service for on-demand model training and partnered with OpenAI to deploy GPT-6 training clusters by 2027. Liquid cooling achieves a PUE of 1.08, positioning Azure as the premier cloud for trillion-parameter models.

NVIDIA Other 2026-06-18

NVIDIA Acquires Kumo AI for $400M: Expanding from GPU Compute to Structured Data Prediction

NVIDIA acquires Kumo AI for over $400M, adding graph neural network and time series analysis for enterprise predictions like churn and inventory optimization. This extends NVIDIA from GPU compute into enterprise data intelligence, complementing HPE partnerships for AI factory solutions, Vera CPU architecture, and agentic AI validated designs.

AMD Other 2026-06-16

AMD Critical RCE Vulnerability Disclosed After 124 Days, Sparks AI Infrastructure Security Crisis

Security researcher mr.bruh publicly disclosed a critical remote code execution (RCE) vulnerability in AMD processors after 124 days without a fix, with AMD refusing a $10,000 bounty. The flaw affects AI servers running AMD EPYC and Instinct, likened to a Log4j moment for AI infrastructure, forcing enterprises to reassess chip-level security response and supply chain risk.

Reports

Filter