TPU - AI Infrastructure Intelligence Search

Other Other 2026-07-12

WhiteFiber and DriveNets Achieve 111.2 Tbps Cross-DC AI Fabric, Breaking Power Constraints

WhiteFiber announces Project Redwood, partnering with DriveNets Ethernet AI fabric (FSE, VOQ, deep buffers), WEKA storage, and NVIDIA H200 GPUs, achieving 111.2 Tbps bandwidth and 0.9ms latency over 83km dark fiber, treating two geographically separated GPU clusters as a single logical supercluster. Commercialization planned for Q3 2026.

Microsoft Other 2026-07-12

Microsoft Takes Over OpenAI's Arctic Data Center, Seizing AI Compute Control

Microsoft leases a data center in Norway's Arctic Circle from Nscale, deploying 30,000 NVIDIA Vera Rubin GPUs, filling the gap left by OpenAI's retreat. OpenAI slashes its 2030 infrastructure budget from $140B to $60B. Microsoft surpasses OpenAI in AI compute capacity and gains geographical redundancy.

Anthropic Other 2026-07-12

Anthropic Locks 3.5GW TPU Compute with Broadcom, Signaling Shift to Custom AI ASICs

Broadcom's Q2 FY2026 filing reveals a 3.5GW TPU compute deal with Anthropic starting 2027. This marks a strategic shift from general-purpose GPUs to custom ASICs for AI workloads, with OpenAI and Meta making similar multi-GW commitments, signaling a fundamental change in AI infrastructure.

Amazon Other 2026-07-10

AWS Sells Trainium 3 Externally, Challenging NVIDIA's AI Training Chip Dominance

AWS begins external sales of its Trainium 3 AI training chip, fabricated on TSMC 3nm process, delivering 2.52 PFLOPS per chip. Early customers include Anthropic and Uber. This move directly challenges NVIDIA's dominance and marks AWS's strategic shift from cloud provider to chip vendor.

AMD Other 2026-07-10

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

...

NVIDIA Other 2026-07-07

NVIDIA Denies Kyber NVL144 Delay, But 78-Layer PCB Bottleneck Exposes AI Hardware Physics Limit

NVIDIA officially denies reports of Kyber NVL144 rack delay to 2028, but SemiAnalysis revelations about a 78-layer ultra-high-density PCB midplane bottleneck and Rubin Ultra cancellation expose hard physical limits in signal integrity and manufacturing, opening a strategic window for AMD and Google.

NVIDIA Other 2026-07-06

NVIDIA Kyber NVL144 Delayed to 2028: Midplane PCB Manufacturing Becomes AI Scaling Bottleneck

SemiAnalysis reveals NVIDIA's Kyber NVL144 delayed beyond 12 months to 2028 due to 78-layer Orthogonal Backplane manufacturing challenges. The interim NVL72x2 solution is cancelled due to operational burdens, and the 4-die Rubin Ultra is also scrapped, leaving a product gap in NVIDIA's scaling roadmap.

Anthropic Other 2026-07-06

Anthropic Starts Custom AI Chip Development, Talks Samsung 2nm, Aims for Compute Independence

Anthropic has initiated its own AI chip development and is in talks with Samsung for 2nm foundry services. The move aims to reduce reliance on NVIDIA GPUs, optimize inference costs, and strengthen its technology moat ahead of a potential IPO. It joins OpenAI, Google, and others in the custom ASIC race, signaling a shift from software to hardware competition.

TSMC Other 2026-07-01

Etched Unveils Sohu Transformer ASIC: Claims 20x H100 Inference Throughput, Challenging NVIDIA's Grip

AI chip startup Etched emerges from stealth with Sohu, a Transformer-specific ASIC on TSMC N4P with 144GB HBM3E. By hardwiring attention mechanisms, it claims 20x throughput and 140x price-performance vs. H100 on Llama 70B. With $800M total funding and first racks shipping this summer, it directly challenges NVIDIA's inference dominance.

Amazon Other 2026-06-30

AWS and Google Open Custom AI Chips for External Sales, ASIC Shipment Growth Surpasses GPU, TCO Inflection Point Reached

In Q2 2026, AWS Trainium and Google TPU are commercialized externally for the first time. Custom ASIC shipment growth of 44.6% surpasses GPU's 16.1%. ASIC TCO advantage reaches 40-65% for large-scale inference; Midjourney cut monthly compute cost from $2.1M to $0.7M after migrating to TPU. This marks a structural inflection point in AI compute.

Google Other 2026-06-29

Google Caps Meta's Gemini Access: AI Compute Bottleneck Reshapes Cloud Ecosystem

Google restricts Meta's access to Gemini API due to compute capacity shortage, delaying Meta's AI projects. This reveals that even with custom TPUs and massive data centers, Google cannot meet surging demand, forcing the industry to reassess AI compute allocation and supply chain resilience.

OpenAI Other 2026-06-26

OpenAI and Broadcom Tape Out First Inference ASIC Jalapeño in 9 Months, Targeting NVIDIA Dominance

OpenAI and Broadcom unveil Jalapeño, their first custom inference ASIC, fabricated on TSMC 3nm and optimized for Transformer models. Targeting a 50% inference cost reduction, it taped out in 9 months and is slated for deployment in gigawatt-scale data centers by late 2026, marking OpenAI's strategic pivot to full-stack AI infrastructure and a direct challenge to NVIDIA's inference hegemony.

Huawei Other 2026-06-25

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

At MWC Shanghai 2026, Huawei urged carriers to shift from byte-based to token-based billing for AI workloads, showcasing a 372% token throughput improvement in long-sequence inference via its AI Inference Acceleration Solution. It also highlighted the Upper-6 GHz band as critical for AI wearables requiring 20 Mbps uplink, aiming to reposition 5G-A networks as AI compute delivery infrastructure.

Anthropic Other 2026-06-25

Anthropic Alleges Largest AI Distillation Attack by Alibaba-Linked Operators, Exposing API Security Gaps

Anthropic alerted U.S. senators that Alibaba-linked operators conducted the largest known distillation attack, generating 28.8 million model exchanges via 25,000 fraudulent accounts to harvest Claude's frontier capabilities. The incident exposes a critical vulnerability in AI API security, forcing a rethinking of inference endpoint protection and usage monitoring.

Google Cloud Other 2026-06-25

Google Cloud Multi-Agent Architecture Shifts Control from Human to Autonomous Verification

Google Cloud introduces agent-scale data management with multi-agent verification to reduce human oversight. Deploys six Gemini agents with Nokia for autonomous network operations. Amazon plans to commercialize Trainium chips, intensifying AI hardware competition against Google TPU and Nvidia GPU.

OpenAI Other 2026-06-25

OpenAI and Broadcom Unveil Jalapeno Inference ASIC, Reshaping AI Hardware Landscape

OpenAI, in collaboration with Broadcom, has developed Jalapeno, a custom LLM inference accelerator. The chip uses a multi-chip module with HBM3E memory and achieved tape-out in just nine months. Designed for OpenAI's model stack, it aims to reduce inference costs and dependency on NVIDIA GPUs, with initial deployment planned for late 2026.

MediaTek Other 2026-06-23

MediaTek Lands Exclusive Google TPU v9 Inference Upgrade Triggerfish with 2x SRAM

Google plans a TPU v9 inference upgrade, Triggerfish, exclusively fabbed by MediaTek. It features 2-3x on-chip SRAM, HBM4E DRAM, and a simulation die for local management. Production starts late 2027 with 1-2M units lifecycle, unit price ~30% higher than Humufish.

MediaTek Other 2026-06-23

Google TPU v9 Switches to MediaTek, Breaking Broadcom's AI ASIC Monopoly

Google moves its TPU v9 Humufish design and integration contract from Broadcom to MediaTek, which handles I/O chip design and packaging. Combined with a split-foundry strategy (TSMC N2 compute, Samsung 2nm I/O), this marks a systematic effort to build a multi-vendor, multi-node supply chain, directly dismantling Broadcom's dominance in custom AI ASICs.

ARM Other 2026-06-23

Arm servers capture >45% data center revenue, x86 ecosystem under AI-driven assault

IDC reports Q1 2026 global server revenue hit a record $122.6B, with Arm-based servers capturing >45% share (x86 at 52%). Accelerated servers (GPU/ASIC/FPGA) generated >70% revenue. Nvidia's Grace CPU (NVL72) and hyperscaler custom Arm chips drive the shift; x86 still leads in unit volume but faces supply constraints.

ASML Other 2026-06-23

ASML CEO Validates Musk's Terafab, Reshaping AI Chip Supply Chain

ASML's CEO publicly acknowledges tracking Elon Musk's planned terawatt-scale AI supercomputer Terafab, comparing it to Korean DRAM megaprojects. This signals that the sole EUV lithography supplier is allocating capacity, potentially transforming AI chip supply chain and vertical integration.

Reports

Filter

WhiteFiber and DriveNets Achieve 111.2 Tbps Cross-DC AI Fabric, Breaking Power Constraints

Microsoft Takes Over OpenAI's Arctic Data Center, Seizing AI Compute Control

Anthropic Locks 3.5GW TPU Compute with Broadcom, Signaling Shift to Custom AI ASICs

AWS Sells Trainium 3 Externally, Challenging NVIDIA's AI Training Chip Dominance

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

NVIDIA Denies Kyber NVL144 Delay, But 78-Layer PCB Bottleneck Exposes AI Hardware Physics Limit

NVIDIA Kyber NVL144 Delayed to 2028: Midplane PCB Manufacturing Becomes AI Scaling Bottleneck

Anthropic Starts Custom AI Chip Development, Talks Samsung 2nm, Aims for Compute Independence

Etched Unveils Sohu Transformer ASIC: Claims 20x H100 Inference Throughput, Challenging NVIDIA's Grip

AWS and Google Open Custom AI Chips for External Sales, ASIC Shipment Growth Surpasses GPU, TCO Inflection Point Reached

Google Caps Meta's Gemini Access: AI Compute Bottleneck Reshapes Cloud Ecosystem

OpenAI and Broadcom Tape Out First Inference ASIC Jalapeño in 9 Months, Targeting NVIDIA Dominance

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

Anthropic Alleges Largest AI Distillation Attack by Alibaba-Linked Operators, Exposing API Security Gaps

Google Cloud Multi-Agent Architecture Shifts Control from Human to Autonomous Verification

OpenAI and Broadcom Unveil Jalapeno Inference ASIC, Reshaping AI Hardware Landscape

MediaTek Lands Exclusive Google TPU v9 Inference Upgrade Triggerfish with 2x SRAM

Google TPU v9 Switches to MediaTek, Breaking Broadcom's AI ASIC Monopoly

Arm servers capture >45% data center revenue, x86 ecosystem under AI-driven assault

ASML CEO Validates Musk's Terafab, Reshaping AI Chip Supply Chain

Reports

Filter

WhiteFiber and DriveNets Achieve 111.2 Tbps Cross-DC AI Fabric, Breaking Power Constraints

Microsoft Takes Over OpenAI's Arctic Data Center, Seizing AI Compute Control

Anthropic Locks 3.5GW TPU Compute with Broadcom, Signaling Shift to Custom AI ASICs

AWS Sells Trainium 3 Externally, Challenging NVIDIA's AI Training Chip Dominance

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

NVIDIA Denies Kyber NVL144 Delay, But 78-Layer PCB Bottleneck Exposes AI Hardware Physics Limit

NVIDIA Kyber NVL144 Delayed to 2028: Midplane PCB Manufacturing Becomes AI Scaling Bottleneck

Anthropic Starts Custom AI Chip Development, Talks Samsung 2nm, Aims for Compute Independence

Etched Unveils Sohu Transformer ASIC: Claims 20x H100 Inference Throughput, Challenging NVIDIA's Grip

AWS and Google Open Custom AI Chips for External Sales, ASIC Shipment Growth Surpasses GPU, TCO Inflection Point Reached

Google Caps Meta's Gemini Access: AI Compute Bottleneck Reshapes Cloud Ecosystem

OpenAI and Broadcom Tape Out First Inference ASIC Jalapeño in 9 Months, Targeting NVIDIA Dominance

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

Anthropic Alleges Largest AI Distillation Attack by Alibaba-Linked Operators, Exposing API Security Gaps

Google Cloud Multi-Agent Architecture Shifts Control from Human to Autonomous Verification

OpenAI and Broadcom Unveil Jalapeno Inference ASIC, Reshaping AI Hardware Landscape

MediaTek Lands Exclusive Google TPU v9 Inference Upgrade Triggerfish with 2x SRAM

Google TPU v9 Switches to MediaTek, Breaking Broadcom's AI ASIC Monopoly

Arm servers capture >45% data center revenue, x86 ecosystem under AI-driven assault

ASML CEO Validates Musk's Terafab, Reshaping AI Chip Supply Chain

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs