Agentic AI - AI Infrastructure Intelligence Search

NVIDIA Other 2026-07-08

NVIDIA Rigel Core: Single-Threaded CPU as the New Control Plane for Agentic AI

NVIDIA unveils Rosa CPU architecture with custom Rigel core (Arm v9.2), targeting single-threaded performance for Agentic AI workloads, paired with Feynman GPU (1.6nm, 50 PFLOPS) in 2028. This shifts CPU design from core-count scaling to serial-latency optimization, directly challenging AMD EPYC and Intel Xeon dominance.

NVIDIA Other 2026-07-07

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

NVIDIA launches Vera CPU, a max single-threaded CPU at scale for agentic AI. With Olympus cores delivering 1.8x sustained per-core performance over x86, 1.2TB/s LPDDR5X bandwidth, and 3.4TB/s core-to-core bandwidth, Vera integrates into NVIDIA's unified AI factory architecture, aiming to lock users into its ecosystem.

NVIDIA Other 2026-07-07

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

...

Huawei Other 2026-06-25

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

At MWC Shanghai 2026, Huawei urged carriers to shift from byte-based to token-based billing for AI workloads, showcasing a 372% token throughput improvement in long-sequence inference via its AI Inference Acceleration Solution. It also highlighted the Upper-6 GHz band as critical for AI wearables requiring 20 Mbps uplink, aiming to reposition 5G-A networks as AI compute delivery infrastructure.

OpenAI Other 2026-06-25

Oracle Defense Ecosystem Cohort 3: Offline AI on Roving Edge Devices Goes Operational

Oracle announced the third cohort of its Defense Ecosystem at the Brussels summit, adding 10 companies. Concurrently, Whitespace's Saga AI system deployed on Oracle Roving Edge Devices during Royal Navy's Operation HIGHMAST, running classified AI workloads completely offline, proving sovereign edge AI is operational.

NVIDIA Other 2026-06-25

Qualcomm Dragonfly: 250-core CPU, HBC memory, UALink interconnects target AI inference TCO

Qualcomm unveils full data center portfolio: Dragonfly C1000 250-core Oryon CPU (>5GHz, PCIe Gen7, CXL), HBC near-memory compute (133TB/s Gen1, 18x-54x effective BW), AI300 inference accelerator (UALink/ESUN scale-up), and 800G/1.6T connectivity. Multi-year Meta CPU deal. Commercial sampling 2027-2028. Targets inference TCO with tokens-per-watt leadership.

OpenAI Other 2026-06-25

OpenAI and Broadcom Unveil Jalapeno Inference ASIC, Reshaping AI Hardware Landscape

OpenAI, in collaboration with Broadcom, has developed Jalapeno, a custom LLM inference accelerator. The chip uses a multi-chip module with HBM3E memory and achieved tape-out in just nine months. Designed for OpenAI's model stack, it aims to reduce inference costs and dependency on NVIDIA GPUs, with initial deployment planned for late 2026.

NVIDIA Other 2026-06-24

NVIDIA and AWS Default GPU Vector Search with cuVS, G7 Instances Deliver 4.6x Inference

NVIDIA and AWS collaborate to embed cuVS as default GPU-accelerated vector search in OpenSearch Serverless, delivering 10x faster indexing at 1/4 cost. New EC2 G7 instances with RTX PRO 4500 Blackwell GPUs achieve up to 4.6x inference performance. AWS achieves GB300 Exemplar Cloud status for training.

Nokia Other 2026-06-24

Nokia, Amazon Web Services expand collaboration to deliver autonomous networks built for the AI era

...

NVIDIA Other 2026-06-23

NVIDIA Launches Agent Toolkit: Nemotron Models, OpenShell Runtime for Specialized AI Agents

NVIDIA unveils Agent Toolkit, an open modular foundation with Nemotron models, NemoClaw blueprints, and OpenShell runtime, enabling enterprises to build secure, specialized AI agents. It targets life sciences, cybersecurity, and industrial workflows, aiming to turn frontier models into domain-specific digital coworkers.

NVIDIA Other 2026-06-23

NVIDIA's AI Agents and Digital Twins Reshape Telecom Network Control Plane

At DTW Ignite 2026, NVIDIA showcases its AI agent platform integrating NeMo synthetic data, NemoClaw secure runtime, OpenShell sandbox, and RTX PRO 6000-accelerated digital twins, aiming for autonomous telecom operations. Partners include SoftBank, Amdocs, NTT DATA, etc., moving from task automation to full autonomy.

ARM Other 2026-06-23

Arm servers capture >45% data center revenue, x86 ecosystem under AI-driven assault

IDC reports Q1 2026 global server revenue hit a record $122.6B, with Arm-based servers capturing >45% share (x86 at 52%). Accelerated servers (GPU/ASIC/FPGA) generated >70% revenue. Nvidia's Grace CPU (NVL72) and hyperscaler custom Arm chips drive the shift; x86 still leads in unit volume but faces supply constraints.

Intel Other 2026-06-23

Intel at Computex 2026: CPU as Agentic AI Orchestrator, x86 Reclaims Inference Control

At Computex 2026, Intel unveiled the 288-core Xeon 6+ (Intel 18A) and 3rd-gen Core Ultra, claiming Agentic AI shifts CPU:GPU ratio from 1:8 to 1:1. Partnering with SambaNova and Foxconn for rack-scale inference systems, Intel repositions the CPU as the orchestrator for multi-step AI reasoning, aiming to reclaim control from GPU-centric architectures.

NVIDIA Other 2026-06-22

Dell PowerEdge XE8812: Liquid-Cooled Density Trap with NVIDIA Vera Rubin NVL4

Dell launches PowerEdge XE8812 with NVIDIA Vera Rubin NVL4, delivering 144 GPUs per rack, 300kW+ power, and 100% direct liquid cooling. It offers a generational leap in memory and compute density for HPC and AI, but deeply locks users into Dell's PowerRack, iDRAC, and ORv3 ecosystem from chip to rack.

NVIDIA Other 2026-06-22

NVIDIA JUPITER Validates Grace Hopper: Exascale Science Goes Production

Europe's first exascale supercomputer JUPITER, powered by NVIDIA Grace Hopper Superchips and Quantum-X800 InfiniBand, achieves breakthroughs in brain mapping at cellular scale, 1km-resolution climate simulation, 6G AI, and 50-qubit quantum simulation, proving exascale is production-ready.

Intel Other 2026-06-22

Intel Launches Xeon 6+ with 288 Cores, Reclaims AI Control Plane

Intel unveils Xeon 6+ (288 E-cores, 576MB L3, 18A process), Ethernet 800 E835 controller (200GbE), and next-gen GPU Crescent Island at Computex 2026. Partnerships with SambaNova and Foxconn for rack-scale AI. Strategy: Xeon as the control plane for Agentic AI.

Amazon Other 2026-06-21

AWS Seizes Agent Control Plane with MCP Gateway and AgentCore

AWS launches managed web search for Bedrock AgentCore, autonomous agents in Amazon Quick, subagent MicroVM orchestration with LangChain, and MCP Gateway, shifting enterprise AI agents from prototypes to governed infrastructure with cloud-native control planes and execution isolation.

Cisco Other 2026-06-19

Cisco Acquires WideField: Injecting Identity Session Intel into Splunk’s Agentic SOC to Win the AI Agent Security Control Plane

Cisco announces intent to acquire WideField Security to embed identity and session intelligence into Splunk's Agentic SOC. The move targets the new security risks from AI agents and non-human identities operating at machine speed, using deterministic data pipelines and session-level signals for evidence-backed autonomous response, strengthening the trust layer within the Cisco Data Fabric.

AMD Other 2026-06-18

AMD MEXT Acquisition Turns NAND Flash into DRAM-Class Memory, Halving AI Inference Cost

AMD acquires MEXT, whose technology makes cheap NAND flash behave like expensive DRAM, doubling to quadrupling usable memory capacity while halving costs. This targets inference and agentic AI memory bottlenecks. AMD also signs a 30MW AI compute deployment deal with Rackspace, rolling out from 2026 to 2028.

Microsoft Other 2026-06-18

Microsoft Shifts Copilot Cowork to Usage-Based Pricing, Eyes DeepSeek for Cost-Efficiency

Microsoft transitions Copilot Cowork to usage-based billing (Copilot Credits) and considers integrating fine-tuned DeepSeek V4 or open-source models as low-cost alternatives, hosted on Azure. This move addresses high costs from intensive usage and signals a multi-model strategy.

Reports

Filter