
Nvidia, the world's largest artificial intelligence chip company, is expected to unveil a dedicated inference chip at its flagship technology conference GTC 2026, industry observers say. The new chip could reshape the AI semiconductor competitive landscape as rivals Google and Amazon have developed their own inference-specialized AI chips.
According to Nvidia, GTC 2026 will be held from March 16 to 19 in San Jose, California. Approximately 30,000 participants from 190 countries will attend more than 1,000 sessions both online and in person.
Industry watchers believe Nvidia is likely to reveal a new inference-focused chip that could shift the AI competition paradigm. The Financial Times reported Monday, citing sources, that Nvidia plans to unveil a new chip focused on inference rather than model training at this GTC.
The inference chip would be Nvidia's first product following its $20 billion acquisition of Groq—the largest deal in the company's history. Late last year, Nvidia absorbed inference chip developer Groq by acquiring its core technology and talent. Founded in 2016 by engineers who developed Google's Tensor Processing Unit, Groq has been developing Language Processing Units designed to accelerate AI processing. Nvidia is expected to unveil its first product based on this technology.
Analysts say Nvidia's new chip could be a game-changer as major tech giants accelerate development of their own inference-specialized AI chips. While Nvidia has dominated AI data centers with its graphics processing units, the company faces challenges as the market shifts toward agentic AI. In agentic AI applications, inference matters more than training, and GPUs have been criticized as inefficient for inference due to high costs and power consumption. The dedicated inference chip expected to be unveiled could address these weaknesses. The FT analyzed that "Nvidia's new chip will be central to a product lineup designed to fend off rivals' challenges and address emerging AI demand."
Whether Nvidia will reveal its next-generation GPU succeeding Rubin is also drawing attention. Last year, Nvidia announced a GPU development roadmap: Rubin in 2026, Rubin Ultra in 2027, and Feynman in 2028. Given that previous GTC events have offered hints about successor models, features of Feynman could be previewed this time.
Some observers expect Nvidia may also unveil a central processing unit optimized for agentic AI. A CPU-only server rack capable of developing AI agents without GPUs could be displayed. While AI accelerators typically combine CPUs and GPUs—such as Grace-Blackwell or Vera-Rubin configurations—Nvidia appears to be developing CPU-centric data center servers reflecting the elevated importance of CPUs in the agentic AI era. This could pose a significant threat to Intel and AMD, which have dominated the server CPU market.
Competition between Nvidia's memory chip suppliers Samsung Electronics and SK Hynix is also drawing keen interest. SK Group Chairman Chey Tae-won is reportedly planning to attend GTC 2026 and may meet with CEO Jensen Huang. Samsung Electronics Vice President Song Yong-ho of the DS Division will present on "The Future of Semiconductor Manufacturing through AI." Fierce competition between Samsung Electronics and SK Hynix for dominance in the HBM4 market—chips to be installed in Rubin—is expected.




