Efficient Inference Engine

Predibase Inference Engine Offers a Cost Effective, Scalable Serving Stack for Specialized AI Models

Predibase, the developer platform for productionizing open source AI, is debuting the Predibase Inference Engine, a comprehensive solution for deploying fine-tuned small language models (SLMs) quickly ...

InfoQ

Google’s TurboQuant Compression May Support Faster Inference, Same Accuracy on Less Capable Hardware

Google Research unveiled TurboQuant, a novel quantization algorithm that compresses large language models’ Key-Value caches by up to 6x. With 3.5-bit compression, near-zero accuracy loss, and no ...

Sail Research raises $80M to optimize long-horizon AI agents

Inference startup Sail Research Inc. today announced that it has raised $80 million in funding at a $450 million valuation.

Why Enterprises Are Rethinking Infrastructure for the AI Inference Era

Enterprise conversations around artificial intelligence are beginning to shift noticeably. For the past few years, much of ...

Barron's

Nvidia Announces New Inference Engine Called Dynamo

Inference, what happens after you prompt an AI model like ChatGPT, has taken on more salience now that traditional model scaling has stalled. To get better responses, model makers like OpenAI and ...

15d

The Age Of Tokenomics: Why Enterprise AI Success Depends On An AI Data Platform

As AI adoption accelerates, organizations will increasingly measure AI success not by model size, but by the economics of ...

VentureBeat

Pipeshift cuts GPU usage for AI inferences 75% with modular interface engine

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now DeepSeek’s release of R1 this week was a ...

VentureBeat

The team behind continuous batching says your idle GPUs should be running inference, not sitting dark

Every GPU cluster has dead time. Training jobs finish, workloads shift and hardware sits dark while power and cooling costs keep running. For neocloud operators, those empty cycles are lost margin.

IEN.eu

Industrial Computers and Boards with Intel Core Series 3 Processors

Advantech announced the integration of Intel Core Series 3 processors into its next-generation portfolio of industrial-grade ...

Semiconductor Engineering

What’s The Best Way To Sell An Inference Engine?

The burgeoning AI market has seen innumerable startups funded on the strength of their ideas about building faster, lower-power, and/or lower-cost AI inference engines. Part of the go-to-market ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results