As vision-centric large language models move on-device, performance measured in raw TOPS is no longer enough. Architectures need to be built around real workloads, memory behavior, and sustained utilization, especially at the edge. Vision LLMs…
Read More
The Coming Breakup Between AI and the Cloud
For a decade, cloud AI has felt inevitable. It powers our voice assistants, photo libraries, recommendation engines, and a growing list of “smart” features we barely notice anymore. Yet beneath the convenience is a fragile…
Read More
Expedera’s Packet-Based AI Processing Architecture: An Introduction
Most NPUs available today are not actually optimized for AI processing. Rather, they are variations of former CPU, GPU, or DSP designs. Every neural network has varying processing and memory requirements and offers unique processing…
Read More
Peeling Back the Layers of TOPS/W
Expedera’s Paul Karazuba, vice president of Marketing, was a guest on a recent episode of The Circuit, a semiconductor industry-focused podcast, where the discussion centered around NPUs. In that podcast, there was much discussion around…
Read More
Considerations For Accelerating On-Device Stable Diffusion Models
The generative AI model that is a critical test for NPU design. One of the more powerful—and visually stunning—advances in generative AI has been the development of Stable Diffusion models. These models are used for…
Read More




