Area-optimized AI Inference
The Expedera Origin™ E1 is a family of Artificial Intelligence (AI) processing cores individually optimized for a subset of neural networks commonly used in home appliances, edge nodes, and other small consumer devices. The E1 family also includes LittleNPU support for always-sensing cameras found in smartphones, smart doorbells, and security cameras. Products like these require optimized AI inference that minimizes power consumption, silicon area, and system cost. The Origin E1 saves area and system power while optimizing performance and reducing latency through careful attention to processor utilization and eliminating the need for external memory. The E1 cores offer 1 TOPS performance for a variety of networks.
Optimized for Neural Networks
While many general-purpose Neural Processing Units (NPUs) exist, a one-size-fits-all solution is rarely the most efficient. By optimizing the E1 for specific neural networks, Expedera can significantly reduce NPU area and power—essential in cost- and power-constrained devices. The Origin E1 family supports combinations of common neural networks, including ResNet 50 V1, EfficientNet, NanoDet, Tiny YOLOv3, MobileNet V1, MobileNet SSD, BERT, CenterNet, Unet, and many others.
Always-sensing NPU Support
Like always-listening audio applications, always-sensing cameras enable a more natural and seamless user experience. However, camera data has quality, richness, and privacy concerns which require specialized AI processing. OEMs are turning to specialized “LittleNPU” AI processors to process always-sensing data. Expedera’s E1 family has been optimized to process the low-power, high-quality neural networks used by leading OEMs in always-sensing applications while maintaining low power (often as low as 10-20mW) and keeping all camera data within the LittleNPU subsystem, working hand in hand with device security implementations to safeguard user data. Expedera’s LittleNPU solutions are available today.
Native Execution: a New NPU Paradigm
Typical AI accelerators—often repurposed CPUs (Central Processing Units) or GPUs (Graphic Processing Units)—rely on a complex software stack that converts a neural network into a long sequence of basic instructions. Execution of these instructions tends to be inefficient, with low processor utilization ranging from 20 to 40%. Taking a new approach, Expedera designed Origin specifically as an NPU that efficiently executes the neural network directly using metadata and achieves sustained utilization averaging 80%. The metadata indicates the function of each layer (such as convolution or pooling) and other important details, such as the size and shape of the convolution. No changes to your trained neural networks are required, and there is no perceivable reduction in model accuracy. This approach greatly simplifies the software, and Expedera provides a robust stack based on Apache TVM. In addition, Expedera’s native execution eases the adoption of new models and reduces time to market.
Market-leading Power Efficiency
Understanding the comparative power efficiencies of NPUs can be complicated. Ours isn’t— Expedera’s Origin E1 family provides up to a market-leading 18 TOPS/W, where we assume a TSMC 7nm process, running ResNet50 at an INT8 precision throughout with a 1GHz system clock. No sparsity, compression, or pruning is applied, though all are supported and may further increase power efficiency. Origin has repeatedly been cited as the most power-efficient NPU available by third parties and customers alike. For example, the E1 power consumption typically averages 55mW or less.
Silicon-Proven and Deployed in Millions of Consumer Products
Choosing the right AI processor can ‘make or break’ a design. The Origin architecture is silicon-proven in leading-edge process nodes and successfully shipped in millions of consumer devices worldwide.
- Performance efficient up to 18 TOPS/Watt
- Capable of processing real-time HD video and images on-chip
- Advanced activation memory management
- Low latency
- Tunable for specific workloads
- Hardware scheduler for NN
- Support for standard NN functions including Convolution, Deconvolution, FC, Activations, Reshape, Concat, Elementwise, Pooling, Softmax, Bilinear
- Processes model as trained, no need for software optimizations
- Use familiar open-source platforms like TFlite
- Delivered as soft IP: portable to any process
Industry-leading performance and power efficiency (up to 18 TOPS/W)
Architected to compute requirements of a specific neural network
Drastically reduces memory requirements, no off-chip DRAM required.
Run trained models unchanged without the need for hardware dependent optimizations.
Deterministic, real-time performance.
Improved performance for your workloads, while still running breadth of models.
Simple software stack.
Achieve same accuracy your trained model.
Simplifies deployment to end customers.
- Efficiency: industry-leading 18 TOPS/W enables greater processing efficiencies with lower power consumption
- Simplicity: eliminates complicated compilers, easing design complexity, reducing cost, and speeding time-to-market
- Configurability: independently configurable building blocks allow for design optimization– right sized deployments
- Predictability: deterministic, QoS
- Deployability: best-in-market TOPS/mm2 assures ideal processing/chip size designs
to our News
Sign up today and receive helpful
resources delivered directly
to your inbox.