
Accelerator IP for
AI Designs
An AI accelerator platform that scales
for any application
Expedera's unified compute pipeline architecture IP enables highly efficient hardware scheduling and advanced memory management to achieve unsurpassed end-to-end low-latency performance. The architecture is mathematically proven to utilize the least amount of memory for neural network (NN) execution. Our Origin™ deep learning accelerator IP enables system designers to meet a full range of low latency, power-efficient and high-performance requirements. This minimizes die area, reduces DRAM access, improves bandwidth, saves power, and maximizes performance. Since NNs generate a tremendous amount of intermediate data, minimizing the memory allows high-resolution NN processing, such as 4K/8K video, to run real-time on-chip. Expedera Origin offers the best performance in the ideal IP size, is easily integrated, readily scalable, and can be customized to your unique needs. For power-constrained applications, such as headsets and smart speakers, we offer our TimbreAI T3, an ultra-low power AI inference engine designed for audio noise reduction.
Expedera's Origin architecture allows designers to run their trained neural network unchanged, without the need for hardware-specific optimizations.
Origin enables AI hardware and software engineers to achieve greater accuracy and predictable performance. The native packet execution architecture enables a simplified software environment which reduces design complexity and simplifies integration efforts.
Origin achieves sustained performance of up to 128 TOPS with typical utilization rates of 70-90% (measured in silicon running common AI workloads such as ResNet). This best-in-class performance and utilization allows users to run AI models with less power consumption than alternative solutions. In 3rd party testing, Expedera’s architecture has been silicon-proven to produce efficiencies of 18 TOPS/W. Origin excels at image-related tasks like computer vision, image classification, and object detection. Additionally, it is capable of NLP (Natural Language Processing)-related tasks like machine translation, sentence classification, and generation. Origin offers deterministic performance, scalable on-chip execution with the smallest memory footprint and scales from edge solutions with little or no DRAM bandwidth to high-performance applications such as autonomous driving and cloud applications without the software bloat that other solutions require. Simply put, the Origin product family offers the ideal solution for engineers looking for a single DLA architecture that readily scales to a variety of application requirements while maintaining optimal processing, power, and area.
Expedera's four Origin Neural Engine IP product families can be tuned to fit the requirements of any AI application. Learn more about Expedera’s Origin E1, Origin E2, Origin E6, and Origin E8 below.
Origin Line of Products

Origin E1
The Expedera OriginTM E1 NPU (Neural Processing Unit) is a series of Artificial Intelligence processing cores which are individually optimized for a class of specific neural networks commonly found in edge, smartphone, and consumer devices. By tailoring Origin E1 IP to specific neural networks, Expedera provides an NPU that consumes the lowest possible amount of silicon area and external bandwidth while delivering optimum performance and utilization.
- Architected to compute requirements of a specific neural network
- Minimal to no off-chip memory requirements

Origin E2
Origin E2 is designed for single job AI applications in power-sensitive devices such as mobile phones and edge nodes. By using on-chip memory only, Origin E2 eliminates the requirement for external DRAM access, saving system power while increasing performance, reducing latency, and shrinking system BOM costs. Origin E2 is tunable for specific workloads to provide an optimal performance profile for unique application requirements.
- Runtime processing without external DRAM access requirements
- Tailored for application workloads, with support up to 20 TOPS
- Highly efficient engine using less than 1W

Origin E6
The Origin E6 is optimized for power and performance. It runs the breadth of neural network (NN) models for AI applications in smartphones, tablets, and edge servers. Advanced memory management ensures sustained DRAM bandwidth and optimal total system performance.
- Runs NN models, including CONV, DECONV, FC, Max pool, Avg pool, Global pool, Reshape
- Enables L3 cache or DRAM access during runtime, minimizing transfers
- Dual job support, with up to 32 TOPS performance
- Highly efficient engine using less than 2W

Origin E8
The Origin E8 is a high performance deep learning accelerator, targeted for high TOPS applications such as automotive/ADAS and data center. The Origin E8 enables multi-job support for better utilization of hardware resources and reductions in system costs. Its highly efficient neural network engine allows designers to develop products using passive cooling, further reducing system costs.
- Runs NN models including CONV, DECONV, FC, Max pool, Avg pool, Global pool, Reshape, and others
- Drastically reduces DRAM requirements, cutting BOM costs
- Highly efficient engine scaling up to 128 TOPS

TimbreAI T3
The TimbreAI T3 is an ultra-low power AI inference engine designed for audio noise reduction for power-constrained applications such as headsets and smart speakers.
- Runs NN models including RNN, LSTM, GRU
- Ultra-low <300μW power consumption enables long battery life
- 3.2 GOPS for wearable applications, like headset active noise reduction
- Tiny silicon footprint allows smaller, more cost-efficient designs

Download our White Papers

Get in Touch With Us
STAY INFORMED
Subscribe
to our News
Sign up today and receive helpful
resources delivered directly
to your inbox.