Graduation Project - Optimization techniques for efficient deployment of transformer-based models on edge devices
Eindhoven
NXP Semiconductors
The Automotive System Innovations team is part of the Chief Technology Office (CTO) department of NXP Semiconductors. We drive innovation on system level for the automotive businesses in applications like highly automated and safe driving, audio, radar systems, in-vehicle networking, artificial intelligence, battery management systems, as well as mobile robotics. We foresee that artificial intelligence through embedded neural networks will often provide a significant part of the ‘smartness’ in the products for these markets.
In this Graduation Project assignment, we want to investigate Transformer optimization techniques to reduce the memory footprint and latency while sustaining ML task performance when deploying these models on our HW. Transformers have been shown to be effective in many different domains, such as computer vision, audio, time series modelling and natural language processing. Yet, it is of great importance to optimize Transformer models for efficient inference on Edge AI devices, as Transformers have typically large number of parameters and high compute cost. There are a number of different approaches for optimizing Transformers, such as Quantization, (one-shot) Pruning, Token Sparsification, Speculative Decoding, Knowledge Distillation and Efficient Attention architectures. Therefore, we are interested in researching and exploring these techniques to identify the benefits, challenges and opportunities that each of these bring to the space of efficient transformer optimization and how can we perhaps utilize these techniques in conjunction with other methods to derive maximally optimized models with little/no accuracy degradation.
#LI-0d06* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture Computer Vision Machine Learning NLP Radar Robotics Transformers
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.