Sr Principal Software Engineer, Quantization (AI2324)
San Jose, California, United States
Full Time Senior-level / Expert USD 240K - 305K
SiMa.ai
Introducing the first Machine Learning SoC (MLSoC™) platform, purpose-built to let you effortlessly scale and deploy ML at the embedded edge. Effortless ML, artificial intelligence, MLSoC, Palette software, Edgematic.
Job Title: Sr Principal Software Engineer, Quantization Job Location: San Jose, CA
Job Number: AI2324 Job Description: SiMa.ai is seeking an outstanding researcher working on efficient deep learning to join the MLSoC Platform Architecture team. We are passionate about pushing the boundaries of Edge AI with power efficient inferencing. We are particularly interested in Post Training Quantization and Pruning techniques applied to quantization of CNN and Transformer based Neural Networks for inference primarily on int8 Machine Learning Accelerator (MLA) and on mixed precision MLA. You will work with an amazing team of engineers that pushes the boundaries and your contributions will have a chance to create a real impact in our products. Sr. Principal Engineer Key Responsibilities (including but not limited to):
Job Number: AI2324 Job Description: SiMa.ai is seeking an outstanding researcher working on efficient deep learning to join the MLSoC Platform Architecture team. We are passionate about pushing the boundaries of Edge AI with power efficient inferencing. We are particularly interested in Post Training Quantization and Pruning techniques applied to quantization of CNN and Transformer based Neural Networks for inference primarily on int8 Machine Learning Accelerator (MLA) and on mixed precision MLA. You will work with an amazing team of engineers that pushes the boundaries and your contributions will have a chance to create a real impact in our products. Sr. Principal Engineer Key Responsibilities (including but not limited to):
- Research, design and implement novel methods to improve PTQ techniques for both int8 and mixed-precision (int8 + bf16) quantization.
- Collaborate with other team members to understand the limitations of our Machine Learning Accelerator and adapt your strategy based on their input.
- Prototype PTQ techniques using Fake Quantization in PyTorch, as well as modify internal tools to implement quantized operators to verify accuracy.
- Understand state-of-the-art research in PTQ and apply it to CNN and Transformer based Neural Networks.
- Help define timeline and deliverables and be accountable for them.
- PhD in electrical engineering or computer science with 6+ years research numerical methods and tools in efficient Neural Network inferencing.
- Proficient in techniques like HAWQ2, and RL based methods for Mixed-precision quantization.
- Proficient in state-of-the-art PTQ techniques like Optimum Brain Compression for LLMs.
- Proficient with PyTorch or other Quantization exploration frameworks like Model Compression Toolkit.
- Excellent programming skills in C++, Python.
- Co-authored internal technical presentations, research papers and disclosures/patents on key technical topics
- Noteworthy technical contributions, which were multi-disciplinary and in collaboration with other cross-functional teams.
Job stats:
2
0
0
Categories:
Deep Learning Jobs
Engineering Jobs
Tags: Architecture Classification Computer Science Deep Learning Engineering LLMs Machine Learning PhD Python PyTorch Research
Region:
North America
Country:
United States
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Business Intelligence Developer jobsData Engineer II jobsPrincipal Data Scientist jobsPrincipal Data Engineer jobsStaff Data Scientist jobsBI Developer jobsCopywriter - Freelance AI Tutor jobsData Scientist II jobsData Manager jobsData Science Manager jobsJunior Data Analyst jobsResearch Scientist jobsBusiness Data Analyst jobsLead Data Analyst jobsSr. Data Scientist jobsSr Data Engineer jobsData Science Intern jobsSenior Artificial Intelligence/Machine Learning Engineer - Remote, Latin America jobsBI Analyst jobsJunior Data Engineer jobsSenior AI Engineer jobsJunior Data Scientist jobsData Engineer III jobsSoftware Engineer, Machine Learning jobsData Specialist jobs
Snowflake jobsLinux jobsEconomics jobsHadoop jobsPhysics jobsOpen Source jobsJavaScript jobsRDBMS jobsComputer Vision jobsAirflow jobsKafka jobsScala jobsMLOps jobsNoSQL jobsData Warehousing jobsBanking jobsData warehouse jobsKPIs jobsGoogle Cloud jobsClassification jobsGitHub jobsSAS jobsPostgreSQL jobsOracle jobsScikit-learn jobs
Scrum jobsCX jobsR&D jobsStreaming jobsTerraform jobsData Mining jobsPandas jobsLooker jobsDistributed Systems jobsIndustrial jobsRobotics jobsJira jobsPySpark jobsBigQuery jobsJenkins jobsRedshift jobsReact jobsMatlab jobsdbt jobsMySQL jobsMicroservices jobsUnstructured data jobsE-commerce jobsData strategy jobsNumPy jobs