SDE I - Systems, Runtime, and ML Infrastructure (AWS Custom Silicon), Annapurna Labs
Seattle, Washington, USA
Amazon.com
Free shipping on millions of items. Get the best of Shopping and Entertainment with Prime. Enjoy low prices and great deals on the largest selection of everyday essentials and other products, including fashion, home, beauty, electronics, Alexa...
At AWS, we're pioneering the future of cloud computing and AI acceleration through innovative hardware-software co-design. Our teams within Annapurna Labs and AWS AI are creating the foundation for next-generation cloud infrastructure that powers thousands of customers worldwide, from cutting-edge startups to global enterprises.
We operate at an unprecedented scale, designing custom silicon chips, advanced networking solutions, and ML accelerators that were unimaginable just a few years ago.
Our work spans from the lowest levels of hardware abstraction to high-performance distributed training systems, creating unique opportunities for early-career engineers to make significant impact across multiple domains.
Key job responsibilities
- Develop and optimize software for custom hardware and ML infrastructure
- Collaborate with hardware teams to understand and leverage chip architecture
- Implement and improve networking, runtime, and system-level software
- Assist in building and maintaining tools for profiling, monitoring, and debugging ML workloads
- Contribute to the development of open-source ML frameworks and infrastructure projects
- Participate in code reviews and implement best practices for software development
- Learn and apply new technologies to solve complex engineering challenges
About the team
Candidates will be routed to specific teams based on their interests and our current needs during the application process:
- The Elastic Network Adapter (ENA) team revolutionizes EC2 core networking, enabling enhanced networking capabilities across AWS's most critical compute instances. Here, you'll work with networking protocols and high-performance drivers that power millions of cloud workloads.
- Our AWS Neuron SDK team develops the complete software stack for custom ML accelerators (Inferentia and Trainium), democratizing access to AI infrastructure. This team bridges the gap between popular ML frameworks and custom hardware.
- The Machine Learning Server Software team maintains and optimizes the world's most advanced ML servers, focusing on system-level software that ensures peak performance of AI workloads. While we don't work directly on ML algorithms, we build the critical infrastructure that makes ML possible at scale.
- The SoC Hardware Abstraction Layer (HAL) team works at the intersection of hardware and software, developing the crucial middleware that manages our custom silicon chips. This team ensures our innovative hardware designs translate into reliable, high-performance solutions.
- To qualify, applicants should have earned (or expect to earn) a Bachelor’s or Master’s degree between December 2022 to September 2025.
- Strong programming skills in C/C++ or Python, with solid understanding of data structures and algorithms
- Understanding of computer architecture, operating systems, and Linux environments
- Internship or project experience related to systems programming, networking, or ML
- Knowledge of ML concepts or frameworks (e.g., PyTorch, TensorFlow)
- Interest in open-source development or contributions to technical communities
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.
Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company’s reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $99,500/year in our lowest geographic market up to $200,000/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.
We operate at an unprecedented scale, designing custom silicon chips, advanced networking solutions, and ML accelerators that were unimaginable just a few years ago.
Our work spans from the lowest levels of hardware abstraction to high-performance distributed training systems, creating unique opportunities for early-career engineers to make significant impact across multiple domains.
Key job responsibilities
- Develop and optimize software for custom hardware and ML infrastructure
- Collaborate with hardware teams to understand and leverage chip architecture
- Implement and improve networking, runtime, and system-level software
- Assist in building and maintaining tools for profiling, monitoring, and debugging ML workloads
- Contribute to the development of open-source ML frameworks and infrastructure projects
- Participate in code reviews and implement best practices for software development
- Learn and apply new technologies to solve complex engineering challenges
About the team
Candidates will be routed to specific teams based on their interests and our current needs during the application process:
- The Elastic Network Adapter (ENA) team revolutionizes EC2 core networking, enabling enhanced networking capabilities across AWS's most critical compute instances. Here, you'll work with networking protocols and high-performance drivers that power millions of cloud workloads.
- Our AWS Neuron SDK team develops the complete software stack for custom ML accelerators (Inferentia and Trainium), democratizing access to AI infrastructure. This team bridges the gap between popular ML frameworks and custom hardware.
- The Machine Learning Server Software team maintains and optimizes the world's most advanced ML servers, focusing on system-level software that ensures peak performance of AI workloads. While we don't work directly on ML algorithms, we build the critical infrastructure that makes ML possible at scale.
- The SoC Hardware Abstraction Layer (HAL) team works at the intersection of hardware and software, developing the crucial middleware that manages our custom silicon chips. This team ensures our innovative hardware designs translate into reliable, high-performance solutions.
Basic Qualifications
- To qualify, applicants should have earned (or expect to earn) a Bachelor’s or Master’s degree between December 2022 to September 2025.
- Strong programming skills in C/C++ or Python, with solid understanding of data structures and algorithms
- Understanding of computer architecture, operating systems, and Linux environments
- Internship or project experience related to systems programming, networking, or ML
Preferred Qualifications
- Familiarity with version control systems (e.g., Git) and software development methodologies- Knowledge of ML concepts or frameworks (e.g., PyTorch, TensorFlow)
- Interest in open-source development or contributions to technical communities
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.
Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company’s reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $99,500/year in our lowest geographic market up to $200,000/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.
Job stats:
1
0
0
Categories:
Machine Learning Jobs
Research Jobs
Tags: Architecture AWS EC2 Engineering Git Linux Machine Learning ML infrastructure Open Source Python PyTorch TensorFlow
Perks/benefits: Career development Equity / stock options
Region:
North America
Country:
United States
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
BI Developer jobsSr. Data Engineer jobsData Engineer II jobsBusiness Intelligence Analyst jobsPrincipal Data Engineer jobsStaff Data Scientist jobsStaff Machine Learning Engineer jobsData Science Manager jobsData Manager jobsPrincipal Software Engineer jobsData Science Intern jobsBusiness Data Analyst jobsJunior Data Analyst jobsData Analyst Intern jobsSoftware Engineer II jobsData Specialist jobsSr. Data Scientist jobsLead Data Analyst jobsDevOps Engineer jobsResearch Scientist jobsStaff Software Engineer jobsAI/ML Engineer jobsData Engineer III jobsSenior Backend Engineer jobsBI Analyst jobs
Git jobsAirflow jobsOpen Source jobsEconomics jobsLinux jobsKafka jobsComputer Vision jobsJavaScript jobsGoogle Cloud jobsMLOps jobsNoSQL jobsKPIs jobsTerraform jobsData Warehousing jobsPhysics jobsRDBMS jobsPostgreSQL jobsScikit-learn jobsBanking jobsHadoop jobsScala jobsGitHub jobsData warehouse jobsStreaming jobsPandas jobs
R&D jobsClassification jobsBigQuery jobsOracle jobsDistributed Systems jobsCX jobsPySpark jobsdbt jobsScrum jobsReact jobsLooker jobsRAG jobsMicroservices jobsJira jobsRobotics jobsRedshift jobsSAS jobsIndustrial jobsData Mining jobsPrompt engineering jobsNumPy jobsGPT jobsELT jobsMySQL jobsData strategy jobs