Sr. SDE (L6), ML Ops

Vancouver, British Columbia, CAN

Amazon.com

Free shipping on millions of items. Get the best of Shopping and Entertainment with Prime. Enjoy low prices and great deals on the largest selection of everyday essentials and other products, including fashion, home, beauty, electronics, Alexa...

View all jobs at Amazon.com

Apply now Apply later

The AWS Infrastructure Services (AIS) team is the backbone of AWS, managing the design, planning, delivery, and operation of our global infrastructure. Essentially, we’re the ones who keep the cloud running. Within AIS, the Science team takes on the exciting challenge of using big data and machine learning to optimize power and cooling, the most critical resources in our data centers. In short, we ensure maximum efficiency while preventing overheating and power outages. Our work helps shape future data center designs and drives exceptional cost savings to AWS customers.

As a Software Engineer on the AIS Science team, you will collaborate with scientists, program managers, and data engineers to build, operationalize, and scale machine learning workflows and platform services. Your work will directly impact how server demand is placed by modeling power and cooling load across AWS's global data centers.

You will play a critical role in building infrastructure meant to support all phases of ML models, from R&D to production, including model retraining and iteration. Our team tackles complex challenges in data processing, model hosting, and metric monitoring. As our responsibilities grow and the number of models we manage increases, we’re seeking an innovative senior engineer with a passion for data, machine learning, and MLOps to join our mission-driven team!

If you're passionate about machine learning and model operations, enjoy working in a collaborative and dynamic team that values work-life balance, and want to make a lasting impact on AWS infrastructure worldwide, this is your opportunity. Come join us on this exciting journey!


Key job responsibilities
In this role you will leverage your engineering background and expertise in ML to lead developing platforms for deploying, productionalizing, and scaling machine learning models, with a focus on variant retraining and ongoing model monitoring.

A day in the life
- Lead the design and implementation of a stable and efficient training and inference infrastructure that scales to support a variety of different machine learning models.
- Collaborate with tenured applied scientists and data engineers to develop improved training and inference infrastructure that accelerates innovation and promotes best practice model scoring and model monitoring.
- Quickly learn the ins and outs of AWS infrastructure’s rack planning and forecasting distributed workflows, and engineer solutions to make these systems more robust, fault-tolerant, and efficient across input and output orgs.

Basic Qualifications


- 5+ years of non-internship professional software development experience
- 5+ years of programming with at least one software programming language experience
- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience as a mentor, tech lead or leading an engineering team

Preferred Qualifications

- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Master's degree in machine learning or equivalent
- Experience with developing state-of-the-art, best practice MLOps tooling and frameworks

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, disability, age, or other legally protected status. If you would like to request an accommodation, please notify your Recruiter.

The base salary for this position ranges from $150,700/year up to $251,700/year. Salary is based on a number of factors and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. Applicants should apply via our internal or external career site.

Apply now Apply later
Job stats:  1  0  0

Tags: Architecture AWS Big Data Engineering Machine Learning ML models MLOps R R&D SDLC Testing

Perks/benefits: Career development Equity / stock options

Region: North America
Country: Canada

More jobs like this