Senior / Staff Machine Learning Infra Engineer
Toronto, CAN, San Francisco, CA & Remote - US & Canada
Full Time Senior-level / Expert USD 158K - 269K
With offices in Toronto, San Francisco, and Dallas, Waabi is growing quickly and looking for diverse, innovative and collaborative candidates who want to impact the world in a positive way. To learn more visit: www.waabi.ai
You will...- Work alongside a team of multidisciplinary Engineers and Research Scientists using an AI-first approach to enable safe self-driving at scale.- Collaborate with cross-functional teams in the company to understand the growing need and pain points in cloud usage.- Propose cloud strategies around compute and data usages for training and simulation workloads.- Design and implement scalable and resilient cloud infrastructure optimized for long term reliability and adaptability. - Devise and promote best practices for cloud usages in training and simulation environments, oversee cloud strategies and usages across the whole company.
Qualifications:- BS, MS/PhD in Computer Science or similar technical field of study or equivalent practical experience.- 5+ years of relevant industry experience.- Experience in reading and developing production quality software.- Deep understanding of Cloud compute and data storage for distributed training and inference workloads.- Familiarity with Python, GO, Rust or C++ ecosystems.- Experience working with public cloud platforms (AWS preferred).- Experience with infrastructure as code systems (Terraform preferred).- Experience in job scheduling and resource allocation.- Experience with containers and container orchestration (i.e., Docker, ECS, Kubernetes).- Experience and high level of comfort working with Linux systems.- Experience with building platform services that enable other teams to do their best work.- Open-minded and collaborative team player with the willingness to help others.- Passionate about self-driving technologies, solving hard problems, and creating innovative solutions.- Experience working in an Agile/Scrum environment.
Bonus/nice to have:- Experience with on-premise servers, network equipment and scale-out storage systems.- Experience with CI/CD pipelines and release management.- Experience in common ML tools, workflows and frameworks (i.e. systems like Kubeflow or MLFlow).- Understand system performance tuning at software, hardware, and network levels.- Have good understanding of GPUs and accelerators in ML training and inference use cases.The US yearly salary range for this role is: $158,000 - $269,000 USD in addition to competitive perks & benefits. Waabi (US) Inc.’s yearly salary ranges are determined based on several factors in accordance with the Company’s compensation practices. The salary base range is reflective of the minimum and maximum target for new hire salaries for the position across all US locations. Note: The Company provides additional compensation for employees in this role, including equity incentive awards and an annual performance bonus.
Perks/Benefits:- Competitive compensation and equity awards.- Health and Wellness benefits encompassing Medical, Dental and Vision coverage (for full-time employees only).- Unlimited Vacation.- Flexible hours and Work from Home support.- Daily drinks, snacks and catered meals (when in office).- Regularly scheduled team building activities and social events both on-site, off-site & virtually.- As we grow, this list continues to evolve!
Waabi is a technology start-up building technologies to transform the way the world moves. Join our talented team to be a part of the future and to make an impact!
Waabi is an equal opportunity employer. We celebrate diversity and are committed to creating a supportive, inclusive, and accessible workplace for all our employees. We seek applicants of all backgrounds and identities, across race, color, ethnicity, national origin or ancestry, age, citizenship, religion, sex, sexual orientation, gender identity or expression, military or veteran status, marital status, pregnancy or parental status, caregiver status, disability, or any other characteristic protected by law. We make workplace accommodations for qualified individuals with disabilities as required by applicable law. If reasonable accommodation is needed to participate in the job application or interview process please let our recruiting team know.
Tags: Agile AWS CI/CD Computer Science Docker ECS Kubeflow Kubernetes Linux Machine Learning MLFlow PhD Pipelines Python Research Rust Scrum Terraform
Perks/benefits: Career development Competitive pay Equity / stock options Flex hours Flex vacation Gear Health care Lunch / meals Salary bonus Snacks / Drinks Startup environment Team events Unlimited paid time off
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.