Engineering Manager, PyTorch - AI Acceleration

Bellevue, WA | Menlo Park, CA | New York City | San Francisco, CA

AI Acceleration is an org within PyTorch. It's responsible for making PyTorch run performantly and reliably on new hardware from external vendors (NVIDIA, AMD, etc.) and Meta's own AI chips.

We are looking for an engineering manager to support internal GPU enablement efforts - using our team's expertise to make GPU inference and training more efficient, reliable, scalable, and simple.

The ideal candidate should have strong technical skills - GPU / ML Systems knowledge is preferred, though not required. We work closely with internal product groups - XFN is a key part of the job.

We have assembled some of the best engineers in the industry to solve some of the world's most foundational problems at a scale that frankly defies comprehension. This is a chance to support these engineers and at the same time learn about industry-leading trends in high-performance computation.Engineering Manager, PyTorch - AI Acceleration Responsibilities
  • Grow a team of domain experts within AI Acceleration
  • Communicate, collaborate, and build relationships with clients and peer teams to facilitate cross-functional projects.
  • Operate strategically and tactically. Develop vision, strategy and help set direction for the team.
  • Remain up-to-date on ongoing software development activities in the team, help work through technical challenges, and be involved in design decisions.
Minimum Qualifications
  • 2+ years of experience in managing a team of HPC/GPU engineers of varied skill levels.
  • Demonstrated experience recruiting, building, structuring, leading technical organizations, including performance management.
  • Experience with cross functional collaboration with product ML or AI framework teams.
  • GPU/CPU optimization skills
Preferred Qualifications
  • Knowledge of ML frameworks like PyTorch, TensorFlow, ONNX, MXNet, etc.
  • Experience with different programming models for high-performance computations, e.g. GPU CUDA programming or OpenCL or OpenMP programming.
  • Experience with ML Systems - GPUs, kernels, compiler, model-specific transformations, quantization, communication optimizations, etc.
LocationsAbout Meta Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today—beyond the constraints of screens, the limits of distance, and even the rules of physics. Meta is committed to providing reasonable support (called accommodations) in our recruiting processes for candidates with disabilities, long term conditions, mental health conditions or sincerely held religious beliefs, or who are neurodivergent or require pregnancy-related support. If you need support, please reach out to accommodations-ext@fb.com. $177,000/year to $251,000/year + bonus + equity + benefits

Individual pay is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base salary only, and do not include bonus, equity or sales incentives, if applicable. In addition to base salary, Meta offers benefits. Learn more about benefits at Meta.
Apply now Apply later
  • Share this job via
  • or

Tags: CUDA Engineering GPU HPC Machine Learning MXNet ONNX OpenMP Physics PyTorch TensorFlow VR

Perks/benefits: Career development Equity Flex vacation Health care Salary bonus Team events

Region: North America
Country: United States
Job stats:  3  0  0

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.