Software Engineer - Systems ML - PyTorch

Bellevue, WA | Menlo Park, CA | New York, NY

Meta

Giving people the power to build community and bring the world closer together

View all jobs at Meta

Apply now Apply later

In this role, you will be a member of the PyTorch Core Systems team. The PyTorch team develops the open source software stack powering AI models and systems. The Systems team optimizes highly performant software to train and serve AI architectures. You will work closely with AI researchers to analyze deep learning models and optimize their performance within PyTorch. You will also partner with researchers to understand modern advances in AI guided software development and apply this directly towards PyTorch code and device optimization.

Examples of projects include: Rewriting core collectives to introduce fault tolerance with RDMA and GPUDirect, allowing training to continue even when nodes fail. Building a custom Python bytecode interpreter so you can capture PyTorch graphs without forcing users to rewrite their Python code. Rewriting PyTorch Distributed from scratch so you can pdb across a training job. Rewriting all of our C++ code so it’s ABI compatible for another 20 years. Fixing performance problems by changing a single register value from 1 to 0. Utilizing AI systems to optimize PyTorch compiler passesSoftware Engineer - Systems ML - PyTorch Responsibilities
  • Improve PyTorch's state of the art training, post-training, and inference on modern AI hardware accelerators.
  • Development of PyTorch's software stack with a focus on AI frameworks and high performance kernel development
  • Performance tuning and optimizations of deep learning framework & software components.
  • Collaborating with AI research scientists to accelerate the next generation of deep learning models such as Recommendation systems, Generative AI, Computer vision, NLP etc.
Minimum Qualifications
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Proven C/C++ programming skills
  • Experience in AI framework development or accelerating deep learning models on hardware architectures.
Preferred Qualifications
  • Knowledge of GPU, CPU, or AI hardware accelerator architectures.
  • Experience working with frameworks like PyTorch, Caffe2, TensorFlow, ONNX, TensorRT
  • OR AI high performance kernels: Experience with CUDA programming, OpenMP / OpenCL programming or AI hardware accelerator kernel programming. Experience in accelerating libraries on AI hardware, similar to cuBLAS, cuDNN, CUTLASS, HIP, ROCm etc.
  • OR AI Compiler: Experience with compiler optimizations such as loop optimizations, vectorization, parallelization, hardware specific optimizations such as SIMD. Experience with MLIR, LLVM, IREE, XLA, TVM, Halide is a plus.
  • OR AI frameworks: Experience in developing training and inference framework components. Experience in system performance optimizations such as runtime analysis of latency, memory bandwidth, I/O access, compute utilization analysis and associated tooling development.
For those who live in or expect to work from California if hired for this position, please click here for additional information. About Meta Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today—beyond the constraints of screens, the limits of distance, and even the rules of physics.
$85.10/hour to $251,000/year + bonus + equity + benefits

Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity or sales incentives, if applicable. In addition to base compensation, Meta offers benefits. Learn more about benefits at Meta.

Equal Employment Opportunity Meta is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here.
Meta is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, fill out the Accommodations request form.
Apply now Apply later
Job stats:  0  0  0

Tags: Architecture Computer Science Computer Vision CUDA cuDNN Deep Learning Engineering Generative AI GPU Machine Learning NLP ONNX OpenMP Open Source Physics Python PyTorch Research SIMD TensorFlow TensorRT VR

Perks/benefits: Career development Equity / stock options Health care Salary bonus

Region: North America
Country: United States

More jobs like this