Software Engineer, SystemML - Scaling / Performance
Tasks
- Build performance tuners and software benchmarks around NCCL and PyTorch
- Develop AI framework and training stack for large scale deep learning models
- Enable reliable distributed machine learning training
- Improve distributed GPU communication reliability and performance
Perks/Benefits
- N/A
Skills/Tech-stack
CUDA | Data-parallel | Distributed Data Parallel | Distributed data | Fully Sharded Data Parallel | GPU Architecture | HPC | Infiniband | NCCL | Pipeline Parallel | PyTorch | RoCE | Tensor Parallel | TensorFlow
Education
Roles
Regions
Countries
States
Cities
Related jobs
-
Software Engineer, SystemML - AI Networking USD 170K-251KC# | C++ | CUDA | Data-parallel | Distributed Data ParallelMid-level Full TimeMenlo Park, CA6h ago
-
Research Software Engineer, Multimodal AI USD 174K-253KC plus plus | Computer Vision | Deep learning | Distributed Computing | Few-Shot LearningMid-level Full TimeSan Jose, CA, USA6h ago
-
Senior Software Engineer, AI/ML, Geo and Gemini App USD 174K-253KA/B | A/B Testing | B testing | C++ | Data AnalysisSenior-level Full TimeNew York, NY, USA6h ago
-
Principal Software Engineer, Perception Pretraining USD 349K-431KC++ | Compute Optimization | Computer Vision | End to End | End-to-end modelingCompany benefits | Discretionary annual bonus | Equity incentive planSenior-level Full TimeMountain View, CA, USA; San Francisco, …14h ago
-
Technical Lead Manager (TLM), ML Simulation USD 238K-302KAnomaly Detection | C++ | Data Processing | Deep learning | Hugging FaceBonus program | Company benefits program | Equity incentive plan | Hybrid work scheduleSenior-level Full TimeNew York, NY, USA; Mountain View, …14h ago
-
Software Engineer, ML Infrastructure, Optimization USD 160K-240KC++ | CUDA | Deep learning | GPU | JAXSenior-level Full TimeMountain View, California (HQ)20h ago
-
C++ | CUDA | Deep learning | GPU | JAXAnnual performance bonus | Competitive benefits package | EquitySenior-level Full TimeMountain View, California (HQ)20h ago
-
Manager, Yield Management - GTM AA Data Scientist USD 132K-250KAWS | Agentic AI | Azure | Cloud platform | ContainerizationDental insurance | Flexible family care days | Health insurance | Paid Holidays | Paid parental leaveSenior-level Full TimeDearborn, MI, United States23h ago
-
Senior Machine Learning Engineer USD 216K-303KAdvertising Auctions | Airflow | BigQuery | Convolutional Neural Networks | Deep learning401k employer match | Family planning support | Flexible vacation | Gender-affirming care | Health care benefitsSenior-level Full TimeRemote - United States R1d ago
-
Research Infrastructure Engineer, Research Acceleration USD 350K-475KApache Spark | Distributed Systems | Evaluation Frameworks | Experiment tracking | JAXHealth, dental, vision insurance | Paid parental leave | Relocation support | Unlimited PTO | Visa sponsorshipSenior-level Full TimeSan Francisco1d ago
-
Senior Storage Benchmarking Engineer USD 186K-279KAnsible | Bash | Block Storage | DLIO | Data AnalysisCompany-sponsored team events | Flexible time off | Wellness resourcesSenior-level Full TimeSanta Clara, California1d ago
-
Lead Software Engineer - Fullstack Java/AWS/AI/ML USD 177K-215KAWS | AWS SNS | AWS SQS | AWS Step Functions | AngularBackup childcare | Financial coaching | Health care coverage | Mental health support | On-site health and wellness centersSenior-level Full TimePlano, TX, United States1d ago
-
Forward Deployed Engineer, Quantum Simulations USD 180K-250KAPI Integration | AWS | Azure | Cloud Computing | Cp2kFlexible location | Visa sponsorshipSenior-level Full TimeMenlo Park1d ago
-
API Development | AWS | Airflow | BigQuery | Cloud ComputingMid-level Full TimeDearborn, United States1d ago
-
API Development | Audio Processing | Computer Vision | Data Processing | Data QualityIn-person work | Open source contributions friendly | Visa sponsorshipSenior-level Full TimeSan Francisco, California, United States1d ago
-
GenAI Engineer III USD 110K-218KArtificial Intelligence | Containerization | Data Pipelines | Docker | Generative AISenior-level Full TimeArlington/Rosslyn, Virginia, United States1d ago
-
Generative AI Engineer III USD 110K-218KArtificial Intelligence | Data Pipelines | Docker | Kubernetes | Language ModelsSenior-level Full TimeAustin, Texas, United States; Boston, Massachusetts, …1d ago
-
BEV Modeling | CI/CD | Cloud Computing | Computer Vision | Data CurationDental insurance | Flexible work arrangements | Health insurance | Holiday closures | Life insuranceSenior-level Full TimeCanada1d ago
-
Senior Software Engineer, Storage AI/ML USD 174K-253KAlgorithms | Benchmarking | Cloud Storage | Data Structures | Deep learningSenior-level Full TimeSeattle, WA, USA1d ago
-
Senior Software Engineer, AI/ML, Creative Intelligence USD 174K-253KAd Creative Optimization | Ad creative | C++ | Creative Optimization | Data ProcessingSenior-level Full TimeMountain View, CA, USA1d ago
-
Senior AI/ML Technical Lead (TS/SCI) USD 160K-180KContainerization | DevOps | Docker | Git | Hugging Face401k matching | Dental insurance | Flexible spending accounts | Health insurance | Paid HolidaysSenior-level Full TimeAlexandria, VA, US1d ago
-
Agent Frameworks | C# | Data Preparation | Deep learning | Direct Preference OptimizationAnnual vacation | English language learning discounts | Fully remote | Health support budget | Home office setup supportSenior-level Full TimeCanada R1d ago
-
AWS | Automated testing | Azure | CI/CD | Data pipelineMid-level Full TimeRedwood City, CA1d ago
-
Entry-level Full TimeMcLean, VA, United States1d ago
-
Senior AI/ML Engineer USD 152K-190KAmazon Web Services | CI/CD | Deep learning | Git | Machine LearningCaregiver support | Hybrid work | Work on site in the United StatesSenior-level Full TimeCambridge, MA, United States1d ago