Research Engineer - Distributed Training
Tasks
- Create technical blogs for customers and developers
- Develop open-source distributed training libraries and frameworks
- Lead and participate in research for decentralized training orchestration
- Optimize AI workload performance and costs
- Publish research in top AI conferences
- Stay updated with AI/ML infrastructure advances and identify platform enhancements
Perks/Benefits
- Conferences
- Equity incentives
- Flexible work
- Hackathons
- Learning opportunities
- Quarterly off-sites
- Relocation assistance
- Remote or in-office
- Visa sponsorship
Skills/Tech-stack
AI/ML | AI/ML engineering | CI/CD | Compute Optimization | Data parallelism | DeepSpeed | Distributed Training | Experiment tracking | ML Engineering | MLOps | Memory Optimization | Model Parallelism | MosaicML LLM Foundry | Performance Tuning | Pipeline parallelism | PyTorch distributed | Ray | Scalability | Tensor Parallelism | Versioning
Regions
Countries
States
Related jobs
-
Research Engineer, World Models, DeepMind USD 147K-211KAccelerator Training | C++ | Deep learning | Distributed Training | GPU ComputingMid-level Full TimeLondon, UK; New York, NY, USA2h ago
-
Research Engineer, Infrastructure USD 255K-400KC++ | Checkpointing | Compute efficiency | Data Pipelines | Data parallelismSenior-level Full TimeSan Francisco Bay Area1d ago
-
Research, Mid-Training USD 225K-400KContext Length Extension | Data Engineering | Deep learning | Distributed Training | Language ModelAccess to large compute | Autonomy | Fast-paced environment | Minimal process overheadMid-level Full TimeSan Francisco Bay Area1d ago
-
Research, Post-Training Data USD 295K-365KAI Feedback | Active Learning | Data Curation | Data labeling | Deep learningAccess to compute resources | Fast prototyping | Research autonomy | Small high talent teamSenior-level Full TimeSan Francisco Bay Area1d ago
-
Agentic AI | Artificial Intelligence | Bimanual manipulation | C++ | CI/CDSenior-level Full TimeUS, CA, Santa Clara, United States2d ago
-
AI Agents Applied Engineer - Senior Associate USD 148K-240KA/B | A/B Testing | Auditability | B testing | Bandit AlgorithmsBackup childcare | Financial coaching | Flexible benefits | Health care coverage | Mental health supportSenior-level Full TimeBrooklyn, NY, United States2d ago
-
Machine Learning Engineer, Responsible AI USD 177K-387KA I | A I Safety | A/B | A/B Testing | Automated testingCommunity involvement | Health benefits | Hybrid work | In person options | Mental health supportMid-level Full TimeSeattle (WA), United States3d ago
-
Quantitative Research Engineer USD 150K-225KAWS | DAG | Dask | Data Analysis | Data VisualizationDental insurance | Health insurance | Paid time offMid-level Full TimeLondon; New York5d ago
-
Lead AI Research Engineer USD 100K-175KAI Studio | Azure AI | Azure AI Studio | Azure Data | Azure Data FactoryHealthcare benefits | Professional development | Retirement benefits | Time off | Tuition reimbursementSenior-level Full TimeWork at Home - Ohio - …7d ago
-
Debugging | Deep learning | Distributed Systems | Fault Tolerance | GPUEntry-level Full TimeSan Jose, California, United States8d ago
-
Research Engineer, AI Safety & Alignment USD 225K-400KA/B | A/B Testing | Adversarial Testing | B testing | Bias MitigationPublications and presentationsMid-level Full TimeRedwood City, CA9d ago
-
Android | Computer Vision | Data Management | Dataset versioning | Deep learningSenior-level Full TimeBaltimore, Maryland9d ago
-
Machine Learning Research Engineer USD 150K-275KCUDA | Deep learning | Distributed Training | Distributed inference | Inference OptimizationDaily meals | Housing subsidy | Medical, dental & vision coverage | Relocation supportSenior-level Full TimeCupertino, CA14d ago
-
AI/ML Research Engineer USD 120K-220KAWS | Cloud Computing | Computational Biology | Data Processing | Data pipelineMid-level Full TimeBoston16d ago
-
Software Engineer, Research Developer Productivity USD 230K-325KC++ | CI/CD | Docker | Kubernetes | PythonHybrid work model | Relocation assistanceMid-level Full TimeSan Francisco19d ago
-
Research Engineer, Media Data Research - MSL FAIR USD 170K-251KComputer Vision | Data Curation | Data Generation | Data Scaling Laws | Data mixingSenior-level Full TimeMenlo Park, CA21d ago
-
AI Research Engineer, Computer Vision USD 170K-210KAutoregressive models | CUDA | DDP | Data Pipelines | DeepSpeed401k retirement plan | Company equity | Dental insurance | Fertility support | Human Annotation SupportMid-level Full TimeRemote (U.S. or Canada) R21d ago
-
3D Reconstruction | AWS SageMaker | Amazon EC2 | Computer Vision | DDP401k eligibility | Annual cash bonus | Dental insurance | Medical insurance | Paid time offMid-level Full TimeLos Altos, CA24d ago
-
Lead AI Research Engineer USD 91K-175KCloud Platforms | Cloud platforms Azure | Cloud platforms Azure GCP | Cloud platforms Azure GCP AWS | Data PipelinesFlexible work arrangements | Health and well-being benefits | Inclusive culture | Professional development opportunities | Recognition programsSenior-level Full TimeWork at Home - Ohio - …1mo ago
-
AI Research Engineer, Scaling USD 180K-300KC++ | CUDA | DeepSpeed | Distributed Training | FSDP401k matching | Dental insurance | Health insurance | Holidays | Paid time offSenior-level Full TimeSan Carlos, California, United States1mo ago
-
Research Engineer, Multimodal USD 225K-400KAudio Processing | DeepSpeed | FSDP | Image Generation | Model CompressionSenior-level Full TimeRedwood City, CA1mo ago
-
Audio ML Engineer (Research) USD 134K-196KAI-assisted coding | AI-assisted coding tools | Audio signal processing | Coding Tools | DSPEmployee discounts | Flexible work environment | Recognition program | Training opportunities | Tuition reimbursementMid-level Full TimeUS Northridge 8500 Balboa Blvd, United …1mo ago
-
Research Engineer / Research Scientist, Tokens USD 350K-500KData Processing | Distributed Training | Kubernetes | Large Scale Data | Large-scale Data ProcessingFlexible working hours | Generous vacation and parental leave | Option to donate equityMid-level Full TimeNew York City, NY; New York …1mo ago
-
Senior LLM Research Engineer - Artificial Intelligence USD 165K-260KAlgorithms | Data Structures | Deep learning | DeepSpeed | Financial NLP401k match | Bonuses | Comprehensive benefits | Disability benefits | Medical/Dental/VisionSenior-level Full TimeNew York1mo ago
-
AI Data Foundation Research Engineer USD 126K-240KAI frameworks | Big Data | C++ | Computer Vision | Container OrchestrationHealth & wellbeing benefits | Inclusive work environment | Personal & professional developmentMid-level Full TimeFt. Collins, Colorado, United States of …1mo ago