Research Scientist / Engineer – Training Infrastructure
Palo Alto, CA, Remote - International, London, UK
R
USD 200K-300K (estimate) Senior-level Full Time
Tasks
- Build monitoring and debugging tools
- Design train distributed systems
- Implement parallelization techniques
- Optimize training stability and resource utilization
Perks/Benefits
- N/A
Skills/Tech-stack
CUDA | Containerization | Distributed Systems | GPU clusters | Linux | MPI | NCCL | Networking | Orchestration | PyTorch | Scripting
Education
Bachelor of Engineering | Bachelor of Science | Master of Science | PhD
Roles
Engineer | Research Engineer | Research Scientist | Scientist
Regions
Countries
States
Related jobs
-
Senior Forward Deployed AI Engineer USD 106K-180KAWS | Automation | CI/CD | Distributed Systems | EmbeddingsBenefits | Bonus eligibility | Remote work optionSenior-level Full TimeUnited States - Remote R19h ago
-
Senior Machine Learning Engineer - Camera Model USD 177K-212K3D Perception | BEV | CNN | Camera Calibration | Computer Vision100 percent paid medical dental and vision premiums | 401k employer match | Accidental death and dismemberment insurance | Company paid holiday office closures | Flexible scheduleSenior-level Full TimeRemote - U.S, Ann Arbor, MI R22h ago
-
Software Engineer, Data Infrastructure USD 153K-376KAI systems | Access Control | Apache Airflow | Apache Flink | Apache KafkaCell phone reimbursement | Company recharge days | Generous PTO | Learning and development stipend | Mental health and wellness benefitsMid-level Full TimeSan Francisco, CA • New York, … R1d ago
-
Sr. Staff Embedded AI Engineer USD 145K-185KBare Metal | C# | C++ | CMSIS NN | Code generationEmployee resource groups | Flexible work environment | Hybrid work model | Remote work optionSenior-level Full TimeColumbia, MARYLAND, United States R1d ago
-
Data Scientist (Hybrid as needed) USD 89K-150KAirflow | Business Intelligence | CI/CD | Classification | Cloud ComputingMid-level Full TimeLake Success, Nassau, United States R1d ago
-
Senior Software Engineer, LLM Performance USD 180K-339KC++ | CUDA | Cutlass | FlashAttention | FlashInferSenior-level Full TimeSF Bay Area (Hybrid) R1d ago
-
Data Scientist, Product USD 209K-235KClustering | Data Mining | Descriptive Statistics | Distributed Systems | ETLTelecommutingSenior-level Full TimeMenlo Park, CA | Remote, US R1d ago
-
Senior Data Scientist GBP 77K-110KAWS | Experimentation | Fine Tuning | Google Cloud | LLM AgentsHealth care | Home-office allowance | Personal learning budget | Sports and wellbeing allowanceSenior-level Full TimeUnited Kingdom R1d ago
-
Software Engineer, Perception and Prediction Evaluation USD 155K-213KAWS | Airflow | Batch | Data Analysis | Distributed SystemsDental insurance | Flexible hours | Health insurance | Social events | Team-building activitiesMid-level Full TimeRemote US & Canada R1d ago
-
Senior Analytics Engineer GBP 70K-100KAI | Amazon Redshift | Apache Airflow | Automation | CI/CDCareer growth plan | Coaching workshops | Flexible working hours | Gym membership | Health cash planSenior-level Full TimeUnited Kingdom - Remote R1d ago
-
SPAM Data Engineer USD 112K-214KAutomation | Email header analysis | Header Analysis | Linux | PerlFlexible work environment | Volunteer days | Wellbeing days | Work from anywhereMid-level Full TimeSunnyvale, CA, United States R1d ago
-
BigQuery | CI/CD | Containerization | Data Modeling | DatabricksSenior-level ContractNew York, NY (Hybrid) R1d ago
-
Data Engineer USD 119K-179KAccumulo | Agile | C# | Containers | DashOnsite work | Remote work options | Security clearance support | Travel opportunitiesSenior-level Full TimeUSA, Niceville, 360 West John Sims … R1d ago
-
Senior-level Full TimeUSA, Niceville, 360 West John Sims … R1d ago
-
Senior-level Full TimeSan Jose, United States R1d ago
-
Director - Enterprise Data Platform USD 155K-210KAgile | Apache Airflow | CI/CD | DBT | Data CatalogDiscretionary bonus | Flexible Time Off (FTO) | Healthcare benefits | Leave benefits | Retirement benefitsExecutive-level Full TimeNY7 - 50 Hudson Yards, New … R1d ago
-
AML | APIs | AWS | Access Management | AzureSenior-level Full TimeNew Jersey Office - 210 Hudson … R1d ago
-
Software Engineer (Hybrid) - 28021 USD 115K-155KAI Observability | API Design | AWS | Agile | Anthropic401k match | Education and training allowance | Healthcare Dental and vision insurance | Hybrid work environment | Paid HolidaysMid-level Full TimeColumbia, MD, Maryland, United States R1d ago
-
Senior Software Engineer (Hybrid) - 28022 USD 137K-195KAI Observability | API Design | AWS | Agile | Anthropic API401k match | Education/training allowances | Generous PTO | Healthcare premium 100 percent paid | Hybrid workSenior-level Full TimeColumbia, MD, Maryland, United States R1d ago
-
Staff AI Engineer USD 175K-250KAgent systems | Backtesting | Data Pipelines | Distributed Systems | Fine TuningBonus eligibility | Equity | Performance incentives | Remote work | Token participationSenior-level Full TimeNew York, United States - Remote R1d ago
-
Senior Machine Learning Engineer, Inference USD 172K-306KAlgorithms | Cloud platform | Code Coverage | Containerization | Data StructuresSenior-level Full TimeSan Jose, United States R1d ago
-
Sr Machine Learning Engineer 5 -- AEP, Agentic System USD 172K-306KAgent Orchestration | Automated retraining | CI/CD | Context engineering | HuggingfaceSenior-level Full TimeSan Jose, United States R1d ago
-
Applied & Agentic AI Engineer USD 105K-115KAPI Integration | Access Control | Agent systems | Audit Logging | Azure OpenAIGreat place to work | Work-life balanceSenior-level Full TimeTelecommuter TN, United States R1d ago
-
Solutions Architect, Physical AI and Robotics USD 152K-241KBenchmarking | C++ | CUDA | Cosmos | Digital TwinsBenefits | EquitySenior-level Full TimeUS, CA, Remote, United States R1d ago
-
Staff Software Engineer (L4) Data Platform USD 171K-213KAWS | Apache Hudi | Apache Iceberg | Apache Kafka | Apache SparkDonation match | Occasional travel | Remote work | Volunteering supportSenior-level Full TimeRemote - US R1d ago