Research Scientist / Engineer – Training Infrastructure
Palo Alto, CA, Remote - International, London, UK
R
USD 200K-300K (estimate) Senior-level Full Time
Tasks
- Build monitoring and debugging tools
- Design train distributed systems
- Implement parallelization techniques
- Optimize training stability and resource utilization
Perks/Benefits
- N/A
Skills/Tech-stack
CUDA | Containerization | Distributed Systems | GPU clusters | Linux | MPI | NCCL | Networking | Orchestration | PyTorch | Scripting
Education
Bachelor of Engineering | Bachelor of Science | Master of Science | PhD
Roles
Engineer | Research Engineer | Research Scientist | Scientist
Regions
Countries
States
Related jobs
-
Principle Data Engineer USD 220K-235KAWS | Airflow | BigQuery | Capacity Planning | Compliance401k | Equity | Essential equipment | Flexible PTO | Fully remoteSenior-level Full TimeCleveland, OH R7h ago
-
Software Engineer, Data Platform USD 105K-132KAPI | AWS | CI/CD | Code review | DBT401k | Baby bonding leave | Commuter benefits | Disability insurance | Employee referral programSenior-level Full TimeUS Remote R10h ago
-
Principal Software Engineer - Storage Cache USD 295K-345KActive/Active | Alertmanager | C++ | Chaos Engineering | Container OrchestrationEquity compensationSenior-level Full TimeSan Mateo, CA, United States R10h ago
-
Staff Machine Learning Engineer, AI Research USD 230K-275KComputer Vision | Feature Engineering | Fine Tuning | Hyperparameter Tuning | Kubeflow401k | Dental insurance | Equity | Fertility treatment benefit | Health insuranceSenior-level Full TimeRemote - United States R12h ago
-
A/B | A/B Testing | AWS | Adversarial Testing | Amazon SQSHybrid work | W2 employmentSenior-level Contract Full TimeIrvine, CA, United States R13h ago
-
Senior Machine Learning Engineer, Roblox Assistant USD 196K-243KData Processing | Distributed data | Distributed data processing | Fine Tuning | Language ModelSenior-level Full TimeSan Mateo, CA, United States R15h ago
-
Senior Machine Learning Research Scientist USD 200K-220KAWS SageMaker | Apache Airflow | CUDA | Data parallelism | Distributed Training401k match | Medical/Dental/Vision insurance | Paid Holidays | Paid parental leave | Remote-first teamSenior-level Full TimeRemote (United States) R17h ago
-
Data Engineer USD 148K-263KAPI | Apache Kafka | Apache Spark | Cassandra | Distributed SystemsDisability insurance | Health insurance | Holiday pay | Learning and development | Life insuranceMid-level Full TimeUSA-Remote Work R19h ago
-
Staff Software Engineer - Core Ingest USD 191K-224KAgile Development | Apache Kafka | Distributed Systems | Docker | Fault ToleranceHealth insurance | Paid time off | Remote work optionsSenior-level Full TimeUnited States, Remote R1d ago
-
Staff Software Engineer - Data Query USD 191K-224KAgile | Automated testing | Big Data | C++ | Data StructuresSenior-level Full TimeUnited States, Remote R1d ago
-
Senior Machine Learning Engineering (Remote) USD 108K-270KAWS | Agile | Azure | Bias detection | CI/CDConference and journal article opportunities | Mentorship culture | Remote workSenior-level Full TimeDurham, North Carolina, United States of … R1d ago
-
AI Solutions Architect USD 126K-225KAir gapped deployment | Air-gapped | Apache Kafka | Apache NiFi | Data PipelinesCareer development | Employee resource groups | Flexible work from home | Generous paid time off | Paid volunteer timeSenior-level Full TimeUS-Washington DC-Remote, United States R1d ago
-
Senior Software Engineer - Data Infrastructure, Safety USD 196K-243KA/B | A/B Testing | AI | Automation | B testingSenior-level Full TimeSan Mateo, CA, United States R1d ago
-
Senior Applied Scientist USD 185K-245KBandits | Causal Inference | Causal Uplift | Collaborative Filtering | Data Processing401k plan | Dependent Care Flexible Spending Account | Employer paid commuter benefit | Flexible time off | Health Care Flexible Spending AccountSenior-level Full TimeLos Angeles, California, United States; San … R1d ago
-
Remote Sensing Data Scientist USD 104K-131KAWS | Azure | CI/CD | Classification | Computer Vision401k matching | Corporate discounts | Dental insurance | Education assistance | Flexible work optionsMid-level Full TimeRemote, REMOTE, United States R1d ago
-
Machine Learning Engineer USD 180K-250KAWS | Azure | CUDA | DDP | Distributed Training401k employer match | Health, dental, vision insurance | Paid time off | Professional development | Work-life balanceMid-level Full TimeEmeryville, California, United States; Hybrid (2-3 … R1d ago
-
Apache Airflow | Data Architecture | Data Governance | Data Modeling | Dimensional dataCollaborative work environment | Continuous learning | Flexible work hours | Health and wellness programs | Remote workSenior-level Full TimeArizona R1d ago
-
Machine Learning Engineer - Computer Vision USD 220K-250KAmazon Bedrock | Cloud ML | Cloud ML services | Computer Vision | Convolutional Neural NetworksCareer growth mindset | Equity | Meaningful impact on product and users | Remote-first work environmentMid-level Full TimeRemote (U.S.) R1d ago
-
Software Engineer - Platform USD 190K-230KAPI Design | Amazon Web Services | CI/CD | Distributed Systems | GraphQLBenefits | Equity | Remote work flexibilityMid-level Full TimeRemote with offices in San Francisco, … R1d ago
-
APIs | Azure | CI/CD | Data Governance | Distributed SystemsBi weekly engineering gatherings | Carbon footprint offset via Ecologi | Company laptop and tools | Company social events | Enhanced family policySenior-level Full TimeMoorgate, SouthEast EC2, United Kingdom R1d ago
-
Senior-level Full TimeUSA - Remote R1d ago
-
Adversarial Networks | BERT | Clustering | Convolutional Neural Networks | Data PipelinesEmployer-matched 401k | Exceptional benefits package | Flexible vacation | Hybrid work environment | Paid time offSenior-level Full TimeSanta Monica, CA R2d ago
-
Adversarial Networks | BERT | Clustering | Convolutional Neural Networks | Decision TreesEmployer-matched 401k | Flexible vacation paid time off | Medical benefits | Remote Work Hybrid Work EnvironmentSenior-level Full TimeSanta Monica, CA R2d ago
-
Algorithms | Amazon Kinesis | Amazon Kinesis Data Analytics | Apache Beam | Apache FlinkEmployer-matched 401k | Exceptional benefits package | Flexible paid time off | Hybrid work environmentSenior-level Full TimeSeattle, WA R2d ago
-
Amazon Kinesis | Amazon Kinesis Data Analytics | Apache Beam | Apache Cassandra | Apache Flink401k match | Comprehensive benefits | Flexible vacation | Paid time offSenior-level Full TimeSanta Monica, CA R2d ago