Lead Machine Learning Engineer, LLM Infrastructure
California - San Francisco, United States
USD 172K-285K Senior-level Full Time
Tasks
- Build evaluation and deployment systems
- Design LLM post training infrastructure
- Detect training and model regressions
- Drive feedback driven model improvement loops
- Ensure reproducibility versioning monitoring and deployment
- Implement experiment management systems
- Manage reward and feedback processing
- Optimize distributed training and inference workloads
- Own scalable training orchestration pipelines
- Run large scale offline evaluation
- Set up human or AI feedback loops
Perks/Benefits
- 401k
- Employee stock purchasing program
- Life and disability insurance
- Medical/Dental/Vision
- Mental health support
- Paid parental leave
- Time off programs
Skills/Tech-stack
AWS | Cloud platform | Debugging | Deep learning | Distributed Systems | Docker | Experiment Management | Google Cloud | Google Cloud Platform | Kubernetes | LLM | Machine Learning | Model Deployment | Model Evaluation | Monitoring | Preference optimization | Python | RLHF | Reinforcement Learning | Reproducibility | Reward Modeling | Version control
Education
N/A
Regions
Countries
States
Related jobs
-
Staff, Data Scientist USD 90K-180KBigQuery | Data Wrangling | DeepAR | Hive | Keras401k | Company discounts | Health insurance | Paid time off | Parental leaveSenior-level Full TimeBentonville, AR, United States6h ago
-
Artificial Intelligence | Cost Optimization | Datadog | Distributed Systems | Drift DetectionExecutive-level Full TimeDallas, TX, United States6h ago
-
Agents | Golang | Information Retrieval | Language Models | Language ProcessingRelocation support if required | Remote work flexibilitySenior-level Full TimeMountain View, CALIFORNIA, United States7h ago
-
A/B | A/B Testing | AWS | Adversarial Testing | Amazon SQSHybrid work | W2 employmentSenior-level Contract Full TimeIrvine, CA, United States R9h ago
-
Senior Data Engineer USD 100K-160KAI Services | API Integration | AWS CloudWatch | AWS Glue | AWS IAMEmployee assistance program | Hybrid work flexibility | Paid time off | Relocation assistance not available | Retirement savings planSenior-level Full TimeHouston, TX, United States12h ago
-
Data Engineer USD 100K-128KData Governance | Data Modeling | Data Security | Databricks | ELT401k match | Adoption Assistance | Community volunteer opportunities | Continuing education support | Dental insuranceMid-level Full TimeSouth Sioux City, NE, United States R13h ago
-
Senior kdb+ Architect (Greenfield Project) USD 170K-227KAWS | Azure | Data Modeling | GCP | KDB+Healthcare and wellbeing | Leadership development | Paid training | Referral bonuses | Social eventsSenior-level Full TimeNew York, NY, United States14h ago
-
Senior AI Software Engineer – LLM Applications (Azure) USD 119K-177K.NET | AI Foundry | Agentic AI | Agile | AngularDental insurance | Employee assistance program | Employee stock purchase plan | Flexible work options | Hybrid work flexibilitySenior-level Full TimeDallas, TX, United States15h ago
-
Senior Data Engineer TS/SCI Clearance USD 160K-220KAWS | Cloud Native | Data Visualization | Database Design | Database performanceBest place to work recognition | Employee development | Full employee approach | High employee morale and retentionSenior-level Full TimeHuntsville, United States15h ago
-
Data Engineer USD 148K-263KAPI | Apache Kafka | Apache Spark | Cassandra | Distributed SystemsDisability insurance | Health insurance | Holiday pay | Learning and development | Life insuranceMid-level Full TimeUSA-Remote Work R15h ago
-
A/B | A/B Testing | B testing | CI/CD | Collaborative FilteringSenior-level Full TimeSan Jose, California, United States15h ago
-
Data cleaning | Data collection | Deep learning | Machine Learning | Model EvaluationSenior-level Full TimeSan Jose, California, United States15h ago
-
GenAI Engineer USD 93K-163KAWS Bedrock | Agentic Workflows | C++ | CI/CD | CohereHealth and wellness benefits | Mentorship | Professional developmentEntry-level Full TimeArlington/Rosslyn, Virginia, United States15h ago
-
Senior GenAI Engineer USD 102K-171KAPI Development | AWS Bedrock | Agentic Workflows | CI/CD | CohereSenior-level Full TimeArlington/Rosslyn, Virginia, United States15h ago
-
Data Scientist - Platform Infrastructure USD 127K-189KData Governance | Data Modeling | Data Pipelines | Data Quality | ETLMid-level Full TimeLos Angeles, California, United States15h ago
-
Machine Learning Engineer Intern (E-commerce-Conversational AI) - 2026 Summer/Fall (PhD) USD 129K-246KLanguage Models | Language Processing | Large Language Models | Machine Learning | Machine TranslationEntry-level InternshipSeattle, Washington, United States15h ago
-
C++ | Data Compression | Data Ingestion | Data Processing | Data StorageSenior-level Full TimeSan Jose, California, United States15h ago
-
Computer Vision | Data Pipelines | Language Models | Language Processing | Large Language ModelsSenior-level Full TimeBellevue, WA | Menlo Park, CA16h ago
-
Senior Software Engineer, Cloud Databases USD 174K-252KAnalytical processing | Benchmarking | C++ | Cloud Databases | Cloud platformSenior-level Full TimeKirkland, WA, USA16h ago
-
C++ | Clustering | Data Pipelines | Data Processing | DebuggingSenior-level Full TimeMountain View, CA, USA16h ago
-
Research Engineer, Pretraining, DeepMind USD 174K-252KFine Tuning | JAX | Language Models | Large Language Models | Machine LearningMid-level Full TimeNew York, NY, USA16h ago
-
SOC Architect, XProf USD 147K-211KC# | C++ | Compiler profiling | Data Analysis | Data VisualizationSenior-level Full TimeSunnyvale, CA, USA16h ago
-
Technical Lead, AI/ML Storage USD 207K-300KAI/ML | AI/ML frameworks | Artificial Intelligence | Benchmarking | Cloud MLHealth insurance | Paid time off | Professional development | Retirement benefitsSenior-level Full TimeSeattle, WA, USA16h ago
-
Principal Engineer, Data Protection USD 307K-427KArtificial Intelligence | Data Protection | Data Transformation | Data labeling | Machine LearningSenior-level Full TimeSunnyvale, CA, USA; New York, NY, …16h ago
-
Applied Machine Learning | Backend Development | C++ | Computer Vision | Data AnalysisSenior-level Full TimeMountain View, CA, USA16h ago