Sr. Software Development Engineer, MLOPs
Tasks
- Architect large scale data pipelines for robotics datasets
- Build CI/CD pipelines for ML models
- Design scalable ML training infrastructure on Kubernetes
- Develop experiment tracking tooling
- Develop hyperparameter optimization tooling
- Ensure reproducibility for ML workflows
- Establish monitoring, alerting, and observability
- Implement fault tolerant distributed training
- Manage GPU fleet and optimize cost
- Operationalize ML models into production
Perks/Benefits
- N/A
Skills/Tech-stack
Alerting | Amazon EKS | CI/CD | Checkpointing | Data Ingestion | Data Pipelines | Distributed Systems | Experiment tracking | Fault-tolerant | Fault-tolerant systems | GPU scheduling | Hyperparameter Optimization | Kubernetes | MLOps | Machine Learning | Model Deployment | Monitoring | Observability | Reproducibility
Education
N/A
Related jobs
-
Featured Feat. Applied AI Engineer - Bay Area USD 211K-263KArtificial Intelligence | C plus plus | C# | Embeddings | Feature Engineering401k | Comprehensive health and wellness benefits | Learning and development opportunities | Unlimited time offMid-level Full TimeHQ (San Francisco)23d ago
-
Data Engineer I - API USD 152K-220KAlteryx | Business Control Systems Integration | Business control | Clean | Control systems integration401k contribution | Dental insurance | Disability insurance | Employee referral awards | Health insuranceMid-level Full TimeClayton, NC, US1h ago
-
AI Research Engineer – Agentic AI USD 165K-180KAblation Studies | Agentic AI | Edge Computing | Error Analysis | Evaluation401k matching | Disability insurance | Health insurance | Life insurance | Paid time offMid-level Full TimeSunnyvale, CA, United States5h ago
-
Lead Analytics Engineer USD 123K-175KCI/CD | DBT | Data Governance | Data Modeling | NetsuiteFull-time telecommuting | Remote work optionSenior-level Full TimePalo Alto, California5h ago
-
Staff Software Engineer, AI Data Platform USD 250K-280KCloud platform | Google Cloud | Google Cloud Platform | GraphQL | KafkaSenior-level Full TimeSan Francisco Bay Area R7h ago
-
BEV | Bayesian Methods | CUDA | Machine Learning | Metrics OptimizationSenior-level Full TimeFoster City, CA8h ago
-
Continual Learning | Data Processing | Deep learning | JAX | Language ModelsBonus program | Company benefits program | Equity incentive planEntry-level Full TimeMountain View, CA USA; San Francisco, …9h ago
-
Staff Machine Learning Engineer, Multi-Modal Perception USD 251K-310KC plus plus | Computer Vision | Data Analysis | Deep learning | JAXSenior-level Full TimeMountain View, CA USA; San Francisco, …9h ago
-
Software Development Engineer, Aurora Storage USD 143K-194KAWS | Amazon Aurora | Distributed Systems | High Availability | MySQLCareer growth | Flexible work schedule | Mentorship | Work-life balanceMid-level Full TimeRedmond, Washington, USA9h ago
-
Senior-level Full TimeUnited States - Remote R11h ago
-
Sr Data Engineer USD 115K-145KAWS S3 | Apache Airflow | Azure Blob | Azure Blob Storage | BigQuery401k | Dental insurance | Discounts | Fully remote | Health insuranceSenior-level Full TimeNew York, NEW YORK, United States R11h ago
-
Senior Staff Software Engineer, Data Platform USD 253K-298KAI Agents | Agent systems | Batch Processing | Change Data Capture | Compliance401k | Quarterly in person surges | Remote-firstSenior-level Full TimeRemote - USA R12h ago
-
Senior Software Engineer, Data Systems (Python) USD 170K-200KAPI Design | API Keys | Apache Airflow | Authentication | BigQuery12 Company Paid Holidays | 401k | Company-Paid Holidays | Flexible PTO | Healthcare benefitsSenior-level Full TimeRemote - USA R12h ago
-
AI for Quantum Operations Lead USD 160K-258KActive Learning | Anomaly Detection | Artificial Intelligence | Bayesian optimization | Cause analysisSenior-level Full TimeBoston, MA, USA12h ago
-
Mid-level Full TimeBethesda, MD - TS/SCI clearance required13h ago
-
Principal AI Engineer USD 265K-285KAWS | Amazon SageMaker | Apache Airflow | CI/CD | DBT401k match | Company-provided phone | Extended leave | Full insurance coverage | Observed holidaysSenior-level Full TimeAustin, Texas, United States; Denver, Colorado, …13h ago
-
Junior Quantitative Analyst USD 150K-150KAlgorithms | C++ | DAG | Data Engineering | Data Structures401k | Casual dress code | Employee resource groups | Flexible spending account | Gym discountsEntry-level Full TimeAustin, Texas13h ago
-
Staff + Senior Software Engineer, Inference USD 320K-485KAutoscaling | Batching | Caching | Cloud infrastructure | Deployment PipelinesCompetitive benefits | Flexible working hours | Generous vacation | Parental leave | Visa sponsorshipSenior-level Full TimeSan Francisco, CA | New York …13h ago
-
Principal AI Engineer USD 160K-220KAI Governance | API Design | AWS | AWS Bedrock | Agent OrchestrationSenior-level Full TimeUS - Remote R13h ago
-
A/B | A/B Testing | Active Learning | Auto-labeling | B testingDental insurance | Dependent Care Account | Disability insurance | Flexible spending account | Flexible vacationMid-level Full TimeAnywhere, USA R13h ago
-
Senior AI Engineer USD 100K-115KAWS | Agile | Cost Optimization | Embeddings | Generative AI401k | Dental insurance | Dry cleaning services | Meal benefits | Medical insuranceSenior-level Full TimeEnglewood Cliffs, NEW JERSEY, United States13h ago
-
Batching | C# | C++ | CUDA | FP16Dental insurance | Disability insurance | Flexible spending account | Flexible vacation | Health insuranceMid-level Full TimeAnywhere, USA R13h ago
-
Senior AI/ML Engineer USD 160K-230KAgent systems | Agentic Systems | Data Pipelines | Docker | Driven systemsSenior-level Full TimeRemote, USA R13h ago
-
Machine Learning Engineer USD 80K-90KDeep learning | Evaluation metrics | Generalization | Language Models | Large Language ModelsSenior-level Full TimeFremont, California R13h ago
-
Machine Learning Engineer USD 80K-90KDeep learning | Evaluation metrics | Generalization | Language Models | Large Language ModelsBonus | Health insurance | Onsite work | Paid time offSenior-level Full TimeManteno, Illinois R13h ago