Sr. Software Development Engineer, MLOPs
Tasks
- Architect large scale data pipelines for robotics datasets
- Build CI/CD pipelines for ML models
- Design scalable ML training infrastructure on Kubernetes
- Develop experiment tracking tooling
- Develop hyperparameter optimization tooling
- Ensure reproducibility for ML workflows
- Establish monitoring, alerting, and observability
- Implement fault tolerant distributed training
- Manage GPU fleet and optimize cost
- Operationalize ML models into production
Perks/Benefits
- N/A
Skills/Tech-stack
Alerting | Amazon EKS | CI/CD | Checkpointing | Data Ingestion | Data Pipelines | Distributed Systems | Experiment tracking | Fault-tolerant | Fault-tolerant systems | GPU scheduling | Hyperparameter Optimization | Kubernetes | MLOps | Machine Learning | Model Deployment | Monitoring | Observability | Reproducibility
Education
N/A
Related jobs
-
Featured Feat. Applied AI Engineer - Bay Area USD 211K-263KArtificial Intelligence | C plus plus | C# | Embeddings | Feature Engineering401k | Comprehensive health and wellness benefits | Learning and development opportunities | Unlimited time offMid-level Full TimeHQ (San Francisco)23d ago
-
Lead Analytics Engineer USD 123K-175KCI/CD | DBT | Data Governance | Data Modeling | NetsuiteFull-time telecommuting | Remote work optionSenior-level Full TimePalo Alto, California4h ago
-
Staff Software Engineer, AI Data Platform USD 250K-280KCloud platform | Google Cloud | Google Cloud Platform | GraphQL | KafkaSenior-level Full TimeSan Francisco Bay Area R6h ago
-
BEV | Bayesian Methods | CUDA | Machine Learning | Metrics OptimizationSenior-level Full TimeFoster City, CA6h ago
-
Software Development Engineer, Aurora Storage USD 143K-194KAWS | Amazon Aurora | Distributed Systems | High Availability | MySQLCareer growth | Flexible work schedule | Mentorship | Work-life balanceMid-level Full TimeRedmond, Washington, USA8h ago
-
Senior-level Full TimeUnited States - Remote R9h ago
-
Sr Data Engineer USD 115K-145KAWS S3 | Apache Airflow | Azure Blob | Azure Blob Storage | BigQuery401k | Dental insurance | Discounts | Fully remote | Health insuranceSenior-level Full TimeNew York, NEW YORK, United States R9h ago
-
AI for Quantum Operations Lead USD 160K-258KActive Learning | Anomaly Detection | Artificial Intelligence | Bayesian optimization | Cause analysisSenior-level Full TimeBoston, MA, USA11h ago
-
Principal AI Engineer USD 265K-285KAWS | Amazon SageMaker | Apache Airflow | CI/CD | DBT401k match | Company-provided phone | Extended leave | Full insurance coverage | Observed holidaysSenior-level Full TimeAustin, Texas, United States; Denver, Colorado, …11h ago
-
Junior Quantitative Analyst USD 150K-150KAlgorithms | C++ | DAG | Data Engineering | Data Structures401k | Casual dress code | Employee resource groups | Flexible spending account | Gym discountsEntry-level Full TimeAustin, Texas11h ago
-
Principal AI Engineer USD 160K-220KAI Governance | API Design | AWS | AWS Bedrock | Agent OrchestrationSenior-level Full TimeUS - Remote R12h ago
-
A/B | A/B Testing | Active Learning | Auto-labeling | B testingDental insurance | Dependent Care Account | Disability insurance | Flexible spending account | Flexible vacationMid-level Full TimeAnywhere, USA R12h ago
-
Senior AI Engineer USD 100K-115KAWS | Agile | Cost Optimization | Embeddings | Generative AI401k | Dental insurance | Dry cleaning services | Meal benefits | Medical insuranceSenior-level Full TimeEnglewood Cliffs, NEW JERSEY, United States12h ago
-
Batching | C# | C++ | CUDA | FP16Dental insurance | Disability insurance | Flexible spending account | Flexible vacation | Health insuranceMid-level Full TimeAnywhere, USA R12h ago
-
Senior AI/ML Engineer USD 160K-230KAgent systems | Agentic Systems | Data Pipelines | Docker | Driven systemsSenior-level Full TimeRemote, USA R12h ago
-
Machine Learning Engineer USD 80K-90KDeep learning | Evaluation metrics | Generalization | Language Models | Large Language ModelsSenior-level Full TimeFremont, California R12h ago
-
Machine Learning Engineer USD 80K-90KDeep learning | Evaluation metrics | Generalization | Language Models | Large Language ModelsBonus | Health insurance | Onsite work | Paid time offSenior-level Full TimeManteno, Illinois R12h ago
-
AI/ML Engineer II USD 159K-211KAPI Design | AWS | Agent Orchestration | Agent systems | AzureHealth benefits | Onsite collaboration | Paid time off | Professional developmentMid-level Full TimeRemote, USA R12h ago
-
AI/ML Engineer USD 150K-211KAWS | Agent systems | Cloud platform | Data Pipelines | DockerOnsite schedule | WFH FridayEntry-level Full TimeRemote, USA R12h ago
-
Senior-level Full TimeTyson's Corner, VA12h ago
-
Software Engineer I/II, Machine Learning USD 129K-190KAWS | Airflow | Amazon S3 | Apache Arrow | CI/CD401k matching | Flexible spending accounts | Healthcare | Life insurance | Onsite gymMid-level Full TimeBoston Office13h ago
-
Data Engineer USD 130KAWS | Amazon Redshift | Apache Airflow | Apache Kafka | CI/CDEmployer paid medical/dental/vision | Flexible work | Health savings account | Life insurance | Remote work policyMid-level Full TimeRosslyn, VA or Remote R13h ago
-
Analytics Engineer USD 120K-150KAirflow | Apache Superset | DBT | Dagster | Dashboarding401k plan | Dental insurance | Health insurance | Paid Holidays | Paid time offMid-level Full TimeChicago13h ago
-
Lead AI Engineer USD 198K-261KAgentic Frameworks | CI/CD | Cloud Computing | Containers | Fine TuningSenior-level Full TimeSeattle, Washington, USA R14h ago
-
Senior AI Engineer USD 95K-197KAWS | Autogen | Azure | CI/CD | Clean CodeAutonomy | Learning and development programs | MentorshipSenior-level Full TimeSeattle, Washington, USA R14h ago