ML Infrastructure Engineer
Los Angeles, California, United States
R
USD 145K-165K Mid-level Full Time
Tasks
- Automate ML lifecycle workflows for training validation and registry
- Build production model deployment and inference systems
- Create infrastructure as code for secure scalable environments
- Define ML infrastructure roadmap
- Design scalable ML development infrastructure
- Ensure governance reproducibility and auditability
- Establish model lifecycle management best practices
- Evaluate and implement model serving frameworks
- Experiment with tooling to improve reproducibility performance and developer velocity
- Implement CI CD workflows for ML automation
- Implement deployment and rollback strategies
- Implement monitoring for model performance latency drift and infrastructure health
- Manage training and inference workloads on AWS and GCP
- Support end to end ML workflows with data science and engineering teams
- Translate experimentation needs into production infrastructure solutions
Perks/Benefits
Skills/Tech-stack
AWS | Amazon SageMaker | BigQuery | BigQuery datasets | CI/CD | CloudFormation | CloudWatch | Datadog | Docker | Feature Engineering | Feature Stores | Flask | GCP | GPU Computing | Grafana | Kubernetes | Learning operations | MLflow | Machine Learning | Machine Learning Operations | Model Monitoring | Model Registry | Prometheus | Python | Terraform | Weights and Biases
Education
N/A
Regions
Countries
States
Cities
Related jobs
-
AI Gateways | AWS CDK | Chunking | Context engineering | Cost Tracking401k match | Counseling membership | Flexible time away | Life insurance | Long-term disabilityMid-level Full Time-REMOTE, USA- R13h ago
-
Agile | Apache Airflow | Artificial Intelligence | Automated testing | BigQueryCollaborative culture | Flexible working hours | Performance evaluations | Professional development opportunities | Remote workSenior-level Full TimeIdaho R20h ago
-
Agile | Apache Airflow | BigQuery | CI/CD | Cloud StorageCollaborative company culture | Flexible working hours | Professional development opportunities | Remote-first work environmentSenior-level Full TimeMinnesota R20h ago
-
Agile | Airflow | BigQuery | CI/CD | Cloud StorageCollaborative culture | Professional development | Remote-first flexibilitySenior-level Full TimeColorado R20h ago
-
Agile | Airflow | BigQuery | CI/CD | Cloud StorageCollaborative company culture | Flexible working hours | Professional development opportunities | Remote-first environmentSenior-level Full TimeColumbia R20h ago
-
Agile | Airflow | Automated testing | BI tools | BigQueryCollaborative company culture | Professional development | Remote-first flexible hoursSenior-level Full TimeIllinois R20h ago
-
Agile | Airflow | BigQuery | CI/CD | Cloud StorageCollaborative company culture | Professional development opportunities | Remote-first flexible hoursSenior-level Full TimeFlorida R20h ago
-
Agile | Airflow | Automated testing | BI tools | BigQueryCollaborative culture | Flexible working hours | Professional development | Remote-first environmentSenior-level Full TimeCalifornia R20h ago
-
Agile | Apache Airflow | Automated testing | BigQuery | CI/CDCollaborative company culture | Professional development opportunities | Remote-first flexibilitySenior-level Full TimeConnecticut R20h ago
-
Agile | Airflow | BI tools | BigQuery | CI/CDCollaborative & Innovative Culture | Professional development | Remote-first flexible hoursSenior-level Full TimeArizona R20h ago
-
Apache Airflow | Apache Hive | Apache Iceberg | Apache Kafka | Apache SparkFully remote work option | International hiring | Long term contractor optionEntry-level Full TimeUnited States R1d ago
-
Defensive Security AI Scientist USD 240K-260KAccelerate | Artificial Intelligence | CISA KEV | CUDA | CVSS401k plan with company matching | Bereavement | Disability insurance | Employee assistance program | Employee discount programSenior-level Full TimeRemote - Nationwide, United States R1d ago
-
ML Infrastructure Engineer USD 145K-165KAWS | Amazon Elastic Kubernetes Service | Amazon SageMaker | BigQuery | CD pipelinesHealth benefits | Paid time off | Remote work optionMid-level Full TimeBoston, MA R2d ago
-
ML Infrastructure Engineer USD 145K-165KAWS | Amazon SageMaker | BigQuery | CI/CD | CloudFormationBenefits plans | Remote work optionMid-level Full TimeNew York, New York, United States R2d ago
-
AI | AWS | DBA | Database systems | DevOpsDental insurance | Flexible working hours | Health insurance | Paid time off | Professional developmentSenior-level Full TimeMinnesota R2d ago
-
AI machine learning | AWS | Cloud platform | DBA operations | Database systemsDental insurance | Flexible working hours | Health insurance | Paid time off | Professional developmentMid-level Full TimeIllinois R2d ago
-
C# | MATLAB | NumPy | Pandas | PythonFlexible hours | Non permanent employment | Part-time project workSenior-level Full TimeNew York, New York, United States … R2d ago
-
Senior AI Data Engineer USD 160K-200KAWS | AWS Athena | AWS Glue | AWS Lambda | Amazon Redshift401k matching | Dental insurance | Disability insurance | Life insurance | Medical insuranceSenior-level Full TimeSan Diego, California, United States R2d ago
-
Causal Inference | Classification | Clustering | Data Warehousing | Experiment designFlexible PTO | Home office stipend | Learning budget | Paid health, dental, vision | Parental leaveSenior-level Full TimeBoston or Remote R2d ago
-
Senior Data Engineer USD 155K-220KAPI Integration | Alerting | Amazon Redshift | Automated testing | CI/CDSenior-level Full TimeDenver, CO;San Francisco, CA;New York, NY;Los … R2d ago
-
Infrastructure Engineer (Storage) USD 180K-200KAutomation | Bare Metal | Bring-up | Capacity Planning | CephComprehensive medical/dental/vision coverage | Flexible work environment | Paid parental leave | Paid time off | Pension contributionMid-level Full TimeNew York, New York, United States; … R2d ago
-
Data Engineer USD 74K-133KAgile | Apache Airflow | BigQuery | Cloud Composer | Cloud Storage401k employer match | Dental insurance | Disability insurance | Flexible time off | Health insuranceMid-level Full TimeLisle, IL, United States R2d ago
-
Lead Instructor: MLOps / AI Platform Engineering USD 23K-27KAI Foundry | Azure AI | Azure AI Foundry | Azure Machine Learning | CI/CDFlexible scheduling | Professional training delivery | Remote workSenior-level Full TimeU.S. Remote R2d ago
-
Lead Instructor: Agentic AI Engineering USD 23K-27KAI Foundry | API Design | Agent Design | Agent design patterns | Agent systemsFlexible scheduling | Remote work | West Coast Pacific Time hoursSenior-level Full TimeU.S. Remote R2d ago
-
Lead Software Engineer USD 121K-181KAWS | Agile | Cause analysis | Data Modeling | Data WarehousingFlexible work arrangement | Health insurance | Hybrid work model | Life insurance | Paid time offSenior-level Full TimeTX, United States R2d ago