Machine Learning Infrastructure Engineer
Tasks
- Build tooling to diagnose cluster issues and hardware failures
- Manage experiments
- Maximize GPU allocation and utilization for serving and training
- Monitor deployments
- Support ML research and product infrastructure
Perks/Benefits
- N/A
Skills/Tech-stack
Cloud Storage | Cloud platform | Compute Engine | Deep learning | GPUs | Google Cloud | Google Cloud Platform | JAX | Kubernetes | PyTorch | TensorFlow
Education
N/A
Regions
Countries
States
Cities
Related jobs
-
Data Engineer USD 126K-208KAPI Integration | Airflow | Amazon Web Services | BigQuery | CCPADEI initiatives | Dental benefits | Employee rewards program | Medical benefits | Mental health supportMid-level Full TimeRemote, United States R5h ago
-
Alerting | Ansible | Bash | CI/CD | CephRemote workSenior-level Full TimeUnited States, United States R7h ago
-
Ansible | Bash | CI/CD | CentOS | CephContract-to-hire | No sponsorship | Remote workSenior-level Full TimeUnited States, United States R7h ago
-
Machine Learning Engineer USD 131K-178KAWS | Cassandra | Convolutional Neural Networks | Data Lakes | Data PipelinesMid-level Full TimeRemote, NY, US R8h ago
-
Senior Databricks Forward Deployed Engineer - GPS USD 119K-198KAPI Integration | AWS | Airflow | Azure | CI/CDTravelSenior-level Full TimeArlington/Rosslyn, Virginia, United States; Atlanta, Georgia, …10h ago
-
Lead Databricks Forward Deployed Engineer - GPS USD 189K-372KAPI Integration | AWS | Airflow | Apache Spark | AzureSenior-level Full TimeArlington/Rosslyn, Virginia, United States; Atlanta, Georgia, …10h ago
-
Lead AI and Data Solutions Engineer II USD 137K-229KAmazon Web Services | Apache Spark | Application Programming | Application Programming Interfaces | Cloud ComputingSenior-level Full TimeSacramento, California, United States; Tempe, Arizona, …10h ago
-
Databricks Senior Consultant USD 113K-188KAWS | Azure | Business Intelligence | Cloud platform | Data EngineeringSenior-level Full TimeArlington/Rosslyn, Virginia, United States; Sacramento, California, …10h ago
-
Software Engineer, Systems ML - SW/HW Co-design USD 117K-173KAI infrastructure | Bias Mitigation | C# | C++ | Co-designSenior-level Full TimeSunnyvale, CA | Redmond, WA11h ago
-
Software Engineer, Machine Learning USD 213K-293KAPI Design | Agent Orchestration | Artificial Intelligence | Bias Mitigation | C++Senior-level Full TimeSunnyvale, CA | Remote, US | … R11h ago
-
C++ | Data Processing | Debugging | Deep learning | Few-Shot LearningSenior-level Full TimeMountain View, CA, USA11h ago
-
GTM Applied AI Architect, Google Cloud USD 153K-222KAgent Development | Agent Development Kit | Cloud platform | Function Calling | GeminiSenior-level Full TimeAustin, TX, USA; Boulder, CO, USA11h ago
-
Senior Software Engineer, Generative AI USD 174K-252KAgent-based | Agent-based systems | Cloud platform | Data Structures | Data Structures and AlgorithmsSenior-level Full TimeSunnyvale, CA, USA11h ago
-
C++ | Data Analysis | Data Processing | Deep learning | EmbeddingsSenior-level Full TimeMountain View, CA, USA11h ago
-
CAN | DNP3 | Data Visualization | Docker | Firmware Over The AirSenior-level Full TimeSan Francisco, California, United States15h ago
-
Machine Learning Research Engineer USD 146K-222KData Analysis | Data Visualization | Deep learning | GPU Programming | Graph Neural Networks401k | Education reimbursement program | Flexible benefits package | Flexible schedule | Relocation assistanceMid-level Full TimeLivermore, CA, United States18h ago
-
Senior Machine Learning Engineer USD 229K-360KAB Testing | AWS SageMaker | Airflow | Amazon S3 | Apache FlinkDisability benefits | Equity awards | Health insurance | Life insurance | Paid time offSenior-level Full TimeSan Jose, California19h ago
-
Software Engineer, Data Infrastructure USD 155K-185KAWS | Apache Airflow | Apache Flink | Apache Kafka | Apache SparkMid-level Full TimeMountain View, CA20h ago
-
Member of Technical Staff, Robotics Research Engineer USD 270K-370KData collection | Deep learning | Demonstration data | Diffusion Models | JAXSenior-level Full TimeNew York21h ago
-
AWS | Airflow | Ansible | Apache Spark | ArgoCDAdditional vacation days | English courses | Flexible remote options | Health insurance | Hybrid work optionsMid-level Full TimeGeorgia21h ago
-
Software Engineer- BIS (Baseten Inference Stack) USD 180K-360KAutoscaling | Backend Engineering | Distributed Runtime | Distributed Systems | GPU WorkloadsCompany 401K | Family building stipend | Flexible PTO | Medical/Dental/Vision insurance | Paid parental leaveSenior-level Full TimeSan Francisco22h ago
-
Gen AI Engineering Analyst - Vice President USD 113K-170KAWS | Accuracy | Apache Kafka | Apache Spark | Azure401k | Accident insurance | Disability insurance | Life insurance | Medical, dental, and vision coverageExecutive-level Full Time14000 CITI CARDS WAY BUILDING C …23h ago
-
Principal AI/ML Engineer USD 165K-226KC# | C++ | CI/CD | CUDA | Computer Vision401k match | Dental insurance | Health insurance | Life insurance | Paid time offSenior-level Full TimeRemote PA - PA PAR, United … R23h ago
-
APIs | Compliance | Distributed Systems | Enterprise Integration | Generative AIOccasional evening calls | Remote workSenior-level Full TimeRemote - US Based R23h ago
-
AV Safety Engineering Analytics Engineer (GPSSC) USD 160K-246KCI/CD | Dash | Docker | GitHub | JenkinsRemote workMid-level Full TimeWork From Home - United States, … R23h ago