Backend Software Engineer (ML Infra)
San Francisco, California, United States
A USD 200K-275K (estimate) Senior-level Full Time
Tasks
- Build cloud native infrastructure for machine learning workloads
- Build distributed inference pipelines
- Build distributed training pipelines
- Collaborate with machine learning engineers on training workflows and evaluation needs
- Design and implement backend systems for large scale machine learning workloads
- Develop internal developer tools for model training evaluation and deployment
- Implement monitoring logging and observability
- Optimize systems for performance reliability and cost efficiency
- Translate machine learning requirements into scalable backend solutions
Perks/Benefits
Skills/Tech-stack
Backend architecture | Containers | Distributed Systems | Docker | Fault-tolerant | Fault-tolerant systems | GPU Workloads | Go | Inference Systems | Kubernetes | Logging | Machine Learning | Model Inference | Model Training | Monitoring | Networking | Observability | Python | Ray | Training pipelines | VLLM
Education
Regions
Countries
States
Related jobs
-
Agent Orchestration | Airflow | Argo Workflows | Artifact versioning | Autonomous workflowsRemote work flexibilitySenior-level Full TimeRemote - United States R7h ago
-
Senior Databricks Engineer USD 180K-247KAWS | Autoscaling | Azure | CI/CD | CachingVisa sponsorshipSenior-level Full TimeCanada R8h ago
-
Staff Applied Scientist USD 244K-320KAgentic Systems | Artificial Intelligence | Benchmarking | CI/CD | Computer VisionEmployee communities | Experience bonus | Hybrid work model | Wellness reimbursementSenior-level Full TimeSeattle, Washington, United States8h ago
-
Senior-level Full TimeCanada R8h ago
-
Capacity Analysis | Cloud Computing | Continuous Improvement | Data Visualization | Data Warehousing401k | Dental insurance | Discounts | Health insurance | Paid leaveMid-level Full TimeUniversal City, CALIFORNIA, United States8h ago
-
AI Research Engineer USD 190K-280KDeep learning | Generative AI | Language Models | Language Processing | Large Language ModelsCareer development | Diversity and inclusion | Flexible work environmentMid-level Full TimeSeattle, Washington, United States; South San …9h ago
-
Senior Applied AI Engineer CAD 144K-165KAI SDK | AWS ECS | AWS ECS Fargate | AWS Key Management Service | AWS LambdaSenior-level Full TimeCanada10h ago
-
Bioinformatics Engineer USD 125K-150KBAM | BED | BWA | Batch | Bismark401k match | Dependent care assistance | Educational benefits | Employee referral bonus | Flexible spending accountMid-level Full TimeRockville, MD11h ago
-
A/B | A/B Testing | AWS | Airflow | Amazon Redshift401k matching | Employee assistance program | Flexible time off | Flexible work arrangement | Paid HolidaysMid-level Full TimeRemote, US R12h ago
-
Data Scientist I (Prescriptive AI) USD 99K-135KCPLEX | DB2 | Data Warehousing | Discrete Event Simulation | Discrete eventCross training | Onsite Work Authorization SupportMid-level Full TimeLittle Rock, AR12h ago
-
Software Engineer - Medical Applications & Algorithms USD 130K-150KAWS CodeBuild | AWS CodePipeline | Agile | Amazon Web Services | C++Cross-functional team collaboration | Hybrid work environment | Medical device industry domainMid-level Full TimeSan Francisco, California, United States13h ago
-
Senior-level Full TimeIrving, TX13h ago
-
Associate AI Engineer USD 144K-180K.NET | APIs | ASPNet | AWS | Azure401k matching | Dental insurance | Hybrid work model | Medical insurance | Paid time offMid-level Full TimeIrving, TX R13h ago
-
Data Engineer-Secret Clearance Required USD 100K-127KAWS | AWS Glue | AWS Redshift | Azure | Azure Data401k match | Bereavement leave | Disability insurance | Employee assistance program | Employee discount programSenior-level Full TimeRemote - Nationwide, United States R14h ago
-
Sr AI Engineer USD 84K-105KC# | Deep learning | Digital Signal | Digital Signal Processing | Edge ComputingAccidental death and dismemberment | Commuter benefits | Dental insurance | Flexible spending account | Health savings accountSenior-level Full TimeColumbia, MARYLAND, United States15h ago
-
Staff Machine Learning Engineer, Foundation - Seattle USD 208K-298KAWS | Artificial Intelligence | Azure | C++ | Code reviewCommunity groups | Experience bonus | Hybrid work model | Wellness reimbursementSenior-level Full TimeSeattle, Washington, United States16h ago
-
Applied Research Scientist / Engineer USD 175K-250KData Curation | Deep learning | Diffusion Models | Distributed Training | Domain AdaptationMid-level Full TimeNew York, NY, SF Bay Area, …16h ago
-
Machine Learning Engineer, Data Mining USD 144K-192KActive Learning | Batch inference | CI/CD | Data Augmentation | Data Curation401k match | Dental insurance | Health savings account | Life insurance | Medical insuranceSenior-level Full TimePittsburgh, Pennsylvania, United States; Remote U.S. R16h ago
-
Machine Learning Engineer, Data Mining USD 144K-192KActive Learning | Batch inference | CI/CD | Data Augmentation | Data Drift401k match | Dental insurance | Health insurance | Health savings account | Life insuranceSenior-level Full TimeBoston, Massachusetts, United States; Remote U.S. R16h ago
-
AI Engineer (GenAI & Integration) USD 130K-181KAI Agents | AI Governance | API Integration | Automation workflows | DeploymentMid-level Full TimeCenter, Center District, IL17h ago
-
Data Engineer USD 105K-130KAPIs | Data Governance | Data Modeling | Data Monitoring | Data Quality401k employer matching | Childcare reimbursement | Company events social hours | Company paid parking or MTS pass | Fertility treatment coverageSenior-level Full TimeSan Diego, CA, United States17h ago
-
Senior Software Engineer - San Francisco (Onsite) USD 130K-220KAWS | Amazon EMR | Amazon S3 | Apache Flink | Apache SparkFast-paced startup environment | Onsite work environment | Rapid hiring process feedback | Relocation supportSenior-level Full TimeSan Francisco, CA, US17h ago
-
SYSTEM ENGINEER - Computer Network Support - AI/ML - 6+ yrs of Experience - TS/SCI w/Poly clearance is required - ES A USD 136K-140KArtificial Intelligence | Confluence | Jira | LLM | Machine Learning401k retirement plan | Dental insurance | Life insurance | Medical insurance | Paid time offMid-level Full TimeFort George G Meade, United States17h ago
-
Machine Learning Engineer USD 223K-260KAmazon Web Services | Apache Airflow | Apache Kafka | Apache Spark | BigQuery401k employer match | Caregiving support | Comprehensive healthcare benefits | Family planning support | Flexible vacationMid-level Full TimeNew York City, NY18h ago
-
Machine Learning Engineer USD 187K-260KA/B | A/B Testing | AWS | Amazon RDS | Amazon S3401k employer match | Comprehensive healthcare benefits | Family planning support | Flexible vacation | Gender-affirming careMid-level Full TimeSan Francisco, CA18h ago