Staff ML Systems Engineer, Distributed Systems
Tasks
- Architect distributed execution systems
- Build reusable pipeline abstractions libraries
- Design distributed machine learning pipelines
- Develop monitoring observability debugging tooling
- Establish best practices engineering standards
- Manage data partitioning memory utilization serialization overhead
- Optimize distributed CPU and GPU performance
- Troubleshoot distributed bottlenecks
Perks/Benefits
- N/A
Skills/Tech-stack
CPU | Concurrency | DDP | Dask | Data Locality | Data Partitioning | Data Processing | Deep learning | DeepSpeed | Distributed Systems | Distributed Training | FSDP | Fault Tolerance | Flink | GPU | Kubernetes | Machine Learning | Memory Management | Model Evaluation | Model Training | Performance optimization | PyTorch | Python | Ray | Resource allocation | Scheduling | Serialization | Spark | TensorFlow | Workflow Orchestration
Education
N/A
Related jobs
-
Senior Analytics Engineer USD 140K-175KBusiness Intelligence | DBT | Data Modeling | Data Pipelines | Data WarehousingDental insurance | Medical insurance | Paid time off | Vision insuranceSenior-level Full TimeSouth San Francisco, California, USA9h ago
-
Machine Learning Engineer II USD 46K-54KAWS | Apache Spark | Azure | Business Intelligence | Data Warehousing401k | Career advancement | Dental insurance | Disability insurance | Health insuranceMid-level Full TimeChicago, IL13h ago
-
Principal AI Engineer, Special Programs USD 220K-350KAI Safety | API Development | API Orchestration | AWS | Cloud platform401k retirement plan | Dental insurance | Employee stock purchase plan | Life insurance | Long-term disability insuranceSenior-level Full TimePalo Alto, CA14h ago
-
Software Engineer, Statistical Evaluation and Sampling USD 170K-216KC++ | Data Processing | Data Processing Pipelines | Importance sampling | Machine LearningEntry-level Full TimeMountain View, CA, USA; San Francisco, …14h ago
-
Senior Software Engineer, Data Infrastructure USD 200K-400KAKS | AWS | Airflow | Azure | BigQueryDaily lunches and snacks | Disability benefits | Fertility and family building benefits | Life insurance | Medical, dental, vision benefitsSenior-level Full TimeNew York City15h ago
-
Senior Software Engineer, Data Infrastructure USD 200K-400KAWS | Airflow | Azure | BigQuery | CDCDaily lunches and snacks | Disability benefits | Fertility and family building benefits | Life insurance | Medical, dental, and vision benefitsSenior-level Full TimeSan Francisco15h ago
-
Senior Software Engineer, Storage USD 166K-210KAmazon CloudWatch | Amazon EC2 | Backups | Cause analysis | Cloud-basedAnnual equity refresh grants | Equity grants | Remote workSenior-level Full TimeUnited States - Remote R15h ago
-
Senior Software Engineer II, Storage USD 192K-242KAmazon CloudWatch | Amazon EC2 | Amazon RDS | Backups | Cloud platformAnnual refresh grants | Equity grant | Remote workSenior-level Full TimeUnited States - Remote R15h ago
-
Senior Software Engineer, Data Governance & Foundations USD 166K-210KApache Airflow | Apache Flink | Apache Hudi | Apache Iceberg | Apache KafkaAnnual refresh grants | Equity grant | Remote work flexibilitySenior-level Full TimeUnited States - Remote R15h ago
-
Associate Software Engineer, Embedded Development USD 100K-150KAOSP | Android | Bash | Black box testing | Black-box401k match | Dental insurance | Free snacks | Health insurance | Life insuranceMid-level Full TimeRaleigh, NC R15h ago
-
Machine Learning Engineer - Semantic Reasoning (Highway) USD 200K-293KBEV | C++ | Computer Vision | Deep learning | JAXPaid time offSenior-level Full TimeFoster City, CA15h ago
-
Dashboard | Data Visualization | Data pipeline | ETL | Machine LearningOnsite days schedule | Overtime paySenior-level Full TimeSan Mateo, CA, United States R16h ago
-
Sr. Machine Learning Engineer USD 175K-230KAWS | C plus plus | Deep learning | Kubernetes | Language Models401k plan | Cell phone internet reimbursement | Company-Paid Holidays | Flexible paid time off | Health Savings Account employer contributionSenior-level Full TimeRemote - United States R16h ago
-
Senior Software Engineer - Infrastructure R&D USD 130K-300KAWS | Azure | CI/CD | Cloud infrastructure | Cloud platformCommunity guilds | Hybrid work | Inclusion talks | Mentor and buddy program | Professional developmentSenior-level Full TimeDenver, Colorado, USA; New York, New …17h ago
-
Senior Software Engineer - Environments Accelerator USD 130K-300KAWS | Automation | CI/CD | Cloud infrastructure | Cloud platformCommunity guild access | Continuous professional development | Free mental health benefits | Hybrid workplace | Mentor and buddy programSenior-level Full TimeDenver, Colorado, USA; New York, New …17h ago
-
Senior AI & ML Engineer USD 194K-228KAPIs | Agent Orchestration | Agent routing | Agents SDK | Cloud infrastructureSenior-level Full TimeUnited States - Remote R18h ago
-
Robotics Framework Engineer II USD 143K-195KC++ | CI/CD | Cross-compilation | EtherCAT | Forward KinematicsMid-level Full TimeAustin, Texas, United States18h ago
-
AI Engineer - Operations USD 115K-165KAI Agents | AI coding | AI coding tools | Coding Tools | Data Engineering401(k) plan match | Barista coffee bar | Corporate events | Coworking spaces | Gym reimbursementMid-level Full TimeAlpharetta, GA18h ago
-
Senior Staff data science engineer USD 141K-206KAzure Machine Learning | CI/CD | Data Science | Databricks | Distributed SystemsDisability insurance | Employee assistance program | Flexible spending account | Health savings account | Life insuranceSenior-level Full TimeMilpitas, CA, United States20h ago
-
Lead Software Engineer, Data Platform USD 220K-250KAPI Design | Auto-labeling | Azure | Computer Vision | Data pipeline100 Percent covered medical dental vision | Citi Bike membership | Commuter benefits | Free lunch | Holiday observancesSenior-level Full TimeNew York, NY20h ago
-
Sr. Staff Engineer, Fulfillment Center (FC) Systems USD 159K-299KAWS | Agile | Alerting | Capacity Planning | CassandraSenior-level Full TimeSeattle, USA20h ago
-
Senior Analytics Engineer USD 152K-170KAmazon Redshift | BigQuery | DBT | Data Catalog | Data GovernanceRemote workSenior-level Full TimeUnited States21h ago
-
Senior AI Software Engineer USD 147K-220KAgentic AI | Code review | Debugging | Distributed Systems | DockerAccess to modern AI tools | Career growth opportunities | Continuous improvement culture | Ownership and decision-makingSenior-level Full TimeUSA - Sandy, UT21h ago
-
Machine Learning Engineer USD 168K-198KAPI Development | Amazon SageMaker | Anthropic | Asynchronous processing | CI/CDCommuter stipend | Fertility adoption and parental planning reimbursement | Flexible PTO policy | Learning and development allowance | Medical, dental, and vision coverageSenior-level Full TimeSan Francisco, California, United States21h ago
-
Mid-level Full TimeSan Mateo, CA21h ago