Staff Machine Learning Engineer, GenAI Platform
Tasks
- Analyze bottlenecks for performance and cost efficiency
- Architect fault tolerant distributed training systems
- Architect multimodal data ingestion pipelines
- Build evaluation and benchmarking infrastructure
- Build self serve LLM fine tuning workflows
- Create automated regression detection and metrics tracking
- Design automated recovery cluster monitoring checkpointing
- Implement production grade LLM training pipelines
- Implement throughput optimization and dynamic batching
- Mentor engineers and lead MLOps culture
- Optimize inference heavy evaluation patterns
- Propose and lead LLM platform architecture
Perks/Benefits
- 401k employer match
- Family planning support
- Flexible vacation
- Gender-affirming care
- Healthcare benefits
- Income replacement programs
- Mental health & coaching benefits
- Paid parental leave
- Paid volunteer time off
- Professional development
Skills/Tech-stack
CUDA | DeepSpeed | Distributed Systems | Docker | FSDP | Go | Kubernetes | LLM | MLOps | MLflow | Megatron-LM | Python | Ray | TensorRT-LLM | VLLM
Education
N/A
Related jobs
-
Data Engineer USD 127K-175KAWS | Airbyte | Airflow | Alerting | CI/CD401k contribution | Dental insurance | Health insurance | Life insurance | Long-term disabilitySenior-level Full TimeUnited States - Remote R11h ago
-
Intern, AI/ML USD 70K-120KAlgorithm deployment | C++ | Computer Vision | Deep learning | GPU ComputingAccess to LinkedIn Learning | Commuter benefits | Internet reimbursement | Paid time offEntry-level InternshipUnited States, Remote R14h ago
-
Senior Staff Machine Learning Engineer, GenAI Platform USD 292K-409KAWS | Agentic AI | CI/CD | Cloud Storage | Generative AI401k employer match | Caregiving support | Family planning support | Flexible vacation | Gender-affirming careSenior-level Full TimeRemote - United States R17h ago
-
Sr. Data Engineer USD 180K-220KBusiness Intelligence | DBT | Dashboards | Data Modeling | Data Transformation401k plan | Health coverage | Life and disability insurance | Mental health days | Paid parental leaveSenior-level Full TimeRemote - United States Only R19h ago
-
Staff Data Engineer | Luma USD 142K-163KAWS | Agile | Apache Kafka | Asynchronous processing | CI/CD401k retirement plan | Dental insurance | Disability insurance | Fitness perks | Flexible time offSenior-level Full TimeRemote - USA R20h ago
-
Cloudflare | Docker | Event Processing | Go | JavaScriptHigh ownership culture | Remote work flexibility | Startup environmentSenior-level Full TimeRemote, US R1d ago
-
Senior Data/ML Engineer USD 151K-205KDBT | Data Architecture | Data Governance | Data Observability | Data Quality401k match | Dental insurance | Family planning resources | Flexible vacation days | Learning and development programSenior-level Full TimeRemote - USA R1d ago
-
Senior Developer Experience Advocate USD 141K-190KAI Agents | Benchmarking | DBT | Documentation | GitHub401k with guaranteed contribution | Healthcare | Home office stipend | Paid parental leave | Unlimited vacationSenior-level Full TimeUS - Remote R1d ago
-
Data Platform Engineer USD 124K-201KAWS | Amazon Kinesis | Amazon Redshift | Apache Airflow | Apache FlinkFamily leave | Flexible paid time off | Free food and snacks | Health care plan | Life insuranceSenior-level Full TimeOrlando, Florida, United States - Remote R1d ago
-
Data Engineer USD 110K-126KAWS CloudFormation | Amazon EC2 | Amazon S3 | Apache Airflow | Apache Iceberg401k match | Dental insurance | Employee stock purchase plan | Flexible time off | Medical insuranceSenior-level Full TimeHybrid - Denver, United States R1d ago
-
Lead Machine Learning Engineer USD 121K-224KC# | Continuous Deployment | Continuous integration | Data Visualization | Database Design401k | Field Work Option | Flexible work schedule | Health insurance | Hybrid work optionSenior-level Full TimeRemote-NC, United States R1d ago
-
Staff Analytics Engineer - US (Remote) USD 158K-232KAirflow | DBT | Data Modeling | Data Monitoring | Data QualityRemote work | Unlimited learning budgetSenior-level Full TimeUnited States R1d ago
-
AWS | Agent Orchestration | Amazon SQS | Distributed Systems | Event DrivenRemote work | Unlimited learning and development budgetSenior-level Full TimeUnited States R1d ago
-
Staff Software Engineer, Combinatorial Optimization USD 155K-213KC# | C++ | CI/CD | Cloud Native | Cloud Native ArchitectureCompany holidays | Health insurance | Learning and development reimbursement | Life insurance | Long-term disabilitySenior-level Full TimeTorrance, California, United States; US - … R1d ago
-
Staff AI Engineer - Applied AI (Remote) USD 206K-243KAIOps | Agents | Amazon Bedrock | Anthropic | Distributed Systems401k plan | Community and employee resource groups | Dental insurance | Department stipend | Disability insuranceSenior-level Full TimeRemote - United States R1d ago
-
Analytics Engineer USD 92K-147KAirflow | Business Intelligence | DBT | Data Modeling | Data WarehousingHybrid work model | Remote work optionSenior-level Full TimeRemote, USA R1d ago
-
Senior Software Engineer, Data Infrastructure USD 191K-225KAWS EMR | Apache Airflow | Apache Iceberg | Apache Spark | Data ETLEmployee travel credits | Remote eligibleSenior-level Full TimeUSA - Remote R1d ago
-
Senior Data Engineer USD 143K-229KAnalysis Services Tabular | Azure Data | Azure Data Factory | Azure DevOps | Azure Log AnalyticsSenior-level Full TimeDenver, CO, United States R1d ago
-
Senior Data Engineer USD 143K-229KAnalysis Services | Azure Analysis | Azure Analysis Services | Azure Data | Azure Data FactoryMentorship | Remote work | Travel as requiredSenior-level Full TimeKansas City, MO, United States R1d ago
-
Senior/Staff Software Engineer, Data Engineering Team USD 225K-310KApache Airflow | Apache Kafka | C++ | Ceph | Data Distribution401k matching | Dental insurance | Health insurance | Life and AD D Insurance | Life insuranceSenior-level Full TimeRemote, United States R1d ago
-
Principal Database Engineer USD 180K-235KAWS | AWS Glue | Access Control | Amazon Aurora | Amazon DocumentDB401k matching | Collaborative supportive culture | Employee assistance program | Employee referral program | Employee resource groupsSenior-level Full TimeRemote - USA; Seattle, Washington, United … R1d ago
-
Senior Analytics Engineer USD 156K-187KAmazon Redshift | BigQuery | CI/CD | Cloud Data | Cloud Data WarehouseBackground check required | Remote-friendly | Startup environmentSenior-level Full TimeUNITED STATES - Remote, CANADA - … R1d ago
-
Lead AI Engineer, Enterprise AI Operations USD 204K-290KAPIs | Access Controls | Agent systems | Audit Logging | CI/CD401k match | Dental insurance | Equity RSUs | Flexible vacation | Health insuranceSenior-level Full TimeU.S. Remote R1d ago
-
Staff Embedded Controls Engineer, Thermal USD 129K-244KADC | C plus plus | C# | CAN | ConfluenceAdoption and surrogacy reimbursement | Back-up child care | Employee resource groups | Fertility treatments | Flexible family care daysSenior-level Full TimePalo Alto, CA, United States R1d ago
-
Forward Deployed Engineer - AI Solutions USD 125K-187KAPI Integration | Artificial Intelligence | CRM Integration | Change Management | Document Management401k match | Dependent care assistance | Flexible spending accounts | Life insurance | Long-term disabilityMid-level Full TimeUnited States R1d ago