Software Engineering Manager, LLM Training
USD 170K-277K Entry-level Full Time
Tasks
- Architect high throughput post training infrastructure for LLMs
- Collaborate with responsible AI teams on compliance and safety
- Contribute to post training stack frameworks and integrations
- Create inclusive team environment
- Define performance goals metrics and operational excellence
- Develop multi modal post training strategy
- Enable distributed training parallelism strategies
- Implement observability and profiling for training runs
- Lead agentic research and autonomous performance optimization
- Lead and coach engineers
- Manage containerized training environments with golden images
- Optimize customer workloads and platform performance
Perks/Benefits
Skills/Tech-stack
CUDA | Containerization | Data parallelism | Distributed Systems | Docker | Expert parallelism | Fine Tuning | FlashAttention | Hugging Face | Hugging Face Accelerate | Hugging Face Transformers | Human Feedback | Knowledge Distillation | Language Models | Large Language Models | Learning from Human Feedback | Low-precision training | Megatron-LM | Multi-Modal | Multi-modal Models | NCCL | Observability | Pipeline parallelism | Profiling | Pruning | PyTorch | Quantization | Ray | Ray Tune | Reinforcement Learning | Reinforcement Learning from Human Feedback | SGLang | Speculative decoding | Supervised Fine Tuning | Telemetry | Tensor Parallelism | VLLM | VeRL
Education
Bachelor of Engineering | Bachelor of Science | Master of Science | PhD
Regions
Countries
States
Related jobs
-
Product Manager, AI Integrity (Mandarin Speaker) USD 86K-232KAI Agents | Behavioral Analysis | Customer Insights | Data Analysis | Information RetrievalMid-level Full TimeSan Jose, California, United States11h ago
-
AI Inference | AI Training | As-a-Service | Data Residency | Data SovereigntySenior-level Full TimeSunnyvale, CA, USA12h ago
-
AI Transformation Lead USD 155K-160KAI Governance | Analytics | Artificial Intelligence | Change Management | Data ScienceSenior-level Full TimeDallas, TX, United States13h ago
-
Data Science Practice Lead USD 140K-285KA/B | A/B Testing | Agentic Workflows | Agile | Artificial IntelligenceComprehensive health care | Educational assistance | Emotional well-being support | Employee matching charitable giving | Hybrid work scheduleSenior-level Full Time245 Summer St, Boston MA, United …23h ago
-
Machine Learning Engineering Manager - AI Insights USD 139K-304KAnomaly Detection | Azure Data | Azure Data Bricks | C# | C++Mid-level Full TimeNew York, NY, US; Redmond, WA, …1d ago
-
Engineering Manager, Data Engineering & Infrastructure USD 139K-205KAWS | Access Management | Analytics Platforms | Cloud infrastructure | Data Architecture401k | Dental insurance | Disability insurance | Life insurance | Medical insuranceMid-level Full TimeNewark, CA1d ago
-
Engineering Manager II, Enterprise AI Solutions USD 177K-364KAWS | Artificial Intelligence | Backend systems | Data Lake | Data ManagementMid-level Full TimeSan Francisco, CA, US; Remote, US R1d ago
-
Senior Manager of Data Engineering USD 175K-185KAWS | Amazon Redshift | Apache Airflow | Automation | CData401k plan | Flexible time off | Life insurance | Long-term disability | Medical/Dental/Vision insuranceSenior-level Full TimeRemote (United States) R1d ago
-
Senior AI Technical Product Manager - R01563914 USD 161K-279KA/B | A/B Testing | AI Governance | Agent Orchestration | AzureOccasional travel | Remote work | Team planning and demos sessionsSenior-level Full TimeNew York, New York, United States1d ago
-
Launch Manager, Trust and Safety, Operational Excellence USD 142K-205KAnti-abuse | Artificial Intelligence | Business Strategy | Content policy | Data TransformationSenior-level Full TimeKirkland, WA, USA; Austin, TX, USA1d ago
-
Mid-level Full TimeColumbia, MD, United States1d ago
-
Technical Product Manager - AI Solutions USD 75K-165KAI Agent | AI agent orchestration | AI frameworks | API | Agent OrchestrationCharitable giving program | Company-Paid Holidays | Dental insurance | Flexible PTO | Life insuranceMid-level Full TimeMassachusetts - Boston R1d ago
-
Sr. Engineering Manager - Storage Engineering USD 203K-254KAmazon S3 | Apache Hadoop | Apache Spark | Caching | CephContinued Career Development | Employee resource groups | Flexible WFH policy | Generous PTO policy | Mental and physical wellness programsSenior-level Full TimeUS-California-San Jose, United States R1d ago
-
Data Engineer II USD 82K-125KAuto Scaling | Azure | Backup and Recovery | CI/CD | Cloud FunctionsOn site work in Honolulu | One day per week work from homeSenior-level Full TimeMapunapuna Plaza, United States1d ago
-
Manager Data Engineering USD 171K-230KAWS | Agile | Apache Airflow | CI/CD | ContainerizationRelocation assistanceSenior-level Full TimeUSA - CA - 1200 Grand …1d ago
-
Head of AI Developer Platform USD 275K-350KCI/CD | Chaos Engineering | Context engineering | Data Management | Environment ManagementComprehensive healthcare | Flexible time off | Hybrid work model | Retirement benefits | Tuition reimbursementExecutive-level Full TimeNY7 - 50 Hudson Yards, New … R1d ago
-
Alerting | Apache Spark | Automation | CI/CD | Data PipelinesMid-level Full Time141278-NC-CIC Customer Information Ctr, United States1d ago
-
Team Lead, AI Engineering USD 149K-200KAI Governance | API Integration | Agent Orchestration | Agentic Systems | Agile401k match | Career growth opportunities | Dental insurance | Employee resource groups | Life insuranceSenior-level Full TimeRemote, United States R3d ago
-
Analytics | Data Analysis | Data Modeling | Econometrics | Machine LearningSenior-level Full Time(USA) Change Building AR Bentonville Home …3d ago
-
Technical Product Manager, Robotics AI USD 170K-260K6DoF Pose Estimation | Behavior Cloning | Computer Vision | Data Ingestion | Data PipelinesHybrid work | Relocation assistance | Travel Under 25 PercentSenior-level Full TimeMountain View Technical Center - Mountain …3d ago
-
Vice President, Data Cloud USD 279K-349KAccess Control | Batch Processing | Change Data Capture | Cloud Computing | Cost OptimizationConference reimbursement | Employee assistance program | Employee stock purchase program | Equity compensation | Flexible time offExecutive-level Full TimeSan Francisco R4d ago
-
Manager, AI/ML Engineering USD 144K-227KAI guardrails | Automation | Data Pipelines | LLM Applications | Language Models401k match | Disability insurance | EAP | Health insurance | Life insuranceSenior-level Full TimeUtah | Hybrid R4d ago
-
Finance Systems, Head of AI & Innovation USD 315K-365KAnomaly Detection | Artificial Intelligence | Audit management | Backlog Management | BigQueryFlexible working hours | Generous vacation | Hybrid work policy | Optional equity donation matching | Parental leaveExecutive-level Full TimeSan Francisco, CA4d ago
-
Product Analyst - Generative AI Platform USD 110K-171KAPI | Agentic Systems | Agile | Cloud Computing | Data Processing401k | Dental insurance | Life insurance | Medical insurance | Paid time offEntry-level Full TimeAustin, TX, United States4d ago
-
API | AWS | Agile | Artificial Intelligence | Azure401k matching | Dental insurance | Flexible work schedule | Health insurance | Paid HolidaysSenior-level Full TimeTexas R4d ago