Research Engineer - LLM Infra training - Seed Infra
San Jose, California, United States
USD 244K-450K Mid-level Full Time
Tasks
- Analyze performance bottlenecks and propose optimization methods
- Conduct research and development for large scale LLM training infrastructure
- Design and optimize distributed training strategies
- Implement checkpointing and fault tolerance techniques
- Research and optimize network scheduling and GPU memory management
- Translate research ideas into scalable production AI infrastructure
Perks/Benefits
- N/A
Skills/Tech-stack
Checkpointing | Data Analysis | Distributed Training | Fault Tolerance | GPU memory | GPU memory management | Language Models | Large Language Models | Machine Learning | Memory Management | Network Optimization | Performance optimization | Reinforcement Learning | Scheduling
Education
N/A
Related jobs
-
Senior AI Engineer – Azure (Enterprise AI & Secure Systems) | W2 Only (No OPT's please) | A USD 120K-304KAI Studio | API Development | Access Management | Azure AI | Azure AI StudioW2 employment onlySenior-level Full TimeChicago, IL2h ago
-
Research Engineer - LLM Infra training - Seed Infra USD 232K-427KCheckpointing | Data-Driven Optimization | Data-driven | Deep learning | Distributed TrainingMid-level Full TimeSeattle, Washington, United States5h ago
-
Causal Inference | Cross-modal fusion | DPO | Data Modeling | Deep learningMid-level Full TimeSeattle, Washington, United States6h ago
-
Machine Learning Engineer Graduate (E-Commerce Supply Chain & Logistics)- 2026 Start (BS/MS) USD 122K-256KData Mining | Deep learning | Knowledge graphs | Language Models | Language ProcessingEntry-level Full TimeSan Jose, California, United States6h ago
-
AI Models | Kubernetes | LLM Inference | Linux | Machine LearningSenior-level Full TimeSan Jose, California, United States6h ago
-
Software Engineer - Applied Machine Learning, Engine USD 122K-316KComputer Science | Computer networks | Distributed Systems | Hardware Integration | Machine LearningEntry-level Full TimeSan Jose, California, United States6h ago
-
Agentic Systems | Architecture Design | Fine Tuning | Generative AI | Human FeedbackEntry-level Full TimeSan Jose, California, United States6h ago
-
Partner Engineering GenAI - US USD 140K-203KAPI Integration | Agent Orchestration | Artificial Intelligence | Bias Mitigation | C++Senior-level Full TimeMenlo Park, CA | Seattle, WA …7h ago
-
Computer Science Research - US - IC5 USD 166K-244KData Pipelines | Deep learning | Experimentation | Generative Models | Image-to-videoKnowledge sharing | Mentoring | Open source contributionsMid-level Full TimeBellevue, WA | Menlo Park, CA7h ago
-
API Design | Agentic Workflows | C plus plus | C# | Computer VisionSenior-level Full TimeRedmond, WA7h ago
-
Machine Learning Solutions Engineer, Google Cloud USD 153K-222KApache Beam | C++ | ELT | ETL | Generative AISenior-level Full TimeChicago, IL, USA; Atlanta, GA, USA7h ago
-
Software Engineer III, AI/ML GenAI, Google Ads USD 147K-211KC++ | Data Processing | Data Storage | Debugging | Distributed ComputingSenior-level Full TimeMountain View, CA, USA7h ago
-
Software Engineer, AI/ML, Platforms and Devices USD 147K-211KAndroid | C plus plus | Data Processing | Debugging | Distributed SystemsMid-level Full TimeMountain View, CA, USA7h ago
-
Staff Software Engineer, YouTube Ads, AI/ML USD 207K-300KAlgorithms | Data Processing | Data Structures | Debugging | Distributed ComputingEmployee discounts | Health insurance | Paid time off | Professional development | Retirement plansSenior-level Full TimeMountain View, CA, USA7h ago
-
Embedded Software Engineer (Data Platform), Autonomy USD 175K-210KA Star | Agent planning | Airspace management | C# | C++Dental insurance | Equity compensation | Medical insurance | Paid time off | Performance bonusMid-level Full TimeSouth San Francisco, California, USA14h ago
-
3D Perception Engineer - Autonomy (Droid) USD 180K-265K3D Geometry | Aerial survey | Autonomy | CNN | Camera CalibrationBonus pay | Dental insurance | Equity compensation | Medical insurance | Paid time offMid-level Full TimeSouth San Francisco, California, USA14h ago
-
Autonomy Perception Engineer - CV / 3D Reconstruction USD 180K-265K3D Reconstruction | Camera Calibration | Computer Vision | Convolutional Neural Networks | Data AnnotationDental insurance | Equity compensation | Medical insurance | Paid time off | Vision insuranceMid-level Full TimeSouth San Francisco, California, USA14h ago
-
Data and AI Software Engineer II USD 170K-201KAgile | Angular | Artificial Intelligence | CI/CD | Cloud technologiesBackup childcare | Financial coaching | Health care coverage | Mental health support | On-site health and wellness centersMid-level Full TimeJersey City, NJ, United States16h ago
-
Machine Learning Engineer (Active Secret Clearance) USD 175K-205KAgile | Algorithms | Asynchronous programming | CI/CD | Data Structures401k plan | FSA | HSA | Medical/Dental/Vision insurance | Paid disability insuranceMid-level Full TimeSchofield Barracks, Hawaii, United States17h ago
-
Senior-level Full TimeCharlotte, United States18h ago
-
API Design | C++ | Data Mining | Deep learning | Feature EngineeringSenior-level Full TimeMountain View, CA, USA; San Francisco, …19h ago
-
Senior Machine Learning Engineer, AI Personalization USD 194K-343KAWS | Agentic Engineering | Automated testing | Code generation | Data ExperimentationFlexible time off | Medical insurance | Modern family planning | Remote work | Retirement savings plansSenior-level Full TimeBay Area, CA, United States of …19h ago
-
Data Analytics Analyst USD 172K-202KAWS | Computer Vision | Data Analysis | Data Pipelines | Deep learningBackup childcare | Financial coaching | Health insurance | Mental health support | On-site health and wellness centersMid-level Full TimeNew York, NY, United States19h ago
-
Systems Engineer - Data Analysis & Algorithms USD 120K-130KAgile | Data Analysis | Data Modeling | Data Visualization | Git401k | Dental insurance | Employee referral program | Flexible spending account | Health savings accountEntry-level Full TimeSanta Clara, CA19h ago
-
Agentic AI | Information Retrieval | LLM Evaluation | Language Models | Language ProcessingFlexible work environment | Health benefits | Remote work optionsSenior-level Full TimeMountain View, CALIFORNIA, United States20h ago