Find jobs in AI/ML, Data Science and Big Data
23 results
for GRPO
(Skill/Tech stack)
-
Senior AI Research Scientist USD 139K-221KDAPO | Fine Tuning | GRPO | Language Models | Large Language ModelsSenior-level Full TimeRemote - USA, United States R1d ago
-
大模型 Infra 研发实习生(Agentic RL 方向) CNY 25K-37KAlerting | Asynchronous programming | Concurrency | Data Retrieval | Data StorageEntry-level Internship深圳3d ago
-
大语言模型后训练/Agentic算法工程师 CNY 180K-360KAgentic RL | DAPO | Distributed Training | Function Calling | GRPOEntry-level Full Time上海、北京3d ago
-
AI Platform Engineer, Training and Inference USD 150K-225KANN indexing | BF16 | DDP | Embeddings | FP8Career growth | Learning opportunitiesSenior-level Full TimeSan Francisco3d ago
-
Senior Deep Learning Engineering - Autonomous Vehicles USD 224K-356KComputer Vision | DPO | Deep learning | Fine Tuning | GRPOBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara, United States8d ago
-
大模型算法工程师(开放域对话) CNY 180K-300KDPO | Deep learning | DeepSpeed | Distributed Training | Function CallingInternshipMid-level Internship上海12d ago
-
Forward Deployed Engineer (Inference & Post-Training) USD 270K-300KDPO | GRPO | KV cache | LoRA | Pipeline parallelismEquity | Health insurance | Remote work flexibilitySenior-level Full TimeSan Francisco14d ago
-
AI Engineer (m/w/d) EUR 47K-47KArgoCD | Automated testing | Clean Code | Code review | DPOCompany pension | Corporate benefits | Professional developmentSenior-level Full TimeBerlin, Berlin, DE16d ago
-
Adversarial Networks | Computer Vision | Cross-modal alignment | GRPO | Generative Adversarial NetworksEntry-level InternshipSeattle, Washington, United States23d ago
-
Adversarial Robustness | Agent learning | Audio Processing | Computer Vision | Content ModerationCareer growth | Research mentorshipNone Full TimeSan Jose, California, United States23d ago
-
AIGC Detection | Adversarial Learning | Agentic Systems | Cross-modal alignment | GRPONone Full TimeSeattle, Washington, United States23d ago
-
Adversarial Networks | Adversarial Training | Cross-modal alignment | GRPO | Generative Adversarial NetworksEntry-level InternshipSan Jose, California, United States23d ago
-
Applied Research - Evals & Data USD 150K-300KAccelerate | Data Pipelines | Data Versioning | Distributed Systems | Distributed tracingConference attendance | Professional development budget | Relocation support | Remote work | Team offsitesSenior-level Full TimeSan Francisco26d ago
-
Causal Inference | Cross-modal fusion | DPO | Data Modeling | Deep learningEntry-level Full TimeSan Jose, California, United States28d ago
-
Causal Inference | Cross-modal fusion | DPO | Data Modeling | Deep learningMid-level Full TimeSeattle, Washington, United States29d ago
-
Agent systems | Agentic AI | Artificial Intelligence | Benchmarking | Continual LearningDiversity training | Flexible work options | GPU infrastructure access | International Conference Publishing Support | Paid time offEntry-level Full TimeDresden, DE, 0106929d ago
-
Senior Software Engineer, RL Post-Training Frameworks USD 184K-356KActor Based Programming | C# | C++ | Consistency models | DPOComprehensive benefits | EquitySenior-level Full TimeUS, CA, Santa Clara, United States29d ago
-
Tech Lead, Robotic AI Model USD 150K-180KAction Chunking | Action Tokenization | Behavior Cloning | DPO | DeepSpeedSenior-level Full TimeEl Segundo, California, United States30d ago
-
Entry-level Internship上海30d ago
-
Agent systems | Attention Mechanism | CPU | Continuous Improvement | DPODental insurance | Employee assistance program | Flexible Paid Vacation | Flexible paid sick leave | Flexible spending accountSenior-level Full TimePalo Alto, CA1mo ago
-
Research Scientist, LLM Evaluation & Post-Training USD 150K-160KBenchmarking | Context evaluation | DPO | Data Processing | Error AnalysisSenior-level Full TimeRemote Work( USA), United States R1mo ago
-
C++ | Deep learning | GPU clusters | GRPO | High PerformanceEquity | Healthcare benefits | Paid time off | Retirement benefitsSenior-level Full TimeUS, CA, Santa Clara, United States1mo ago
-
Tech Lead Manager- MLRE, ML Systems USD 264K-331KCUDA | Distributed Systems | Flash Attention | GRPO | Human FeedbackCommuter stipend | Generous PTO | Health, dental and vision coverage | Learning and development stipend | Retirement benefitsSenior-level Full TimeSan Francisco, CA; New York, NY1mo ago