Find jobs in AI/ML, Data Science and Big Data
101 results
for Reinforcement Learning from Human Feedback
(Skill/Tech stack)
-
Senior Principal Data Scientist (Fulfilment) SGD 224K-252KDecision Processes | DeepSpeed | Distributed Training | Dynamic Models | FSDPBirthday leave | Flexible work arrangements | Life insurance | Medical insurance | Parental leaveExecutive-level Full TimeSingapore, Singapore8h ago
-
大模型算法研究员-MiMo CNY 500K-500KAI Feedback | Active Learning | C++ | Curriculum learning | Deep learningEntry-level Full Time北京20h ago
-
Research Scientist, AI Language USD 170K-251KA/B | A/B Testing | B testing | Benchmarking | Data CurationSenior-level Full TimeMenlo Park, CA1d ago
-
Data Science - AI USD 245K-295KA/B | A/B Testing | Annotation Guidelines | B testing | Data labeling401k match | Commuter benefits | Compassionate leave | Family support | Hybrid workMid-level Full TimeSan Francisco1d ago
-
Senior AI Engineer INR 2500K-4000KAPI Gateway | API Integration | AWS CloudFormation | AWS Lambda | Amazon ECREmployee assistance program | Flexible work | Free Apps Access | Free Economist subscription | Free Podcasts AccessSenior-level Full TimeGurugram R2d ago
-
Senior AI Engineer INR 2500K-4000KAPI Gateway | AWS Lambda | Agile | Airflow | Amazon ECRAnnual leave | Employee assistance program | Flexible working | Free access to Economist content | Moving home supportSenior-level Full TimeBengaluru, Karnataka, India R2d ago
-
IN_Manager_Data Science + Gen AI_ GCC_Advisory_Hyderabad INR 1500K-2000KAPI Integration | Anthropic API | Embeddings | Faiss | Hugging FaceFlexibility programmes | Inclusive benefits | Mentorship | Wellbeing supportMid-level Full TimeHyderabad, India2d ago
-
IN_Senior Associate_Data Science + Gen AI_ GCC_Advisory_Gurgaon INR 3000K-4000KAPI Integration | Embeddings | Faiss | Fine Tuning | Hugging FaceSenior-level Full TimeGurugram Novus Tower, India2d ago
-
实习-Ai研究员-大语言模型/视觉语言模型算法与后训练(博士优先) CNY 25K-37KAI Feedback | Direct Preference Optimization | Efficient Fine Tuning | Fine Tuning | FlaxEntry-level Internship上海2d ago
-
Strategic Project Lead - Code USD 200K-250KArtificial Intelligence | Data Pipelines | Fine Tuning | Human Feedback | LLM EvaluationFive-day workweek | Flexible working hours | Supportive work cultureSenior-level Full TimeSan Francisco, California, United States3d ago
-
Forward Deployed Engineer Lead | LLM Post-training USD 165K-258KData Generation | Data Pipelines | Dataset versioning | Distributed Training | Evaluation methodologyDental insurance | Disability insurance | Health insurance | Life insurance | Paid time offSenior-level Full TimeNew York3d ago
-
Applied Research - Evals & Data USD 150K-300KAccelerate | Data Pipelines | Data Versioning | Distributed Systems | Distributed tracingConference attendance | Professional development budget | Relocation support | Remote work | Team offsitesSenior-level Full TimeSan Francisco4d ago
-
Senior Director, AI Model LifeCycle USD 301K-355KCheckpointing | Dataset versioning | Experiment tracking | Failure recovery | Fine Tuning401k match | Cell phone stipend | Commuter benefits | Dental insurance | HSA contributionsSenior-level Full TimeSan Francisco, CA - US5d ago
-
C++ | Computer Vision | Distributed Training | Efficient Fine Tuning | Fine TuningBonus program | Company benefits | Equity incentive planSenior-level Full TimeMountain View, CA, USA5d ago
-
Adversarial Machine Learning | Automated Red Teaming | Cybersecurity | Guardrails | Human FeedbackMid-level Full TimeSan Francisco Bay Area, USA6d ago
-
Adversarial Machine Learning | Cybersecurity | Guardrails | Human Feedback | Jailbreak detectionMid-level Full TimeOregon, USA6d ago
-
Adversarial Machine Learning | Cybersecurity | Guardrails | Human Feedback | Language ModelsMid-level Full TimeSeattle, USA6d ago
-
Adversarial Machine Learning | Cybersecurity | Guardrails | Human Feedback | Jailbreak TaxonomiesMid-level Full TimeBoston, USA6d ago
-
Adversarial Machine Learning | Automated Red Teaming | Cybersecurity | Guardrails | Human FeedbackMid-level Full TimeChina6d ago
-
Adversarial Machine Learning | Automated Red Teaming | Cybersecurity | Human Feedback | Jailbreak TaxonomiesMid-level Full TimeAustralia6d ago
-
Adversarial Machine Learning | Automated Red Teaming | Cybersecurity | Guardrails | Human FeedbackMid-level Full TimeSingapore6d ago
-
Adversarial Machine Learning | Automated Red Teaming | Cybersecurity | Ethical AI | GuardrailsMid-level Full TimeHong Kong6d ago
-
Lead Data Scientist - AI SGD 140K-162KAWS | Azure | Cloud Computing | Computer Vision | Data PreprocessingHybrid workSenior-level Full TimeSingapore6d ago
-
Senior Data Scientist INR 2520K-3880KDPO | Deep learning | Document Embeddings | Embedding Chunking | Fine TuningCompany-matched student loan contribution | Continuing education program | Continuous learning resources | Financial wellness programs | Flexible time offSenior-level Full TimeIN - AHMEDABAD, India6d ago
-
Python&BI&AI开发工程师 CNY 180K-360KAsynchronous programming | C++ | DPO | Data Annotation | Data ProcessingIn person collaboration flexibility | Work/life balance focusMid-level Full TimeCN004 - Shanghai, China (CN004)6d ago
-
Senior Data Scientist INR 2520K-3880KChunking | Deep learning | Direct Preference Optimization | Document Embeddings | Fine TuningContinuing education program | Continuous learning | Family-friendly perks | Flexible time off | Health care coverageSenior-level Full TimeIN - AHMEDABAD, India6d ago
-
AI/ML Applied Data Scientist - Generative AI USD 121K-208KContext Management | Deep learning | Deep reinforcement learning | Efficient Fine Tuning | Fine TuningMid-level Full TimeNewport Beach, CA, US, 926606d ago
-
ML Postdoc Researcher - LLMs & Generative AI USD 100K-120KCloud deployment | Deep learning | Distributed Training | Generative Modeling | Human FeedbackCompany provided laptop and equipment | Opportunities for future full time positions | Remote work cultureSenior-level Full TimeSeattle, WA R6d ago
-
Researcher, Agentic Post-Training USD 295K-445KAgent systems | Data Pipelines | Diagnostics | Evals | Function CallingSenior-level Full TimeSan Francisco6d ago
-
Agentic Systems | Architecture Design | Fine Tuning | Generative AI | Human FeedbackEntry-level Full TimeSan Jose, California, United States7d ago
-
Senior-level Full TimeIndia - Remote R7d ago
-
Generative AI Analyst INR 2500K-3000KCase Development | Data labeling | Human Feedback | Language Models | Large Language ModelsNone Full TimeAsia (Remote), India R7d ago
-
Generative AI Analyst INR 2500K-3000KData labeling | Human Feedback | Language Models | Large Language Models | Learning from Human FeedbackMid-level Full TimeAsia (Remote), India R7d ago
-
Generative AI Analyst INR 2500K-3000KData labeling | Human Feedback | Language Models | Large Language Models | Learning from Human FeedbackNone Full TimeAsia (Remote), India R7d ago
-
Generative AI Analyst INR 2500K-3000KData labeling | Deep Neural Networks | Human Feedback | Language Models | Large Language ModelsNone Full TimeAsia (Remote), India R7d ago
-
Generative AI Analyst INR 2500K-3000KDeep Neural Networks | Human Feedback | Labeling | Language Models | Large Language ModelsNone Full TimeAsia (Remote), India R7d ago
-
Generative AI Analyst INR 2500K-3000KData labeling | Human Feedback | Language Models | Large Language Models | Learning from Human FeedbackMid-level Full TimeAsia (Remote), India R7d ago
-
Applied Researcher I (AI Foundations, LLM Customization, Finetuning, Reinforcement Learning) USD 218K-272KAWS | Data labeling | Dataset curation | Deep learning | Distributed TrainingNone Full TimeMcLean, VA, United States7d ago
-
Sr Staff AI Software Development Engineer GBP 55K-61KAWS | Artificial Intelligence | Azure | Databricks | Direct Preference OptimizationAccrued Paid Vacation | Commuter benefits | Dental insurance | Employee assistance program | Employee resource groupsSenior-level Full TimeCambridge, United Kingdom R7d ago
-
Principal AI Architect USD 160K-220KAgent Orchestration | Artificial Intelligence | Data Modeling | Fine Tuning | Human FeedbackArchitecture autonomy | Greenfield opportunities | High ownershipSenior-level Full TimeSan Francisco, CA7d ago
-
Data Infra Tech Lead CNY 240K-480K3D processing | C++ | Cloud processing | Data Governance | Data LineageSenior-level Full Time北京7d ago
-
ML Engineer, Post-Training and Evaluation USD 175K-270KData Pipelines | Data Quality | Data quality analysis | Dataset Preparation | Evaluation methodologyDental insurance | Disability insurance | Health insurance | Life insurance | Paid time offMid-level Full TimeSan Francisco8d ago
-
Strategic AI Lead USD 150K-210KAgile | Confluence | Data Quality | Data Visualization | Data labeling401k matching | Medical/Dental/VisionSenior-level Full TimeNew York City, NY (Hybrid); Redwood … R8d ago
-
Senior Product Manager, LLM Post-Training & Evaluation USD 160K-170KAI Feedback | API Design | Agentic Evaluation | Benchmarking | Context evaluationSenior-level Full TimeRemote Work( USA), United States R8d ago
-
API Integration | Anthropic API | Embeddings | Faiss | Fine TuningFlexibility programmes | Inclusive benefits | Mentorship | Wellbeing supportSenior-level Full TimeBengaluru Millenia, India8d ago
-
Active Learning | Deep learning | Fine Tuning | Golang | Human FeedbackRemote work flexibility | Workplace accommodation supportSenior-level Full TimeMountain View, CALIFORNIA, United States8d ago
-
Entry-level Internship上海8d ago
-
Mid-level Internship上海8d ago
-
Pioneer Talent Program - Research Data Scientist PHP 312K-396KGenerative AI | Human Feedback | Information theory | Language Models | Large Language ModelsCareer growth | Continuous learning | Mentorship | Work from homeMid-level Full TimeAsia9d ago
-
A/B | A/B Testing | Agentic Workflows | B testing | Fine TuningCareer development | Coaching and mentoringEntry-level Internship Part TimeShanghai - Phincas 2, China9d ago