Find jobs in AI/ML, Data Science and Big Data
141 results
for Reinforcement Learning from Human Feedback
(Skill/Tech stack)
-
Sr. Machine Learning Engineer, Applied Science USD 161K-332KComputer Vision | Diffusion Models | Fine Tuning | Generative Modeling | Human FeedbackSenior-level Full TimeSan Francisco, CA, US; Remote, US R1d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapter-Tuning | Automated Benchmarks | Data Curation | Direct Preference Optimization | Distributed TrainingMid-level Full TimeUnited States - Remote R2d ago
-
Agentic AI | Amazon Web Services | Autogen | Benchmarking | CI/CDIncome protection benefits | Medical, dental, vision plans | Paid Holidays | Paid family leave | Paid time offSenior-level Full TimeBoise, ID - Main Site, United …2d ago
-
Mid-level Full TimeTel Aviv-Yafo, Tel Aviv District, IL2d ago
-
Domain Adaptation | Human Feedback | Knowledge graphs | Language Models | Language ProcessingConference travel | Professional developmentMid-level Full TimeAbu Dhabi2d ago
-
Researcher: Agent Post-Training, API & Power-Users USD 295K-445KAI Feedback | Calibrated Reasoning | Data Generation | Deep learning | Error RecoverySenior-level Full TimeSan Francisco4d ago
-
Agent systems | Embodied AI | Human Feedback | Language Models | Large Language ModelsMid-level Full Time深圳、上海4d ago
-
大语言模型后训练算法工程师 CNY 240K-480KDistributed Training | Docker | Fine Tuning | Human Feedback | KubernetesMid-level Full Time深圳、上海4d ago
-
Data collection | Data shaping | Fine Tuning | Gradient Free Optimization | Human FeedbackAnnual travel stipend | Flexible work arrangements | Global remote work | Meal Delivery Stipend | Medical coverageSenior-level Full TimeIndia4d ago
-
Data Engineering | Fine Tuning | Human Feedback | JAX | Learning from Human FeedbackMeal Delivery Stipend | Medical coverage | Paid time off | Remote work | Team offsitesSenior-level Full TimeCanada4d ago
-
AL Research Scientist - Algorithm INR 2475K-3465KData Engineering | Deep learning | Generative AI | Human Feedback | Language ModelsMid-level Full TimeChennai,IND, India5d ago
-
AI Engineer AED 264K-323KAI Foundry | AWS Lambda | AWS SageMaker | Agent systems | Artificial IntelligenceCareer advancement opportunities | Certification support | Health insurance | Professional development support | Visa sponsorshipSenior-level Contract Full TimeAbu Dhabi, Abu Dhabi, United Arab …5d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAttention Optimization | DPO | Direct Preference Optimization | Distributed Training | EvaluationMid-level Full TimeUnited States - Remote R5d ago
-
Senior-level Full TimeNew York, New York, United States5d ago
-
AI/ML Computational Science Specialist INR 2000K-3500KA/B | A/B Testing | AI alignment | B testing | Generative AISenior-level Full TimeHyderabad, HDC3B, India6d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapters | Attention Optimization | Cluster operations | Data Generation | DeepSpeed ZeRORemote workMid-level Full TimeUnited States - Remote R6d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapters | Attention Optimization | Benchmarking | Dataset curation | Direct Preference OptimizationMid-level Full TimeUnited States - Remote R6d ago
-
Senior Machine Learning Engineer , AI Platform USD 150K-210KArtificial Intelligence | Batch Processing | Data Analysis | Data Pipelines | Data PrivacySenior-level Full TimeBoston, MA7d ago
-
Generative AI Analyst USD 50K-55KData labeling | Human Feedback | Language Models | Large Language Models | Learning from Human FeedbackNone Full TimeCalifornia (Bay Area), United States8d ago
-
Algorithm Design | Data Processing | Deep learning | Distributed Training | Evaluation metricsBonus program | Company benefits program | Equity incentive planSenior-level Full TimeMountain View, CA, United States8d ago
-
Bayesian Modeling | Classical Test Theory | Cohen Kappa | Computational Linguistics | Data PipelinesSenior-level Full TimeMountain View, CA, USA10d ago
-
Machine Learning Engineer - Reinforcement Learning USD 150K-250KData Processing | Deep learning | Distributed Training | Evaluation metrics | Generative ModelsDental insurance | Family leave | Free food and snacks | Health insurance | Life insuranceSenior-level Full TimeFremont, California, United States12d ago
-
Agentic AI Engineer USD 77K-176KAgent Orchestration | Asynchronous programming | Autogen | CrewAI | EvaluationDependent care | Paid leave | Professional development | Secret clearance | Tuition assistanceEntry-level Full TimeUSA, VA, McLean (8283 Greensboro Dr, …12d ago
-
Data Curation | Dataset development | Deep learning | Fine Tuning | Generative AIHealth and wellbeing programs | Learning opportunities | Relocation eligible | Travel 10 percentNone Full TimeSanta Clara,CA, United States12d ago
-
Strategic Projects Lead USD 75K-110KClient Communication | Data Analysis | Data Quality | Data labeling | Data pipelineSenior-level Full TimeSan Francisco, California12d ago
-
Senior Manager, AI Engineering USD 223K-358KArtificial Intelligence | Cost Optimization | Evaluation | Fine Tuning | GPU clusters401k match | Adoption reimbursement | Comprehensive onboarding | FSA | Fertility benefitsSenior-level Full TimeUS NY Remote, United States R13d ago
-
Human-Robot Interaction Applied Scientist , Fauna USD 183K-248KComputer Vision | Facial Expression Recognition | Gaze Estimation | Gesture Recognition | Human FeedbackSenior-level Full TimeNew York, New York, USA13d ago
-
Creative Writing Generative AI Analyst USD 48K-54KData labeling | Deep Neural Networks | Human Feedback | Language Models | Large Language ModelsNone Full TimeCalifornia (Bay Area), United States13d ago
-
Multimedia Generative AI Analyst USD 50K-55KData Tagging | Data labeling | Human Feedback | Language Models | Large Language ModelsEntry-level Full TimeCalifornia (Bay Area), United States13d ago
-
Red Teaming | Generative AI Analyst USD 50K-56KData labeling | Human Feedback | Language Models | Large Language Models | Learning from Human FeedbackEntry-level Full TimeCalifornia (Bay Area), United States13d ago
-
2026 Fall Intern, Computer Vision/AI USD 96K-126KComputer Vision | Diffusion Models | Generative AI | Human Feedback | Knowledge DistillationEntry-level Internship665 Clyde Avenue, Mountain View, CA, …13d ago
-
Senior Associate, Data Scientist - NLP USD 123K-168KAWS | Deep learning | Explainability | Hugging Face | Human FeedbackSenior-level Full TimeMcLean, VA, United States14d ago
-
Senior Applied AI/ML Scientist USD 142K-225KAWS SageMaker | Algorithms | Autoregressive Language Models | Azure Machine Learning | Data PipelinesRemote workSenior-level Full TimeRemote R14d ago
-
Tech Lead, LLM & Generative AI (Full Remote - Andorra) USD 150K-225KDirect Preference Optimization | Fine Tuning | Huggingface | Human Feedback | Information RetrievalCo-working space budget | Equipment provided | Fully remote | Health insurance support | Learning budgetSenior-level Full TimeAndorra R14d ago
-
Tech Lead, LLM & Generative AI (Full Remote - Turkey) TRY 840K-1080KClassifiers | Data labeling | Direct Preference Optimization | Evaluation | Fine TuningAccess to AI tools | Annual in-person meetup | Co-working space budget | Equipment budget | Fully remoteSenior-level Full TimeTurkey R14d ago
-
Tech Lead, LLM & Generative AI (Full Remote - Sweden) SEK 738K-930KClassification | DPO | Data labeling | Dataset cleaning | Evaluation FrameworksAI tools access | Annual in-person meetup | Co-working space budget | Equipment provided | Fully remoteSenior-level Full TimeSweden R14d ago
-
Classifier Training | Content Moderation | DPO | Data cleaning | Data labelingAI tools access | Co-working space budget | Equipment provided | Fully remote | Health insurance allowanceSenior-level Full TimeSlovakia R14d ago
-
Classification | Context window | Context window management | DPO | Data cleaningAccess to mental health counseling | Co-working space budget | Company equipment provision | Fully remote | Health insurance allowanceSenior-level Full TimeGreece R14d ago
-
Classification | Context window | Context window management | Data labeling | Direct Preference OptimizationCo-working space budget | Fully remote | Health insurance support | Learning budget | Paid time offSenior-level Full TimeIreland R14d ago
-
Tech Lead, LLM & Generative AI (Full Remote - Romania) RON 245K-348KDPO | Fine Tuning | Huggingface | Human Feedback | Inference OptimizationAI tools access | Annual in-person meetup | Co-working space budget | Company equipment provided | Fully remoteSenior-level Full TimeRomania R14d ago
-
Classifier Training | DPO | Fine Tuning | Huggingface | Human FeedbackAI tools access | Annual in-person meetup | Co-working space budget | Company laptop | Fully remoteSenior-level Full TimeSwitzerland R14d ago
-
Tech Lead, LLM & Generative AI (Full Remote - Serbia) USD 150K-225KClassification | Context window | Context window optimization | DPO | Data cleaningAnnual in-person meetup | Co-working space budget | Company laptop | Fully remote | Health and wellness supportSenior-level Full TimeSerbia R14d ago
-
Tech Lead, LLM & Generative AI (Full Remote - Malta) EUR 79K-100KDPO | Data labeling | Fine Tuning | Huggingface | Human FeedbackAI tools access | Co-working space budget | Equipment provided | Fully remote | Health and wellness supportSenior-level Full TimeMalta R14d ago
-
Tech Lead, LLM & Generative AI (Full Remote - Italy) EUR 79K-100KClassification Models | Context Management | DPO | Data cleaning | Data labelingAI tools access | Annual in-person gathering | Co-working space budget | Equipment provided | Fully remoteSenior-level Full TimeItaly R14d ago
-
DPO | Data labeling | Fine Tuning | Huggingface | Human FeedbackAI tools access | Co-working space budget | Equipment provided | Fully remote | Health and wellness supportSenior-level Full TimeSlovenia R14d ago
-
Tech Lead, LLM & Generative AI (Full Remote - Norway) NOK 1100K-1250KClassification | Data labeling | Dataset cleaning | Direct Preference Optimization | Fine TuningAI tools access | Annual in-person meetup | Co-working space budget | Company equipment provided | Fully remoteSenior-level Full TimeNorway R14d ago
-
Classification Algorithms | Context window | Context window optimization | DPO | Data labelingAI tools access | Co-working space budget | Company equipment | Fully remote | Health insurance supportSenior-level Full TimeMontenegro R14d ago
-
Tech Lead, LLM & Generative AI (Full Remote - Poland) PLN 324K-450KClassification | Context window | Context window optimization | Data cleaning | Data labelingAI tools access | Co-working space budget | Equipment budget | Fully remote | Health and wellness supportSenior-level Full TimePoland R14d ago
-
Classifier Training | Context window | Context window optimization | Data cleaning | Data labelingAccess to AI tools | Co-working space budget | Equipment provided | Fully remote | Health insurance allowanceSenior-level Full TimeGermany R14d ago
-
Context window | Context window optimization | Data labeling | Dataset cleaning | Direct Preference Optimization1 1 psychologist sessions | AI tools access | Annual in-person meetup | Co-working space budget | Company-provided equipmentSenior-level Full TimeCroatia R14d ago