Find jobs in AI/ML, Data Science and Big Data
7 results
for Reward Optimization
(Skill/Tech stack)
-
Actor-critic | Exploration Strategy | Online Reinforcement Learning | Policy Optimization | Policy gradientsAutonomy and innovation | Flexible work culture | Fully remote | Global team collaboration | Research exposureMid-level Full TimeIsrael R2d ago
-
Actor-critic | Data Pipelines | Exploration/exploitation | Neural Networks | Online Reinforcement LearningAutonomy and innovation | Flexible working culture | Fully remote work | Global team collaborationMid-level Full TimeIreland R2d ago
-
Actor-critic | Benchmarking | Exploration/exploitation | Model Evaluation | Online Reinforcement LearningFlexible working culture | Fully remote | Global team collaborationMid-level Full TimePortugal R2d ago
-
Actor-critic | Benchmarking | Experiment design | Exploration/exploitation | Exploration/exploitation tradeoffsAutonomy and innovation | Flexible working culture | Fully remote | Global team collaboration | Research exposureMid-level Full TimeNetherlands R2d ago
-
Actor-critic | Benchmarking | Deep learning | Exploration Strategy | Multimodal AIFlexible working culture | Fully remote work | Global team collaboration | Research and innovation opportunitiesMid-level Full TimeBrazil R2d ago
-
Actor-critic | Experiment design | Exploration/exploitation | Model Evaluation | Multimodal LearningFlexible working culture | Fully remote work | Global team collaboration | High impact production influence | Work on cutting-edge AI researchMid-level Full TimeGermany R2d ago
-
Actor-critic | Experiment design | Exploration/exploitation | Model Evaluation | Multimodal LearningAutonomy and innovation | Flexible working culture | Fully remote | Global team collaborationMid-level Full TimeIndia R2d ago