LLM Engineer (Reinforcement Learning)
Tasks
- Design self refine training structure
- Develop foundation models integrated with external knowledge and APIs
- Enhance generation accuracy and stability
- Improve LLM training efficiency
- Optimize direct alignment training with PPO GRPO DPO
- Prevent reward hacking
- Train models that select external tools based on instruction types
Perks/Benefits
- N/A
Skills/Tech-stack
DDP | Deep learning | Direct Preference Optimization | Distributed Training | Docker | Fine Tuning | GPU Computing | Horovod | Kubernetes | Language Processing | Natural Language | Natural Language Processing | Parameter efficient fine-tuning | Policy Optimization | Preference optimization | Proximal Policy Optimization | PyTorch | Python | Reinforcement Learning | Slurm | Supervised Fine Tuning
Education
Related jobs
-
AI Inference | Algorithms | C# | C++ | Computer ArchitectureHybrid work model | In-office collaboration | Remote work flexibilityMid-level Full TimeKOR - Seoul, Korea, Republic of1d ago
-
3D Deep Learning | 3D Mesh | API Development | C++ | Computer GraphicsBook Reimbursement | Flexible work schedule | Health checkup | Insurance benefits | Meal supportEntry-level Full TimeSeoul2d ago
-
Data Analysis | Data Modeling | Data Pipelines | Databricks | PythonMid-level Full TimeSeoul4d ago
-
Bash | Cloud platform | Data Processing | Docker | Google CloudAsynchronous culture | Remote-friendly work environmentMid-level Full TimeSeongnam, South Korea4d ago
-
3D Computer Vision | API Development | C++ | Computer Graphics | Computer VisionAdditional leave | Equipment allowance | Flexible work schedule | Health checkups | Meal supportEntry-level Full TimeSeoul4d ago
-
APIs | Accuracy | Agent Development Kit | Agent systems | Agentic WorkflowsSenior-level Full TimeSeoul, South Korea7d ago
-
Bottleneck analysis | C plus plus | Cloud Performance | Cloud performance optimization | Code optimizationSenior-level Full TimePangyo (Software Dream Center), South Korea7d ago
-
Batching | Caching | Deep learning | Distributed Training | GPU ComputingEnglish education program support | Equipment stipend | Health checkup support | Hybrid work | Meals and transportation cardSenior-level Full TimeSeoul, South Korea11d ago
-
Batching | Caching | Computer Vision | Distributed Training | GPU ComputingAnnual health checkup | English education program | Equipment refresh every 3 years | Hybrid work model | MacBook equipment providedSenior-level Full TimeSeoul, South Korea11d ago
-
Batching | Caching | Computer Vision | Data Curation | Data workflowsCompany card | English education | Equipment stipend | Health checkup | Hybrid workSenior-level Full TimeSeoul, South Korea11d ago
-
Data Curation | Distributed Training | Machine Learning | Model Evaluation | Multimodal LearningAnnual health check | English education support | Equipment stipend | Equipment upgrade | Hybrid workSenior-level Full TimeSeoul, South Korea11d ago
-
ANN | Apache Airflow | Async Programming | BM25 | Distributed SystemsSenior-level Full TimeSeoul, South Korea13d ago
-
A/B | A/B Testing | B testing | Cloud Computing | Data AnalysisEntry-level Full TimeSeoul, South Korea14d ago
-
API Integration | AWS | Agentic Systems | Azure | GCPAdditional paid holidays | Commuting cost support | Flexible work hours | Free parking | Group insurance supportMid-level Full TimeSeoul, South Korea14d ago
-
AWS | AWS Batch | AWS Glue | AWS Lambda | Active DirectorySenior-level Full TimeKR, Gyeonggi-do, Hwaseong, Korea, Republic of14d ago
-
Automation | C++ | Control Systems | GitHub | Industrial AutomationSenior-level Full TimeKR, Gyeonggi-do, Hwaseong, Korea, Republic of14d ago
-
Director - Machine Learning & Computer Vision KRW 60000K-80000K3D Geometry | 3D Perception | CUDA | Calibration | CamerasAnnual health check | Daily meal support | Flexible work hours | Hybrid work model | International teamExecutive-level Full TimeSeoul18d ago
-
AI | Analytics | Data Governance | Data Management | Machine LearningMid-level Full TimeKR-AIA Tower, Korea, Republic of18d ago
-
Apache Spark | Azure Blob | Azure Blob Storage | Azure Data | Azure Data LakeSenior-level Full Time TemporarySeoul, Korea, Republic of18d ago
-
DDP | Deep learning | Distributed Training | Docker | Efficient Fine TuningSenior-level Full TimePangyo (Software Dream Center), South Korea21d ago
-
3D Reconstruction | C# | C++ | CUDA | Computer VisionSenior-level Full TimePangyo (Software Dream Center), South Korea21d ago
-
Android | Attention Mechanisms | C# | C++ | CI/CDSenior-level Full TimePangyo (Software Dream Center), South Korea23d ago
-
Agent Orchestration | Embedding Models | Evaluation | LLM APIs | ObservabilityEquity | Flexible time off | Flexible work schedules | Health and wellness benefits | In-person offsitesSenior-level Full TimeSeoul, South Korea29d ago
-
Artificial Intelligence | Evaluation | Feedback loops | LLM APIs | Language ModelsFlexible time off | Flexible work schedules | Health and wellness benefits | In-person offsites | Technology reimbursementsSenior-level Full TimeSeoul, South Korea29d ago
-
Mid-level Full TimeKOR - Seoul, South Korea, Korea, …29d ago