LLM Engineer (LLM Evaluation)
Tasks
- Automate model evaluation workflows
- Build benchmark datasets
- Build end to end evaluation workflows
- Define evaluation metrics
- Design LLM evaluation benchmarks
- Design quality validation workflows
- Detect model regression automatically
- Establish evaluation protocols
- Improve model quality based on evaluation results
- Integrate evaluations with ML pipelines
- Maintain reproducible evaluation environments
Perks/Benefits
- N/A
Skills/Tech-stack
Argo Workflows | Asynchronous programming | Benchmarking | Data Monitoring | Datadog | Deep learning | Distributed inference | Evaluation automation | GPU Computing | Kubernetes | Language Models | Language Processing | Large Language Models | MLflow | Machine Learning | Machine Learning Pipeline | Natural Language | Natural Language Processing | Prometheus | Python | Regression Detection | Reproducibility
Education
N/A
Related jobs
-
Bash | Cloud platform | Data Processing | Docker | Google CloudAsynchronous culture | Competitive salaries | Entrepreneurial team | Friendly work environmentMid-level Full TimeSeongnam, South Korea20h ago
-
A/B | A/B Testing | AWS | B testing | Data ValidationSenior-level Full TimeSeoul, South Korea1d ago
-
Staff, Machine Learning Engineer - Coupang Play KRW 25272K-26000KA/B | A/B Testing | AWS | B testing | Deep learningSenior-level Full TimeSeoul, South Korea1d ago
-
Agent Orchestration | Agent systems | Autogen | Automated Evaluation | BenchmarkingSenior-level Full TimeSeoul HQ2d ago
-
Artificial Intelligence | C++ | Data Modeling | Data Quality | Data pipelineMid-level Full TimeSeoul - 100 Hangang-daero, Korea, Republic …2d ago
-
Data Curation | Deep learning | Distributed Training | End to End | End-to-end trainingConference budget | Equipment stipend | Health checkup | Hybrid work | Learning budgetSenior-level Full TimeSeoul, South Korea3d ago
-
Mid-level Full TimeSeoul HQ3d ago
-
API Integration | Cloud Architecture | Generative AI | Machine Learning | Network ArchitectureHybrid work model | Relocation assistanceMid-level Full TimeSeoul, South Korea6d ago
-
3D Computer Vision | Active Learning | Auto-labeling | BEV | C++Senior-level Full TimeKorea, Seoul, Korea, Republic of6d ago
-
Solutions Architect KRW 65000K-90000KApache Spark | Big Data | Cloud Platforms | Java | Proof of ConceptSenior-level Full TimeSeoul, South Korea7d ago
-
Staff Customer Engineer - Automotive AI Engineer KRW 65000K-90000KAI Platform | AI Platform Integration | APIs | Agentic AI | Cloud ComputingAnnual health checkup | Complimentary meals | Computer accessory allowance | English support class | Family leaveSenior-level Full TimeSeoul, South Korea7d ago
-
API Integration | Automation | BigQuery | Data Architecture | Data LakeSenior-level ContractSeoul, South Korea8d ago
-
API Integration | Anthropic | Apache Spark | Artificial Intelligence | Backend DevelopmentSenior-level Full TimeSeoul, South Korea8d ago
-
Bash | Data Processing | Docker | GCP | Infrastructure as CodeMid-level Full TimeIncheon, South Korea9d ago
-
Foundation Model Engineer KRW 24728K-24728KComputer Vision | Data pipeline | Deep learning | Distributed Training | GPU ComputingEntry-level Full TimeSeoul, Korea13d ago
-
Senior-level Full TimeSeoul, Korea13d ago
-
Staff Machine Learning Engineer (Ads Relevance) KRW 26604K-30000KCoding Agents | Deep Neural Networks | Embeddings | Experimentation | Information RetrievalSenior-level Full TimeSeoul, South Korea14d ago
-
Embedding Models | Experimentation | Information Retrieval | Language Modeling | Language ModelsFull-time role | Probation periodSenior-level Full TimeSeoul, South Korea14d ago
-
ALSA | Android Audio | Android Audio Framework | Audio Framework | C#Senior-level Full TimeKorea, Seoul, Gangnam-gu, Korea, Republic of14d ago
-
Bash | Data Ingestion | Data Pipelines | Data Processing | DockerAsynchronous culture | Laid-back atmosphere | Remote-friendly team | Supportive leadershipMid-level Full TimeBusan, South Korea15d ago
-
Senior-level Full TimePangyo (Software Dream Center), South Korea17d ago
-
API Design | Algorithms | Apache Beam | Apache Spark | BigQueryMid-level Full TimeSeoul, Korea21d ago
-
Senior Software Engineer - Ads Experience (시니어 소프트웨어 엔지니어) KRW 65000K-90000KAPIs | Apache Beam | Apache Spark | BigQuery | BigtableSenior-level Full TimeSeoul, Korea21d ago
-
ARINC 429 | ARINC 664 | ARM | Agile | BashRelocation assistance not includedSenior-level Full TimeKOR - Seoul, South Korea, Korea, …22d ago
-
Senior-level Full TimeKOR - Seoul, South Korea, Korea, …22d ago