Research Engineer (LLM Training and Performance)
GBP 80K-120K (estimate) Senior-level Full Time
Tasks
- Build elastic fault tolerant training setups
- Define metrics and build performance dashboards
- Design and evaluate model architecture choices
- Implement custom GPU kernels and operations
- Improve data loading streaming and throughput
- Improve end to end performance for multi node LLM pipelines
- Optimize memory and performance with parallelism
- Profile training pipeline hotspots
- Run pre training and post training methods efficiently
Perks/Benefits
- N/A
Skills/Tech-stack
AOTAutograd | CUDA | CuTe | Cutlass | Data loaders | DeepSpeed | FP8 | FSDP | FlashAttention | GPU Kernel | GPU kernel programming | KV cache | Kernel programming | Kubernetes | MOE | Megatron Core | Megatron-LM | Mixture of Experts | NCCL | NCCL collectives | NCCL tuning | NEMO | NTK | NVIDIA Collective Communication Library | NVTX | Nsight Compute | Nsight Systems | Paged Attention | Parquet | PyTorch | PyTorch distributed | ROPE | Sharded datasets | Slurm | Streaming Data | Streaming data loaders | TFRecord | Tokenization | Torch Inductor | Torch compile | TransformerEngine | Triton | VLLM | Zero
Education
N/A
Roles
Regions
Countries
Armenia | Cyprus | Czechia | Germany | Poland | Serbia | Spain | The Netherlands | United Kingdom
States
Yerevan, AM | Pafos, CY | Limassol, CY | Prague, CZ | Berlin, DE | Bavaria, DE | Madrid, ES | England, GB | North Holland, NL | Mazovia, PL | Central Serbia, RS
Related jobs
-
Poland | Data and MLOps Engineer 1 PLN 213K-309KAPIs | AWS | Anomaly Detection | Azure | CI/CDFlexible work environment | Long-term career development | Top EmployerMid-level Full TimeKatowice, Slaskie, PL15h ago
-
Poland | Data and MLOps Engineer PLN 204K-252KAVEVA PI System | AWS | Aveva PI | Azure | Azure DevOpsFlexible work environment | Long-term career developmentMid-level Full TimeKatowice, Slaskie, PL15h ago
-
Lead AI Engineer (AI Systems & Automation) GBP 78K-109KAlerting | Anthropic | Distributed Systems | Docker | EmbeddingsFully remote | Global engineering collaboration | High ownership culture | Learning and development budgetSenior-level Full TimeUnited Kingdom R1d ago
-
Ultralytics LLM Engineer EUR 60K-76KAPI Development | AWS | Azure | Data Quality | DebuggingBirthday off | Home setup allowance | Hybrid work | Independent contractor eligibility | Local holidaysMid-level Full TimeMadrid, Remote EURO R1d ago
-
Senior MLOps Engineer Public Sector (w/m/d) EUR 60K-67KAI Act | Airflow | BSI Grundschutz | Data isolation | DockerAccessibility Workplace | Deferred compensation | Employee discounts | Extended sick pay | Flexible work hoursSenior-level Full TimeBerlin, DE2d ago
-
Applied AI Engineer GBP 70K-85KCloud Computing | Data Modeling | Deep learning | Docker | LLM DeploymentFully expensed tech | Meal allowance | Paid annual leaveMid-level Full TimeLondon, England, United Kingdom2d ago
-
Senior Software Engineer: Data Infrastructure EUR 60K-80KAWS RDS | AWS Redshift | BigQuery | CI/CD | CloudFormationCommuting expense reimbursement | Employee assistance program | Equity program | Health insurance | Meal benefitsSenior-level Full TimeAmsterdam, Netherlands2d ago
-
AI accelerators | Computer Vision | Data center | Deep learning | Edge AIRemote workSenior-level Full TimeMunich, Germany2d ago
-
Staff ML Engineer-Prague CZK 1284K-1715KEmbeddings | Human-in-the-loop | Information Extraction | Java | Language ModelingSenior-level Full TimePraha 1, Hlavní město Praha, Czechia2d ago
-
3D Computer Vision | Bundle adjustment | C++ | CUDA | CUDA kernelsEquity | Gym membership | Paid time off | Professional mentorship | Stock optionsSenior-level Full TimeMunich2d ago
-
Consulting Systems Engineer, Data Management (EMEA) GBP 75K-101KAWS | Ansible | Apache Kafka | Azure | Cloud hybridCompany-sponsored team events | Flexible time off | Wellness resourcesSenior-level Full TimeRemote, United Kingdom R2d ago
-
Senior AI/ML engineer GBP 120K-150KAWS | CI/CD | Databricks | Deep learning | Delta LakeAccelerated professional growth | Enhanced parental leave | Female health leave | Fully paid sabbatical | Health pension wellbeing benefitsSenior-level Full TimeLondon R2d ago
-
Staff AI Analytics Engineer EUR 55K-60KAWS | ClickHouse | Cube.js | DBT | Dimensional ModelingBreakfast in the office | Discounts | Free Coffee & Tea | Gym membership | Language classesSenior-level Full TimeBarcelona, CT, Spain2d ago
-
Lead Software Engineer - Python, Databricks, AWS GBP 72K-100KAWS | AWS Step Functions | Agile | Airflow | Apache SparkSenior-level Full TimeGLASGOW, LANARKSHIRE, United Kingdom3d ago
-
Staff ML Engineer, Gaia GBP 146K-162KData Pipelines | Debugging | Deep learning | Distributed Training | Machine LearningHybrid workSenior-level Full TimeLondon3d ago
-
Market Data Engineer GBP 58K-108KAeron | Apache Arrow | Bloomberg | ClickHouse | FASTCareer development and internal mobility | Discretionary bonus | Group life insurance | Gym discounts | Holiday allowanceMid-level Full TimeLondon, United Kingdom3d ago
-
Senior AI Engineer PLN 246K-352KAI Agents | API Design | Agent systems | Apache Spark | Cloud PlatformsFlexible location across Europe Ukraine and LATAM | Remote work flexibilitySenior-level Full TimeWarsaw, Masovian Voivodeship, Poland R3d ago
-
Software Engineer (f/m/d) – Data Platforms EUR 80K-110KAWS | Apache Hadoop | Apache Spark | CI/CD | Clean architectureCorporate benefits | Hackathons and conferences | Jobrad | Mental health support | Mobility supportMid-level Full TimeBerlin, Germany3d ago
-
Data Engineer EUR 89K-89KAWS | Access Control | Authentication | Authorization | CI/CDOnsite work | Some travel to NATO sitesMid-level Full TimeThe Hague, Netherlands3d ago
-
AI Software Engineer: Python, JavaScript & Microservices (Information Technology/Software) A EUR 54K-75KAngular | Azure | CI/CD | Cloud Computing | ConfluenceSenior-level Full TimeMaastricht, Limburg, Netherlands3d ago
-
Machine Learning Engineer PLN 213K-328KAWS | Azure | Azure DevOps | Computer Vision | Data DriftHybrid work | International projectsMid-level Full TimeWarsaw, Masovian Voivodeship, Poland3d ago
-
Data Scientist – AI & ML EUR 36K-45KAUC | AWS | Artificial Intelligence | Azure | CI/CD100% remote | Career development | Continuous training | Flexible hours | Indefinite contractMid-level Full TimeMadrid, Spain R3d ago
-
Machine Learning Engineer GBP 55K-55KAzure ML | Clustering | Data Monitoring | Decision Trees | DevOpsCoaching and mentoring opportunities | Hybrid workingMid-level Full TimeManchester, Greater Manchester, United Kingdom3d ago
-
Middle Data Engineer (ADB, Python) CAD 120K-158KApache Spark | Azure Blob | Azure Blob Storage | Azure Cloud | Azure DataHybrid/Remote flexibility | Medical healthcare | Ongoing learning reimbursement | Referral bonuses | Sports compensationSenior-level Full TimeBulgaria, Georgia, Poland , Romania , …3d ago
-
Middle Data Engineer (ADB, Python) CAD 120K-158K.Net 8 | ASP.Net Core | Angular | Apache Spark | Azure CloudHybrid work flexibility | Medical healthcare | Ongoing learning reimbursement | Remote work flexibility | Team eventsSenior-level Full TimeBulgaria, Georgia, Poland , Romania , …3d ago