Research Engineer (LLM Training and Performance)
GBP 80K-120K (estimate) Senior-level Full Time
Tasks
- Build elastic fault tolerant training setups
- Define metrics and build performance dashboards
- Design and evaluate model architecture choices
- Implement custom GPU kernels and operations
- Improve data loading streaming and throughput
- Improve end to end performance for multi node LLM pipelines
- Optimize memory and performance with parallelism
- Profile training pipeline hotspots
- Run pre training and post training methods efficiently
Perks/Benefits
- N/A
Skills/Tech-stack
AOTAutograd | CUDA | CuTe | Cutlass | Data loaders | DeepSpeed | FP8 | FSDP | FlashAttention | GPU Kernel | GPU kernel programming | KV cache | Kernel programming | Kubernetes | MOE | Megatron Core | Megatron-LM | Mixture of Experts | NCCL | NCCL collectives | NCCL tuning | NEMO | NTK | NVIDIA Collective Communication Library | NVTX | Nsight Compute | Nsight Systems | Paged Attention | Parquet | PyTorch | PyTorch distributed | ROPE | Sharded datasets | Slurm | Streaming Data | Streaming data loaders | TFRecord | Tokenization | Torch Inductor | Torch compile | TransformerEngine | Triton | VLLM | Zero
Education
N/A
Roles
Regions
Countries
Armenia | Cyprus | Czechia | Germany | Poland | Serbia | Spain | The Netherlands | United Kingdom
States
Yerevan, AM | Pafos, CY | Limassol, CY | Prague, CZ | Berlin, DE | Bavaria, DE | Madrid, ES | England, GB | North Holland, NL | Mazovia, PL | Central Serbia, RS
Related jobs
-
Senior Software Specialist - AI/ML - Monetisation GBP 100K-140KAds Auction | Behavior Modeling | Bidding | Budgeting | C++Senior-level Full TimeLondon, UK8h ago
-
Data Engineer GBP 42K-45KAPI Integration | Apache Kafka | BigQuery | Cloud Platforms | Data GovernanceCycle to work scheme | Discounts for family and partners | Health cash plan | Life insurance | Paid time offMid-level Full TimeLondon, England, United Kingdom8h ago
-
Lead Software Engineer - Data & AI GBP 84K-126KAWS | Amazon Athena | Amazon SageMaker | Apache Flink | Apache KafkaSenior-level Full TimeLONDON, LONDON, United Kingdom9h ago
-
Senior AI Engineer (All Genders) EUR 54K-75KCI/CD | Context engineering | Docker | Embeddings | JavaBike leasing | Hybrid working | Product discount | Subsidised transport | Training and supportSenior-level Full TimeMadrid, Spain9h ago
-
Senior AI Engineer - SDLC (All genders) EUR 54K-75KAmazon Bedrock | Azure OpenAI | CI/CD | Distributed Systems | Docker20 percent product discount | Bike leasing | Hybrid working | Option to work abroad for 20 days | Subsidised transportSenior-level Full TimeMadrid, Spain10h ago
-
AI ENGINEER (m/f/d) EUR 65K-85KAzure | CI/CD | Data Preparation | Data Quality | Generative AIFlexible learning support | Mentorship | Onboarding program | One-on-one coaching | Ongoing trainingMid-level Full TimePrag, Bratislava, München19h ago
-
AI Agents | API Development | AWS | Agent Orchestration | Distributed SystemsSenior-level Full TimeWarszawa, Poland1d ago
-
AWS Bedrock | AWS Lambda | AWS SageMaker | Agentic Workflows | Amazon EKSBike leasing | City transit card | Commute subsidy | Company pension | Corporate benefitsSenior-level Full TimeMünchen1d ago
-
Senior Machine Learning Engineer GBP 72K-85KBatch Machine Learning | CI/CD | Containerization | Data Pipelines | Data parallelismDisability accommodations during hiring process | Employee discount | Employee sample sales | Flexible benefits allowance | Paid annual leaveSenior-level Full TimeLondon, England, United Kingdom1d ago
-
.NET | APIs | Automation | Azure DevOps | C#Senior-level Full TimeAmsterdam, Netherlands1d ago
-
Data Engineer - Platform Engineering EUR 54K-74KAWS | Apache Airflow | ArgoCD | CI/CD | CloudFormationAdditional annual leave | Discounts | Fitness & wellness memberships | Language apps | Personal development budgetMid-level Full TimeBerlin1d ago
-
Working Student Computer Vision Engineer EUR 15K-18KC++ | Computer Vision | Docker | Git | Language ModelsFlat hierarchies | Flexible work location | Free coffee | Free drinks | Free lunchEntry-level Part TimeMunich R1d ago
-
Software Engineering III - AI/ML Engineer GBP 80K-109KAWS | Automation | Databricks | Datadog | Disaster RecoverySenior-level Full TimeLONDON, LONDON, United Kingdom1d ago
-
Containers | Distributed Systems | Embeddings | Java | KubernetesCollaborative flat structure | Direct access to technical leadership | Exposure to cutting edge generative AI | Flexible schedule | High autonomyEntry-level Full TimeCzechia R1d ago
-
Distributed Systems | Embeddings | Java | Kubernetes | LLM InferenceCollaborative flat culture | Direct access to technical leadership | Flexible work setup | High autonomy | High ownership of projectsEntry-level Full TimePoland R1d ago
-
Adtech | Distributed Systems | Embeddings | Java | KubernetesCollaborative culture | Direct access to technical leadership | Flexible work environment | High autonomy | Project ownershipEntry-level Full TimeNetherlands R1d ago
-
Data Manipulation | Distributed Systems | Embeddings | Java | KubernetesCollaborative flat structure culture | Direct access to technical leadership | Exposure to cutting edge generative AI | Flexible work setup | High autonomyEntry-level Full TimeGermany R1d ago
-
Data Manipulation | Distributed Systems | Embeddings | Java | KubernetesCollaborative engineering culture | Direct access to technical leadership | Exposure to cutting edge generative AI technologies | High autonomy and flexibility | High ownership of projectsEntry-level Full TimeSpain R1d ago
-
Senior ML Ops Engineer (m/f/d) EUR 55K-66KAWS | Azure | Bash | CI/CD | Cloud platformCorporate pension plan | Mental health support | Personal development | Team events | Work from homeSenior-level Full TimeCologne, Germany1d ago
-
Principal Machine Learning Engineer GBP 84K-115KCloud Computing | Containerization | Data Science | Docker | KubernetesCoaching | Enhanced parental leave | Family-friendly flexibility | Flexible working | Hybrid workingSenior-level Full TimeUK - London1d ago
-
Senior Data Platform Engineer EUR 70K-90KData Modelling | DataOps | Docker | Druid | HDFSIn-person collaboration | Office-based workSenior-level Full TimeAmsterdam1d ago
-
Senior Machine Learning Engineer GBP 80K-87KAWS | Behavior-Driven Development | CloudFormation | DVC | Data orchestrationGrowth plan | Gym discounts | Learning resources | Mental health support | MentorshipSenior-level Full TimeLondon R1d ago
-
Staff Applied Scientist GBP 110K-160KCloud Computing | Complex Planning | Data Analysis | Deep learning | Explainable AIEMI share option scheme | Generous share options | Hybrid work schedule | Mentoring and coaching cultureSenior-level Full TimeLondon1d ago
-
Senior Applied Scientist GBP 100K-120KBlack | Cloud Computing | Deep learning | Git | GitHubEMI share option scheme | Remote work hybrid scheme | Share options | Tax efficient equity scheme | Team days in personSenior-level Full TimeLondon1d ago
-
Software Engineer, Platform - Prague, Czech Republic USD 30K-100KAPIs | AWS | Analytics | Authentication | AzureAsynchronous communication culture | Autonomous work | Opportunities for innovation | Remote-friendlySenior-level Full TimePrague, Czech Republic1d ago