Senior ML Infrastructure Engineer
GBP 70K-90K (estimate) Senior-level Full Time
Tasks
- Automate cluster lifecycle management
- Benchmark and profile performance bottlenecks
- Build GPU training and inference clusters
- Design high-throughput data paths
- Forecast GPU and storage capacity and cost
- Implement automated security controls
- Implement observability and resilience
- Improve distributed training and inference efficiency
- Operate GPU clusters
- Optimize I O caching and data locality
- Optimize scheduling and isolation
- Set quotas and streamline ML experimentation pipelines
Perks/Benefits
- Electric car scheme
- Enhanced holiday pay
- Hospital Cash Plan
- Income protection
- Life assurance
- Pension
- Perk Box
- Private medical insurance
- Therapy Services
Skills/Tech-stack
Argo CD | Automation | Benchmarking | CI/CD | Caching | Containerization | Data Locality | Distributed Training | GPU | High Performance | High Throughput | High speed | High throughput storage | High-Performance Computing | High-Speed Networking | I/O | I/O Optimization | Inference | Infrastructure as Code | Lustre | MLOps | Machine Learning | Network Performance | Observability | Performance Computing | Performance Profiling | Resilience | Scheduling | Security | Terraform | “as-code”
Education
N/A
Related jobs
-
Data Science & AI Specialist GBP 28K-28KAWS Bedrock | Apache Pulsar | Apache Spark | Artificial Intelligence | CI/CDCarers leave bonus | Discounted mobile and broadband | Equalized maternity paternity and adoption leave | Holiday purchase scheme | Paid carer’s leaveMid-level Full TimeLondon, GB, E1 8EP R5h ago
-
Staff Software Engineer, Inference GBP 325K-390KAWS | Batching | Caching | Distributed Systems | GCPFlexible work environment | Flexible working hours | Generous vacation | Parental leaveSenior-level Full TimeLondon, UK21h ago
-
Sr. Software Engineer, Inference GBP 225K-325KAWS | Batching | Caching | Distributed Systems | GCPFlexible working hours | Generous vacation | Parental leave | Visa sponsorshipSenior-level Full TimeLondon, UK21h ago
-
Forward Deployed Engineer GBP 74K-120KAWS | Apache Spark | Azure | CI/CD | Data PipelinesCustomer travel | Hybrid workSenior-level Full TimeLondon, United Kingdom22h ago
-
API | AWS | Artificial Intelligence | Automation | AzureEntry-level Full TimeBirmingham, England, United Kingdom23h ago
-
Embedded Automotive Platforms Software Engineer GBP 90K-120KAPI | AUTOSAR | Abstraction layer | Autonomous Vehicles | C++Flexible core hours | Hybrid workSenior-level Full TimeLondon23h ago
-
Software Engineer (Data Platform) GBP 50K-65KAPI Development | Azure | CI/CD | Data Modeling | Data PipelinesEthical pension scheme | Flexible working | Hybrid working | Office snacks | Paid time offMid-level Full TimeLondon, United Kingdom1d ago
-
Mid-level Full TimeWorsley1d ago
-
Lead Software Engineer – Data Platform GBP 72K-109KAWS Glue | AWS Lambda | Amazon Aurora | Amazon S3 | Apache IcebergHealthcare | Paid volunteering days | Retirement planning | Wellbeing initiativesSenior-level Full TimeLondon, United Kingdom1d ago
-
Mid-level Full TimeUK - Duxford, United Kingdom1d ago
-
Gen AI Engineer GBP 84K-109KAI Search | AWS | Azure | Azure AI | Azure AI SearchEthical work culture | Hybrid working | Learning courses and certifications | Mentoring and career progression | Training and development opportunitiesSenior-level Full TimeLondon, GB1d ago
-
Senior Machine Learning & AI Engineer GBP 72K-80KCloud Computing | Data Pipelines | Feature Engineering | GCP | Machine LearningDiscounted shopping | Flexible working | Hybrid working | Paid time off | Parental leaveSenior-level Full TimeBristol Harbourside, United Kingdom1d ago
-
Gen AI Engineer GBP 84K-109KAWS | Azure | Azure OpenAI | BigQuery | CI/CDHybrid working | Learning and development opportunities | Wellbeing programsSenior-level Full TimeLondon, GB1d ago
-
Sr. Software Engineer - Cloud, Data Platform GBP 97K-124KAWS | Apache Flink | Apache Kafka | Apache Spark | AvroCompetitive vacation and holidays | Employee networks | On-site amenities | Paid adoption leave | Paid parental leaveSenior-level Full TimeLondon (Haggerston), United Kingdom1d ago
-
Machine Learning Manager, Borrowing GBP 100K-160KAI Platform | AWS | BigQuery | Cloud platform | Credit RiskEquipment provided | Flexible working hours | Learning budget | Relocation support | Visa sponsorshipMid-level Full TimeCardiff, London or Remote (UK) R2d ago
-
AI Engineer (Spatial Intelligence) GBP 65K-92KAutogen | CoreML | Edge Computing | Foundation Models | Geospatial DataMid-level Full TimeEdinburgh, United Kingdom2d ago
-
AI Engineer GBP 75K-100KC# | Cloud Computing | MLOps | Machine Learning | NoSQLAnnual leave | Benefit discounts | Healthcare cash plan | Life insurance | Pension matchingSenior-level Full TimeLondon, United Kingdom2d ago
-
Data Engineer (Maternity Cover 12 month FTC) GBP 26K-26KAccess Control | Batch Processing | CI/CD | Cloud Platforms | Cloud platformPersonal development opportunities | Staff discountEntry-level Contract TemporaryBury, GB-BUR, BL9 8RR, GBR2d ago
-
Bayesian Inference | Data Science | Hypothesis Testing | Language Processing | Machine LearningDental coverage | Enhanced parental leave | Family-friendly flexibility | Flexible working | Hybrid workingNone Full TimeUnited Kingdom2d ago
-
MLOps Engineering Specialist GBP 55K-58KAWS | AWS CDK | AWS CloudFormation | Alerting | Amazon SageMakerDiscounted mobile and broadband | Equalized maternity paternity adoption leave | Gym memberships | Holiday purchase scheme | Online private GP 24 7Mid-level Full TimeLondon, GB, E1 8EP R2d ago
-
Bash | Cloud infrastructure | Cloud platform | Data Processing | DockerAsynchronous work culture | Entrepreneurial team | Hands-off management | Inclusive workplace | Opportunity to make impactMid-level Full TimeLondon, United Kingdom2d ago
-
Bash | Cloud platform | Data Processing | Docker | Google CloudAsynchronous culture | Laid-back atmosphere | Life-changing product impact | Remote/distributed workMid-level Full TimeLondon, United Kingdom2d ago
-
Bash | Cloud infrastructure | Cloud platform | Data Ingestion | Data ProcessingAsynchronous culture | Entrepreneurial team | Remote/distributed workMid-level Full TimeManchester, United Kingdom2d ago
-
Bash | Cloud platform | Data Processing | Docker | GCPAsynchronous culture | Flexible management approach | Opportunity for impact | Remote/distributed workMid-level Full TimeEdinburgh, United Kingdom2d ago
-
Bash | Cloud platform | Data Processing | Docker | Google CloudAsynchronous culture | Remote workMid-level Full TimeBristol, United Kingdom2d ago