Machine Learning & Cloud Infra Engineer
Tasks
- Automate training data and infrastructure workflows
- Build and optimize storage and data throughput
- Define SLOs and handle incidents
- Design and operate GPU clusters
- Enable distributed training performance
- Implement monitoring, logging, and alerting
- Maintain infrastructure as code and release processes
- Manage secrets IAM and secure network boundaries
- Own and evolve ML cloud infrastructure
- Package and deploy workloads with containers and orchestration
- Support model evaluation and model serving
Perks/Benefits
- N/A
Skills/Tech-stack
AWS | Azure | Bash | CI/CD | CUDA | Caching | CircleCI | Cloud Computing | DDP | Distributed Training | Docker | ELK | FSDP | GCP | GPU Computing | GitHub Actions | Grafana | IAM | Kubernetes | Machine Learning | Monitoring | NCCL | NVMe | Networking | Object storage | OpenTelemetry | Profiling | Prometheus | PyTorch | Python | Shared Filesystems | Terraform
Education
N/A
Related jobs
-
AI Research Engineer GBP 110K-200KC# | CUDA | Deep learning | Machine Learning | PyTorchHybrid Remote | Remote Interview AccommodationMid-level Full TimeHybrid (UK) R4h ago
-
CSS | Data Modeling | ETL | HTML | JavaScriptSenior-level Contract TemporaryRemote or Hybrid (Finchley, North London … R5h ago
-
AI Data Engineer GBP 111K-133KAgent memory | Amazon Web Services | Audio Processing | Batch Processing | Cloud infrastructureAccess to AI tools | Equity | OwnershipMid-level Full TimeLondon10h ago
-
Lead Data Platform Engineer GBP 84K-110KAmazon Web Services | Continuous Delivery | Continuous Integration/Continuous Delivery | Continuous integration | DatabricksSenior-level Full TimeLondon, UK, United Kingdom11h ago
-
Insights Product Manager - Analytics Engineering GBP 50K-68KAmplitude | Anomaly alerting | CI/CD | DBT | Data CatalogAnnual leave | Counselling access | Employee assistance program | Free Economist content access | Moving home supportMid-level Full TimeLondon - Commercial R13h ago
-
Agent Orchestration | Authentication | Backend Development | CI/CD | Frontend DevelopmentBPSS clearance | Hybrid workSenior-level Contract Full TimeSheffield, England, United Kingdom15h ago
-
Senior Solutions Engineer - Qatar & S.Africa Fly-in GBP 70K-100KAI Agents | AWS | Apache Spark | Apache Spark architecture | Artificial IntelligenceHybrid work schedule | Travel for customer visits and events | Workshops seminars and community buildingSenior-level Full TimeLondon, United Kingdom; Paris, France R18h ago
-
Data Engineer GBP 87K-111KApache Airflow | DBT | Data Modeling | Data Pipeline Monitoring | Data QualityPaid 'me days' | Paid parental leave | Paid paternity leave | Paid sabbatical leave | RSU equitySenior-level Full TimeLondon, United Kingdom19h ago
-
APIs | AWS | Automation Pipelines | Azure | Backend DevelopmentAnnual leave | Company lunches | Learning and development budget | Monthly socials | Public holidaysEntry-level Full TimeLondon20h ago
-
Alerting | Cause analysis | Completeness Accuracy Timeliness Consistency | Data Governance | Data MonitoringSenior-level Full TimeLondon22h ago
-
Cloud Computing | Distributed Systems | Go | Infrastructure as Code | Kubernetes100% remote | Annual leave | Bonus | Equity | Shutdown daysSenior-level Full TimeUnited Kingdom (Remote) R23h ago
-
Data Science Lead - AML Risk GBP 75K-115KData Pipelines | Experiment design | Language Models | Language Processing | Large Language ModelsSenior-level Full TimeLondon, United Kingdom23h ago
-
Data Engineer GBP 45K-50KApache Airflow | Batch Data Processing | Batch data | BigQuery | Cloud platformCompany performance bonus | Employee assistance | Enhanced family leave | Flexible working | Gym membershipMid-level Full TimeLondon, England, United Kingdom1d ago
-
Infrastructure and MLOps Engineer GBP 91K-110KAWS | CI/CD | Docker | Kubernetes | LinuxAnnual leave | Barista bar | Dental plan | Employee assistance programme | Flexible workingSenior-level Full TimeBristol, UK1d ago
-
DevOps Engineer GBP 45K-70KAWS | AWS SSM | Ansible | Bash | CI/CDFlexible scheduling for sprint cycles | Hybrid work environmentMid-level Full TimeUnited Kingdom1d ago
-
Senior Machine Learning Engineer GBP 55K-75KApache Spark | CI/CD | Canary deployments | Data Engineering | Data PipelinesFlexible working | Hybrid workingSenior-level Full TimeBristol, UK1d ago
-
AI Research Engineer - Computer Vision GBP 65K-90KAdversarial Attacks | Computer Vision | Deep learning | Edge Computing | Hardware optimizationEnhanced parental leave | Gym membership | Learning allowance | Mental health support | Paid Family Emergency LeaveSenior-level Full TimeBerlin; London; Munich R1d ago
-
AWS | Apache Airflow | Apache Kafka | Apache Spark | CI/CDCommuter benefits | Dental insurance | Disability insurance | Financial wellness support | Flexible remote workSenior-level Full TimeManchester, United Kingdom1d ago
-
Senior AI Engineer GBP 70K-90KAPI Development | AWS | Agent Orchestration | Apache Spark | CI/CDCentral London office | Enhanced parental leave | Enhanced pension | Free meals and snacks | Life insuranceSenior-level Full TimeLondon, England, United Kingdom1d ago
-
Bilby | CAMB | Class | DOLFINx | FenicsPart-time freelance | Project based workSenior-level FreelanceUnited Kingdom - Remote R1d ago
-
Data Engineering Technical Lead GBP 60K-70KAd Hoc Queries | Ad-Hoc | Data Integrity | Data Modeling | Data ProcessingHybrid work model | Mentorship | Professional developmentSenior-level Full TimeGBR - Manchester, United Kingdom1d ago
-
Apache Spark | Azure | Azure Bicep | Azure Data | Azure Data LakeFlexible working | Holiday | Life assurance | Paid time off | PensionSenior-level Full TimeThe Bridge, United Kingdom1d ago
-
System Development Engineer, ToolBelt - Builder Tools, Operator Tools, UXP2 and Cutlass. GBP 75K-90KAWS | Amazon Web Services | Ansible | CI/CD | Distributed SystemsCareer growth | Flexible work arrangements | Mentorship | Work-life balanceSenior-level Full TimeLondon, England, GBR1d ago
-
AI Engineer GBP 70K-80KAgent systems | Agentic AI | Algorithms | Cloud Native | Data StructuresLearning and development | Mental health support | Training opportunities | Wellbeing programsMid-level Full TimeManchester, GB1d ago
-
Data Analysis | Data Generation | Human-in-the-loop | Model-based data generation | PythonMid-level Full TimeLondon, England, GBR1d ago