aijobs.net

Senior ML Engineer - Kimchi (LLM Inference Optimization

United Kingdom R

GBP 110K-141K (estimate) Senior-level Full Time

Apply Save
Found 1d ago
Tasks
Perks/Benefits
Skills/Tech-stack

Activations quantization | Amazon Web Services | ArgoCD | CUDA | CUDA-adjacent tooling | Checkpointing | Chunked prefill | ClickHouse | Cloud Pub/Sub | Cloud platform | Collective communication | Continuous batching | Distributed Systems | Eviction Policy | FP8 | GitLab CI | Google Cloud | Google Cloud Platform | Google Cloud Pub/Sub | Grafana | INT4) | INT8 | KV cache | KV quantization | Kernel tuning | Kubernetes | Kv cache reuse | Loki | Microsoft Azure | Multi-GPU | Multi-node | Network aware placement | Paged Attention | PostgreSQL | Prefix caching | Prometheus | Pub/Sub | PyTorch | Python | Quantization | SGLang | Sharding | Speculative decoding | Tempo | TensorRT-LLM | VLLM | Web Services | Weights quantization

Education

N/A

Roles

Engineer | Learning Engineer | Machine Learning Engineer

Regions

Europe

Countries

United Kingdom

Apply Save
Language: en Views: 1 Clicks: 0 Saves: 0

Related jobs