aijobs.net

LLM Inference Frameworks and Optimization Engineer

San Francisco, Singapore, Amsterdam

USD 160K-230K Mid-level Full Time

Apply Save
Found 3d ago
Tasks
Perks/Benefits
Skills/Tech-stack

C++ | CUDA | CUDA graph | Cluster scheduling | Compiler | Efficient kernels | GPU Cluster | GPU Kernel | GPU Programming | GPU cluster scheduling | GPU kernel optimization | KV cache | Kernel optimization | Mixture of Experts | Model Quantization | Pipeline parallelism | PyTorch | Python | Speculative decoding | TRT-LLM | Tensor Parallelism | TensorRT | Torch compile | Transformer | Triton | Workload Scheduling

Education

N/A

Roles

AI | AI Infrastructure Engineer | Engineer | Inference Engineer | Infrastructure Engineer | LLM Inference Engineer

Regions

Asia/Pacific | Europe | North America

Countries

Singapore | The Netherlands | United States

States

North Holland, NL | California, US

Cities

Singapore, SG | San Francisco, California, US | Amsterdam, North Holland, NL

Apply Save
Language: en Views: 0 Clicks: 0 Saves: 0

Related jobs