aijobs.net

LLM Inference Performance & Evals Engineer

Toronto, Ontario, Canada

CAD 142K-195K (estimate) Mid-level Full Time

Apply Save
Found 28d ago
Tasks
Perks/Benefits
Skills/Tech-stack

Attention Mechanisms | C# | C++ | Compiler optimization | Debugging | Flash Attention | High Performance | High-Performance Computing | KV cache | Kernel development | LLVM | MLIR | Machine Learning | Mixture of Experts | Performance Computing | Performance Profiling | Python | Quantization | Runtime integration | Speculative decoding | Transformers | Triton

Education

N/A

Roles

Engineer | Learning Engineer | Machine Learning Engineer | Performance Engineer | Systems Engineer

Regions

North America

Countries

Canada

States

Ontario, CA

Cities

Toronto, Ontario, CA

Apply Save
Language: en Views: 1 Clicks: 1 Saves: 0

Related jobs