aijobs.net

AI Research Engineer (Kernel & Inference Optimization)

Remote job R

USD 201K-332K (estimate) Senior-level Full Time

Apply Save
Found 21h ago
Tasks
Perks/Benefits
Skills/Tech-stack

Diffusion Models | Distributed Inference Systems | Distributed inference | Edge Computing | Expert parallelism | Flash Attention | GPU Kernels | Inference Optimization | Inference Systems | KV cache | Language Processing | Latency optimization | Low Latency | Low Memory | Machine Learning | Memory Optimization | Mobile optimization | Model Architecture | Model Serving | NLP | Natural Language | Natural Language Processing | On-device Inference | Pipeline parallelism | Pruning | Quantization | Response Optimization | Speculative decoding | Tensor Parallelism | Throughput Optimization | Token Response Optimization | Vision Transformers

Education

Bachelor of Engineering | Bachelor of Science | Master of Science | PhD

Roles

AI | AI Research Engineer | Engineer | Learning Engineer | Machine Learning Engineer | Research Engineer

Apply Save
Language: en Views: 0 Clicks: 0 Saves: 0

Related jobs