aijobs.net

Software Engineer, Inference Platform

San Francisco, CA

USD 200K-250K Mid-level Full Time

Apply Save
Found 15d ago
Tasks
Perks/Benefits
Skills/Tech-stack

CUDA | Distributed Systems | Expert parallelism | GPU Compute | GPU Optimization | GPU compute parallelism | GPU memory | GPU memory hierarchies | Go | Inference Engines | JAX | Kubernetes | LLM serving | LLM serving frameworks | Memory hierarchies | Model Deployment | PyTorch | Python | Quantization tooling | Serving frameworks | Speculative decoding | Tensor and expert parallelism | Torch.compile | Triton

Education

Bachelor of Science

Roles

Engineer | Software Engineer

Regions

North America

Countries

United States

States

California, US

Cities

San Francisco, California, US

Apply Save
Language: en | Views: 1 | Clicks: 0 | Saves: 0

Related jobs