Senior Deep Learning Framework Communications Engineer
Tasks
- Analyze AI workloads for multi GPU communication requirements
- Author communication or fused compute communication kernels
- Collaborate on AI model engineering
- Conduct performance benchmarking on AI clusters
- Design fault tolerant and elastic distributed solutions
- Improve AI compilers for communication hiding
- Influence communication library roadmap
- Integrate communication library features in AI frameworks
- Perform AI workload performance characterization
Perks/Benefits
Skills/Tech-stack
C++ | CUDA | CUDA kernels | CuTe | Distributed Training | Distributed inference | Elasticity | Fault Tolerance | GPUDirect | HPC communication | JAX | MOE | MPI | Mixture of Experts | NCCL | NVSHMEM | Nsight Systems | Performance Benchmarking | PyTorch | PyTorch Profiler | Python | Reinforcement Learning | SGLang | TRT-LLM | Topology Discovery | Torch compile | Triton | VLLM
Education
Regions
Countries
States
Related jobs
-
Senior Software Engineer, PyTorch - Deep Learning USD 152K-287KC++ | CUDA | Distributed Computing | Parallel Programming | PyTorchSenior-level Full TimeUS, CA, Santa Clara R4d ago
-
Bash | Bootstrap | CSI | CSS3 | Container StorageSenior-level Full TimeUS, CA, Santa Clara R4d ago
-
Senior Software Engineer, AI Storage USD 184K-287KAlgorithms | Bash | C++ | CUDA | CloudBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara R4d ago
-
C++ | CUDA | Docker | Infiniband | JAXSenior-level Full TimeUS, CA, Santa Clara R7d ago
-
Senior Deep Learning Frameworks CUDA Software Engineer USD 184K-356KAutograd | C++ | CUDA | Compiler technology | Computer ArchitectureSenior-level Full TimeUS, CA, Santa Clara R10d ago
-
Senior Scientific Machine Learning Engineer – Earth-2 USD 152K-287KCUDA | Containers | Data parallelism | Diffusion Models | GPU KernelBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara R12d ago
-
Senior Storage Production Engineer - DGX Cloud USD 176K-333KAI/ML | Access Control | Algorithms | Ansible | AuditingBenefits | Equity | On-call rotationSenior-level Full TimeUS, CA, Santa Clara R12d ago