Senior Deep Learning Frameworks CUDA Software Engineer
Tasks
- Analyze AI workloads and identify optimization opportunities
- Benchmark AI performance on compute clusters
- Design fault tolerant elastic distributed solutions
- Develop and maintain production-quality code
- Develop exploratory profiling and runtime tools
- Implement fault tolerant distributed runtime abstractions
- Improve compiler runtime interface for multi GPU multi node scaling
- Integrate CUDA features into AI frameworks
Perks/Benefits
- N/A
Skills/Tech-stack
Autograd | C++ | CUDA | Compiler technology | Computer Architecture | Distributed Systems | Elastic Systems | Fault Tolerance | HPC communication | JAX | MPI | NCCL | NVIDIA Nsight | Nsight Systems | Nvidia NSight Systems | Parallel Computing | Performance Benchmarking | Profiling | PyTorch | Python | TensorRT | TorchCompile | UCX | VLLM
Education
Regions
Countries
States
Related jobs
-
Senior Software Engineer, PyTorch - Deep Learning USD 152K-287KC++ | CUDA | Distributed Computing | Parallel Programming | PyTorchSenior-level Full TimeUS, CA, Santa Clara R4d ago
-
Bash | Bootstrap | CSI | CSS3 | Container StorageSenior-level Full TimeUS, CA, Santa Clara R4d ago
-
Senior Software Engineer, AI Storage USD 184K-287KAlgorithms | Bash | C++ | CUDA | CloudBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara R4d ago
-
Senior Deep Learning Framework Communications Engineer USD 152K-287KC++ | CUDA | CUDA kernels | CuTe | Distributed TrainingBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara R6d ago
-
C++ | CUDA | Docker | Infiniband | JAXSenior-level Full TimeUS, CA, Santa Clara R7d ago
-
Senior Scientific Machine Learning Engineer – Earth-2 USD 152K-287KCUDA | Containers | Data parallelism | Diffusion Models | GPU KernelBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara R12d ago
-
Senior Storage Production Engineer - DGX Cloud USD 176K-333KAI/ML | Access Control | Algorithms | Ansible | AuditingBenefits | Equity | On-call rotationSenior-level Full TimeUS, CA, Santa Clara R12d ago