Senior Software Engineer - Storage
Tasks
- Build workload orchestration automation
- Collaborate with AI/ML research teams
- Design distributed systems for data compute and networking
- Ensure robustness and compliance with security networking and platform teams
- Improve system reliability, performance, and observability
- Operate multi region GPU clusters
- Participate in design reviews and system architecture discussions
- Stay current with distributed systems and AI frameworks
Perks/Benefits
- N/A
Skills/Tech-stack
Amazon Web Services | Automation | Azure | BeeGFS | C++ | Cloud Computing | Cloud platform | Data Management | Distributed Systems | GPFS | Go | Google Cloud | Google Cloud Platform | HPC | Infrastructure automation | JAX | Kubernetes | LSF | Lustre | NEMO | Performance optimization | PyTorch | Python | Reliability Engineering | Slurm | Storage | System Observability | Web Services | Workload Orchestration
Education
Bachelor of Engineering | Bachelor of Science | Master of Science | PhD
Regions
Countries
States
Related jobs
-
C++ | Cache Management | Distributed Systems | GPU memory | GPU memory managementAgile team environment | Equity | Health benefitsSenior-level Full TimeUS, CA, Santa Clara R1d ago
-
Senior Software Engineer, Agentic Engineering USD 184K-356KAgent Orchestration | Autonomous Search | CI/CD | Compiler infrastructure | Deep learningSenior-level Full TimeUS, CA, Santa Clara R3d ago
-
Senior Software Engineer, DGXC Data Services USD 152K-287KAWS | Algorithms | Apache Iceberg | Apache Spark | AzureEmployee benefits | EquitySenior-level Full TimeUS, CA, Santa Clara R8d ago
-
Bash | Bootstrap | CSI | CSS3 | Container StorageSenior-level Full TimeUS, CA, Santa Clara R11d ago
-
Senior Software Engineer, AI Storage USD 184K-287KAlgorithms | Bash | C++ | CUDA | CloudBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara R11d ago
-
Senior Deep Learning Framework Communications Engineer USD 152K-287KC++ | CUDA | CUDA kernels | CuTe | Distributed TrainingBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara R13d ago
-
C++ | CUDA | Docker | Infiniband | JAXSenior-level Full TimeUS, CA, Santa Clara R14d ago
-
Senior Deep Learning Frameworks CUDA Software Engineer USD 184K-356KAutograd | C++ | CUDA | Compiler technology | Computer ArchitectureSenior-level Full TimeUS, CA, Santa Clara R17d ago
-
Senior Scientific Machine Learning Engineer – Earth-2 USD 152K-287KCUDA | Containers | Data parallelism | Diffusion Models | GPU KernelBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara R19d ago
-
Senior Storage Production Engineer - DGX Cloud USD 176K-333KAI/ML | Access Control | Algorithms | Ansible | AuditingBenefits | Equity | On-call rotationSenior-level Full TimeUS, CA, Santa Clara R19d ago