Principal Deep Learning Communication Architect
US, CA, Santa Clara, United States
USD 272K-431K Senior-level Full Time
Tasks
- Co design communication primitives with application developers
- Collaborate on hardware and software co design for networking
- Define technical roadmap for communication libraries
- Design communication primitives and collective algorithms
- Develop analytical models and simulators for system behavior
- Ensure evolution of communication libraries for large language models
- Lead development and scaling for distributed deep learning
- Optimize communication for heterogeneous interconnects
Perks/Benefits
- N/A
Skills/Tech-stack
3D Parallelism | CUDA | Context Parallelism | Data parallelism | DeepSpeed | Expert parallelism | Infiniband | JAX | MPI | Megatron Core | NCCL | NVSHMEM | Pipeline parallelism | PyTorch | PyTorch distributed | RDMA | RoCE | SGLang | Tensor Parallelism | TensorRT-LLM | UCC | UCX | VLLM | XLA | Zero
Education
Regions
Countries
States
Cities
Related jobs
-
Mid-level Full TimeSeattle16h ago
-
AI/ML Scientist – Protein Foundation Models USD 120K-200KAWS | Alphafold | Attention Mechanisms | Diffusion Models | Distributed ComputingOn-site work | Relocation assistanceMid-level Full TimeBoston, MA or San Francisco, CA21h ago
-
AI/ML Scientist USD 120K-200KAutoregressive models | Cloud Computing | Computational Biology | Data Engineering | Deep learningOn-site work | Relocation supportMid-level Full TimeBoston, MA or San Francisco, CA22h ago
-
Staff AI Researcher USD 160K-210KApache Spark | Benchmarking | Data Processing | Data Transformation | DatabricksSenior-level Full TimeRemote, United States R23h ago
-
Senior / Staff AI Research Engineer, Real-Time Inference USD 160K-300KC++ | CUDA | CUDA kernels | Edge Computing | Embedded Systems401k plan | Dental insurance | Equity program | Fully stocked kitchen | Green card supportSenior-level Full TimeMilpitas, CA1d ago
-
Staff AI Research Engineer, Perception USD 170K-291KC++ | Computer Vision | Depth Estimation | Distributed Training | Embedded Systems401k plan | Dental insurance | Equity programs | Fully stocked kitchen | Green card supportSenior-level Full TimeMilpitas, CA1d ago
-
Amazon EKS | Angular | Bedrock | CI/CD | Diffusion ModelsSenior-level Full TimeJersey City, NJ, United States1d ago
-
API Design | Agentic Workflows | C# | C++ | Code reviewSenior-level Full TimeRedmond, WA1d ago
-
AI Research Scientist, Text Data Research - MSL FAIR USD 147K-208KApache Hive | Apache Spark | Data Curation | Data Pipelines | Language ModelsEntry-level Full TimeBellevue, WA | Menlo Park, CA …1d ago
-
Senior-level Full TimePalo Alto, CA1d ago
-
Senior-level Full TimeReports into Rochester, NY1d ago
-
Senior/Staff AI Algorithms Engineer USD 160K-200KGeometric reasoning | Imitation Learning | Machine Learning | Motion Planning | OptimizationSenior-level Full TimeRedwood City1d ago
-
AI Engineer - Private Assets USD 128K-166KA/B | A/B Testing | Apache Beam | Apache Flink | Apache SparkAdvanced technology | Collaborative workspaces | Flexible working arrangements | Learning platforms accessSenior-level Full TimeNew York, NY, United States1d ago
-
Senior AI Engineer - Private Assets USD 136K-177KA/B | A/B Testing | Apache Beam | Apache Flink | Apache SparkCollaborative workspaces | Employee Career Development | Employee resource groups | Flexible working arrangements | Learning platform accessSenior-level Full TimeNew York, NY, United States1d ago
-
Senior Architect USD 224K-356KAPI Development | Artificial Intelligence | CUDA | Developer platforms | GPU ComputingBenefits | EquitySenior-level Full TimeUS, CA, Remote, United States R1d ago
-
AI Engineer - Tech Lead (Remote) USD 132K-221KAWS | Agentic Workflows | Azure | CI/CD | Cloud NativeCareer growth | Mentorship | Remote workSenior-level Full TimeDallas, TX, US R1d ago
-
Cyber AI/ML Intern USD 57K-104KAWS | Agentic Systems | Azure | Language Models | Large Language ModelsEntry-level Full Time Internship6314 Remote/Teleworker US, United States R1d ago
-
Mid-level Full Time6314 Remote/Teleworker US, United States R1d ago
-
Technical Lead, GenAI - Autonomous Vehicles USD 224K-356KAPIs | Agentic AI | C++ | Computer Vision | Control SystemsSenior-level Full TimeUS, CA, Santa Clara, United States1d ago
-
Solutions Architect - AI Networking and Storage USD 184K-356KARM | Bring-up | Ceph | Cloud Native | Cloud setupsConference attendance | Equity compensation | Health benefits | Travel opportunitiesSenior-level Full TimeUS, TX, Remote, United States R1d ago
-
Senior-level Full TimeUS, CA, Santa Clara, United States1d ago
-
Principal Data Scientist USD 126K-255KASR | AWS Athena | AWS Bedrock | AWS SageMaker | Amazon KinesisSenior-level Full Time245 Summer St, Boston MA, United …1d ago
-
Senior Cloud Architect, ML/AI USD 141K-200KAI Governance | API Gateway | AWS CodeBuild | AWS CodePipeline | AWS FargateEmployee stock option plan | Flexible working options | Health insurance | Home-office allowance | Parental leaveSenior-level Full TimeRemote US R2d ago
-
AI/ML Developer Relations - US (San Francisco) USD 150K-220KComputer Vision | Machine Learning | NumPy | PyTorch | PythonAnnual leave | Conference attendance | Public holidays | Remote flexibilityMid-level Full TimeSan Francisco2d ago
-
AI Inference Engineer - Model Optimization & Deployment USD 205K-303KAccuracy evaluation | BF16 | C++ | CUDA | CUDA kernelsSenior-level Full TimeFoster City, CA3d ago