Software Engineer, Inference – AMD GPU Enablement
Tasks
- Build integrate and tune collective communication libraries for parallel model execution
- Debug and optimize distributed inference workloads
- Design and optimize high performance GPU kernels for accelerators
- Integrate model serving infrastructure into GPU backed systems
- Own bring up correctness and performance of inference stack on AMD hardware
- Validate correctness performance and scalability on large GPU clusters
Perks/Benefits
- N/A
Skills/Tech-stack
CUDA | Collective communication | Distributed Systems | GPU Kernels | HIP | Mixed Precision | Model Parallelism | NCCL | Profiling | RCCL | Tensor Parallelism | Triton
Education
N/A
Roles
Regions
Countries
States
Related jobs
-
Senior Software Engineer, Cloud Databases USD 174K-252KAnalytical processing | Benchmarking | C++ | Cloud Databases | Cloud platformSenior-level Full TimeKirkland, WA, USA3h ago
-
Technical Lead, AI/ML Storage USD 207K-300KAI/ML | AI/ML frameworks | Artificial Intelligence | Benchmarking | Cloud MLHealth insurance | Paid time off | Professional development | Retirement benefitsSenior-level Full TimeSeattle, WA, USA3h ago
-
Artificial Intelligence | Computer Vision | Computer vision models | Data Processing | Data StorageSenior-level Full TimeSunnyvale, CA, USA3h ago
-
Staff Software Engineer - Core Ingest USD 191K-224KAgile Development | Apache Kafka | Distributed Systems | Docker | Fault ToleranceHealth insurance | Paid time off | Remote work optionsSenior-level Full TimeUnited States, Remote R10h ago
-
Staff Software Engineer - Data Query USD 191K-224KAgile | Automated testing | Big Data | C++ | Data StructuresSenior-level Full TimeUnited States, Remote R10h ago
-
Senior Software Engineer - Experiment Platform USD 159K-235KA/B | A/B Testing | B testing | Data Pipelines | Data Quality401k plan | Basic life insurance | Dental insurance | Flexible time off | Long-Term Disability coverageSenior-level Full TimeSeattle, Washington, United States14h ago
-
Staff Softare Engineer, Cortex AI Infrastructure USD 236K-339KData Pipelines | Distributed Systems | FoundationDB | Go | High ThroughputSenior-level Full TimeUS-CA-Menlo Park16h ago
-
Senior Software Engineer - Data Infrastructure, Safety USD 196K-243KA/B | A/B Testing | AI | Automation | B testingSenior-level Full TimeSan Mateo, CA, United States R17h ago
-
Perception Engineer USD 125K-220KC++ | CI/CD | CUDA | Computer Vision | Convolutional Neural NetworkHealth insurance | Professional development | Retirement plansSenior-level Full TimeHuntington Beach18h ago
-
Software Engineer, Infrastructure - Autonomy & Robotics USD 130K-285KAlgorithms | Data Processing | Data Structures | Distributed Systems | Distributed data401k match | Basic life insurance | Commuter benefits match | Dental benefits | Disability insuranceMid-level Full TimeSan Francisco, CA19h ago
-
Research Infrastructure Engineer, Training Systems USD 295K-380KAPI Design | Benchmarking | Debugging | Distributed Systems | GPU ComputingMid-level Full TimeSan Francisco21h ago
-
Machine Learning Engineer USD 180K-250KAWS | Azure | CUDA | DDP | Distributed Training401k employer match | Health, dental, vision insurance | Paid time off | Professional development | Work-life balanceMid-level Full TimeEmeryville, California, United States; Hybrid (2-3 … R21h ago
-
MLOps Engineer - Machine Learning Platform USD 130K-219KAPIs | AWS | CI/CD | CloudFormation | Container OrchestrationOn-call rotationMid-level Full TimeJersey City, New Jersey, United States22h ago
-
Software Engineer - Platform USD 190K-230KAPI Design | Amazon Web Services | CI/CD | Distributed Systems | GraphQLBenefits | Equity | Remote work flexibilityMid-level Full TimeRemote with offices in San Francisco, … R22h ago
-
Staff Backend Engineer, Customer Value Optimisation USD 147K-185KDistributed Systems | ML Ops | Machine Learning | Microservices | Model ServingSenior-level Full TimeNew York, NY, United States1d ago
-
Artificial Intelligence | Big Data | Data Processing | Distributed Systems | High PerformanceEntry-level InternshipSan Jose, California, United States1d ago
-
Staff Backend Engineer, Core Data Service USD 187K-280KAI | Active architecture | Active-active Architecture | Active/Active | Data ConsistencySenior-level Full TimeSan Jose, California, United States1d ago
-
Senior Backend Engineer, Core Data Service USD 187K-280KAI | Active architecture | Active-active Architecture | Active/Active | Anomaly DetectionSenior-level Full TimeSan Jose, California, United States1d ago
-
Software Engineer, Search AI Infra Performance USD 174K-252KData Processing | Debugging | Distributed Systems | Generative AI | Language ModelsMid-level Full TimeMountain View, CA, USA1d ago
-
Software Engineer III, BigLake OSS USD 147K-211KApache Arrow | Apache Iceberg | Apache Spark | C++ | Data StorageSenior-level Full TimeSeattle, WA, USA1d ago
-
AI machine learning | Code Quality | Distributed Systems | Machine Learning | Performance optimizationSenior-level Full TimeNew York, New York, United States1d ago
-
Staff Technical Lead for Inference & ML Performance USD 180K-300KCUDA | Compilation | Cutlass | Distributed Serving | Kernel optimizationSenior-level Full TimeSan Francisco1d ago
-
Feature Lead - GenAI Team USD 106K-173KAgentic Retrieval Augmented Generation | Agile | Automation | Data Ingestion | Data ProcessingEmployee support resources | Paid time offSenior-level Full TimeNew York, United States1d ago
-
Staff Machine Learning Engineer, Agentic Systems USD 155K-235KAgentic Systems | Benchmarking | C++ | Distributed Systems | Evaluation401k | Annual bonus | Dental insurance | Medical insurance | Paid time offSenior-level Full TimeWaltham Office (POST), United States1d ago
-
Senior Embedded Software Engineer USD 114K-228KC# | Control Management | Debugger | Debugging | Design ControlSenior-level Full TimeUnited States - Alameda : 1360-1380 …1d ago