Software Engineer, Inference – AMD GPU Enablement
Tasks
- Build integrate and tune collective communication libraries for parallel model execution
- Debug and optimize distributed inference workloads
- Design and optimize high performance GPU kernels for accelerators
- Integrate model serving infrastructure into GPU backed systems
- Own bring up correctness and performance of inference stack on AMD hardware
- Validate correctness performance and scalability on large GPU clusters
Perks/Benefits
- N/A
Skills/Tech-stack
CUDA | Collective communication | Distributed Systems | GPU Kernels | HIP | Mixed Precision | Model Parallelism | NCCL | Profiling | RCCL | Tensor Parallelism | Triton
Education
N/A
Roles
Regions
Countries
States
Related jobs
-
Robotics Algorithm Engineer-Motion Planning USD 78K-116KAdmittance control | C++ | CUDA | Contact mechanics | Force Torque ReasoningEntry-level InternshipSan Jose2h ago
-
Senior Research Scientist - Machine Learning System USD 212K-387KCUDA | Deep learning | Distributed Systems | GPU Performance | GPU Performance OptimizationSenior-level Full TimeSan Jose, California, United States7h ago
-
Machine Learning Engineer - Orchestration USD 212K-450KAutoscaling | Distributed Systems | Embedding | Eviction | GPUSenior-level Full TimeSan Jose, California, United States7h ago
-
Apache Flink | Apache Paimon | Batch Processing | Data Architecture | Data ConsistencySenior-level Full TimeSan Jose, California, United States7h ago
-
Customer Engineer I, Cloud AI, Retail, Google Cloud USD 105K-151KAPI prompting | Agent collaboration | Agentic design | Agentic design patterns | Artificial IntelligenceBenefits | Bonus | EquityMid-level Full TimeNew York, NY, USA8h ago
-
Software Engineer, BigQuery Continuous Query USD 147K-211KBigQuery | C++ | Cloud platform | Distributed Computing | Distributed DatabasesMid-level Full TimeKirkland, WA, USA8h ago
-
AWS | Azure | BigQuery | C++ | Cloud infrastructureBonus plan | Company benefits program | Equity incentive planSenior-level Full TimeMountain View, CA, USA; San Francisco, …15h ago
-
Senior-level Full TimeUS - Milpitas21h ago
-
Machine Learning Engineer, Platform Integrations USD 225K-325KAWS | Azure | Batching | CI/CD | CachingFlexible PTO | Health, dental, vision benefits | Parental leave | Visa supportSenior-level Full TimeSan Francisco21h ago
-
Staff Software Engineer (L4) Data Platform USD 171K-213KAWS | Apache Hudi | Apache Iceberg | Apache Kafka | Apache SparkDonation match | Occasional travel | Remote work | Volunteering supportSenior-level Full TimeRemote - US R22h ago
-
Senior-level Full TimeUnited States1d ago
-
AI Engineer I USD 104K-156KAgentic AI | Apache Spark | Async Processing | Data Processing | Distributed SystemsMid-level Full TimeBoston, MA1d ago
-
Infrastructure Engineer, Pre-training USD 350K-850KApache Spark | Chunking | Cloud Computing | Data Deduplication | Distributed SystemsFlexible working hours | Generous vacation | Hybrid work flexibility | Optional equity donation matching | Parental leaveMid-level Full TimeSan Francisco, CA1d ago
-
Embedded Real-Time Software Engineer - ASL - Open Rank USD 152K-227KApplication Programming | Application Programming Interface | Bash | C# | C++Senior-level Full TimeAtlanta, GA1d ago
-
AI acceleration | Communication optimization | Data parallelism | Deep learning | Distributed TrainingSenior-level Full TimeSeattle, Washington, United States1d ago
-
Software Engineer - Data Architecture, TikTok US USD 317K-413KComputing systems | Data Architecture | Data Mining | Data Modeling | Distributed SystemsSenior-level Full TimeSan Jose, California, United States1d ago
-
Software Engineer, Fusion USD 120K-170KAPI | AWS | Azure | Database systems | Distributed Systems401k | Home office stipend | Paid parental leave | Pension plan | Unlimited vacationSenior-level Full TimeUS - Remote R1d ago
-
Cloud Computing | Data integration | Database Design | Distributed Systems | ODBCActive TS SCI clearance with polygraphSenior-level Full TimeJessup, Maryland, United States1d ago
-
Authentication | Avro | Bash | CSV | CentOSPolygraph clearance | TS/SCI clearanceSenior-level Full TimeJessup, Maryland, United States1d ago
-
Staff Software Engineer - Data Platform USD 171K-232KApache Spark | Data Engineering | Database | Distributed Systems | ETL401k | Commuter benefits | FSA | Family leave | Medical/Dental/VisionSenior-level Full TimeSan Francisco, CA1d ago
-
AI Data Engineer USD 130K-150KAPI Design | Agentic Systems | Data Pipelines | Distributed Systems | EmbeddingsGlobal career advancement opportunities | High impact on risk management | Professional growth | Talent dense teamMid-level Full TimeUnited States - Remote R1d ago
-
AWS | Azure | C# | C++ | Cloud PlatformsHousing assistance | Mentorship | Opportunity for return offers | Professional development | Real-world project ownershipEntry-level Full TimeEvendale, United States R1d ago
-
Software Engineer, Search Data Infrastructure -Slack USD 117K-223KAWS | Chef | Distributed Systems | EMR | Elasticsearch401k | Employee stock purchasing program | Health, dental, vision insurance | Life and disability insurance | Mental health supportSenior-level Full TimeWashington - Seattle, United States1d ago
-
Amazon Web Services | Artificial Intelligence | Distributed Systems | Generative AI | Machine LearningCareer growth resources | Knowledge sharing | Mentorship | Training | Work-life balanceSenior-level Full TimeDallas, Texas, USA1d ago
-
Senior Software Engineer, AI Networking USD 152K-287KBash | Bayesian optimization | C++ | Data Curation | Data Curation PipelinesSenior-level Full TimeUS, CA, Santa Clara, United States1d ago