Find jobs in AI/ML, Data Science and Big Data
34 results
for XLA
(Skill/Tech stack)
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | Communication Primitives | Continuous batching | Distributed TrainingCareer growth potential | Remote workMid-level Full TimeUnited States - Remote R2d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | CUDA | Continuous batching | DebuggingMid-level Full TimeUnited States - Remote R2d ago
-
AI Performance Optimization Engineer USD 100K-150KC++ | CUDA | Continuous batching | Cutlass | Deep learningCareer growth | Health benefits | Remote workMid-level Full TimeUnited States - Remote R2d ago
-
AI Performance Optimization Engineer USD 100K-150KAttention Mechanisms | Benchmarking | C++ | CUDA | DeepSpeedBenefits | Remote workMid-level Full TimeUnited States - Remote R2d ago
-
AI Performance Optimization Engineer USD 100K-150KC++ | Compiler optimization | Continuous batching | Distributed Training | FSDPMid-level Full TimeUnited States - Remote R2d ago
-
AI Performance Optimization Engineer USD 100K-150KAttention Mechanisms | Benchmarking | C++ | Continuous batching | Data pipelineCareer growth | Remote workMid-level Full TimeUnited States - Remote R5d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | Compiler optimization | Continuous batching | CutlassBenefits | Full-time employment | Remote workMid-level Full TimeUnited States - Remote R5d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C plus plus | CUDA | Continuous batching | Distributed TrainingMid-level Full TimeUnited States - Remote R6d ago
-
AI Performance Optimization Engineer USD 100K-150KAccess Optimization | Attention Mechanisms | Benchmarking | C++ | Communication PrimitivesMid-level Full TimeUnited States - Remote R6d ago
-
C++ | CUDA | Compute Graphs | Distributed Systems | GPUCross-functional collaboration | Engineering autonomy | Flexible working modelMid-level Full TimeGdansk, Poland7d ago
-
Staff Compiler Engineer - PyTorch + Kernel DSLPLATE USD 163K-253KAutotuning | Collective Primitives | Cost Based Compilation | Custom ISA | Cutlass401k | Adoption support stipend | Charitable giving match | Fertility care stipend | Flexible work environmentSenior-level Full TimeSan Jose, California, United States7d ago
-
AI compute | AI compute clusters | AI hardware | Chip interconnects | Collective communicationSenior-level Full TimeSingapore8d ago
-
C# | C++ | Deep learning | Domain-specific language | LLVMSenior-level Full TimeChina, Shanghai8d ago
-
Entry-level Full TimeNew York, NY, United States8d ago
-
Entry-level Full TimeNew York, NY, United States8d ago
-
AI Computing Architect CNY 240K-480KArchitecture simulation | C# | C++ | CUDA | Computer ArchitectureSenior-level Full TimeChina, Shanghai12d ago
-
Machine Learning Engineer USD 145K-180KAWS Batch | AWS EC2 | AWS Lambda | Airflow | Async ProcessingCompany paid life insurance | Medical, dental, and vision coverage | Mental well-being resources | Mentorship and resources | Paid parental leaveSenior-level Full TimeNew York, NY, US, 1028113d ago
-
Research Scientist (Visual Generative AI & World Models) GBP 195K-270KATen | C++ | CUDA | Calculus | Deep learningDental plan | Employee assistance programme | Employee wellbeing support | Flexible working | Generous annual leaveSenior-level Full TimeLondon, UK20d ago
-
AI Software Lead – PyTorch & CUDA Runtime (Next-Gen Accelerator) INR 2475K-3380KC# | C++ | CUDA | Compiler runtime interaction | Compiler/runtimeSenior-level Full TimeBengaluru, KA, India23d ago
-
AI SW Stack Deployment Architect INR 2500K-4500KAPI Design | Cloud Computing | Distributed Systems | Edge Computing | Inference ServerSenior-level Full TimeBengaluru, KA, India23d ago
-
Senior Software Engineer, CUDA Deep Learning Systems USD 184K-356KC++ | CUDA | CUDA kernel | CUDA kernel optimization | Computer ArchitectureEquity | Health benefits | Paid time offSenior-level Full TimeUS, CA, Santa Clara, United States27d ago
-
Machine Learning Intern/Co-op (Fall, 2026) CAD 60K-60KCUDA | Distributed Training | GPU | JAX | MLIRCo-working stipend | Health and dental benefits | Inclusive culture | Lunch stipend | Parental leave top-upEntry-level InternshipCanada27d ago
-
Staff Software Engineer, GPU Performance USD 207K-300KAMD | CUDA | Code generation | Compiler optimization | CutlassSenior-level Full TimeSunnyvale, CA, USA; Kirkland, WA, USA28d ago
-
Member of Technical Staff, Performance Optimization USD 175K-220KCUDA | CUPTI | Distributed Systems | GPU Profiling | InfinibandSenior-level Full TimeSan Mateo, CA28d ago
-
Senior Software Engineer, AI Inference Systems PLN 292K-507KAlgorithms | C++ | CI/CD | CUDA | CUDA GraphsHybrid workSenior-level Full TimeGermany, Remote R1mo ago
-
Senior-level Full TimeUS, CA, Santa Clara, United States1mo ago
-
APIs | Compiler infrastructure | Device Drivers | Firmware | Hardware Aware TechniquesExecutive-level Full TimeHerzliya, Israel, IL1mo ago
-
Senior Machine Learning Engineer, Runtime and Serving USD 213K-263KBenchmarking | Buffer management | C++ | CUDA | Concurrent SystemsSenior-level Full TimeMountain View, CA, USA1mo ago
-
Senior Deep Learning Compiler Verification Engineer USD 140K-224KC++ | Formal verification | Graph optimization | IR lowering | JAXComprehensive benefits package | EquitySenior-level Full TimeUS, CA, Santa Clara, United States1mo ago
-
SOC Architect, XProf USD 147K-211KC# | C++ | Compiler profiling | Data Analysis | Data VisualizationSenior-level Full TimeSunnyvale, CA, USA1mo ago
-
Machine Learning Engineer, Runtime & Optimization USD 213K-263KC++ | CUDA | Deep learning | JAX | Machine LearningCompany benefits program | Discretionary annual bonus | Equity incentive planSenior-level Full TimeMountain View, California, USA1mo ago
-
Agentic AI | Autonomous Driving | Compiler technology | Curriculum Design | Deep learningMid-level Full TimeUS, CA, Santa Clara, United States1mo ago
-
ARM Mali | Android | Apple Neural Engine | C++ | CoreMLCommute subsidy | Employee assistance program | Employee resource groups | Employee stock ownership | Generous vacation and personal daysSenior-level Full TimeMountain View, CA, USA1mo ago
-
Principal Deep Learning Communication Architect USD 272K-431K3D Parallelism | CUDA | Context Parallelism | Data parallelism | DeepSpeedSenior-level Full TimeUS, CA, Santa Clara, United States1mo ago