Find jobs in AI/ML, Data Science and Big Data
40 results
for Pipeline parallelism
(Skill/Tech stack)
-
Data parallelism | Diffusion Models | Efficient Attention | Expert parallelism | FlaxSenior-level Full TimeMountain View, California, United States, New …14h ago
-
LLM Engineer USD 100K-150KAdapter methods | DPO | Deep reinforcement learning | Distributed Training | Efficient AttentionBenefits | Career growth | Mentorship | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
LLM Engineer USD 100K-150KDPO | Deep learning | Distributed Training | Efficient Attention | Efficient Fine TuningRemote workMid-level Full TimeUnited States - Remote R1d ago
-
LLM Engineer USD 100K-150KAdapter modules | DPO | Distributed Training | Evaluation methodology | FSDPMid-level Full TimeUnited States - Remote R4d ago
-
Mid-level Full TimeUnited States - Remote R5d ago
-
LLM Engineer USD 100K-150KAdapters | DeepSpeed ZeRO | Direct Preference Optimization | Efficient Attention | FSDPMid-level Full TimeUnited States - Remote R5d ago
-
LLM Engineer USD 100K-150KAdapters | Attention Optimization | DPO | Distributed Training | Evaluation benchmarksMid-level Full TimeUnited States - Remote R5d ago
-
C++ | CUDA | Cluster scheduling | Compute scheduling | Deep learningSenior-level Full TimeIsrael, Tel Aviv6d ago
-
Async execution | Batch inference | C++ | CUDA | CachingEntry-level Full TimeSan Francisco, California, United States8d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapter-Tuning | Dataset curation | Direct Preference Optimization | Distributed Training | Efficient AttentionMid-level Full TimeUnited States - Remote R8d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapters | Attention Mechanisms | DPO | Dataset curation | Evaluation benchmarksBenefits | Remote workMid-level Full TimeUnited States - Remote R8d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | CUDA | Continuous batching | CutlassRemote workMid-level Full TimeUnited States - Remote R8d ago
-
AI Performance Optimization Engineer USD 100K-150KAccess Optimization | Attention Mechanisms | Benchmarking | C plus plus | CPUMid-level Full TimeUnited States - Remote R8d ago
-
Continuous batching | Data parallelism | Deep learning | Distributed Training | Dynamic MemoryComputational resources access | Full sponsorship | Hired by Rakuten Asia after completion | Research exchangesMid-level Full TimeCrimson House Singapore11d ago
-
Software Engineer, Systems ML USD 141K-208KC plus plus | CUDA | Co-design | Compiler optimization | Deep learningSenior-level Full TimeBellevue, WA | Menlo Park, CA …12d ago
-
Attention Mechanisms | Batching | CUDA | Data parallelism | Distributed SystemsSenior-level Full TimeSan Jose, California, United States13d ago
-
Attention Mechanisms | Data parallelism | Deep learning | Distributed Training | Language ModelsSenior-level Full TimeSan Jose, California, United States13d ago
-
Staff Software Engineer, AI Runtime USD 190K-265KAlgorithms | Automatic Recovery | Checkpointing | Collective communication | Data StructuresSenior-level Full TimeMountain View, California; San Francisco, California14d ago
-
LLM Inference Frameworks and Optimization Engineer USD 160K-230KC++ | CUDA | CUDA graph | Cluster scheduling | CompilerEquity | Health insuranceMid-level Full TimeSan Francisco, Singapore, Amsterdam14d ago
-
Senior Inference Engineer, AIConfigurator for Dynamo USD 184K-356KBatching | Distributed Systems | Expert parallelism | GPU Computing | High PerformanceEquity | Health benefits | Hybrid workSenior-level Full TimeUS, CA, Santa Clara, United States18d ago
-
Senior Solutions Architect, GPU Cloud GenAI – Infrastructure INR 2200K-5000KAnsible | C plus plus | CI/CD | Data parallelism | Device pluginSenior-level Full TimeIndia, Mumbai19d ago
-
Architecture Search | C++ | CUDA | Computer Vision | Deep learningSenior-level Full TimeUS, CA, Santa Clara, United States19d ago
-
Senior Software Engineer, AI Runtime USD 160K-225KAlgorithms | Checkpointing | Collective communication | Data Structures | Data parallelismSenior-level Full TimeMountain View, California; San Francisco, California21d ago
-
AWS | Azure | Debugging | Distributed Computing | FSDPCompany vehicle | Dental insurance | Flexible spending account | Health insurance | Health savings accountSenior-level Full TimeGM Automation - Sunnyvale - GM …25d ago
-
AWS | Azure | Debugging | Distributed Computing | FSDPEmployee assistance program | Flexible spending accounts | Health savings account | Life insurance | Medical, dental & vision coverageSenior-level Full TimeGM Automation - Sunnyvale - GM …27d ago
-
Staff Compiler Engineer - PyTorch + Kernel DSLPLATE USD 163K-253KAutotuning | Collective Primitives | Cost Based Compilation | Custom ISA | Cutlass401k | Adoption support stipend | Charitable giving match | Fertility care stipend | Flexible work environmentSenior-level Full TimeSan Jose, California, United States27d ago
-
Data/AI Engineer Intern SGD 40K-57KAI Job Scheduling | Automated testing | C++ | Checkpointing | DeepSpeedEntry-level Full Time InternshipSingapore-CapitaSky28d ago
-
Senior-level Full Time上海28d ago
-
Senior AI Engineer USD 209K-275KA/B | A/B Testing | Autoscaling | B testing | BashFour days in office | Hybrid work arrangement | Telecommuting one day per weekSenior-level Full TimeSan Jose (CA), United States1mo ago
-
Engineering Manager, Model Inference USD 220K-270KAPIs | Attention Mechanism | Batching | Distributed Systems | Docker401k matching | Commuter benefits | Flexible PTO | Flexible spending accounts | Generous time offMid-level Full TimeSF Office1mo ago
-
Compute Shaders | Diffusion Models | Distributed inference | Edge Computing | Expert parallelism100 percent remoteSenior-level Full TimeRemote job R1mo ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 201K-332KComputer Vision | Diffusion Models | Edge Computing | Expert parallelism | Flash AttentionRemote workSenior-level Full TimeRemote job R1mo ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 201K-332KCompute Shaders | Diffusion Models | Distributed inference | Edge Computing | Expert parallelismEnglish communication support | Remote workSenior-level Full TimeRemote job R1mo ago
-
Diffusion Models | Distributed Inference Systems | Distributed inference | Expert parallelism | Flash Attention100 percent remote | Worldwide remoteSenior-level Full TimeRemote job R1mo ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 200K-332KCompute Shaders | Diffusion Models | Distributed Inference Systems | Distributed inference | Edge ComputingRemote workSenior-level Full TimeRemote job R1mo ago
-
Activation checkpointing | Attention Mechanisms | CUDA | Collective operations | Data parallelismSenior-level Full TimeMountain View, California; San Francisco, California1mo ago
-
Senior Software Engineer, CUDA Deep Learning Systems USD 184K-356KC++ | CUDA | CUDA kernel | CUDA kernel optimization | Computer ArchitectureEquity | Health benefits | Paid time offSenior-level Full TimeUS, CA, Santa Clara, United States1mo ago
-
Senior Deep Learning Frameworks CUDA Software Engineer USD 184K-356KAI compilers | C++ | CUDA | Distributed machine learning | HPC communicationSenior-level Full TimeUS, CA, Santa Clara, United States1mo ago
-
Large Model Training Acceleration Engineer USD 187K-387KBenchmarking | Data parallelism | Deep learning | Distributed Training | Distributed inferenceMid-level Full TimeSan Jose, California, United States1mo ago
-
Software Engineering Manager, LLM Training USD 170K-277KCUDA | Containerization | Context Parallelism | Data I/O | Data parallelismEntry-level Full TimeMountain View, CA, United States1mo ago