Find jobs in AI/ML, Data Science and Big Data
60 results
for Pipeline parallelism
(Skill/Tech stack)
-
Staff Software Engineer, AI Runtime USD 190K-265KCUDA | Checkpointing | Data parallelism | DeepSpeed | Distributed SystemsSenior-level Full TimeMountain View, California; San Francisco, California1d ago
-
Senior Software Engineer, AI Runtime USD 160K-225KAlgorithms | Checkpointing | Collective communication | Data Structures | Data parallelismSenior-level Full TimeMountain View, California; San Francisco, California1d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapter-Tuning | Automated Benchmarks | Data Curation | Direct Preference Optimization | Distributed TrainingMid-level Full TimeUnited States - Remote R2d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | Communication Primitives | Continuous batching | Distributed TrainingCareer growth potential | Remote workMid-level Full TimeUnited States - Remote R2d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | CUDA | Continuous batching | DebuggingMid-level Full TimeUnited States - Remote R2d ago
-
AI Performance Optimization Engineer USD 100K-150KC++ | CUDA | Continuous batching | Cutlass | Deep learningCareer growth | Health benefits | Remote workMid-level Full TimeUnited States - Remote R2d ago
-
AI Performance Optimization Engineer USD 100K-150KAttention Mechanisms | Benchmarking | C++ | CUDA | DeepSpeedBenefits | Remote workMid-level Full TimeUnited States - Remote R2d ago
-
AI Performance Optimization Engineer USD 100K-150KC++ | Compiler optimization | Continuous batching | Distributed Training | FSDPMid-level Full TimeUnited States - Remote R2d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAttention Optimization | DPO | Direct Preference Optimization | Distributed Training | EvaluationMid-level Full TimeUnited States - Remote R5d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapter methods | DPO | Dataset curation | Distributed Training | Efficient AttentionMid-level Full TimeUnited States - Remote R5d ago
-
AI Performance Optimization Engineer USD 100K-150KAttention Mechanisms | Benchmarking | C++ | Continuous batching | Data pipelineCareer growth | Remote workMid-level Full TimeUnited States - Remote R5d ago
-
AWS | Azure | Debugging | Distributed Computing | FSDPCompany vehicle | Dental insurance | Flexible spending account | Health insurance | Health savings accountSenior-level Full TimeGM Automation - Sunnyvale - GM …5d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapters | Attention | DPO | Dataset curation | Distributed TrainingMid-level Full TimeUnited States - Remote R5d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | Compiler optimization | Continuous batching | CutlassBenefits | Full-time employment | Remote workMid-level Full TimeUnited States - Remote R5d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapters | Attention Optimization | Cluster operations | Data Generation | DeepSpeed ZeRORemote workMid-level Full TimeUnited States - Remote R6d ago
-
LLM Fine-Tuning Engineer USD 100K-150KDPO | Efficient Attention | Evaluation | FSDP | GPU clustersCareer growth | Long-term engagement | Remote work | Technical mentorshipMid-level Full TimeUnited States - Remote R6d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapters | Attention Optimization | Benchmarking | Dataset curation | Direct Preference OptimizationMid-level Full TimeUnited States - Remote R6d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapters | Benchmarking | DPO | Deep learning | Efficient Fine TuningMid-level Full TimeUnited States - Remote R6d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C plus plus | CUDA | Continuous batching | Distributed TrainingMid-level Full TimeUnited States - Remote R6d ago
-
AI Performance Optimization Engineer USD 100K-150KAccess Optimization | Attention Mechanisms | Benchmarking | C++ | Communication PrimitivesMid-level Full TimeUnited States - Remote R6d ago
-
AWS | Azure | Debugging | Distributed Computing | FSDPEmployee assistance program | Flexible spending accounts | Health savings account | Life insurance | Medical, dental & vision coverageSenior-level Full TimeGM Automation - Sunnyvale - GM …7d ago
-
Staff Compiler Engineer - PyTorch + Kernel DSLPLATE USD 163K-253KAutotuning | Collective Primitives | Cost Based Compilation | Custom ISA | Cutlass401k | Adoption support stipend | Charitable giving match | Fertility care stipend | Flexible work environmentSenior-level Full TimeSan Jose, California, United States7d ago
-
Senior-level Full Time上海8d ago
-
Mid-level Full Time北京 R10d ago
-
Senior AI Engineer USD 209K-275KA/B | A/B Testing | Autoscaling | B testing | BashFour days in office | Hybrid work arrangement | Telecommuting one day per weekSenior-level Full TimeSan Jose (CA), United States20d ago
-
Engineering Manager, Model Inference USD 220K-270KAPIs | Attention Mechanism | Batching | Distributed Systems | Docker401k matching | Commuter benefits | Flexible PTO | Flexible spending accounts | Generous time offMid-level Full TimeSF Office20d ago
-
Compute Shaders | Diffusion Models | Distributed inference | Edge Computing | Expert parallelism100 percent remoteSenior-level Full TimeRemote job R21d ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 201K-332KComputer Vision | Diffusion Models | Edge Computing | Expert parallelism | Flash AttentionRemote workSenior-level Full TimeRemote job R21d ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 201K-332KCompute Shaders | Diffusion Models | Distributed inference | Edge Computing | Expert parallelismEnglish communication support | Remote workSenior-level Full TimeRemote job R21d ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 201K-332KDiffusion Models | Distributed Inference Systems | Distributed inference | Edge Computing | Expert parallelismRemote workSenior-level Full TimeRemote job R21d ago
-
Diffusion Models | Distributed Inference Systems | Distributed inference | Expert parallelism | Flash Attention100 percent remote | Worldwide remoteSenior-level Full TimeRemote job R21d ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 200K-332KCompute Shaders | Diffusion Models | Distributed Inference Systems | Distributed inference | Edge ComputingRemote workSenior-level Full TimeRemote job R23d ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 200K-332KComputer Vision | Deep learning | Diffusion Models | Distributed inference | Edge ComputingRemote workSenior-level Full TimeRemote job R23d ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 200K-332KCompute Shaders | Diffusion Models | Distributed Inference Systems | Distributed inference | Edge ComputingRemote workSenior-level Full TimeRemote job R23d ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 200K-332KDiffusion Models | Distributed Inference Systems | Distributed inference | Edge Computing | Expert parallelismCareer growth | Collaborative research environment | English communication support | Remote work opportunitySenior-level Full TimeRemote job R23d ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 200K-332KCompute Shaders | Custom Compute Shaders | Data Pipelines | Diffusion Models | Distributed Inference SystemsRemote workSenior-level Full TimeRemote job R23d ago
-
Activation checkpointing | Attention Mechanisms | CUDA | Collective operations | Data parallelismSenior-level Full TimeMountain View, California; San Francisco, California26d ago
-
Senior Software Engineer, CUDA Deep Learning Systems USD 184K-356KC++ | CUDA | CUDA kernel | CUDA kernel optimization | Computer ArchitectureEquity | Health benefits | Paid time offSenior-level Full TimeUS, CA, Santa Clara, United States27d ago
-
Senior Deep Learning Frameworks CUDA Software Engineer USD 184K-356KAI compilers | C++ | CUDA | Distributed machine learning | HPC communicationSenior-level Full TimeUS, CA, Santa Clara, United States27d ago
-
Large Model Training Acceleration Engineer USD 187K-387KBenchmarking | Data parallelism | Deep learning | Distributed Training | Distributed inferenceMid-level Full TimeSan Jose, California, United States27d ago
-
Technical Specialist-Data Engg INR 1500K-2200KAb Initio | Ab Initio GDE | Agile | Co Op | Component ParallelismMid-level Full TimeINDIA - HYDERABAD - BIRLASOFT OFFICE, …28d ago
-
Software Engineering Manager, LLM Training USD 170K-277KCUDA | Containerization | Context Parallelism | Data I/O | Data parallelismEntry-level Full TimeMountain View, CA, United States28d ago
-
Forward Deployed Engineer (Inference & Post-Training) USD 270K-300KDPO | GRPO | KV cache | LoRA | Pipeline parallelismEquity | Health insurance | Remote work flexibilitySenior-level Full TimeSan Francisco1mo ago
-
C++ | CUDA | CUDA profiling | Collective communication | Communication Compute OverlapSenior-level Full TimeIsrael, Tel Aviv R1mo ago
-
Principal High-Performance LLM Training Engineer USD 272K-431KActivation checkpointing | Benchmarking | CUDA | Communication and Computation Overlap | CompilersBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara, United States1mo ago
-
3D Parallelism | C++ | CUDA | Data parallelism | DeepSpeedEntry-level Full TimeHong Kong1mo ago
-
3D Parallelism | C++ | CUDA | Data parallelism | DeepSpeedEntry-level Full TimeSingapore1mo ago
-
C++ | CUDA | Data parallelism | DeepSpeed | InfinibandEntry-level Full TimeChina1mo ago
-
C++ | CUDA | Data parallelism | DeepSpeed | InfinibandEntry-level Full TimeBoston, USA1mo ago
-
3D Parallelism | C++ | CUDA | Data parallelism | DeepSpeedEntry-level Full TimeSeattle, USA1mo ago