Find jobs in AI/ML, Data Science and Big Data
24 results
for Speculative decoding
(Skill/Tech stack)
-
AI Platform Engineer INR 1500K-2000KAlerting | CUDA | Capacity Planning | Continuous batching | Distributed tracingMid-level Full TimeBangalore, India1d ago
-
ML Research Engineer (Inference) INR 120K-180KC++ | Deep learning | Generative AI | Hugging Face | Hugging Face TransformersEntry-level Full TimeBengaluru, Karnataka, India2d ago
-
Senior Software Engineer, LLM Performance USD 180K-339KC++ | CUDA | Cutlass | FlashAttention | FlashInferSenior-level Full TimeSF Bay Area (Hybrid) R3d ago
-
Research Engineer - Ads Integrity USD 136K-205KAI Safety | AIGC | Deepfake Synthesis | Deepfake detection | Generative AICareer growth opportunities | Flexible work culture | Opportunities for open-source contributions | Opportunities for publicationMid-level Full TimeSan Jose, California, United States8d ago
-
Senior Member of Technical Staff: ML Systems and Infrastructure INR 2500K-4000KArgo Workflows | ArgoCD | CI/CD | CUDA | GitHub ActionsSenior-level Full TimeBangalore, India8d ago
-
Senior-level Full TimeDoha Municipality, Doha, Qatar9d ago
-
Machine Learning Engineer CAD 128K-192KDeep learning | Graph theory | Inference Optimization | LLM Inference | LLM Inference OptimizationMid-level Full TimeToronto - MSO, Canada R9d ago
-
Principal Machine Learning Engineer USD 32K-32KCI/CD | Cloud Platforms | Containerization | Distributed Training | DockerBirthday celebrations | Company lunches | Dental insurance | Flexible working hours | Generous holiday allowanceSenior-level Full TimeLondon, England, United Kingdom10d ago
-
Senior Machine Learning Engineer USD 32K-32KDistributed Training | Dynamic batching | Flash Attention | Inference Optimization | Machine Learning401k matching | Adoption Assistance | Birthday celebrations | Company lunches | Dental coverageSenior-level Full TimeLondon, England, United Kingdom11d ago
-
Continuous batching | Jupyter | KV cache | Low Latency | Machine LearningDaily meals | Housing subsidy | Medical, dental & vision coverage | Relocation supportMid-level Full TimeCupertino, CA12d ago
-
Entry-level Internship北京15d ago
-
Miclaw-大模型训练推理方向实习生 CNY 25K-37KAttention Mechanism | C++ | CUDA | Compiler optimization | FlashAttentionEntry-level Internship北京15d ago
-
Entry-level Internship北京15d ago
-
Inference Software Engineer USD 150K-275KC++ | CUDA | Continuous batching | Distributed Systems | KV cacheDaily meals | Housing subsidy | Medical, dental & vision coverage | Relocation supportSenior-level Full TimeCupertino, CA15d ago
-
Machine Learning Research Engineer USD 150K-275KCUDA | Deep learning | Distributed Training | Distributed inference | Inference OptimizationDaily meals | Housing subsidy | Medical, dental & vision coverage | Relocation supportSenior-level Full TimeCupertino, CA15d ago
-
Software Engineering Manager, LLM Training USD 170K-277KCUDA | CUDA profiling | Containerization | Context Parallelism | Data I/OHealth and wellness programs | Hybrid work | Time away from workEntry-level Full TimeMountain View, CA, United States16d ago
-
Senior Software Engineer II, Inference USD 165K-242KAutoscaling | BF16 | C++ | CI/CD | CUDA401k match | Employee stock purchase program | Flexible PTO | Flexible spending account | Health savings accountSenior-level Full TimeSunnyvale, CA / Bellevue, WA16d ago
-
Software Engineering Manager, LLM Training USD 170K-277KCUDA | Containerization | Data parallelism | Distributed Systems | DockerFlexible-hybrid work | Health and wellness programs | Time offEntry-level Full TimeMountain View, CA, United States16d ago
-
4-bit | C plus plus | C++14 | C++17 | CI/CDSenior-level Full TimeGermany, Munich21d ago
-
Member of Technical Staff - Inference USD 200K-300KAWS | Ansible | Benchmarks | C++ | CUDACompetitive compensation | Conference attendance | Equity incentives | Flexible work | Professional developmentSenior-level Full TimeRemote R26d ago
-
Software Engineer, Inference Platform USD 200K-250KCUDA | Distributed Systems | Expert parallelism | GPU Compute | GPU OptimizationDental insurance | Equity | Health insurance | PTO policy | Retirement planMid-level Full TimeSan Francisco, CA1mo ago
-
Solutions Architect, Inference Deployments USD 152K-241KAI Inference | AI inference workloads | Disaggregated inference | GPU Operator | GPU OrchestrationBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara, United States1mo ago
-
Staff Software Engineer, ML Infrastructure USD 300K-430KBatching Strategies | Distributed Training | Fault Tolerance | Inference architecture | JAXDental | Lunches | Medical | Snacks | VacationSenior-level Full TimeSan Francisco1mo ago
-
Research Engineer, Core ML USD 200K-280KAPIs | Backend Development | Deep learning | Distributed Systems | FasterTransformerCompetitive benefits | Health insurance | Startup equitySenior-level Full TimeSan Francisco1mo ago