Find jobs in AI/ML, Data Science and Big Data
27 results
for KV cache
(Skill/Tech stack)
-
Senior Performance Analyst, Inference USD 175K-260KAttention Mechanism | CUDA | Flash Attention | GPU kernel optimization | KV cacheSenior-level Full TimeSunnyvale, CA1d ago
-
AI Inference Engineer - Model Optimization & Deployment USD 205K-303KAccuracy evaluation | BF16 | C++ | CUDA | CUDA kernelsSenior-level Full TimeFoster City, CA4d ago
-
Senior Software Engineer, AI Inference CAD 135K-220KC++ | Chunked prefill | Continuous batching | Cutlass | DockerSenior-level Full TimeCanada, Toronto5d ago
-
AI Platform Engineer INR 1500K-2000KAlerting | CUDA | Capacity Planning | Continuous batching | Distributed tracingMid-level Full TimeBangalore, India6d ago
-
Senior Product Manager, AI Inference - Dynamo USD 208K-327KAgentic AI | Artificial Intelligence | Cache Management | Data-driven | Data-driven project managementSenior-level Full TimeUS, CA, Santa Clara, United States6d ago
-
Senior Solutions Architect - KV Cache and AI Storage CNY 460K-600KBluefield | CMX | Caching | Cassandra | CephSenior-level Full TimeChina, Beijing8d ago
-
Solutions Architect - Top AI Labs CNY 435K-500KArtificial Intelligence | C++ | Computer Systems | Data Structures | Distributed ComputingSenior-level Full TimeChina, Beijing8d ago
-
Agentic Inference | CUDA | Distributed Training | Docker | GPU ComputingSenior-level Full TimeChina, Beijing11d ago
-
Senior Deep Learning Solution Architect CNY 367K-490KC++ | Caching | Computer Architecture | Data Structures | Data transferSenior-level Full TimeChina, Beijing11d ago
-
Artifact management | Automation | Backup | Backup and Restore | BenchmarkingFlexible interview processSenior-level Full TimeAustin, Texas, United States12d ago
-
Principal Software Engineer - AI/ML (Ireland) EUR 95K-135KC++ | CUDA | Distributed Systems | GPU Optimization | Inference RuntimeSenior-level Full TimeRemote Ireland R13d ago
-
API Gateway | C++ | Cilium | Distributed tracing | EnvoyMedical/Dental/Vision insurance | Paid parental leave | Paid time off | Retirement 401k matchSenior-level Full TimeBoston, United States R13d ago
-
Batching | CUDA | Decoding Optimization | Deep learning | GPU PerformanceIn-office collaborationEntry-level InternshipMilpitas, CA14d ago
-
Attention Mechanism | C++ | Deep learning | Distributed Systems | DockerEntry-level Full TimeUS, CA, Santa Clara, United States15d ago
-
Continuous batching | Jupyter | KV cache | Low Latency | Machine LearningDaily meals | Housing subsidy | Medical, dental & vision coverage | Relocation supportMid-level Full TimeCupertino, CA16d ago
-
Entry-level Internship上海18d ago
-
Entry-level Internship北京19d ago
-
Miclaw-大模型训练推理方向实习生 CNY 25K-37KAttention Mechanism | C++ | CUDA | Compiler optimization | FlashAttentionEntry-level Internship北京19d ago
-
Entry-level Internship北京19d ago
-
Inference Software Engineer USD 150K-275KC++ | CUDA | Continuous batching | Distributed Systems | KV cacheDaily meals | Housing subsidy | Medical, dental & vision coverage | Relocation supportSenior-level Full TimeCupertino, CA19d ago
-
Machine Learning Research Engineer USD 150K-275KCUDA | Deep learning | Distributed Training | Distributed inference | Inference OptimizationDaily meals | Housing subsidy | Medical, dental & vision coverage | Relocation supportSenior-level Full TimeCupertino, CA19d ago
-
Senior Data Scientist USD 200K-225KAWS | Airflow | Argo Workflows | ArgoCD | Batching401k company match | Company paid life insurance | FSA options | Flexible PTO | Free foodSenior-level Full TimeMarina del Rey, CA R20d ago
-
Senior Software Engineer II, Inference USD 165K-242KAutoscaling | BF16 | C++ | CI/CD | CUDA401k match | Employee stock purchase program | Flexible PTO | Flexible spending account | Health savings accountSenior-level Full TimeSunnyvale, CA / Bellevue, WA20d ago
-
Machine Learning Engineer, GenAI, Amazon Connect USD 168K-227KAWS | Batching | Cloud Native | Cloud Native Machine Learning | Distributed SystemsCareer growth | Flexibility | Inclusive team culture | Mentorship | Work-life balanceSenior-level Full TimeSeattle, Washington, USA26d ago
-
Senior Engineer 2: Inference Data Plane USD 167K-209KAI | Databases | Distributed Systems | GPU Optimization | GRPCBenefits support | Educational courses | Equity compensation | Flexible time off | Reimbursement for trainingSenior-level Full TimeSan Francisco R28d ago
-
Senior Engineer 2: Inference Data Plane USD 167K-209KAI | Continuous batching | Data parallelism | Databases | Distributed SystemsEmployee assistance program | Equity compensation | Flexible time off | Learning & development resources | Local Employee MeetupsSenior-level Full TimeAustin R28d ago
-
Senior Machine Learning Engineer - AI Foundation USD 174K-295KC++ | CPU | CUDA | Edge accelerators | GPUActivities | Competitive salary | Computational resources | Cutting-edge technology | Impact on transportationSenior-level Full TimeSanta Clara, CA1mo ago