Find jobs in AI/ML, Data Science and Big Data
5 results
for Flash Attention
(Skill/Tech stack)
-
Senior Performance Analyst, Inference USD 175K-260KAttention Mechanism | CUDA | Flash Attention | GPU kernel optimization | KV cacheSenior-level Full TimeSunnyvale, CA1d ago
-
Principal Machine Learning Engineer USD 32K-32KCI/CD | Cloud Platforms | Containerization | Distributed Training | DockerBirthday celebrations | Company lunches | Dental insurance | Flexible working hours | Generous holiday allowanceSenior-level Full TimeLondon, England, United Kingdom14d ago
-
Tech Lead Manager- MLRE, ML Systems USD 264K-331KCUDA | Distributed Systems | Flash Attention | GRPO | Human FeedbackCommuter stipend | Generous PTO | Health, dental and vision coverage | Learning and development stipend | Retirement benefitsSenior-level Full TimeSan Francisco, CA; New York, NY14d ago
-
Senior Machine Learning Engineer USD 32K-32KDistributed Training | Dynamic batching | Flash Attention | Inference Optimization | Machine Learning401k matching | Adoption Assistance | Birthday celebrations | Company lunches | Dental coverageSenior-level Full TimeLondon, England, United Kingdom15d ago
-
Performance Engineer, GPU USD 280K-850KBandwidth Optimization | CUDA | Cluster Orchestration | Collective communication | Custom OperatorsFlexible working hours | Generous vacation | Hybrid work 25 percent | Optional equity donation matching | Parental leaveSenior-level Full TimeSan Francisco, CA | New York …25d ago