Find jobs in AI/ML, Data Science and Big Data
78 results
for TensorRT-LLM
(Skill/Tech stack)
-
#Hiring | Senior Security Architect | Post-Quantum Cryptography (PQC) | AI/LLM Security | Frisco, TX (Onsite) USD 167K-246KBackstage IDP | Confidential Computing | Confidential Computing TEE | Cryptography Agility | EBPFSenior-level Contract Full TimeFrisco, TX, United States1d ago
-
AI Performance Optimization Engineer USD 100K-150KC++ | CUDA | Continuous batching | Cutlass | Deep learningCareer growth | Health benefits | Remote workMid-level Full TimeUnited States - Remote R2d ago
-
AWS | Apache Flink | Apache Spark | Azure | C++Senior-level Full TimeSanta Clara, CA2d ago
-
Mid-level Full TimeSeattle (WA), United States2d ago
-
AWQ | AWS | Batching | CPU architecture | CUDASenior-level Full TimeGuangzhou, Guangdong, China4d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | Compiler optimization | Continuous batching | CutlassBenefits | Full-time employment | Remote workMid-level Full TimeUnited States - Remote R5d ago
-
C++ | Deep learning | Distributed Training | ETL | GoSenior-level Full TimeMountain View, CALIFORNIA, United States5d ago
-
Senior Machine Learning Engineer (Inference Platform) USD 175K-225KAWS | Alerting | CI/CD | Continuous batching | Data ProcessingSenior-level Full TimeRemote - USA R6d ago
-
Senior-level Full TimeSeoul, Korea7d ago
-
Senior-level Full TimeUS, CA, Remote, United States R8d ago
-
Artificial Intelligence | Bottleneck analysis | CUDA | Deep learning | Diffusion ModelsBenefits | EquitySenior-level Full TimeUS, CA, Santa Clara, United States9d ago
-
CUDA | CUDA-X | DevRel | Dynamo | Inference ServingMid-level Full TimeUS, CA, Santa Clara, United States12d ago
-
Senior Solutions Architect - Generative AI INR 2475K-4500KArgo | CI/CD | CUDA | Evaluation | FedRAMPSenior-level Full TimeIndia, Pune13d ago
-
Engineering Manager, Inference Benchmarking — AI Perf USD 224K-356KDCGM | Distributed Systems | GPU Telemetry | GPU observability | HelmSenior-level Full TimeUS, CA, Santa Clara, United States13d ago
-
Product Manager - AI Inference & Model Serving USD 165K-275KAI Inference | Artificial Intelligence | Autoscaling | Cache Management | Continuous batchingConference attendance | Professional development | Stock options | Training | Workstation providedMid-level Full TimeAustin, TX, United States13d ago
-
Data Curation | Deep learning | DeepSpeed | Direct Preference Optimization | EvaluationSenior-level Full TimeSingapore, Singapore15d ago
-
AI工程师-Agent Infra & LLMOps 方向(成都) CNY 180K-360KAccess Control | AutoGPT | CPU isolation | Docker | FirecrackerNone Full Time成都19d ago
-
Entry-level Full Time武汉19d ago
-
Entry-level Full Time北京19d ago
-
Intern, AI Engineering USD 64K-106KCUDA | CUDA kernel | CUDA kernel development | Hugging Face | Inference OptimizationEntry-level InternshipSan Francisco, California20d ago
-
Machine Learning Engineer, Distributed vLLM USD 136K-287KAPI Gateway | Cilium | Distributed Systems | Envoy | GPU ProfilingPaid parental leave | Paid time off | Retirement 401k match | Tuition reimbursementMid-level Full TimeBoston, United States R20d ago
-
Product Manager - AI Inference & Model Serving USD 160K-275KAI Inference | Autoscaling | Cache Management | Cold Start | Cold Start OptimizationConference attendance | Professional development and training | Stock options | Workstation providedMid-level Full TimeAustin, TX, United States20d ago
-
Senior AI Engineer USD 160K-250KAPI Design | Agent Orchestration | Agent systems | Audit Logging | Authentication401k eligibility | Flexible work environment | Hybrid work option | Paid time off | Parental leave eligibilitySenior-level Full TimeUnited States (Remote) R20d ago
-
AWQ | AWS | Accelerate | Azure | BatchingMid-level Full TimeShenzhen, Guangdong, China R20d ago
-
Sr GenAI Infra Specialist SA, AWS WWSO Startup USD 153K-228KAWS | Amazon EC2 | Amazon EKS | Amazon S3 | Cache optimizationInclusive team culture | Mentorship and career growth | Work-life balanceSenior-level Full TimeNew York, New York, USA21d ago
-
Solutions Architect - AI Technology Center, Foundation Model Building KRW 65000K-90000KAI model | AI model development | CUDA | Debugging | Fine TuningSenior-level Full TimeKorea, Seoul, Korea, Republic of21d ago
-
Staff Machine Learning Engineer, Voice AI USD 220K-280KAudio codecs | Audio signal processing | Batching | CUDA | Deep learningHealth insurance | Startup equitySenior-level Full TimeSan Francisco21d ago
-
Staff AI Platform Engineer - Abu Dhabi USD 139K-300KAlerting | Azure | CI/CD | Distributed tracing | DockerSenior-level Full TimeAmman, Amman Governorate, Jordan22d ago
-
Inference Engineer - Acceleration CHF 110K-160KAdmission control | CUDA | Cutlass | FlashAttention | KV cacheCommuting subsidy | Learning and development budget | Offsites and team events | Pension plan | Vacation daysMid-level Full TimeZürich, Switzerland22d ago
-
Software Engineer, Inference - Multi Modal USD 295K-555KDistributed Systems | GPU | High Throughput | Inference | Language ModelsEntry-level Full TimeSan Francisco24d ago
-
Senior-level Full Time北京26d ago
-
Member of Technical Staff, AI Engineering USD 162K-297KAutogen | BF16 | C++ | CI/CD | CUDAIncome Protection for Illness or Injury | Medical, dental, vision plans | Paid Holidays | Paid family leave | Paid time offSenior-level Full TimeBoise, ID - Main Site, United …27d ago
-
Deep learning | Evaluation Pipelines | GPU Cluster | High Performance | High-Performance ComputingSenior-level Full TimeIsrael, Tel Aviv28d ago
-
Solutions Architect, Pre-training and Post-training KRW 65000K-90000KArtificial Intelligence | Debugging | Deep learning | Fine Tuning | GPU ArchitectureSenior-level Full TimeKorea, Seoul, Korea, Republic of28d ago
-
AI Platform Engineer INR 1500K-2500KAutomated Evaluation | CI/CD | CUDA | Continuous Checkpointing | Continuous batchingMid-level Full TimeBangalore, India28d ago
-
Senior-level Full TimeIsrael28d ago
-
AI Engineer - Tieto Banktech (m/f/d) NOK 792K-1075KAWS | Anthropic | Azure | CI/CD | DockerAutonomy | Collaborative culture | Hybrid workingMid-level Full TimeTrondheim, Trøndelag, Norway29d ago
-
AI Engineer - Tieto Banktech (m/f/d) NOK 792K-1075KAWS | Anthropic | Azure | CI/CD | Cloud platformHybrid workingMid-level Full TimeTrondheim, Trøndelag, Norway29d ago
-
AI Engineer - Tieto Banktech (m/f/d) NOK 792K-1075KAWS | Agentic Workflows | Anthropic | Azure | CI/CDFlexible hybrid workingMid-level Full TimeFornebu, Akershus, Norway29d ago
-
AI Engineer - Tieto Banktech (m/f/d) NOK 792K-1075KAWS | Agent Frameworks | Azure | CI/CD | Cloud platformAutonomy | Flexible hybrid working | Knowledge sharingMid-level Full TimeBergen, Vestland, Norway29d ago
-
AI Engineer - Tieto Banktech (m/f/d) NOK 792K-1075KAWS | Agent Frameworks | Anthropic API | Azure | CI/CDAutonomy | Collaborative culture | Flexible hybrid workingMid-level Full TimeBergen, Vestland, Norway29d ago
-
【26届校招】Software Engineer (All Levels) – 大模型与智能机器人系统 CNY 240K-480KC++ | CUDA | DDS | GPU memory | GPU memory managementNone Full Time广州、深圳29d ago
-
Solution Architect (AI/LLM Inference) USD 165K-330KArtificial Intelligence | Benchmarking | Embeddings | GPU Selection | Image Generation401k company match | Fertility and family building stipend | Flexible PTO | Medical/Dental/Vision insurance | Paid parental leaveSenior-level Full TimeSan Francisco30d ago
-
AWQ | Audio codecs | Audio streaming | Autoscaling | Chunked prefill401k matching | Annual offsites | Dental coverage | Employer-paid training | Healthcare benefitsMid-level Full TimeSan Francisco, CA1mo ago
-
Automatic Speech Recognition | DeepSpeed | Distributed Training | FSDP | GPU Memory Optimization401k matching | Healthcare Dental Vision | Hybrid work | New parent leave | Office StockedMid-level Full TimeSan Francisco, CA1mo ago
-
Forward Deployed Engineer (Inference & Post-Training) USD 270K-300KDPO | GRPO | KV cache | LoRA | Pipeline parallelismEquity | Health insurance | Remote work flexibilitySenior-level Full TimeSan Francisco1mo ago
-
Staff Software Engineer, Inference USD 188K-275KBF16 | C++ | CUDA | Distributed Systems | FP8401k employer match | Dental insurance | Employee stock purchase program | Flexible PTO | Flexible spending accountSenior-level Full TimeSunnyvale, CA / Bellevue, WA1mo ago
-
Senior-level Full TimeIraklio, Greece1mo ago
-
Audio Inference Engineer, Model Efficiency USD 165K-300KC++ | Deep learning | Distributed inference | GPU Programming | Low-level systemCo-working stipend | Health and dental benefits | Inclusive culture | Mental health budget | Parental leave top-upMid-level Full TimeNew York1mo ago
-
None Full Time广州、深圳1mo ago