Sr. AI Inference Systems Engineer
US-California-Palo Alto, United States
USD 120K-225K Senior-level Full Time
Tasks
- Build high-performance inference frameworks
- Design kv cache storage strategies
- Design router architecture
- Design technical roadmaps
- Develop standardized inference optimization schemes
- Evaluate inference architectures for real time batch and streaming
- Lead inference optimization technical bottleneck resolution
- Mentor team members
- Optimize inference operators for throughput and latency
- Optimize inference pipeline for large models
- Optimize scheduling and memory management
- Productize emerging inference technologies
- Research hardware accelerator inference logic
- Resolve distributed inference communication latency
- Resolve load imbalance in distributed inference
- Track compiler optimization model compression hardware fusion
Perks/Benefits
- 401k
- Dental insurance
- Disability insurance
- Health insurance
- Life insurance
- Paid Holidays
- Paid sick leave
- Paid vacation
- Relocation assistance
- Restricted stock units
- Sign-on bonus
- Vision insurance
Skills/Tech-stack
CUDA | Distributed Systems | Hardware Accelerators | Inference Optimization | Instruction set | Instruction set architecture | Intelligent routing | KV cache | Language Models | Large Language Models | Memory Management | Model Compression | Multimodal Models | Parallel Computing | PyTorch | Quantization | Router architecture | Scheduling | TensorFlow | Triton
Education
Regions
Countries
States
Cities
Related jobs
-
AWS | Agentic AI | Angular | CI/CD | DatabricksHybrid work | Technical mentorshipSenior-level Full TimeNormal, United States10h ago
-
Sr. Data Engineer USD 108K-158KAWS | Apache Spark | Automated testing | Azure Event | Azure Event Hubs401k matching | Dental insurance | Disability insurance | Educational growth | Employee discount programSenior-level Full TimeNew York-TONAWANDA11h ago
-
Research Scientist - LLM Training System as a Service - Global Frontier Tech Recruitment Program - 2027 Start (PhD) USD 212K-450KCUDA | Deep learning | Distributed Systems | GPU Performance | GPU Performance OptimizationEntry-level Full TimeSan Jose, California, United States11h ago
-
Backup and Restore | Blob Storage | Cluster communication | Cluster management | Crash diagnosticsSenior-level Full TimeSan Jose, California, United States11h ago
-
Data-Driven Decision Making | Data-driven | Decision Making | Deep learning | Distributed TrainingSenior-level Full TimeSunnyvale, CA12h ago
-
Research Engineer - MSL FAIR Foundations USD 117K-173KBenchmarking | Code review | Data Pipelines | Distributed Systems | Language ModelEntry-level Full TimeMenlo Park, CA12h ago
-
AI Model Serving | AI model | Benchmarking | Cache Management | Data AnalysisSenior-level Full TimeSunnyvale, CA, USA; Kirkland, WA, USA12h ago
-
Staff Software Engineer, AI Data Generation Platform USD 207K-300KComputer Vision | Data Engineering | Data Processing | Data Structures | Data Structures and AlgorithmsSenior-level Full TimeSunnyvale, CA, USA12h ago
-
C plus plus | C++ | Cloud Spanner | Cloud Storage | Cloud platformSenior-level Full TimeSunnyvale, CA, USA12h ago
-
Senior Software Engineer, AI/ML, Search Growth USD 174K-252KA/B | A/B Testing | B testing | Deep learning | Information RetrievalSenior-level Full TimeMountain View, CA, USA12h ago
-
Staff Software Engineer, Agentic Data and Evals USD 207K-300KC++ | CSS | Cloud | Data Storage | Data StructuresSenior-level Full TimeSunnyvale, CA, USA12h ago
-
Software Engineer, Applied AI USD 130K-500KData Pipelines | Data Quality | Evaluation Frameworks | Experimental Design | GoDental insurance | Equity grant | Free Equinox Membership | Health insurance | Housing bonusMid-level Full TimeSan Francisco17h ago
-
ASR | Automatic Speech Recognition | CTC | Data Augmentation | Knowledge Distillation401k matching | Annual offsites | Dental insurance | Free snacks and drinks | Health insuranceSenior-level Full TimeSan Francisco, CA22h ago
-
Machine Learning Researcher, Multimodal LLMs USD 140K-250KAudio codecs | Data Analysis | Experiment design | Fine Tuning | Language ModelsDental insurance | Equity | Health insurance | High autonomy | High impactSenior-level Full TimeSan Francisco23h ago
-
Application Software Engineer, Data USD 150K-225KAngular | C# | CI/CD | Continuous integration | DeploymentSenior-level Full TimeStarbase, TX23h ago
-
Full Stack AI Developer USD 146K-222KAgile | Angular | Auto-tagging | CI/CD | Chunking401k | Education reimbursement program | Flexible schedule | Hybrid schedule | MentorshipSenior-level Full TimeLivermore, CA, United States R23h ago
-
AI Security | API Security | Adversarial Machine Learning | Data exfiltration | Evasion TechniquesLife insurance | Mental health support | Private medical coverageMid-level Full TimePortland, Oregon, United States1d ago
-
Adversarial Machine Learning | Data leakage | Fine Tuning | ISO 27001 | ISO 27017Life insurance | Mental Health Expenses | Private medical coverageExecutive-level Full TimePortland, Oregon, United States1d ago
-
AI Developer USD 128K-173KAI Automation | Artificial Intelligence | Databricks | Generative AI | Language Models401k matching | Dental insurance | Flexible work hours | Health insurance options | Paid time offSenior-level Full TimeUSA DC Washington - 475 L'Enfant …1d ago
-
Senior-level Full Time11063 Alexandria VA, United States1d ago
-
Sr Staff AI Engineer, Context Engineering USD 220K-255KAWS CloudWatch | AWS EC2 | AWS IAM | Agent Orchestration | Agentic RAGDental insurance | Disability insurance | Flexible spending account | Health insurance | Health savings accountSenior-level Full TimeCalifornia - Remote Office, United States R1d ago
-
Sr Software AI Engineer 3 - Context Engineering USD 140K-185KAPI Development | AWS | Agentic RAG | ArgoCD | AzureSenior-level Full TimeCalifornia - Remote Office, United States R1d ago
-
Applied Machine Learning Engineer (All Levels) USD 90K-162KAPI | AWS | Amazon SageMaker | Azure | Azure Machine LearningSenior-level Full TimeUSA - IL (Remote), United States R1d ago
-
Machine Learning Engineer, Customer Support Engineering USD 162K-186KArtificial Intelligence | Fine Tuning | Human Feedback | Language Models | Large Language ModelsSenior-level Full TimeRemote-USA R1d ago
-
Machine Learning Researcher, Audio USD 140K-250KAgent Design | Data Engineering | Deep learning | Experiment design | Fine TuningDental insurance | Health insurance | High autonomy | High impact | Remote workSenior-level Full TimeSan Francisco1d ago