AI Research Engineer (Kernel & Inference Optimization)
Tasks
- Build and monitor inference tests in simulated and production environments
- Create test datasets and simulation scenarios for deployment
- Design model serving architectures for low latency high throughput
- Diagnose serving bottlenecks using performance metrics
- Evaluate model efficiency and iterate on inference algorithms
- Integrate inference frameworks into edge and on device production pipelines
- Optimize batching and reduce network delays
- Optimize memory usage in inference pipelines
Perks/Benefits
Skills/Tech-stack
Compute Shaders | Custom Compute Shaders | Data Pipelines | Diffusion Models | Distributed Inference Systems | Distributed inference | Edge Computing | Expert parallelism | Flash Attention | GPU Kernels | High Throughput | Inference Optimization | Inference Systems | KV cache | Kernel optimization | Latency optimization | Low Latency | Memory Optimization | Metal Shading Language | Mobile Devices | Model Serving | Performance Benchmarking | Pipeline parallelism | Pruning | Quantization | Shading language | Speculative decoding | Tensor Parallelism | Throughput Optimization | Vision Transformers
Education
Roles
Related jobs
-
Software Engineer, Data Infrastructure PLN 300K-347KAWS | Apache Spark | Azure | Data Ingestion | Data LakeCareer growth budget | Dental coverage | Family forming support | Fertility healthcare support | Group life insuranceSenior-level Full TimeWarsaw R5h ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 200K-332KDiffusion Models | Distributed Inference Systems | Distributed inference | Expert parallelism | Flash AttentionEnglish support | Remote workSenior-level Full TimeRemote job R18h ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 200K-332KCompute Shaders | Diffusion Models | Distributed Inference Systems | Distributed inference | Edge ComputingRemote workSenior-level Full TimeRemote job R18h ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 200K-332KComputer Vision | Deep learning | Diffusion Models | Distributed inference | Edge ComputingRemote workSenior-level Full TimeRemote job R18h ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 200K-332KCompute Shaders | Diffusion Models | Distributed Inference Systems | Distributed inference | Edge ComputingRemote workSenior-level Full TimeRemote job R18h ago
-
AI Research Engineer (Kernel & Inference Optimization) USD 200K-332KDiffusion Models | Distributed Inference Systems | Distributed inference | Edge Computing | Expert parallelismCareer growth | Collaborative research environment | English communication support | Remote work opportunitySenior-level Full TimeRemote job R18h ago
-
(Senior) AI Engineer (all genders) EUR 65K-75KCloud Platforms | Containerization | DevOps | Docker | Language Models30 days vacation | E-learning support | Employee participation | Fitness benefits | Flexible work optionsMid-level Full TimeBremen, Munich, Mannheim, Mainz, Berlin, Remote R1d ago
-
AI Research Engineer USD 230K-385KApplied cryptography | Blockchain Protocols | Decentralized networks | Distributed Systems | Federated LearningEntry-level Full TimeAnywhere R1d ago
-
AI Researcher USD 250K-350KCryptography | Distributed Computing | Distributed inference | Federated Learning | Incentive designMid-level Full TimeAnywhere R1d ago
-
LLM Fine-Tuning Engineer USD 150K-270KAdapter-Tuning | DPO | Dataset curation | Efficient Attention | EvaluationHealth insurance | Paid time off | Remote workMid-level Full TimeUnited States - Remote R2d ago
-
AI Performance Optimization Engineer USD 136K-258KAccess Optimization | Attention Optimization | Benchmarking | C++ | Compiler optimizationMid-level Full TimeUnited States - Remote R2d ago
-
Senior AI Engineer - Contract USD 136K-172KBehavior Trees | C# | C++ | CPU Optimization | Game AICareer improvement plan | Company events | Flexible work arrangements | Generous time-off policy | Medical, dental & vision coverageSenior-level Full TimeIrvine, CA R2d ago
-
Senior Machine Learning Engineer USD 150K-203KAWS | Accuracy | Anomaly Detection | Azure | ClassificationDental insurance | Equity | Flexible PTO | Health insurance | Mental health benefitsSenior-level Full TimeRemote (US) R2d ago
-
Principal AI Software Engineer USD 224K-308KAWS | Cloud Computing | Data Processing | Docker | Endpoint Security401k match | Adoption and surrogacy reimbursement | Cancer Care Program | Dependent care FSA | Employee assistance programSenior-level Full TimeUnited States - Remote R2d ago
-
Staff Data Engineer USD 112K-125KApache Flink | Apache Kafka | Apache Spark | Batch Processing | Cloud NativeFlexible benefits | Healthcare stipend | Learning benefits | Paid Holidays | Paid family leaveSenior-level Full TimeArgentina (Remote) R2d ago
-
ML Data Engineer USD 150K-194KAWS | Automation | Azure | Code review | Data ArchitectureFlexible work schedule | Professional development | Remote workMid-level Full TimeLATAM (Remote) R3d ago
-
Head of AI Engineering (f/m/x) EUR 56K-79KA/B | A/B Testing | API Design | AWS Bedrock | Autoscaling30 vacation days | Flexible working hours | Hybrid work | Jobrad bicycle leasing | Jobticket subsidySenior-level Full TimeMünchen R3d ago
-
Staff MLOps Engineer (AI/ML Platform) EUR 56K-78KAWS | AWS EKS | Apache Spark | Batch Scoring | CachingSenior-level Full TimeRemote, Remote, Germany R3d ago
-
AI Engineer Lead EUR 65K-90KAPIs | AWS | Agentic Workflows | Anthropic | CI/CDDiscount on events and experiences | English lessons | Flexible remuneration tax exemption | Health insurance | Home office friendly setupSenior-level Full TimeMadrid R3d ago
-
Lead Forward Deployed Engineer, Databricks 2026- US, UK USD 180K-247KAgents | Apache Spark | Data Pipelines | Data product | DatabricksRemote workSenior-level Full TimeAtlanta, GA / London, GB - … R3d ago
-
Senior-level Full TimeSan Francisco - Remote, CA, United … R3d ago
-
Data Modeling | Data Pipelines | ETL | Git | NoSQLFlexible schedule | Hybrid work | Learning and development | Medical support | Pro-bono projectsMid-level Full TimeLa Coruña, ES R3d ago
-
Autonomy | C++ | CPU GPU | CPU/GPU Optimization | DDS401k | Health insurance | Paid Company Holidays | Paid time off | Phone stipendSenior-level Full TimeSan Carlos - Hybrid R3d ago
-
AI‑Native Platform Engineer – Media Analytics RON 312K-396KAlerting | Amazon Kinesis | Apache Spark | CI/CD | Data PipelinesSenior-level Full TimeBucharest, Romania R3d ago
-
Senior-level Full TimeSan Jose, United States R3d ago