AI Research Engineer (Kernel & Inference Optimization)
Tasks
- Build and monitor inference tests
- Create test datasets and simulation scenarios
- Deploy inference pipelines
- Design model serving architectures
- Identify and resolve serving bottlenecks
- Integrate inference frameworks into production pipelines
- Optimize inference strategies
- Track performance metrics
Perks/Benefits
Skills/Tech-stack
Computer Vision | Diffusion Models | Edge Computing | Expert parallelism | Flash Attention | GPU Kernels | Inference Optimization | KV cache | Level optimization | Low-level optimization | Machine Learning | Memory Management | Mobile optimization | Model Serving | NLP | Neural Networks | On-device Inference | Pipeline parallelism | Pruning | Quantization | Speculative decoding | Tensor Parallelism | Vision Transformers
Education
Related jobs
-
Featured Feat. Associate Director, Data Labs USD 167K-167KAWS | Cloud Computing | Compute Infrastructure | Data Analysis | LLM GovernanceConference speaking opportunities | Hybrid work schedule | Media appearancesSenior-level Full TimeWashington, District of Columbia, 20004, United … R3d ago
-
AWS | Adversarial Machine Learning | Amazon SageMaker | Anonymization | AzureCutting-edge AI security work | Flexible working hours | Fully remote | Global cross-functional collaboration | Opportunity to shape AI security best practicesSenior-level Full TimeIndia R6h ago
-
AWS | Airflow | DBT | Fine Tuning | Language ModelsBonuses | Disability insurance | Life insurance | Paid parental leave | Paid time offSenior-level Full TimeRemote, India R17h ago
-
AI Engineer / AI Architect COP 60000K-71400KCI/CD | Cloud Computing | Compliance | Data Ingestion | Deep learningSenior-level Full TimeBogota, Colombia (Remote Friendly) R19h ago
-
AI Engineer COP 41748K-43836KAWS | CI/CD | Cloud infrastructure | Data Ingestion | Distributed SystemsMid-level Full TimeBogota, Colombia (Remote Friendly) R19h ago
-
AI Engineer H/F - CDI EUR 50K-65KAI Agents | Agent systems | Cloud Computing | Deep learning | Fine TuningCooptation bonus | Equipment bonus | Flexible remote work | Health insurance | Meal vouchersMid-level Full TimeParis, IDF, France R19h ago
-
Anthropic API | AutoGluon | CUDA | CatBoost | Cloud platformRemote workMid-level Full TimeRemote R22h ago
-
Senior CRM / AI Solutions Engineer (m/w/d) EUR 75K-95KAI | API Design | C# | Data Governance | Data Quality100 percent homeoffice | Company pension | Disability insurance | Employee meals | Flexible work hoursMid-level Full TimeHomeoffice R23h ago
-
AI Agents | AWS | AWS Glue | Apache Druid | Apache HudiCareer growth potential | Collaborative engineering culture | Flexible work from anywhere in India | Fully remote | Technical strategy influenceSenior-level Full TimeIndia R1d ago
-
Data Engineering Lead EUR 60K-80KAWS | Artificial Intelligence | Azure | Cloud Computing | Data GovernanceAnnual bonus | Health insurance | Knowledge sharing | Learning and development | PensionSenior-level Full TimeRemote, Spain R1d ago
-
Distributed Systems | Embeddings | Kubernetes | LLM Inference | Language ModelsCollaborative flat structure | Direct access to technical leadership | High autonomy and flexibility | High ownership of projects | Remote first international work environmentEntry-level Full TimeEstonia R1d ago
-
Distributed Systems | Embeddings | Java | Kubernetes | LLM InferenceCollaborative engineering culture | Direct access to technical leadership | High autonomy and flexibility | Remote-first work environmentEntry-level Full TimeHungary R1d ago
-
Distributed Systems | Embeddings | Java | Kubernetes | LLM InferenceCollaborative flat engineering culture | Direct access to technical leadership | Exposure to cutting edge generative AI | Flexible working conditions | High autonomyEntry-level Full TimeFinland R1d ago
-
Containers | Distributed Systems | Embeddings | Java | KubernetesCollaborative flat structure | Direct access to technical leadership | Exposure to cutting edge generative AI | Flexible schedule | High autonomyEntry-level Full TimeCzechia R1d ago
-
Distributed Systems | Embeddings | Java | Kubernetes | LLM InferenceCollaborative engineering culture | Direct access to technical leadership | Flexible work hours | High autonomy | Opportunity to work on large-scale AI systemsEntry-level Full TimeNorway R1d ago
-
Containerization | Distributed Systems | Embeddings | Kubernetes | LLM InferenceAccess to technical leadership | Exposure to cutting-edge technology | Flexible schedule | High autonomy | International work environmentEntry-level Full TimeLuxembourg R1d ago
-
Advertising Technology | Distributed Systems | Embeddings | Java | KubernetesCollaborative engineering culture | Direct access to technical leadership | Exposure to cutting edge generative AI technologies | Flexible work arrangements | High autonomyEntry-level Full TimeCroatia R1d ago
-
Containerization | Data Manipulation | Distributed Systems | Embeddings | JavaCollaborative flat engineering culture | Direct access to technical leadership | Exposure to cutting edge generative AI | Fast moving experimentation cycles | High autonomy and flexibilityEntry-level Full TimeBulgaria R1d ago
-
Distributed Systems | Embeddings | Java | Kubernetes | LLM InferenceCollaborative flat engineering culture | Direct access to technical leadership | Exposure to cutting edge generative AI technologies | High autonomy and flexibility | High ownership of projectsEntry-level Full TimeDenmark R1d ago
-
Conversational Interfaces | Data Retrieval | Distributed Systems | Embeddings | JavaAccess to technical leadership | Collaborative engineering culture | Exposure to cutting-edge AI technologies | Flexible work setup | High autonomyEntry-level Full TimeGreece R1d ago
-
Conversational Interfaces | Data Retrieval | Distributed Systems | Embeddings | KubernetesAccess to technical leadership | Autonomy and flexibility | Collaborative flat culture | Exposure to cutting edge generative AI | Remote-first work environmentEntry-level Full TimeChile R1d ago
-
Distributed Systems | Embeddings | Java | Kubernetes | LLM InferenceCollaborative flat culture | Direct access to technical leadership | Flexible work setup | High autonomy | High ownership of projectsEntry-level Full TimePoland R1d ago
-
Distributed Systems | Embeddings | Java | Kubernetes | LLM InferenceDirect access to technical leadership | Exposure to cutting edge generative AI | Flat structure engineering culture | Flexible work | High autonomyEntry-level Full TimeAustria R1d ago
-
Conversational Interfaces | Data Retrieval | Distributed Systems | Embeddings | KubernetesCollaborative engineering culture | Direct access to technical leadership | Exposure to cutting edge generative AI | High autonomy and flexibility | High ownership of projectsEntry-level Full TimeSweden R1d ago
-
Distributed Systems | Embeddings | Kubernetes | LLM Inference | Machine LearningCollaborative culture | Competitive compensation | Direct access to technical leadership | Flexible work environment | High autonomyEntry-level Full TimeIsrael R1d ago