Senior Consultant Specialist (Model Hosting/Inference Optimization)
Guangzhou, Guangdong, China
CNY 144K-184K (estimate) Senior-level Full Time
Tasks
- Build end to end fine tuning pipelines
- Design and operate model hosting platforms for LLMs embeddings STT TTS
- Ensure production reliability scalability security and high availability
- Evaluate and integrate inference frameworks for target hardware
- Implement distributed training and hyperparameter tuning for fine tuning
- Monitor inference health and performance troubleshoot deployment issues
- Optimize inference for latency throughput and cost
- Partner with hardware teams for hardware specific optimizations
Perks/Benefits
- N/A
Skills/Tech-stack
AWQ | AWS | Accelerate | Benchmarking | CUDA | Distributed Training | Docker | FP8 | GPTQ | Google Cloud | Hugging Face | Hugging Face Transformers | Hyperparameter Tuning | INT4) | KV cache | Kubernetes | LoRA | Microsoft Azure | Monitoring | Python | QLoRA | Quantization | SGLang | TensorRT-LLM | VLLM
Education
Related jobs
-
AI运维工程师(大模型推理 / AI Infra) CNY 180K-300KAlerting | Automation | Docker | GPU Acceleration | High AvailabilityEntry-level Full Time深圳9h ago
-
数据算法工程师 CNY 180K-300KAnomaly Detection | Automation | C plus plus | Computer Vision | Data AnnotationEntry-level Full Time上海9h ago
-
Entry-level Full Time上海9h ago
-
Entry-level Full Time上海10h ago
-
Entry-level Full Time上海10h ago
-
Mid-level Full TimeSuzhou, Jiangsu, China1d ago
-
2026 Intern(3 months)-AI Software Enginner CNY 38K-50KAlgorithms | Android | Audio Video Decoding | Audio/Video | C#Entry-level InternshipShenzhen, Guangdong, China1d ago
-
Machine Learning Engineer CNY 248K-315KAndroid | C# | C++ | Embedded System | Embedded System ArchitectureMid-level Full TimeShanghai, Shanghai, China1d ago
-
Entry-level Full TimeShanghai, Shanghai, China1d ago
-
Machine Learning Engineer CNY 216K-300KAI acceleration | Android | C++ | Concurrency optimization | Embedded DevelopmentMid-level Full TimeShenzhen, Guangdong, China1d ago
-
Principal Engineer - Agentic AI Architect CNY 240K-375KAI Deployment | API Design | ASIC | Agent systems | Agentic AISenior-level Full TimeShanghai, China1d ago
-
自动驾驶数据闭环工程师-Data Infra CNY 25K-37KAI model | C++ | Data Mining | Data Quality | Data Quality EvaluationMid-level Full Time北京、苏州1d ago
-
Mid-level Full Time上海1d ago
-
具身多模态数据分析算法开发实习生 CNY 25K-37KASR | Anomaly Detection | Audio Data | Audio Data Processing | Automated Data LabelingInternship opportunityEntry-level Internship上海1d ago
-
Advanced Engineer CNY 360K-600KAI | C# | CAN bus | CCD Camera | Computer VisionCareer growth opportunities | Leadership development | Technical competency development | Training and development programsSenior-level Full TimeNanjing Shi, China1d ago
-
Senior Consultant Specialist CNY 120K-180KAlerting | Apache Airflow | Apache Beam | Apache Flink | Apache SparkSenior-level Full TimeGuangzhou, Guangdong, China2d ago
-
Analyst, Data Science CNY 216K-264KArtificial Intelligence | Code review | Document Ingestion Pipelines | Document ingestion | Fine TuningMid-level Full TimeCN - AIA Financial Center Building, …2d ago
-
Senior-level Full TimeChina-Shanghai (Tianshan-W-Rd)2d ago
-
Mid-level Full TimeHangzhou4d ago
-
Mid-level Full Time深圳4d ago
-
【27届实习】云原生Ai平台研发工程师-杭州 CNY 37K-37KArgo Workflow | Computer networks | Containers | Data Structures | GoEntry-level Internship杭州4d ago
-
【27届实习】数据挖掘工程师 CNY 25K-37KData Structures | Deep learning | Distributed machine learning | Go | JavaFull-time conversion opportunityEntry-level Internship Temporary上海4d ago
-
Entry-level Internship南京4d ago
-
Mid-level Full Time东莞4d ago
-
Ai算法工程师 CNY 180K-300KConvolutional Neural Networks | Data Mining | Data Warehouse | Data labeling | Deep learningMid-level Full Time东莞4d ago