Senior Consultant Specialist (Model Hosting/Inference Optimization)
Guangzhou, Guangdong, China
CNY 144K-240K (estimate) Senior-level Full Time
Tasks
- Collaborate with hardware teams for performance optimizations
- Design model hosting platforms for LLMs and embeddings
- Ensure production readiness for reliability scalability security and high availability
- Implement fine tuning pipelines for foundation models
- Integrate and tailor inference frameworks for target hardware
- Monitor inference health and performance
- Optimize inference for latency throughput and cost
- Troubleshoot deployment bottlenecks and issues
Perks/Benefits
- N/A
Skills/Tech-stack
AWQ | AWS | Batching | CPU architecture | CUDA | Distributed Training | Docker | FP8 | Fine Tuning | GPTQ | GPU Architecture | Google Cloud | HPC | Hugging Face | Hugging Face Accelerate | Hugging Face Transformers | Hyperparameter Tuning | INT4) | KV cache | Kubernetes | LoRA | Microsoft Azure | Monitoring | Operator optimization | Python | QLoRA | Quantization | SGLang | TensorRT-LLM | VLLM
Education
Related jobs
-
Embodied AI Intern CNY 45K-50KC++ | Computer Vision | Deep learning | Gazebo | Isaac SimHands on industry scale data annotation experience | Onsite work three days per week | Structured mentoringEntry-level Internship Part TimeShanghai, China23h ago
-
CI/CD | Docker | ETL | FastAPI | FlaskEntry-level InternshipShanghai, YANGPU, China1d ago
-
Senior Gen AI Software Solutions Engineer CNY 240K-360KAutogen | C++ | Deep learning | Edge AI | EmbeddingsOn-site work modelSenior-level Full TimeCHN - Minhang, China1d ago
-
优才-多模态交互算法工程师-X-Lab CNY 240K-480KAttention | Benchmarking | Computer Vision | Deep learning | Hard Negative MiningSenior-level Full Time上海、深圳1d ago
-
Mid-level Full Time深圳 R1d ago
-
Mid-level Full Time北京 R1d ago
-
Gaming AI Engineer CNY 304K-380KAlgorithms | Automatic Speech Recognition | C# | C++ | Computer ArchitectureMid-level Full TimeShenzhen, Guangdong, China1d ago
-
Forward Deployed AI Engineer CNY 72K-96KAWS | Agile | Amazon Redshift | BigQuery | Cloud platformTravel up to 50 percentEntry-level Full Time Internship北京1d ago
-
Mid-level Full Time北京 R1d ago
-
Mid-level Full Time Temporary北京1d ago
-
Mid-level Full Time北京 R1d ago
-
Mid-level Full Time杭州1d ago
-
Regional Data & AI Engineer, Operations, Asia Pacific CNY 300K-380KArtificial neural networks | Clustering | Data Architecture | Data Governance | Data ModelingMid-level Full TimeShanghai, CN1d ago
-
[Pricing Data Engineering ] Staff Data Engineer I CNY 120K-180KAWS | Algorithms | Amazon EMR | Apache Airflow | Apache SparkSenior-level Full TimeShanghai, China2d ago
-
Magnetic Recording Algorithm Development Engineer CNY 144K-240KAlgorithm Development | Automated Test | Automated Test Equipment | C# | C++Senior-level Full TimeShenzhen, Guangdong Province, China2d ago
-
Principal Engineer, Cloud Storage Architect CNY 74K-100KAzure Blob | Azure Blob Storage | Blob Storage | Cloud Computing | Cloud StorageEntry-level Full TimeShanghai, Shanghai, China2d ago
-
Mid-level Full TimeWuxi - Ximei Road, China (Mainland)2d ago
-
Mid-level Full TimeChina, Shanghai2d ago
-
Senior AI Training Performance Engineer CNY 144K-240KC++ | CUDA | Computer Architecture | Deep learning | GPU ArchitectureSenior-level Full TimeChina, Shanghai2d ago
-
Mid-level Full TimeShenzhen, Guangdong, China2d ago
-
Senior-level Full TimeCN-OCG International Center, Cheng Du, China2d ago
-
Sr. System Software Engineer CNY 240K-480KAAC | ARM | ARM Drivers | Audio Encoding | BashOn-site support | Remote support | Technical trainingSenior-level Full TimeChina Shanghai2d ago
-
Mid-level Full TimeShanghai, Shanghai, China2d ago
-
Specialist, AI Research CNY 240K-480KC plus plus | CI/CD | Docker | Embeddings | Experiment trackingCareer development opportunities | Learning opportunities | Supportive work environmentSenior-level Full TimeCN-OCG International Center, Cheng Du, China2d ago
-
数据开发工程师 CNY 120K-180KBI | Data Governance | Data Quality | Data Warehousing | Data quality monitoringMid-level Full Time深圳2d ago