Co-op Researcher - Multimodal Large Language Model (MLLM) Serving Optimization
Vancouver, British Columbia, Canada
Huawei Technologies Canada Co., Ltd.
Huawei is a leading global provider of information and communications technology (ICT) infrastructure and smart devices.Huawei Canada has an immediate co-op opening for a Researcher.
About the team:
The Intelligent Cloud Infrastructure Lab aims to innovate technologies, algorithms, systems, and platforms for next-generation cloud infrastructure. The lab addresses scalability, performance, and resource utilization challenges in existing cloud services while preparing for future challenges with appropriate technologies and architectures. Additionally, the lab aims to understand industry dynamics and technology trends to create a robust ecosystem.
About the job:
Design, implement, and optimize a high-performance serving platform for MLLMs.
Integrate SOTA open-source serving frameworks such as vLLM, sglang, or lmdeploy.
Develop techniques for efficient resource utilization and low-latency inference for MLLMs in serverless environments.
Optimize memory usage, scalability, and throughput of the serving platform.
Conduct experiments to evaluate and benchmark MLLM serving performance.
Contribute novel ideas to improve serving efficiency and publish findings when applicable.
Work with cross-functional teams, including researchers and engineers, to ensure seamless deployment and integration of the platform.
Provide technical guidance and support for platform users.
The base salary for this position ranges from $56,000 to $79,000 depending on education, experience and demonstrated expertise.
Requirements
About the ideal candidate:
Bachelor’s degree or higher in Computer Science, Electrical and Computer Engineering (ECE), or a related field.
Strong proficiency in PyTorch, Python and familiar with other programming languages as needed.
Experience with one or more SOTA LLM serving frameworks such as vLLM, sglang, or lmdeploy. Experience with inference optimization for large-scale AI models.
Familiarity with distributed systems, serverless architectures, and cloud computing platforms. Familiarity with multimodal architectures and serving requirements.
Strong analytical and problem-solving abilities.
Previous experience in deploying AI platforms on cloud services.
Excellent communication and teamwork skills.
Tags: Architecture Computer Science Distributed Systems Engineering LLMs Open Source Python PyTorch vLLM
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.