Intern Associate Engineer - Large Model Application Platforms

Kingston, Ontario, Canada

Huawei Technologies Canada Co., Ltd.

Huawei is a leading global provider of information and communications technology (ICT) infrastructure and smart devices.

View all jobs at Huawei Technologies Canada Co., Ltd.

Apply now Apply later

Our team has an immediate 12-month internship opening for an Associate Engineer.

Responsibilities:

  • Research, prototype and build core infrastructure, tooling, and platforms to improve the productivity, quality, and efficiency of engineering and serving foundation model applications.
  • Design, implement and assess application programming APIs, frameworks and runtime systems software for heterogeneous architectures (e.g., GPU, NPU).
  • Support the integration process of novel software frameworks on in-house hardware platforms (e.g. performance modeling, analysis of future computing architectures, resource allocation and management, scheduling, fault tolerance and resiliency, communication and shared memory).
  • Meet top industry and academic leaders and experts around the world, collaborate with top researchers and students, consult with Engineering teams across diverse domains, publish research papers in far-reaching and impactful areas, and submit patent applications for novel inventions.

Requirements

What you’ll bring to the team:

  • Bachelors, Master or PhD Degree in Computer Science, Electrical & Computer Engineering, Machine Learning, or relevant domains.
  • Solid experience with one or more of the following programming languages: Python/C/C++/Go; Familiarity with software development practices (version management, build management, CI/CD, debugging and profiling).
  • Solid understanding in any of these areas: Machine Learning and/or Deep Learning, Large Models Training and Finetuning (e.g., NLP/CV).
  • Experience with mainstream model training and inference frameworks and tools (e.g., PyTorch, Tensorflow, PaddlePaddle, Oneflow, MindSpore, HuggingFace Transformer&Accelerate, DeepSpeed, Megatron, FasterTransformer, Triton Inference).
  • Solid understanding in Computer Architecture, Distributed Computing, Parallel Computing, Cloud Native, Operating Systems, Networks.
  • Experience in using frameworks and tools of any of the aforementioned areas (e.g., Spark, Flink, Ray for Distributed Computing, Docker, K8S for Cloud-Native app/framework development).
  • Ability to evaluate, apply, and mature published research to real-world problems on prototype systems and have an inquisitive mindset, proven research and communication skills, can conduct investigations and experiments independently, and can interpret experiment data and present results clearly and concisely. 
  • Publications in related top-tier venues (e.g., ICSE, FSE, TSE, ICLR, ICML, NeurIPS, OSDI, SOSP) is an asset.
Apply now Apply later
Job stats:  4  1  0
Category: Engineering Jobs

Tags: APIs Architecture CI/CD Computer Science Deep Learning Docker Engineering Flink GPU HuggingFace ICLR ICML Kubernetes Machine Learning Model training NeurIPS NLP PhD Python PyTorch Research Spark TensorFlow

Region: North America
Country: Canada

More jobs like this