Machine Learning Runtime Engineer - Internship (PEY 2025)
Toronto, Ontario, Canada
Cerebras Systems
Cerebras is the go-to platform for fast and effortless AI training and inference.Cerebras has developed a radically new chip and system to dramatically accelerate deep learning applications. Our system runs training and inference workloads orders of magnitude faster than contemporary machines, fundamentally changing the way ML researchers work and pursue AI innovation.
We are innovating at every level of the stack – from chip, to microcode, to power delivery and cooling, to new algorithms and network architectures at the cutting edge of ML research. Our fully-integrated system delivers unprecedented performance because it is built from the ground up for deep learning workloads.
Cerebras is building a team of exceptional people to work together on big problems. Join us!
About The Role
As a Runtime Engineer, you will directly impact the performance at which deep learning models are trained on our “distributed systems” hardware and be responsible for enabling next-generation AI applications that require substantial computational capabilities. In this position, you will develop algorithms for execution, acceleration, partitioning, and routing of communication for dataflow graphs on a massively parallel, multi-core architecture.
Specific responsibilities may include:
- Be able to understand the flow of data in a distributed system and how to characterize performance pain points
- Develop algorithms for allocation of compute, communication, and memory resources
- Measure, analyze, and improve execution of Runtime software (that is responsible for training large models with massive datasets)
- Integrate successful optimizations into production software stack
- Implement mathematical models in C++ or Python using discrete optimization techniques and standard libraries and packages
Requirements
- Currently enrolled in a University in Computer Science, Computer Engineering, or any other related discipline
- Strong proficiency in C/C++
- Familiarity with Python or other scripting language
- The ability to operate at multiple levels of abstraction in the software stack
Preferred
- Knowledge about distributed systems, memory subsystem of modern computers, and networking solutions
Term Length
- 12-16 months starting May 2025
Please apply to the job with BOTH your resume and transcript (official or unofficial).
Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.
This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.
Tags: Architecture Computer Science Dataflow Deep Learning Distributed Systems Engineering Machine Learning Python Research
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.