Software Engineer II (GPU Performance)

Mountain View, California, United States

Full Time Mid-level / Intermediate USD 98K - 208K

Microsoft

Entdecken Sie Microsoft-Produkte und -Dienste für Ihr Zuhause oder Ihr Unternehmen. Microsoft 365, Copilot, Teams, Xbox, Windows, Azure, Surface und mehr kaufen

View all jobs at Microsoft

Apply now Apply later

Posted 1 week ago

The Artificial Intelligence (AI) Frameworks team at Microsoft develops the AI software used to train and deploy the world’s most advanced AI models. We collaborate with our hardware teams and partners to build the software stacks for Microsoft’s next-generation supercomputers and the new Maia-100 AI accelerator. We work closely with ML researchers and developers to optimize and scale out model training and inference. We work directly with OpenAI on the models hosted on the Azure OpenAI service.

We are hiring a Software Engineer II (GPU Performance) to work on GPU (Graphics Processing Units) performance analysis and optimization. As a member of this team, you will have the opportunity to work on the fundamental abstractions, programming models, runtimes, libraries and APIs to enable large scale training and inferencing of models on novel AI hardware.

This is a technical role focused on performance analysis and optimization of machine learning models: it requires hands-on software development skills. We’re looking for someone who has a demonstrated history of solving hard technical problems and is motivated to tackle the hardest problems in building a full end-to-end AI stack. An entrepreneurial approach and ability to take initiative and move fast are essential.

We do not just value differences or different perspectives. We seek them out and invite them in so we can tap into the collective power of everyone in the company. As a result, our customers are better served.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities

Software development in C/C++, Python, and in GPU languages such as CUDA, ROCm, or Triton
Work with cutting-edge hardware stacks and a fast-moving software stack to deliver best-of-class inference and optimal cost.
Engage with key partners to understand and implement performance analysis and optimization for state-of-the-art LLMs and other models.
Embody our culture and values

Qualifications

Required Qualifications:

Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C/C++, CUDA, or ROCm
- OR equivalent experience.
1+ years’ practical experience working on applications that use GPUs, experience in optimizing their performance

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

Bachelor's Degree in Computer Science
- OR related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C/C++, CUDA, or ROCm
- OR Master's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C/C++, CUDA, or ROCm
- OR equivalent experience.
Experience writing new GPU kernels, going beyond experience of GPU workloads with existing library kernels
Experience in low-level performance analysis and optimization, including proficiency using GPU profiling tools such as NVIDIA Visual Profiler, and NVIDIA Nsight Compute
Technical background and solid foundation in software engineering principles and architecture design
Exposure to Deep Neural Network inference and experience in one or more deep learning frameworks such as PyTorch, Tensorflow, or ONNX Runtime
Cross-team collaboration skills and the desire to collaborate in a team of researchers and developers

Software Engineering IC3 - The typical base pay range for this role across the U.S. is USD $98,300 - $193,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $127,200 - $208,800 per year. Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
Microsoft will accept applications for the role until November 29, 2024.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

#AIFX

#AIPLATFORM#

#AIPLATREF#

#SHPE24MSFT

Apply now Apply later

Job stats: 0 0 0

Category: Engineering Jobs

Tags: APIs Architecture Azure Computer Science CUDA Deep Learning Engineering GPU LLMs Machine Learning ML models Model training ONNX OpenAI Python PyTorch Security TensorFlow