Manufacturing Test Lead

Mountain View, California, United States

Microsoft

Entdecken Sie Microsoft-Produkte und -Dienste für Ihr Zuhause oder Ihr Unternehmen. Microsoft 365, Copilot, Teams, Xbox, Windows, Azure, Surface und mehr kaufen

View all jobs at Microsoft

Apply now Apply later

Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft’s expanding Cloud Infrastructure and responsible for powering Microsoft’s “Intelligent Cloud” mission. SCHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 online businesses including Bing, MSN, Office 365, Xbox Live, Teams, OneDrive, and the Microsoft Azure platform globally with our server and data center infrastructure, security and compliance, operations, globalization, and manageability solutions. Our focus is on smart growth, high efficiency, and deliver trusted experience to customers and partners worldwide and we are looking for passionate, high-energy engineers to help achieve that mission.

 

As Microsoft's cloud business continues to grow, the ability to deploy new offerings and hardware infrastructure on time, in high volume with high quality and lowest cost is of paramount importance. To achieve this goal, the Hardware, Infrastructure Management, and Fundamentals Engineering (HIFE) team is instrumental in defining and delivering operational measures of success for hardware manufacturing, improving the planning process, quality, delivery, scale, and sustainability related to Microsoft cloud hardware. We are looking for seasoned engineers with a dedicated passion for customer focused solutions, insight, and industry knowledge to envision and implement future technical solutions that will manage and optimize the Cloud infrastructure.

 

We are looking for a Manufacturing Test Lead to join our team.  Join us in this exciting AI revolution. Be part of the infrastructure engineering taskforce to fuel this world changing mission.

 

Responsibilities

  • Define and lead end-to-end manufacturing test strategies for PCBAs, storage enclosures, and rack-level systems.
  • Leads the development of test hardware, software, and firmware to validate the functionality of complex systems including GPUs, CPUs, and liquid-cooled platforms, ensuring alignment with product design and performance goals.
  • Develop test plans and validation metrics for GPU-based platforms (e.g., NVIDIA HGX, GB200), covering bring-up, functional, performance, and stress diagnostics.
  • Integrate AI/ML models to dynamically adjust test coverage based on historical data, product complexity, and risk profiles.
  • Implement AI-driven anomaly detection systems to flag test escapes and reduce false positives in real time.
  • Designs and delivers end-to-end test solutions, particularly for advanced liquid cooling technologies, addressing both macro and micro-level thermal transfer challenges (e.g., fluids, pumps, manifolds, connectors).
  • Collaborates across multidisciplinary teams—mechanical, electrical, process, and production engineering—to integrate test strategies early in the product lifecycle and ensure seamless execution during manufacturing.
  • Defines and maintains test architecture and core test content, including reusable scripts and test cases, for both blade-level and rack-level systems, ensuring scalability and consistency across product lines.
  • Monitors production yield and test data, identifies systemic issues, and drives root cause analysis and corrective actions to improve test coverage, product quality, and manufacturing efficiency.
  • Assesses manufacturing test readinessbefore each NPI (New Product Introduction) build, conducting risk assessments and coordinating mitigation plans with internal and external stakeholders.
  • Initiates early engagement in design phasesto identify test coverage gaps, develop new test materials, and establish successful metrics to ensure quality and reliability from prototype to production.
  • Ensures comprehensive documentation and verification coverageacross all product stages, mapping test cases to customer impact and business value, including coverage ROI and cost rationalization.
  • Drive continuous improvement initiativesby analyzing test system data, eliminating non-value-added processes, and enhancing test effectiveness and efficiency.
  • Engages with CM/ODM partners to understand production capabilities and limitations, ensuring supply chain alignment and consistent delivery of high-quality products.
  • Leverage predictive analytics and machine learning to forecast failure trends and proactively mitigate risks.
  • Evaluate AI readiness of supplier test systems and drive adoption of intelligent test solutions across the ecosystem.
  • Collaborate with data science teams to develop AI tools that support test optimization and decision-making.

Qualifications

Required Qualifications:

  • Bachelor's Degree in Electrical Engineering, Computer Science, Mechanical Engineering (Hardware) or related field AND 8+ years enterprise computer, consumer electronics design or hyperscale supply chain systems experience
    • OR Master's Degree in Electrical Engineering, Computer Science, Mechanical Engineering (Hardware) or related field AND 4+ years of enterprise computer, consumer electronics design or hyperscale supply chain systems experience
    • OR Doctorate in Electrical Engineering, Computer Science, Mechanical Engineering (Hardware) or related field AND 3+ years of enterprise computer, consumer electronics design or hyperscale supply chain systems experience
    • OR equivalent experience.
  • 8+ years of experience in manufacturing test engineering for compute and hyperscale platforms (e.g., servers, storage, GPUs), with deep expertise in test development and deployment at the building block, server, and rack levels.
  • 3+ years of experience working with data center cooling infrastructure or advanced thermal management solutions, including liquid cooling technologies.
  • 3+ years of experience developing diagnostic tools or utilities for electronic systems in high-volume or contract manufacturing environments.

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: 

 

Microsoft Cloud Background Check: 

This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

 

Preferred Qualifications:

  • Experience in hyperscale or OEM manufacturing environments (e.g., Microsoft Azure, AWS, or similar).
  • Familiarity with hardware diagnostics and test methodologies, including PCIe/NVLink, NVIDIA NVML APIs, sensor telemetry, and stress testing for memory, CPU, and GPU components.
  • Knowledge of rack-level test automation frameworks, including barcode scanning, firmware flashing, and test sequencing.
  • Proficient in telemetry and monitoring systems such as Prometheus and Grafana for real-time test data visualization.
  • Solid understanding of PCBA and enclosure-level design and manufacturing processes.
  • Hands-on experience with software development in languages such as Python, SQL, C#, C++, or Rust.
  • Familiarity with firmware/BIOS development and driver integration for Linux or Windows platforms.
  • Experience with modern software development workflows and version control tools (e.g., Git).
  • Direct experience integrating CPU/GPU test tools into high-volume manufacturing test flows.
  • Proficient in Linux OS, including scripting and command-line operations.
  • Experience with Azure DevOps Services, Power BI, or Power Automate is a plus.
  • Proficient in Microsoft Office applications for documentation and reporting.
  • Ability to navigate ambiguity, translate complex concepts into practical processes, and drive implementation across cross-functional teams.
  • Proficient in Linux-based environments, including scripting, command-line operations, and troubleshooting test systems and storage enclosures.
  • Proven analytical and problem-solving skills with a proven ability to collaborate across multidisciplinary technical teams.
  • 2+ years of hands-on experience in data center or test lab environments.
  • Willingness to travel domestically and internationally, and to support off-hours work as required.

Manufacturing Test Engineering IC5 - The typical base pay range for this role across the U.S. is USD $139,900 - $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000 - $304,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay.

Microsoft will accept applications for the role until July 17, 2025.

 

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.  We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

 

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

 

#azurehwjobs #HIFE #SCHIE 

Apply now Apply later
Job stats:  0  0  0
Category: Leadership Jobs

Tags: APIs Architecture AWS Azure Computer Science Data visualization DevOps Engineering Git GPU Grafana Linux Machine Learning ML models NVLink Power BI Python Rust Security SQL Testing

Perks/benefits: Career development Gear Health care Medical leave

Region: North America
Country: United States

More jobs like this