Senior System Software Manager

San Jose

Etched

Transformers etched into silicon. By burning the transformer architecture into our chips, we're creating the world's most powerful servers for transformer inference.

View all jobs at Etched

Apply now Apply later

Senior System Software Manager

About Etched

Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents.

Job Summary

Lead System Software development for Etched’s ground breaking Inference Acceleration Systems. As Senior Manager, System Software, you will guide talented engineering and test teams responsible for the full low-level stack (firmware, drivers, OS, monitoring, test automation). Key responsibilities include attracting and developing world-class talent, defining technical strategy, driving quality execution from silicon bring-up through production, and collaborating across hardware, software, and manufacturing partners to deliver high-performance, reliable ML platforms.

Key responsibilities

  • Team Leadership & Talent Development: Lead, manage, and inspire high-caliber teams of system software developers and system test engineers. 

  • Team Building: Build and scale world-class system software and test teams by attracting, hiring, and retaining, top-tier engineering talent. 

  • Technical Strategy & Roadmap: Define and drive the technical strategy, architecture, and development roadmap for the entire system software stack (UEFI/BIOS, BMC, RoT, Drivers, OS, and Monitoring).

  • Coach and Mentor: Actively coach and mentor team members, fostering professional growth through challenging assignments, targeted development plans and continuous feedback. Cultivate team culture focused on technical excellence and results.

  • Execution & Delivery: Oversee end-to-end software development lifecycle – including design, implementation, testing, validation, and release.

  • System Validation Leadership: Provide direction and oversight for System Test engineering teams responsible for the validation and qualification of Etched ML System Software stack. 

  • Manufacturing Test Integration: Collaborate closely with Manufacturing Operations, Test Engineering, and external partners (CMs) to deliver system software testing and diagnostics to manufacturing environments.

  • Cross-Functional Collaboration: Partner effectively with ASIC design, hardware platform engineering, and external manufacturing partners, to ensure seamless hardware/software integration and address system-level challenges.

  • Golden Image Management: Oversee the creation, maintenance, and release process for the validated golden reference container images.

  • Resource Management: Manage project priorities, deadlines, and resources effectively across development and test teams for multiple concurrent projects.

You may be a good fit if you have

  • Proven experience (5+ years) managing software engineering teams, ideally in system software, embedded Linux, server firmware, or system-level validation

  • A strong track record of attracting, developing, mentoring, and retaining top engineering talent, fostering high-performing and diverse teams

  • Deep technical expertise in areas such as BIOS/UEFI, BMC firmware, Root of Trust/Secure Boot, Linux kernel, PCIe device drivers, OS-level services, and system monitoring

  • Solid understanding of server and hardware architecture, including CPU/SoC/ASIC design, memory hierarchies, and PCIe interconnects

  • Experience managing the complete software development lifecycle: requirements, design, implementation, testing, release, and maintenance

  • History of successfully delivering complex system software projects tightly coupled with hardware platforms

  • Excellent leadership, communication, and collaboration skills across functional boundaries

  • Strong problem-solving abilities, especially in debugging complex hardware/software system interactions

  • Familiarity with modern development workflows including Git, CI/CD pipelines, and software engineering best practices

Strong candidates may also have experience with

  • System software development or validation for AI/ML accelerators or custom ASIC/SoC hardware platforms

  • Deploying and managing diagnostics and software tests in high-volume manufacturing environments (e.g., factory test, L10)

  • Working with OpenBMC, Redfish, or other modern BMC firmware stacks and related standards

  • Deep knowledge of system-level security: threat modeling, secure boot, and secure development lifecycle practices

  • Managing container technologies like Docker and Kubernetes at the node/system level

  • Familiarity with working alongside server contract manufacturers in APAC, including logistics and test support coordination

Ideal Background:

  • Current Development or Validation Managers/Directors (or those managing combined teams) from semiconductor, server hardware, cloud infrastructure, or HPC companies.

  • Senior technical leaders or architects in system software or validation with demonstrated leadership experience, strong mentorship skills, and readiness for management.

  • Managers who have led teams responsible for delivering and qualifying foundational software for complex hardware systems, including enabling manufacturing test.

  • Individuals with a strong track record of building and managing teams focused on firmware, kernel, OS development, and system test, prioritizing technical excellence, quality, and talent development.

Benefits

  • Full medical, dental, and vision packages, with 100% of premium covered

  • Housing subsidy of $2,000/month for those living within walking distance of the office

  • Daily lunch and dinner in our office

  • Relocation support for those moving to West San Jose

How we’re different

Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

We are a fully in-person team in West San Jose, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  1  0  0
Category: Leadership Jobs

Tags: Architecture ASIC Design CI/CD Docker Engineering Git HPC Kubernetes Linux Machine Learning Pipelines Research Security Testing Transformers

Perks/benefits: Health care Relocation support

Region: North America
Country: United States

More jobs like this