AI Platform Engineering Operations Manager
San Jose, United States
Full Time Mid-level / Intermediate USD 143K - 289K
Adobe
Adobe is changing the world through digital experiences. We help our customers create, deliver and optimize content and applications.Our Company
Changing the world through digital experiences is what Adobe’s all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences! We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen.
We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours!
About the Role
We are looking for an AI Platform Engineering Operations Manager to coordinate the operational excellence of our AI infrastructure. This is an outstanding opportunity to lead all aspects of platform reliability, scalability, observability, and efficiency across training, inference, and data platforms. You will work closely with platform engineering, ML engineering, SRE, research, Adobe product teams, and our vendor partners to ensure seamless, end-to-end ML platform capabilities.
Ideal candidate for this position:
Proven experience in deep technical operations of cloud infrastructure or infrastructure development, preferably in the AI, or large-scale compute environments.
Proficient in orchestration layer tools like Kubernetes, EKS/AKS/GKE. Experience handling large compute clusters, especially GPUs. Solid understanding of the ML lifecycle.
Demonstrated success in achieving goals through fostering positive relationships with diverse collaborators in multi-functional settings.
Comfortable with data pipelines and BI tools for dashboards and observability tools (e.g., Prometheus, Grafana), incident tracking systems.
What you’ll do:
Manage capacity planning, scaling, and cost efficiency for meaningful AI training and inference deployments in collaboration with Program and Finance Managers, and CSP vendors.
Track and drive key operational improvements for metrics including GPU utilization efficiency, system uptime as well as user happiness and NPS scores.
Design, drive and own day-to-day operations of AI platforms in close partnership with globally dispersed platform engineering and SRE teams.
Drive operational efficiency through a combination of process improvements and leaning heavily into automation and tooling to eliminate toil and improve platform reliability.
Collaborate multi-functionally to gather insights into platform usage patterns and user feedback to develop effective platform policies, improve user support, and influence the platform roadmap.
What you will need to succeed:
Strong attention to detail, combined with the ability to synthesize and communicate strategic insights.
Proactive and adaptable demeanor, with a “no task is too big or too small” approach to problem-solving and execution. Requires minimal direction in an ambiguous context to take action and adapt quickly.
Acute Degree of Ownership and Grit: You do not let go until a problem is solved for good.
Why Join Us?
Help craft the future of AI infrastructure at scale. Work with world-class researchers, engineers, and platform architects. Drive meaningful impact in operational efficiency, performance, and AI innovation. This is your chance to create a difference with a team that's ambitious, exceptionally dedicated, and driven to compete at the highest level. Join us in crafting something truly world-class!
#FireflyGenAI
Our compensation reflects the cost of labor across several U.S. geographic markets, and we pay differently based on those defined markets. The U.S. pay range for this position is $143,700 -- $289,900 annually. Pay within this range varies by work location and may also depend on job-related knowledge, skills, and experience. Your recruiter can share more about the specific salary range for the job location during the hiring process.At Adobe, for sales roles starting salaries are expressed as total target compensation (TTC = base + commission), and short-term incentives are in the form of sales commission plans. Non-sales roles starting salaries are expressed as base salary and short-term incentives are in the form of the Annual Incentive Plan (AIP).
In addition, certain roles may be eligible for long-term incentives in the form of a new hire equity award.
State-Specific Notices:
California:
Fair Chance Ordinances
Adobe will consider qualified applicants with arrest or conviction records for employment in accordance with state and local laws and “fair chance” ordinances.
Colorado:
Application Window Notice
If this role is open to hiring in Colorado (as listed on the job posting), the application window will remain open until at least the date and time stated above in Pacific Time, in compliance with Colorado pay transparency regulations. If this role does not have Colorado listed as a hiring location, no specific application window applies, and the posting may close at any time based on hiring needs.
Massachusetts:
Massachusetts Legal Notice
It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.
Adobe is proud to be an Equal Employment Opportunity employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other applicable characteristics protected by law. Learn more.
Adobe aims to make Adobe.com accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process, email accommodations@adobe.com or call (408) 536-3015.
Tags: Data pipelines Engineering Finance GPU Grafana Kubernetes Machine Learning ML infrastructure Pipelines Research
Perks/benefits: Equity / stock options Transparency
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.