Engineer, Infrastructure (Kubernetes/AWS) (Open to remote)
New York, NY, US, 10019
Full Time Mid-level / Intermediate USD 140K - 160K
Bertelsmann
International media company and it’s divisions; information for all interested people, journalists and applicants; financial data and business detailsPenguin Random House is looking for a skilled ML platform engineer with expertise in AWS and Kubernetes to join our team. While Kubernetes will be a key focus, this role also requires working across our broader cloud systems, such as Databricks and Snowflake, to ensure infrastructure decisions support the full ML lifecycle.
In this role you will work closely with our development, operations, and data science teams to ensure the reliability and scalability of our cloud infrastructure, and to ensure it is well-integrated with ML development patterns. This is a hands-on engineering role focused on creating the foundation that enables fast, compliant, and reliable ML delivery at scale.
Specific responsibilities include:
- Designs, implements, and manages Kubernetes clusters on AWS using tools like Terraform; automates the deployment, scaling, and monitoring of containerized applications.
- Automates deployment and scaling of ML containers and cloud-native services.
- Supports infrastructure integration with platforms such as Databricks and Snowflake. Ensures the security and compliance of cloud infrastructure and applications.
- Monitors infrastructure performance and troubleshoots issues across cloud services and orchestration layers.
- Stays up-to-date with the latest trends and technologies in cloud computing and container orchestration.
Please apply if you meet the following qualifications:
- Proven experience with AWS cloud services, Kubernetes and Docker
- Strong background in Linux systems engineering
- Proficiency in programming languages such as Python, Java, or Go.Strong understanding of CI/CD tools such as GitLab CI
- Knowledge of infrastructure as code tools like Terraform
- Excellent problem-solving skills and attention to detail
- Strong communication and collaboration skills
- Holds a CKA or CKAD certification from Cloud Native Computing Foundation
- Comfortable collaborating across DevOps, infrastructure, and ML teams
Preferred Qualifications:
- Experience with monitoring tools like Prometheus, Grafana, Datadog, and Splunk
- Prior exposure to platform reliability, observability, and operational scaling challenges
- Experience with the deployment and scaling of ML models and workloads in production environments
The salary range for this position is $140,000 - $160,000. All positions are currently eligible for annual profit award or bonus, subject to company results.
Please apply by July 7, 2025 and include your resume and cover letter for consideration. Before applying for any role at Penguin Random House, we recommend you review our applicant resources page and our FAQs page.
Penguin Random House job postings include a good faith compensation range for each open position. The salary range listed is specific to each particular open position and takes into account various factors including the specifics of the individual role, and candidate's relevant experience and qualifications.
Full-time employees are eligible for our comprehensive benefits program. Our range of benefits include, but are not limited to, Medical/Prescription drug insurance, Dental, Vision, Health Care/Dependent Care Flexible Spending Account, Health Savings Account, Pre-Tax and Roth 401(k), Short and Long-Term Disability Insurance, Life/AD&D Insurance, Commuter Benefits, Student Loan Repayment Program, Educational Assistance & generous paid time off.
Penguin Random House is the leading adult and children's publishing house in North America, the United Kingdom and many other regions around the world. In publishing the best books in every genre and subject for all ages, we are committed to quality, excellence in execution, and innovation throughout the entire publishing process: editorial, design, marketing, publicity, sales, production, and distribution. Our vibrant and diverse international community of nearly 300 publishing brands and imprints include Ballantine Bantam Dell, Berkley, Clarkson Potter, Crown, DK, Doubleday, Dutton, Grosset & Dunlap, Little Golden Books, Knopf, Modern Library, Pantheon, Penguin Books, Penguin Press, Penguin Random House Audio, Penguin Young Readers, Portfolio, Puffin, Putnam, Random House, Random House Children's Books, Riverhead, Ten Speed Press, Viking, and Vintage, among others. More information can be found at http://www.penguinrandomhouse.com/.
Penguin Random House values the array of talents and perspectives that a diverse workforce brings. All qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status.
Company: Penguin Random House LLC
Country: United States of America
State/Region: New York
City: New York
Postal Code: 10019
Job ID: 280897
Tags: AWS CI/CD Databricks DevOps Docker Engineering GitLab Grafana Java Kubernetes Linux Machine Learning ML models Python Security Snowflake Splunk Terraform
Perks/benefits: Flex hours Flexible spending account Flex vacation Health care Insurance Salary bonus Startup environment
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.