Member of Technical Staff - Infrastructure & Data

Remote

Moonvalley AI

The imagination research company building ML video and image models that captivate.

View all jobs at Moonvalley AI

Apply now Apply later

Moonvalley is building the next generation creative studio, powered by the most capable video and image foundational models in the world. We are creating the platforms where the first generative Super Bowl ads and Oscar winning movies will be created.

We’re the most pedigreed team in generative AI, with top former Deepmind video researchers leading a research team as deep as any in the industry, product leaders who have built some of the best software products in the world, and an in-house Oscar-nominated movie studio. We’ve also raised $75m from world class investors including General Catalyst, Bessemer, Khosla Ventures & YCombinator.

Role Summary:

We’re looking for an Infrastructure Engineer to shape the backbone of our AI systems as we develop cutting-edge AI models. Joining at an early stage you'll have the unique opportunity to architect infrastructure at scale, harness thousands of GPUs, tackle challenging data problems, work with top AI talent, and push the boundaries of large-scale model capabilities. We’re looking for people who love dealing with technical complexity, thrive in an innovative and fast-paced environment, and want to shape the future of AI. 

What you’ll do:

  • Manage, and scale GPU infrastructure (Kubernetes, Terraform / Pulumi).

  • Maintain ETL pipelines (Spark / Ray / Airflow).

  • Oversee the telemetry platform to monitor system health (Datadog, Grafana, W&B).

  • Manage the code platform (GitHub, CI/CD, PyTorch, Python).

  • Track and optimize assets like datasets, checkpoints, and compute resources.

  • Develop tools, documentation, and guidance for the team.

What we’re looking for:

  • Passion for building petabyte-scale systems that enhance efficiency and productivity.

  • Ability to balance quick fixes for urgent needs with long-term, scalable solutions.

  • Strong prioritization skills in a fast-moving, high-impact environment.

  • Comfortable using open-source tools or developing custom solutions when needed.

  • A versatile generalist, eager to learn and adapt to new tools and systems.

Nice to haves:

  • Experience with infrastructure for large-scale AI training.

  • Cluster Engineering: GPU infrastructure, Kubernetes expertise.

  • Data Engineering: Mastery of ETL pipelines.

  • Developer Advocacy: Improving workflows, documentation, and tool adoption.

In our team, we approach our work with the dedication similar to Olympic athletes. Anticipate occasional late nights and weekends dedicated to our mission. We understand this level of commitment may not suit everyone, and we openly communicate this expectation.

If you're motivated by deeply technical problems, a seemingly never-ending uphill battle and the opportunity to build (and own) a generational technology company, we can give you what you're looking for.

All roles at Moonvalley are either fully remote by default or hybrid positions if specified. We meet a few times every year, usually in London, UK or North America (LA, Toronto) as a company.

If you're excited about the opportunity to work on cutting-edge AI technology and help shape the future of media and entertainment, we encourage you to apply. We look forward to hearing from you!

The statements contained in this job description reflect general details as necessary to describe the principal functions of this job, the level of knowledge and skill typically required and the scope of responsibility. It should not be considered an all-inclusive listing of work requirements. Individuals may perform other duties as assigned, including work in other functional areas to cover absences, to equalize peak work periods, or to otherwise balance organizational work

Moonvalley AI is proud to be an equal opportunity employer. We are committed to providing accommodations. If you require accommodation, we will work with you to meet your needs.

Please be assured we'll treat any information you share with us with the utmost care, only use your information for recruitment purposes and will never sell it to other companies for marketing purposes. Please review our privacy policy and career privacy policy for further information.

Apply now Apply later
Job stats:  0  0  0
Category: Leadership Jobs

Tags: Airflow CI/CD Engineering ETL Generative AI GitHub GPU Grafana Kubernetes Open Source Pipelines Privacy Python PyTorch Research Spark Terraform Weights & Biases

Perks/benefits: Startup environment

Region: Remote/Anywhere

More jobs like this