AI Research Intern - Foundation Models
Austin, TX
Apptronik
Apptronik is building robots for the real world to improve human quality of life and to help solve the ever-increasing labor shortage problem. Our team has been building some of the most advanced robots on the planet for years, dating back to the DARPA Robotics Challenge. We apply our expertise across the full robotics stack to some of the most important and impactful problems our society faces, and expect our products and technology to change the world for the better. We value passion, creativity, and collaboration to help us overcome existing technological barriers in the industry to create truly innovative products.
You will join a team developing state-of-the-art general-purpose robots designed to operate in human spaces and with human tools. It is designed to work alongside humans, mobilize to human spaces, and manipulate the world around it.
JOB SUMMARY
As an inter on the Apptronik AI Robotics Foundation Models (AIRfm) team, you will build a new generation of world-class large behavior and reasoning models for multi-resolution decision making in Robotics and build production grade robotics solutions and deliver this to customers.
Implement core infrastructure and conduct research to build Vision-Language-Action Models for Physical AI. Solve essential problems in curating pre-training data and effect of post-training data, and finally RL based model improvement. You will also develop metrics and scaling laws for physical intelligence.
WHAT TO EXPECT
This position is expected to start in-person around May/June 2025 and continue through the entire Summer (i.e. through Aug/Sep 2025). We ask for a minimum of 10 weeks. Interns should expect to work Monday-Friday, up to 40 hours per week, typically between 9am-5pm. Specific team norms around working hours will be communicated by your manager. Please consider before applying.
ESSENTIAL DUTIES AND RESPONSIBILITIES or KEY ACCOUNTABILITIES
This role touches all aspects of the multimodal model pipeline. Work may include model training, data pipeline development, evaluation, finetuning and RL training as well as tooling and support. You will work on systems for training multimodal transformers at scale for robotics:
- Inference optimization and distillation for real-time generation.
- Methods for native multimodal generation in language models.
- Quantitative evals for physical accuracy and intelligence.
- Scaling law science for multimodal pretraining and adding new modalities.
TECH STACK
- Python
- Pytorch, JAX and XLA
- CUDA (C++ and Triton)
EDUCATION and/or EXPERIENCE
- MUST be an MS or PhD student enrolled in an accredited academic program during the internship term located in the United States. (New graduates not enrolled in an accredited program for Fall 2025 are ineligible.)
- Pursuing a degree in Computer Science, Machine Learning, Engineering, Robotics, or a related field.
- Demonstrated track record. The portfolio can consist of: code, technical writing (white papers or blog posts), peer reviewed papers in Tier-1 Machine Learning, Robotics and Computer Vision
- Expert in ML and fine-tuning large language models.
- Demonstrated experience in deep learning and transformers models
- Experience in developing and managing large-scale machine learning systems.
- Experience building training codebases for large-scale video or multimodal foundation models.
- Expertise optimizing efficiency of distributed training systems and/or inference systems.
- Passion for making production grade agentic frameworks with cutting edge technology
PHYSICAL REQUIREMENTS
- Prolonged periods of sitting at a desk and working on a computer
- Must be able to lift 15 pounds at times
- Vision to read printed materials and a computer screen
- Hearing and speech to communicate
*This is a direct hire. Please, no outside Agency solicitations.
Apptronik provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
Tags: Computer Science Computer Vision CUDA Deep Learning Engineering JAX LLMs Machine Learning Model training PhD Python PyTorch Research Robotics Transformers
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.