Director of AI Infrastructure
Tasks
- Decide when to burst to cloud vs invest in on prem capacity
- Develop storage roadmap for throughput and durability
- Direct strategy for Beaker orchestration platform
- Manage GPU compute budget and resource economics
- Optimize job scheduling for hybrid cloud workloads
- Oversee on prem GPU cluster availability and performance
- Partner with hardware vendors to meet infrastructure demands
- Provide technical bridge to research teams
Perks/Benefits
- 401k plan
- Annual bonuses
- Commuting support
- Employee assistance program
- Fitness and Wellbeing Support
- Health savings account
- Long-term incentive plan
- Medical/Dental/Vision
- Paid Holidays
- Paid sick leave
- Paid vacation
- Personal days
Skills/Tech-stack
AWS | Beaker | Ceph | Containerd | Distributed Systems | Docker | GCP | Go | HPC | High Performance | High-Performance Computing | Hybrid Cloud | Infiniband | Kube scheduler | Kubernetes | Linux | Lustre | NCCL | NVIDIA GPU | Performance Computing | Python | Resource Management | Slurm | Weka
Education
Bachelor of Engineering | Bachelor of Science | Master of Science
Related jobs
-
Agent systems | Agentic solutions | Classification | Cloud AI | Data AnalysisClient-facing opportunities | Travel opportunitiesMid-level Full TimeAustin, TX, USA; Atlanta, GA, USA2h ago
-
API Integration | Cloud Architecture | Data Processing | Deep learning | GPUTravel up to 20 percent timeSenior-level Full TimeReston, VA, USA; Washington D.C., DC, …3h ago
-
AI machine learning | Amazon Redshift | Amazon Web Services | Cloud Computing | Data GovernanceHealth insurance | Paid time off | Retirement contributionsSenior-level Full TimeBoston, Massachusetts, US, 022107h ago
-
Director, Data Engineering USD 234K-253KAWS | AWS Glue | AWS Lambda | Amazon Athena | Amazon S3Career growth | Collaborative culture | Employee mentoring | Hybrid work | Inclusive work environmentExecutive-level Full TimeSan Diego, California, United States8h ago
-
AI Middleware Engineer USD 99K-195KAI Agent | AI Agent Frameworks | API Design | API Security | Agent FrameworksClient-facing work | Production impactMid-level Full TimeSan Francisco9h ago
-
Mid-level Full TimeDes Moines, Iowa13h ago
-
Senior-level Full TimeNew York, NY, United States14h ago
-
Director, Data Engineering - League Studios USD 251K-351KAlgorithms | Atlan | DBT | Data Engineering | Data Governance401k match | Dental insurance | Flexible work schedules | Life insurance | Medical insuranceExecutive-level Full TimeLos Angeles, USA15h ago
-
AI Architect-Generative AI & Large Language Models USD 131K-223KAI Governance | Bias Mitigation | Distributed Systems | Explainable AI | Fine TuningSenior-level Full TimeMontvale, New Jersey, United States16h ago
-
AI Architect-Generative AI & Large Language Models USD 131K-223KAI Agents | AI Governance | Agentic AI | Bias Mitigation | Distributed SystemsSenior-level Full TimeEdison, NJ16h ago
-
AI Data Engineer USD 133K-256KAWS | Agile | Apache Kafka | Apache Spark | AzureHybrid work | Work from home up to three days per weekEntry-level Full TimeMaryland, United States16h ago
-
Director, AI Engineering (Tip.AI) USD 170K-264KAWS | Azure | CI/CD | Cloud Computing | Cloud platform401k plan | Accident insurance | Adoption expense reimbursement | Childcare discounts | Commuter benefitsExecutive-level Full TimeBethesda, MD, United States17h ago
-
Marketing Ops & AI Intern USD 40KArtificial Intelligence | Automation | Generative AI | JavaScript | Machine LearningAllyship learning resources | Employee resource groups | Inclusive workplace | Parking stipend | Team eventsEntry-level InternshipChicago, IL17h ago
-
AI Research Scientist- GenAI USD 160K-200KC plus plus | Deep learning | Foundation Models | Generative AI | Inference OptimizationMid-level Full TimePittsburgh, PA, United States19h ago
-
Forward Deployed AI Engineer USD 173K-303KAI orchestration | API Integration | AWS | Agentic AI | AutomationDiversity and inclusion focus | Flexible working environment | Remote work | Travel opportunitiesSenior-level Full TimeUSA - Remote R19h ago
-
Director, Sales Center Reporting and Advanced Analytics USD 150K-200KAzure Cognitive | Azure Cognitive Services | Azure Synapse | Azure Synapse Analytics | Cognitive Services401k match | Employee stock purchase plan | Gym membership discount | Hybrid work schedule | Paid HolidaysExecutive-level Full TimeLos Angeles, California, United States19h ago
-
Senior AI Solutions Engineer USD 128K-180KAPI Development | AWS | Azure | Data Engineering | Data TransformationFlexible work options | Paid time off | Relocation assistance no sponsorship for US work authorization noSenior-level Full TimeDallas, TX, United States20h ago
-
Director, ML/Dev Ops (Tip.AI) USD 110K-245KAWS | AWS Secrets | AWS Secrets Manager | Amazon SageMaker | Auto Scaling401k plan | Childcare discounts | Commuter benefits | Educational assistance | Employee assistance planExecutive-level Full TimeBethesda, MD, United States20h ago
-
Senior Director of Engineering, Traffic and Networking USD 340K-488KAWS | Cloud Computing | Cloud platform | Distributed Systems | Google CloudSenior-level Full TimeUS-WA-Bellevue21h ago
-
AI/Machine Learning Engineer - Python | TheLoops USD 110K-120KCI/CD | Data Pipelines | Embeddings | Fine Tuning | Hugging Face401k | Community volunteering events | Dental insurance | Disability benefits | Flexible paid time offEntry-level Full TimePalo Alto, United States23h ago
-
Director of Data Engineering USD 200K-250KAWS | AWS S3 | Access Control | Amazon Athena | CI/CDAnnual offsite | Equity | Health insurance | Hybrid flexibility | Meal vouchersExecutive-level Full TimeUnited States-Remote R1d ago
-
AI Research Scientist - Safety Alignment Team USD 213K-293KAdversarial Training | Computer Vision | DPO | Dataset curation | Distributed TrainingSenior-level Full TimeMenlo Park, CA1d ago
-
Agentic data | Apache Hive | Apache Spark | Computer Vision | Data CurationSenior-level Full TimeMenlo Park, CA1d ago
-
Agentic Systems | Data Curation | Evaluation | Experiment design | Generative AIMid-level Full TimeMenlo Park, CA1d ago
-
Fundamental AI Researcher - FAIR USD 117K-173KApplied Mathematics | Artificial Intelligence | Computational statistics | Computer Vision | Distributed TrainingOpen source contributions | Reproducible researchEntry-level Full TimeMenlo Park, CA | Seattle, WA …1d ago