Member of Technical Staff (Infrastructure): World Models
Tasks
- Allocate GPU and cluster resources
- Build operate and scale GPU infrastructure
- Collaborate with researchers and engineers on workload requirements
- Coordinate GPU provider relationships
- Design scheduling for inference and training coexistence
- Develop automation tooling and observability
- Drive architecture decisions for compute and storage systems
- Improve reliability through incident response
- Monitor GPU utilization and cost
- Participate in on-call rotation
- Set scheduling policy
- Validate infrastructure fit for evolving workloads
Perks/Benefits
- Fully Distributed Async First Culture
- Hardware setup of your choice
- Internet stipend
- Meals stipend
- Pension contribution
- Phone stipend
- Private health coverage
Skills/Tech-stack
Automation | Distributed Computing | Distributed Storage | Distributed Systems | GPU infrastructure | Incident Response | Kernel debugging | Kubernetes | Linux | Monitoring | Networking | Observability | Resource allocation | Scheduling | Slurm | Storage
Education
N/A
Related jobs
-
AI machine learning | Azure | DAX | Data Modeling | Data QualityDiscounts on training and programs | Flexible benefits | Flexible work hours | Health insurance | Intensive work scheduleMid-level Full TimeBarcelona, Spain R4h ago
-
Software Engineer II - Abnormal Data Platform USD 149K-214KAerospike | Amazon DynamoDB | Apache Spark | Data Storage | DatabricksDistributed team collaboration | Remote work | Technical mentorshipMid-level Full TimeRemote - USA R9h ago
-
AWS EC2 | AWS Glue | AWS Lambda | AWS Step Functions | Amazon DynamoDBAnnual time off | Continuous development opportunities | Employee insurance coverage | Global Connected Culture | Remote work flexibilitySenior-level Full TimeMexico, Remote R9h ago
-
AWS Glue | AWS Lambda | AWS Step Functions | Amazon DynamoDB | Amazon EC2Annual time off | Continuous development opportunities | Employee insurance coverage | Global Connected Culture | Remote work flexibilitySenior-level Full TimeMexico, Remote R11h ago
-
Senior Software Engineer – Backend (Python / Typescript / Big Data / AWS / Kubernetes) MXN 1040K-1300KAWS | AWS Glue | Amazon EMR | Apache Kafka | Apache SparkContinuous development opportunities | Employee insurance coverage | Paid time off | Remote work flexibility | Wellness programsSenior-level Full TimeMexico, Remote R11h ago
-
AWS | AWS EMR | AWS Glue | AWS Lambda | AWS Step FunctionsAnnual time off | Continuous development opportunities | Employee insurance coverage | Global Connected Culture | Remote work flexibilitySenior-level Full TimeMexico, Remote R11h ago
-
API Security | Access Control | Airflow | Amazon Redshift | BigQueryFlexible hours | Remote workSenior-level Full TimePortugal - Remote R12h ago
-
Access Control | Airflow | Amazon Redshift | BigQuery | CI/CDRemote work flexibilitySenior-level Full TimePakistan - Remote R12h ago
-
AWS CloudFormation | Airflow | Amazon Kinesis | Amazon Redshift | BigQueryFlexible hours | Remote workMid-level Full TimeBrazil - Remote R12h ago
-
Principal Machine Learning Engineer USD 245K-393KCloud infrastructure | Data Science | Distributed Systems | Infrastructure as Code | ML pipelinesSenior-level Full TimeChicago, Illinois, USA R15h ago
-
Sr Sales Engineer, West USD 160K-196KAnalytics | Apache Spark | Artificial Intelligence | Dataiku | Kubernetes401k match | Dental insurance | Employer paid disability coverage | Flexible spending accounts | Medical insuranceSenior-level Full TimeUnited States, Remote R15h ago
-
Senior Security Engineer, Incident Response GBP 91K-110KAWS | Access Control | Azure | Cloud Security | DFIRSenior-level Full TimeAmsterdam, Netherlands; Berlin, Germany; London, United … R15h ago
-
AI Engineer USD 53K-119KAPI Design | Cost Optimization | Embeddings | Evaluation | JSONDental insurance | Gym stipend | Health insurance | Medical membership | Offsite retreatsSenior-level Full TimeRemote, US R16h ago
-
Machine Learning Engineer, Chakra INR 2000K-4600KBenchmarking | Conversational AI | Data Pipelines | Deep learning | DockerMid-level Full TimeHybrid in Bangalore, India R16h ago
-
AI Engineer - kf USD 150K-225KAPIs | Agent Orchestration | Authentication | Databricks | Distributed SystemsBirthday off | English lessons | Extra vacation week | Food credits | Referral bonusesMid-level Full TimeRemote R17h ago
-
Machine Learning Engineer, Integrity USD 120K-235KAdversarial Machine Learning | Anomaly Detection | Audio analysis | Behavioral analytics | BenchmarkingMid-level Full TimeHybrid in Santa Clara, CA R18h ago
-
AWS RDS | AWS Security | Amazon Web Services | Apache Spark | AutomationEquipment and office stipend | Flexible PTO | Laptop and tools | Learning and development stipend | Paid exams and certificationsSenior-level Full TimeARGENTINA R18h ago
-
Senior Data Analytics Engineer USD 145K-180KAWS Glue | AWS S3 | Ad Spend | Amazon Athena | Amazon RedshiftPaid time off | Remote workSenior-level Full TimeRemote job R19h ago
-
Senior-level Full TimeMéxico (Remote) R19h ago
-
Principal Machine Learning & Data Engineer USD 184K-271KA/B | A/B Testing | AWS | Autoscaling | B testing401k | Healthcare | Paid parental leave | Paid sick time | Paid time offSenior-level Full TimeRemote - US R20h ago
-
CI/CD | Cloud Computing | DevOps | Docker | GitSenior-level Full TimeSchweiz - Remote R20h ago
-
Senior Data Engineer USD 121K-170KAccess Control | Azure Data | Azure Data Factory | Azure SQL | CDC401k match | Dental insurance | Disability insurance | Educational assistance program | Employee assistance programSenior-level Full TimeRemote (United States) R20h ago
-
Azure DataOps - H/F EUR 44K-50KAWS | Agile | Ansible | Azure | Azure DevOpsDisability support services | Employee share ownership | Equipment bonus | Health insurance | Mobility supportEntry-level Full TimeParis, IDF, France R21h ago
-
Senior Software Engineer II (TASER Data Science) USD 148K-237KAzure ML | Batch Processing | Data Ingestion | Data Pipelines | DatabricksDiscretionary paid time off | Emotional & mental wellness support | Employee resource groups | Fitness programs | Hybrid work scheduleMid-level Full TimeSeattle, Washington, United States R23h ago
-
AWS EKS | BigQuery | CI/CD | ClickHouse | Distributed SystemsGym membership | Healthcare | Home-office equipment | Life insurance | Lunch cardSenior-level Full TimeRemote R23h ago