Member of Technical Staff - Training Platform
Tasks
- Build and maintain Helm charts for training stacks
- Build job submission and live run monitoring surfaces
- Build node local model caches and checkpoint pipelines
- Build real time streaming logs and step level metrics tools
- Design and operate Kubernetes training and inference orchestration
- Develop FastAPI backend services and REST APIs
- Develop Python control plane agents for pod monitoring and cluster sync
- Develop product UI for hosted training platform
- Implement scheduling and autoscaling for GPU fleets
- Operate GitOps workflow for infrastructure changes
- Operate observability stack for GPU cluster debugging
Perks/Benefits
- Conference attendance
- Professional development budget
- Relocation support
- Remote work
- Team off-sites
- Visa sponsorship
Skills/Tech-stack
Ansible | DCGM | FastAPI | GPU Operator | GitOps | Grafana | Helm | KEDA | Kubectl | Kubernetes | Linux | Loki | Next.js | OpenTelemetry | Prometheus | Python | REST API | React | SQLAlchemy | Tailwind | TanStack Query | Terraform | Trpc | TypeScript
Education
N/A
Roles
Regions
Countries
States
Related jobs
-
Senior Software Engineer USD 107K-150KAWS | Agile | Datadog | GitHub | GitHub ActionsOn-call rotation | Remote workSenior-level Full TimeCosta Mesa, CA, United States R11h ago
-
Summer 2026 Data Engineer USD 41K-50KAPIs | Agile | Azure Data | Azure Data Factory | Azure Data LakeExposure to real-world projects | Learning and development opportunities | MentorshipEntry-level InternshipBoston, MA, United States11h ago
-
Early-Career Network Engineer (RAN Optimization) USD 85K-130K4G | 5G | Automation | C Band | CBRS401k match | Dental insurance | Disability insurance | Educational assistance | Financial wellness programsMid-level Full TimePlano,Texas,United States R11h ago
-
Senior Embedded Software Engineer USD 146K-196KARM Cortex | C# | Digital Signal | Digital Signal Processing | Embedded Linux401k match | Dental insurance | Employee assistance program | FSA | Flexible scheduleSenior-level Full TimeCamarillo, CA, United States11h ago
-
Data Engineer USD 126K-208KAPI Integration | Airflow | Amazon Web Services | BigQuery | CCPADEI initiatives | Dental benefits | Employee rewards program | Medical benefits | Mental health supportMid-level Full TimeRemote, United States R11h ago
-
Alerting | Ansible | Bash | CI/CD | CephRemote workSenior-level Full TimeUnited States, United States R13h ago
-
Ansible | Bash | CI/CD | CentOS | CephContract-to-hire | No sponsorship | Remote workSenior-level Full TimeUnited States, United States R13h ago
-
HPG Big Data Engineer / Senior-Level USD 119K-164KAgile | Azure Data | Azure Data Lake | Azure Data Lake Storage | Azure FunctionsSenior-level Full TimeNashville, TN, United States13h ago
-
kdb+ Contractor (Capital Markets) USD 150K-200KAWS | Azure | Electronic Trading | GCP | Kdb PlusContract opportunities | On site work 5 days per week | Permanent opportunitiesSenior-level ContractNew York, NY, United States14h ago
-
Senior kdb+ Engineer USD 150K-200KAI Tooling | AWS | Azure | Capital Markets | Electronic TradingFlexible holiday allowance | Healthcare and wellbeing | Hybrid working | Leadership development | Paid certificationsSenior-level Full TimeNew York, NY, United States14h ago
-
Platform and Integrations Engineer USD 100K-200KAnthropic | GraphQL | Next.js | Node.js | OpenAIIn-person work | Medical/Dental/Vision | Significant equity upsideEntry-level Full TimeSan Francisco, CA, US14h ago
-
Machine Learning Engineer USD 131K-178KAWS | Cassandra | Convolutional Neural Networks | Data Lakes | Data PipelinesMid-level Full TimeRemote, NY, US R14h ago
-
Senior Data Engineer (TS/SCI Clearance) USD 130K-220KData Visualization | Database performance | Database performance tuning | ETL | High PerformanceEmployee development | High employee morale | RetentionSenior-level Full TimeHuntsville, United States15h ago
-
Amazon S3 | Data Engineering | Data Modeling | Data Pipelines | Data QualitySenior-level Full TimeNew York16h ago
-
Amazon S3 | Automation | Data Engineering | Data Modeling | Data Pipelines401k match | Dental insurance | Life insurance | Long-term disability | Medical insuranceSenior-level Full TimePrinceton16h ago
-
Senior Databricks Forward Deployed Engineer - GPS USD 119K-198KAPI Integration | AWS | Airflow | Azure | CI/CDTravelSenior-level Full TimeArlington/Rosslyn, Virginia, United States; Atlanta, Georgia, …16h ago
-
Lead Databricks Forward Deployed Engineer - GPS USD 189K-372KAPI Integration | AWS | Airflow | Apache Spark | AzureSenior-level Full TimeArlington/Rosslyn, Virginia, United States; Atlanta, Georgia, …16h ago
-
Lead AI and Data Solutions Engineer II USD 137K-229KAmazon Web Services | Apache Spark | Application Programming | Application Programming Interfaces | Cloud ComputingSenior-level Full TimeSacramento, California, United States; Tempe, Arizona, …16h ago
-
TikTok Shop - E-commerce Anti-Fraud Data Scientist USD 156K-296KA/B | A/B Testing | Analytics | B testing | Big DataMid-level Full TimeSeattle, Washington, United States16h ago
-
Software Engineer, Systems ML - SW/HW Co-design USD 117K-173KAI infrastructure | Bias Mitigation | C# | C++ | Co-designSenior-level Full TimeSunnyvale, CA | Redmond, WA17h ago
-
Software Engineer, Machine Learning USD 213K-293KAPI Design | Agent Orchestration | Artificial Intelligence | Bias Mitigation | C++Senior-level Full TimeSunnyvale, CA | Remote, US | … R17h ago
-
Staff Software Engineer, AI/ML Performance USD 207K-300KAlgorithms | Auto sharding | C++ | Code debugging | Code generationSenior-level Full TimeSunnyvale, CA, USA17h ago
-
C++ | Data Processing | Debugging | Deep learning | Few-Shot LearningSenior-level Full TimeMountain View, CA, USA17h ago
-
GTM Applied AI Architect, Google Cloud USD 153K-222KAgent Development | Agent Development Kit | Cloud platform | Function Calling | GeminiSenior-level Full TimeAustin, TX, USA; Boulder, CO, USA17h ago
-
Software Engineer III, Generative AI, Payments Risk USD 147K-211KAgent systems | Algorithms | Analytics | Big Data | Computer VisionSenior-level Full TimeMountain View, CA, USA17h ago