Head of Supercomputing
Tasks
- Build and scale supercomputing engineering organization
- Build system services integrating firmware drivers kernel and runtime
- Define reliability targets and operational metrics
- Define supercomputing software architecture
- Develop control plane software for system bring up
- Establish telemetry and observability infrastructure
- Implement orchestration primitives for devices nodes racks clusters
- Integrate diagnostics into manufacturing and test environments
- Lead cross functional infrastructure decisions
- Lead supercomputing software roadmap
- Own release processes for production deployments
- Own system software lifecycle from silicon bring up to production
- Recruit mentor and retain systems engineers
Perks/Benefits
- Daily meals
- Dental coverage
- Housing subsidy
- Medical coverage
- Relocation support
- Vision coverage
- Wellness benefits
Skills/Tech-stack
Bring-up | Debugging | Device Drivers | Fleet Management | HPC | Interrupt Handling | Kernel | Memory hierarchy | Networking | Observability | Orchestration | PCIe | Provisioning | RDMA | Runtime | System bring-up | Telemetry
Education
N/A
Roles
Director | Director of Engineering | Engineering | Head | Head of Supercomputing
Related jobs
-
Head of Applied AI & Products USD 220K-280KAgent Orchestration | Analytics | Data Pipelines | Debugging | Document UnderstandingContinuous learning programs | Sustainability initiatives | Volunteering opportunitiesExecutive-level Full TimeUSA-Houston Town Park, United States1d ago
-
Sr. Engineering Manager, MLOps USD 270K-300KAWS | Amazon SageMaker | Apache Flink | Apache Kafka | Apache SparkSenior-level Full TimePalo Alto, California, United States1d ago
-
API Design | Cloud Computing | Distributed Systems | FHIR | FHIR API401k benefit | Commuter benefits | Company holidays | Dental insurance | Health insuranceSenior-level Full TimeNew York, NY1d ago
-
Director of Engineering - Agentic AI Solutions USD 215K-275KAPI Design | Distributed Systems | FHIR | Google ADT | Google Gemini401k benefit | Commuter benefits | Dental insurance | Health insurance | Hybrid work scheduleExecutive-level Full TimeNew York, NY1d ago
-
Engineering Manager, AI Engineering:Agent Foundations USD 131K-276KAI | Go | Machine Learning | Observability | PythonEmployee stock purchase plan | Equity compensation | Flexible paid time off | Growth and development fund | Home office supportMid-level Full TimeRemote, EMEA; Remote, US-Southeast R2d ago
-
Director, Data Engineering USD 234K-253KAWS | AWS Glue | AWS Lambda | Amazon Athena | Amazon S3Career growth | Collaborative culture | Employee mentoring | Hybrid work | Inclusive work environmentExecutive-level Full TimeSan Diego, California, United States2d ago
-
Director of AI Infrastructure USD 176K-264KAWS | Beaker | Ceph | Containerd | Distributed Systems401k plan | Annual bonuses | Commuting support | Employee assistance program | Fitness and Wellbeing SupportExecutive-level Full TimeSeattle, WA2d ago
-
Senior Director, AI Engineering and Delivery USD 190K-380KAPI | AWS | Access Control | Azure | Cloud NativeSenior-level Full TimeUnited States - Abbott Park : …2d ago
-
Senior Director, AI Engineering and Delivery USD 190K-380KAPI | Access Control | Artificial Intelligence | CI/CD | Cloud ComputingSenior-level Full TimeUnited States - Abbott Park : …2d ago
-
Director of Software Engineering (MLOps & ML Governance) USD 138K-200KAWS | Audit Readiness | Automated testing | Azure | CI/CD401k match | Business resource groups | Dental insurance | Family and medical leave | Life insuranceExecutive-level Full TimeKS Overland Park, United States2d ago
-
Director of Enterprise Architecture (GenAI) USD 163K-244KAWS | Agent Development Kit | Agent Engine | CI/CD | CSPMHybrid work scheduleSenior-level Full TimeCharlotte NC-South Tryon St., United States2d ago
-
Storage Engineering Manager USD 137K-315KAccess Control | Automated testing | Block Storage | CI/CD | CSIHealth benefits | Inclusion programs | Professional development programs | Remote workMid-level Full TimeAll, Minnesota, United States of America2d ago
-
Storage Engineering Manager USD 137K-315KAccess Control | Agile | Automated testing | CI/CD | Capacity PlanningCareer Development Programs | Health and wellbeing benefits | Inclusion and Diversity support | Remote work optionsMid-level Full TimeAll, Minnesota, United States of America2d ago
-
Head of Data Science Engineering & Analytics USD 171K-375KAWS | AWS Firehose | AWS Kinesis | Alation | Apache AirflowCommunity involvement support | Employee well-being support | Hybrid work environment | In-person work option | Remote work optionExecutive-level Full TimeSan Jose (CA), United States2d ago
-
Director, AI Engineering (Tip.AI) USD 170K-264KAWS | Azure | CI/CD | Cloud Computing | Cloud platform401k plan | Accident insurance | Adoption expense reimbursement | Childcare discounts | Commuter benefitsExecutive-level Full TimeBethesda, MD, United States2d ago
-
Engineer IV, Data Engineering USD 160K-193KAlerting | Apache Spark | CI/CD | Configuration Driven Pipelines | Data GovernanceSenior-level Full TimePittsburgh, PA, United States2d ago
-
Anomaly Detection | CI/CD | Data Engineering | Data analytics | MLOps401k match | Dental insurance | Flexible work schedules | Life insurance | Medical insuranceSenior-level Full TimeLos Angeles, USA2d ago
-
Software Development Manager USD 115K-251KCloud Platforms | Debugging | Distributed Algorithms | Distributed Systems | Engineering Project ManagementEmployee stock purchase plan | Paid Holidays | Paid parental leave | Paid sick leave | Paid time offMid-level Full TimeSeattle, WA, United States2d ago
-
Director, ML/Dev Ops (Tip.AI) USD 110K-245KAWS | AWS Secrets | AWS Secrets Manager | Amazon SageMaker | Auto Scaling401k plan | Childcare discounts | Commuter benefits | Educational assistance | Employee assistance planExecutive-level Full TimeBethesda, MD, United States2d ago
-
Senior Director of Engineering, Traffic and Networking USD 340K-488KAWS | Cloud Computing | Cloud platform | Distributed Systems | Google CloudSenior-level Full TimeUS-WA-Bellevue2d ago
-
Engineering Manager, Inference ML Runtime USD 180K-250KC++ | Cloud infrastructure | Deep learning | Distributed Systems | High PerformanceMid-level Full TimeSunnyvale CA or Toronto Canada2d ago
-
Director, App Dev - Mobile and GenAI USD 149K-206KAPM | Android | Angular | Automated testing | Azure OpenAI401k retirement plan | Caregiving support and resources | Life insurance | Long-term and short-term disability insurance | Medical, dental, and vision insuranceExecutive-level Full TimeNewton Home Office - NEWTON, United …3d ago
-
Director AI/ML Strategic Customers Engineering USD 139K-291KArtificial Intelligence | Customer Relationship Management | Customer relationship | Deep learning | Distributed Systems401k match | Adoption Assistance | Dental insurance | Employee stock purchase plan | Flexible spending accountsExecutive-level Full TimeUnited States3d ago
-
Engineering Manager, Inference Cloud USD 180K-250KAWS EKS | Active/Active | Admission control | Alerting | BackpressureMid-level Full TimeSunnyvale CA or Toronto Canada4d ago
-
AVP of Data Analytics and Business Intelligence USD 93K-144KAI/ML | API Integration | AWS | Agentic Analytics | Azure401k retirement savings | Employee assistance program | Employee wellness days | Employer 401K matching | Medical/Dental/Vision insuranceExecutive-level Full TimeSpokane Valley, WA, US6d ago