Principal Software Developer - AI Infra Compute
Tasks
- Build diagnostic services for GPU systems
- Collaborate with networking and data center operations teams
- Debug and resolve customer issues
- Deliver large-scale production systems
- Design distributed GPU infrastructure software
- Develop health monitoring and triage automation
- Implement GPU server firmware
- Manage RoCE and Infiniband based workloads
Perks/Benefits
- 401k match
- Dental insurance
- Disability insurance
- Flexible spending accounts
- Health insurance
- Life insurance
- Paid Holidays
- Paid parental leave
- Paid sick leave
- Paid time off
- Vision insurance
Skills/Tech-stack
C++ | Computer networks | Data synchronization | Distributed Systems | Fault Tolerance | Go | High Performance | High-Performance Computing | Infiniband | Java | Linux | Memcache | MySQL | Operating Systems | Performance Computing | Python | Redis | RoCE | SQL | Shell Scripting | State management
Education
Related jobs
-
Automation Testing | CI/CD | CSS | Cypress | Feature DevelopmentMedical, dental & vision coverage | Paid time off | Parental leave | Reimbursement programs | Retirement planMid-levelRaleigh, United States R15d ago
-
AWS | Agentic AI | Azure | CI/CD | Cloud platformLocal candidate preference | Onsite workMid-level ContractDallas, United States4h ago
-
Anomaly Detection | Clustering | Entity Resolution | Entity recognition | Experiment tracking100% onsite | Local candidates preferredMid-level ContractCharlotte, United States4h ago
-
AWS | Agentic AI | Automated testing | Azure | CI/CD100 percent onsite | 5 days per week | HackerRank assessment required | Local candidates onlyMid-level ContractCary, United States4h ago
-
Senior Software Engineer - Hadoop Infrastructure USD 160K-240KAirflow | Ansible | Argo | HBase | HDFSSenior-level Full TimeNew York5h ago
-
Algorithm Development | Deep learning | Distributed Systems | Fine Tuning | InferenceMid-level Full TimeSan Jose, California, United States5h ago
-
Machine Learning Engineer Graduate (E-Commerce Knowledge Graph - CV/Multimodal/NLP) -2026 Start (PhD) USD 136K-259KData Mining | Knowledge graphs | Language Processing | Machine Learning | Multimodal LearningEntry-level Full TimeSan Jose, California, United States5h ago
-
Mid-level Full TimeArlington/Rosslyn, Virginia, United States; Cleveland, Ohio, …6h ago
-
Data Engineer USD 192K-196KBig Data | Data Architecture | Data Visualization | Data Warehouse | Database systemsEntry-level Full TimeMenlo Park, CA7h ago
-
Data Engineer USD 209K-235KData Governance | Data Modeling | Data Quality | Data Security | Data VisualizationTelecommuting allowedSenior-level Full TimeMenlo Park, CA | Remote, US R7h ago
-
Data Engineer, Analytics USD 209K-235KBig Data | Data Governance | Data Modeling | Data Quality | Data VisualizationSenior-level Full TimeMenlo Park, CA7h ago
-
Data Engineer, Analytics USD 185K-196KBig Data | Data Governance | Data Quality | Data Security | Data WarehousingMid-level Full TimeMenlo Park, CA7h ago
-
Senior-level Full TimeNew York, NY7h ago
-
MSL Infra Optimizations - Technical Leadership USD 167K-230KArtificial Intelligence | CUDA | Deep learning | Machine Learning | NVIDIA GPUSenior-level Full TimeMenlo Park, CA7h ago
-
Accessibility | Artificial Intelligence | Data Analysis | Data Mining | Data ProcessingSenior-level Full TimeSunnyvale, CA, USA7h ago
-
Software Engineer III, Embedded Systems Firmware, Platforms Infrastructure Engineering USD 147K-211KC# | C++ | Data Structures | Data Structures and Algorithms | Device DriversSenior-level Full TimeSunnyvale, CA, USA7h ago
-
Agent systems | Agentic solutions | Classification | Cloud AI | Data AnalysisClient-facing opportunities | Travel opportunitiesMid-level Full TimeAustin, TX, USA; Atlanta, GA, USA7h ago
-
Staff Software Engineer, AI/ML Performance USD 207K-300KAlgorithm Design | Auto sharding | C++ | Code generation | Compiler optimizationSenior-level Full TimeSunnyvale, CA, USA7h ago
-
API Integration | Cloud Architecture | Data Processing | Deep learning | GPUTravel up to 20 percent timeSenior-level Full TimeReston, VA, USA; Washington D.C., DC, …7h ago
-
Senior-level Full TimeGrapevine, TX, US7h ago
-
Applied AI ML Lead USD 177K-210KAlgorithms | Apache Spark | CUDA | Causal Inference | Data MiningBackup childcare | Financial coaching | Health care coverage | Mental health support | Retirement savings planSenior-level Full TimeJersey City, NJ, United States13h ago
-
AI Middleware Engineer USD 99K-195KAI Agent | AI Agent Frameworks | API Design | API Security | Agent FrameworksClient-facing work | Production impactMid-level Full TimeSan Francisco13h ago
-
Software Engineer, Applied AI USD 130K-500KData Pipelines | Data Quality | Evaluation | Experimental Design | GoEquity 4 year vest | Free gym membership | Health insurance | Housing bonus | Meal stipendMid-level Full TimeSan Francisco15h ago
-
Software Engineer, Storage USD 157K-185KAsynchronous processing | CDC | Change Data Capture | Data Capture | Data IngestionEmployee travel credits | Oncall rotation | Remote eligibleSenior-level Full TimeUnited States15h ago
-
Senior Data Engineer USD 136K-160KApache Spark | Azure Data | Azure Data Factory | Azure Event | Azure Event Hubs401k match | Employee assistance program | Free parking | Healthcare | Hybrid workplace flexibilitySenior-level Full TimeCharlotte, North Carolina, United States; Virtual R16h ago