Senior Solutions Architect - AI Factory Deployment
US, CA, Remote, United States
R
USD 184K-356K Senior-level Full Time
Tasks
- Analyze NCCL usage and collectives
- Automate benchmark execution and regression checks
- Build observability dashboards
- Collaborate with cross functional teams on customer readiness
- Collect and analyze benchmark results
- Create documentation and guidelines
- Debug hanging or underperforming jobs
- Recommend cluster tuning and parallelism strategies
- Run AI LLM benchmarks
- Set up AI factory environments
- Troubleshoot training failures
- Verify NCCL and distributed training configurations
Perks/Benefits
Skills/Tech-stack
AllReduce | AllToAll | Automation | Bash | Benchmarking | CI/CD | Dashboards | Deep learning | Distributed Training | Linux | Logging | Metrics | NCCL | NVIDIA GPU | Observability | PyTorch | Python | TensorFlow | Tracing
Education
Roles
AI | AI Solutions | AI Solutions Architect | Architect | Solutions Architect
Related jobs
-
A/B | A/B Testing | AWS | Adversarial Testing | Amazon SQSHybrid work | W2 employmentSenior-level Contract Full TimeIrvine, CA, United States R1d ago
-
Principal Engineer, AI Architect USD 190K-210KCloud Architecture | Conversational AI | Enterprise SaaS | GCP | Generative AI401k match | Critical illness insurance | Dedicated WeWork space | Employee assistance program | Employee stock purchase programSenior-level Full TimeRemote - Colorado R1d ago
-
Principal Engineer, AI Architect USD 190K-210KCloud Native | Conversational AI | Enterprise Architecture | GCP | Generative AI401k match | Dental insurance | Development resources | Employee assistance program | Employee stock purchase programSenior-level Full TimeRemote - Colorado R1d ago
-
Sr. Staff AI Engineer (Remote) USD 229K-283KAWS | Agents | Amazon Bedrock | Anthropic | Data Pipelines401k plan access | Dental insurance | Disability insurance | Employee assistance program | FSASenior-level Full TimeRemote - United States R1d ago
-
Mid-level Full TimeRemote Worker - USA R1d ago
-
AI Software Engineer Intern USD 50K-87KCloud Native | Data Science | Fine Tuning | Generative AI | GoHands-on projects | Inclusive work environment | Mentorship | Startup experienceEntry-level InternshipSan Jose, Hybrid R1d ago
-
AI Engineer - Application Development USD 75K-158KAWS | AWS Bedrock | AWS GovCloud | Agno | Amazon BedrockFlexible time off | Learning resourcesMid-level Full Time999 REMOTE, United States R1d ago
-
AI Solutions Architect USD 126K-225KAir gapped deployment | Air-gapped | Apache Kafka | Apache NiFi | Data PipelinesCareer development | Employee resource groups | Flexible work from home | Generous paid time off | Paid volunteer timeSenior-level Full TimeUS-Washington DC-Remote, United States R1d ago
-
AI Program Manager & Developer USD 136K-244KAI Governance | API Integration | Access Control | ChatGPT enterprise | ConfluenceAnnual wellness and community outreach days | Flexible time off | Flexible work environment | Global collaboration and networking opportunities | Recognition programMid-level Full TimeArizona, United States R1d ago
-
A/B | A/B Testing | Agile | Automation | B testingEntry-level Full TimeNew York, NY, US, 10001 R1d ago
-
AI Solutions Engineer - HYBRID USD 104K-218KA/B | A/B Testing | B testing | Command Line | Command-line InterfaceSenior-level Full TimeNew York, NY, US R1d ago
-
AI Junior / Intermediate Developer - HYBRID USD 104K-218KAgentic Frameworks | Anthropic | CI/CD | Cloud | Command LineHybrid work | Technical onboarding supportEntry-level Full TimeNew York, NY, US R1d ago
-
AI Junior / Intermediate Developer - HYBRID USD 77K-140KAnthropic | CI/CD | Cloud | Command Line | Command-line InterfaceHybrid workEntry-level Full TimeAlpharetta, GA, US R1d ago
-
Cons - TC - AI and Quant Modelling - AI Data Scientist - Manager - Multiple Positions - 1705472 USD 175K-175KAWS | Agile | Azure | Azure DevOps | CI/CD401k | Dental coverage | Medical coverage | Paid time off | Pension planMid-level Full TimeSan Francisco, CA, US, 94105-2907 R1d ago
-
Sr. AI Software Engineer (NYC Hybrid) USD 133K-258KAPI Integration | Angular | Automated testing | Cloud Native | Code reviewAnnual bonus eligibility | Family care resources | Health and wellness benefits | Hybrid work schedule | Online therapySenior-level Full TimeNew York, NY, United States R1d ago
-
Language Processing | Machine Learning | Natural Language | Natural Language Processing | Process ImprovementParental leave | Remote work | Work-life balanceEntry-level Full Time InternshipUnited States R1d ago
-
Staff Software Engineer, AI/ML USD 136K-265KAgent systems | Autogen | Cloud platform | CrewAI | EmbeddingsSenior-level Full TimeRemote - USA, United States R1d ago
-
Head of AI USD 200K-280KAWS | Artificial Intelligence | Azure | Cloud Computing | Cloud deploymentDiverse and inclusive workforce | Dynamic work environment | Flexible working hoursExecutive-level Full TimeUS - Remote, Canada - Remote R2d ago
-
Senior-level Full TimeUSA - Remote R2d ago
-
Global Red Team AI Engineer, Analyst USD 98K-123KAI Foundry | AWS Bedrock | Agentic AI | Amazon SageMaker | Azure AIComprehensive health and wellness benefits | Educational assistance | Income replacement for qualified employees with disabilities | Paid Holidays | Paid maternity and parental bonding leaveMid-level Full TimeNew Jersey Office - 210 Hudson … R2d ago
-
Director of AI Engineering USD 198K-250KA/B | A/B Testing | API Versioning | AWS | Agent Orchestration401k match | Annual company retreats | Medical, dental, vision benefits | Paid time off | Promote from withinExecutive-level Full TimeUnited States - Remote R2d ago
-
CI/CD | Docker | Event Driven | Event-driven architecture | GraphQLDirect impact on product direction | Equity compensation | Fast founder led hiring process | High ownership | Hybrid workSenior-level Full TimeSan Francisco, CA; Hybrid R2d ago
-
C# | MATLAB | NumPy | Pandas | PythonPart-time project-based work | Remote project workSenior-level Full TimeMichigan, United States - Remote R4d ago
-
C# | MATLAB | NumPy | Pandas | PythonPart-time project work | Project-based assignmentsSenior-level Full TimeUnited States - Remote R4d ago
-
C# | MATLAB | NumPy | Pandas | PythonPart-time hours | Project based workSenior-level Full TimeFlorida, United States - Remote R4d ago