Platform Support Architect
USD 175K-200K Senior-level Full Time
Tasks
- Author and maintain support triage runbooks and checklists
- Build hands on labs and proof of concepts for RAG and agentic AI use cases
- Collaborate to align reference architectures and best practices across teams
- Collect and interpret logs and telemetry and create minimal repros and defect reports
- Define and validate unified diagnostics bundles
- Develop reusable technical assets implementation guides and best practice playbooks
- Diagnose performance bottlenecks in RAG and agentic AI workflows
- Perform end to end triage across GPU NVAIE vector DB Kubernetes Docker networking and storage
- Provide NVIDIA AI Enterprise and vector database support for customer environments
- Provide field feedback to product management and engineering on compatibility upgrade rollback and observability needs
Perks/Benefits
- N/A
Skills/Tech-stack
CI/CD | CUDA | Canary Deployment | Ceph | Ceph RBD | Docker | Elasticsearch | Embeddings | GPFS | GPU Operator | Grafana | Helm | Inference Server | Infinia | Infiniband | Ingestion pipelines | Kubernetes | Linux | Lustre | MLOps | Milvus | NFS | NVIDIA GPU | NVIDIA GPU Operator | NVIDIA Nemo | Nvidia Nim | Observability | Prometheus | Prompt engineering | RAG | RDMA | Reranking | Retrieval | Rollback | S3 | SMB | TensorRT | Triton Inference | Triton Inference Server | Vector Database
Education
N/A
Regions
Countries
States
Related jobs
-
Senior Data Scientist, Machine Learning USD 194K-218KAWS | Active Learning | Airflow | Amazon Redshift | Automated Labeling100% TelecommutingSenior-level Full TimeRedwood City, CA R11h ago
-
Manager, AI Engineering USD 122K-152KAI Act | AI RMF | Agent Frameworks | Bias Testing | Data Lineage401k match | Business travel coverage | Dental insurance | Disability insurance | Employee assistance programMid-level Full TimePrinceton, New Jersey, United States; San … R14h ago
-
Machine Learning Engineer USD 140K-190KApache Flink | Apache Kafka | Apache Spark | Bigtable | CI/CDMid-level Full TimeRemote - USA R14h ago
-
Senior Embedded Software Engineer - Future Forward USD 153K-201KAgile | Authentication | Board Bring-up | Bring-up | C#Senior-level Full TimeSunnyvale, CA, United States R15h ago
-
Software Engineer, Storage USD 153K-196KAlertmanager | As-a-Service | Availability | C++ | CassandraEquity compensation | Onsite optionSenior-level Full TimeSan Mateo, CA, United States R16h ago
-
Senior Software Engineer - Data Platform USD 130K-220KAWS Lambda | AWS RDS | Airflow | Amundsen | Apache HiveHealth insurance | Parental leave | Professional development stipend | Remote workSenior-level Full TimeRemote - US R17h ago
-
Applied AI Engineer, Agentic Systems USD 115K-157K.NET | API Design | Anthropic | CrewAI | EvaluationAI and productivity tools | Remote workSenior-level Full TimeRemote - United States R18h ago
-
Staff AI Engineer (Audio) USD 185K-235KAudio Processing | Classification metrics | Data Drift | Data analytics | Datadog401k | Commuter benefits | Company offsite | Daily lunch | Dental/visionSenior-level Full TimeGlobal Remote R19h ago
-
Senior Machine Learning Engineer, AI Platform USD 139K-218KAlerting | Batching | CI/CD | CPU | Capacity PlanningBirthday time off | Country specific holidays | Home office stipend | Medical, dental, and vision coverage | Paid parental leaveSenior-level Full TimeRemote US R20h ago
-
Senior Data Engineer (Remote) USD 155KAgile | Apache Spark | BigQuery | Cassandra | Data Governance401k match | Dental insurance | Employee assistance program | Employee stock purchase plan | Flexible scheduleSenior-level Full TimeWork From Home, United States R20h ago
-
Senior AI Operations Engineer USD 170K-180KAI infrastructure | Azure | CI/CD | Cloud infrastructure | Container Engine for Kubernetes401k match | Employee assistance program | Employee stock purchase plan | Flexible schedule | Flexible spending accountSenior-level Full TimeWork From Home, United States R20h ago
-
Senior Storage System Software Developer USD 175K-267KC# | C++ | Distributed Version Control | EBPF | GDB401k | Education reimbursement | Flexible schedules | Hybrid schedule | Relocation assistanceSenior-level Full TimeLivermore, CA, United States R21h ago
-
Sr. Machine Learning Engineer USD 175K-220KAWS | Adversarial Machine Learning | Anti-spoofing | Attack detection | CI/CDIn person meeting during hiring | Remote work flexibilitySenior-level Full TimeUnited States R21h ago
-
Senior Machine Learning Engineer USD 134K-197KA/B | A/B Testing | AWS OpenSearch | Algolia | B testing401k | Dental insurance | Disability insurance | Life insurance | Medical insuranceSenior-level Full TimeRemote US R21h ago
-
Senior Data Engineer | Bankrate USD 100K-210KAPIs | AWS Lambda | AWS S3 | Airflow | Amazon EC2401k matching | Eastern Standard Time schedule | Employee assistance program | Flexible paid time off | Flexible spending accountsSenior-level Full TimeUnited States R23h ago
-
Federal AI Solutions Engineer (Entry Level) USD 85K-105KAI Agents | AI RMF | AWS Bedrock | AWS CDK | Amazon Elastic Container Service401k employer match | Career growth and mentorship | Certification reimbursement | Dental insurance | Federal HolidaysEntry-level Full TimeHybrid - McLean, VA, United States R1d ago
-
AI Architect (USA) USD 155K-175KAI Agents | AI Workflow Orchestration | AI workflow | AI-based testing | Automation401k with company matching | Bonus pay opportunities | Dental insurance | Disability coverage | Equipment providedSenior-level Full TimeDallas, Texas, United States R1d ago
-
AI Solutions Manager USD 500KAWS | Artificial Intelligence | B2B Sales | Consultative selling | DatabricksRemote workMid-level Full TimeNorth America R1d ago
-
Senior AI Architect USD 175K-267KAPI Design | Backend Development | CI/CD | Cloud Computing | Evaluation401k | Education reimbursement program | Flexible schedules | Hybrid work schedule | Relocation assistanceEntry-level Full TimeLivermore, CA, United States R1d ago
-
Senior AI Engineer (MS Copilot) USD 150K-170KAI Foundry | AI Search | Azure AI | Azure AI Foundry | Azure AI Search401k matching | Disability insurance | Free telehealth | Fully remote | HSA company contributionSenior-level Full TimeMinneapolis, MN, United States R1d ago
-
Staff Software Engineer, AI Data Platform USD 250K-280KCloud platform | Google Cloud | Google Cloud Platform | GraphQL | KafkaSenior-level Full TimeSan Francisco Bay Area R1d ago
-
Data Platform Engineer USD 182K-240KAWS | Amazon Kinesis | Apache Airflow | Apache Flink | Apache Kafka401k | Dental insurance | Family leave | Flexible paid time off | Free food and snacksSenior-level Full TimeOrlando, Florida, United States - Remote R1d ago
-
Hugging Face | LLM orchestration | Langchain | Language Models | Large Language ModelsCareer growth potential | Early stage technical hire | Equity compensation | High ownership role | Hybrid workMid-level Full TimeSan Francisco, CA; Hybrid R1d ago
-
Senior Data Engineer - Athlete (REMOTE) USD 83K-138KData Governance | Data Lineage | Data Modeling | Data Privacy | Data TestingSenior-level Full TimeRemote - US, United States R1d ago
-
AI Solutions Architect USD 144K-200KAI RMF | Angular | Django | Drift Detection | FedRAMPCareer development | Employee resource groups | Flexible WFH | Generous PTO | Paid volunteer timeSenior-level Full TimeUS-Washington DC-Remote, United States R1d ago