Senior Solutions Architect (AI/ML)

Santa Clara, California, United States

Full Time Senior-level / Expert USD 122K - 227K *

Proximity Works

A global team of coders, designers, product managers, geeks and experts. We solve complex problems and build cutting edge tech — at scale.

View all jobs at Proximity Works

Apply now Apply later

Posted 1 month ago

We are looking for a Senior Solutions Architect to design, develop, and scale innovative AI/ML-driven solutions. You will be responsible for architecting highly scalable, low-latency distributed systems optimized for AI/ML workloads. As a key technical leader, you will solve complex challenges, influence next-generation AI/ML infrastructures, and guide cross-functional teams to deliver state-of-the-art solutions for fast-growing startups and enterprise companies.

Be at the forefront of shaping next-generation AI/ML infrastructures, driving solutions for high-impact products across diverse industries. You'll have the opportunity to influence key architectural decisions and enable real-world applications that scale globally, ensuring innovation and efficiency at every step.

Requirements

You'll be responsible for —

Driving end-to-end GenAI architecture and implementation:

Design and deploy multi-agent systems using modern frameworks (LangGraph, CrewAI, AutoGen)
Architect RAG solutions with advanced vector store integration
Implement efficient fine-tuning strategies for foundation models
Develop synthetic data generation pipelines for training and testing

Leading ML infrastructure and deployment:

Design high-performance model serving architectures
Implement distributed training and inference systems
Establish MLOps practices and pipelines
Optimize cloud resource utilization and costs
Set up monitoring and observability solutions

Driving technical excellence and innovation:

Define architectural standards and best practices
Lead technical decision-making for AI/ML initiatives
Ensure scalability and reliability of AI systems
Implement AI governance and security measures
Guide teams on advanced AI concepts and implementations

Overseeing production AI systems:

Manage model deployment and versioning
Implement A/B testing frameworks
Monitor system performance and model drift
Optimize inference latency and throughput
Ensure high availability and fault tolerance

Fostering collaboration and growth:

Mentor engineering teams on AI architecture
Collaborate with stakeholders on technical strategy
Drive innovation in AI/ML solutions
Share knowledge through documentation and training
Lead technical reviews and architecture discussions

You need —

8+ years experience in software engineering or architecture, including:

4+ years leading cross-functional GenAI/ML teams
Production experience with distributed AI systems
Enterprise-scale AI architecture implementation

To lead and architect enterprise-scale GenAI/ML solutions, focusing on:

Multi-agent orchestration using LangGraph, CrewAI, and AutoGen
Workflow automation with LlamaIndex, LangChain, and LangFlow
Agent coordination using LETTA framework
Integration of specialized agents for reasoning, planning, and execution

To design and implement sophisticated AI architectures incorporating:

Advanced RAG systems using:

Vector databases (Chroma, Weaviate, Pinecone, Milvus)
Hybrid search with BM25 and semantic embeddings
Self-querying and recursive retrieval patterns

Fine-tuning strategies for foundation models:

PEFT methods (LoRA, QLoRA, Adapter-tuning)
Parameter-efficient training approaches
Instruction fine-tuning and RLHF

Multi-agent frameworks integrating:

Tool-use and reasoning chains
Memory systems (short-term and long-term)
Meta-prompting and reflection mechanisms
Agent communication protocols

Expertise advanced data generation and synthesis:

Synthetic data generation using Arigilla and PersonaHub
Privacy-preserving data synthesis
Domain-specific data augmentation
Quality assessment of synthetic data
Data balancing and bias mitigation

To architect high-performance ML serving infrastructure focusing on:

Model serving platforms (BentoML, Ray Serve, Triton)
Real-time processing with Ray, Kafka, and Spark Streaming
Distributed training using Horovod, DeepSpeed, and FSDP
vLLM and TGI for efficient inference
Integration patterns for hybrid cloud-edge deployments

To drive cloud architecture decisions across:

Kubernetes orchestration with Kubeflow and KServe
Serverless ML with AWS Lambda, Azure Functions, Cloud Run
Auto-scaling using HPA, KEDA, and custom metrics
Resource optimization with Nvidia Triton and TensorRT
MLOps platforms (MLflow, Weights & Biases, DVC)

Benefits

Bonus points for —

Research publications in AI/ML
Open-source project maintenance
Technical blog posts on AI architecture
Conference presentations
AI community leadership

What you get —

Best in class salary: We hire only the best, and we pay accordingly.
Proximity Talks: Meet other designers, engineers, and product geeks — and learn from experts in the field.
Keep on learning with a world-class team: Work with the best in the field, challenge yourself constantly, and learn something new every day.

About us —

We are Proximity — a global team of coders, designers, product managers, geeks, and experts. We solve complex problems and build cutting-edge tech at scale. Here's a quick guide to getting to know us better: