Lead Architect

Bengaluru, India

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Full Time Senior-level / Expert USD 75K - 140K * ^est.

Fractal

Fractal is a strategic analytics partner to global Fortune 500 companies & powers every human decision in the enterprise with AI, engineering & design.

View all jobs at Fractal

Apply now Apply later

Posted 3 weeks ago

It's fun to work in a company where people truly BELIEVE in what they are doing!

We're committed to bringing passion and customer focus to the business.

Role overview:

We’re building a next-gen LLMOps team at Fractal to industrialize GenAI implementation and shape the future of GenAI engineering. This is a hands-on technical leadership role for AI engineers with strong ML and DevOps skills — ideal for those who love building scalable systems from the ground up. You will be designing, deploying, and scaling GenAI and Agentic AI applications with robust lifecycle automation and observability.

Required Qualifications:

10 - 14 years of experience in working on ML projects that includes product building mindset, strong hands on skills, technical leadership, leading development teams
Model development, training, deployment at scale, monitoring performance for production use cases
Strong knowledge on Python, Data Engineering, FastAPI, NLP
Knowledge on Langchain, Llamaindex, Langtrace, Langfuse, LLM evaluation, MLFlow, BentoML
Should have worked on proprietary and open-source LLMs
Experience on LLM fine tuning including PEFT/CPT
Experience in creating Agentic AI workflows using frameworks like CrewAI, Langraph, AutoGen, Symantec Kernel
Experience in performance optimization, RAG, guardrails, AI governance, prompt engineering, evaluation, and observability
Experience in GenAI application deployment on cloud and on-premises at scale for production using DevOps practices
Experience in DevOps and MLOps
Good working knowledge on Kubernetes and Terraform
Experience in minimum one cloud: AWS / GCP / Azure to deploy AI services
Team player with excellent communication and presentation skills

Must have skills:

Product thinking that includes ideation, prototyping, and scale internal accelerators for LLMOps
Architect and build scalable LLMOps platforms for enterprise-grade GenAI systems
Design and manage end-to-end LLM pipelines from data ingestion and embedding to evaluation and inference
Drive LLM-specific infrastructure: memory management, token control, prompt chaining, and context optimization
Lead scalable deployment frameworks for LLMs using Kubernetes and GPU-aware scaling
Build agentic AI operations capabilities including agent evaluation, observability, orchestration and reflection loops
Guardrails & Observability: Implement output filtering, context-aware routing, evaluation harnesses, metrics logging, and incident response
Platform Automation for LLMOps: Drive end-to-end automation with Docker, Kubernetes, GitOps, DevOps, Terraform, etc.

Product Thinking: Ideate, prototype, and scale internal accelerators and reusable components for LLMOps

GenAI Engineering: Productionize LLM-powered applications with modular, reusable, and secure patterns

Pipeline Architecture: Create evaluation pipelines — including prompt orchestration, feedback loops, and fine-tuning workflows

Prompt & Model Management: Design systems for versioning, AI governance, automated testing, and prompt quality scoring

Scalable Deployment: Architect cloud-native and hybrid deployment strategies for large-scale inference

Guardrails & Observability: Implement output filtering, context-aware routing, evaluation harnesses, metrics logging, and incident response

DevOps & Platform Automation: Drive end-to-end automation with Docker, Kubernetes, GitOps, Terraform, etc.

Must-Have Technical Skills

LLMOps frameworks: LangChain, MLflow, BentoML, Ray, Truss, FastAPI
Prompt evaluation and scoring systems: OpenAI evals, Ragas, Rebuff, Outlines
Cloud-native deployment: Kubernetes, Helm, Terraform, Docker, GitOps
ML pipeline: Airflow, Prefect, Feast, Feature Store
Data stack: Spark/Flink, Parquet/Delta, Lakehouse patterns
Cloud: Azure ML, GCP Vertex AI, AWS Bedrock/SageMaker
Languages: Python (must), Bash, YAML, Terraform HCL (preferred)

If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!

Not the right fit? Let us know you're interested in a future opportunity by clicking Introduce Yourself in the top-right corner of the page or create an account to set up email alerts as new job postings become available that meet your interest!

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 1 0 0

Categories: Architecture Jobs Leadership Jobs

Tags: AI governance Airflow Architecture AWS Azure BentoML DevOps Docker Engineering FastAPI Flink GCP Generative AI GPU Helm Kubernetes LangChain LLMOps LLMs Machine Learning MLFlow ML models MLOps NLP OpenAI Open Source Parquet Pipelines Prompt engineering Prototyping Python RAG SageMaker Spark Terraform Testing Vertex AI