Staff Machine Learning Engineer
IN Bengaluru, India
⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️
Automation Anywhere
The industry's most advanced, most deployed agentic process automation system combines the power of AI, Automation, and RPA to deliver secure end-to-end enterprise agentic automation for mission-critical processes.About Us
Automation Anywhere is the leader in Agentic Process Automation (APA), transforming how work gets done with AI-powered automation. Its APA system, built on the industry’s first Process Reasoning Engine (PRE) and specialized AI agents, combines process discovery, RPA, end-to-end orchestration, document processing, and analytics—all delivered with enterprise-grade security and governance. Guided by its vision to fuel the future of work, Automation Anywhere helps organizations worldwide boost productivity, accelerate growth, and unleash human potential.
Staff Engineer – Machine Learning
Job Description
We are seeking a highly skilled Staff Machine Learning Engineer with a strong background in Machine Learning and Software Engineering to join our team. In this role, you will collaborate closely with product managers, applied scientists, and engineers to design, implement, and scale ML-powered systems that extract and reason over information from complex, unstructured data sources—such as documents, web screens, and automation workflows. Your work will play a key role in enabling intelligent, context-aware assistants and task automation agents that understand user intent and deliver meaningful outcomes.
Key Responsibilities
Build intelligent agentic systems that combine document understanding, retrieval, and LLM-based reasoning to support context-aware conversational workflows.
Design and implement scalable Agentic RAG pipelines that orchestrate unstructured data preprocessing, vector store integration, and LLM prompting for accurate, grounded responses.
Develop modular ML components for layout analysis, information extraction, dialogue management, and tool invocation within multi-turn, goal-driven conversations.
Integrate conversational agents with enterprise data sources, APIs, and downstream automation workflows to enable end-to-end task execution.
Craft dynamic prompting and memory strategies to maintain context, reduce hallucination, and improve relevance in long-form or multi-turn queries.
Collaborate cross-functionally with product managers, UX designers, and platform engineers to define agent behavior and ensure seamless user experiences.
Monitor and continuously improve system performance, including response quality, retrieval precision, latency, and task success rate using real-world feedback.
Drive data curation efforts including synthetic generation, annotation workflows, and hard example mining to improve agent robustness and generalization.
Establish best practices for evaluating agent behavior, including human-in-the-loop review processes, regression testing, and safety guardrails.
Contribute to infrastructure and MLOps pipelines that support model experimentation, deployment, and monitoring in production environments.
Qualifications / Expectations:
Bachelor’s or Master’s Degree in Computer Science, Data Science, or a related technical field. Advanced degrees are a plus.
5–7+ years of hands-on experience in building and deploying machine learning models, with a strong focus on document intelligence, NLP, or Generative AI.
Proven experience developing and productionizing ML systems for unstructured data processing, including document parsing, table extraction, and layout analysis.
Proficiency with modern ML frameworks such as PyTorch or TensorFlow, and experience with OCR and document processing tools (e.g., Tesseract, AWS Textract, PDFMiner, LayoutParser).
Strong experience with LLMs and Retrieval-Augmented Generation (RAG) architectures, including practical knowledge of LangChain, LlamaIndex, Haystack, or similar frameworks.
Understanding on knowledge graphs and graph based information retrieval
Experience designing and integrating conversational agents with context memory, function/tool calling, and dynamic prompting strategies.
Familiarity with vector stores (e.g., FAISS, Pinecone, Weaviate) and search/retrieval mechanisms for grounding LLM outputs.
Experience building ML pipelines and implementing MLOps best practices to support training, validation, deployment, and monitoring of models at scale.
Hands-on experience with cloud ML services (e.g., AWS SageMaker, Azure ML, Google Vertex AI) for training and inference.
Solid programming skills in Python, with working knowledge of SQL
Experience with containerization (Docker), orchestration (Kubernetes), and model serving technologies (e.g., Triton Inference Server, ONNX Runtime, TorchServe) in production environments.
Knowledge of model optimization techniques (e.g., quantization, pruning, distillation) to improve inference efficiency on cloud or edge devices.
Strong problem-solving abilities, with a track record of delivering scalable, high-impact ML solutions for complex, real-world problems.
Excellent communication skills and ability to work autonomously in a fast-paced, collaborative environment.
Nice to Have:
Experience in fine-tuning large language models (LLMs) and applying GenAI techniques in parsing unstructured data.
Experience with dialogue management systems or conversational frameworks (e.g., Rasa, Dialogflow, or custom pipelines) for building intelligent agents.
Experience with distributed training techniques to optimize large-scale model training across multiple GPUs or cloud environments.
Understanding of prompt engineering best practices and few-shot or chain-of-thought prompting for improving agent behavior in GenAI systems.
Background in open-source contributions or research related to document AI, LLM applications, or multi-modal learning.
Familiarity with CI/CD pipelines for ML, automated model versioning, and monitoring tools for performance and drift in production models.
All unsolicited resumes submitted to any @automationanywhere.com email address, whether submitted by an individual or by an agency, will not be eligible for an agency fee.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Amazon Textract APIs Architecture AWS Azure CI/CD Computer Science Docker Engineering FAISS Generative AI Haystack Kubernetes LangChain LLMs Machine Learning ML models MLOps Model training NLP OCR ONNX Open Source Pinecone Pipelines Prompt engineering Python PyTorch RAG Research Robotics RPA SageMaker Security SQL TensorFlow Testing Unstructured data UX Vertex AI Weaviate
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.