NLP Engineer
Bengaluru, Karnataka, India
Weekday
At Weekday, we help companies hire engineers who are vouched by other software engineers. We are enabling engineers to earn passive income by leveraging & monetizing the unused information in their head about the best people they have worked...This role is for one of the Weekday's clients
Salary range: Rs 3000000 - Rs 4000000 (ie INR 30-40 LPA)
Min Experience: 2 years
Location: Bangalore
JobType: full-time
We are looking for a skilled and driven NLP Engineer to help scale, optimize, and deploy large language model (LLM)-based solutions within the healthcare domain. Your primary focus will be on building and maintaining production-ready, end-to-end NLP systems—covering backend architecture, inference optimization, and efficient model deployment pipelines. While opportunities exist for fine-tuning LLMs for specific use cases, the core responsibility is ensuring these models run efficiently, reliably, and at scale in production environments.
Additionally, you will develop NLP pipelines leveraging pre-trained LLMs and embedding models, including retrieval-augmented generation (RAG) systems and agentic NLP solutions that integrate multiple models and data sources for real-time, context-aware processing.
Requirements
Key Responsibilities
Production-Grade NLP Systems
- Design and implement scalable, efficient NLP pipelines using LLMs and embedding models.
- Integrate RAG and agentic components to enhance NLP capabilities and adaptability.
Inference Optimization & Deployment
- Optimize model inference performance, reduce latency, and improve throughput using frameworks like vLLM, TensorRT, Ray, etc.
- Implement best practices for containerization, CI/CD, monitoring, and observability to ensure stable, production-ready deployments.
Occasional Model Adaptation
- Assist with fine-tuning or adapting LLMs for specific healthcare applications, ensuring scalability and efficiency.
Collaboration & Continuous Improvement
- Work closely with NLP researchers, backend engineers, product managers, and frontend developers to build high-quality NLP solutions.
- Participate in code reviews, architectural discussions, and stay updated on emerging NLP and LLM optimization techniques.
Requirements (Must-Haves!)
- Bachelor's or Master’s degree in Computer Science or a related field.
- 2+ years of experience (or 1+ year with an advanced degree) in building and deploying ML/NLP systems using Python.
- Hands-on experience with NLP frameworks (e.g., spaCy, Hugging Face Transformers, LangChain) and deep learning libraries (e.g., PyTorch).
- Strong background in designing, implementing, and maintaining scalable backend architectures for NLP/LLM-based applications.
- Experience working with large datasets, including data cleaning, preprocessing, and structuring.
- Proficiency in containerization, CI/CD, and version control for production-grade deployments.
- Expertise in LLM inference optimization using vLLM, TensorRT, Ray, etc.
- Practical knowledge of deploying NLP models in production, including load balancing and latency reduction.
Preferred (Nice-to-Have!)
- Experience in building RAG pipelines and integrating embedding models into NLP workflows.
- Familiarity with agentic systems that leverage multiple models for dynamic, context-aware NLP solutions.
- Knowledge of prompt engineering, model fine-tuning, and large-scale inference optimization for LLMs.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture CI/CD Computer Science Deep Learning Engineering LangChain LLMs Machine Learning Model deployment Model inference NLP Pipelines Prompt engineering Python PyTorch RAG spaCy TensorRT Transformers vLLM
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.