Principal Data Scientist
Boston, Massachusetts, United States
At the cutting edge of AI/ML technologies, we empower our clients to harness the value of unstructured data and uncover hidden insights within their enterprise information. As an LLM engineer, you’ll bring your deep expertise in LLM/GenAI technologies to the table. In collaboration with our R&D team, product managers, and engineering leads, you'll prototype, build, test, and scale innovative products powered by GenAI/LLM technologies. You'll also be instrumental in fine-tuning model hyperparameters, optimizing configurations, and ensuring the highest level of model performance to drive impactful outcomes for our clients.
RESPONSIBILITIES
- Develop LLM solutions on customer data, such as RAG architectures on enterprise knowledge repos, querying structured data with natural language, agents, and content generation.
- Develop end-to-end AI/ML solutions using Python, LLM/GenAI frameworks and tools.
- Develop CI/CD pipelines, containerize LLM models, and deploy them on cloud or on-premise. Ensure support and maintenance for all LLM/ML model lifecycle stages, including developing training datasets, fine-tuning, testing, deployment pipelines, and ongoing deviation monitoring.
- Design prototypes and POCs to showcase feasibility and value; provide architectural solutions.
- Research, design, build, and train innovative LLM applications to address complex real-world problems.
- Offer technical guidance to clients implementing LLM technologies.
QUALIFICATIONS
- Bachelor’s Degree (final-year students may apply) in Statistics, Applied Mathematics, Computer Science, or a related field
- 3+ years of hands-on experience with Python; 2+ years of experience with command line scripting; 1+ years of experience building and maintaining scalable API solutions
- 2+ years of professional experience with NLP; 1+ years of professional experience with Large Language Models (LLM)/GenAI technology (e.g., OpenAI API, GPT-4, Gemini, Llama, Claude, Amazon Bedrock, Langchain, HuggingFace Transformers, PyTorch); 1+ years of experience with prompt engineering and vector databases
- 2+ years of experience with AWS, GCP, or Microsoft Azure; 2+ years of experience with MLOps and CI/CD pipeline development, containerization, and model deployment in test and production environments
- Team player who can communicate complex LLM capabilities and limitations to non-technical stakeholders.
PREFERRED
- Master’s or Ph.D. in a relevant field
- 7+ years of product engineering and/or data science experience
- Experience with Ruby on Rails, JavaScript, or Flutter; 2+ years of experience with Snowflake or Databricks
- Deep knowledge of a Retail domain or industry, with a focus on NLP/LLM
- In-depth understanding of Responsible AI standards and protocols
- Applied research background using frameworks to build LLM prototypes; knowledge of best practices for production LLM development
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs Architecture AWS Azure CI/CD Claude Computer Science Databricks Engineering GCP Gemini Generative AI GPT GPT-4 HuggingFace JavaScript LangChain LLaMA LLMs Machine Learning Mathematics MLOps Model deployment NLP OpenAI Pipelines Prompt engineering Python PyTorch R RAG R&D Research Responsible AI Ruby Snowflake Statistics Testing Transformers Unstructured data
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.