Lead Data Scientist
Delhi
⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️
HighLevel
HighLevel is the all-in-one sales & marketing platform that agencies can white-label and resell to their clients!Our PeopleWith over 1,500 team members across 15+ countries, we operate in a global, remote-first environment. We are building more than software; we are building a global community rooted in creativity, collaboration, and impact. We take pride in cultivating a culture where innovation thrives, ideas are celebrated, and people come first, no matter where they call home.
Our ImpactAs of mid 2025, our platform powers over 1.5 billion messages, helps generate over 200 million leads, and facilitates over 20 million conversations for the more than 2 million businesses we serve each month. Behind those numbers are real people growing their companies, connecting with customers, and making their mark - and we get to help make that happen.
About the Role:As a Senior Data Scientist, you will design and deploy AI-driven systems that support key business functions like Sales, Customer Success, and Product. You’ll own the end-to-end lifecycle from experimentation to production, applying techniques like predictive modeling, real-time scoring, and AI agent orchestration. Working cross-functionally, you’ll translate data into automation and decision-making tools that drive measurable business outcomes.
Requirements:
- 8+ years in data science, ML, or applied AI roles, ideally within SaaS (B2B or PLG preferred)
- Expert in SQL, Python, and modeling frameworks (e.g. scikit-learn, XGBoost, LightGBM)
- Proven experience building and deploying predictive models in production (churn, conversion, LTV, usage drop-off)
- Experience in fine-tuning models either with FFT or LORA (Or variants of)
- Strong hands-on experience with OpenAI models, LangChain, and agent orchestration tools
- Demonstrated prompt engineering capability: designing and refining system and task-specific prompts
- Experience implementing retrieval-augmented generation (RAG) using embeddings and vector DBs (Pinecone, FAISS, etc.)
- Experience testing, training, and deploying models/agents via Cloudflare Workers or equivalent serverless environments
- Familiarity with streaming usage data pipelines and real-time behavioral scoring
- Strong storytelling skills: you can articulate technical work to non-technical stakeholders clearly and persuasively
Responsibilities:
- Develop and fine-tune machine learning models using advanced algorithms like gradient boosting (XGBoost, LightGBM) and lightweight neural networks to better grade customer churn, account health decline, upsell opportunities, and trial conversion rates
- Pull data from feature sets across CRM, product usage, support, and NPS. Cleanse and transform data to form a holistic view of account health.
- Build production-grade models to predict churn, account health decline, usage slowness, upsell opportunity, and trial conversion
- Create real-time scoring mechanisms to alert GTM teams about at-risk customers and under-engaged segments
- Use OpenAI models, LangChain (or equivalent) or open source models to build intelligent assistants, auto-analysis agents, and retrieval-based matchers
- Design prompts and agent flows to answer RevOps questions, generate insight summaries, and automate interventions
- Implement retrieval-augmented generation (RAG) architectures using vector databases (e.g., Pinecone, FAISS)
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs Architecture Data pipelines Engineering FAISS LangChain LightGBM LoRA Machine Learning Microservices ML models OpenAI Open Source Pinecone Pipelines Predictive modeling Prompt engineering Python RAG Scikit-learn SQL Streaming Testing XGBoost
Perks/benefits: Career development Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.