Software Engineer, Machine Learning Infrastructure
Palo Alto
About Us:
Hippocratic AI is developing the first safety-focused Large Language Model (LLM) for healthcare. Our mission is to dramatically improve healthcare accessibility and outcomes by bringing deep healthcare expertise to every person. No other technology has the potential for this level of global impact on health.
Why Join Our Team:
Innovative mission: We are creating a safe, healthcare-focused LLM that can transform health outcomes on a global scale.
Visionary leadership: Hippocratic AI was co-founded by CEO Munjal Shah alongside physicians, hospital administrators, healthcare professionals, and AI researchers from top institutions including El Camino Health, Johns Hopkins, Washington University in St. Louis, Stanford, Google, and NVIDIA.
Strategic investors: Raised $137 million from top investors including General Catalyst, Andreessen Horowitz, Premji Invest, SV Angel, NVentures, and Greycroft.
Team and expertise: We are working with top experts in healthcare and artificial intelligence to ensure the safety and efficacy of our technology.
For more information: Visit www.HippocraticAI.com.
We value in-person teamwork and believe the best ideas happen together. Our team is expected to be in the office five days a week in Palo Alto, CA, unless explicitly noted otherwise in the job description.
About the Role:
This role at Hippocratic AI is focused on building and optimizing the data infrastructure that powers our machine learning (ML) operations, including our generative AI models and large language models (LLMs) for healthcare. You will design and scale reliable, data-driven services for ML model training, data processing, and deployment, ensuring our Research Scientists can seamlessly transition from experimentation to production. This role involves working with massive datasets, managing ETL pipelines, and building scalable solutions to handle data ingestion, transformation, and storage across Hippocratic AI’s systems.
Responsibilities:
Build powerful, flexible, and user-friendly data and ML infrastructure that supports all ML Speech and LLM operations across Hippocratic AI.
Design and develop fast, reliable data services for ML model training, ETL pipelines, and deployment, scaling infrastructure across multiple regions.
Create services and libraries that enable ML engineers to move efficiently from data experimentation to production, especially for generative AI models.
Collaborate closely with product teams, data engineers, and research scientists to develop data-focused infrastructure that supports production-ready generative AI and LLM/ Multimodal models for healthcare applications.
Must-Have:
5-8 years of experience in building software applications for large-scale distributed data systems.
Strong engineering background with experience in infrastructure and/or distributed systems, with proficiency in Python, Java, or similar languages.
Solid experience with ETL processes and data pipeline design, ensuring high-quality data for ML model development and deployment.
Familiarity with the complete software development life cycle, from design and implementation to testing and deployment.
Proven track record in building and maintaining high-availability, low-latency systems with a focus on reliability, testing, and observability.
Pragmatic approach to problem-solving, knowing when to aim for ideal solutions and when to adjust course.
Experience with big data technologies such as Apache Spark for data processing and large-scale data analytics.
Strong sense of curiosity and a collaborative mindset, eager to learn new technologies and share knowledge within the team.
Preferred:
5+ years of experience supporting machine learning and generative AI infrastructure.
Hands-on experience optimizing the end-to-end performance of distributed data systems, particularly for Multimodal LLMs and other generative AI applications.
Experience with Audio/ Speech Training Infrastructure is a plus.
Why You’ll Love Working Here:
At Hippocratic AI, we are revolutionizing the healthcare landscape through cutting-edge technology. We want talented individuals who thrive at the intersection of innovation and impact. You’ll work alongside some of the brightest minds in healthcare and AI to shape the future of healthcare accessibility.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Big Data Data Analytics Distributed Systems Engineering ETL Generative AI Java LLMs Machine Learning ML infrastructure ML models Model training Pipelines Python Research SDLC Spark Testing
Perks/benefits: Flex hours
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.