Senior Data Engineer (Remote/Hybrid)
Brazil
- Remote-first
- Website
- @swordhealth 𝕏
- GitHub
- Search
SWORD Health
Sword Health’s solutions combine AI and clinical expertise to deliver pain-fighting care without the need for opioids or unnecessary surgeries.Are you ready to transform healthcare with cutting-edge AI? At Sword Health, we’re harnessing the power of Large Language Models (LLMs) and advanced data engineering to revolutionize how we deliver personalized care. We're looking for a Senior Data Engineer to join our dynamic Predict team, where you'll play a pivotal role in building the data infrastructure that drives our AI and machine learning projects from the ground up.
As a Senior Data Engineer, you'll design, develop, and scale systems that analyze complex health data, enabling AI models to predict, understand, and personalize care for millions of users. Your work will empower our AI to not only understand member behavior across a variety of health and pain conditions but also to continuously improve the care experience through data-driven insights.
What you’ll do:
- Design, build, and maintain ETL/ELT data pipelines to efficiently ingest and process data from multiple sources (e.g., Cloud Storage, BigQuery) to support AI and machine learning models.
- Collaborate with data science and AI teams to develop and implement AI models, leveraging machine learning (ML) and deep learning techniques to analyze musculoskeletal health and pain data, and to improve digital health interventions.
- Architect and optimize data solutions to ensure a seamless, person-level identity across diverse datasets and sources (structured and unstructured data), supporting AI-driven insights.
- Develop and implement data engineering strategies for processing large-scale datasets, including clinical notes, sensor data, and other medical records, using technologies such as PySpark and cloud-based data platforms (e.g., GCP, AWS).
- Drive the data engineering efforts for AI and ML projects, supporting data pipeline design, data preparation, feature engineering, and ensuring high-quality data for model training and productionization.
- Work closely with cross-functional teams (analytics, marketing, clinical operations) to translate business requirements into data-driven solutions, helping develop AI and machine learning products that address real-world healthcare problems.
- Stay at the forefront of emerging technologies in AI, ML, and data engineering, experimenting with new tools, techniques, and approaches to continuously improve data processing and model deployment workflows.
About you:
- Experience in data engineering, with a deep understanding of data pipelines, cloud infrastructure, and big data technologies.
- Expertise in supporting AI/ML models—familiarity with LLMs and their integration into healthcare applications is a huge plus!
- A passion for creating data-driven solutions that enhance the healthcare experience and deliver real-world impact.
- A collaborative mindset, able to work across interdisciplinary teams including AIresearchers, data scientists, and healthcare professionals.
- Experience with cloud platforms (AWS, GCP, Azure) and scalable data frameworks (Spark, Kafka, etc.).
- Experience in designing and implementing scalable, efficient data pipelines, with a focus on SQL and modern data modeling techniques.
- Experience in developing data engineering solutions using Python, and Pyspark, with particular experience in applying tools to AI and ML tasks.
Bonus points if you have:
- Experience working with healthcare data, including medical claims, EHR, FHIR, or other clinical data formats.
- Proven track record of creating production-ready ML models, including model deployment, monitoring, and retraining.
- Experience with advanced ML tools and frameworks, such as TensorFlow, PyTorch, or scikit-learn, for model development and evaluation.
- Strong ability to propose and implement changes to improve data architecture, scalability, and performance, specifically in the context of AI-driven applications.
- Knowledge of data privacy and governance standards relevant to healthcare, including data security, anonymization, and compliance with healthcare regulations.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture AWS Azure Big Data BigQuery Data pipelines Deep Learning ELT Engineering ETL Feature engineering GCP Kafka LLMs Machine Learning ML models Model deployment Model training Pipelines Privacy PySpark Python PyTorch Scikit-learn Security Spark SQL TensorFlow Unstructured data
Perks/benefits: Relocation support
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.