Principal Clinical Data Science
US - Boston - MA, United States
Alexion Pharmaceuticals
As Alexion, AstraZeneca Rare Disease, we are delivering life-changing therapies to people living with rare diseases.This is what you will do:
Alexion is looking for a highly motivated and skilled Principal Clinical Data Scientist within our growing Data Science team. You will be central to applying leading-edge data science techniques to unlock significant insights from clinical trial data, with a dedicated focus on wearables, echocardiography imaging, real-world data, and omics data, leveraging resources such as the UK Biobank and our internal data lakes.
You will collaborate closely with clinical researchers, imaging specialists, and software engineers to drive innovation and accelerate the development of therapies for rare diseases. This is an exciting opportunity to make a significant impact on patients' lives by applying your expertise in machine learning and AI to complex and meaningful clinical data
You will be responsible for:
Key Responsibilities:
Design, develop, and implement machine learning and deep learning models to analyze data from wearable sensors (e.g., activity trackers, continuous glucose monitors) and echocardiography images.
Develop and validate algorithms for feature extraction, pattern recognition, and predictive modeling using wearable and imaging data.
Integrate and analyze diverse clinical datasets, including electronic health records (EHRs), genomic data, and patient-reported outcomes, alongside wearable and imaging data.
Collaborate with clinical teams to define research questions and identify opportunities for leveraging data science to improve clinical trial design, patient monitoring, and disease understanding in rare diseases.
Develop and maintain robust data pipelines for processing, cleaning, and analyzing large-scale datasets.
Contribute to the development of visualization tools and dashboards to effectively communicate findings to both technical and non-technical audiences.
Stay abreast of the latest advancements in machine learning, AI, and data science, particularly in the context of healthcare and digital biomarkers.
Document methodologies, results, and findings clearly and concisely.
Ensure compliance with relevant regulatory guidelines and data privacy standards.
Process Improvement:Actively contribute to the development of standard processes aimed at enhancing the quality, efficiency, and effectiveness of the Clinical Bioinformatics group.
You will need to have:
Ph.D. degree in Data Science, Biostatistics, Computer Science, Biomedical Engineering, or a related quantitative field.
2-4 years post-doctoral of hands-on experience applying machine learning and statistical modeling techniques to real-world datasets, preferably within a clinical research or healthcare setting.
Demonstrated experience working with data from wearable devices (e.g., accelerometers, heart rate monitors, sleep trackers) and a strong understanding of signal processing techniques.
Experience in analyzing medical imaging data, particularly echocardiography, including image processing, feature extraction, and the application of computer vision techniques.
Proficiency in programming languages such as Python and R, along with relevant libraries for data manipulation, statistical analysis, and machine learning (e.g., pandas, NumPy, scikit-learn, TensorFlow, PyTorch).
Experience with cloud computing platforms (e.g., AWS, Azure, GCP) and big data technologies (e.g., Spark, Hadoop) is a plus.
Strong understanding of statistical inference, hypothesis testing, and model evaluation metrics.
Excellent problem-solving, critical thinking, and analytical skills.
Strong communication and collaboration skills, with the ability to effectively present complex technical information to diverse audiences.
A strong interest in rare diseases and a desire to contribute to the development of innovative therapies for patients with unmet medical needs.
Technical Skills & Familiarity:
The ideal candidate will be familiar with a combination of the following techniques and concepts:
Machine Learning & AI:
Supervised Learning: Regression (linear, polynomial, etc.), Classification (logistic regression, support vector machines, decision trees, random forests, gradient boosting machines like XGBoost, LightGBM, CatBoost).
Unsupervised Learning: Clustering (k-means, hierarchical clustering, DBSCAN), dimensionality reduction (PCA, t-SNE, UMAP), anomaly detection.
Deep Learning: Convolutional Neural Networks (CNNs) for image analysis, Recurrent Neural Networks (RNNs) and Transformers for time-series data (wearable data), autoencoders.
Model Evaluation & Selection: Cross-validation, hyperparameter tuning, performance metrics (e.g., AUC, F1-score, RMSE), bias-variance trade-off.
Explainable AI (XAI): Techniques for understanding and interpreting machine learning model predictions (e.g., SHAP, LIME).
Time Series Analysis: Feature engineering, forecasting models (e.g., ARIMA, Prophet), dynamic time warping.
Wearable Data Analysis:
Signal Processing: Filtering, noise reduction, feature extraction from raw sensor data (e.g., frequency domain analysis).
Activity Recognition: Developing models to classify different types of physical activity.
Sleep Analysis: Algorithms for sleep stage classification and sleep quality assessment.
Event Detection: Identifying specific events or patterns in wearable sensor data.
Echocardiography Image Analysis:
Image Preprocessing: Noise reduction, normalization, augmentation techniques.
Image Segmentation: Techniques for identifying and delineating cardiac structures.
Object Detection: Identifying key landmarks or regions of interest in echocardiograms.
Quantitative Image Analysis: Extracting clinically relevant measurements from images.
General Data Science & Programming:
Python: Pandas, NumPy, SciPy, scikit-learn, TensorFlow, Keras, PyTorch, Matplotlib, Seaborn.
R: Base R, tidyverse, caret.
SQL: For querying and manipulating databases.
Data Visualization: Creating informative and visually appealing plots and dashboards.
Data Wrangling & Cleaning: Handling missing data, outliers, and data inconsistencies.
Statistical Concepts:
Hypothesis testing, statistical power, confidence intervals.
Regression analysis, survival analysis.
Longitudinal data analysis
Date Posted
04-Jun-2025Closing Date
26-Jun-2025Alexion is proud to be an Equal Employment Opportunity and Affirmative Action employer. We are committed to fostering a culture of belonging where every single person can belong because of their uniqueness. The Company will not make decisions about employment, training, compensation, promotion, and other terms and conditions of employment based on race, color, religion, creed or lack thereof, sex, sexual orientation, age, ancestry, national origin, ethnicity, citizenship status, marital status, pregnancy, (including childbirth, breastfeeding, or related medical conditions), parental status (including adoption or surrogacy), military status, protected veteran status, disability, medical condition, gender identity or expression, genetic information, mental illness or other characteristics protected by law. Alexion provides reasonable accommodations to meet the needs of candidates and employees. To begin an interactive dialogue with Alexion regarding an accommodation, please contact accommodations@Alexion.com. Alexion participates in E-Verify.* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: AWS Azure Big Data Bioinformatics Biostatistics Classification Clustering Computer Science Computer Vision Data analysis Data pipelines Data visualization Deep Learning Engineering Feature engineering GCP Hadoop Keras LightGBM Machine Learning Matplotlib NumPy Pandas Pipelines Predictive modeling Privacy Python PyTorch R Research Scikit-learn SciPy Seaborn Spark SQL Statistical modeling Statistics TensorFlow Testing Transformers Unsupervised Learning XGBoost
Perks/benefits: Career development Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.