Data Scientist (PhD)

United States - Remote

Full Time Mid-level / Intermediate USD 103K - 192K *

ASCENDING

View all jobs at ASCENDING

Apply now Apply later

Posted 10 hours ago

Location: 100% Remote within the United States

Job Overview:
As a Data Scientist, you will be responsible for managing the complete Model Development Life Cycle (MDLC), from problem definition to model deployment and monitoring. You will work closely with cross-functional teams to deliver machine learning models that support business objectives and drive innovation. The ideal candidate should have a strong background in data analysis, feature engineering, and model selection, along with a deep understanding of model deployment and ongoing model maintenance.

Key Responsibilities:

Problem Definition: Collaborate with business stakeholders to define and structure data-driven problems. Translate business objectives into machine learning tasks (e.g., classification, regression, clustering).
Data Collection & Preprocessing: Gather, clean, and preprocess data from multiple sources (e.g., databases, APIs, publicly available datasets). Handle missing data, outliers, and apply normalization techniques.
Exploratory Data Analysis (EDA): Use statistical analysis and data visualization techniques to identify key patterns, trends, and correlations in the data.
Feature Engineering: Create, extract, and transform features to improve model performance. Apply techniques such as feature extraction, selection, and transformation.
Model Selection & Training: Select the appropriate machine learning models based on the problem at hand (e.g., supervised learning, unsupervised learning, deep learning). Train models using tools like Scikit-learn, TensorFlow, or PyTorch. Evaluate model performance using relevant metrics (e.g., RMSE, accuracy, F1-score, ROC-AUC) and optimize hyperparameters to ensure robustness. Deploy models in a production environment using tools like Flask, FastAPI, Docker, and Kubernetes. Ensure scalability and integration with existing systems.
Model Monitoring & Maintenance: Monitor model performance post-deployment, address model drift, and retrain models as needed. Ensure continuous accuracy and relevance of models in real-world scenarios.
Model Interpretation & Communication: Provide clear and actionable insights through model interpretation techniques such as feature importance and SHAP values. Present results to both technical and non-technical stakeholders.

Qualifications:

PhD degree in Computer Science, Data Science, Statistics, Engineering, or a related field.
3+ years of experience in machine learning, statistical modeling, and data science.
Proficiency in Python, SQL, and experience with libraries such as Pandas, NumPy, Scikit-learn, TensorFlow, and Keras.
Hands-on experience with model deployment tools such as Flask, Docker, Kubernetes, and cloud platforms like AWS, Azure, or Google Cloud.
Strong knowledge of data preprocessing techniques, feature engineering, and exploratory data analysis.
Experience with hyperparameter tuning techniques (e.g., Grid Search, Bayesian Optimization).
Familiarity with model monitoring tools such as MLflow, Prometheus, or Grafana.
Excellent communication skills, with the ability to translate technical results into actionable insights for stakeholders.
Strong problem-solving skills and the ability to work on complex, data-driven projects.