Lead Product Software - Data Science Engineer

IND-Pune-IndiQube Orchid, India

āš ļø We'll shut down after Aug 1st - try foošŸ¦ for all jobs in tech āš ļø

Applications have closed

Wolters Kluwer

Wolters Kluwer is a global provider of professional information, software solutions, and services.

View all jobs at Wolters Kluwer

Job Description Summary

Designs, develops, tests, debugs and implements more complex operating systems components, software tools, and utilities with full competency. Coordinates with users to determine requirements. Reviews systems under development and related documentation. Makes more complex modifications to existing software to fit specialized needs and configurations, and maintains program libraries and technical documentation. May coordinate activities of the project team and assist in monitoring project schedules and costs.

ResponsibilitiesĀ 

  • Design and implement scalable data pipelines for both ML and non-ML applicationsĀ 

  • Build and maintain data lakesĀ and feature stores preferably optimized for machine learningĀ 

  • Develop ETL processes for complex, high-volume datasetsĀ 

  • Create and maintain infrastructure for ML model training and deploymentĀ 

  • Collaborate with data scientists to productionize ML modelsĀ 

  • Implement CI/CD pipelines for ML models Ā 

  • Optimize data processing for model training and inferenceĀ 

  • Monitor data ystems performance and troubleshoot issuesĀ 

  • Ensure data quality, integrity, and governance Ā 

  • Design real-time data processing solutions for ML applications and other consumer applicationsĀ 

RequirementsĀ 

  • Bachelor's or master's degree in computer science, Engineering, or related technical fieldĀ 

  • Minimum of 5 years' experience in building data pipelines for both structured and unstructured data.Ā 

  • At least 2 years' experience in Azure data pipeline development.Ā 

  • Preferably 3 or more years' experience with Hadoop, Azure Databricks, Stream Analytics, Eventhub, Kafka, and Flink.Ā 

  • Strong proficiency in Python and SQLĀ 

  • Experience with big data technologies (Spark, Hadoop, Kafka)Ā 

  • Familiarity with ML frameworks (TensorFlow, PyTorch, scikit-learn)Ā 

  • Knowledge of model serving technologies (TensorFlow Serving, MLflow, KubeFlow) will be a plusĀ 

  • Experience with one pof the cloud platforms (Azure preferred) and their Data Services. Understanding ML services will get preference.Ā 

  • Understanding of containerization and orchestration (Docker, Kubernetes)Ā 

  • Experience with data versioning and ML experiment tracking will beĀ great additionĀ 

  • Knowledge of distributed computing principlesĀ 

  • Familiarity with DevOps practices and CI/CD pipelinesĀ 
    Ā 

  • Preferred Qualifications

  • Bachelor’s degree in Computer Science or equivalent practical experience.Ā Ā 

  • Experience with Agile/Scrum methodologies.Ā Ā 

  • Background in tax and accounting domains is advantageous.Ā Ā 

  • Azure Data Engineer certification is beneficial.Ā 

* Salary range is an estimate based on our AI, ML, Data Science Salary Index šŸ’°

Job stats:  0  0  0

Tags: Agile Azure Big Data CI/CD Computer Science Databricks Data pipelines Data quality DevOps Docker Engineering ETL Flink Hadoop Kafka Kubeflow Kubernetes Machine Learning MLFlow ML models Model training Pipelines Python PyTorch Scikit-learn Scrum Spark SQL TensorFlow Unstructured data

Perks/benefits: Career development Team events

Region: Asia/Pacific
Country: India

More jobs like this