Lead Product Software - Data Science Engineer
IND-Pune-IndiQube Orchid, India
ā ļø We'll shut down after Aug 1st - try fooš¦ for all jobs in tech ā ļø
Applications have closed
- Remote-first
- Website
- @wolters_kluwer š
- Search
Wolters Kluwer
Wolters Kluwer is a global provider of professional information, software solutions, and services.Job Description Summary
Designs, develops, tests, debugs and implements more complex operating systems components, software tools, and utilities with full competency. Coordinates with users to determine requirements. Reviews systems under development and related documentation. Makes more complex modifications to existing software to fit specialized needs and configurations, and maintains program libraries and technical documentation. May coordinate activities of the project team and assist in monitoring project schedules and costs.
ResponsibilitiesĀ
Design and implement scalable data pipelines for both ML and non-ML applicationsĀ
Build and maintain data lakesĀ and feature stores preferably optimized for machine learningĀ
Develop ETL processes for complex, high-volume datasetsĀ
Create and maintain infrastructure for ML model training and deploymentĀ
Collaborate with data scientists to productionize ML modelsĀ
Implement CI/CD pipelines for ML models Ā
Optimize data processing for model training and inferenceĀ
Monitor data ystems performance and troubleshoot issuesĀ
Ensure data quality, integrity, and governance Ā
Design real-time data processing solutions for ML applications and other consumer applicationsĀ
RequirementsĀ
Bachelor's or master's degree in computer science, Engineering, or related technical fieldĀ
Minimum of 5 years' experience in building data pipelines for both structured and unstructured data.Ā
At least 2 years' experience in Azure data pipeline development.Ā
Preferably 3 or more years' experience with Hadoop, Azure Databricks, Stream Analytics, Eventhub, Kafka, and Flink.Ā
Strong proficiency in Python and SQLĀ
Experience with big data technologies (Spark, Hadoop, Kafka)Ā
Familiarity with ML frameworks (TensorFlow, PyTorch, scikit-learn)Ā
Knowledge of model serving technologies (TensorFlow Serving, MLflow, KubeFlow) will be a plusĀ
Experience with one pof the cloud platforms (Azure preferred) and their Data Services. Understanding ML services will get preference.Ā
Understanding of containerization and orchestration (Docker, Kubernetes)Ā
Experience with data versioning and ML experiment tracking will beĀ great additionĀ
Knowledge of distributed computing principlesĀ
Familiarity with DevOps practices and CI/CD pipelinesĀ
Ā
Preferred Qualifications
Bachelorās degree in Computer Science or equivalent practical experience.Ā Ā
Experience with Agile/Scrum methodologies.Ā Ā
Background in tax and accounting domains is advantageous.Ā Ā
Azure Data Engineer certification is beneficial.Ā
* Salary range is an estimate based on our AI, ML, Data Science Salary Index š°
Tags: Agile Azure Big Data CI/CD Computer Science Databricks Data pipelines Data quality DevOps Docker Engineering ETL Flink Hadoop Kafka Kubeflow Kubernetes Machine Learning MLFlow ML models Model training Pipelines Python PyTorch Scikit-learn Scrum Spark SQL TensorFlow Unstructured data
Perks/benefits: Career development Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.