Data Engineer / Data Scientist

Noida Berger Tower, India

Thales

From Aerospace, Space, Defence to Security & Transportation, Thales helps its customers to create a safer world by giving them the tools they need to perform critical tasks

View all jobs at Thales

Apply now Apply later

Location: Noida, India

Thales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000 organizations already rely on us to verify the identities of people and things, grant access to digital services, analyze vast quantities of information and encrypt data to make the connected world more secure.

Present in India since 1953, Thales is headquartered in Noida, Uttar Pradesh, and has operational offices and sites spread across Bengaluru, Delhi, Gurugram, Hyderabad, Mumbai, Pune among others. Over 1800 employees are working with Thales and its joint ventures in India. Since the beginning, Thales has been playing an essential role in India’s growth story by sharing its technologies and expertise in Defence, Transport, Aerospace and Digital Identity and Security markets.

Data Engineer Cum Data Scientist

Key Responsibilities:

Data Engineering:

  • Design, build, and maintain scalable and efficient data pipelines on Databricks for machine learning workflows.
  • Optimize data processing workflows to handle large-scale datasets using Spark and Delta Lake.
  • Implement best practices for data versioning, quality, and governance.

Model Development & Deployment:

  • Develop, train, and fine-tune machine learning models like Cross-sell, classification and segmentation to meet business objectives.
  • Use MLflow to track experiments, manage model lifecycle, and ensure reproducibility of results.
  • Register and version models in Databricks Model Registry and maintain model lineage.

Model Productionization:

  • Deploy models to production environments and integrate them into real-time or batch systems.
  • Implement robust CI/CD pipelines for machine learning workflows in Databricks.
  • Monitor model performance in production using metrics and feedback loops.

Innovation, Collaboration & Best Practices:

  • Stay updated with the latest trends and advancements in data engineering and machine learning technologies.
  • Promote best practices in machine learning, including feature engineering, hyperparameter tuning, and model evaluation.
  • Present results and insights from machine learning experiments to non-technical audiences.

Qualifications & Skills:

Must-Have Skills:

  • Experience: 3+ years in data engineering and machine learning roles, with expertise in Databricks.
  • Technical Stack: Strong experience with Databricks, PySpark, SQL, MLflow, Model Registry, and Apache Spark, MLOps, Terraform
  • Programming: Proficiency in Python for data processing and machine learning.
  • Model Deployment: Hands-on experience with deploying and managing models in production.
  • ML Lifecycle: Deep understanding of the end-to-end machine learning lifecycle, including model evaluation, monitoring, and maintenance.
  • Cloud Platforms: Experience with cloud platforms like GCP for deploying Databricks solutions.

Good-to-Have Skills:

  • Familiarity with generative AI and deep learning frameworks such as TensorFlow or PyTorch.

At Thales we provide CAREERS and not only jobs. With Thales employing 80,000 employees in 68 countries our mobility policy enables thousands of employees each year to develop their careers at home and abroad, in their existing areas of expertise or by branching out into new fields. Together we believe that embracing flexibility is a smarter way of working. Great journeys start here, apply now!
Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0

Tags: CI/CD Classification Databricks Data pipelines Deep Learning Engineering Feature engineering GCP Generative AI Machine Learning MLFlow ML models MLOps Model deployment Pipelines PySpark Python PyTorch Security Spark SQL TensorFlow Terraform

Perks/benefits: Career development

Region: Asia/Pacific
Country: India

More jobs like this