Middle Data Scientist MLOps
Bucharest Orhideea, Romania
Thales
From Aerospace, Space, Defence to Security & Transportation, Thales helps its customers to create a safer world by giving them the tools they need to perform critical tasksThe people we all rely on to make the world go round – they rely on Thales. Thales rely on its employees to invent the future: right here, right now.
Present in Romania for over 40 years, Thales is expanding its presence in the country by growing its Digital capabilities and by developing a Group Engineering Competence Centre (ECC). Operating from Bucharest, Thales delivers solutions in a number of core businesses, from ground transportation, space and defence, to security and aeronautics.
Several professional opportunities have arisen. If you are looking for the solidity of a Global Group that is at the forefront of innovation, but with the agility of a human structure that tailors to the personal development of its employees and allows opportunities for evolution in an international environment, then this is the place for you!
Background:
We are seeking a passionate Data Scientist MLOps to join our Engineering Project Dashboard team aiming to provide KPIs and metrics to monitor engineering activities of projects' engineering work packages. Customers of Engineering Dashboard digital services are spread all around the world, leading teams with different granularity, and looking for contextual information related to their projects.
Mission:
Our “Data Scientist – MLOps” colleague will define and implement the injection of engineering data into a Data lake dedicated to Thales engineering data. This job requires to process raw data in possibly huge amount (terabytes), to analyze, format and refine them (statistical analysis, normalization and cleaning steps, outlier’s detection and management). From this data, it is required to understand the problem to be solved and find the right models for extraction. From the models produced, it is required for our colleague to be able to compare different models, to identify the most performing ones. In case of huge data amount, knowledge in big data techniques and environment (HDFS and related tools) may be required.
Main responsibilities:
Develop, update, maintain the project data models and manage the data sets for the development and operation process.
Handling vague metrics, deciphering inherited projects, and defining customer records.
Data Extraction: identify and extract relevant data from various sources, including databases, CSV files, APIs, PDF, and other systems.
Data Transformation: clean, normalize, and transform data to ensure it is in a suitable format for the organization needs. This may involve data manipulation, joining different datasets, applying statistical functions, converting data types.
Data Loading: load transformed data into appropriate storage systems
Data Validation and Quality Assurance: ensure the accuracy and integrity of data throughout all stages of the ETL process. Perform and integrate quality checks and tools to identify and correct errors or discrepancies.
Documentation: create and maintain documentation related to data flows and model, transformations applied, and validation procedures.
Data Analysis: use loaded data analyze data distributions, visualize patterns, to extract valuable insights, generate reports, identify trends, and support data-driven decision-making.
Stay in touch with the Group Data Management in various function, to ensure alignment with the recommendations and strategies.
Maintain clear and close collaboration with both the development team and the project stakeholders/ key users.
Who you are:
Bachelor's degree in Computer Science, Information Systems, Data Modelling, Data Science, or a relevant experience
High-value skills to tackle specific analytical problems
Proven experience in data analysis and ETL processes and tools
Proven data engineering skills
Very good statistical data analysis skills and attention to detail
Proven data modeling (Weka, TensorFlow, Keras and knowledge of core algorithms) skills
Good knowledge of relational SQL database
Good knowledge of non-relation databases (e.g. MongoDB, etc. in case of huge amount of data)
Good communication and relationship with the stakeholders and team members
Capable to give and receive feedback; able to listen and share, able to give constructive feedback
English knowledge; French would be a plus
Agile mindset & practices
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile APIs Big Data Computer Science CSV Data analysis Data management Engineering ETL HDFS Keras KPIs MLOps MongoDB Security SQL Statistics TensorFlow Weka
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.