Data Engineer

Pune, IN

Full Time USD 49K - 91K *

Atos

We design digital solutions from the everyday to the mission critical — in artificial intelligence, hybrid cloud, infrastructure management, decarbonization and employee experience.

View all jobs at Atos

Apply now Apply later

Posted 1 week ago

Eviden, part of the Atos Group, with an annual revenue of circa € 5 billion is a global leader in data-driven, trusted and sustainable digital transformation. As a next generation digital business with worldwide leading positions in digital, cloud, data, advanced computing and security, it brings deep expertise for all industries in more than 47 countries. By uniting unique high-end technologies across the full digital continuum with 47,000 world-class talents, Eviden expands the possibilities of data and technology, now and for generations to come.

Skill Set

Experience in PySpark and Python Language.
Experience in (OLAP Systems).
Experience in SQL (should be able to write complex SQL Queries)
Experience in Orchestration (Apache Airflow is preferred).
Experience in Hadoop (Spark and Hive: Optimization of Spark and Hive apps).
Knowledge in Snowflake (good to have).
Experience in Data Quality (good to have).
Knowledge in File Storage (S3 is good to have)

Role and Responsibilities

1. Data Pipeline Development

Build and maintain scalable, reliable, and efficient ETL (Extract, Transform, Load) pipelines using Python and Airflow.

Automate data ingestion and processing workflows from multiple sources.

2. Data Integration

Integrate and transform data from disparate sources (e.g., APIs, third-party systems, legacy systems).

Handle data standardization, validation, and quality assurance during integration.

3. Big Data Processing

Utilize big data technologies like Apache Spark, and Snowflake for large-scale data processing.

Write efficient and scalable Python scripts to process and validate the data.

4. Data Governance and Quality

Implement data validation, cleaning, and transformation processes to ensure data accuracy and reliability.

Enforce compliance with data governance policies and standards (e.g., GDPR, HIPAA).

5. Collaboration

Work closely with other teams to understand data requirements.

Collaborate with software engineers to integrate data workflows into applications.

6. Monitoring and Optimization

Monitor the performance of data pipelines and systems.

Debug and optimize data workflows to improve efficiency and reliability.

7. Scripting and Automation

Develop reusable and modular Python scripts for repeated tasks.
Automate workflows for recurring data processing jobs.

8. Documentation and Best Practices

Document pipeline architecture.

Our Offering:

Global cutting-edge IT projects that shape the future of digital and have a positive impact on environment
Wellbeing programs & work-life balance - integration and passion sharing events
Attractive Salary and Company Initiative Benefits
Courses and conferences
Attractive Salary
Hybrid work culture

Let’s grow together

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 0 0 0

Category: Engineering Jobs

Tags: Airflow APIs Architecture Big Data Data governance Data pipelines Data quality ETL Hadoop OLAP Pipelines PySpark Python Security Snowflake Spark SQL