PySpark Developer

Warsaw, Masovian Voivodeship, Poland

Axiom Software Solutions Limited

Axioms Software Solutions is one of the well known Best Software Consulting Company,Business Intelligence Analyst Consultant and expertise Devops for Developers. Trust us for all your software needs.

View all jobs at Axiom Software Solutions Limited

Apply now Apply later

PySpark Developer

Description

We are looking for a skilled Data Engineer with expertise in Python, PySpark, and Cloudera to join our team. The ideal candidate will be responsible for developing and optimizing big data pipelines while ensuring efficiency and scalability. Experience with Databricks is a plus. Additionally, familiarity with Git, GitHub, Jira, and Confluence is highly valued for effective collaboration and version control.

Key Responsibilities

- Design, develop, and maintain ETL pipelines using Python and PySpark.

- Work with Cloudera Hadoop ecosystem to manage and process large-scale datasets.

- Ensure data integrity, performance, and reliability across distributed systems.

- Collaborate with data scientists, analysts, and business stakeholders to deliver data-driven solutions.

- Implement best practices for data governance, security, and performance tuning.

- Use Git and GitHub for version control and efficient code collaboration.

- Track and manage tasks using Jira, and document processes in Confluence.

- (Optional) Work with Databricks for cloud-based big data processing.

Required Skills & Experience

- Strong programming skills in Python.

- Hands-on experience with PySpark for distributed data processing.

- Expertise in Cloudera Hadoop ecosystem (HDFS, Hive, Impala).

- Experience with SQL and working with large datasets.

- Knowledge of Git and GitHub for source code management.

- Experience with Jira for task tracking and Confluence for documentation.

- Strong problem-solving and analytical skills.

Preferred Qualifications

- Basic knowledge of Databricks for cloud-based big data solutions.

- Experience with workflow orchestration tools (e.g., Airflow, Oozie).

- Understanding of cloud platforms (AWS, Azure, or GCP).

- Exposure to Kafka or other real-time streaming technologies.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Engineering Jobs

Tags: Airflow AWS Azure Big Data Confluence Databricks Data governance Data pipelines Distributed Systems ETL GCP Git GitHub Hadoop HDFS Jira Kafka Oozie Pipelines PySpark Python Security SQL Streaming

Region: Europe
Country: Poland

More jobs like this