Data Scientist / Data Warehouse Engineer

UAE Dubai (PIL Office), Emiratos Árabes Unidos

Full Time Mid-level / Intermediate USD 100K - 187K *

Parsons Corporation

Parsons is a digitally enabled solutions provider and a leader in many diversified markets with a focus on national security, defense, and global infrastructure.

View all jobs at Parsons Corporation

Apply now Apply later

Posted 13 hours ago

Job Description:

Data Scientist / Data Warehouse Engineer (Unstructured Data Extraction & Processing)

Position Overview

Parsons Corporation is seeking a Data Scientist / Data Warehouse Engineer with a strong focus on handling unstructured data extraction and processing. The ideal candidate will design, develop, and maintain scalable data pipelines, integrating both structured and unstructured data from various sources. This role requires technical expertise in data engineering tools and best practices, as well as excellent communication and collaboration skills to work cross-functionally with data analysts, scientists, and stakeholders. Join a dedicated and distributed team of scientists, software architects, and software engineers responsible for developing a Generative Artificial Intelligence (GenAI) enabled capability to expedite the design of infrastructure projects such as highways, bridges, etc.

Key Responsibilities

Unstructured Data Processing

Extract, cleanse, and process unstructured data (e.g., text, logs, images) for use in analytics and machine learning.
Develop and optimize custom ETL/ELT pipelines to handle complex data formats and large data volumes.

Data Pipeline Development

Build robust and scalable data pipelines using Apache Spark, Hadoop, or Apache Beam.
Automate workflows and schedule data processes using orchestration tools such as Apache Airflow, Prefect, or Luigi.

Data Warehousing & Storage

Design, implement, and maintain modern data warehouse solutions (e.g., Databricks, Snowflake, Redshift, BigQuery).
Manage both relational (SQL) and NoSQL databases for structured and unstructured data storage.

Cloud Integration

Deploy and optimize data solutions on cloud platforms (Azure, AWS, or GCP).
Leverage services like Azure Data Factory, AWS Glue, or Google Dataflow for seamless data ingestion and transformation.

Performance Optimization & Troubleshooting

Monitor, diagnose, and improve data system performance and reliability.
Collaborate with other teams to refine database queries, optimize ETL processes, and ensure data integrity.

Data Governance & Security

Implement data quality checks, versioning, and security protocols in compliance with regulations (GDPR, CCPA).
Ensure robust access controls and encryption measures for sensitive information.

Collaboration & Documentation

Work closely with cross-functional teams to understand data requirements and deliver solutions.
Document workflows, system designs, and troubleshooting procedures to support knowledge sharing and future maintenance.

Required Technical Skills

Programming

Proficiency in Python for data processing and automation.
Experience with scripting languages (e.g., Bash, Shell) is a plus.

Data Processing Frameworks

Hands-on experience with Apache Spark, Hadoop, or Apache Beam.
Familiarity with ETL/ELT processes and best practices.

Database & Querying

Strong understanding of SQL with experience in PostgreSQL, MySQL, or Oracle.
Exposure to NoSQL databases like MongoDB, Cassandra, or DynamoDB.

Cloud Platforms

Working knowledge of Azure (e.g., Data Factory, Synapse, Data Lake), AWS (e.g., S3, Redshift, Glue), or GCP (BigQuery, Dataflow).

Data Warehousing

Experience with Databricks, Snowflake, Redshift, or BigQuery.

Data Pipelines & Orchestration

Familiarity with workflow orchestration tools (Airflow, Prefect, Luigi).

Big Data Tools

Proficiency working with distributed data systems like HDFS or cloud-native equivalents.

Version Control

Skilled in Git for collaborative development and code versioning.

Experience

Years of Experience: Minimum 4 years in data engineering, data warehousing, or a related field.
Project Exposure: Demonstrated ability to build and optimize scalable data pipelines for both batch and real-time processing.
Debugging & Optimization: Proven track record of diagnosing performance issues and optimizing data systems.
Data Governance & Security: Experience implementing data privacy regulations and best practices in data quality and access controls.

Soft Skills

Problem-Solving

Capable of independently troubleshooting complex data and system issues.

Communication

Strong ability to collaborate with data analysts, scientists, and other engineers to translate business requirements into effective data solutions.

Documentation

Competent in documenting data workflows, system designs, and troubleshooting steps clearly and concisely.

Team Collaboration

Experience working in cross-functional teams of professionals that are located around the world, ideally within Agile or similar methodologies.

Education

Bachelor’s or Master’s degree in Computer Science, Information Systems, Engineering, or a related field.
Equivalent practical experience can compensate for formal education in some cases.

Certifications (Optional but Valuable)

AWS Certified Data Analytics – Specialty
Google Professional Data Engineer
Microsoft Azure Data Engineer Associate
Databricks Certified Data Engineer Associate

Additional Considerations

Analytical & Statistical Skills: A background in data analysis or data science is highly beneficial for designing effective data models and understanding business insights.
Machine Learning Integration: Exposure to integrating machine learning pipelines, especially GenAI technology, for further data-driven intelligence is a plus.
Innovative Mindset: Enthusiasm for exploring new tools, frameworks, and methodologies to continually optimize data solutions.

Why Join Us

Impactful Role: Shape the architecture and strategy for unstructured data management and analytics, influencing key decisions and driving business value.
Collaborative Environment: Work alongside a dynamic team of data professionals, leveraging cutting-edge technologies to solve real-world challenges.
Professional Growth: Expand your technical acumen and leadership capabilities in a role that offers continuous learning and development opportunities.

Minimum Clearance Required to Start:

Not Applicable/None

Parsons is an equal opportunity employer committed to diversity in the workplace. Minority/Female/Disabled/Protected Veteran.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 3 0 0

Categories: Data Science Jobs Engineering Jobs

Tags: Agile Airflow Architecture AWS AWS Glue Azure Big Data BigQuery Cassandra Computer Science Data analysis Data Analytics Databricks Dataflow Data governance Data management Data pipelines Data quality Data warehouse Data Warehousing DynamoDB ELT Engineering ETL GCP Generative AI Git Hadoop HDFS Machine Learning MongoDB MySQL NoSQL Oracle Pipelines PostgreSQL Privacy Python Redshift Security Snowflake Spark SQL Statistics Unstructured data