Data Quality Engineer

Home Office: Mexico

Full Time Mid-level / Intermediate USD 45K - 83K * ^est.

Inmar Intelligence

Inmar Intelligence leverages data science and technology to power reliable, dynamic solutions for brands, retailers, and healthcare organizations — including incentives and loyalty, media, returns, and healthcare solutions.

View all jobs at Inmar Intelligence

Apply now Apply later

Posted 2 hours ago

As a Data Quality Engineer you will be responsible for building automated, repeatable, efficient processes to perform data quality checks across various data assets. It is your duty to ensure high quality data is available to internal stakeholders and customers.

To do this you will be designing, developing and documenting automated (sometimes manual) test plans and test cases writing high-quality, well-structured code. The output of these quality assurance checks should be easy to disseminate to appropriate audiences so that the proper action can be taken to ensure consistent, high quality data across the organization. You will be involved with requirements gathering, working closely with other teams such as Data Engineering and effectively communicating or explaining technical concepts to developers, product managers, and business partners. You will provide ongoing support of this data quality framework along with continuous improvement and refinement.

MAJOR JOB RESPONSIBILITIES

● Contribute to the design, development, documentation, maintenance and monitoring of a data quality framework
● Build repeatable, automated, efficient data quality checks
● Continuously validate the data quality across data pipelines and repositories against data from source systems
● Work across teams for requirements gathering
● Document test plans and test cases
● Execute test cases, perform bug tracking, document and share results
● Troubleshooting, performance tuning and resolution where necessary
● Assist with data quality support tickets and inquiries
● Design Data Quality reports and dashboards for various audiences to analyze and communicate the output of the data quality tests

EDUCATION / QUALIFICATIONS / EXPERIENCE
● B.S. in computer science or information systems fields required, or 5+ years related work experience.
● Strong analytical, critical thinking skills used to solve complex problems
● Strong technical background with a mix of development and automation skills
● Outstanding attention to detail and consistently meets deadlines
● Exceptional communication and interpersonal skills
● Ability to work alongside a highly collaborative team, but also a self-starter, able to work independently with little guidance
● Experience in troubleshooting, performance tuning, and optimization
● Proficient in shell scripting, Python, Scala or other programming languages
● Knowledge of Spark/PySpark
● Excellent SQL knowledge, ability to read/write SQL queries
● Skilled in Hive (HQL) and HDFS
● Experience working with both unstructured and structured data sets, including flat files, JSON, XML, ORC, Parquet and AVRO
● Comfortable working with big data environments and dealing with large diverse data sets
● Proficient in Linux environments
● Familiarity with source code management/versioning tools such as Github
● Understanding of CI/CD principles and best practices in data processing
● Experience building data visualization dashboards to capture data quality metrics using tools like Tableau
● Understanding of public cloud technologies such as AWS, GCP and Azure is a plus

We are an Equal Opportunity Employer, including disability/vets.

This position is not eligible for student visa sponsorship, including F-1 OPT or CPT. Candidates must have authorization to work in the U.S. without the need for employer sponsorship now or in the future.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 0 0 0

Category: Engineering Jobs

Tags: Avro AWS Azure Big Data CI/CD Computer Science Data pipelines Data quality Data visualization Engineering GCP GitHub HDFS JSON Linux Parquet Pipelines PySpark Python Scala Shell scripting Spark SQL Tableau XML