Data Engineer (PySpark/Informatica)

PA, Working at Home - Pennsylvania, United States

Full Time USD 67K - 126K

Highmark Health

View all jobs at Highmark Health

Apply now Apply later

Posted 2 hours ago

Company :

Highmark Health

Job Description :

JOB SUMMARY

***CANDIDATE MUST BE US Citizen (due to contractual/access requirements)***

We are seeking an integral member for our technical team, responsible for supporting the design, development, and maintenance of the organization's data and programming infrastructure, ensuring the efficient and reliable flow of data across various systems. This role requires a strong understanding of data and ETL architecture, cloud-based data solutions (ideally GCP), ETL tools like Informatica combined with Programming languages like Python/PySpark with a proven track record of success in designing and implementing advanced data engineering solutions. The ideal candidate will possess hands-on experience with the entire software development life cycle (SDLC), with heavy coding experience in relevant languages and Tools (e.g., Python, PySpark, SQL, Informatica), and a demonstrated ability to design and implement efficient data pipelines.

ESSENTIAL RESPONSIBILITIES

Design, develop, and maintain robust data processes and solutions to ensure the efficient movement and transformation of data across multiple systems.
Develop and maintain data models, databases, and data warehouses to support business intelligence and analytics needs
Collaborate with stakeholders across IT, product, analytics, and business teams to gather requirements and provide data solutions that meet organizational needs
Monitor work against the production schedule, provide progress updates, and report any issues or technical difficulties to lead developers regularly.
Implement and manage data governance practices, ensuring data quality, integrity, and compliance with relevant regulations.
Collaborate on the design and implementation of data security measures, including access controls, encryption, and data masking
Perform data analysis and provide insights to support decision-making across various departments
Stay current with industry trends and emerging technologies in data engineering, recommending new tools and best practices as needed
Other duties as assigned or requested.

EXPERIENCE

Required

3 years of experience in design and analysis of algorithms, data structures, and design patterns in the building and deploying of scalable, highly available systems
3 years of experience in a data engineering, ETL development, or data management role.
3 years of experience in SQL and experience with database technologies (e.g., MySQL, PostgreSQL, MongoDB).
3 years of experience in data warehousing concepts and experience with data warehouse solutions (e.g., Snowflake, Redshift, BigQuery)

Preferred

3 years of experience with data pipeline and workflow management tools (e.g., Apache, GCP Tools, Databricks, PySpark).
3 Years of experience building ETL and Data Integration pipelines in Python, PySpark and Informatica
3 years of experience working with On Prem databases like Oracle, Teradata and DB2
3 years of experience with Cloud platforms (GCP and Azure) and their respective data services
1 year of experience working with a variety of technology systems, designing solutions or developing data solutions in healthcare
3 years of experience in data governance, data quality, and data security best practices and tools
3 years of experience translating requirements, design mockups, prototypes or user stories into technical designs
3 years of experience with data-related code that is fault-tolerant, efficient, and maintainable

SKILLS

Demonstrated ability to achieve stretch goals in a highly innovative and fast-paced environment
Adaptability: Ability to take on diverse tasks and projects, adapting to the evolving needs of the organization
Analytical Thinking: Analytical skills with a focus on detail and accuracy
Interest and ability to learn other data development technologies/languages as needed
Technical Proficiency: Comfortable with a range of data tools and technologies, with a willingness to learn new skills as needed
Track record in designing and implementing large-scale data sources
Sense of ownership, urgency, and drive
Demonstrated passion for user experience and improving usability
Team Collaboration: A team player who can work effectively in cross-functional environments
Experience and willingness to mentor junior data engineers and help develop their skills and leadership

EDUCATION

Required

Bachelor’s degree in Computer Science, Information Systems, Data Science, Computer Engineering or related field or relevant experience and/or education as determined by the company in lieu of bachelor's degree.

Preferred

Master's degree in Computer Science, Information Systems, Data Science, Computer Engineering or related field

LICENSES or CERTIFICATIONS

Required

None

Preferred

None

Language (Other than English):

None

Travel Requirement:

0% - 25%

PHYSICAL, MENTAL DEMANDS and WORKING CONDITIONS

Position Type

Office- or Remote-based

Teaches / trains others

Occasionally

Travel from the office to various work sites or from site-to-site

Rarely

Works primarily out-of-the office selling products/services (sales employees)

Never

Physical work site required

Lifting: up to 10 pounds

Constantly

Lifting: 10 to 25 pounds

Occasionally

Lifting: 25 to 50 pounds

Rarely

Disclaimer: The job description has been designed to indicate the general nature and essential duties and responsibilities of work performed by employees within this job title. It may not contain a comprehensive inventory of all duties, responsibilities, and qualifications required of employees to do this job.

Compliance Requirement: This job adheres to the ethical and legal standards and behavioral expectations as set forth in the code of business conduct and company policies.

As a component of job responsibilities, employees may have access to covered information, cardholder data, or other confidential customer information that must be protected at all times. In connection with this, all employees must comply with both the Health Insurance Portability Accountability Act of 1996 (HIPAA) as described in the Notice of Privacy Practices and Privacy Policies and Procedures as well as all data security guidelines established within the Company’s Handbook of Privacy Policies and Practices and Information Security Policy.

Furthermore, it is every employee’s responsibility to comply with the company’s Code of Business Conduct. This includes but is not limited to adherence to applicable federal and state laws, rules, and regulations as well as company policies and training requirements.

Pay Range Minimum:

$67,500.00

Pay Range Maximum:

$126,000.00

Base pay is determined by a variety of factors including a candidate’s qualifications, experience, and expected contributions, as well as internal peer equity, market, and business considerations. The displayed salary range does not reflect any geographic differential Highmark may apply for certain locations based upon comparative markets.

Highmark Health and its affiliates prohibit discrimination against qualified individuals based on their status as protected veterans or individuals with disabilities and prohibit discrimination against all individuals based on any category protected by applicable federal, state, or local law.

We endeavor to make this site accessible to any and all users. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process, please contact the email below.

For accommodation requests, please contact HR Services Online at HRServices@highmarkhealth.org

California Consumer Privacy Act Employees, Contractors, and Applicants Notice

Apply now Apply later

Job stats: 0 0 0

Category: Engineering Jobs

Tags: Architecture Azure BigQuery Business Intelligence Computer Science Data analysis Databricks Data governance Data management Data pipelines Data quality Data warehouse Data Warehousing DB2 Engineering ETL GCP Informatica MongoDB MySQL Oracle Pipelines PostgreSQL Privacy PySpark Python Redshift SDLC Security Snowflake SQL Teradata