Senior Data Integration Engineer
San Antonio TX, United States
Marathon Petroleum Corporation
At MPC, we’re committed to being a great place to work – one that welcomes new ideas, encourages diverse perspectives, develops our people, and fosters a collaborative team environment.
The Senior Data Integration Engineer is responsible for Integration developments using the latest S4 HANA, BTP and cloud related tools/technology, and provide SAP Integration architecture across the organization, championing innovation, continuous improvement and automation. This role will be focusing on leading integration development, modernization, and ensuring customized solutions are resilient and follow MPC standards and best practices. The ideal candidate possesses a strong background in application security best practices. Additionally excellent leadership and strong communication skills are essential.
This position belongs to a family of jobs with increasing responsibility, competency, and skill level. Actual position title and pay grade will be based on the selected candidate’s experience and qualifications.
Key Responsibilities- Leads integration projects across different departments or systems.
- Collaborates closely with cloud engineers for cloud-based integrations.
- Collaborates with IT security to ensure data protection during integrations.
- Handles complex API integrations with third-party systems.
- Undertakes the automation of routine and repetitive data tasks.
- Advocates for and upholds data quality standards in integrations.
- Develops advanced data transformation and cleansing strategies and mentors less experienced team members in best practices.
- Participates in vendor selection and contributes to strategic decisions regarding integration tools and platforms.
- Ensures high availability and fault tolerance in integration processes.
- Develop and maintain integration solutions using SAP PO, SAP Integration Suite, and BusinessObjects Data Services.
- Design and implement data integration workflows and ETL processes to ensure seamless data transformation and movement.
- Optimize data warehousing, data lakes, and data pipelines to support robust data storage and retrieval.
- Leverage cloud-based services (AWS, Azure, GCP) and integration technologies (e.g., REST APIs, SOAP) for efficient data integration.
- Utilize strong knowledge of relational and non-relational databases (e.g., MySQL, PostgreSQL, MongoDB, Cassandra) to support diverse data needs.
- Work with various data formats such as JSON, XML, Avro, and Parquet to ensure compatibility and efficiency in data handling.
- Bachelor’s degree in information technology, related field or equivalent experience.
- 5+ years of relevant experience required
- Experience with SAP Process Orchestration (PO) and SAP Integration Suite is required.
- Experience with API or DevSecOps CI/CD pipeline integration is preferred.
- Experience migrating interfaces from SAP PO to Integration Suite and defining/executing an SAP Clean Core Strategy for Integrations is preferred.
- Experience with ETL processes and data transformation with Data Sphere is preferred.
- API Development - Proficiency in integrating machine learning models into data pipelines and data platforms, including feature engineering, model deployment, and monitoring.
- Automations - Automations refer to the systematic use of software tools, scripts, and processes to streamline and optimize the management, processing, and analysis of data. These automations aim to reduce manual intervention, minimize errors, and increase efficiency in handling various data-related tasks such as data ingestion, transformation, cleansing, integration, storage, and reporting. Automations in data engineering empower organizations to handle large volumes of data efficiently, reduce operational overhead, and accelerate the delivery of insights and analytics to stakeholders.
- Containerization - Containerization is form of operating system virtualization, through which applications are run in isolated user spaces called containers, all using the same shared operating system (OS). Container orchestration automatically provisions, deploys, scales, and manages containerized applications without worrying about the underlying infrastructure.
- Data Integration - Proficiency in integrating data from various sources, including structured and unstructured data, using technologies such as ETL (Extract, Transform, Load) processes, data pipelines, and data ingestion frameworks.
- Data Pipelines - Data pipelines are a set of processes that enable the flow of data from one or multiple sources to a destination, often involving tasks such as extraction, transformation, and loading (ETL). These pipelines are designed to efficiently and reliably move and process data, ensuring its quality and accessibility for various analytical and operational purposes.
- Data Privacy - Ability to understand and implement practices that ensure the protection and confidential handling of personal and sensitive information. This includes knowledge of relevant laws and regulations (such as GDPR or HIPAA), the ability to design and enforce policies that safeguard data, and the skills to manage data access rights and consent protocols.
- Data Security - Knowledge of data privacy regulations, cybersecurity best practices, and techniques for protecting sensitive information and ensuring compliance.
- General Programming - Applies a computer language to communicate with computers using a set of instructions and to automate the execution of tasks.
- Low-Code/No-Code Development - Low-code/no-code development is an approach to software development that enables the creation of applications with minimal hand-coding, utilizing visual interfaces and pre-built components. It allows individuals with varying levels of technical expertise to participate in the development process, accelerating the application delivery cycle.
- Metadata Management - Proficiency in metadata management solutions to enable efficient data discovery, data lineage tracing, and data asset management.
- NoSQL Databases - NoSQL databases are a type of database management system that provides a flexible and scalable approach to storing and retrieving data, often diverging from the traditional relational database model. Unlike relational databases, NoSQL databases are designed to handle large volumes of unstructured or semi-structured data, offering high performance and horizontal scalability for modern applications.
- Process Mining - Process mining is a data-driven methodology that leverages event logs and other data sources to analyze and visualize business processes, providing insights into their actual execution, deviations, and performance. It aims to discover, monitor, and improve processes by uncovering patterns, bottlenecks, and inefficiencies through the systematic analysis of event data.
- Process Orchestration - Process orchestration refers to the coordination and management of various tasks, activities, and resources within a workflow or business process. It involves organizing, sequencing, and automating individual tasks or sub-processes to ensure that the overall process operates smoothly and efficiently. Process orchestration typically involves integrating disparate systems, applications, and services to streamline operations and improve collaboration across different parts of an organization. It often employs workflow management tools, automation software, and integration platforms to facilitate communication, data exchange, and decision-making among different components of the process. The goal of process orchestration is to optimize the flow of work, minimize delays and bottlenecks and enhance overall productivity and performance.
- Serverless Computing - Serverless computing is a cloud computing model where developers can build and run applications without managing the underlying server infrastructure. In this paradigm, the cloud provider automatically handles the scaling, maintenance, and allocation of resources, allowing developers to focus solely on writing code and deploying functions or applications.
- SQL DBMS - A SQL Database Management System (DBMS) is a software that facilitates the creation, organization, and management of relational databases. It provides a structured framework for storing, retrieving, and manipulating data using the Structured Query Language (SQL).
- Systems Automation - Systems automation refers to the use of technology and software to perform repetitive tasks, streamline processes, and control the operation of systems without manual intervention. It aims to enhance efficiency, reduce human errors, and optimize resource utilization by automating routine and complex functions within a given system.
- Systems Integration - The process of linking together various IT systems, services and/or software to enable all of them to work functionally together.
As an energy industry leader, our career opportunities fuel personal and professional growth.
Location:
San Antonio, TexasAdditional locations:
Findlay, OhioJob Requisition ID:
00015100Location Address:
19100 Ridgewood PkwyEducation:
Employee Group:
Full timeEmployee Subgroup:
RegularMarathon Petroleum Company LP is an Equal Opportunity Employer and gives consideration for employment to qualified applicants without discrimination on the basis of race, color, religion, creed, sex, gender (including pregnancy, childbirth, breastfeeding or related medical conditions), sexual orientation, gender identity, gender expression, reproductive health decision-making, age, mental or physical disability, medical condition or AIDS/HIV status, ancestry, national origin, genetic information, military, veteran status, marital status, citizenship or any other status protected by applicable federal, state, or local laws. If you would like more information about your EEO rights as an applicant, click here.
If you need a reasonable accommodation for any part of the application process at Marathon Petroleum LP, please contact our Human Resources Department at talentacquisition@marathonpetroleum.com. Please specify the reasonable accommodation you are requesting, along with the job posting number in which you may be interested. A Human Resources representative will review your request and contact you to discuss a reasonable accommodation. Marathon Petroleum offers a total rewards program which includes, but is not limited to, access to health, vision, and dental insurance, paid time off, 401k matching program, paid parental leave, and educational reimbursement. Detailed benefit information is available at https://mympcbenefits.com.The hired candidate will also be eligible for a discretionary company-sponsored annual bonus program.
Equal Opportunity Employer: Veteran / Disability
We will consider all qualified Applicants for employment, including those with arrest or conviction records, in a manner consistent with the requirements of applicable state and local laws. In reviewing criminal history in connection with a conditional offer of employment, Marathon will consider the key responsibilities of the role.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: API Development APIs Architecture Avro AWS Azure Cassandra CI/CD Data pipelines Data quality Data Warehousing Engineering ETL Feature engineering GCP JSON Machine Learning ML models Model deployment MongoDB MySQL NoSQL Parquet Pipelines PostgreSQL Privacy RDBMS Security SQL Unstructured data XML
Perks/benefits: Career development Flex vacation Health care Insurance Medical leave Parental leave Salary bonus Startup environment
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.