Data Engineer II - PySpark
NY-New York, USA
Full Time Mid-level / Intermediate USD 118K - 196K
Memorial Sloan Kettering Cancer Center
Careers and jobs in oncology and cancer care for graduates, students, medical professionals, scientists, operations professionals, hospital administrators, and nurses. [job title], [job title]Pay Range
$118,800.00-$196,200.00Company Overview
The people of Memorial Sloan Kettering Cancer Center (MSK) are united by a singular mission: ending cancer for life. Our specialized care teams provide personalized, compassionate, expert care to patients of all ages. Informed by basic research done at our Sloan Kettering Institute, scientists across MSK collaborate to conduct innovative translational and clinical research that is driving a revolution in our understanding of cancer as a disease and improving the ability to prevent, diagnose, and treat it. MSK is dedicated to training the next generation of scientists and clinicians, who go on to pursue our mission at MSK and around the globe.
Please review important announcements about vaccination requirements and our upcoming EHR implementation by clicking here.
Important Note for MSK Employees:
Your Career Hub profile is submitted to the hiring team as your internal resume. Please be sure your profile is fully complete with your skills, relevant experience and education (if required). Click here to learn more. Please note, this link is only accessible for MSK employees.
Job Description
Data Engineer II
Exciting Opportunity at MSK
Seeking experienced data infrastructure engineer to help drive MSK’s data modernization journey. You will design, implement, and maintain data infrastructure solutions for a new hybrid and multi-cloud data ecosystem that will be used for multiple purposes including data science and analytics. Seeking a talented data engineer to be a key player in MSK’s mission to fight cancer. You will leverage your skills with Spark, Data Lakehouse, and cloud technologies to design, develop, and maintain scalable data processing systems, enabling us to derive valuable insights from our data in a hybrid and multi-cloud data ecosystem.
Position Overview:
- Work closely with stakeholders to understand data requirements and provide technical solutions
- Develop ELT data pipelines using PySpark coding best practices.
- Use CI/CD and other dataops best practices.
- Transform data into conformed models and data products.
- Monitor and optimize data processing jobs for performance, ensuring efficient resource utilization, optimal data storage structure and minimal processing time.
- Implement data validation and quality checks to ensure the integrity and reliability of data throughout the processing lifecycle.
- Create and maintain documentation for data pipelines, processes, and best practices.
- Keep abreast of the latest developments in data engineering and related technologies.
Required Skills:
- Strong proficiency with PySpark development and troubleshooting.
- Experience with Databricks and medallion architecture.
- Solid understanding of database design fundamentals.
- Solid understanding of cloud infrastructure concepts including network and security.
- Experience with CI/CD for development lifecycle and testing automation.
- Implementation expertise on AWS technologies including S3, EC2, Glue, MWAA.
- Strong analytical, problem solving and organizational skills.
- Excellent communication and collaboration skills.
- Minimum 4+ years of Cloud Data Engineer working experience.
Core Skills:
- An in-depth understanding of both business and technical discussions.
- Excellent communication and collaboration skills.
- Strong analytical, problem solving and organizational skills.
Additional Information:
- Location: 633 Third Avenue, NY
- Reporting to Director, Data Management
- Schedule: Hybrid, 4 days a month onsite
Pay Range: $118,800 - $196,200
Helpful Links:
- MSK Compensation Philosophy
- Review Our Greats Benefits Offerings
#LI-POST
#LI-HYBRID
Closing
MSK is an equal opportunity and affirmative action employer committed to diversity and inclusion in all aspects of recruiting and employment. All qualified individuals are encouraged to apply and will receive consideration without regard to race, color, gender, gender identity or expression, sexual orientation, national origin, age, religion, creed, disability, veteran status or any other factor which cannot lawfully be used as a basis for an employment decision.
Federal law requires employers to provide reasonable accommodation to qualified individuals with disabilities. Please tell us if you require a reasonable accommodation to apply for a job or to perform your job. Examples of reasonable accommodation include making a change to the application process or work procedures, providing documents in an alternate format, using a sign language interpreter, or using specialized equipment.
Tags: Architecture AWS CI/CD Databricks Data management DataOps Data pipelines EC2 ELT Engineering Pipelines PySpark Research Security Spark Testing
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.