Data Engineer

55 Fruit Street Boston (White Building), United States

Mass General Brigham

Mass General Brigham is an integrated healthcare system, uniting great minds to solve the hardest problems in medicine for our communities and the world.

View all jobs at Mass General Brigham

Apply now Apply later

Site: The General Hospital Corporation


 

At Mass General Brigham, we know it takes a surprising range of talented professionals to advance our mission—from doctors, nurses, business people and tech experts, to dedicated researchers and systems analysts. As a not-for-profit organization, Mass General Brigham is committed to supporting patient care, research, teaching, and service to the community.  We place great value on being a diverse, equitable and inclusive organization as we aim to reflect the diversity of the patients we serve.

At Mass General Brigham, we believe a diverse set of backgrounds and lived experiences makes us stronger by challenging our assumptions with new perspectives that can drive revolutionary discoveries in medical innovations in research and patient care. Therefore, we invite and welcome applicants from traditionally underrepresented groups in healthcare — people of color, people with disabilities, LGBTQ community, and/or gender expansive, first and second-generation immigrants, veterans, and people from different socioeconomic backgrounds – to apply.


 

Our laboratory applies computational and machine learning methods to investigate the impact of seizures and abnormal brain activity on outcomes in pigs with cortical impact. Our goal is to understand pathological correlates of epilepsy and traumatic brain injury. Analysis of datasets (including video–EEG telemetry, intracellular Chloride, among others) is central to these efforts.

Specific efforts focus on developing methods for automatically classifying the semiology of pigs in video monitoring as they undergo the development of epilepsy and understanding the relationships between any abnormal behaviors and time after injury or the change in seizure frequency. Efforts will particularly focus on using supervised machine learning approaches including training artificial neural networks via open source software such as Keras, Tensorflow, DeepLabCut, SimBA, TREBA etc. or unsupervised learning methods, heuristics, and other algorithms to learn patterns, fit and extrapolate from models, and process large datasets of video frames.

The person will interact with staff in other lab’s such as Sydney Cash’s lab, Kevin Staley’s lab, and Kyle Lillis’ lab.

PRINCIPAL DUTIES AND RESPONSIBILITIES:

The machine learning engineer will work and mentor a team of researchers in searching for patterns hidden in large data sets for research in neurology. The machine learning engineer will be responsible for data from the electronic data repository, including EEG, video, and peripheral blood biomarkers. The machine learning engineer will develop unique algorithmic approaches for analysis of data and supervise and mentor a team of research staff. Responsibilities will include:

- Creating or applying methods for automatic classification or regression on large data

- Software development and code management

- Data wrangling of biological, instrumental, or technical data

- Guiding a team on computational tasks and helping oversee research staff

- Problem solving and troubleshooting of technical problems for research staff

- Management of a large physiological database, warehouse, and/or repository

- Development of algorithms and maintaining a software pipeline

- Collaborate and interface with personnel from other research laboratories

- Documenting steps for reproducing results

- Outlining desired milestones for research staff so that objectives can be met

- Generate reports of statistical analysis

- Prepare and submit research manuscripts and abstracts

- Provide weekly updates on data processing, analysis or other research progress

- Present at lab meetings, and at local and national meetings

- Data annotation, storage, and management

- Communicating concepts in a helpful way to those that are not computer scientists

Excellent analytical and troubleshooting skills,

· Demonstrated knowledge of software development methodologies and software pipeline design.

· Ability to solve complex and large-scale problems to make important contributions to medicine and science.

· Strong software engineering and quantitative background including knowledge in Python, Unix Shell (e.g. Bash, Zsh, etc), deep neural networks of different architectures (convolutional, recurrent, etc), algorithms (sorting, binary search, etc), C++ or any other compiler based language(s), calculus, basic statistics (hypothesis testing, distributions, regression, etc), data visualization, etc.

· Experience in scientific method and critical thinking

· Detail-oriented and pro-active workstyle

· Strong ethical principles

· Ability to work independently and as part of a team

· Excellent verbal and written communication skills

Any additional skills are a plus including:

Parallel computing, command prompt, working with GPUs, supercomputing, video software (e.g. ffmpeg), SQL, large language models, MATLAB, PHP, or other additional languages, hardware knowledge, advanced understanding of kernel, neurobiology knowledge, et


 

Job Summary

Summary
Responsible for implementing methods to improve data reliability and quality. They combine raw information from different sources to create consistent and machine-readable formats and they develop and test architectures that enable data extraction and transformation for predictive or prescriptive modeling.

Does this position require Patient Care?
No

Essential Functions
-Design, develop, and implement data pipelines and ETL/ELT code to support business requirements.

-Maintain and optimize various components of the data pipeline architecture.

-Deliver high quality, efficient solutions to meet technical standards and industry best practices.

-Deliver optimal technical solutions for business and operational requirements.

-Participate in team design sessions and contribute options and solutions Produce and support product documentation.

-Participate in ETL Quality circle discussions to explore, discuss, and arrive at efficient solutions and best practices.


 

Qualifications

Education
Bachelor's Degree Computer Science required or Bachelor's Degree Related Field of Study required

Can this role accept experience in lieu of a degree?
Yes

Licenses and Credentials

Experience
Data warehousing development in large reporting environment(s) 2-3 years required and Experience with developing data pipelines using on Snowflake features ( Snowpipe, SnowSQL, Snow Sight, Data Streams ) required and Hands-on development experience with ETL/ELT tools, such as dbt, Fivetran, or Informatica required and Experience working in Agile software development environment required

Knowledge, Skills and Abilities
- Working knowledge of cloud computing platforms such as AWS, GCP, or Azure.
- Familiarity with enterprise data warehousing systems a plus.






 

Additional Job Details (if applicable)

Physical Requirements

  • Standing Occasionally (3-33%)
  • Walking Occasionally (3-33%)
  • Sitting Constantly (67-100%)
  • Lifting Occasionally (3-33%) 20lbs - 35lbs
  • Carrying Occasionally (3-33%) 20lbs - 35lbs
  • Pushing Rarely (Less than 2%)
  • Pulling Rarely (Less than 2%)
  • Climbing Rarely (Less than 2%)
  • Balancing Occasionally (3-33%)
  • Stooping Occasionally (3-33%)
  • Kneeling Rarely (Less than 2%)
  • Crouching Rarely (Less than 2%)
  • Crawling Rarely (Less than 2%)
  • Reaching Occasionally (3-33%)
  • Gross Manipulation (Handling) Constantly (67-100%)
  • Fine Manipulation (Fingering) Frequently (34-66%)
  • Feeling Constantly (67-100%)
  • Foot Use Rarely (Less than 2%)
  • Vision - Far Constantly (67-100%)
  • Vision - Near Constantly (67-100%)
  • Talking Constantly (67-100%)
  • Hearing Constantly (67-100%)


 

Remote Type

Onsite


 

Work Location

55 Fruit Street


 

Scheduled Weekly Hours

0


 

Employee Type

Per Diem


 

Work Shift

Day (United States of America)


 

EEO Statement:

The General Hospital Corporation is an Affirmative Action Employer. By embracing diverse skills, perspectives and ideas, we choose to lead. All qualified applicants will receive consideration for employment without regard to race, color, religious creed, national origin, sex, age, gender identity, disability, sexual orientation, military service, genetic information, and/or other status protected under law. We will ensure that all individuals with a disability are provided a reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment.


 

Mass General Brigham Competency Framework

At Mass General Brigham, our competency framework defines what effective leadership “looks like” by specifying which behaviors are most critical for successful performance at each job level. The framework is comprised of ten competencies (half People-Focused, half Performance-Focused) and are defined by observable and measurable skills and behaviors that contribute to workplace effectiveness and career success. These competencies are used to evaluate performance, make hiring decisions, identify development needs, mobilize employees across our system, and establish a strong talent pipeline.






Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  6  2  0
Category: Engineering Jobs

Tags: Agile Architecture AWS Azure Classification Computer Science Data pipelines Data visualization Data Warehousing dbt ELT Engineering ETL FiveTran GCP Informatica Keras LLMs Machine Learning Matlab Nonprofit Open Source PHP Pipelines Python Research Snowflake SQL Statistics Teaching TensorFlow Testing Unsupervised Learning

Perks/benefits: Career development

Regions: Remote/Anywhere North America
Country: United States

More jobs like this