Data Scientist

Columbus, IN, United States

Applications have closed

We are looking for a talented Data Scientist to join our team specializing in Systems and IT for our Corporate Organization in Columbus, IN

In this role, you will make an impact in the following ways:

  • Solving Business Problems: By leveraging data science methodologies, you'll provide innovative solutions to complex business challenges, driving strategic decision-making.
  • Developing Algorithms: Creating individual algorithms using statistical methodologies and programming languages will enhance the company's analytical capabilities and improve operational efficiency.
  • Collaborating with Domain Experts: Partnering with domain experts to verify model capabilities ensures that your solutions are practical, accurate, and aligned with business needs.
  • Data Preparation: Implementing statistical techniques to clean, prepare, and profile data will ensure high-quality datasets for deeper analysis, leading to more reliable insights.
  • Communicating Results: Clearly articulating results, methodologies, and learnings to stakeholders and peers will foster understanding and support for data-driven initiatives.
  • Knowledge Sharing: Continuously developing and advancing the team through knowledge sharing and collaboration will build a strong, cohesive, and innovative data science team.
  • Enhancing Data Quality: Your efforts in data preparation and profiling will improve the overall quality of data used in analyses, leading to more accurate and actionable insights.
  • Driving Innovation: By applying advanced statistical techniques and collaborating with experts, you'll drive innovation and contribute to the company's competitive edge.

To be successful in this role you will need the following: 

  • Communicates effectively - Developing and delivering multi-mode communications that convey a clear understanding of the unique needs of different audiences. 
     
  • Customer focus - Building strong customer relationships and delivering customer-centric solutions. 
     
  • Decision quality - Making good and timely decisions that keep the organization moving forward. 
     
  • Manages complexity - Making sense of complex, high quantity, and sometimes contradictory information to effectively solve problems. 
     
  • Tech savvy - Anticipating and adopting innovations in business-building digital and technology applications. 
     
  • Data Mining - Extracts insights from data by identifying relationships and patterns through use of a suite of data exploration and data visualization techniques to understand the underlying structure of the data and enable sound conclusions upon model building. 
     
  • Predictive Modeling - Develops analytical or machine learning models by using appropriate variable transformations, feature selection strategies, imputation strategies, class rebalancing, resampling strategies and quality control measures to generate predictive insights used in solving business questions. 
     
  • Programming - Creates, writes and tests computer code, test scripts, and build scripts using algorithmic analysis and design, industry standards and tools, version control, and build and test automation to meet business, technical, security, governance and compliance requirements. 
     
  • Requirements Analysis - Evaluates relationships and interdependencies between requirements based upon their complexity and value to the business in order to determine feasibility and prioritization. 
     
  • Statistical Modeling - Develops descriptive and explanatory statistical models, and simulations for regression, classification, outlier detection, anomaly detection, time series forecasting using knowledge of foundational statistics such as null hypotheses significance tests, regression models, generalized linear modeling, time series analysis, rank statistics, probability distribution fitting survival analysis, etc. to validate hypotheses for any given statistical or business question. 
     
  • Problem Solving - Solves problems and may mentor others on effective problem solving by using a systematic analysis process by leveraging industry standard methodologies to create problem traceability and protect the customer; determines the assignable cause; implements robust, data-based solutions; identifies the systemic root causes and ensures actions to prevent problem reoccurrence are implemented. 
     
  • Values differences - Recognizing the value that different perspectives and cultures bring to an organization. 

     

Education, Licenses, Certifications: 

  • College, university, or equivalent degree in relevant technical discipline, or relevant equivalent experience required. 
  • This position may require licensing for compliance with export controls or sanctions regulations. 

     

Experience: 
 

  • Relevant experience preferred such as working in a temporary student employment, intern, co-op, or other extracurricular team activities. 
     
  • Knowledge of the latest technologies and trends in data science is highly preferred and includes: 
    - Background in processing and managing large data sets 
    - Knowledge of big data, open source and third party toolsets 
    - Experience in building analytical solutions 
     
  • Experiences in the following are preferred: 
    - Familiarity analyzing complex business systems, industry requirements, and/or data regulations 
    - SQL query language 
    - Clustered compute cloud-based implementation experience 
    - Implementing Big Data platform solutions using open source and third-party tools 
    - Microsoft Azure and/or Amazon Web services environment 
    - Experience in Agile software development 
    - Familiarity with validation and testing of machine learning systems 
    - Familiarity with Continuous Integration and Continuous Delivery (CI/CD)

Additional Responsibilities: 

1. Understanding of transformer architetcure(encoder, decoder, attention mechanism)
2. Knowledge of Transfer Learning with some experience
3. Basic knowledge of sematic learning and document databases
4. Search - Information Retrieval concepts and paradigms

Nice to have/know:

1.    Proficiency in SQL and Python is a must. Proficiency in PySpark and R is a plus
2.    Some knowledge of databricks or Hadoop/spark cluster is a must while familiarity with various IDEs for R, Python and SQL is a plus
3.    Be a pragmatic programmer who comments the code for readability and uses functions for modularity
4.    Knowledgeable in end to end machine learning implementation with familiarity on model monitoring framework
5.    Knowledgeable in bootstrap, descriptive statistics, inferential statistics

At Cummins, we are an equal opportunity and affirmative action employer dedicated to diversity in the workplace. Our policy is to provide equal employment opportunities to all qualified persons without regard to race, gender, color, disability, national origin, age, union affiliation, sexual orientation, veteran status, citizenship, gender identity and/or expression, or other status protected by law. Visit EEOC.gov to know your rights on workplace discrimination.

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  3  1  0
Category: Data Science Jobs

Tags: Agile AWS Azure Big Data CI/CD Classification Databricks Data Mining Data quality Data visualization Hadoop Machine Learning ML models Open Source Predictive modeling PySpark Python R Security Spark SQL Statistical modeling Statistics Testing

Perks/benefits: Career development Team events

Region: North America
Country: United States

More jobs like this