Data Scientist
Durham, NC, US, 27710
Duke University
Duke University:
Duke University was created in 1924 through an indenture of trust by James Buchanan Duke. Today, Duke is regarded as one of America’s leading research universities. Located in Durham, North Carolina, Duke is positioned in the heart of the Research Triangle, which is ranked annually as one of the best places in the country to work and live. Duke has more than 15,000 students who study and conduct research in its 10 undergraduate, graduate and professional schools. With about 40,000 employees, Duke is the third largest private employer in North Carolina, and it now has international programs in more than 150 countries.
The Duke Law Library’s Data Lab is seeking a talented Data Scientist. The Data Lab plays a vital role in supporting the empirical research needs of Law School faculty, students, and staff, providing a broad range of skills to support each stage of the empirical research lifecycle. The Data Scientist will strengthen the Data Lab’s ability to create, manage, and share research-ready data and analysis resources that empower faculty, staff, and students in their work.
Reporting to the Assistant Director for Empirical Research and Data Services, and collaborating closely throughout the Law School and Law Library, the Data Scientist will guide law faculty and students in selecting appropriate methodologies and best practices for each phase of empirical research projects. For select initiatives, the Data Scientist will design and implement data preparation workflows tailored to meet project requirements accurately and efficiently. Additionally, the Data Scientist will contribute to ERDS’s mission of promoting data literacy within legal scholarship and practice at Duke Law, while actively engaging with colleagues across the broader Duke University community.
Responsibilities
- Assist faculty and students in selecting appropriate methodologies across the entire lifecycle of an empirical project, including research design, data capture, data management, data cleaning, analysis, presentation of results, replicability, and archival.
- Coordinate and monitor the design, development, modification, and implementation of information technology applications to combine, manipulate, and transform datasets of varying size, scope, and substance into formats that allows researchers to more easily view, interpret, and analyze data.
- Extract, clean, and organize raw information harvested from different electronic and web-based sources—including social media platforms, websites, and PDF documents.
- Use programmatic API calls to automate the collection and management of research data sources.
- Use, apply, and develop AI-powered research tools, using techniques such as machine-learning, large language models (LLMs), and natural language processing (NLP)
- Provide support for grant applications and IRB applications related to empirical research projects.
- Supervise and direct the performance of one or more student research assistants.
- Participate in development of long-range planning for and prioritization of new and existing projects.
- Ensure that external and internal regulations and policies governing data management are met including regulations concerning security, auditability, and privacy.
- Advise students, faculty, and staff on the use of data management concepts and tools—in and/or outside a classroom setting.
- Participate in a lively community of research and data support professionals at Duke University and the broader Research Triangle area.
- Perform other duties as assigned.
Required Qualifications
Education/Training
Work requires knowledge and skills generally acquired through completion of a Bachelor's degree program in business, computer science, mathematics, statistics or social sciences.
Experience
Work requires knowledge of both statistical analysis and interpretation generally acquired through at least two years of relevant experience.
OR AN EQUIVALENT COMBINATION OF RELEVANT EDUCATION AND/OR EXPERIENCE
Skills
- Proficiency in Python and/or R programming languages
- Survey development
- Machine learning
- Large language models (LLMs)
- Natural Language Processing (NLP)
- Statistical analysis
- Excellent oral and written communication skills
- Demonstrated interpersonal and teamwork skills complemented by the ability to take initiative
- Independent active learning
Desired Education
- Advanced degree in computer/data science or an applied social science field
Desired Experience
- 2 to 4 years of progressive data science experience, including: data capture, cleaning, and analysis; machine learning and NLP; programming; database administration experience, to include design, implementation, tuning, backup, recovery, modification and reorganization of relational databases.
- Strong technical skills and problem-solving abilities related to research data management
- Demonstrated experience with web scraping and electronic document parsing
- Working knowledge of HTML, JavaScript, and web design to facilitate entity extraction from web-based sources
- Experience or familiarity with legal information
- Demonstrated experience collaborating with stakeholders on applied data projects
- Applied experience with natural language processing and machine-learning algorithms
- Proficiency with Linux-based operating systems
- Familiarity with metadata standards for archiving and sharing research data
- Demonstrated experience using virtual machine environments to manipulate datasets
- Familiarity with the production and deployment of interactive data applications/dashboards
- Familiarity with both quantitative and qualitative data analysis methods
- Familiarity with emerging standards for research replicability
Job Code: 00001727 DATA ADMINISTRATION ANALYST
Job Level: 12
Duke is an Affirmative Action/Equal Opportunity Employer committed to providing employment opportunity without regard to an individual's age, color, disability, gender, gender expression, gender identity, genetic information, national origin, race, religion, sex, sexual orientation, or veteran status.
Duke aspires to create a community built on collaboration, innovation, creativity, and belonging. Our collective success depends on the robust exchange of ideas—an exchange that is best when the rich diversity of our perspectives, backgrounds, and experiences flourishes. To achieve this exchange, it is essential that all members of the community feel secure and welcome, that the contributions of all individuals are respected, and that all voices are heard. All members of our community have a responsibility to uphold these values.
Essential Physical Job Functions: Certain jobs at Duke University and Duke University Health System may include essential job functions that require specific physical and/or mental abilities. Additional information and provision for requests for reasonable accommodation will be provided by each hiring department.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs Computer Science Data analysis Data management JavaScript Linux LLMs Machine Learning Mathematics NLP Privacy Python R RDBMS Research Security Statistics
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.