Jr. HPC System Administrator & Programmer

Hyde Park Campus, United States

University of Chicago

One of the world’s leading research universities, the University of Chicago inspires scholars to pursue field-defining research, while providing a transformative education for students.

View all jobs at University of Chicago

Apply now Apply later

Department

Provost Research Computing Center


About the Department

The University of Chicago Research Computing Center (RCC), a unit in the Office of Research, provides high-end research computing resources to researchers at the University of Chicago. It is dedicated to enabling research by providing access to centrally managed High-Performance Computing (HPC), storage, and visualization resources. These resources include hardware, software, high-level scientific and technical user support, and the education and training required to help researchers make full use of modern HPC technology and local and national supercomputing resources. The Office of Research oversees the conduct of sponsored research, research program development, and contract management functions.


Job Summary

The University of Chicago Research Computing Center (RCC) is seeking a qualified Jr. HPC  System Administrator & Programmer to join its Systems and Operations Team that manages and supports an ecosystem of HPC systems and services. The individual in this position will contribute to the ongoing efforts to streamline RCC processes, maintain the backend tools of the HPC environment, develop automated workflows to support the system administration efforts and improve the ways in which RCC enables transformational computational research at the University of Chicago. The job duties will primarily include development and maintenance of backend software and deployment automation for the systems in the RCC environment. The Jr. HPC System Administrator & Programmer will also closely work with the application development team in consolidating continuous integration and continuous deployment approaches (CI/CD) and supporting faculty projects. The ideal candidate will possess a strong technical background in programming and HPC, an analytical mind, and be comfortable working as part of a team.

The job participates in the design of automated, scalable, and rapidly deployable solutions to systems infrastructure and server configuration. Installs, configures, and maintains operating systems, monitoring and alerting systems, utility software, and firewalls. Plans and executes hands-on maintenance for production servers as well as Windows and Linux servers.

This is a hybrid position requiring at least 3 days a week onsite.

Responsibilities

  • Work with moderate guidance to administer simple systems, assist in the administration of larger systems in an HPC environment, including both software and hardware.

  • Install, design, configure and maintain tools and scripts that are used for systems provisioning and configuration management.

  • Develop and maintain system software to automate operations such as management of HPC user accounts and resource allocations (i.e., computing cycles and storage quotas).

  • Maintain and further develop database-backed solutions and software to track and monitor HPC inventory including servers, network devices, compute nodes, and their respective details (specifications, locations, warranty status and renewals, health status, etc.).

  • Design and develop tools to automate tasks such as:

    • Collection of metrics and usage information.

    • Backup of research data to different storage tiers.

    • Identify and apply security patches and upgrades.

    • Execution of benchmarks and creation of a benchmark performance database.

  • Assist with the implementation, integration, administration and maintenance of security and infrastructure monitoring solutions and dashboards by developing tools and scripts, and also by leveraging existing open-source and commercial solutions.

  • Design and develop tools and metrics to assist RCC leadership with visualizing, analyzing and reporting usages information and other system statistics.

  • Assist with deployment, configuration and customization of applications commonly used to support an academic HPC environment such as XDMoD, Open OnDemand, ColdFront, etc.

  • Proactively troubleshoot issues, and respond to complex user support requests.

  • Create and maintain documentation related to tools and solutions developed, system administration procedures.

  • Work with other internal teams to provide and gather feedback regarding user support and service delivery, identify and foster opportunities for improvement.

  • Assist with maintaining a knowledge base of useful systems-related information and standard operating procedures that other internal teams can consult when providing user support.

  • Become involved with mentoring students and interns working in the Systems team.

  • Contribute to developing software, tools and/or platforms for the reproducibility of scientific research.

  • Maintains complex system and network administration functions. Works with moderate guidance to administer simple systems and assists in the administration of larger systems.

  • Ensures integrity by implementing appropriate routine software and hardware solutions. Conducts routine hardware and software audits of workstations, backing up all information.

  • Performs other related work as needed.


Minimum Qualifications

Education:

Minimum requirements include a college or university degree in related field.


Work Experience:

Minimum requirements include knowledge and skills developed through 2-5 years of work experience in a related job discipline.


Certifications:

---

Preferred Qualifications

Education:

  • Master’s in Computer Science or closely related field.

Experience:

  • Minimum of two year’s experience working with HPC systems or equivalent experience.

Technical Skills or Knowledge:

  • Experience with basic system configuration, fluent use of the command line interface, experience with building and installing software.

  • Experience with Python programming, including various packages for data processing (i.e., Numpy, Scipy, Pandas, Matplotlib).

  • Experience with shell scripting (Bash).

  • Experience with open-source SQL databases (deployment, configuration, modeling, access).

  • Experience with development in a Linux environment, version control using Git, GitLab/GitHub development practices.

  • Experience with container technology (Docker, Kubernetes).

  • Experience with automation and configuration management tools (Ansible, Puppet).

  • Experience implementing automation and monitoring of infrastructure and systems.

  • Experience reading, modifying, and porting existing Perl scripts.

  • Experience in setting up and executing benchmarks in an HPC environment and analyzing their results systematically.

  • Experience in creating and maintaining documentation that describes implemented solutions and standard operating procedures.

Preferred Competencies

  • Excellent interpersonal, verbal, written, and presentation skills.

  • Ability to understand and translate researchers’ scientific goals into technical requirements.

  • Ability to identify and gain expertise in appropriate new technologies and/or software tools.

  • Ability to function as part of an interactive team while demonstrating self-initiative to achieve project’s goals and Research Computing Center’s mission.

  • Strong analytical skills, problem-solving ability, attention to detail.

  • Ability to work well with faculty and researchers.

  • Versatile, enthusiastic, and eager to learn new skills

  • Possess a willingness and ability to support a diverse and inclusive environment.

Application Documents

  • CV or resume (required)

  • Cover letter (preferred)


When applying, the document(s) MUST be uploaded via the My Experience page, in the section titled Application Documents of the application.


Job Family

Information Technology


Role Impact

Individual Contributor


Scheduled Weekly Hours

37.5


Drug Test Required

No


Health Screen Required

No


Motor Vehicle Record Inquiry Required

No


Pay Rate Type

Salary


FLSA Status

Exempt


Pay Range

$83,750.00 - $107,500.00

The included pay rate or range represents the University’s good faith estimate of the possible compensation offer for this role at the time of posting.


Benefits Eligible

Yes

The University of Chicago offers a wide range of benefits programs and resources for eligible employees, including health, retirement, and paid time off. Information about the benefit offerings can be found in the Benefits Guidebook.


Posting Statement
 

The University of Chicago is an Affirmative Action/Equal Opportunity/Disabled/Veterans and does not discriminate on the basis of race, color, religion, sex, sexual orientation, gender, gender identity, national or ethnic origin, age, status as an individual with a disability, military or veteran status, genetic information, or other protected classes under the law. For additional information please see the University's Notice of Nondiscrimination.

 

Staff Job seekers in need of a reasonable accommodation to complete the application process should call 773-702-5800 or submit a request via Applicant Inquiry Form.

 

We seek a diverse pool of applicants who wish to join an academic community that places the highest value on rigorous inquiry and encourages a diversity of perspectives, experiences, groups of individuals, and ideas to inform and stimulate intellectual challenge, engagement, and exchange.

 

All offers of employment are contingent upon a background check that includes a review of conviction history.  A conviction does not automatically preclude University employment.  Rather, the University considers conviction information on a case-by-case basis and assesses the nature of the offense, the circumstances surrounding it, the proximity in time of the conviction, and its relevance to the position.

 

The University of Chicago's Annual Security & Fire Safety Report (Report) provides information about University offices and programs that provide safety support, crime and fire statistics, emergency response and communications plans, and other policies and information. The Report can be accessed online at: http://securityreport.uchicago.edu. Paper copies of the Report are available, upon request, from the University of Chicago Police Department, 850 E. 61st Street, Chicago, IL 60637.

Apply now Apply later
Job stats:  1  1  0

Tags: Ansible CI/CD Computer Science Docker Git GitHub GitLab HPC Kubernetes Linux Matplotlib NumPy Open Source Pandas Perl Puppet Python Research SciPy Security Shell scripting SQL Statistics

Perks/benefits: Health care

Region: North America
Country: United States

More jobs like this