Platform & HPC Data Engineer
Herndon, VA
Full Time Senior-level / Expert Clearance required USD 133K - 247K *
Reinventing Geospatial (RGi)
RGi is a leading C5ISR innovator and geospatial expert in National Security. We deliver an Immediate Impact™ for soldiers and analysts. % %Clearance:Active Top Secret clearance with willingness and ability to obtain an SCI and CI polygraphUS Citizenship required
As a Platform & HPC Data Engineer you will...
- Design and implement data management systems and architectures for HPC platforms, focusing on optimizing data flow, storage, and access in large-scale computing environments.
- Oversee the configuration, maintenance, and optimization of distributed file systems (e.g., Lustre, IBM Spectrum Scale, NFS, GPFS) and storage solutions used in HPC environments to ensure efficient performance, scalability, and reliability.
- Implement and manage metadata-driven systems for data labeling/tagging. This includes the development of strategies for classifying, indexing, and organizing datasets to enhance data discoverability, access control, and auditing.
- Configure and maintain various storage appliances (e.g., NetApp, Dell EMC, HPE) and integrated storage solutions. Ensure that storage devices are optimized for performance, capacity, and availability within the HPC ecosystem.
- Integrate data storage and management systems with HPC clusters, ensuring seamless data flow between compute nodes and storage appliances. Optimize data pipelines to support high-throughput workloads and minimize bottlenecks in I/O performance.
- Monitor and improve the performance of storage systems, focusing on I/O throughput, latency, and efficient resource allocation. Use performance metrics to guide optimizations across storage appliances and file systems.
- Implement security best practices for data access, protection, and management, ensuring compliance with government regulations and internal data governance policies. Configure encryption, access control, and secure data sharing methods.
- Develop and maintain automation scripts (e.g., using Python, Bash, or Perl) to streamline storage configurations, data labeling/tagging, and system monitoring tasks. Automate processes related to data integration and HPC platform management.
- Work closely with data scientists, HPC administrators, software developers, and other technical staff to support ongoing projects. Provide expertise in troubleshooting data storage issues and ensuring optimal system performance.
- Maintain thorough documentation for storage configurations, file system setups, data labeling/tagging procedures, and performance optimization strategies. Provide regular reports on system health, data management processes, and any improvements made.
Platform & HPC Data Engineer Qualifications:
- Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field. A Master’s degree or higher is a plus.
- 7+ years of experience in managing data infrastructure in HPC environments, with expertise in file systems, storage appliances, and data workflows.
- Hands-on experience with distributed file systems, including Lustre, IBM Spectrum Scale (GPFS), NFS, and others commonly used in HPC settings.
- Proven experience with storage appliance configuration (e.g., NetApp, Dell EMC, HPE, or similar systems), including performance tuning, capacity management, and reliability.
- Strong experience in implementing data labeling/tagging systems, metadata management, and structuring large datasets for efficient access and compliance.
- Knowledge of high-performance networking protocols (e.g., InfiniBand, RDMA) and their role in data transfer and storage optimization.
- Familiarity with data access protocols like GridFTP, rsync, and NFS for large-scale data transfer.
Additional Skills We'd Like to See:
- Experience with cloud storage integration or hybrid cloud environments, with knowledge of cloud-native storage solutions (e.g., AWS S3, Ceph, OpenShift).
- Familiarity with high-performance computing (HPC) schedulers (e.g., SLURM, PBS, Torque) and their interaction with data storage systems.
- Understanding of data protection mechanisms, including data replication, backup strategies, and disaster recovery in HPC environments.
- Experience with containerization (Docker, Singularity) in an HPC context for data processing and application deployment.
- Experience with machine learning or data science workflows in HPC environments.
We pride ourselves on giving employees an exceptional life experience, where creativity thrives, and challenges are simply part of the fun. We provide truly excellent benefits, including:
· 100% paid employee healthcare & dental insurance· Paid parental leave· 401k with matching· Escalating vacation time· Referral bonuses· Tuition reimbursement· Professional development training· Free beverages and snacks· Weekly catered lunches and breakfast on Fridays Grow to be our next leader:At RGi, fostering a strong and organic corporate culture is paramount and serves as a compass on the decisions we make and how we operate the company. We believe our culture of camaraderie, innovation, and collaboration reflects the caliber of our employees and their dedication to the mission of providing quality software to our customers. As such, we want our employees to feel empowered to seek growth and leadership opportunities within the company and position us to maintain our culture as we grow. RGi provides opportunities, resources, training, and mentorship to all our employees to let them take control of their careers and become a leader or a crucial member of our company. If this is what you are looking for in a company, then you are what we are looking for in an employee.
Reinventing Geospatial, Inc. is an Equal Opportunity Employer committed to hiring and retaining a diverse workforce. We are an Equal Opportunity Employer, making decisions without regard to race, color, religion, sex, national origin, age, veteran status, disability, or any other protected class. U.S. Citizenship is required for all positions.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture AWS Computer Science Data governance Data management Data pipelines Docker Engineering HPC InfiniBand Machine Learning Perl Pipelines Python Security
Perks/benefits: Career development Health care Insurance Lunch / meals Parental leave Snacks / Drinks Startup environment
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.