High Performance Computing (HPC) System Engineer TS/SCI CI poly
Springfield, VA
Full Time Senior-level / Expert Clearance required USD 147K - 274K *
TENICA Global Solutions
Always Exceeding Our Clients' Expectations We put our clients first every time Contact us today for your consultation 703-955-7770 Who we are Global So ...High Performance Computing (HPC) System Engineer. Must have TS/SCI with CI poly
Job location: Springfield, VA
Are you looking for an opportunity to combine your technical skills with big picture thinking to make an impact in High Performance Computing and AI solutions within the Intelligence Community? Our clients are building high performance (HPC) and accelerated compute environments from the ground up, performing modeling and simulation on GPUs designed for a variety of workloads. You will help in the creation and maintenance of a DevOps process for these efforts, from the basic data collection and preprocessing, to assisting with the frameworks to build and train models in AI and Machine Learning within a Research and Development environment. Your ability to translate mission needs into technical solutions, makes you an integral part of delivering a customer-focused engineering solution.
As a systems and DevOps engineer on our team, you have the chance to shape the geospatial intelligence mission by being a part of, or leading, a multi-functional accelerated-compute engineering team. Your customer will trust you to not only architect and engineer these environments, but also evolve them with advanced technology solutions. On our team, you’ll be able to broaden your skillset into areas like DevOps, accelerated-compute, GPU-processing, and cluster management. Grow your skills by merging systems engineering, on-premise environments, Cloud and virtual architecture, and AI and ML frameworks to create a high-performance environment. Join our team and create the future of accelerated compute in the GEOINT mission.
Empower change with us.
The selected candidate will have:
-A strong experience with working on Linux systems
-Experience with building and deploying containerized, GPU-enabled applications in Docker, Singularity, or Kubernetes
-Experience with orchestration and cluster management tools, including Slurm, Mesos, or Moab
-Experience with AI and Machine Learning Development Tool Sets, including Jupyter, Keras, TensorFlow,
MPI, OpenMP, OpenCL, or CUDA
- Lustre and Infiniband maintenance and troubleshooting. Infiniband/fibre/network plumbing, configuration, and maintenance.
-Experience with deploying systems in both on-premise and Cloud environments, including AWS, Azure,
or Google
- Server hardware maintenance and troubleshooting.
- Created and maintained system documentation.
- RHEL and CentOS administration and ACE cluster administration for HPC clusters.
-Experience with supporting environments for massively parallel computation
-Experience with certification and accreditation of containers
-Experience with programming and implementing scientific and physics M&S algorithms, Big Data, and
Data Science
-Experience with optimizing applications to use AI and ML toolsets
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture AWS Azure Big Data CUDA DevOps Docker Engineering GPU HPC InfiniBand Jupyter Keras Kubernetes Linux Machine Learning OpenMP Physics Research TensorFlow
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.