Mid-Level Data Engineer

Arlington, VA, United States

PeopleTec

Delivering world-class solutions to the Department of Defense and Civilian Federal Sectors from Huntsville, Alabama.

View all jobs at PeopleTec

Apply now Apply later

Responsibilities

PeopleTec is currently seeking a Mid-Level Data Engineer to support our DC-area offices of the Chief Digital and AI Office in Falls Church, Pentagon, Alexandria, and Arlington locations.

 

Duties Include:

  • Develop a generalized tool to semantically search, summarize, and categorize unstructured data

  • Participate in DoD and government AI/ML Task Forces, connect with others in DoD working on similar capabilities, and share best practices with an LLM community of practice

  • Extend a generalized API deployed to NIPR to semantically search, summarize, and categorize unstructured data and enable others across the Department to use the API within the paradigm of CDAO / Advana 1.2's self-service model

  • Support the installation of the capability on other networks at different classification levels, including SIPRnet and JWICS

  • Includes a set of swappable containers with different functions that provide inputs and outputs through an API.

  • Develop methodology to test how Search performance (with varying levels of prompt engineering)

  • Contribute to and drive a demand signal for a data operations playbook for unstructured data

  • Develop a cost model for semantic search API use cases

  • Contribute to and drive a demand signal for a data operations playbook for unstructured data

  • Develop and document a strategy and implementation plan to ingest and consistently store unstructured data on the Advana platform, following the Bronze/Silver/Gold table paradigm (i.e. raw files in bronze, parsed/transformed data in silver, cleaned, processed and data available for query in gold)

  • Develop an approach to address issues arising from maintaining semantic indices associated with document change management and version control for unstructured data, such as when a new manual comes out to replace a previous version

Major Duties/Tasks:

  • Support the configuration and ingestion of designated structured, unstructured, and semi-structured data repositories into capabilities that satisfy mission partner requirements and support a data analytics and DevOps pipeline to drive rapid delivery of functionality to the client;
  • Maintain all operational aspects of data transfers while accounting for the security posture of the underlying infrastructure and the systems and applications that are supported and monitoring the health of the environment through a variety of health tracking capabilities;
  • Automate configuration management, leverage tools, and stay current on data extract, transfer, and load (ETL) technologies and services;
  • Work under general guidance, demonstrate an initiative to develop approaches to solutions independently, review architecture, and identify areas for automation, optimization, right-sizing, and cost reduction to support the overall health of the environment;
  • Apply comprehension of data engineering-specific technologies and services, leverage expertise in databases and a variety of approaches to structuring and retrieving of data, comprehend Cloud architectural constructs, and support the establishment and maintenance of Cloud environments programmatically using vendor consoles;
  • Engage with multiple functional groups to comprehend client challenges, prototype new ideas and new technologies, help to create solutions to drive the next wave of innovation, and design, implement, schedule, test, and deploy full features and components of solutions
  • Maintain an existing collection of web scraping tools used as the initial step of the ETL process
  • Identify and implement scalable and efficient coding solutions

Qualifications

Required Skills/Experience:

  • Experience with Big Data systems, including Apache Spark / Databricks
  • Experience with ETL processes;
  • Experience with Amazon Web Services (AWS), Microsoft Azure, or MilCloud 2.0;
  • Applying DoD Security Technical Implementation Guides (STIGs) and automating that process
  • Experience with multiple coding languages
  • Travel: <10 %
  • Must be a U.S. Citizen
  • An active DoD Top Secret clearance with SCI eligibility is required to perform this work. Candidates are required to have an active Top Secret clearance with SCI eligibility upon hire, and the ability to maintain this level of clearance during their employment.

Education Requirements:

  • Bachelor’s degree plus 5-7 years experience, or a Masters Degree plus 3 years of experience.

Overview

People First. Technology Always.

 

PeopleTec, Inc. is an employee-owned small business founded in Huntsville, AL that provides exceptional customer support by employing and retaining a highly skilled workforce.

 

Culture: The name "PeopleTec" was deliberately chosen to remind us of our core value system - our people. Our company's foundation was built on placing our employees and customers first. With an award-winning atmosphere, we have matured into a company that boasts the best and brightest across multiple technical fields.

 

Career: At PeopleTec, we value your long-term goals. Whether it's through our continuing-education opportunities, our robust training programs, or our "People First" benefits package, PeopleTec truly believes that our best investments are our people.

 

Come Experience It.

#cjpost #dpost

 

EEO Statement

 

PeopleTec, Inc. is an Equal Employment Opportunity employer and provides reasonable accommodation for qualified individuals with disabilities and disabled veterans in its job application procedures. If you have any difficulty using our online system and you need an accommodation due to a disability, you may use the following email address, applicationhelp@peopletec.com and/or phone number (256.319.3800) to contact us about your interest in employment with PeopleTec, Inc.

 

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, citizenship, ancestry, marital status, protected veteran status, disability status or any other status protected by federal, state, or local law. PeopleTec, Inc. participates in E-Verify.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Engineering Jobs

Tags: APIs Architecture AWS Azure Big Data Classification Data Analytics Databricks DataOps DevOps Engineering ETL LLMs Machine Learning Prompt engineering Security Spark Unstructured data

Perks/benefits: Health care

Region: North America
Country: United States

More jobs like this