Mid-Level Data Engineer
Arlington, VA, United States
Full Time Mid-level / Intermediate Clearance required USD 107K - 200K *
PeopleTec
Delivering world-class solutions to the Department of Defense and Civilian Federal Sectors from Huntsville, Alabama.Responsibilities
PeopleTec is currently seeking a Mid-Level Data Engineer to support our DC-area offices of the Chief Digital and AI Office in Falls Church, Pentagon, Alexandria, and Arlington locations.
Duties Include:
Develop a generalized tool to semantically search, summarize, and categorize unstructured data
Participate in DoD and government AI/ML Task Forces, connect with others in DoD working on similar capabilities, and share best practices with an LLM community of practice
Extend a generalized API deployed to NIPR to semantically search, summarize, and categorize unstructured data and enable others across the Department to use the API within the paradigm of CDAO / Advana 1.2's self-service model
Support the installation of the capability on other networks at different classification levels, including SIPRnet and JWICS
Includes a set of swappable containers with different functions that provide inputs and outputs through an API.
Develop methodology to test how Search performance (with varying levels of prompt engineering)
Contribute to and drive a demand signal for a data operations playbook for unstructured data
Develop a cost model for semantic search API use cases
Contribute to and drive a demand signal for a data operations playbook for unstructured data
Develop and document a strategy and implementation plan to ingest and consistently store unstructured data on the Advana platform, following the Bronze/Silver/Gold table paradigm (i.e. raw files in bronze, parsed/transformed data in silver, cleaned, processed and data available for query in gold)
Develop an approach to address issues arising from maintaining semantic indices associated with document change management and version control for unstructured data, such as when a new manual comes out to replace a previous version
Major Duties/Tasks:
- Support the configuration and ingestion of designated structured, unstructured, and semi-structured data repositories into capabilities that satisfy mission partner requirements and support a data analytics and DevOps pipeline to drive rapid delivery of functionality to the client;
- Maintain all operational aspects of data transfers while accounting for the security posture of the underlying infrastructure and the systems and applications that are supported and monitoring the health of the environment through a variety of health tracking capabilities;
- Automate configuration management, leverage tools, and stay current on data extract, transfer, and load (ETL) technologies and services;
- Work under general guidance, demonstrate an initiative to develop approaches to solutions independently, review architecture, and identify areas for automation, optimization, right-sizing, and cost reduction to support the overall health of the environment;
- Apply comprehension of data engineering-specific technologies and services, leverage expertise in databases and a variety of approaches to structuring and retrieving of data, comprehend Cloud architectural constructs, and support the establishment and maintenance of Cloud environments programmatically using vendor consoles;
- Engage with multiple functional groups to comprehend client challenges, prototype new ideas and new technologies, help to create solutions to drive the next wave of innovation, and design, implement, schedule, test, and deploy full features and components of solutions
- Maintain an existing collection of web scraping tools used as the initial step of the ETL process
- Identify and implement scalable and efficient coding solutions
Qualifications
Required Skills/Experience:
- Experience with Big Data systems, including Apache Spark / Databricks
- Experience with ETL processes;
- Experience with Amazon Web Services (AWS), Microsoft Azure, or MilCloud 2.0;
- Applying DoD Security Technical Implementation Guides (STIGs) and automating that process
- Experience with multiple coding languages
- Travel: <10 %
- Must be a U.S. Citizen
- An active DoD Top Secret clearance with SCI eligibility is required to perform this work. Candidates are required to have an active Top Secret clearance with SCI eligibility upon hire, and the ability to maintain this level of clearance during their employment.
Education Requirements:
- Bachelor’s degree plus 5-7 years experience, or a Masters Degree plus 3 years of experience.
Overview
People First. Technology Always.
PeopleTec, Inc. is an employee-owned small business founded in Huntsville, AL that provides exceptional customer support by employing and retaining a highly skilled workforce.
Culture: The name "PeopleTec" was deliberately chosen to remind us of our core value system - our people. Our company's foundation was built on placing our employees and customers first. With an award-winning atmosphere, we have matured into a company that boasts the best and brightest across multiple technical fields.
Career: At PeopleTec, we value your long-term goals. Whether it's through our continuing-education opportunities, our robust training programs, or our "People First" benefits package, PeopleTec truly believes that our best investments are our people.
Come Experience It.
#cjpost #dpost
EEO Statement
PeopleTec, Inc. is an Equal Employment Opportunity employer and provides reasonable accommodation for qualified individuals with disabilities and disabled veterans in its job application procedures. If you have any difficulty using our online system and you need an accommodation due to a disability, you may use the following email address, applicationhelp@peopletec.com and/or phone number (256.319.3800) to contact us about your interest in employment with PeopleTec, Inc.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, citizenship, ancestry, marital status, protected veteran status, disability status or any other status protected by federal, state, or local law. PeopleTec, Inc. participates in E-Verify.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs Architecture AWS Azure Big Data Classification Data Analytics Databricks DataOps DevOps Engineering ETL LLMs Machine Learning Prompt engineering Security Spark Unstructured data
Perks/benefits: Health care
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.