Data Pipeline Operations Engineer
Maryland, United States - Remote
SixMap, Inc.
Protect your external attack surface with SixMap, the leading Automated Cyber Defense Platform for CTEM. Get started with just your enterprise name name.We are seeking a detail-oriented and technically skilled Data Pipeline Operations Engineer to manage and execute our weekly scanning process. This critical role ensures the timely flow of customer data through our research, scanning, and UI ingest pipeline. The ideal candidate has a mix of programming, database, and Linux system administration skills to handle the various steps in the scanning workflow.
SixMap is the leading Automated Cyber Defense Platform for continuous threat exposure management (CTEM) across today’s largest, most complex and dynamic enterprise and government environments. With zero network impact and zero agents, SixMap automatically discovers all Internet-facing assets across IPv4 and IPv6 to deliver the most comprehensive external attack surface visibility. The platform identifies vulnerabilities, correlates proprietary and open-source threat intelligence, and provides actionable insights to defend against imminent threats with supervised proactive response capabilities. The SixMap team brings deep intelligence community expertise and best practices to the defense of both U.S. Federal agencies and Fortune 500 corporations.
Responsibilities
- Manage the weekly scanning process, ensuring customer data progresses through research, scanning, and UI ingest phases according to defined SLAs
- Prepare input files and kick off processes on the scanning cluster via Airflow
- Monitor and troubleshoot jobs, adjusting parameters like rate files as needed to optimize runtimes
- Perform data ingest into production databases using SQL and Python
- Clear data artifacts and caches in between ingest cycles
- Execute post-ingest data refresh routines
- Perform quality checks on ingested data to validate contractual obligations are met
- Identify process bottlenecks and suggest or implement improvements to the automated tooling to increase speed and reliability
Requirements
- Required Skills:
- Strong Linux command line skills
- Experience with Airflow or similar workflow orchestration tools
- Python programming proficiency
- Advanced SQL knowledge for data ingest, refresh, and validation
- Ability to diagnose and resolve issues with long-running batch processes
- Excellent attention to detail and problem-solving skills
- Good communication to coordinate with other teams
- Flexibility to handle off-hours work when needed to meet SLAs
- Preferred Additional Skills:
- Familiarity with network scanning tools and methodologies
- Experience optimizing database performance
- Scripting skills to automate routine tasks
- Understanding of common network protocols and services
- Knowledge of AWS services like EC2
Benefits
- Competitive compensation packages; including equity
- Employer paid medical, dental, vision, disability & life insurance
- 401(k) plans
- Flexible Spending Accounts (health & dependents)
- Unlimited PTO
- Remote Working Options
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow AWS EC2 Linux Open Source Python Research SQL
Perks/benefits: Competitive pay Equity / stock options Flex hours Flex vacation Health care Insurance Unlimited paid time off
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.