Senior Data Engineer
Remote
Applications have closed
About Us
eSimplicity is a modern digital services company that works across government, partnering with our clients to improve the health and lives of millions of Americans, ensure the security of all Americans—from soldiers and veterans to kids and the elderly, and defend national interests on the battlefield. Our engineers, designers, and strategists cut through complexity to create intuitive products and services that courageously equip Federal agencies with solutions to transform today for a better tomorrow for all Americans.
Responsibilities:
- Design, develop, and maintain scalable data pipelines using Spark, Hive, and Airflow
- Develop and deploy data processing workflows on the Databricks platform
- Develop API services to facilitate data access and integration
- Create interactive data visualizations and reports using AWS QuickSight
- Builds required infrastructure for optimal extraction, transformation and loading of data from various data sources using AWS and SQL technologies
- Monitor and optimize the performance of data infrastructure and processes
- Develop data quality and validation jobs
- Assembles large, complex sets of data that meet non-functional and functional business requirements
- Write unit and integration tests for all data processing code
- Work with DevOps engineers on CI, CD, and IaC
- Read specs and translate them into code and design documents
- Perform code reviews and develop processes for improving code quality
- Improve data availability and timeliness by implementing more frequent refreshes, tiered data storage, and optimizations of existing datasets
- Maintain security and privacy for data at rest and while in transit
Required Qualifications:
- 7+ years of hands-on software development experience; 4+ years of data pipeline experience using Python, Java and cloud technologies
- Bachelor's degree in computer science, Information Systems, Engineering, Business, or other related scientific or technical discipline
- Experienced in Spark and Hive for big data processing
- Experience building job workflows with the Databricks platform
- Strong understanding of AWS products including S3, Redshift, RDS, EMR, AWS Glue, AWS Glue DataBrew, Jupyter Notebooks, Athena, QuickSight, EMR, and Amazon SNS
- Familiar with work to build processes that support data transformation, workload management, data structures, dependency and metadata
- Experienced in data governance process to ingest (batch, stream), curate, and share data with upstream and downstream data users.
- Experienced in data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up.
- Demonstrated understanding using software and tools including relational NoSQL and SQL databases including Cassandra and Postgres; workflow management and pipeline tools such as Airflow, Luigi and Azkaban; stream-processing systems like Spark-Streaming and Storm; and object function/object-oriented scripting languages including Scala, C++, Java and Python.?
- Familiar with DevOps methodologies, including CI/CD pipelines (Github Actions) and IaC (Terraform)
- Ability to obtain and maintain a Public Trust; residing in the United States
- Experience with Agile methodology, using test-driven development.
Working Environment:
eSimplicity supports a remote work environment operating within the Eastern time zone so we can work with and respond to our government clients. Expected hours are 9:00 AM to 5:00 PM Eastern unless otherwise directed by manager.
Occasional travel for training and project meetings. It is estimated to be less than 25% per year.
Benefits:
We offer a highly competitive salary and full healthcare benefits.
Equal Employment Opportunity:
eSimplicity is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, religion, color, national origin, gender, age, status as a protected veteran, sexual orientation, gender identity, or status as a qualified individual with a disability.
Salary Description $127,300 - $140,000Tags: Agile Airflow APIs Athena AWS AWS Glue AWS Glue DataBrew Azkaban Big Data Cassandra CI/CD Computer Science Databricks Data governance Data pipelines Data quality DevOps Engineering GitHub Java Jupyter NoSQL Pipelines PostgreSQL Privacy Python QuickSight Redshift Scala Security Spark SQL Streaming TDD Terraform
Perks/benefits: Competitive pay Health care
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.