Senior Data Pipeline Developer

Remote - Colombia

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Full Time Senior-level / Expert USD 42K - 79K * ^est.

Ansira

Fuel growth across your distributed networks with an industry-defining platform designed to synchronize your partner ecosystem.

View all jobs at Ansira

Apply now Apply later

Posted 1 day ago

As a Senior Data Pipeline Developer, you will take part in the optimization and enhancement of data pipelines that power customer data management in the Ansira Connect SaaS platform. In this role, you will be part of a cross functional team developing, maintaining, and improving fast, scalable, and highly complex cloud-native data solutions capable of processing millions of customer records in real-time, while ensuring reliability and efficiency across varying data volumes.

Overview:

The Senior Data Pipeline Developer is a self-starter with a strong desire to learn and work with cloud-native technologies & processes, improve efficiency along the way and make an impact while contributing to cross-functional teams. Your work is all about data and the technology around turning raw customer data into actionable, targetable audiences. You are responsible for optimizing, monitoring and maintaining fast, secure and cost-effective data workflows that ingest customer information including addresses, emails, and custom targeting fields through real-time streaming pipelines. You'll ensure these pipelines perform efficiently whether processing millions of records or just a hundred, while maintaining data quality through deduplication, address standardization, and email validation processes.

You are expected to contribute more than just code. You'll be involved in defining how things work, what they do, and why we do that instead of something else. We also expect you to share your knowledge and expertise with everyone else. Your ability to creatively collaborate and execute team goals will affect scalability and directly contribute to the company's product and the features our team builds. You will collaborate with product, engineering and other development teams to enhance cloud-native solutions using Spring Cloud Data Flow, Kafka, CockroachDB, and Kubernetes in a dynamic and agile environment.

You will be part of a fun, diverse team that seeks challenges, loves learning and values teamwork. You will have opportunities for learning, mentorship, career growth, and work on high-business impact areas.

Responsibilities:

Contribute to the full development life cycle of features and products in our SaaS Platform aiming to meet or exceed customer SLAs.
Participate in the design, development and implementation of large-scale distributed systems using cloud-native principles and technologies.
Participate in the design, development and implementation of applications and services able to process large volumes of data, focusing on security, scalability, latency, and resiliency.
Analyze business requirements; and translate them into data processing workflows. This includes data collection, and outbound data transfer.
Design, develop and maintain data pipeline processing functionality, either by extending existing systems, or implementing new ones as needed.
Extend the Data Warehouse and Data Lakes with data from diverse sources (RDBMS, NoSQL, REST API, flat files, Streams, Time series data, proprietary formats), applying data processing standards and best practices.
Design and implement rigorous data analysis to proactively identify any inconsistencies or data quality issues. Provide recommendations for improvements.
Develop a strong understanding of different data sources and strategically implement data flows for robustness and scalability.
Identify development needs in order to improve and streamline operations.
Write scalable, performant, readable and tested code following standards and best coding practices.
Develop test strategies, use automation frameworks, write unit/functional tests to drive up code coverage and automation metrics.
Participate in code reviews and provide meaningful feedback that helps other developers to build better solutions.
Present your own designs to other development teams, engineering or stakeholders and review designs of others.
Contribute relevant, clean, concise and quality documentation to Ansira's knowledge base to support/increase information sharing within the organization.
Learn about Ansira's business, master our development process, culture and code base, then improve it.
Establish strong working relationships at all organizational levels and across functional teams.
Collaborate with the internal/external stakeholder and product team to gather functional and non-functional requirements and identify the business requirements.
Work closely with product owners and a wide variety of stakeholders to analyze and break down large requirements into small, simple, workable deliverables.
Ability to work in a fast paced environment and deliver incremental value iteratively and continuously.
Take responsibility and ownership of product timelines and deliverables

Qualifications:

Bachelor's or Master’s degree in computer science, computer science engineering, statistics, math, related field, or equivalent experience
5+ years of hands on experience in application development experience using cloud technologies.
5+ years of hands on Architecture experience: data pipelines, distributed computing engines.
5+ years of hands on experience in developing and running ETL or ELT processes.
Expertise in using ETL or ELT to ingest data from diverse sources (RDBMS, NoSQL, REST API, flat files, Streams, Time series data, proprietary formats).
Expertise designing and implementing pluggable, reusable platform components pertinent to data analytics and ingestion technologies.
Expertise in consuming web-services (REST, SOAP).
Expertise in developing software involving caching, queuing, concurrency, and network programming.
Expertise in using Continuous Integration, Continuous Delivery and DevSecOps best practices.
Expertise in running workloads in containers (Docker or Kubernetes).
Expertise in analyzing production workloads and developing strategies to run data systems with scale and efficiency.
Proficiency in SQL/PLSQL, data manipulation, query development and optimization.
Proficiency troubleshooting and resolving performance issues at the database and application levels.
Proficiency in using flow charts, UML or C4 models.
Proficiency in using Unix and command line tools.
Proficiency in Test Driven Development (TDD) or experience with automated testing including unit, functional, stress and load testing.
Proficiency in OWASP security principles, understanding accessibility, and security compliance.
Competency in data security and data protection strategies.
Experience with the entire Software Development Life Cycle (SDLC), Agile Development, SCRUM, or Extreme Programming methodologies
A passion for solving problems and providing workable solutions while demonstrating the flexibility to learn new technologies that meet business needs.
Strong communication skills (English) as well as experience in mentoring and educating your peers.

Preferred Knowledge/Skills:

Expertise in one or more Programming languages such as Java, PHP, Python, Go, etc. Emphasis on Java (8+) and Python.
Expertise in one or more ETL/ELT tools such as Spring Cloud Data Flow, Google Dataflow, Apache Bean, Adobe Airflow, etc. Emphasis in Spring Cloud Data Flow (Spring 4+ and Spring Boot 2+).

Expertise in one or more Version Control Systems such as Git, SVN, CVS, Team Foundation. Emphasis on Git.
Expertise in one or more Message-Oriented-Middleware such as RabbitMQ, JMS, Kafka, Pulsar. Emphasis on Apache Kafka.
Proficiency in one or more public cloud providers (AWS, Azure, GCP, etc). Emphasis on Google Cloud Platform.
Proficiency in one or more cloud DWH platforms such as BigQuery, Snowflake, Redshift, Cloudera, Azure Data Lake Store, etc. Emphasis in BigQuery.
Proficiency in full-stack observability principles (tracing, metrics, logging) and one or more observability tools such as Apache Skywalking, Prometheus, Grafana, Graylog, and StackDriver.
Competency in one or more RDBMS such as PostgreSQL, MySQL, Oracle, SQL Server, etc. Emphasis on PostgreSQL.
Competency developing queries and stored procedures in SQL, PLSQL or T-SQL
Fluency in data visualizations techniques using tools such as PLX Dashboards, Google Data Studio, Looker, Tableau, or similar technologies. Emphasis in Looker.
Fluency in distributed or NoSQL databases such as CockroachDB, MongoDB, Cassandra, Couchbase, DynamoDB, Redis, etc.
Understanding of one or more large-scale data processing platforms such as Apache Spark, Apache Storm, Apache Flink, Hadoop, etc.
Understanding of cloud object storage such as S3, GCS. Emphasis on GCS.
Understanding of HTML and JavaScript.