Big Data Engineer
LATAM
Applications have closed
10Pearls
10Pearls | The leading IT, Software, Web, App, and Emerging Technologies Services & Solutions | Enabling & Transforming Digitally Fortune 500 Clients WorldwideWelcome to 10Pearls!
We believe in harnessing the power of technology for social good through our core values: Innovate, modernize and accelerate.
This is a fully remote position only available in Peru, Colombia, Honduras, Costa Rica, Mexico, El Salvador, Guatemala, Nicaragua, and the Dominican Republic.
About 10Pearls
We are 10Pearls, an award-winning digital development company, helping companies with product design, development and technology acceleration. We have a culture of innovation, uniquely designed to help companies transform, digitize and scale by leveraging digital technology.
About the Project:
This opportunity involves working on a project for a valued US client. We are looking for a Sr. Data Engineer with a diverse background in data integration to join the Data Services team. Some data are small, some data are very large (1 trillion+ rows), some data is structured, some data is not. The data comes in all kinds of sizes, shapes and formats. Traditional RDBMS like PostgreSQL, Oracle, SQL Server, MPPs like StarRocks, Vertica, Snowflake, Google BigQuery, and unstructured, key-value like MongoDB, Elasticsearch, to name a few. We are looking for individuals who can design and solve any data problems using different types of databases and technologies supported within our team. The client use MPP databases to analyze billions of rows in seconds, Spark and Iceberg, batch or streaming to process whatever the data needs are. If you’re ready to step up and take on some new technical challenges at a well-respected company, this is a unique opportunity for you.
Responsibilities
- Implement ETL/ELT processes using various tools and programming languages (Scala, Python) against our MPP databases StarRocks, Vertica and Snowflake As Data Engineers, not only are we developers, but we also maintain and administrate our MPP ecosystem.
- Tune and maximize our hardware potential from OS, network and storage levels
- Work with the Hadoop team and optimize Hive and Iceberg tables Running POC between different table formats
- Contribute to the existing Data Lake and Data Warehouse imitative using Hive, Spark, Iceberg, Presto/Trino
- Analyze business requirements, design and implement required data models
Requirements
- BA/BS in Computer Science or in related field
- 2+ years of experience with MPP databases such as StarRocks, Vertica, Snowflake
- 5+ years of experience with RDBMS databases such as Oracle, MSSQL or PostgreSQL
- 2+ years of experience managing or developing in the Hadoop ecosystem Programming background with Scala, Python, Java or C/C++ Experience with Elasticsearch or ELK stack Working knowledge of streaming technologies such as Kafka Strong in any of the Linux distributions, RHEL,CentOS or Fedora Deep knowledge shell scripting, scheduling, and monitoring processes on Linux
- Experience working in both OLAP and OLTP environments Experience working on-prem, not just cloud environments
Nice to have:
- Working knowledge of data unification and setup using Presto/Trino
- Working knowledge of orchestration tools such Oozie and Airflow Experience with Spark. PySpark, SparkSQL, Spark Streaming, etc…
- Experience using ETL tools such as Informatica, Talend and/or Pentaho
- Understanding of Healthcare data
- Data Analyst or Business Intelligence would be a plus
Some Benefits we offer:
- Work from home
- Flexible Schedules
- Amazing People oriented organizational culture
- Challenging projects using the latest technologies with clients from the US and Canada
- Technology and Soft Skills Internal Training
- Online Courses from Udemy and Pluralsight
We thank you for applying to this job position, we’re more than thrilled to start reviewing your profile and great skills! This is the first step in our selection process, so you will be hearing back from our awesome recruitment team regarding the next steps 😀
10Pearls Team
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Big Data BigQuery Business Intelligence Computer Science Data warehouse Elasticsearch ELK ELT ETL Hadoop Informatica Java Kafka Linux MongoDB MPP MS SQL OLAP Oozie Oracle Pentaho PostgreSQL PySpark Python RDBMS Scala Shell scripting Snowflake Spark SQL Streaming Talend
Perks/benefits: Flex hours
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Marketing Data Analyst jobs
- Open Data Science Manager jobs
- Open Lead Data Analyst jobs
- Open Data Engineer II jobs
- Open Senior Business Intelligence Analyst jobs
- Open Principal Data Engineer jobs
- Open MLOps Engineer jobs
- Open Data Analytics Engineer jobs
- Open Data Scientist II jobs
- Open Power BI Developer jobs
- Open Junior Data Scientist jobs
- Open Business Intelligence Developer jobs
- Open Product Data Analyst jobs
- Open Business Data Analyst jobs
- Open Sr Data Engineer jobs
- Open Data Analyst Intern jobs
- Open Senior Data Architect jobs
- Open Sr. Data Scientist jobs
- Open Big Data Engineer jobs
- Open Research Scientist jobs
- Open Data Quality Analyst jobs
- Open Azure Data Engineer jobs
- Open Manager, Data Engineering jobs
- Open Principal Data Scientist jobs
- Open Data Product Manager jobs
- Open GCP-related jobs
- Open Data quality-related jobs
- Open Java-related jobs
- Open ML models-related jobs
- Open Business Intelligence-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open Deep Learning-related jobs
- Open Data visualization-related jobs
- Open PhD-related jobs
- Open Finance-related jobs
- Open NLP-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open APIs-related jobs
- Open Consulting-related jobs
- Open LLMs-related jobs
- Open CI/CD-related jobs
- Open Generative AI-related jobs
- Open Snowflake-related jobs
- Open Hadoop-related jobs
- Open Kubernetes-related jobs
- Open Data governance-related jobs
- Open Databricks-related jobs
- Open Airflow-related jobs