Data Engineer (Databricks)
Latin America
Factored
Empower your business with top AI engineers in innovation, business analytics, and data science. Scale efficiently with our expert-led AI solutions.We are currently looking for an exceptionally talented Data Engineer to join our team. You will be called on for a wide range of responsibilities, from data aggregation, scraping, validation, transformation, quality and DevOps administration of both structured and unstructured datasets. Ideally, you will be experienced in optimizing data architecture, building data pipelines and wrangling data to suit the needs of our algorithms and application functionality. Since you’ll be joining an early-stage startup at the ground level, you’ll need to be a self-starter with a high degree of initiative and accountability. You must be able to wear multiple hats and take on additional responsibility on our growing team. #LI-Remote
What you will be doing:
- Develop and maintain ETL (Extract, Transform, Load) processes using Python.
- Design, build, and optimize large-scale data pipelines on Databricks.
- Write efficient SQL queries to extract, manipulate, and analyze data from various databases.
- Design and develop optimal data processing techniques: automating manual processes, data delivery, data validation and data augmentation.
- Collaborate with stakeholders to understand data needs and translate them into scalable solutions.
- Design and develop API integrations in order to feed different data models.
- Architect and implement new features from scratch, partnering with AI/ML engineers to identify data sources, gaps and dependencies.
- Identify bugs and performance issues across the stack, including performance monitoring and testing tools to ensure data integrity and quality user experience.
- Build a highly scalable infrastructure using SQL and AWS big data technologies.
- Keep data secure and compliant with international data handling rules.
What you must bring:
- 3 - 5+ years of professional experience shipping high-quality, production-ready code.
- Strong computer science foundations, including data structures & algorithms, OS, computer networks, databases, algorithms, and object-oriented programming.
- Experience with Databricks.
- Experience in Python.
- Experience in setting up data pipelines using relational SQL and NoSQL databases, including Postgres, Cassandra or MongoDB.
- Experience with cloud services for handling data infrastructure such as: Snowflake(preferred), Azure, Databricks, Azure Databricks, and/or AWS.
- Experience with orchestration tools such as Airflow
- Proven success manipulating, processing, and extracting value from large datasets.
- Experience with Big Data tools, including Hadoop, Spark, Kafka, etc.
- Expertise with version control systems, such as Git.
- Strong analytic skills related to working with unstructured datasets.
- Excellent verbal and written communication skills in English.
Nice to have:
- BSc in Computer Science, Mathematics or similar field; Master’s or PhD degree is a plus.
- Experience with real-time scenarios, low-latency systems and data intensive environments is a plus.
- Experience developing scalable RESTful APIs.
- Experience with consumer applications and data handling.
- Familiarity with data privacy regulations and best practices.
We are a transparent workplace, where EVERYBODY has a voice in building OUR company, and where learning and growth is available to everyone based on their merits, not just on stamps on their resume. As impressive as some of the stamps on our resumes are, we recognize that human talent and passion exist everywhere, and come from many backgrounds, so stamps matter much less than results. All of us are dedicated doers and are highly energetic, focusing vehemently on execution because we know that the best learning happens by doing. We recognize that we are creating OUR COMPANY TOGETHER, which is not only a high-performing fast-growing business, but is changing the way the world perceives the quality of technical talent in Latin America. We are fueled by the great positive impact we are making in the places where we do business, and are committed to accelerating careers and investing in hundreds (and hopefully thousands) of highly talented data science engineers and data analysts.
In short, our business is about people, so we hire the best people and invest as much as possible in making them fall in love with their work, their learning, and their mission. When not nerding out on data science, we love to make music together, play sports, play games, dance salsa, cook delicious food, brew the best coffee, throw the best parties, and generally have a great time with each other.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow APIs Architecture AWS Azure Big Data Cassandra Computer Science Databricks Data pipelines DevOps ETL Git Hadoop Kafka Machine Learning Mathematics MongoDB NoSQL OOP PhD Pipelines PostgreSQL Privacy Python Snowflake Spark SQL Testing
Perks/benefits: Career development Startup environment
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.