Mid+ Data Engineer
Colombia
⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️
Lean Tech
Lean Solutions Group is a top workforce optimization company. Explore our offshore and nearshore staffing solutions to transform your business operations.
Company Overview: Lean Tech is a rapidly expanding organization situated in Medellín, Colombia. We pride ourselves on possessing one of the most influential networks within software development and IT services for the entertainment, financial, and logistics sectors. Our corporate projections offer a multitude of opportunities for professionals to elevate their careers and experience substantial growth. Joining our team means engaging with expansive engineering teams across Latin America and the United States, contributing to cutting-edge developments in multiple industries. Currently, we are seeking a Mid+ Data Engineer to join our team. Here are the challenges that our next warrior will face and the requirements we look for: Position Title: Mid+ Data Engineer Location: Remote - LATAM What you will be doing: Lean Tech is seeking a Mid+ Data Engineer to work on an exploratory project with a focus on resolving systemic address mismatches for a prominent U.S. telecom company. The primary purpose of this role is to engage in hands-on data cleaning, prototyping, and collaborating with stakeholders to enhance address validation accuracy. This position is uniquely challenging as it emphasizes experimentation and iteration over production deployment, allowing for a dynamic exploration of solutions in an Agile environment. As a key member of the data engineering team, you will evaluate and implement innovative techniques, leveraging strong skills in Python, SQL, and geospatial analysis. Your contributions will directly support the company's commitment to professional growth and diversity.
- Conduct deep data cleaning and root cause analysis on mismatched customer addresses across systems.
- Use SQL and Python to explore, transform, and validate datasets for address accuracy.
- Perform geospatial analysis to validate and align address data utilizing metrics such as latitude and longitude.
- Prototype and test API-based solutions, including third-party validation tools.
- Apply natural language processing techniques to parse and normalize address data.
- Collaborate with stakeholders to investigate data anomalies and iterate on solution development.
- Present findings, methodologies, and iterative improvements within an Agile environment.
- Provide assessments and reports on data health and project progress.
- 3–5 years of experience in data engineering, data analysis, or related roles with a proven track record in designing and implementing data warehousing solutions, preferably in an Enterprise environment.
- Advanced proficiency in Python and SQL for data exploration, transformation, and validation.
- Strong experience in data cleaning and conducting root cause analysis.
- Intermediate experience with geospatial analysis tools or methods.
- Proficiency in using APIs, including developing, consuming, or evaluating third-party APIs, with the ability to read API documentation, make web requests, and parse responses.
- Capability to work with flat files and data extracts from multiple sources.
- Intermediate experience with Agile methodologies, particularly Scrum practices.
- Excellent problem-solving, communication, and time management skills.
- Basic familiarity with natural language processing (NLP) techniques for parsing address data is a plus.
- Exposure to data profiling or data health assessment tools is beneficial.
- Experience with geospatial tools such as Lightbox data or GE Smallworld
- Knowledge of advanced geocoding techniques and fuzzy logic for data validation
- Familiarity with scripting languages other than Python, such as R or JavaScript, for data manipulation
- Understanding of cloud-based data solutions and platforms, such as AWS or Google Cloud
- Prior experience in telecom data systems or a related field
- Certification in data science or a related discipline
- Strong adaptability and willingness to learn emerging data technologies
- Effective leadership skills and experience in mentoring junior team members
- Geospatial analysis packages/tools like:
- Nominatim
- Geopy
- Placekey
- H3 Indexes
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration abilities, with the capacity to work effectively across different time zones.
- Time management skills to prioritize tasks and manage multiple projects simultaneously, ensuring timely delivery and high-quality outcomes.
- Join a powerful tech workforce and help us change the world through technology
- Professional development opportunities with international customers
- Collaborative work environment
- Career path and mentorship programs that will lead to new levels.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
0
0
0
Category:
Engineering Jobs
Tags: Agile APIs AWS Data analysis Data Warehousing Engineering Fuzzy Logic GCP Google Cloud JavaScript NLP Prototyping Python R Scrum SQL
Perks/benefits: Career development Startup environment
Region:
South America
Country:
Colombia
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Business Intelligence Developer jobsSr. Data Engineer jobsData Scientist II jobsBI Developer jobsPrincipal Data Engineer jobsStaff Data Scientist jobsStaff Machine Learning Engineer jobsPrincipal Software Engineer jobsDevOps Engineer jobsData Science Intern jobsJunior Data Analyst jobsSoftware Engineer II jobsData Manager jobsData Science Manager jobsStaff Software Engineer jobsAI/ML Engineer jobsLead Data Analyst jobsBusiness Data Analyst jobsData Analyst Intern jobsSr. Data Scientist jobsData Specialist jobsBusiness Intelligence Analyst jobsData Engineer III jobsData Governance Analyst jobsSenior Backend Engineer jobs
Consulting jobsMLOps jobsAirflow jobsOpen Source jobsEconomics jobsKafka jobsLinux jobsGitHub jobsKPIs jobsTerraform jobsJavaScript jobsPrompt engineering jobsRAG jobsPostgreSQL jobsBanking jobsStreaming jobsScikit-learn jobsClassification jobsRDBMS jobsNoSQL jobsData Warehousing jobsComputer Vision jobsPhysics jobsdbt jobsGoogle Cloud jobs
Hadoop jobsPandas jobsLangChain jobsScala jobsR&D jobsGPT jobsBigQuery jobsData warehouse jobsMicroservices jobsDistributed Systems jobsReact jobsScrum jobsELT jobsCX jobsOracle jobsLooker jobsIndustrial jobsPySpark jobsOpenAI jobsRedshift jobsJira jobsSAS jobsRobotics jobsTypeScript jobsUnstructured data jobs