Data Engineer for AI
Philippines (Remote)
This is a remote position.
What the engineer will actually do:- P1 | Build and schedule Python parsers that extract structured JSON from PowerPoint, PDF, and Excel documents, then land the data in Databricks Bronze → Silver tables.
- P1 | Develop/maintain simple Auto Loader or Fivetran pipelines for ERP and ticketing systems.
- P2 | Add basic text‑embedding or LLM‑based entity extraction (LangChain or open‑source transformers) to enrich the document feed.
- P3 | Write unit tests and lightweight data‑quality checks (Great Expectations) so parsing errors do not break the pipeline.
- P3 | Produce concise handover docs for our future data architect.
Skill Set:
Must‑have (core):
- 2‑4 years building ETL or ELT pipelines with Databricks or Snowflake (Delta/Parquet, Spark SQL, Airflow or similar).
- Solid Python (pandas, PySpark) and experience parsing Office files with libraries such as python‑pptx, openpyxl, pdfplumber, or PyPDF.
- Basic SQL tuning and ability to work with structured schemas.
- Git and CI/CD familiarity.
- Exposure to LangChain, Hugging Face Transformer, or any LLM inference workflow.
- Experience adding embeddings to tables for downstream ML or search.
- Great Expectations or similar data‑quality tooling.
- Familiarity with Unity Catalog or Snowflake RBAC concepts.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
5
2
0
Categories:
Deep Learning Jobs
Engineering Jobs
Tags: Airflow CI/CD Databricks ELT ETL Excel FiveTran Git JSON LangChain LLMs Machine Learning Pandas Parquet Pipelines PySpark Python Snowflake Spark SQL Transformers
Regions:
Remote/Anywhere
Asia/Pacific
Country:
Philippines
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
BI Developer jobsData Engineer II jobsStaff Data Scientist jobsSr. Data Engineer jobsPrincipal Data Engineer jobsStaff Machine Learning Engineer jobsPrincipal Software Engineer jobsData Science Manager jobsData Manager jobsData Science Intern jobsSoftware Engineer II jobsDevOps Engineer jobsBusiness Intelligence Analyst jobsJunior Data Analyst jobsData Analyst Intern jobsData Specialist jobsBusiness Data Analyst jobsLead Data Analyst jobsStaff Software Engineer jobsSr. Data Scientist jobsAI/ML Engineer jobsSenior Backend Engineer jobsData Governance Analyst jobsData Engineer III jobsResearch Scientist jobs
Consulting jobsAirflow jobsMLOps jobsOpen Source jobsKPIs jobsKafka jobsJavaScript jobsLinux jobsEconomics jobsTerraform jobsNoSQL jobsData Warehousing jobsComputer Vision jobsGoogle Cloud jobsGitHub jobsRDBMS jobsPostgreSQL jobsScikit-learn jobsR&D jobsPhysics jobsStreaming jobsHadoop jobsData warehouse jobsBanking jobsScala jobs
dbt jobsPandas jobsBigQuery jobsOracle jobsClassification jobsReact jobsLooker jobsRAG jobsCX jobsScrum jobsPySpark jobsDistributed Systems jobsPrompt engineering jobsIndustrial jobsRedshift jobsELT jobsMicroservices jobsJira jobsGPT jobsTypeScript jobsRobotics jobsOpenAI jobsLangChain jobsSAS jobsJenkins jobs