Staff Geospatial Data Engineer (Remote)
San Francisco, California
About the Role:We are looking for a Staff Data Engineer to lead the development of cutting-edge data systems backing our products for our mission to restore and protect the planet's forests. As a leader on the DMRV (Digital measurement, reporting, and verification) team, you will build, scale and deploy systems for ingesting, storing and computing the data powering our AI and Remote Sensing insights and are responsible for delivering those data insights to our customers to enable them to identify and originate the highest quality nature-based projects.
A typical day includes collaborating across engineering and science teams to understand new dataset ingest pathways for model or algorithm features, writing code to support efficient compute and scalable transformation and algorithms to unlock insights over project data, designing systems for easy data access and experimentation pathways, pair coding with other engineers to raise the standards and bar on our technical work, and roadmapping core improvements to our data, compute or measurement stack.
We're looking for engineers who find joy in the craft of building but live for seeing the end-to-end impact and want to rally engineers around them. Engineers who push forward initiatives by asking great questions, cutting through ambiguity, and organizing to win. Engineers who are relentlessly detail-oriented, methodical in their approach to understanding trade-offs, and place the highest emphasis on building and building quickly.
Location: This role is remote. However, given the cross-functional communication responsibilities, it is preferred that you be within 3 hours of Pacific time.
About Pachama:Pachama harnesses AI and satellite data to empower companies to confidently invest in nature. Using the latest technological advances, Pachama delivers continuous insight into how forests sequester carbon, protect wildlife and benefit local communities. These insights enable leading companies to find the world’s best projects and track their impact over time while also helping land stewards earn an income protecting nature with tools to develop carbon projects and secure funds.
A typical day includes collaborating across engineering and science teams to understand new dataset ingest pathways for model or algorithm features, writing code to support efficient compute and scalable transformation and algorithms to unlock insights over project data, designing systems for easy data access and experimentation pathways, pair coding with other engineers to raise the standards and bar on our technical work, and roadmapping core improvements to our data, compute or measurement stack.
We're looking for engineers who find joy in the craft of building but live for seeing the end-to-end impact and want to rally engineers around them. Engineers who push forward initiatives by asking great questions, cutting through ambiguity, and organizing to win. Engineers who are relentlessly detail-oriented, methodical in their approach to understanding trade-offs, and place the highest emphasis on building and building quickly.
Location: This role is remote. However, given the cross-functional communication responsibilities, it is preferred that you be within 3 hours of Pacific time.
About Pachama:Pachama harnesses AI and satellite data to empower companies to confidently invest in nature. Using the latest technological advances, Pachama delivers continuous insight into how forests sequester carbon, protect wildlife and benefit local communities. These insights enable leading companies to find the world’s best projects and track their impact over time while also helping land stewards earn an income protecting nature with tools to develop carbon projects and secure funds.
What You Will Help Us With:
- Impact: Empower our interdisciplinary team and customers to derive insights needed to originate high quality nature based projects from our multi-TB datasets by building the ingest pipelines, access and compute supporting our geospatial and remote sensing data powering our products.
- Technical leadershipand innovation: for cross-functional projects as our data and compute pipelines are core platform assets used across teams. Connect product value across teams with the core design and technologies available to develop strategies and vision for the data systems we need to build and how we build them. You will work with teams to implement this vision.
- Advocating for and mentoring on best practices: applied to our data pipelines and compute. Mentoring teammates to raise the bar across the engineering teams to enable step-level increases in efficiency.
- Hands on contributions: coding the systems and tools that enable all engineering and science to produce high-quality insight for forest carbon projects and optimizing methods to run efficiently on large amounts of geospatial and remote sensing data.
Experience & Skills We’re Looking For:
- Geospatial experience specifically with raster and vector data, and nuances of geospatial data and common geospatial cloud-native data formats (geopackage, flatgeobuf, cloud-optimized geotiff). Our tech stack includes Zarr, Rasterio, Geopandas, and Xarray.
- Experience leading larger cross-team engineering efforts.
- Experience with data engineering including ingest, storage, orchestration and compute at scale with an ability to apply these skills to new domains like forest science and remote sensing.
- Strong software engineering practices and a background in Python programming, debugging/profiling, and version control and system design.
- Distributed Compute - familiarity with distributed compute technologies and knowledge of distributed systems concepts (like CPU/GPU interactions/transfers, latency/throughput bottlenecks, pipelining/multiprocessing) Our tech stack includes Dask and Flyte deployed through Kubernetes and GCP.
- Comfort with fast pace execution and rapid iteration startup environment. Excited by product impact.
- Passion for environmental sustainability and a desire to make a meaningful impact on the planet.
Preferred (But Not Required) Qualifications:
- Owned and operated distributed compute system - you aren’t just familiar with distributed workflows but have been responsible for deploying, scaling, overseeing and maintaining the infrastructure needed to run them.
- Built Data pipelines and infra ML and Scientific applications- Have worked with ML and/or Science teams previously.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
2
1
0
Categories:
Engineering Jobs
Leadership Jobs
Tags: Data pipelines Distributed Systems Engineering GCP GPU Kubernetes Machine Learning Pipelines Python Zarr
Perks/benefits: Startup environment
Regions:
Remote/Anywhere
North America
Country:
United States
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Principal Data Scientist jobsBI Developer jobsData Scientist II jobsStaff Data Scientist jobsPrincipal Data Engineer jobsData Manager jobsJunior Data Analyst jobsData Science Manager jobsResearch Scientist jobsBusiness Data Analyst jobsLead Data Analyst jobsSenior AI Engineer jobsData Engineer III jobsSr. Data Scientist jobsData Science Intern jobsData Specialist jobsJunior Data Engineer jobsSenior Data Scientist, Performance Marketing jobsSoftware Engineer, Machine Learning jobsData Analyst Intern jobsSr Data Engineer jobsBI Analyst jobsSoftware Engineer II jobsData Analyst II jobsData Engineering Manager jobs
Snowflake jobsLinux jobsEconomics jobsHadoop jobsJavaScript jobsOpen Source jobsPhysics jobsComputer Vision jobsMLOps jobsAirflow jobsKafka jobsRDBMS jobsBanking jobsNoSQL jobsGoogle Cloud jobsData Warehousing jobsScala jobsR&D jobsKPIs jobsData warehouse jobsGitHub jobsScikit-learn jobsOracle jobsPostgreSQL jobsCX jobs
Classification jobsStreaming jobsSAS jobsTerraform jobsLooker jobsScrum jobsDistributed Systems jobsPandas jobsData Mining jobsPySpark jobsBigQuery jobsRobotics jobsJenkins jobsJira jobsIndustrial jobsRedshift jobsReact jobsdbt jobsUnstructured data jobsMicroservices jobsData strategy jobsE-commerce jobsMySQL jobsMatlab jobsNumPy jobs