Data Engineer Intern
Fayetteville, AR
AcreTrader
About Acres
Acres is a map-based land intelligence platform designed to bring transparency to America’s largest asset–land. Using public and private market data, Acres aggregates and analyzes over 150 million parcels of land to enable users to better understand and value land with confidence. By providing access to a comprehensive and more accurate compilation of land data, comparable sales, and parcel-level insights, Acres supports fast, informed decision making. Visit Acres.com to learn more.
We are dedicated to the growth of our platform and the growth of our people. We're looking for high performers who go about their work with a sense of curiosity, responsibility, and a drive to make things better.
*Note: Acres seeks qualified candidates who are eligible to work in the United States. We are unable at this time to provide any sponsorship for work authorization.
Data Engineer Intern
Acres is seeking a full-time Data Engineer Intern to join our growing data science team. As a Data Engineer Intern, you will work closely with our Data Team to build and maintain robust data pipelines that handle geospatial data. You’ll utilize tools and technologies like GitHub, Docker, Python, ZSH, Linux, PostgreSQL, OpenSearch, DuckDB, and more to support the development of industry-standard tools for land analytics. This role is full-time and offers opportunities for growth in a collaborative, fast-paced startup environment.
The data science team at Acres is innovative, team-oriented, and passionate about creating the default land transaction platform. Your contributions will directly support our mission of setting the industry standard in geospatial tools.
Core Responsibilities
- Build and maintain data pipelines for handling geospatial data.
- Process and manage raster and vector geospatial data using GDAL, rasterio, and other GIS tools.
- Work with geospatial data storage and queries using PostgreSQL and PostGIS.
- Develop solutions for data storage and retrieval in DuckDB, cloud buckets, and other distributed systems.
- Optimize data pipelines for parallelism, performance, and reliability.
- Collaborate with team members via GitHub, Docker, and Linux workflows.
- Research and apply best practices for data processing, ensuring scalability and accuracy.
Key Competencies
- Proficiency with Python, ZSH, and Linux.
- Familiarity with geospatial data concepts and tools such as GDAL, rasterio, PostGIS, and OpenSearch.
- Experience using GitHub for version control and Docker for containerization.
- Strong attention to detail with an eagerness to learn and adapt.
- Humble attitude, willingness to explore tradeoffs, and curiosity about optimizing pipelines.
- Understanding of pipeline performance considerations, including parallelization, cost-efficiency, and system-level performance (e.g., Linux internals).
- Effective communication, collaboration, and teamwork skills.
Preferred Qualifications
- Bachelor’s or master’s degree in Computer Science, Data Science, GIS, or related field.
- Prior experience working with geospatial data pipelines or tools.
- Familiarity with cloud computing platforms and services.
- Demonstrated ability to work in agile, fast-paced environments.
- A passion for learning, improving, and understanding the tradeoffs in system design.
- A creative, intuitive approach to solving hard problems balancing space, time, and cost.
Apply now to join us in shaping the future of geospatial data solutions!
Tags: Agile Computer Science Data pipelines Distributed Systems Docker GitHub Linux OpenSearch Pipelines PostgreSQL Python Research
Perks/benefits: Career development Startup environment Transparency
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.