Data Engineering Internship (Summer 2026)
London, United Kingdom
⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️
Castleton Commodities International
Castleton Commodities International is a leading Global Energy Commodities Merchant and Infrastructure Asset Investor, unlocking value in Energy marketsApplication Deadline: September 14th, 11:59m EST
Position Overview:
CCI is developing a leading-edge Data Science platform, as staying at the forefront of data management and analytics is essential to our investment strategy. We are looking for motivated and detail-oriented Data Engineering Intern to join our Global Data Science & Technology team in our London office. The Data Engineering Intern will work closely with our Data Science, Data Engineering and Commercial teams to build and optimize data pipelines that power our analytics, forecasting, and investment decision-making processes. This is a hands-on technical internship ideal for someone who enjoys solving real-world data challenges, especially around ingesting, scraping, and managing large datasets across the commodity markets.
Responsibilities:
Develop and maintain robust data ingestion pipelines from various internal and external sources, including APIs, FTP endpoints, and cloud data providers.
Develop data ingestion and transformation pipelines using Python and SQL, publishing Snowflake for downstream use in analytics and forecasting tools.
Work on data architecture and data management projects for both new and existing data sources.
Design and implement ETL processes to clean, normalize, and store structured and semi-structured data in Snowflake, our core relational data warehouse.
Analyze data pipeline performance and implement optimizations to improve efficiency and reliability.
Conduct data quality checks and build validation logic to identify anomalies and ensure data integrity for use by commercial trading and analytics teams.
Automate data workflows using Python, SQL, and orchestration tools (e.g., Airflow or similar).
Assist in transitioning legacy datasets and codebases into scalable, cloud-native workflows aligned with our modern data architecture.
Document data sources, pipeline logic, and data models to ensure maintainability and knowledge transfer.
Qualifications:
Currently pursuing a Bachelor’s or higher degree in Computer Science, Engineering, Management Information Systems, or related technical field.
Expected graduation date of Winter 2026 or Spring/Summer 2027.
Strong programming experience in Python (preferred libraries: pandas, NumPy, SQL alchemy, etc.).
Strong understanding of SQL and experience querying relational databases (Snowflake a plus).
Exposure to or interest in cloud platforms (e.g., AWS, Azure), particularly with cloud data storage and compute.
Familiarity with web scraping frameworks and handling large-scale structured and unstructured data sources.
Visit https://www.cci.com/careers/life-at-cci/# to learn more!
#LI-CD1
Tags: Airflow APIs Architecture AWS Azure Computer Science Data management Data pipelines Data quality Data warehouse Engineering ETL NumPy Pandas Pipelines Python RDBMS Snowflake SQL Unstructured data
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.