Software Engineer Graduate (Data Arch - Data Ecosystem ) - 2026 (PhD)
San Jose, California, United States
⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️
About the team:
The TikTok Data Ecosystem Team has the vital role of crafting and implementing a storage solution for offline data in TikTok's recommendation system, which caters to more than a billion users. Their primary objectives are to guarantee system reliability, uninterrupted service, and seamless performance. They aim to create a storage and computing infrastructure that can adapt to various data sources within the recommendation system, accommodating diverse storage needs. Their ultimate goal is to deliver efficient, affordable data storage with easy-to-use data management tools for the recommendation, search, and advertising functions.
We are looking for talented individuals to join our team in 2026. As a graduate, you will get unparalleled opportunities for you to kickstart your career, pursue bold ideas and explore limitless growth opportunities. Co-create a future driven by your inspiration with TikTok.
Successful candidates must be able to commit to an onboarding date by end of year 2026.
Responsibilities:
1. Design and implement real-time and offline data architecture for large-scale recommendation systems.
2. Build scalable and high-performance streaming Lakehouse systems that power feature pipelines, model training, and real-time inference.
3. Collaborate with ML platform teams to support PyTorch-based model training workflows and design efficient data formats and access patterns for large-scale samples and features.
4. Own core components of our distributed storage and processing stack, from file format to stream compaction to metadata management.
The TikTok Data Ecosystem Team has the vital role of crafting and implementing a storage solution for offline data in TikTok's recommendation system, which caters to more than a billion users. Their primary objectives are to guarantee system reliability, uninterrupted service, and seamless performance. They aim to create a storage and computing infrastructure that can adapt to various data sources within the recommendation system, accommodating diverse storage needs. Their ultimate goal is to deliver efficient, affordable data storage with easy-to-use data management tools for the recommendation, search, and advertising functions.
We are looking for talented individuals to join our team in 2026. As a graduate, you will get unparalleled opportunities for you to kickstart your career, pursue bold ideas and explore limitless growth opportunities. Co-create a future driven by your inspiration with TikTok.
Successful candidates must be able to commit to an onboarding date by end of year 2026.
Responsibilities:
1. Design and implement real-time and offline data architecture for large-scale recommendation systems.
2. Build scalable and high-performance streaming Lakehouse systems that power feature pipelines, model training, and real-time inference.
3. Collaborate with ML platform teams to support PyTorch-based model training workflows and design efficient data formats and access patterns for large-scale samples and features.
4. Own core components of our distributed storage and processing stack, from file format to stream compaction to metadata management.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
0
0
0
Category:
Engineering Jobs
Tags: Architecture Data management Machine Learning Model training PhD Pipelines PyTorch Streaming
Region:
North America
Country:
United States
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Sr. Data Engineer jobsData Scientist II jobsBusiness Intelligence Developer jobsPrincipal Data Engineer jobsBI Developer jobsStaff Data Scientist jobsStaff Machine Learning Engineer jobsPrincipal Software Engineer jobsDevOps Engineer jobsJunior Data Analyst jobsData Science Intern jobsSoftware Engineer II jobsData Science Manager jobsData Manager jobsStaff Software Engineer jobsAI/ML Engineer jobsLead Data Analyst jobsData Analyst Intern jobsBusiness Data Analyst jobsSr. Data Scientist jobsData Specialist jobsData Engineer III jobsBusiness Intelligence Analyst jobsData Governance Analyst jobsData Analyst II jobs
Consulting jobsMLOps jobsAirflow jobsOpen Source jobsLinux jobsEconomics jobsKafka jobsKPIs jobsGitHub jobsJavaScript jobsTerraform jobsPostgreSQL jobsBanking jobsPrompt engineering jobsRAG jobsRDBMS jobsStreaming jobsNoSQL jobsPhysics jobsClassification jobsData Warehousing jobsComputer Vision jobsScikit-learn jobsdbt jobsGoogle Cloud jobs
GPT jobsLooker jobsHadoop jobsR&D jobsPandas jobsScala jobsData warehouse jobsLangChain jobsOracle jobsReact jobsDistributed Systems jobsBigQuery jobsMicroservices jobsELT jobsScrum jobsCX jobsPySpark jobsIndustrial jobsOpenAI jobsRedshift jobsJira jobsSAS jobsRobotics jobsTypeScript jobsE-commerce jobs