Machine Learning Researcher-Search
Singapore, Singapore
Team Introduction:
The Search Team is primarily responsible for the innovation of search algorithm and architecture research and development (R&D) for products such as Douyin, Toutiao, and Xigua Video, as well as businesses like E-commerce and Local Services. We leverage cutting-edge machine learning technologies for end-to-end modeling and continuously push for breakthroughs. We also focus on the construction and performance optimization of distributed and machine learning systems — ranging from memory and disk optimization to innovations in index compression and exploration of recall and ranking algorithms — providing students with ample opportunities to grow and develop themselves.
The main areas of work include:
1. Exploring Cutting-Edge NLP Technologies: From basic tasks like word segmentation and Named Entity Recognition (NER) to advanced business functions like text and multimodal pre-training, query analysis, and fundamental relevance modeling, we apply deep learning models throughout the pipeline where every detail presents a challenge.
2. Cross-Modal Matching Technologies: Applying deep learning techniques that combine Computer Vision (CV) and Natural Language Processing (NLP) in search, we aim to achieve powerful semantic understanding and retrieval capabilities for multimodal video search.
3. Large-Scale Streaming Machine Learning Technologies: Utilising large-scale machine learning to address recommendation challenges in search, making the search more personalized and intuitive in understanding user needs.
4. Architecture for data at the scale of hundreds of billions: Conducting in-depth research and innovation in all aspects, from large-scale offline computing and performance and scheduling optimization of distributed systems to building high-availability, high-throughput, and low-latency online services.
5. Recommendation Technologies: Leveraging ultra-large-scale machine learning to build industry-leading search recommendation systems and continuously explore and innovate in search recommendation technologies.
Specific objectives include:
1. Exploring the integration of large models with ranking algorithms to improve the accuracy of personalized ranking and user experience.
2. Researching generative retrieval algorithms to solve ultra-large-scale retrieval problems in candidate libraries with tens or hundreds of billions of entries.
3. Leveraging large language models (LLMs) to enhance search satisfaction for complex and polysemous queries.
4. Building high-performance, low-resource-consumption large-scale batch-stream integrated retrieval and computing systems to improve resource utilization.
Challenge:
1. Challenges in Personalized Ranking
Traditional ranking algorithms struggle to fully utilize multimodal information (e.g., text, images, video) and have limited model complexity, failing to meet users’ demands for precise and personalized search results.
2. Challenges in Ultra-Large-Scale Retrieval
In retrieval scenarios with candidate libraries containing hundreds of billions of entries, traditional discriminative models face issues such as insufficient model capacity and low indexing efficiency, urgently requiring next-generation retrieval algorithms.
3. Challenges in Complex Query Understanding
User search intents are becoming increasingly complex. Traditional search engines struggle to accurately interpret the semantics of long/complex sentences and polysemous queries, leading to low satisfaction with search results.
4. Challenges in Resource Utilization
The storage-computation separation architecture of search systems results in low resource utilization. Optimizing resource usage while maintaining performance has become a critical issue.
5. Necessity of Large Model-Based Intelligent Search
Introducing large model technologies is essential to address the above challenges. It can significantly enhance the semantic understanding, retrieval efficiency, and resource utilization of search systems, thereby delivering more accurate and efficient search experiences to users.
Details:
1. Research on Large Models for Personalized Ranking
2. Research on Ultra-Large-Scale Generative Retrieval Algorithms
3. Improving Search Satisfaction for Complex Polysemous Queries Based on LLMs
4. High-Performance Large-Scale Batch-Stream Integrated Retrieval and Computing Systems
Involved Research Directions:
1. Large models for ranking
2. Generative retrieval and cross-modal fusion
3. Large language models (LLMs) and complex query understanding
4. High-performance computing and storage architectures
The Search Team is primarily responsible for the innovation of search algorithm and architecture research and development (R&D) for products such as Douyin, Toutiao, and Xigua Video, as well as businesses like E-commerce and Local Services. We leverage cutting-edge machine learning technologies for end-to-end modeling and continuously push for breakthroughs. We also focus on the construction and performance optimization of distributed and machine learning systems — ranging from memory and disk optimization to innovations in index compression and exploration of recall and ranking algorithms — providing students with ample opportunities to grow and develop themselves.
The main areas of work include:
1. Exploring Cutting-Edge NLP Technologies: From basic tasks like word segmentation and Named Entity Recognition (NER) to advanced business functions like text and multimodal pre-training, query analysis, and fundamental relevance modeling, we apply deep learning models throughout the pipeline where every detail presents a challenge.
2. Cross-Modal Matching Technologies: Applying deep learning techniques that combine Computer Vision (CV) and Natural Language Processing (NLP) in search, we aim to achieve powerful semantic understanding and retrieval capabilities for multimodal video search.
3. Large-Scale Streaming Machine Learning Technologies: Utilising large-scale machine learning to address recommendation challenges in search, making the search more personalized and intuitive in understanding user needs.
4. Architecture for data at the scale of hundreds of billions: Conducting in-depth research and innovation in all aspects, from large-scale offline computing and performance and scheduling optimization of distributed systems to building high-availability, high-throughput, and low-latency online services.
5. Recommendation Technologies: Leveraging ultra-large-scale machine learning to build industry-leading search recommendation systems and continuously explore and innovate in search recommendation technologies.
Specific objectives include:
1. Exploring the integration of large models with ranking algorithms to improve the accuracy of personalized ranking and user experience.
2. Researching generative retrieval algorithms to solve ultra-large-scale retrieval problems in candidate libraries with tens or hundreds of billions of entries.
3. Leveraging large language models (LLMs) to enhance search satisfaction for complex and polysemous queries.
4. Building high-performance, low-resource-consumption large-scale batch-stream integrated retrieval and computing systems to improve resource utilization.
Challenge:
1. Challenges in Personalized Ranking
Traditional ranking algorithms struggle to fully utilize multimodal information (e.g., text, images, video) and have limited model complexity, failing to meet users’ demands for precise and personalized search results.
2. Challenges in Ultra-Large-Scale Retrieval
In retrieval scenarios with candidate libraries containing hundreds of billions of entries, traditional discriminative models face issues such as insufficient model capacity and low indexing efficiency, urgently requiring next-generation retrieval algorithms.
3. Challenges in Complex Query Understanding
User search intents are becoming increasingly complex. Traditional search engines struggle to accurately interpret the semantics of long/complex sentences and polysemous queries, leading to low satisfaction with search results.
4. Challenges in Resource Utilization
The storage-computation separation architecture of search systems results in low resource utilization. Optimizing resource usage while maintaining performance has become a critical issue.
5. Necessity of Large Model-Based Intelligent Search
Introducing large model technologies is essential to address the above challenges. It can significantly enhance the semantic understanding, retrieval efficiency, and resource utilization of search systems, thereby delivering more accurate and efficient search experiences to users.
Details:
1. Research on Large Models for Personalized Ranking
2. Research on Ultra-Large-Scale Generative Retrieval Algorithms
3. Improving Search Satisfaction for Complex Polysemous Queries Based on LLMs
4. High-Performance Large-Scale Batch-Stream Integrated Retrieval and Computing Systems
Involved Research Directions:
1. Large models for ranking
2. Generative retrieval and cross-modal fusion
3. Large language models (LLMs) and complex query understanding
4. High-performance computing and storage architectures
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
1
0
0
Categories:
Machine Learning Jobs
Research Jobs
Tags: Architecture Computer Vision Deep Learning Distributed Systems E-commerce LLMs Machine Learning NLP R R&D Research Streaming
Perks/benefits: Career development
Region:
Asia/Pacific
Country:
Singapore
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
BI Developer jobsData Engineer II jobsStaff Data Scientist jobsSr. Data Engineer jobsPrincipal Data Engineer jobsStaff Machine Learning Engineer jobsPrincipal Software Engineer jobsData Science Manager jobsData Manager jobsData Science Intern jobsSoftware Engineer II jobsDevOps Engineer jobsBusiness Intelligence Analyst jobsJunior Data Analyst jobsData Analyst Intern jobsData Specialist jobsBusiness Data Analyst jobsLead Data Analyst jobsStaff Software Engineer jobsSr. Data Scientist jobsAI/ML Engineer jobsSenior Backend Engineer jobsData Governance Analyst jobsData Engineer III jobsResearch Scientist jobs
Consulting jobsAirflow jobsMLOps jobsOpen Source jobsKPIs jobsKafka jobsJavaScript jobsLinux jobsEconomics jobsTerraform jobsNoSQL jobsData Warehousing jobsComputer Vision jobsGoogle Cloud jobsGitHub jobsRDBMS jobsPostgreSQL jobsScikit-learn jobsR&D jobsPhysics jobsStreaming jobsHadoop jobsData warehouse jobsBanking jobsScala jobs
dbt jobsPandas jobsBigQuery jobsOracle jobsClassification jobsReact jobsLooker jobsRAG jobsCX jobsScrum jobsPySpark jobsDistributed Systems jobsPrompt engineering jobsIndustrial jobsRedshift jobsELT jobsMicroservices jobsJira jobsGPT jobsTypeScript jobsRobotics jobsOpenAI jobsLangChain jobsSAS jobsJenkins jobs