Machine Learning Researcher-Search

Singapore, Singapore

Apply now Apply later

Team Introduction:
The Search Team is primarily responsible for the innovation of search algorithm and architecture research and development (R&D) for products such as Douyin, Toutiao, and Xigua Video, as well as businesses like E-commerce and Local Services. We leverage cutting-edge machine learning technologies for end-to-end modeling and continuously push for breakthroughs. We also focus on the construction and performance optimization of distributed and machine learning systems — ranging from memory and disk optimization to innovations in index compression and exploration of recall and ranking algorithms — providing students with ample opportunities to grow and develop themselves.

The main areas of work include:
1. Exploring Cutting-Edge NLP Technologies: From basic tasks like word segmentation and Named Entity Recognition (NER) to advanced business functions like text and multimodal pre-training, query analysis, and fundamental relevance modeling, we apply deep learning models throughout the pipeline where every detail presents a challenge.
2. Cross-Modal Matching Technologies: Applying deep learning techniques that combine Computer Vision (CV) and Natural Language Processing (NLP) in search, we aim to achieve powerful semantic understanding and retrieval capabilities for multimodal video search.
3. Large-Scale Streaming Machine Learning Technologies: Utilising large-scale machine learning to address recommendation challenges in search, making the search more personalized and intuitive in understanding user needs.
4. Architecture for data at the scale of hundreds of billions: Conducting in-depth research and innovation in all aspects, from large-scale offline computing and performance and scheduling optimization of distributed systems to building high-availability, high-throughput, and low-latency online services.
5. Recommendation Technologies: Leveraging ultra-large-scale machine learning to build industry-leading search recommendation systems and continuously explore and innovate in search recommendation technologies.

Specific objectives include:
1. Exploring the integration of large models with ranking algorithms to improve the accuracy of personalized ranking and user experience.
2. Researching generative retrieval algorithms to solve ultra-large-scale retrieval problems in candidate libraries with tens or hundreds of billions of entries.
3. Leveraging large language models (LLMs) to enhance search satisfaction for complex and polysemous queries.
4. Building high-performance, low-resource-consumption large-scale batch-stream integrated retrieval and computing systems to improve resource utilization.

Challenge:
1. Challenges in Personalized Ranking
Traditional ranking algorithms struggle to fully utilize multimodal information (e.g., text, images, video) and have limited model complexity, failing to meet users’ demands for precise and personalized search results.
2. Challenges in Ultra-Large-Scale Retrieval
In retrieval scenarios with candidate libraries containing hundreds of billions of entries, traditional discriminative models face issues such as insufficient model capacity and low indexing efficiency, urgently requiring next-generation retrieval algorithms.
3. Challenges in Complex Query Understanding
User search intents are becoming increasingly complex. Traditional search engines struggle to accurately interpret the semantics of long/complex sentences and polysemous queries, leading to low satisfaction with search results.
4. Challenges in Resource Utilization
The storage-computation separation architecture of search systems results in low resource utilization. Optimizing resource usage while maintaining performance has become a critical issue.
5. Necessity of Large Model-Based Intelligent Search
Introducing large model technologies is essential to address the above challenges. It can significantly enhance the semantic understanding, retrieval efficiency, and resource utilization of search systems, thereby delivering more accurate and efficient search experiences to users.

Details:
1. Research on Large Models for Personalized Ranking
2. Research on Ultra-Large-Scale Generative Retrieval Algorithms
3. Improving Search Satisfaction for Complex Polysemous Queries Based on LLMs
4. High-Performance Large-Scale Batch-Stream Integrated Retrieval and Computing Systems

Involved Research Directions:
1. Large models for ranking
2. Generative retrieval and cross-modal fusion
3. Large language models (LLMs) and complex query understanding
4. High-performance computing and storage architectures
Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  1  0  0

Tags: Architecture Computer Vision Deep Learning Distributed Systems E-commerce LLMs Machine Learning NLP R R&D Research Streaming

Perks/benefits: Career development

Region: Asia/Pacific
Country: Singapore

More jobs like this