Junior Research Engineer (Machine Translation)

Warsaw, Poland

Allegro

Allegro - Najlepsze ceny oraz gwarancja bezpiecznych zakupów!

View all jobs at Allegro

Apply now Apply later

Job Description

A hybrid work model requires 1 day a week in the office (Warsaw).

At Allegro, we're building the future of e-commerce, and language is a key part of that vision.  The Machine Learning Research Lab provides cutting-edge AI solutions specifically designed for the unique challenges of e-commerce. Within the Lab, the Language Intelligence area focuses on making international platforms seamless for both buyers and sellers.  We're developing an in-house Machine Translation (MT) system that goes beyond generic solutions, accurately handling the nuances of e-commerce language and scaling efficiently.  Learn more about our work at https://ml.allegro.tech/.

About the Role:

We are seeking a passionate and talented Research Engineer to join our Machine Translation team. You will be a key contributor to our efforts in building a state-of-the-art MT system specifically tailored for the Allegro e-commerce marketplace. This role offers a unique opportunity to work on challenging research problems, develop cutting-edge MT technology for a wide range of language pairs, and see your work directly impact millions of users. You'll be contributing to making our platform accessible to an audience of non-Polish speakers.

Why is it worth working with us?

  • Impact: Being a part of the Machine Learning Research team, you will be responsible for bringing to production research solutions for Allegro.

  • Innovation: We're not just using off-the-shelf solutions. You'll be working on novel approaches to machine translation, leveraging state-of-the-art machine learning methods, and contributing to the advancement of the field. This includes exploring and integrating cutting-edge solutions based on commercial LLMs.

  • Scale: You will be responsible for creating the production-grade ML models, supporting the development team in correct implementation: meeting technical and performance requirements.

  • Collaboration: You'll be part of a team of experienced researchers and engineers, fostering a culture of knowledge sharing and continuous learning. Your support will be needed both at the technical (e.g., what architecture will be appropriate for the domain) and best-practices level (e.g., building data sets, modeling, metrics, implementation of the ML-based solutions to the production) .

  • Growth: To apply state-of-the-art solutions, you will stay up to date with the scientific progress. You will deepen your knowledge by reading the latest papers in your domain and sharing the knowledge with the research teams operating in Allegro. You will have the possibility to participate in scientific conferences, visiting venues where the latest discoveries are presented, developing your scientific career, as well as Allegro's presence in the science community.

In your daily work you will handle the following tasks:

  • Research: Explore, identify, and implement state-of-the-art neural machine translation models and techniques. Conduct thorough literature reviews to find the most promising approaches for specific problems. This includes investigating and evaluating the potential of commercial LLMs (e.g., ChatGPT, Gemini) for machine translation tasks and developing innovative solutions leveraging these technologies.

  • Model Development: Train, evaluate, and fine-tune machine learning models. Develop and refine prompting strategies for LLMs to optimize their performance for e-commerce translation.

  • Data Analysis: Work with large datasets, ensuring data quality and developing strategies for data augmentation and improvement.

  • Collaboration: Work closely with other researchers and software engineering teams to integrate your models into production systems. Participate in the code review process.

  • Continuous Learning: Stay on top of the latest advancements in machine translation and large language models by reading research papers, attending conferences (virtually or in person), and participating in internal seminars. Prepare and deliver presentations.

  • Production Support: Contribute to the preparation of production-grade machine translation models and provide support to the development team during implementation.

  • Quality Focus: Contribute to the development and application of automatic quality estimation models and work with human evaluators to continuously improve translation quality.

What we offer:

  • Well-located office (with fully equipped kitchens and bicycle parking facilities) and excellent working tools (height-adjustable desks, interactive conference rooms)

  • Annual bonus up to 10% of the annual salary gross (depending on your annual assessment and the company's results)

  • A wide selection of fringe benefits in a cafeteria plan – you choose what you like (e.g. medical, sports or lunch packages, insurance, purchase vouchers)

  • English classes that we pay for related to the specific nature of your job

  • Working in a team you can always count on — we have on board top-class specialists and experts in their areas of expertise

  • A high degree of autonomy in terms of organizing your team’s work; we encourage you to develop continuously and try out new things

  • Hackathons, team tourism, training budget and an internal educational platform, MindUp (including training courses on work organization, means of communications, motivation to work and various technologies and subject-matter issues)

  • If you want to learn more, check it out

We are looking for people who, have:

  • Bachelor’s or master's degree in Computer Science, Computational Linguistics, Artificial Intelligence, or a related field. Strong candidates nearing completion of their degree will also be considered

  • Solid understanding of machine learning fundamentals, particularly in the area of Natural Language Processing (NLP)

  • Experience with ML  frameworks (e.g.  PyTorch, transformers, pandas)

  • Programming proficiency in Python

  • Strong analytical and problem-solving skills

  • Demonstrated experience with prompting techniques and AI engineering for large language models (LLMs). Understanding of how to effectively interact with and leverage the capabilities of commercial LLMs

Bonus Points:

  • Experience with machine translation models

  • Experience with sequence-to-sequence models

  • Familiarity with e-commerce data

  • Contributions to open-source projects or publications in relevant conferences/journals

  • Experience with utilizing LLMs in production environments

  • Prior experience in running large-scale computation on cloud platform (GCP, Azure) 

  • Prior experience in using LLMs for synthetic data generation/solving business problems

This may also be of interest to you: 

https://ml.allegro.tech/ 

Send in your CV and see why it is #dobrzetubyć (#goodtobehere)

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  13  3  0

Tags: Architecture Azure ChatGPT Computer Science Data analysis Data quality E-commerce Engineering GCP Gemini GPT Linguistics LLMs Machine Learning ML models NLP Open Source Pandas Prompt engineering Python PyTorch Research Transformers

Perks/benefits: Career development Conferences Lunch / meals Salary bonus Startup environment

Region: Europe
Country: Poland

More jobs like this