Research Scientist - Acoustic and Multi-Modal Scene Understanding

Cambridge, UK

Meta

Giving people the power to build community and bring the world closer together

View all jobs at Meta

Apply now Apply later

At Meta’s Reality Labs Research, our goal is to make world-class consumer virtual, augmented, and mixed reality experiences. Come work alongside industry-leading scientists and engineers to create the technology that makes VR, AR and smart wearables pervasive and universal. Join the adventure of a lifetime as we make science fiction real and change the world. We are a world-class team of researchers and engineers creating the future of augmented and virtual reality, which together will become as universal and essential as smartphones and personal computers are today. And just as personal computers have done over the past 45 years, AR, VR and smart wearables will ultimately change everything about how we work, play, and connect.

We are developing all the technologies needed to enable breakthrough Smartglasses, AR glasses and VR headsets, including optics and displays, computer vision, audio, graphics, brain-computer interfaces, haptic interaction, eye/hand/face/body tracking, perception science, and true telepresence. Some of those will advance much faster than others, but they all need to happen to enable AR and VR that are so compelling that they become an integral part of our lives.

The Audio team within RL Research is looking for an experienced Research Scientist with an in-depth understanding of real-time and efficient signal processing and machine learning on audio and multi-modal signals to join our team. You will be doing core and applied research in technologies that improve listener’s hearing abilities under challenging listening conditions using wearable computing, and alongside a team of dedicated researchers, developers, and engineers. You will operate at the intersection of egocentric perception, acoustics, computer vision, and signal processing algorithms with hardware and software co-design.Research Scientist - Acoustic and Multi-Modal Scene Understanding Responsibilities
  • Design innovative solutions for challenging multi-modal egocentric recognition problems with resource constraints
  • Communicate research results internally and externally in the form of technical reports and scientific publications
  • Experience of consistently working under own initiative implementing state of the art models and techniques on Pytorch, Tensorflow or other platforms, seeking feedback and input where appropriate
  • Identify, motivate, and execute on reasonable medium to large hypotheses (each with many tasks) for model improvements through data analysis, and domain knowledge, with capacity to communicate learnings effectively.
  • Design, perform, and analyze online and offline experiments with specific and well thought-out hypotheses in mind.
  • Generate reliable, correct training data with great attention to detail.
  • Identify and debug common issues in training machine learning models such as overfitting/underfitting, leakage, offline/online inconsistency
  • Aware of common systems considerations and modeling issues, and factor this into modeling choices.
  • Design acoustic or audio-visual models which can have a small computational footprint on mobile devices and wearables such as smart glasses.
Minimum Qualifications
  • Currently has a PhD or a postdoctoral assignment in the field of deep learning, Machine Learning, Computer Vision, Computer Science, Computer Engineering or Statistics or a related field.
  • 4+ years experience with development and implementation of signal processing and deep learning algorithms in the fields of acoustic and multi-modal detection, recognition and/or tracking problems.
  • 4+ years experience with scientific programming languages such as Python, C++, or similar.
  • 3+ years experience with research-oriented software engineering skills, including fluency with machine learning (e.g., PyTorch, TensorFlow, Scikit-learn, Pandas) and libraries for scientific computing (e.g. SciPy ecosystem).
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment.
  • Demonstrated experience of implementing and evaluating end-to-end prototypical learning systems
  • Ability to independently resolve most online and offline issues which affect the hypothesis testing
  • Understand the model architecture used, and the consequences of this for different hypotheses tested. In general, you have a good understanding of computer vision from an applied perspective, even though you may not be up to date with the state of the art.
  • Experience in communicating effectively with a broad range of stakeholders and collaborators at different levels
Preferred Qualifications
  • Experience with audio-visual learning, computer vision, source localization and tracking, audio and visual SLAM systems, egocentric multimodal learning, etc.
  • Experience with building low-complexity models on acoustic and multi-modal problems aimed at low-power mobile devices and wearables
  • Experience with integration of development models on real-time running mobile platforms with different levels of compute (on-sensor computation, system on chip, low power island, etc.)
  • Experience with acoustic localization or visual multi-object tracking problems
  • Proven track record of achieving significant results and innovation as demonstrated by first-authored publications and patents.
LocationsAbout Meta Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today—beyond the constraints of screens, the limits of distance, and even the rules of physics.


Equal Employment Opportunity and Affirmative Action Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics. You may view our Equal Employment Opportunity notice here.
Meta is committed to providing reasonable support (called accommodations) in our recruiting processes for candidates with disabilities, long term conditions, mental health conditions or sincerely held religious beliefs, or who are neurodivergent or require pregnancy-related support. If you need support, please reach out to accommodations-ext@meta.com.
Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0

Tags: Architecture Computer Science Computer Vision Data analysis Deep Learning Engineering Machine Learning ML models Pandas PhD Physics Python PyTorch Research Scikit-learn SciPy SLAM Statistics TensorFlow Testing VR V-SLAM

Perks/benefits: Career development

Region: Europe
Country: United Kingdom

More jobs like this