Senior Deep Learning Scientist, Speech Synthesis

US, CA, Santa Clara

NVIDIA

NVIDIA erfindet den Grafikprozessor und fördert Fortschritte in den Bereichen KI, HPC, Gaming, kreatives Design, autonome Fahrzeuge und Robotik.

View all jobs at NVIDIA

Apply now Apply later

Widely considered to be one of the technology world’s most desirable employers, NVIDIA is an industry leader with groundbreaking developments in High-Performance Computing, Artificial Intelligence and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, autonomous cars and conversational AI that can perceive and understand the world. Today, we are increasingly known as “the AI computing company.” We're looking to grow our company, and build our teams with the smartest people in the world. Join us at the forefront of technological advancement.

NVIDIA is looking for Speech Data Scientists to develop high-impact, high-visibility Speech AI product "Riva" & improve the experience of millions of customers.  If you're creative & passionate about solving real world conversational AI problems, come join our Riva Product engineering team. For more details on Riva check https://developer.nvidia.com/riva

What you’ll be doing:

  • Train Speech Synthesis Mel spectrogram and vocoder models.

  • Measure and benchmark model performance.

  • Maintain TTS model evaluation system

  • Analyze model accuracy and bias and recommend the next course of action & Improvements.

  • Improve processes for speech data processing, augmentation, filtering & TTS Training sets preparation.

  • Gather knowhow on TTS datasets for training & evaluation.

  • Characterize performance and quality metrics across platforms for various speech AI components.

  • Collaborate with various teams on new product features and improvements of existing products.

  • Participate in developing and reviewing code, design documents, use case reviews, and test plan reviews.

  • Help innovate, identify problems, recommend solutions and perform triage in a collaborative team environment.

What we need to see:

  • Master’s degree (or equivalent experience) or PhD in Computer Science, Electrical Engineering, Artificial Intelligence,  Applied Math, Linguistics or Computational Linguistics

  • 5+ years of experience

  • Excellent programming skills in Python.

  • Strong fundamentals in Programming, optimizations and Software design.

  • Strong knowledge of ML/DL techniques, algorithms and tools with exposure to CNN, RNN (LSTM), Transformers.

  • Knowhow of Deep learning applications to Speech synthesis, LLM, and Speech-to-speech translations.

  • Hands-on experience on Speech Technologies like Speech Synthesis, voice cloning, etc.

  • Experience with Training of speech models.

  • Experience with “PyTorch” Deep Learning Frameworks.

  • Exposure to basic speech digital signal processing and feature extraction techniques like FFT, MFCC, Mel Spectrogram, Neural codecs, etc.

  • General background around version control and code review tools like Git, Gerrit, Gitlab.

Ways to stand out from the crowd:

  • Native or near-native fluency in a non-English language - Spanish / Mandarin / German / Japanese / Russian / French / UK English / Arabic / Hindi / Korean / Italian / Portuguese
  • Experience developing multilingual code-switched TTS, voice cloning, and cross-lingual voice cloning
  • Feeling comfortable and motivated when working in a fast paced, highly collaborative, dynamic work environment
  • Familiarity with GPU based technologies like CUDA, CuDNN and TensorRT
  • Background with deploying machine learning models on data center, cloud, and embedded systems

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression , sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

The base salary range is 180,000 USD - 339,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Apply now Apply later
Job stats:  0  0  0

Tags: Computer Science Conversational AI CUDA cuDNN Deep Learning Engineering Git GitLab GPU Linguistics LLMs LSTM Machine Learning Mathematics ML models PhD Python PyTorch RNN Speech synthesis TensorRT Transformers

Perks/benefits: Career development Equity / stock options

Region: North America
Country: United States

More jobs like this