Research Scientist, Speech Translation Santa Monica, CA
Are you a hands-on Research Scientist with a solid background in Machine Learning who loves solving challenging real-world speech translation problems and is eager to design solutions that will help us revolutionize content creation for films and TV?
Do you want to help revolutionize the way that Film and TV are experienced across the globe? Do you dream of working with the world's best Technologists, Visual Effects Editors, and AI Scientists? Then look no further!
DS Group have partnered with a company with the first AI-based technology that can visually translate a film or TV show into any language! All whilst preserving the original actor's performances, so you can enjoy it without annoying subtitles or poor dubbing! Their recent launch has attracted interest from several global streaming services & film studios, cementing their position as pioneers in a multi-billion-dollar industry.
To take full advantage of this position, they are expanding their Research team in LA with multiple Research Engineers to contribute to novel AI-based audio technology deployed at scale to turn science into unique business products.
Research Science Team
- The team is responsible for fueling next generation AI (vision, graphics, audio, and NLP)
solutions that power the companies AI’s products.
- Works closely with applied, and dataset science, engineering, and film innovation teams to understand new requirements, feature requests, and limitations. And then research and develop baseline solutions.
- The team is also responsible for publishing high-quality research and creating intellectual property.
- Generative modeling and modern audio synthesis including text-to-speech (TTS), and SST (speech-to-speech translation), and voice conversion methods.
- Expert knowledge in machine and deep learning, especially for applications in the audio domain.
- Solid background in linear algebra, signal processing, and numerical optimization.
- Excellent coding skills in Python, and PyTorch, or Tensorflow.
- Outstanding communication skills to collaborate in a team with research scientists and engineers.
- Familiarity with audio identity embedding, style-transfer, multi-language audio synthesis.
- Familiarity with attention and diffusion models.
- Demonstrable research experience with publications in top-tier audio, signal processing, and AI venues and journals such as NeurIPS, InterSpeech, ICASSP, or TSP.
- Familiarity with cloud platforms, such as GCP or AWS.
- Ph.D. in deep audio synthesis techniques, visual computing, or related fields.