Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 4 Algorithms @ 4 Data Structures @ 7 Distributed Systems @ 4 Machine Learning @ 4 TensorFlow @ 8 Communication @ 7 Mentoring @ 7 API @ 4 PyTorch @ 8 CUDA @ 4 GPU @ 3Details
We are looking for a Principal Research Engineer focused on Generative AI inference. The team develops optimized inferencing technologies for generative AI models (language, images, speech) and contributes across the machine learning lifecycle: conceptualization, applied research, engineering for optimized inference, and deployment. You will interact with internal partners, users, and the open-source community to define, analyze, and implement highly optimized algorithms for speech recognition, natural language understanding, image generation, and speech synthesis.
Responsibilities
- Develop new models and algorithms in Speech Recognition, Speech Synthesis, Natural Language Processing, and Deep Learning.
- Architect and implement features in C++, CUDA, and Python.
- Implement new algorithms, perform performance tuning and analysis, and define APIs and functionality coverage.
- Work across NVIDIA engineering teams to ensure software integrates with the NVIDIA accelerated serving stack.
- Demonstrate good engineering practices and mentor other team members.
Requirements
- Strong understanding of modern techniques in Machine Learning, Deep Neural Networks, Natural Language Processing, or Speech Recognition.
- 12+ years industry experience working with Deep Learning frameworks (PyTorch or TensorFlow).
- Strong software engineering skills with excellent C++ and Python development experience and meaningful contributions to major open-source projects.
- Experience or familiarity with CUDA and GPU programming for high-performance inference workloads.
- Strong computer science fundamentals: algorithms and data structures, computational complexity, parallel and distributed computing, and system software.
- Strong communication and interpersonal skills; history of mentoring junior engineers and interns is a plus.
- Bachelor's degree or equivalent experience.
- Desire to constantly grow and learn new things.
Ways to stand out
- Experience architecting or developing large-scale distributed systems for deep learning.
- Knowledge of CPU and/or GPU architecture.
- Strong GPU programming (CUDA) experience.
Compensation & Logistics
- Base salary range: 272,000 USD - 431,250 USD (final base salary determined by location, experience, and peer pay).
- Eligible for equity and benefits.
- Applications accepted at least until July 29, 2025.
Company & Culture
- NVIDIA emphasizes diversity and is an equal opportunity employer. They value creative, autonomous engineers with a passion for technology.