Used Tools & Technologies
Not specified
Required Skills & Competences ?
Linux @ 4 Python @ 4 Hiring @ 4 Bash @ 4 Communication @ 4 Mathematics @ 4 PyTorch @ 4 CUDA @ 4Details
NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It is a unique legacy of innovation fueled by great technology and amazing people. Today, NVIDIA is tapping into the unlimited potential of AI to define the next era of computing. GPUs act as the brains of computers, robots, and self-driving cars that can understand the world. Join the team to make a lasting impact on the world.
Responsibilities
- Lead all aspects of implementing performance practices in large scale infrastructure.
- Deliver powerful tools, methodologies, and flows to validate and improve multiple datacenter products.
- Align next generation AI workloads with next generation datacenter designs through early engagement with hardware, firmware, software, platform internal and customer teams.
- Deliver engineering solutions that provide continuous insights into AI workload performance over evolving environments.
- Decompose high-complexity performance or stability issues into minimal reproduction cases and identify root causes.
- Collaborate with software and firmware teams (BMC, SBIOS, OS, drivers) to develop best-in-class practices and tools.
- Analyze, debug, and resolve critical firmware and software issues to optimize AI workload performance at scale.
Requirements
- 8+ years of experience with accelerated computing for datacenter container computing solutions.
- Proven understanding of accelerated computing software stacks and deep learning frameworks such as CUDA and PyTorch.
- Experience with modern cloud and container-based enterprise computing architectures.
- Programming and scripting skills in C/C++, Python, and Bash.
- Experience with CPU architecture and Linux-based OSes.
- Understanding of collective communication and AI workload patterns.
- Experience supporting high performance computing or deep learning in academic or engineering research communities.
- Strong verbal and written communication and teamwork skills.
- Strong analytical and action-driven mindset.
- Bachelor’s degree in Engineering, Mathematics, Physics, or Computer Science; MS or PhD is desirable or equivalent experience.
Ways to Stand Out From the Crowd
- Experience in at-scale deep learning training.
- Skills in deep learning and graph compiling programming.
- Exposure to virtualization techniques and cloud platform solutions.
- Experience with scheduling and resource management systems.
- Experience with large scale HPC environments.
Compensation and Benefits
- Base salary range: 184,000 USD - 356,500 USD, determined by location, experience, and comparative pay.
- Eligibility for equity and additional benefits.
NVIDIA is committed to diversity and equal opportunity employment, valuing diversity in hiring and promotion practices.