Vacancy is archived. Applications are no longer accepted.
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Software Development @ 6 Go @ 7 Linux @ 4 Python @ 7 TensorFlow @ 4 Networking @ 4 Debugging @ 4 Project Management @ 4 PyTorch @ 4Details
A key part of NVIDIA's strength is our sophisticated analysis / debugging tools that empower NVIDIA engineers to improve performance and power efficiency of our products and running applications. We are looking for forward-thinking, hard-working, and creative people to join a multifaceted software team with high standards! This software engineering role involves developing tools for GPU Cluster users and admins.
Responsibilities
- Build internal perf/power profiling and analysis tools and platform for AI workloads at cluster scale
- Build debugging tools for common encountered problems in GPU cluster
- Work with our users to build / calibrate perf/power models for next generation hardware or system
- Partner with architects to propose new hardware features or improve existing features with real world use cases
Requirements
- BS+ in Computer Science or related (or equivalent experience) and 5+ years of software development
- Strong software design and implementation ability with Python/Go/C++
- Good understanding of Deep Learning and AI frameworks like PyTorch, TensorFlow, etc.
- Knowledge of AI cluster job scheduling, storage management, and networking management
- Knowledge of Linux kernel
- Excellent problem-solving skills and project management skills
- Flexibility for working in an evolving environment with changing requirements
Ways to stand out from the crowd:
- Proven experience in GPU cluster scale continuous profiling & analysis tools/platforms
- Solid experience in large AI job troubleshooting and failure detection/recovery
- Skillful in Deep Learning application performance analysis and optimization
- Knowledgeable in GPU / CPU architecture and application performance or power efficiency analysis
NVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most brilliant and talented people in the world working for us and, due to unprecedented growth, our world-class engineering teams are growing fast. If you're a creative and autonomous engineer with real passion for technology, we want to hear from you.
#LI-Hybrid