Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 7 Docker @ 4 ElasticSearch @ 3 Kubernetes @ 4 Redis @ 6 Python @ 4 GCP @ 4 Java @ 4 NoSQL @ 6 RDBMS @ 6 CI/CD @ 4 AWS @ 4 Azure @ 4 Communication @ 7 gRPC @ 7 Networking @ 7 Solr @ 3 Debugging @ 7 API @ 4 HTTP @ 7 Splunk @ 3 Cassandra @ 6 Spring Boot @ 4 Cloud Computing @ 4 GPU @ 4Details
We are seeking a Principal Cloud Software Engineer to help reshape the future of GPU cloud computing and contribute to projects in Deep Learning and AI. The role focuses on architecting, building, and operating complex PaaS GPU cloud services, driving technical direction, and collaborating across teams to deliver scalable, testable, and customer-centric software.
Responsibilities
- Architect, build, plan, implement, and operate complex PaaS for GPU cloud services.
- Drive the underlying technology stack and implementation methodology.
- Collaborate with customers, UX/UI designers, front-end engineers, and other stakeholders to develop new products and enhance existing features.
- Ensure consistency across modules and/or products within the team.
- Champion development practices that prioritize testing, including advancing test automation and CI/CD.
- Support, maintain, and document software functionality with a customer-centric approach.
Requirements
- BS/MS in Computer Science or equivalent experience; 15+ years of hands-on experience building complex services.
- Strong knowledge and experience in OOP concepts and build patterns, with in-depth experience designing and composing large-scale back-end systems.
- Expertise in core Java, including Collections API, Streams API, Concurrency, and I/O.
- Proficiency with RDBMS and NoSQL databases (examples cited: Cassandra, DynamoDB, Redis).
- Deep understanding of HTTP REST APIs, gRPC, security, and networking; strong grasp of API development informed by UX/UI/CLI requirements.
- Ability to drive pragmatic technical discussions toward clean, reusable, testable, and extensible solutions.
- Dedication to collaborative development approaches and ability to impact daily operations across teams.
- Strong debugging skills and experience working closely with DevSecOps, SREs, or equivalent roles.
- Strong verbal and written communication skills and a track record of being an outstanding colleague.
Ways to stand out
- Operational experience with large-scale web applications.
- Expertise with Java, Spring Boot, Golang, Gatling, Python, Kubernetes, and Docker.
- Familiarity with InfluxDB, Cassandra, RDS, Elasticsearch, Solr, and Splunk.
- Experience with one or more cloud providers: AWS, GCP, or Azure.
- Ability to excel in dynamic, highly interactive environments and drive team success.
Benefits & Compensation
- Base salary range: 272,000 USD - 425,500 USD (determined by location, experience, and market pay).
- Eligible for equity and additional benefits (see NVIDIA benefits page).
- Full-time role. Applications accepted at least until September 6, 2025.
Additional information
- Employer: NVIDIA. Committed to diversity and equal opportunity; non-discrimination across protected characteristics.
- Location specified: Santa Clara, CA, United States.