Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Security @ 3
Kubernetes @ 3
Python @ 3
Distributed Systems @ 3
Hiring @ 3
Communication @ 3
gRPC @ 3
Protobuf @ 2
Rust @ 3
Debugging @ 5
API @ 2
Reporting @ 3
Observability @ 3
AI @ 3
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
NVIDIA is defining the next era of computing by tapping into the unlimited potential of AI. Joining the OpenShell team offers a unique opportunity to work on a highly advanced platform that enables this future. This core system provides secure, sandboxed runtimes essential for autonomous AI agents. The OpenShell platform is sophisticated, incorporating a control-plane gateway, a privacy-conscious inference router, declarative policy enforcement, and specialized container and VM-based sandbox execution environments.
Responsibilities
- Work across the full stack of a distributed systems platform, from crafting gRPC contracts to building secure sandbox runtimes.
- Implement and harden network security features, including policy enforcement, L4/L7 proxies, and secure inter-service communication using mTLS.
- Develop core platform components such as inference routing, ensuring model provider adapters, credential management, and protocol translation integrate seamlessly with the sandbox and gateway.
- Build reliable configuration and control plane systems that handle state divergence, implement reconciliation loops, and support safe merging and hot-reloading policies.
- Own the operability experience by creating effective CLI tools, managing release automation, and instrumenting all systems for observability with structured logging and distributed tracing.
Requirements
- Minimum of a Bachelor's degree in Computer Science, Electrical Engineering, or a related technical field, or equivalent experience.
- 8+ years of meaningful experience.
- Proficiency in systems programming, including building and debugging long-running services, async runtimes, and handling OS-level integration.
- Deep knowledge of distributed systems/control planes, including reasoning about state divergence, building reconciliation loops, and designing crash recovery paths.
- Experience with container/sandbox internals, managing isolated workloads, process lifecycle, capabilities, and network namespaces.
- Familiarity with gRPC and Protobuf, including crafting machine-to-machine APIs with clean streaming semantics and version safety.
- Experience operating and extending workloads on Kubernetes, including working with compute drivers, image management, and detailed debugging.
- Ability to secure inter-service communication using mTLS, gateway registration flows, and non-browser identity verification.
- Proficiency in instrumenting systems with structured logging, health checks, and distributed tracing for production observability.
Ways to stand out from the crowd
- Familiarity with virtualization technologies and alternative runtimes, such as microVMs (e.g., libkrun).
- Experience improving operator experience through CLI/TUI development, status reporting, and clear error messages.
- Comfort working at cross-language boundaries, specifically between Rust, Python, protobuf codegen, and shell scripting.
Compensation & Benefits
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5. You will also be eligible for equity and benefits.
Additional information
- Applications for this job will be accepted at least until May 12, 2026.
- This posting is for an existing vacancy.
- NVIDIA uses AI tools in its recruiting processes.
- NVIDIA is an equal opportunity employer and values diversity in hiring and promotion practices.