Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Security @ 7
Leadership @ 7
Communication @ 7
API @ 4
OSS @ 7
LLM @ 7
Compliance @ 4
GPU @ 4
AI @ 4
RAG @ 4
Data Pipelines @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
As a Senior Engineering Manager for Agentic Systems & Platform Architecture, you will lead the strategy and execution for NVIDIA’s agentic developer platform—deeply understanding how teams across the company build, evaluate, and improve autonomous agents, and turning those evolving patterns into scalable platform capabilities. You will identify gaps and friction, drive rapid proof-of-concepts on emerging agent constructs and ecosystem tools, and operationalize the best approaches into reusable building blocks, integrations, and governance mechanisms that accelerate developer productivity and agent quality. This platform helps teams safely ship more autonomous systems at NVIDIA scale.
Responsibilities
- Track and deeply understand evolving agent development patterns across NVIDIA and the broader ecosystem.
- Identify gaps and friction in current agent architectures, and translate insights into a platform strategy that boosts developer velocity and agent quality—backed by evaluations, benchmarking, and feedback loops.
- Assess and integrate open source and third-party tools where they add leverage; drive clear build-vs-use decisions.
- Architect and integrate high-performance data pipelines, RAG systems, vector databases, and GPU-optimized training and inference workflows.
- Lead integration of the AI Data Platform into NVIDIA’s on-prem AI Factory, optimizing GPU-to-storage throughput, data locality, and distributed inference performance.
- Establish and enforce robust Agent Governance policies across the platform, covering model/tool usage, data lineage, and ensuring adherence to compliance and Responsible AI frameworks.
- Design, implement, and maintain a centralized Agent Safety Toolkit, providing developers with pre-vetted components for input/output guardrails and prompt injection defenses.
- Lead and grow a high-performing team along with a multi-functional community to standardize procedures and scale adoption.
Requirements
- Bachelor’s degree in Computer Science/Engineering or equivalent experience.
- 10+ years overall in software engineering, including 4+ years managing high-performing teams.
- Strong hands-on experience with evolving agent architectures and open-source libraries; deep expertise in LLM/agent architectures—leading POCs and integrating them into real business use cases with measurable adoption/impact.
- Ability to turn fast-moving, ambiguous problem spaces into clear platform strategy, roadmap, and outcomes.
- Proven track record building multi-team developer platforms (APIs/SDKs, reusable components, reference implementations).
- Experience building evaluation/benchmarking systems for agent workflows (metrics, regression, feedback loops).
- Strong judgment integrating OSS/3P tools; clear build-vs-use decision-making and integration strategy.
- Product approach for safety and governance: controls, auditability, monitoring, and risk management.
- Strong leadership and executive communication (engineering, product, security, research).
Ways to stand out
- Experience implementing enterprise-grade governance for agent systems (controls, auditability, monitoring, policy enforcement) in production autonomous workflows.
- Demonstrated wins taking new/open-source agent constructs from POC to production adoption, with clear business impact (cycle time, quality, cost, reliability).
- Built and scaled an agent platform or agent developer experience used by multiple teams (SDKs, templates, reference apps, reusable building blocks).
- Clear point of view and real examples on build-vs-use decisions—when to adopt OSS/3P vs build internal primitives—and how to operationalize the choice.
- Deep experience with agent evaluation at scale (long-horizon tasks, tool correctness, reliability testing, automated regressions, offline/online feedback loops).
Compensation and benefits
- Base salary range: 272,000 USD - 431,250 USD (determined based on location, experience, and comparable roles).
- Eligible for equity and benefits (link provided in posting).
Other details
- Applications accepted at least until February 27, 2026.
- This posting is for an existing vacancy.
- NVIDIA uses AI tools in its recruiting processes.
- NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment.