AI Factory Deployment Engineer
at Nvidia
π Santa Clara, United States
USD 152,000-287,500 per year
Used Tools & Technologies
HPCRequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 β basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 β daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 β you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 β exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Software Development @ 3
Automated Testing @ 3
Python @ 3
SQL @ 3
Machine Learning @ 3
Communication @ 3
Networking @ 3
PHP @ 3
Due Diligence @ 3
AI @ 3
Agentic AI @ 3
- 1-2 β basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 β daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 β you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 β exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
NVIDIA's AI Factories (data centers) host products across high-performance computing and machine learning applications. This role focuses on controls and monitoring (DCCM) systems for next-generation data center deployments, adapting control system reference designs and standards, and enabling IT-to-OT data integration to support digital twins and AI-enabled applications.
Responsibilities
- Collaborate with product owners and technical leads to identify and collect requirements for next-generation data centers.
- Support global design standards for data center controls and monitoring (DCCM) systems; develop execution strategy and lifecycle management.
- Adapt control system reference designs and standards to AI Factory deployments.
- Participate in control system technical evaluation from site selection due diligence through site turnover to operations, including contractor selection, bid package development, MEP review, control system composition review, RFI responses, submittal/as-built reviews, and commissioning support.
- Provide technical support to data center operations controls engineers.
- Support IT-to-OT data integration enabling digital twins, agentic AI onboarding, coordinated leak detection, and other applications.
- Support standardization in controls engineering quality approval, process control, product evaluation, vendor proposals, product reliability evaluation, automated testing, and software validation.
- Collaborate with cross-functional teams to modify control settings and alarm thresholds to manage data center spaces.
Requirements
- BS in Engineering, Computer Science, or equivalent experience.
- 8+ years of experience with control system design, development, and management on industrial or mission-critical systems.
- Working knowledge of mechanical, electrical, life safety, and IT networking systems associated with critical environments.
- Understanding of OPC-UA and Modbus (TCP & RTU) protocols and how to integrate using these protocols.
- Experience with equipment commissioning, testing, and related activities.
- Experience with startup and configuration of Programmable Logic Controllers (PLCs) and SCADA workstations.
- Strong understanding of Sequence of Operations (SOO) for mechanical system control and ability to create and iterate on SOOs.
- Troubleshooting and problem-solving skills with experience driving root cause analysis on complex projects under pressure.
Ways to stand out
- Experience with MQTT communication protocol, higher-level data strategies, and integration to IT systems.
- Strong understanding of data center commissioning including Level 1 through Integrated Systems Testing.
- Strong understanding of document control and change control processes.
- Experience with Data Center Infrastructure Management (DCIM), EPMS systems, Ignition SCADA software development and deployment, and programming languages such as Python, PHP, and SQL.
- Working knowledge of data center power and cooling solutions, including advanced systems such as liquid cooling.
Compensation and benefits
- Base salary range: 152,000 USD - 253,000 USD for Level 4; 184,000 USD - 287,500 USD for Level 5.
- Eligible for equity and company benefits (link to NVIDIA benefits provided in original posting).
Additional information
- Applications accepted at least until May 4, 2026.
- NVIDIA uses AI tools in its recruiting processes and is an equal opportunity employer committed to diversity and inclusion.