Senior Data Center Deployment Engineer

at Nebius
USD 125,000-180,000 per year
SENIOR
✅ Remote

Used Tools & Technologies

Machine Learning

Required Skills & Competences

Linux @ 4 Leadership @ 4 Communication @ 4 Networking @ 4 Technical Leadership @ 4 Cloud Computing @ 4 GPU @ 4 AI @ 4 HPC @ 4

Details

Nebius is leading a new era in cloud computing to serve the global AI economy. We create the tools and resources our customers need to solve real-world challenges and transform industries, without massive infrastructure costs or the need to build large in-house AI/ML teams. Our employees work at the cutting edge of AI cloud infrastructure alongside experienced and innovative leaders and engineers.

Where we work

Headquartered in Amsterdam and listed on Nasdaq, Nebius has a global footprint with R&D hubs across Europe, North America, and Israel. The team includes over 800 employees with more than 400 engineers across hardware and software engineering and an in-house AI R&D team.

Role description

Nebius operates large-scale, GPU-dense AI infrastructure across mission-critical data center environments. As a Senior Delivery Deployment Engineer, you will own the end-to-end delivery, deployment, and production readiness of next-generation GPU platforms inside our data centers. This role sits at the intersection of hardware, Linux systems, and operational execution. You will lead on-site rack bring-up, validate NVIDIA-based AI systems, coordinate repairs, and ensure GB-series infrastructure moves from installation to fully operational production environments. You will collaborate closely with hardware engineering, networking, and infrastructure teams to deploy and stabilize H200 and B200-based GPU systems at scale.

Responsibilities

  • Lead end-to-end deployment of GB-series racks within data center environments
  • Oversee installation, bring-up, validation, and production readiness of NVIDIA H200 and B200-based servers
  • Troubleshoot complex hardware, firmware, Linux OS, and networking issues
  • Execute structured testing and validation procedures during deployment
  • Develop and maintain basic Linux-based hardware health-check and diagnostic scripts
  • Coordinate on-site hardware repairs, part replacements, and vendor escalations
  • Drive root cause analysis and ensure corrective actions are implemented
  • Manage and prioritize deployment timelines across multiple concurrent rollouts
  • Provide technical leadership and guidance to on-site engineers and technicians
  • Partner with networking and infrastructure teams to ensure seamless integration
  • Document deployment processes, validation standards, and operational runbooks

Requirements

  • Strong hands-on experience deploying and operating data center infrastructure
  • Deep familiarity with GPU-dense systems, ideally NVIDIA H-series platforms
  • Experience working with high-density rack deployments (GB-series or similar)
  • Solid Linux experience, including troubleshooting and scripting
  • Ability to diagnose issues across hardware, OS, firmware, and network layers
  • Experience coordinating field repairs and working directly with hardware vendors
  • Proven experience leading technical teams or overseeing field operations
  • High ownership mindset and ability to operate in production-critical environments
  • Clear communication skills and ability to collaborate across distributed teams

Nice to have

  • Experience deploying AI or HPC clusters at scale
  • Familiarity with automated provisioning or infrastructure lifecycle systems
  • Background in hardware qualification, burn-in testing, or factory validation
  • Experience supporting rapid infrastructure expansion
  • Exposure to ARM-based or heterogeneous compute environments

Working conditions

  • Fully remote position (United States)
  • Collaboration with globally distributed engineering and operations teams

Benefits

  • Health insurance: 100% company-paid medical, dental, and vision coverage for employees and families
  • 401(k) plan: up to 4% company match with immediate vesting
  • Parental leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers
  • Remote work reimbursement: up to $85/month for mobile and internet
  • Disability & life insurance: company-paid short-term, long-term, and life insurance coverage
  • Competitive salary and comprehensive benefits package; opportunities for professional growth and flexible working arrangements

Compensation

We offer competitive salaries, ranging from $125k- $180k base + quarterly performance bonuses.