Senior Software Engineer, Observability

at Nebius
USD 130,000-170,000 per year
SENIOR
✅ Remote ✅ Hybrid

Used Tools & Technologies

Machine Learning

Required Skills & Competences

Go @ 7 Linux @ 4 Python @ 7 Hiring @ 4 Networking @ 4 Debugging @ 4 Cloud Computing @ 4 Observability @ 4 AI @ 4

Details

Nebius is leading a new era in cloud computing for the global AI economy. The company builds cloud infrastructure and tools to help customers solve real-world challenges and run AI/ML workloads without massive infrastructure costs. Headquartered in Amsterdam and listed on Nasdaq, Nebius has R&D hubs across Europe, North America, and Israel, and a team of 800+ employees including 400+ engineers.

Role summary

Nebius is hiring a Senior Software Engineer to design, build, and own backend systems that power metrics, monitor large-scale infrastructure, and develop a comprehensive infrastructure maintenance platform. The role focuses on production systems, system design, reliability, and close collaboration with hardware, networking, and data center operations teams.

Responsibilities

  • Design and build services and agents that provide deep visibility into large-scale server fleets and data center engineering systems
  • Evolve metrics, aggregation, and alerting pipelines, with a focus on signal quality and reliability
  • Design and operate maintenance and remediation systems that enable safe, predictable fleet-wide changes and keep infrastructure healthy
  • Investigate production incidents hands-on, including on-host Linux debugging, and drive root-cause fixes
  • Collaborate closely with hardware, networking, and data center operations teams to improve reliability

Requirements

  • 5+ years of professional software engineering experience
  • Strong production experience with Python and Go, or the ability to ramp up quickly
  • Solid Linux fundamentals and comfort debugging live systems (on-host Linux debugging)
  • Ability to write reliable, maintainable code and dig into complex, ambiguous problems
  • Experience building and operating production systems at scale

It will be an added bonus if you have:

  • Ubuntu experience, including internal tooling and packaging workflows (e.g., building Debian packages)
  • CCNA (Cisco Certified Network Associate) or equivalent networking experience

Benefits

  • Health insurance: 100% company-paid medical, dental, and vision coverage for employees and families
  • 401(k) plan: up to 4% company match with immediate vesting
  • Parental leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers
  • Remote work reimbursement: up to $85/month for mobile and internet
  • Disability & life insurance: company-paid short-term, long-term and life insurance coverage
  • Competitive salary and comprehensive benefits package
  • Opportunities for professional growth and flexible working arrangements

Compensation

  • Base salary range: $130,000 - $170,000 per year + quarterly performance bonuses

Additional context

  • Team: Hardware Automation / Observability-focused backend systems
  • Work arrangements: flexible working; remote work reimbursement provided