Used Tools & Technologies
Machine LearningRequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Go @ 6
Linux @ 6
Python @ 6
Networking @ 4
Cloud Computing @ 4
GPU @ 4
AI @ 4
InfiniBand @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Why work at Nebius
Nebius is leading a new era in cloud computing to serve the global AI economy. We create tools and resources to help customers solve real-world challenges and transform industries without massive infrastructure costs or the need to build large in-house AI/ML teams. The company is headquartered in Amsterdam, listed on Nasdaq, and has R&D hubs across Europe, North America, and Israel.
The role
We are looking for a Senior Network Engineer to design, build, and operate large-scale, high-performance data center networks supporting GPU-dense AI workloads. You will take end-to-end ownership of service provider–grade and CLOS-based network infrastructure, ensuring reliability, scalability, and predictable performance across distributed environments. This role focuses on production operations, root cause analysis, and continuous improvement of network systems.
Responsibilities
- Design and evolve scalable data center network architectures (CLOS/leaf-spine) for high-throughput, low-latency environments
- Own end-to-end deployment and lifecycle management of routing and switching infrastructure across production environments
- Develop and maintain network design documentation, standards, and operational procedures
- Plan, execute, and validate network infrastructure testing, including vendor evaluation and benchmarking
- Diagnose and resolve complex network issues across the TCP/IPv4/v6 stack in large-scale distributed environments
- Optimize traffic engineering, ECMP, and load balancing strategies
- Implement and operate MPLS-based technologies, including L3 VPNs and segment routing (SR-MPLS, SRv6)
- Collaborate with hardware, systems, and software teams to integrate networking with compute and storage infrastructure
- Automate network operations and workflows to improve reliability, scalability, and operational efficiency
- Ensure high availability and performance of production networks through proactive monitoring and continuous improvement
Requirements
- Expert-level knowledge (CCIE/JNCIE or equivalent) in MPLS, routing, and switching for service provider and data center networks
- Strong experience with Ethernet switching, VXLAN, and modern cloud overlay networking technologies
- Deep expertise in routing protocols including BGP and IS-IS
- Hands-on experience with segment routing (SR-MPLS, SRv6), L3 MPLS VPNs, and ECMP-based traffic balancing
- Proven experience designing and documenting large-scale network architectures
- Experience developing and executing network testing strategies and validating vendor solutions
- Strong troubleshooting skills across the TCP/IPv4/v6 stack in CLOS-based data center environments
- Solid understanding of network hardware architecture, QoS mechanisms, and packet processing pipelines
- Hands-on experience with network equipment from vendors such as Juniper, Arista, Huawei, and Mellanox
- Working proficiency in English
It will be an added bonus if you have
- Working proficiency in an additional European language
- Experience with public cloud networking and GPU/InfiniBand environments
- Knowledge of software-defined networking (SDN) overlays in cloud environments
- Proficiency in Python, Go, or other programming languages in Linux environments
- Experience using programming languages for network automation and tooling
Working conditions
- Remote work within the United States
- Work closely with globally distributed infrastructure and engineering teams
- Participation in on-call rotations to support production environments
Key employee benefits
- Comprehensive health insurance
- 401(k) plan with company contribution
- Paid time off and public holidays
- Flexible working hours
- Professional development and certification support
Compensation
$125,000 6080,000 USD per year, depending on experience and qualifications
What we offer
- Competitive salary and comprehensive benefits package
- Opportunities for professional growth within Nebius
- Flexible working arrangements
- A dynamic and collaborative work environment that values initiative and innovation