Used Tools & Technologies
Not specified
Required Skills & Competences ?
Algorithms @ 3 Data Structures @ 6 Distributed Systems @ 3 Communication @ 3 Networking @ 3 Performance Optimization @ 3 Rust @ 3Details
A systems-level engineer specializing in network infrastructure and network optimization, with expertise in building and maintaining software that interacts with networks. You will be responsible for writing and maintaining software that interfaces between our accelerators and our high-speed networks. This role requires deep technical knowledge of network protocols, kernel-space and/or user-space networks, interfacing with hardware, and the ability to debug and optimize distributed software at the network level.
Responsibilities
- Write and maintain software that interfaces between ML accelerators and high-speed networks.
- Build and optimize networked systems for low-latency, high-throughput ML workloads.
- Diagnose and resolve networking issues in distributed systems (OSI layers 2-4).
- Benchmark and optimize congestion control, collectives, and other networking algorithms.
- Implement and evaluate new collective algorithms and network protocols to improve latency and throughput.
- Work on kernel-space and/or user-space networking stacks and tools; debug kernel-level network latency spikes.
Requirements
- Expert-level proficiency with network protocols and networking concepts.
- Deep kernel networking experience: TCP/IP stack internals, XDP, eBPF, io_uring, and epoll.
- User-space networking experience: DPDK, RDMA, and kernel bypass techniques.
- Understanding of how to build higher-level abstractions such as collectives and RPC.
- Strong skills diagnosing and resolving networking issues in distributed systems, especially OSI layers 2-4.
- Strong programming skills in systems programming (memory management, lock-free data structures, NUMA-aware programming).
- Experience with software, driver, and OS performance optimization tools and techniques.
- Comfort with or desire to learn Rust.
- Bachelor’s degree in a related field or equivalent experience is required.
- Strong candidates often have experience or background in HPC, telecommunications, host networking software, OS/kernel engineering, or embedded systems.
- 5+ years of experience in systems programming or network programming is common for strong candidates.
Nice-to-have / Strong candidates may have
- Understanding of ML accelerators and accelerator drivers.
- Experience designing new network protocols.
- Experience with PCIe and drivers for PCIe devices.
- Expertise in networking algorithms including compression and graph algorithms.
- Experience programming on SmartNICs.
Compensation & Logistics
- Annual base salary: $315,000 - $560,000 USD.
- Total compensation may include equity, benefits, and incentive compensation.
- Education requirement: at least a Bachelor’s degree in a related field or equivalent experience.
- Location-based hybrid policy: staff are expected to be in one of the offices at least 25% of the time (some roles may require more time).
- Visa sponsorship: Anthropic does sponsor visas for some roles and will make reasonable efforts and provide immigration support when an offer is made.
About Anthropic
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. The team is a mix of researchers, engineers, policy experts, and business leaders working together on large-scale research efforts. The company values collaboration, communication, and impact. Anthropic is a public benefit corporation headquartered in San Francisco and offers competitive compensation, benefits, equity donation matching, generous vacation and parental leave, and flexible working hours.