Senior Software Engineer, Observability and AIOps

at Nvidia
πŸ“ United States
$164,000-310,500 per year
SENIOR
βœ… Remote

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 4 SQL @ 4 Machine Learning @ 4 Data Science @ 7 Hiring @ 4 Bash @ 4 Data Analysis @ 7 API @ 4

Details

Imagine a world where the network is self-managed, and self-healing, and requires minimal manual intervention to sustain business operations. A world where the network learns from past events to recommend actions to users. Or better yet, a network that proactively prevents actions with high probability of causing disruption. This network is advanced and intelligent where disruptions are minimized and emerging technology is easily integrated to maintain a first-class service for our business. If that sounds exciting, NVIDIA is looking for a Network Software Engineer to develop a smart network infrastructure.

Responsibilities

  • Lead the design, development, testing, and deployment of an AIOps platform
  • Apply machine learning, deep learning, natural language processing, and other AI techniques to solve network operations challenges such as anomaly detection, root cause analysis, incident management, and automation
  • Improve network operations by defining and measuring AIOps metrics such as accuracy, reliability, scalability, performance, and efficiency
  • Experience in implementing observability principles and practices such as monitoring, logging, tracing, and alerting
  • Deep Knowledge in data science engineering such as data collection, data cleaning, data analysis, data modeling, and data visualizations
  • Build services to automate monitoring and triaging activities and provide critical information to facilitate response and resolution of performance issues and incidents
  • Build automation which recognizes, troubleshoots, and analyzes system disruptions and develop solutions for improved reliability
  • Owning and driving integrations with various service APIs such as Cloud Service Providers, to automate creation of environments and auto populate data sources in turn. Breakdown targeted manual processes into reusable software modules that can be integrated as code

Requirements

  • 10+ years of network architecture and automation experience
  • PhD or equivalent experience plus proven track record in architecting and automating large scale enterprise-grade networks for several types of organizations.
  • Familiarity and hands-on experience with Arista, Fortinet, Juniper, and Mellanox
  • Strong track record of implementing network services in a variety of distributed computing environments
  • Hands-on experience with high-performance network and network optimization in highly-available, large-scale, multi-site, international environments
  • Hands-on experience with contributing to tooling and automation for provisioning, monitoring, and managing network infrastructure
  • Must be able to read, write and review automation code (Python, Bash, SQL, etc.) Uses independent judgment & a high level of innovation to set company-level technology strategies & processes to accomplish objectives
  • Must have strong interpersonal and organizational skills, including the ability to meet deadlines, work in a team environment, follow written policies and procedures, and maintain superior customer service at all times

Benefits

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.