Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 4 Ansible @ 6 Docker @ 6 Go @ 6 Grafana @ 4 Jenkins @ 4 Kubernetes @ 6 Linux @ 4 Prometheus @ 4 IaC @ 4 Terraform @ 6 Python @ 6 Java @ 6 CI/CD @ 4 Datadog @ 4 ArgoCD @ 4 Distributed Systems @ 4 AWS @ 4 Azure @ 4 Networking @ 4 CloudFormation @ 6 Microservices @ 4 Thanos @ 4 API @ 7 HTTP @ 4 QA @ 4 OpenShift @ 6Details
We are in search of a highly skilled Senior Test Developer Architect to join our dynamic Enterprise Software QA team. This role presents an outstanding opportunity to craft the design, optimization, and testing of large-scale cloud infrastructure for foundational NVIDIA Unified Cloud Services and Data Center offerings. Seeking cloud infrastructure expert with expertise in distributed systems, test automation, cloud architectures, for a dynamic role.
Responsibilities
- Leverage AI-powered testing tools to improve test automation, increase coverage, and accelerate testing cycles for cloud-based infrastructure.
- Collaborate with product engineering teams to deeply understand cloud service architectures and provide mentorship to SWQA teams on testing cloud-native applications at scale.
- Craft and develop end-to-end test strategies for validating cloud infrastructure, including compute, storage, networking, security, and orchestration layers.
- Lead NVIDIA Cloud bring-up activities from a software quality assurance perspective, ensuring scalability, reliability, and performance.
- Architect and implement cloud-native test automation frameworks to validate multi-cloud (AWS, Azure, Google Cloud) and hybrid-cloud environments.
- Develop scalable and resilient infrastructure automation by using Infrastructure as Code (IaC), Configuration Management, and optimization techniques.
- Improve observability and monitoring through AI-powered anomaly detection, predictive analytics, and intelligent alerting.
- Ensure resilience and failover testing of cloud-based microservices and distributed architectures.
- Collaborate with internal teams and cloud service partners to ensure alignment with industry standard methodologies and real-world use cases.
Requirements
- Master’s or Ph.D. in Computer Science, Cloud Computing, or a related field, or equivalent experience.
- 4+ years of hands-on experience in cloud-native cluster management, including Docker, Slurm, Kubernetes, OpenShift, and Ansible.
- 8+ years of experience working with cloud infrastructure platforms like AWS, Azure, and Google Cloud, with deep expertise in multi-cloud and hybrid-cloud architectures.
- Strong hands-on experience with Cloud Networking (VPCs, Load Balancers, Service Mesh, API Gateways) and Storage Technologies (EBS, S3, Azure Blob, GFS).
- Advanced proficiency in Infrastructure as Code (IaC) and Configuration Management tools (e.g., Terraform, CloudFormation, Pulumi, Ansible).
- Deep expertise in Kubernetes administration, service mesh technologies (Istio, Linkerd), and container security.
- Proficiency in Python, Go, or Java for cloud automation, testing frameworks, and infrastructure scripting.
- Expertise in CI/CD pipelines using GitOps models, GitLab, Jenkins, ArgoCD, and Spinnaker for automated cloud deployments.
- Hands-on experience with cloud observability and monitoring tools (Prometheus, Grafana, CloudWatch, Thanos, Datadog, New Relic).
- Strong cloud security knowledge, including Kubernetes security, IAM policies, encryption, and vulnerability management.
- Proven track record to debug complex cloud infrastructure issues, involving DNS, HTTP, Linux, cloud networking, and containers.
Benefits
By joining our team, you will be part of a forward-thinking company that values innovation and creativity. We offer a competitive salary and benefits package, a flexible work environment, and the opportunity to work with some of the industry leading experts. If you're ready to take your career to the next level, we’d love to hear from you.