Senior AI and FSI Developer Technology Engineer

at Nvidia

📍 Santa Clara, United States

USD 152,000-287,500 per year

SENIOR

✅ Hybrid

Used Tools & Technologies

Machine Learning

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Algorithms @ 4 Communication @ 7 Parallel Programming @ 7 Prioritization @ 7 LLM @ 4 CUDA @ 7 GPU @ 4 AI @ 4 TensorRT @ 4 HPC @ 4

Details

Help shape the future of AI and LLMs in FSI (Financial Services Industry) at NVIDIA. We’re looking for a Senior AI Developer Technology Engineer to push the limits of performance at the intersection of AI, high-performance computing, and financial markets. In this role you will dive deep into parallel algorithms, GPUs, and sophisticated systems, identifying and eliminating bottlenecks to unlock the full power of advanced processing hardware.

Responsibilities

Research, design, and develop techniques to accelerate high-performance workloads for FSI-focused AI on NVIDIA CPUs and GPUs.
Work hands-on with technical experts to analyze, optimize, and scale complex AI and HPC workloads for modern CPU and GPU architectures.
Profile and eliminate performance bottlenecks across the stack: algorithms, kernels, and system-level behavior.
Publish and present results in conferences, talks, and blogs to educate and inspire the developer community.
Influence the design of future hardware architectures, system software, libraries, and programming models by collaborating with NVIDIA research, hardware, compiler, and tools teams.
Act as a developer technology engineer working with external technologists to investigate application performance, design parallel algorithms, and implement GPU-accelerated optimizations.

Requirements

Master’s or PhD in Computer Science, Computer Engineering, or Electrical and Computer Engineering (or equivalent experience).
5+ years of relevant work or research experience.
Strong, hands-on experience with low-level parallel programming (examples cited: CUDA, OpenACC, OpenMP, MPI, pthreads, TBB).
Fluency in C/C++ and solid foundations in algorithms and software design.
Deep understanding of CPU/GPU architecture fundamentals and how they impact performance.
Proven experience improving performance of large-scale computational applications on GPUs.
Excellent understanding of linear algebra.
Strong communication and organization skills, logical problem solving, and solid prioritization abilities.

Ways to stand out

Experience with inference optimization techniques and deploying optimized AI models in production.
Experience with TensorRT, TensorRT-LLM, and cuTile.
Background in capital markets, systematic/algorithmic strategies, or quantitative trading.
Experience parallelizing and optimizing ML methods such as decision trees, time series models, and Monte Carlo simulations.
Knowledge of financial data models, pricing and risk simulation algorithms, portfolio optimization, or other finance-focused applications.

Compensation & Other Info

Base salary ranges: 152,000 USD - 241,500 USD (Level 3) and 184,000 USD - 287,500 USD (Level 4).
Eligible for equity and benefits.
Location: Santa Clara, California, United States. #LI-Hybrid (hybrid role).
Applications accepted at least until April 14, 2026. This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes. NVIDIA is an equal opportunity employer committed to diversity.