Principal Performance Modeling Engineer

at Groq
USD 205,000-248,000 per year
SENIOR
✅ Remote

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 6 Algorithms @ 4 Leadership @ 4 Mathematics @ 4 Networking @ 4 Performance Optimization @ 4 LLM @ 4

Details

Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers the speed and scale they need. From our Bay Area roots to our growing global presence, we are on a mission to make high performance AI compute more accessible and affordable.

This role is focused on performance modeling of Groq systems on state-of-the-art AI/ML workloads to identify bottlenecks early and guide future hardware development for Groq's AI accelerator.

Responsibilities

  • Develop and maintain performance models for multiple generations of Groq hardware on the latest AI/ML workloads (LLMs, CNNs, LSTMs, etc.)
  • Analyze AI/ML algorithms to understand their compute, networking and memory requirements, and map them effectively onto the underlying hardware architecture
  • Lead a matrixed team to enable software/hardware co-optimization across chip, system and software teams
  • Identify performance bottlenecks and help drive next generation chip architecture through a solid understanding of Groq's software and hardware
  • Work with silicon and system integration engineers to evaluate the costs & benefits of new technologies on Groq systems
  • Provide what-if scenarios and continuous guidance directly to the CEO & senior leadership
  • Develop the Design Space Exploration (DSE) tool for performance analysis and exploration of both chip and system across various workloads
  • Define custom hardware solutions for high profile customers

Requirements

  • Degree or equivalent experience in computer science, mathematics, electrical and computer engineering (ECE) or a related field
  • Strong fundamentals in computer architecture, with deep knowledge and experience of working on domain-specific AI architectures (highly preferred)
  • In-depth understanding of latest AI/ML algorithms and their hardware implications
  • Ability to analyze and simplify complex hardware designs into simple abstracted timing models
  • Past experience modeling AI/ML workloads and creating tools for performance optimization; experience with modeling LLM performance is beneficial but not required
  • Proficient in programming languages such as C/C++ and Python
  • Experience with cycle-accurate simulators for benchmarking analysis
  • Experience with ASIC microarchitecture design is a plus
  • Experience with understanding and simulating RTL (SystemVerilog) designs is a plus

Attributes / Culture

Groq values humility, collaboration, a growth and giver mindset, curiosity and innovation, and passion and grit. Team members are expected to work collaboratively, share knowledge generously, and take creative approaches to projects and design.

Compensation

  • Base salary range (United States): $205,000 to $248,000
  • Base salary is part of a comprehensive compensation package that includes equity and benefits. Compensation for candidates outside the USA will depend on the local market.

Equal Opportunity & Accommodations

Groq is an Equal Opportunity Employer and is committed to creating an inclusive environment. Reasonable accommodations for applicants with disabilities are available upon request (contact: [email protected]).