Staff Software Engineer, Speculative Decoding

at Groq
USD 175,900-307,800 per year
MIDDLE SENIOR
✅ Remote

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Security @ 3 Kubernetes @ 3 Algorithms @ 6 Distributed Systems @ 3 Data Science @ 2 Leadership @ 3 Communication @ 3 Rust @ 3 Technical Leadership @ 3 PyTorch @ 2

Details

Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers the speed and scale they need. Headquartered in Silicon Valley, we are on a mission to make high performance AI compute more accessible and affordable. When real-time AI is within reach, anything is possible. Build fast.

Mission

With a strong background in Generative AI Inference and expertise in Speculative Decoding, you will design, implement, and optimize cutting-edge algorithms to enhance our production AI infrastructure and capabilities in post training, model evaluation, and operational performance.

Responsibilities

  • Design, implement, and optimize speculative decoding algorithms and underlying models that enhance the speed and accuracy of Generative AI Inference.
  • Collaborate with cross-functional teams to integrate your solutions into Groq’s production AI infrastructure.
  • Work in a multi data center production environment and Kubernetes environment with Groq’s customer hardware, inference, and compiler stack.
  • Develop high-performance, scalable code primarily in C++ and Rust, ensuring efficient resource utilization and system stability. Ability to model performance of a distributed high performance system.
  • Experience building production distributed systems involving multi process communication with technologies such as MPI, scheduling, and working in a Kubernetes environment.
  • Stay up-to-date with the latest developments in generative AI and speculative decoding, and translate cutting-edge research into practical, production-ready implementations.
  • Work closely with teams across software engineering, research, and operations to drive improvements in post training, model evaluation, and overall system performance.
  • Provide technical leadership and mentorship to team members, fostering an environment of continuous learning and innovation.
  • Champion code quality, maintainability, observability, monitoring, and best practices, ensuring that all deliverables meet rigorous performance and security standards.

Requirements

  • Master’s degree in Computer Science, Electrical Engineering, or a related field (or equivalent industry experience).
  • Extensive, hands-on experience in generative AI inference with a specific focus on speculative decoding.
  • Proficiency in C++ is essential, with demonstrated experience in developing high-performance systems.
  • Strong analytical and problem-solving skills, with a track record of delivering innovative technical solutions.
  • Proven ability to work effectively in fast-paced, cross-functional environments, driving projects from conception to production.
  • Understanding of the architecture of Generative AI models, PyTorch, familiarity with the data science necessary to evaluate layers of models, their performance, and quality.
  • Familiarity with AI infrastructure challenges and scalable system design.

Attributes of a Groqster

  • Humility - Egos are checked at the door.
  • Collaborative & Team Savvy - We make up the smartest person in the room, together.
  • Growth & Giver Mindset - Learn it all versus know it all, we share knowledge generously.
  • Curious & Innovative - Take a creative approach to projects, problems, and design.
  • Passion, Grit, & Boldness - no limit thinking, fueling informed risk-taking.

If this sounds like you, we’d love to hear from you!

Compensation

At Groq, a competitive base salary is part of our comprehensive compensation package, which includes equity and benefits. For this role, the base salary range is $175,900 to $307,800, determined by your skills, qualifications, experience, and internal benchmarks.