Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 3 Kubernetes @ 3 Algorithms @ 6 Distributed Systems @ 3 Data Science @ 2 Leadership @ 3 Communication @ 3 Rust @ 3 Technical Leadership @ 3 PyTorch @ 2Details
Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers the speed and scale they need. Headquartered in Silicon Valley, we are on a mission to make high performance AI compute more accessible and affordable. When real-time AI is within reach, anything is possible. Build fast.
Mission
With a strong background in Generative AI Inference and expertise in Speculative Decoding, you will design, implement, and optimize cutting-edge algorithms to enhance our production AI infrastructure and capabilities in post training, model evaluation, and operational performance.
Responsibilities
- Design, implement, and optimize speculative decoding algorithms and underlying models that enhance the speed and accuracy of Generative AI Inference.
- Collaborate with cross-functional teams to integrate your solutions into Groq’s production AI infrastructure.
- Work in a multi data center production environment and Kubernetes environment with Groq’s customer hardware, inference, and compiler stack.
- Develop high-performance, scalable code primarily in C++ and Rust, ensuring efficient resource utilization and system stability. Ability to model performance of a distributed high performance system.
- Experience building production distributed systems involving multi process communication with technologies such as MPI, scheduling, and working in a Kubernetes environment.
- Stay up-to-date with the latest developments in generative AI and speculative decoding, and translate cutting-edge research into practical, production-ready implementations.
- Work closely with teams across software engineering, research, and operations to drive improvements in post training, model evaluation, and overall system performance.
- Provide technical leadership and mentorship to team members, fostering an environment of continuous learning and innovation.
- Champion code quality, maintainability, observability, monitoring, and best practices, ensuring that all deliverables meet rigorous performance and security standards.
Requirements
- Master’s degree in Computer Science, Electrical Engineering, or a related field (or equivalent industry experience).
- Extensive, hands-on experience in generative AI inference with a specific focus on speculative decoding.
- Proficiency in C++ is essential, with demonstrated experience in developing high-performance systems.
- Strong analytical and problem-solving skills, with a track record of delivering innovative technical solutions.
- Proven ability to work effectively in fast-paced, cross-functional environments, driving projects from conception to production.
- Understanding of the architecture of Generative AI models, PyTorch, familiarity with the data science necessary to evaluate layers of models, their performance, and quality.
- Familiarity with AI infrastructure challenges and scalable system design.
Attributes of a Groqster
- Humility - Egos are checked at the door.
- Collaborative & Team Savvy - We make up the smartest person in the room, together.
- Growth & Giver Mindset - Learn it all versus know it all, we share knowledge generously.
- Curious & Innovative - Take a creative approach to projects, problems, and design.
- Passion, Grit, & Boldness - no limit thinking, fueling informed risk-taking.
If this sounds like you, we’d love to hear from you!
Compensation
At Groq, a competitive base salary is part of our comprehensive compensation package, which includes equity and benefits. For this role, the base salary range is $175,900 to $307,800, determined by your skills, qualifications, experience, and internal benchmarks.