Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Communication @ 6
macOS @ 3
AI @ 3
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
Contribute to xAI's work on multilingual audio capabilities by curating, annotating, and recording high-quality audio data to improve Grok's voice interactions, speech recognition, and auditory experiences across languages, accents, and cultural contexts.
Responsibilities
- Use proprietary software to provide labels, annotations, recordings, and inputs for projects involving multilingual audio clips, voice recordings, speech samples, and auditory elements in various languages.
- Support delivery of high-quality curated audio data that ensures clear, natural spoken output and accurate representation of linguistic and prosodic details (intonation, rhythm, accent) to professional audio standards.
- Collaborate with technical staff to develop tasks that improve the AI's ability to handle speech modulation, accent variation, noise in real-world recordings, and multilingual audio processing.
- Work with technical staff to improve annotation tools and efficient audio workflows.
Requirements
- Native proficiency in Urdu, with exposure to diverse accents, dialects, or regional variations.
- Proficiency in English (minimum B2 level) with clear, natural vocal delivery and pronunciation suitable for audio recording.
- Strong auditory perception to identify nuances in speech, accents, pronunciation, intonation, and audio quality across languages.
- Demonstrated ability to handle multilingual audio content, including evaluating speech accuracy, cultural vocal expressions, and contextual interpretation in spoken form.
- Demonstrated ability to transcribe audio with high accuracy across accents and varying audio quality.
- Comfort providing high-quality voice recordings and feedback on audio samples in multiple languages.
- Strong comprehension, independent judgment on ambiguous or varied audio material (including noisy or accented speech), and strong communication, interpersonal, analytical, detail-oriented, and organizational skills.
- Ability to use a personal device that is a Chromebook, a Mac with macOS 11.0 or later, or Windows 10 or later.
- Commitment to developing AI that masters sophisticated multilingual audio capabilities.
Preferred Skills and Experience
- Exceptional attention to linguistic nuance, auditory detail, and data quality beyond standard transcription work.
- Deep understanding and taste for what constitutes good/useful audio data.
- Strong command of advanced transcription and annotation practices, including handling disfluencies, accents, and prosodic features (intonation, stress, rhythm, emotion) with high consistency and accuracy.
- Background in linguistics (phonetics, phonology, sociolinguistics), speech sciences, cognitive science, or equivalent practical experience analyzing accent variation and multilingual speech patterns.
- Experience working with speech/audio datasets, annotation workflows, or AI training data, including familiarity with training voice models and how data quality impacts model performance.
- Professional experience in voice work (voice acting, voice recording, podcasting) demonstrating attention to clarity and recording quality.
- Portfolio (strongly preferred for advanced candidates): voice samples, annotated transcripts, or audio-related work demonstrating quality, methodology, and attention to detail.
Location and Other Expectations
- Tutor roles may be offered as full-time, part-time, or contractor positions depending on role needs and candidate fit.
- For contractor positions, hours vary widely with no fixed commitments; on average most projects may require at least 10 hours per week, though this is not a fixed commitment and depends on scope and availability. Contractors have flexibility to set their own hours.
- Roles may be performed remotely from any location worldwide, subject to legal eligibility, time-zone compatibility, and role-specific needs. (Note: for US-based candidates, xAI cannot hire in Wyoming or Illinois.)
- xAI is unable to provide visa sponsorship.
Compensation and Benefits
- US-based candidates: $35/hour - $45/hour depending on experience, skills, education, geographic location, and qualifications. International candidate compensation details provided during recruitment.
- Benefits vary by employment type, location, and jurisdiction. Benefits for eligible U.S.-based positions may include health insurance, a 401(k) plan, and paid sick leave. Specifics provided during the interview process.