Humanlike Logo

Conversation datasets that speak for themselves

We capture high-quality, unscripted, dyadic, multi-turn conversations with expert-verified transcriptions, shaped after the Seamless Interaction standard.

Conversational voice specialists

Our expertise lies in native audio for conversational models. We capture natural, multi-speaker dialogues.

Premium, expert-sourced datasets

Gain a proprietary edge with data from experts in various fields. We provide data that is richer and more nuanced than publicly available sources.

Natural, familiar dyad datasets

Get natural voice conversations. Over 40% of our dyads are with familiar partners (friends, classmates, family members) having authentic, unscripted, candid conversations.

Flexible engagement

We offer both the highest quality curated datasets and custom collection services, tailored to the rigorous standards of leading AI labs.

Our data advantage

Expert-led, natural multi-turn conversations
Includes overlapping and interleaved speech for real-world scenarios
Rich accent and dialect diversity. Background information on individual speakers.
High-quality datasets with 1080p video (mp4), 48 kHz sound (wav), and native-speaker verified transcriptions (.json)