SHARE

An online coaching company recently teamed up with language researchers to amass the world’s largest publicly available dataset of two-person virtual conversations. Already in use by institutions at Harvard, Columbia, Cornell, and elsewhere, BetterUp Labs’ CANDOR Conversation Corpus includes over 850 hours’ worth of over 1,600 Zoom chats recorded between January and November 2020. Its authors hope to provide experts and scholars across an array of fields a deep trove of data offering insight into the myriad ways digital communication methods can affect everyday human interactions.

Zoom delays are the bane of many remote workers’ existence, but there’s a reason beyond the sheer annoyance. Zoom delays cause us to awkwardly talk over one another. According to a study published last year, it takes approximately 297 milliseconds for the human brain to process face-to-face, yes-or-no questions—ask those same queries over a video chat portal like Zoom, and that delay increases to upwards of 976 milliseconds. As Business Insider relayed on Monday, the previous study’s researchers theorized that even as little as a 30- to 70-millisecond audio delay (less than the blink of an eye) can disrupt conversation participants’ neural processing that underlies the very basics of human dialogue.

[Related: The best Zoom tricks and add-ons for your video chats.]

Enter BetterUp Labs’ “Conversation: A Naturalistic Dataset of Online Recordings,” aka CANDOR. With methodology and results recently published in Science Advances, CANDOR offers one of the most expansive archives of two-person audio and video conversations to date. The process was simple enough: compensated participants were asked to pair up with randomized fellow volunteers, who were then tasked to chat together for at least 25 minutes about whatever they wanted. Afterwards, they were surveyed about their feelings and thoughts post-chat. Both the audio and video of each conversation was also recorded, meaning that unlike most conversational corpuses, CANDOR didn’t merely archive their transcriptions. Speakers’ visual and audio information were also detailed, meaning every facial tic, verbal stutter, and subtle gesture was made available for researchers to parse and analyze.

Initial analysis of CANDOR’s data reveals some quick takeaways about what makes a solid Zoom conversationalist—generally speaking, higher rated and more well received participants were those who spoke faster, louder, and more intensely. As Insider explains, “people rated by their partners as better conversationalists spoke 3 percent faster than bad conversationalists—uttering about six more words a minute.” Although average volume didn’t change between positively and negatively reviewed conversations, the more nuanced notion of “intensity” factored heavily into opinions, as well as the variation between decibel levels. More variation meant a better view, while monotone conversationalists unsurprisingly didn’t score as well.

[Related: Zoom chats can be surprisingly therapeutic.]

The authors of the new CANDOR corpus freely admit the limitations to their initial work—the first version includes only American English conversations, and randomly pairing participants might have produced social anxieties and issues that skewed some of the data. Still, the CANDOR database offers one of the most expansive sets of two-person digital conversations ever amassed, and can serve as a launching pad for even more detailed investigations down the line. In order to do so, however, don’t be surprised if you find yet another Zoom invite in your email inbox in the near future.