Building CoAffinity: Teaching Computers to Read the Room
Research2025-01-2012 min readResearch Paper

Building CoAffinity: Teaching Computers to Read the Room

Affective ComputingCognitive LoadRemote CollaborationDatasetMachine Learning
Building CoAffinity: Teaching Computers to Read the Room

Teaching Computers to Read the Room

In simple terms: CoAffinity is like a textbook for teaching AI to understand how you're feeling during video calls. I collected data from 39 people doing collaborative tasks over 38+ hours, capturing everything from their facial expressions to their heart rate. Now computers can learn to "read the room" in virtual meetings.


🎯 Key Takeaways

  • 38+ hours of multimodal data from 39 participants doing real collaborative tasks
  • 82.6% accuracy in detecting cognitive load (how mentally taxed someone is)
  • 80.2% accuracy in predicting emotional valence (positive vs. negative feelings)
  • First dataset of its kind - combining affect and cognitive load in remote collaboration contexts

The Problem I Wanted to Solve

CoAffinity Dataset Overview
CoAffinity Dataset Overview

When the world shifted to remote work, I noticed something that bothered me: our video conferencing tools were essentially blind to how we were actually doing.

Think about it. In a physical meeting, a good manager notices when someone looks confused. They see the glazed-over eyes of someone who's been staring at spreadsheets for too long. They pick up on the tension when two people disagree. But Zoom? Teams? Google Meet? They just stream video, completely oblivious to the human dynamics unfolding.

A team member could be overwhelmed, disengaged, or struggling-and the technology wouldn't know or care.

This observation became the seed for CoAffinity.

Why I Built Another Dataset

You might wonder-don't we already have datasets for affective computing? We do. But here's the gap I identified:

Most existing datasets capture emotions in isolated contexts-watching videos alone, performing solo tasks, or responding to stimuli in a lab. But collaboration is fundamentally different. When we work together remotely, our cognitive load fluctuates based on:

  • The complexity of the shared task
  • Communication breakdowns (can you hear me? you're muted!)
  • Technical issues (screen sharing not working... again)
  • Social dynamics we can't fully perceive through a screen
  • The exhaustion of trying to read people through tiny video windows

I wanted to capture this messy, real-world complexity that existing datasets missed.

How I Built It

CoAffinity Methodology
CoAffinity Methodology

Building CoAffinity meant solving several hard problems simultaneously:

1. Multimodal Synchronization

We collected:

  • Video streams - facial expressions, gaze patterns, head movements
  • Audio features - voice tone, speaking patterns, turn-taking dynamics
  • Physiological signals - heart rate (PPG), skin conductance (GSR)
  • Self-reports - continuous annotations of subjective experience

Synchronizing all these streams with millisecond precision across participants in different locations was challenging. A few milliseconds of drift could make the data useless for training ML models.

2. Ecological Validity

Lab studies are controlled but artificial. Nobody actually collaborates the way they do in a psychology lab. So I designed tasks that mirror real collaborative work:

  • Brainstorming sessions - generating ideas under time pressure
  • Problem-solving tasks - working through complex scenarios together
  • Decision-making exercises - reaching consensus with limited information

These created the natural fluctuations in cognitive load and emotion that make the dataset valuable.

3. Ground Truth Labels

Here's a fundamental challenge: cognitive load and affect are internal states. How do you get reliable labels for something you can't directly observe?

We combined three approaches:

  • Continuous self-reports - participants indicated their state at regular intervals
  • Post-task questionnaires - NASA-TLX for workload, SAM for affect
  • Physiological markers - heart rate variability and skin conductance as objective indicators

Cross-validating these sources gave us more reliable ground truth than any single method could provide.

What Makes This Dataset Special

FeatureCoAffinityPrevious Datasets
ContextReal-time collaborationSolo tasks or passive viewing
Data SourcesVideo + Audio + Physio + Self-reportUsually 1-2 modalities
Duration38+ hoursTypically <10 hours
Cognitive Load LabelsYesRarely included
Collaborative TasksYesNo

What CoAffinity Enables

With this dataset, researchers can now:

  • Train models that detect when remote collaborators are struggling
  • Design adaptive interfaces that respond to team cognitive states
  • Build AI facilitators (like CLARA!) that support group cognition

This directly feeds into my PhD research on AI-augmented collaborative cognition. Imagine a virtual meeting assistant that notices when the team's collective cognitive load is spiking and suggests a break-or identifies that someone hasn't contributed in a while and creates space for them.

That's not science fiction anymore. CoAffinity makes it trainable.

The Results That Surprised Me

When I trained baseline models on CoAffinity, I was honestly nervous. Real-world data is noisy. Collaborative settings add complexity. Would the signals even be detectable?

The results exceeded my expectations:

  • Cognitive load detection: 82.6% F1-score
  • Valence prediction: 80.2% accuracy
  • Arousal prediction: Strong correlation with physiological ground truth

These aren't perfect numbers, but they're good enough to be useful. Good enough to build systems that can actually help people.


📚 Personal Reflections: What I Learned

Creating CoAffinity taught me several things that changed how I approach research:

1. Design Decisions Are Everything

The hardest part of research isn't the technical implementation-it's the design decisions. Every choice shapes what questions the dataset can answer:

  • Which sensors? (We chose PPG and GSR for their balance of signal quality and wearability)
  • How many participants? (39 gave us statistical power without being logistically impossible)
  • What tasks? (Collaborative tasks that mirror real work, not artificial lab exercises)

Getting these decisions right required understanding both the technical constraints and the eventual use cases.

2. Collaboration Teaches You What You Don't Know

This work wouldn't exist without my co-authors-Kunal, Yun Suen, Huidong, and Mark-each bringing expertise I lacked. Kunal's deep knowledge of affective computing. Yun Suen's experience with physiological sensing. Huidong's work on remote collaboration. Mark's decades of perspective on where the field is heading.

Research is a team sport, and CoAffinity made that viscerally clear.

3. The Gap Between "Working" and "Published" Is Huge

I had working data collection pipelines a year before the paper was published. The gap was filled with cleaning data, debugging synchronization issues, running baseline experiments, writing and rewriting, reviewer responses, and countless iterations.

But that gap is where the work becomes trustworthy. Quick results that can't withstand scrutiny don't help anyone.


What's Next?

CoAffinity is published in IEEE Transactions on Affective Computing, one of the premier journals in the field. But this is just the foundation. I'm now using insights from this dataset to design AI systems that can:

  1. Predict team cognitive overload before it causes breakdowns
  2. Adapt meeting dynamics in real-time based on group state
  3. Personalize support based on individual cognitive styles

The future of remote work isn't just about better video quality-it's about technology that truly understands and supports human collaboration.

And that future starts with data that captures what collaboration actually looks like.