SCHEMA Engages in Multiparty Conversation

Waseda University’s Perceptual Computing Laboratory has posted a video showing their third generation conversational robot, SCHEMA, engaging in conversation with multiple groups.  The robot listens intently as two humans talk about movies.  When another person arrives, SCHEMA asks the person to wait while it finishes what it was going to say.  It then directs its attention to the new comer, whom it has already recognized using image processing.  What is important here is how quickly SCHEMA can respond to human questions and interjections.  It doesn’t display any of the usual awkward pauses normally associated with talking robots.  In fact, it almost responds too quickly!  This type of work is helping to make robots more sociable and natural, and is the best work I’ve seen on the subject.  Like the previous video featuring ROBISUKE, the video is also subtitled in English.

Video (Mirror):

The lab won the Yamashita Memorial Research Award from the Conference on Spoken Language Processing Society Technical Committee two years in a row for their work on spoken language processing.

[source: Matsuyama720 @ YouTube]

  • alex

    I know it all can be but it looks just too good and the presentation looks more like a (theatrical) play with exact choreography – I say this, you say that, robot says this, new person comes, robot react on new person, robot asks to wait…
    We saw only one minute so they could just select a conversation topic (i.e. movies) and situation they new the robot would have no problems with it – but considering if it’s just a chatbot inside and the robot would make a conversation with a stranger on some random topic and in a random situation I bet the robot wouldn’t look so good like here – I bet we would see the standard chatbot behavior with random questions and answers.
    It’s a presentation video so they just select the things the robot can best, why would they show a situation where the robot doesn’t look good…

  • Robotbling

    @ alex

    It’s likely SCHEMA was responding immediately to “why” in the sentence “why do you like it”. I agree it comes in too fast, but that could be one reason for it. As for knowing the direction of the new person, I’m sure SCHEMA is using some form of sound location estimation using multiple microphones. This ability has been demonstrated in several other robots.

  • alex

    The reaction on “why do you like it 0:23” seems a bit too fast. And when the new member arrives it’s like Schema would know where exactly he is even though the robot only hears him – Schema knows that someone else is calling him and that it’s not someone of the two and knows exactly where he is – the scene just looks a bit strange.
    Considering it’s a promotional video it’s quite possible they helped him a little.