From August 29th through September 4th, an International Summer School on Embodied Language Games and Construction Grammar took place in Palazzone di Cortona, Tuscany, Italy. Many distinguished researchers from around the world gave lectures and tutorials on evolutionary linguistics including Luc Steels from SONY CSL Paris. He and his team have been working with SONY’s AIBO and QRIO robots – research that has kept QRIO on life support for the past few years – which focuses on dialogs for humanoid robots and fundamental research into the origins of language and meaning.
Students who enrolled in the program got some coveted hands-on time with a pair of QRIOs, which were used to investigate how robotic agents might describe a room to one another, which contained objects and landmarks scattered around.
Luc Steels has published a number of papers in which SONY’s QRIO plays a central role. The idea being that you take some embodied artificial agents (in this case, humanoid robots) equipped with machinery for interaction, invention and adoption of language, and see what kind of a communication system is formed. Interestingly, one of QRIO’s functions allowed it to take words it had heard and work them into daily conversation. In this way, your QRIO would appear more intelligent – especially when it used them in the correct context or situation. This ability makes QRIO an ideal subject for this line of research, since they can be programmed to invent their own words for objects they see in their environment.
Since almost every language has a way to express the relationship between objects and events, grammar might form in a population of agents. When presented with an arbitrary set of colored objects, a pair of agents are given the task of naming what they see. The flow of conversation between two humanoid robots might go like this:
- Speaker perceives the scene
- Speaker conceptualizes what to say
- Speaker applies vocabulary to produce sentence
- Sentence transmitted to Listener
- Listener decodes sentence
- Listener parses sentence using its own vocabulary
- Listener applies meaning to his own perception of the scene
This presents many problems. When the listener hasn’t heard a word before, it adds that word into its own vocabulary. But what if the listener’s perception of the scene is different from that of the speaker? For example, an object might be occluded by another object from its point of view. It may mistake a word being applied to the red object as a word used to describe the yellow object, and so on. This could lead to all sorts of misunderstandings. The researchers solve this problem by putting checks and balances in place so that over multiple generations the meanings of words become settled amongst a population of agents.
Steels has also done some fascinating work investigating how the representations of body image develop and become coordinated in a paper titled “The Robot in the Mirror“. We maintain a body image of ourselves, which helps us control our own body movements, plan actions, and recognize and perform the actions of others. This body image we all hold in our minds is so strong that it can cause amputees to experience phantom pain in limbs that are no longer there. Various experiments are performed to study how these body representations might form.
The first one (called the mirror experiment) is viewed as preparatory. Robots learn the bi-directional mapping between visual-body-image and motor-body-image by standing before the mirror, executing actions, and observing the visual-body-images that they generate. Once a group has each learned this mapping, they play language games settling on names for these actions. The language games that we study are action games, in which the speaker asks the listener to do an action and the game is a success if the listener performs the requested action. If the game fails, the speaker repairs the communication by performing the action himself.
In a second experiment (called the body language experiment), robots do not learn the bi-directional mapping between visual-body-image and motor-body-image through a mirror but through the language game itself. They start without knowing the relation between visual-body-image and motor behaviors and without having a pre-defined lexicon.We will show that not only is the bi-directional mapping between visual-body-image and motor-body-image emergent but also the lexicon for naming bodily actions and their visual appearance. Both co-evolve and bootstrap each other.
As a big fan of SONY’s QRIO humanoid robot it puts a smile on my face knowing that they are still being put to good use studying these A.I. topics. If you’re interested in learning more about Luc Steel’s work, I would highly recommend you check out the (appropriately titled) Talking Robots podcast interview with him (link below), or do a quick google .PDF search including his name should bring up plenty to sink your teeth into. If by chance you or someone you know happened to attend these sessions, I’d be delighted to see any photos or video of it (contact me: robotbling [at] plasticpals [dot] com).