The Rise of Emotional Robots

There seems to be an ever increasing number of new stories about emergence of robots, but I wanted to single out a number of recent announcements that I found quite interesting.  The first is the announcement of Google’s latest strategy with regards to self-driving cars. It’s certainly not new that Google has been investing in self-driving cars. They have been traveling the streets of San Francisco and the freeways of the Bay Area for some time now.

However, the latest information reveals that Google is applying what they have learned from retrofitting conventional cars with sensors and computers to these own custom vehicles. Unlike their predecessors, the new Google cars are smaller and designed to only travel about 25 miles per hour. Also, unlike their more conventional cousins, they have no explicit controls (i.e. no steering wheel), other than an emergency stop button. They have also been designed to have soft exteriors, likely not so much as a backup if their safety mechanisms fail, but (as is the case with the robot that Hoaloha is developing) to absorb the impact of colliding objects the robot cannot easily avoid; in their case perhaps an errant bicyclist.

But one of the most interesting aspects of Google’s new self-driving (robot) cars are their appearance. They are cute. We humans already tend to attribute social characteristics to cars, often giving them names or referring to them with pronouns. The designs of cars sometimes lend themselves to our imagining them with faces, so a detail exploited by Pixar, in their animated feature, “Cars”.

So it is perhaps no accident that Google new cars appear to have a simple “face” with two eyes, a nose, and mouth-looking bumper. The overall diminutive size (about the size of a golf cart) also contributes to its friendly appearance. The appearance will clearly evoke an emotional reaction even more so than a Roomba.

This shift in strategy suggests that rather than trying to tackle the thorny concerns and issues of loosing robot cars on our freeways, Google may be going for a subtler, less threatening approach. This may enable to get them into the market sooner, but also avoid competing with traditional automobile manufacturers.

I can easily see how this could work well on corporate or university campuses or other large properties where you need to shuttle people around. But observing the increasing numbers of the small Car2Go on the streets in Seattle, I can also see a huge potential business in personal taxi services that supplement other forms of public transportation, especially in metropolitan areas. Already companies like Uber and Lyft are disrupting the conventional model and these seems like a logical progression.

It also may help address the growing issue of our aging population, who typically have to discontinue driving at some point, and lose some of their mobility options, and may find crowded public transportation too onerous. But I can see yet another great business opportunity here. While Amazon has teased us with the prospect of flying drone delivery, Google’s approach seems to have fewer barriers. I can imagine ordering up groceries or other items on-line and find them delivered to my door minutes later.  I swipe my credit car or use my smartphone to enter my PIN, the door unlocks, and allows me to retrieve my goods. This is a mobile form of what Amazon’s Locker Delivery offers, but far more convenient. Perhaps a modified form could delivery mail or other packages to me as well. Brilliant, though I am interested in what Google has yet to reveal in terms of how you interact with the car.

What I like about this most is the way Google is combining form and function to give this concept its potential for success. It is not simply the convenience this kind of vehicle would provide, but also its appearance. Who could resist taking a ride in something that looks like it came from Disneyland? Reminds me of Walt’s vision for the People Mover.

Emotion is a very important part of product design (and success). I have mentioned in previous posts how Apple took the smartphone and even the PC world by storm, in large part by designing products that people loved. The sooner those in the robotics industry recognize this, the sooner we will have personal robots.

Emotional design for personal robots requires more than a cute face. If that’s all it took we could be there already.  What it will also require is a socially engaging design.

That brings me to the announcement of Jibo, a new robotics company, founded by Dr. Cynthia Breazeal, formerly of MIT. I have long followed the research that she and her students have worked on over the years including Kismet, a mechanical, articulated robot head, that engaged like baby, or the more sophisticated, impressive Leonardo, that looks somewhat like a Furby on steriods, designed for her lab by Stan Winston Studio. While at Microsoft, I also helped support the work on Huggable, a robot teddy bear, that looks like the predecessor of the one that appeared in Spielberg’s movie, AI.

I have tried to recruit Dr. Breazeal as an academic advisor to Hoaloha, but she always resisted, indicating that she had her own venture planned. Now the details are starting to finally come out. While the myjibo.com website still does not reveal too much about “Jibo”, it suggests that the robot will be “the first social robot for the home”, and “a family robot” that “get to know you , helps you, delights you, and grows with you”. Many of these attributes also fit our design goals for the Hoaloha robot, so I will look forward to learning more about Jibo. Based on Dr. Breazeal’s past work, without a doubt Jibo will be designed for emotional and social engagement.

Dr. Breazeal’s announcement to leave MIT to focus on Jibo was followed by an equally exciting announcement from SoftBank and Aldebaran, suppliers of the Nao, a small humanoid robot that helped fill the gap left when Sony canceled further development of their “Aibo” robot dog. Ina very dramatic and superb choreographed presentation, SoftBank’s CEO, Masayoshi Son and Aldebaran’s Bruno Maisonnier, introduced us to “Pepper”, also a humanoid robot, though a bit larger than than Nao and using an omnidirectional drive base rather than walking “legs” to move around.

Standing 4 feet tall and weighing about 60 pounds, with a 10” touch tablet on its chest, Pepper is the first robot to come closest to what we are developing at Hoaloha. Like our robot, Pepper also has a “head” and “face”, though this is somewhat where the similarities end. While it is too early to reveal details about our robot, I can say that if you like what you see in Pepper, you will likely like our robot as well.

That said, Pepper is not altogether that different than Wakamaru, a similarly sized humanoid robot, introduced by Mitsubishi in 2005. Like Pepper, Wakamaru has arms and hands and like Pepper’s seem to be designed more expression than for picking up and carrying things. However, there is a vast difference in price. Where Wakamaru was priced at $14,000, SoftBank proposes to sell Pepper for a most impressive price of just under $2000. This seems almost unbelievable, especially if you consider that Aldebaran recently dropped the price of Nao  from $16,000 to $8,000, and it is a much smaller robot. When I look at the specs for the robot, I find it hard to see how SoftBank can sell the robot at this price at a profit; and in fact, Masayoshi suggested that they were intentionally being aggressive to get to an affordable (and attractive) price point. While the price for Nao was recently dropped from $16,000 to $8000, that is still substantially more than the more advanced Pepper.

It suggests that SoftBank may be willing to sell the robot, likely at an initial loss, just to prime the market. There is also a possibility that to actually use the robot and its purported cloud based services, that there may be a subscription in addition to its initial sales price. When asked by the press, both Masayoshi and Maissonnier indicated that the robot would be usable without subscription or cloud services, but somewhat hedged on to what degree. So there might be a hidden cost here. Remember that SoftBank also owns Sprint and is one of Japan’s largest telecom companies.

Pepper’s introduction was beautifully scripted, even without legs, rivaling public demonstrations of Honda’s Asimo or Toyota’s Partner Robots, going through a variety of graceful interactions with Masayoshi and others on stage. That included dancing, which seems to be obligatory for Japanese robots, as the cultural orientation likes to blend technology with art. To that extent, Pepper did very well.

Within that theme, Pepper was billed as a robot that can both  “read” and respond to emotions, adapting its behavior to human interaction. It’s elegant appearance clearly distinguishes it from most Western robots or some of it creepy Asian cousins, though it also reflects its design relationship to Nao, its French cousin. At the same time, Pepper is a more modest design than Aldebaran’s Romeo, that looks like a scaled up Nao or Asimo competitor. The move from legs to a omnidirectional drive base likely helps a lot in reducing the price and the promised battery runtime of over 12 hours.

As amazing as Pepper appears to be there are important questions that still need to be answered including:

–  What will Pepper’s availability be after its launch in February 2015? No word has been given yet about Pepper being available outside of Japan.

–  What will Pepper do out-of-the-box? If Pepper is sold like Nao, it may be more like buying a Apple II computer in 1978, a great platform for hobbyists and researchers, but not quite a consumer product like the Macintosh was. To get there will require a great, core user interface, and a suite of core applications that deliver value when you first turn the robot on; as you would expect if you buy a smartphone today. Such has not the case for Nao, so will Pepper be offered differently? Similarly the absence of much more than pretty graphics on the tablet suggests that either SoftBank/Aldebaran may plan to leave the core UI and applications to developers to build themselves or maybe haven’t finished enough to show.

While Apple success with the iPhone and iPad has clearly benefited by third party application developers supporting the iPhone and iPad, they also supplied these devices with a well defined interface for installing, launching, and switching between applications, support for system settings and user preferences, a rich set of common interface components to use in applications, and a core starting set of applications that makes the device immediately useful when purchased. Third party applications are additive, not required, so that the device delivers value when the user purchases it.

– As pleasing as Pepper’s outward design is, is it sufficient to succeed as a product? While Pepper’s arms are great for gesturing, are they useful beyond this? Can they actually be used for picking up and delivering objects and if so, what kind of objects and in what situations? Pepper’s introduction implied  this capability when taking a glowing heart from Masayoshi and then returning it to him. However, you could see these hand-offs were carefully coordinated. Robot adaptive dexterous manipulation remains a challenge even for robots with much more sophisticated robot arms. For example, Rethink Robotics’ Baxter can do a great job of repeating a manipulation task, must first be walked through the procedure step-by-step. Even then the task can only be repeated within the same setup. Likewise, most demos of Asimo are carefully scripted when showing manipulation. It’s not that the hardware may not be capable, but the software complexity required to pick and place objects requires coordination between vision and sensing and the movement of the arms and hands. Further, doing this in any arbitrary scenario with a wide variety of objects is not trivial and has not been demonstrated yet.

Similarly, I am curious about Pepper’s mobility. If the robot uses omniwheels to achieve its holonomic motion, these tend to be poor on smooth surfaces and not very efficient from a power perspective.  In addition, are Pepper’s sensors sufficient to support autonomous navigation as well as user interaction?. Pepper’s specs only call out a single depth camera (in the head). These cameras typically have a 60 degree field of view, whereas human peripheral vision is roughly double that. I can say from experience that 60 degrees is not sufficient for supporting user interaction, even if the head moves. This seems to be reflected in some of the on-line videos of Pepper interacting with people in the SoftBank store in Japan.

Finally, like Nao, Pepper’s “face” is mostly static, animated only by head movement and colored lights behind the eyes. Research suggests that while this can be scripted to evoke social response from people (even a Roomba can do this without any humanoid features). However, it seems like a limited design choice not to have given Pepper a more expressive face. While articulated features like movable eyes, mouth, and eyebrows might have been too costly to include, is what is here enough? Pepper developers will have to be much smarter about using body and head movement and gestures as well as voice characteristics to achieve social communication. That will go a long way, but as wonderfully expressive as Pepper has been designed, one wonders why a static face?

– Is emotion recognition really a sufficient value proposition? Pepper was introduced as the first robot to recognize human emotions. This is simply not true. Kismet and Leonardo clearly demonstrated this ability years ago. Perhaps the claim might be considered more accurate if stated that it may be the first “commercial” robot to have this ability. Even so, reading human emotions by analyzing voice characteristics or facial expressions is not new. Numerous companies including the MIT spin-out, Affectiva, offer this as a service.

It is not clear whether Pepper will read emotions based on voice patterns or visual data, but based on existing technologies it seems likely to be able to do this only making simple, shallow connections from easily observed features, such as associating a frown with sadness or a smile with happiness. It is much harder to read emotions that may be more subtly presented, even for people. Further, we humans are notorious for hiding or falsely presenting emotions.

While I see value in having a robot recognize when I am sad and trying to cheer me up or recognize when I smile that I may be happy, that seems a somewhat minimal benefit. Humans generally don’t exhibit strong emotions that often in everyday activities. We might show a smile when we greet someone or hear a funny joke, but  unless there are special circumstances, most of our facial expressions tend to be more modest and include forms that help facilitate social engagement, such as head nods, gazes, etc. Further Typically for software to recognize expressions also requires somewhat exaggerated or extreme expressions such as broad smile or big frown. It is much harder to read emotion that may be more subtly presented, even for people. Further, we humans are notorious for hiding or falsely presenting emotions. That said, I do agree that emotions are an essential element in personality, but like “telepresence” this is a required feature, not a sufficient value proposition by itself.

Perhaps positioning Pepper as an “emotional” robot was intended to be more culturally appealing for Japanese users. Historically, the Japanese have had much more positive perspectives on the role of robots, especially as human assistants, which stands in contrast of some of the very negative fictional examples found in Western cultures. Still I might have positioned Pepper as a “social” robot, which to me suggests much more than just reading and responding to emotions. However, I do like what may be the intent; the concept of a robot that contributes to your well-being. After all, our company’s name, Hoaloha, includes in its root the word “aloha”, a  Hawaiian word that also is used to convey a variety of positive emotions.

–  How dependent is Pepper’s operation on the cloud? Both Masayoshi and Maisonnier indicated that they wanted Pepper to use the cloud, but not necessarily require the cloud. That sounds good, but what does it really mean? Looking at Pepper’s specs include no CPU specs. Nao uses an Intel Atom, which we tried early own and found did not provide sufficient processing power. I suspect Pepper will have more, perhaps an Atom for core functions as well as a multi-core ARM to support the tablet (since most 10″ tablets these days include 4 core processors). Maybe the reason for not listing this is so that SoftBank/Aldebaran can upgrade the processor before Pepper releases. Otherwise it suggests that perhaps some of Pepper’s core recognition and advanced functions are cloud-based.

None of these questions are intended to diminish what SoftBank and Aldebaran demonstrated. They have scheduled a developer conference in September so maybe there will be answers to some of these questions forthcoming.

With these announcements, Google, Jibo, and Pepper help pave the way for what Hoaloha is developing. It is great to see the rise of emotional robots, validating Hoaloha’s design philosophy in the importance of social interaction and its role in the design of the user experience; that to be personal, robots must have personality.