Conversational AI and Smart Home

Executive Summary

20201029-conversation_design_smart_home

Meet Our Panelists

Victoria Tkatch

Software Product Manager, Voice Adoption @ Sonos

Victoria Tkatch is currently a Product Manager at Sonos, the world’s leading sound experience company. At Sonos, she has led voice assistant integration for Alexa and Google Assistant, including bringing the world's first smart-speaker with multiple assistants to market. Her areas of expertise include voice assistant Out-of-the-Box experience, along with voice geo-expansion into new languages and locales. Victoria comes from a background in entrepreneurship running a Seattle-based startup and holds a Bachelor’s degree from the University of Washington. She is passionate about supporting aspiring product managers to break into the role and has recently launched her Youtube channel ProductChat to give anyone with an eye on product the tools and information they need.

Ilana Meir

Conversation Designer, AR/VR @ Facebook

Ilana Meir (meh-ear) is a conversation designer at Facebook Reality Labs, where she works with augmented reality and virtual reality technologies, namely Oculus and Portal. She approaches her work with a cultural lens: first seeking to understand the cultural underpinnings of behavior in a space, and then how a new product will affect them. For Ilana's contributions to the design field and voice community, Speech Technology magazine named her a "Speech Technology Luminary”.

Michael Zagoresk

VP, Product Marketing @ SoundHound

Michael Zagorsek is VP of Product Marketing at SoundHound Inc. where he focuses on strengthening the company’s presence in the voice market and expanding the reach of its core products: SoundHound, Hound, and Houndify. Throughout his career, he’s incorporated technology into marketing, whether it was digital advertising, launching websites, or innovative forms of human/computer interaction. Prior to SoundHound Inc., Michael served in leadership roles at Square, Leap Motion, and Apple.

Toni Klopfenstein

Developer Advocate @ Google

Toni Klopfenstein is currently a Developer Advocate @ Google where she focuses on Smart Home Actions for the Google Assistant. In her career, she has focused on IoT, Smart Home Applications, and is well versed in the electrical and electronic manufacturing industry. Prior to Google, Toni worked as a Product Development Manager where she led the product development team at Sparkfun Electronics

Meet Our Moderator

Rob Hayes

Head of Product @ Voiceflow

Rob spends his days building better tools for conversation design teams as Head of Product at Voiceflow. Prior to that, he spent time working on customer support-focused AI chatbots at Ada Support, and running the User Experience Design agency, Heist.

Watch The Event Again

Event Q&A

Event Q&A

What are some of the biggest use cases that you have seen in conversational AI, in the home, or in Smart Home products?

Victoria: So, I work at Sonos, and we develop Smart Home products, which primarily focus on music and audio. And so naturally for us, our number one use cases, media initiation, and people are spending a lot of time on that. And this use case is split across, you know, a number of spaces. So we have different types of music services that people use. We also have multiple assistants on our device. And so, you know, one interesting area is seeing how people build habits around their different services. And they also build you know, habits around : I like to use Google for queries around learning something or doing a Google search versus I like to use another assistant for Initiating Meteor something like that. Also within the space, we look at which services or which voice assistance is most effective at completing certain queries. And so you'll see that as you go deeper in a particular use case, you have assistants that are, you know, much stronger in working with a particular music service. And you'll see users following that sort of pattern, whereas they will follow the areas of strengths for other assistance. So I guess the message there is, it's very much affected by the space that you as a company are operating in, and the strings of assistance. But yeah, music is a huge one. General searches are really important. We're also seeing a lot around the usage of voice assistance in cars, you know, that's a really important space where people are not able to use their hands freely. And they're very comfortable because they're in a private space, they don't have anybody you know, like listening to them on the street, or concern on feeling awkward. So that's an interesting space for me, and how can we create new developments of people's usage of voice in the car, while pushing past certain barriers that we have in outdoor spaces, open offices, transit, etc.

Is there any kind of unexpected use cases that anyone's seen that are surprisingly popular?

Ilana: I love seeing that it's become so useful for the elderly population, they're normally a demographic that's left out of new technology waves, and to see that they're being included and a major population that benefits from this technology are wonderful to see.

Do you have different design considerations? when you're building different Smart Home products? Are there separate use case considerations that you'd have when thinking about interacting directly with the device or with the device via another input?

Toni: Yeah, so I think it really does depend on what the devices and what the use cases, from a product manufacturing standpoint, if you're trying to design a product that has a voice assistant embedded in it, you know, you have to look at the bill of materials for that, and you have to use materials that can actually support that kind of technology on the device. And in the case of a light bulb, for example, you may not want to have the computing power embedded in a light bulb for that because of the cost that it would drive up. So it is very much dependent on what is the use case, you know, for things and again, light bulbs are a really good example for this; if you want to have voice control of your light bulbs, folks may put their light bulbs, you know, they might have 15-foot ceilings. And you don't want to have the speaker in your lightbulb where you have to be standing in the middle of your living room shouting at your ceiling to get it to turn on or off or dim or brighten. So you have to think about the use case of how are your users going to be engaging with these devices? How do they want to engage with these devices? And again, I come at this from a hardware background. So it's like very much you know, what, from a manufacturing standpoint, what is the cost of goods for you as the manufacturer? versus what are your customers willing to pay for those costs? And how do you balance that? And then the other big thing from a user standpoint is how easy is it for the user to set it up in their house? So, you know, it's very easy to add a lot of features to these devices. But if it ups the barrier of entry for folks, they're not going to want to be able to or they're not going to spend the time setting it up in their house. So I think those are some big things to consider.

Michael: Yeah, sure. I just really want to echo everything Toni said. And what's interesting about the question of conversational AI used in Smart Home products because you have the relationship that the user has with the product and then the relationship the user has with the home. And those are two very interconnected things and Toni nail that which said that sometimes your relationship with the product is better delegated to a central service like a smart speaker. And it really depends on the product. And then in some cases, your relationship with the product can be well served with the device itself. And I think the one big distinction there is command and control versus access to services. So if you have a more complex device like an appliance, chances are you'll be in front of that appliance, whether it's a washing machine, a fridge, microwave, in the kitchen, or anything else. That device having its own voice interface is better, is great for the organization that's building it because it adds product value. And it means it's not necessarily delegated. And especially if you have a situation where you over delegate control of a product to third party assistance, then you start to have to maintain product control, and there are multiple assistant environments. So I think we're going to see is, a smart home will be about connected devices, you'll have smart hubs that connect those devices, but then you have smart devices. And each company is going to make sure it's adding value to the product in a way that makes sense for how they want to connect with their customers.

In terms of designing for smart home products, whether it's a connected device or a device that connects through smart assistants, or has it built directly in. Are there any particular design challenges or design considerations that you're taking in?

Ilana: Yeah, so I'll mention a couple of things the other panelists have mentioned, too. So Toni was talking about the acoustics, have a really tall ceiling and not having a speaker up there. So thinking about the acoustic environment of space and like a kitchen is noisy, a bedroom might be quieter, that sort of a thing. And thinking about the context of an environment, which is like if you're in a kitchen, you might want to set the standard volume a little bit higher, knowing that it's going to be louder, considering multiple assistants like we were just talking about. So who has right of way, when you're talking to an assistant, or there are multiple assistants that can respond to have the same assistant one is in the living room one is in the bedroom? And like they both hear you, who says what one? And what does the command mean that is ambiguous? Like if you say stop, what does that do? Like? Is that turning off the music? Is that turning off the screen? How are you dealing with those ambiguous commands? so that in some ways those ambiguous commands carry across. Something that we haven't touched on, though is that we're being invited into people's homes. So it's a private space, usually don't invite strangers in, or people that you don't know in other than maybe like a report repair person coming to fix something. And designing for the smart home, you have to respect that privacy. And you also have to respect the privacy of the individual in a communal environment. So it seems fairly innocuous on the surface to save somebody's preferences for how they like whatever. But then you have to think about well, what if somebody else in the home has access to those preferences? Is that information that they want to be shared? Or is that information that they would want other people to be able to modify. So for example, with music, everybody loves their music, and sometimes they don't want their partners messing with their favorite playlist. And so that's an innocuous example, but it carries throughout when you think about saving, user preferences, and respecting the privacy of the space that you're invited into. And, and also the individuals as they interact as a group.

Being able to manage multiple different users on a single device. How is that changing the design experience of the design considerations?

Ilana: Yeah, as you said, it's still in its nascent sea. But something that I'm seeing or thinking about as I look ahead is, the home has very much become a place of individual expression. And so much of the people who are living in it. Home and interacting is about individual expression. And so thinking about how that can extend beyond the physical space, these are the posters that the teenager hangs up on their walls. These are the throw pillows that the parents chose for the couch into the digital. So we're interacting with screens on all sorts of devices, how can we come together and express our individual personalities and have them interact within the Smart Home space.

How widely available are the APIs for creating Smart Home experiences? Are these often built per use case? Or are they readily available to be built upon?

Toni: speaking from the Google world specifically, we have a lot of different APIs available to interact and engage with the Google Assistant and tie in your devices. We have the Smart Home platform, which enables you to tie into a user's home graph. So and that actually helps where you can gain the context of what devices a user has, what rooms do they have them in, in their house? And what are the supported traits that they might have, you know, on their various devices, so it can help you build up kind of an information model as far as what are the potential use cases and how will they, users, want to combine these devices or set them into routines? We have the local home SDK, which is also set up to help minimize latency and responses from devices. So that way, if you say, hey, turn on my light bulbs, you're not sitting there in the dark, you know, for five seconds, as all the lights slowly turn on one by one, that's not a great user experience. So you want to be able to have that instant feedback to actually make it feel like the Smart Home is responsive to you. There's also the Smart Home device access API through the nest. So you can actually if you are an OEM manufacturer, and you want to tie in nest control into your device, for example, if you're building a smart screen, and you want users to be able to control those other elements in their house, from your device, you can tie that into that ecosystem as well. So there's a lot of different options there. And then there's, you know, in the community at large, there's a lot of different DIY options as well, that are enabling folks to go out and build their own devices that integrate as well.

Victoria: One thing that's interesting to consider around voice and conversational AI, in the Smart Home, specifically is like these different interactions that happen and like we've got the lights turning on, and all of those voices, sort of one, one piece of the smart home, but it's just one facet. And there are all these other facets that contribute to a user's concept of the smart home. So we break those up into a few categories, like their safety and security. So things like locks and monitoring and security systems. There's also like well-being so if you have exercise or sleep or something like that, entertainment, which we're pretty familiar with, for, you know, music or TV. There are all these different categories with a set of devices that come underneath them. And then, you know, voice is interesting, because a lot of them, I think, as Mike was mentioning, they don't have necessarily a screen or UI. And so you have to think about what's the appropriate way to interact with them, does it make sense to have a UI on them or not. And thinking through sort of how voice can serve as a primary interface to connect the different aspects of the smart home and the channel for the user to interact with them. But that doesn't just happen through a voice that also happens through the different lights or signals that they see from those devices. It also happens through, you know, TTS or times that come back from them after an interaction during an interaction, even tactile, maybe they do a push to talk, they push a button, then they talk. Or, you know, potentially, I think one thing in the future is awareness of the device about the user being there. So you come in the devices aware that you've come into the room. And I think all of you know, the technology and APIs and tools, and also user experiences that contribute to those pieces are a really important part of the overall voice experience that we're creating.

How much of an opportunity is there for proactive assistance in the home, to be able to kind of react or predict the actions taken, actions that the user wants to take, or needs to take?

Victoria: Yeah, I mean, it's really interesting, we have a scale where it's like, you know, reactive, proactive predictive, where you think about predictive, like, you know before the user even knows what they want. And you have to walk this fine line of like, what technology can actually do, and then what people are comfortable receiving from this technology. So you have the creepy factor where people, you know, they might be excited about a capability, but very quickly think, you know, how did you do that? Like, what, what data are you collecting on me, and so proactive is sort of a good middle ground, where you're still reacting to something that the user gave you some data that they gave you, without being too far ahead of their curve, where it might make them uncomfortable. I'm definitely like, a lot of things that you can do there. I mean, theoretically, you can put like radar and track everybody's location without them needing to carry a phone around. But it's really about getting the feedback from what kind of user experience people are looking for, and designing based on that, more so than all of the limits of the technology available.

How do you consider the approach to design multimodal experiences? And then how does that experience expands when you start introducing new devices and into that relationship when it goes beyond just the user and the Smart Home device to the user, their phone in the Smart Home device?

Victoria: That's really like a big question. So I'm sure many of us have something to contribute there. There's a lot of things to consider. So I'll start from a small example, on a Sonos device, you have a teeny tiny light at the top. And on average, you can't see that light because your Sonos is like up here and on a shelf. And so you ask one of the systems a question, you actually don't get any visual feedback, because normally, that light isn't available to you. In many cases, the voices in companies designing for these assume that light is clearly visible because it is on most products. But you have complexities when you have this large platform that's, you know, becoming available for all sorts of different products and use cases. And there's sort of a set of rules guiding what kind of design should happen. And there's equally large sort of edge cases that kind of conflict with the ruleset where people might really not know what to expect, you know, if they're being heard, how-to, you know, how to know that their interaction is successful. So I think being aware of all of those edge cases, and use cases is a big part of it. But also understanding that, like, we need to align on a set of guidelines for all of the many devices that we're bringing in to interact together. So I think interoperability has been a big topic recently. And thinking, you know, is there a shared set of guidelines for users on all devices can expect this type of light or this type of UI to mean, a listening state, or, you know, thinking state or something like that, but a, you know, a shared set of principles that people can kind of like a no else, you know, understand to mean something and reduce the cognitive load of interacting with these voices systems.

Michael: Yeah, in a way, you can't have a smart home without smart products. There needs to be some capability in the product itself, in the collection of those products create a smart home, currently, it'll be interesting to see over time, what protocols and standards determine the way forward. Right. So as Victoria was talking about, Antonia as well, that is there, interoperability, and there's there a new smart home standard, that the same way there's safety standards and electronic standards, there would be interconnectivity standards. And I think that's starting to happen, but we're a ways away from it. So instead, we see this, the proxy to that is the smart speaker hub. And everything connects to that. We're, for example, we haven't announced yet but working with Smart TV. And that will be voice-enabled and your interaction with that particular product is done through the remote. And so it's the dynamic between smart products and smart home is still emerging. And I think the brands that are being aggressive are the ones recognizing that they need to be innovating in that space. Because if they wait for everything to be sorted out, then everyone else will have taken the space and will emerge as leaders, and then they'll crowd the space. And the last point I would make is that smart is more than just voice. It's right now if you speak with something and it responds, that sounds pretty smart because the voice is a very complex thing. But as was mentioned before, it's proximity detection, facial recognition. So if you have a privacy issue, you want to make sure that it recognizes you and doesn't reveal information that's not about you and then there's the voice and biometric identification And then the last thing I would point out is that you ask yourself, then what capabilities are on the device on the edge versus what is actually in the cloud to make all of that work. And that's purely from a cost, privacy efficiency, and scalability standpoint.

How do you ensure that all of the actions and interactions available with these devices are known to users? How does that consideration get built-in? How do you get feedback, qualitative, and quantitative to be able to evolve?

Michael: Sure, absolutely. I mean, it's the one thing that we're actually talking about one of our breakout rooms is there's a world of possibilities that people could use conversational, smart home products, but quite often people find their use cases, and they just really stick with it. So there's always this dilemma, do you want to introduce somebody to a product and have them use all the things that they can? Or is it more important than just find one thing, and they love it, and they use it all the time? And I think my argument is that we're at a space where we want to see that consumer adoption grows. And as a, you know, marketing person myself, I'm very happy if someone reliably uses a diverse product, even for just one thing, because then your relationship with the customer can result in you educating them more, their confidence to try more things goes up. So if you just use your smart speaker for music, you may realize that it's great for timers or for general knowledge, or for setting reminders, and you just always have to start somewhere. And then you can either use it, it's a bit tricky to use the product to get feedback, because people tend to be transactional, for the most part. So if the product conversationally starts to engage with you and says, Well, can I get your feedback on this? Or what do you think about that, then the user may have moved on? So there's, it's really about getting a feel for how open are people to give feedback in real-time? Do you ask for their permission? Are there other channels where you can use email, or if there's a screen, that there are areas of the screen reserved for people to opt in to provide feedback, so it's so I would say the main thing is to do a few things really, really well identify the people who are loyal. Ask for the right permissions. If you can, either verbally through a screen or through, say marketing channels, and start gathering that information in a way that ends up strengthening the users' journey through your product so that they end up using it more and more

Ilana: So something that you can look at the sides. Just asking users is, well, if you can identify their intent, what they're trying to do, did that result in successful, completed action? So for example, music, if we know that somebody's asking us about music, did we play a song? And then how long did they listen to that song for? If it's for five seconds, and they asked for another song, while we probably got it wrong, that wasn't the thing that they were looking for. And so you can look at how people move from thing to thing and say, what, what is their task completion rate? I'll also say that was with smart homes, the cost of errors is higher. Because if somebody's setting a timer, how many times are they going to ask you to set a timer before just going to their physical timer and learning that, oh, I can just set my physical timer and I don't need to use that thing. So only releasing things once you can get it right is really important for retention. And then in terms of discoverability, I think a lot about feature adjacencies. And so when a user interacts with an object And they're building a schema in their mind of what this assistant can and cannot do until you want to build out a consistent schema like the timer is very closely related to alarm. So it makes sense that if you can set a timer, you'd be able to set an alarm and somebody is likely to experiment with those things being in timer, going into alarm, but if you say like, timer and music are very different. And so if somebody knows that they can set a timer, how likely are they to ask to play music? Well, I'm not sure they might be more likely to ask about time zones, then ask about music. And so building a cohesive product and thinking about your feature rollout in terms of feature adjacencies. And building towards a consistent schema is really important for discoverability, as well as your relationship with customers.

What is the process of defining a personality for an in-home assistant or for a product? What role it's going to be playing in the user's life?

Michael: I would say the main thing, the question is, is What relationship do users want to have with the product with their assistant? And? And how to how well is that assistant adapt to some of those, some of those cues? And one of the things that we see, and I'm sure we'll get nods from the other panelists is that people first experience with voice, it ends up being conversational. Hi, how are you? What's your name? And it's because the voice is a conduit to emotional connection much more than it is about simply telling something to do something. I mean, the transactional component is, I'm going to ask you to do this, or I'm going to ask for an answer. Those are really the two transactional things do it or answer my question. But humans use their voice for so much more. It's how we express ourselves emotions, anger, it's how we are curious. And so it's hard to deprogram oneself when you first interact with a product that you can talk to because humans ourselves have been wired in a way to express ourselves. So it's, there's always a little give and take. And that's ultimately sometimes what's so empowering, empowering about voice assistance, but can also be discouraging, because we see the potential but they can't fully be realized. So some of the subtle ways I've seen it done is gauging the length of responses. So sometimes you just can ask the user say I can, if you don't want me to respond, I can just make a sound because you're not interested in conversational or I can make a longer one or if there's the idea of Have a small talk capability so that if users asked for jokes, that's an indication that, that this is an area that needs to be developed. And then, you know, just programming the personality to a point where a popular assistant, whether it ranges from, you know, Alexa to Google, to Siri, that company, those companies are going to invest in responses that start to develop a relationship and a personality that people start to get attached to. And I know those companies, in particular, are working hard. The same way any company that adds its own custom voice assistant can do the same, because your brand becomes your voice, literally. And I'm really excited to see the diversity of assistants across the board and how they evolve. But happy to hand it off to some other folks to continue.

Victoria: One thing that we have noticed is no matter what kind of personality that you try to add to whether it's the voice itself, or the way it phrases, the responses, there's also sort of like a built-in personality to the concept of a voice assistant in general, which is an interesting idea, in that voice is a new technology, it's something that people are, you know, getting used to, and that really doesn't perform perfectly, very often. And so there's some personality associated with just the voice itself, you can think like sort of Kiki a little bit complex and fragmented, somewhat technical. And so for companies like Sonos, one place that has been interesting, and that, you know, we can do things to, to change the voice or, or, you know, whatever, to try to build personality on top of this. But if you have a very, you know, simple and premium sort of personality that you want to associate with your brand, you then have to find a way to absorb all of this complexity and technical aspect that comes with the voice assistant itself and find a way to just rationalize those together. Yeah, that's been an interesting aspect, of this. And I think that's also shaped the fact that we really look at the voice and how we approach it from use cases, and like, what are the things the user wants to accomplish, and personality a little bit secondary to that, and that, you know, maybe we haven't gotten to the point that the voice assistant itself is powerful enough to match, you know, all of the things that personality you can also accomplish.

What are the next major milestones in the industry that you are looking forward to?

Toni: Yeah, so I think I'm personally looking at it from like the Google standpoint, I'm really excited to see what we do in the industry, especially on projects like project chip, which is the connected home over IP, through the ZigBee. Alliance. And that's working towards building some of those standards for connected devices, and how did they connect to make that and user experience of setting up all these devices easier, and more simplified, so that way, it's not so much of a struggle for the end-user. I'm really excited to see where that goes. And also just, you know, what kind of new technologies and solutions are developed with things like edge computing, especially as TPU modules become smaller and more accessible and more affordable for manufacturers. For me, personally, what I'm really excited to see about is a lot of the accessibility solutions that are going to come out of this. And I think we've just started seeing that, especially with COVID, kind of forcing some of these issues of everybody's stuck at home. And, you know, we're all running into a lot of the same issues and frustrations. And so how do we change that and improve that and for areas where folks need help, you know, I know in our breakaway session we were talking about like emergency phone calls. If somebody's an elderly relative needs help, they can use these voice assistance to get that help right away. And I think that's a whole big area that can be expanded upon.

Victoria: So I'll just throw out two, I think one of them that I'm excited for is users beginning to understand some of the implications of local versus cloud processing, and kind of where the strengths of each lie, and the reasons why you might choose one or another and the privacy implications of that, I think that this is going to be especially interesting as we have more and more devices that need to interact with each other. So you want to make sure that they're all communicating across each other might seem obvious to do that in the cloud. At the same time, you really want to think about privacy, how do you, you know, allow these devices to have only the appropriate information, and have anything that you can have locally, happen locally. So I've been surprised to learn that consumers are actually beginning to understand those concepts around local processing, more and more people are, you know, gaining an education in this space. But yeah, I'm excited for, users to become more familiar with the technology underneath. And the second side of that is I'm excited for, I think, the upcoming world where instead of a few main assistance, we have hundreds of different assistance for every company and need to figure out how to make those really all work together, which I think ties back to those, you know, shared standards type of concepts. But really a lot of concepts around interfaces, data sharing, processing, like all of these different things that you need to figure out to, to accommodate many, many, many companies now gaining the ability to develop their own assistance, and how to make that consumable by users.

Michael: I definitely reinforce a few points that you know, edge, privacy, and the ability to very confidently provide users with the assurances that they need will be a milestone as well, just embedded voice computing, but the diversity angle is one that you know, I really feel strongly about, and it Soundhound we build, we provide a custom voice assistant platform. So we're really founded on the principle that people want choice and variety. And we have a vision of the world that says if you if you're surrounded by robots and devices and various forms of interaction, they're not all going to have the same name we embrace the range of personalities and capabilities that our world offers and voice assistants will be the same way. So to me, the next milestone to keep it concrete is when a user or user group says I really like my Mercedes because of the voice experience I have in it. Right man and of course we power a lot of that and or when I listened to Pandora, I love talking to Pandora because it understands me so well. So that they understand that some of the larger assistance that provided by Google and Amazon and apple and increasingly Facebook, are really good at extending their services into the right environments. But not necessarily in a way that compromises their strong branded relationships with their products. So when we start, when the awareness of these integrations and what's already happening, but as it becomes more mainstream, then that diversity and 1000 assistance with Victoria talked about starts to become real. And it takes a while to get into the collective consciousness. But we see it coming and they're excited to contribute.

Ilana: So right now we're really good at or fairly good at understanding individual needs and individual interactions. But the home is a space for multiple people and that the people in the home fluctuates over time. So kids grow up, maybe elderly parents move in, you have guests over the guests leave, somebody is home alone, and adapting to all those fluctuations in the home space. And dealing with multiple users. And all those relationship fluctuations are something that we could get better at in the near term, and even understanding people code-switching with their languages, what languages they're using. In the far-term, I'm excited to see how smart home technologies affect architecture. So it used to be that homes were just dwelling places. And then in the 60s, they became part of a domestic landscape where it was a place where you could show your individual expression. And that's where you see the funky chairs with all the weird designs coming out and the experimentation in design. And when we look to the future and thinking about AR VR and thinking about how communities are fragmented, might there be a room that's completely empty? For a completely virtual space where you can get together and be with your friends who are normally far apart? That's what I'm looking forward to in short.

What would be the good and the bad or drawback for the local and the cloud?

Victoria: Great question. Yeah, sure. All Yeah, I'll do a short and simple version of it. Essentially, right now, when you talk to any of the largest systems typically you have a, you know, intent that you provide to the system, it's picked up by the device, then this intent is sent to the cloud. And they know that this user had this particular query or whatnot. And it's processing cloud and the responses sent back. And so you sort of shared this piece of information, so that you can have all of the processing happen in the model, which is supported in the cloud. And you share data around your, you know, interests, or whatever you want to do you share data about your own questions, with third parties, typically as you do that. And so the benefit of that is that you have less processing that needs to happen on the device. And then you improve the strength of the, you know, cloud model, the processing model that supports providing the correct response to this. In contrast, if you have a local version of this, you have the audio and it goes into a model, which is actually stored directly on to the device in your house, all the processing happens there. Maybe then it can make a cloud call, for example, for me to initiation directly to start the music, but it doesn't make a cloud call that sends you intention anywhere or your audio data, especially in the case that you know, something isn't clear. And they need to listen to the actual voice clip that can have issues because now you have data directly from your home, or maybe through an unintentional initiation, shared out again to the third parties. One thing that we found is that some of you may have heard of a company called Snips they do local voice processing. They're a well-known startup in Europe that has been acquired by Sonos. And they developed a model for local processing that can address these queries as effectively as a cloud-based model. And so we're pushing a case and proving that you don't have to have the whole model live in the cloud, you can actually accomplish a lot of these things effectively on a device without sharing data to third parties or to cloud or, you know, anywhere outside of the boundaries of your own home. So I think that's a very exciting new space that we're moving into.

I wondered if any of you are seeing a change in the way people are using smart home products and virtual assistants? And are people exploring new uses or decking out their homes? Do you think that the new normal and remote working, remote education for children gonna inspire changes and innovations? And if so, what areas do you think are gonna see those impacts?

Ilana: portal has been video calling on the portal has been really important for people during this time. And I can say anecdotally for my grandma who's been isolated, giving her a way to call her family and be able to connect to them with a video and the camera follows her motion. So she doesn't have to worry about zooming in zooming out using the screen in that way. I think that some of these connections that we didn't think about connecting in that way in the past will endure past this time.

Victoria: I will pile on to that, from the audio and home environment point of view, we have definitely seen a large increase in people, you know, playing music, spending time, making their home environment perfect, and initiating voice queries. One thing that's been funny to see is we've also seen a huge spike in like, you know, customer care requests to resolve various issues that people have just not had time to figure out how to do something or like, figure out why there's been an issue, they're like, Oh, I have months now to figure
out all of these problems. And so definitely, I think people are number one, you know, taking the time to explore, you know, understanding new features, and use cases, as it relates to improving their comfort. And, you know, their sense of like, this is really my home, this is the space that belongs to me, and I want to make it as best as possible. And I think voice plays a role in their ability to control that.

What are your thoughts on the future of protocols and privacy and collecting data? What are your thoughts on open-source data and voice especially? Do you think that there is room for growth in open source? And what does that look like?

Toni: Yeah, I was gonna say, um, personally speaking, I think there's absolutely room for growth in the open-source community because I do think that helps push on privacy standards and protocols, as people can see what data is in these data sets? And, you know, how is it being used in edge computing kind of things? I know, you know, just from my own personal research, I've, I've found more and more open data sets for those kinds of things over the last couple of years. And I think those continue to proliferate, which is great. I think it is something also you know, we just need to be cognizant about and maintain openness about the policies and how the data is being used. And I think that's something that as consumers, we all need to push back on, on the big companies and just make sure that they are explaining how this is being used. And that, you know, issues of implicit biases that are kind of being built into these models are also being addressed. So that's more of my personal take on this.

Curated Resources By Voice Tech Global

Alex Capecelatro - the state of the union on voice control for the smart home

This old video (2018) outlines some of the key concepts that were discussed during our panel and also has some great insights on how josh.ai wants to define the premium experience for Smart Home.

Google - Local SDK

In this talk, Google explains the key components for building technology in the local network with Google.

Curated Resources By Voice Tech Global

Alex Capecelatro - the state of the union on voice control for the smart home

This old video (2018) outlines some of the key concepts that were discussed during our panel and also has some great insights on how josh.ai wants to define the premium experience for Smart Home.

Google - Local SDK

In this talk, Google explains the key components for building technology in the local network with Google.

Conversational AI and Smart Home

Executive Summary

Meet Our Panelists

Victoria Tkatch

Software Product Manager, Voice Adoption @ Sonos

Ilana Meir

Conversation Designer, AR/VR @ Facebook

Michael Zagoresk

VP, Product Marketing @ SoundHound

Toni Klopfenstein

Developer Advocate @ Google

Meet Our Moderator

Rob Hayes

Head of Product @ Voiceflow

Watch The Event Again

Event Q&A

Curated Resources By Voice Tech Global

Alex Capecelatro - the state of the union on voice control for the smart home

Google - Local SDK

Curated Resources By Voice Tech Global

Alex Capecelatro - the state of the union on voice control for the smart home

Google - Local SDK

table of content

Menu

Conversational AI and Smart Home

Executive Summary

Meet Our Panelists

Victoria Tkatch

Software Product Manager, Voice Adoption @ Sonos

Ilana Meir

Conversation Designer, AR/VR @ Facebook

Michael Zagoresk

VP, Product Marketing @ SoundHound

Toni Klopfenstein

Developer Advocate @ Google

Meet Our Moderator

Rob Hayes

Head of Product @ Voiceflow

Watch The Event Again

Event Q&A

Curated Resources By Voice Tech Global

Alex Capecelatro - the state of the union on voice control for the smart home

Google - Local SDK

Curated Resources By Voice Tech Global

Alex Capecelatro - the state of the union on voice control for the smart home

Google - Local SDK

table of content

Menu

Follow us: