Fusion Smart Answers Integration with DialogFlow

Presented at virtual Activate 2020. Conversational user interfaces are growing in popularity, but the limitations of chatbots can leave user frustrated. By joining the capabilities of Google DialogFlow and Lucidworks Fusion’s Smart Answers, users can get a best in class conversational solution using AI-powered speech to text, powerful dialog flow design and AI- powered semantic search. In this presentation we will present how Smart Answers can be applied towards an impactful self-service experience powered by DialogFlow on GCP.

Speakers:
Steven Mierop, Sales Engineer, Lucidworks
Amit Kumar, Solution Consultant, Cloud AI, Google

Intended Audience:
Search Developers, Data Scientists, Integration engineers

Attendee Takeaway:

Integration is very easy when platforms are API-first
Documentation is straightforward and easy to follow for developers
Combining the two best-in-class platforms results in powerful dialog front-end and scalable ai-based information retrieval back-end.

Transcript:

Steve: Hi and welcome to Fusion Smart Answers integration with Dialogflow. My name is Steven Mierop. I’m a solution engineer here at Lucidworks, and I’m joined today with Amit Kumar, a technical solution consultant in the cloud AI practice at Google.

Today we’re gonna be talking about conversational AI trends that we see in the industry. Semantic search and how they relate to chatbots. We’ll do a quick demo of Smart Answers and then Amit will give us an overview of Dialogflow.

What are we seeing in the industry today? This was a report released from Gartner just a couple of months back showing that there’s been a 10X increase in the number of new chat conversations online. And I think in a lot of ways unsurprisingly so, we’ve been riding in this digital transformation wave for a number of years now.

But an accelerant to all of this, of course, has been COVID-19, employees working remotely, shoppers trying to find answers to their questions in isolation. All of this put a lot of increased demand on customer support teams.

This was another report released in February from Gartner showing that the main use cases have been centered around customer support, call center agents, virtual assistants and employee portals.

Something else that we’re noticing is that in addition for the use cases being related to cost savings, we have a lot of customers and prospects wanting a chatbot or an intelligent virtual assistant to serve as a revenue driver in addition to that.

I think the bar is beginning to get set pretty high.

What were some of the limitations traditionally with chatbots? One of which was that they couldn’t retain the context of a conversation. And that was a problem because after a couple of questions, you’d quickly realize that probably it wasn’t gonna help you in the way you hoped and you’d abandon it entirely.

Part of the problem and the reason for that was because the backend intelligence was just static, rules-driven workflows that someone had to maintain. And it kind of became a maintenance nightmare to keep one of these running and keep it relevant in terms of the line of incoming questions.

In the past few years, we’ve definitely seen a massive improvement in this, particularly around natural language understanding. I’m even seeing a lot of these smaller boutique chatbot companies and frameworks sprout up that are allowing you to train a model, dockerize that solution, upload it to the cloud and then horizontally scale that out.

Now I think that’s absolutely a step in the right direction. But when it comes to having a usable chatbot that is easy to maintain and that can learn on its own over time, you definitely need more pieces and more components helping you here. And this is where Fusion can come into play.

Let’s build up to Smart Answers and we’ll look at some of the benefits of using Fusion’s information retrieval as kind of a foundational piece to this. So, when we think about answers to questions, answers can live in a lot of different places and come from a lot of different locations. It might be more than just a set of curated FAQs that your company has.

Questions and answers can also live in community forums. They could be in word files or PDFs or blog posts, for example.

Connecting to all of these different data sources and then ingesting that into a search index is an important first step.

In addition to the intelligence that Smart Answers applies and brings, we also need other types of natural language processing capabilities like entity extraction.

As a conversation is happening with a chatbot, we can identify what are some of the key concepts being spoken about. Cause we might wanna take some type of action if we recognize a known entity. We also need the ability to track signals. If a user is interacting with a chatbot and maybe they don’t get an answer that’s quite what they were looking for, they might wanna send some type of a signal to Fusion that says, give me the next answer or let’s try again. And what we can do is we could track that information and that particular type of behavior and then ultimately feed that back into Smart Answers, so that it becomes smarter and smarter over time.

All of these different layers and factors are what we use as a foundation to build Smart Answers.

What exactly is Smart Answers? Smart answers is an intelligent Q and A system. It is a simplified way to use deep learning models to understand the context and semantic meaning of users’ questions.

Once we have that, we could then find the most appropriate answer in the backend of Fusion. And the benefits of doing this can help with call center deflection. It could help make your call center agents more effective and even help in e-commerce settings as well.

How does it work at a high level? Well, we have a couple of different ways that we can train a model. We could do supervised, unsupervised and we even ship with a generally trained model that’s pre-trained that you could use fast. As soon as you ingest your data, just use that model, create your encodings and then you’re essentially done and then we just tune it from there.

Once that model is created, it’s automatically uploaded to our machine learning service which makes it readily available inside of your index and query pipelines. Once we have that model referenceable in a stage, for example, any incoming documents or Q and A pairs will be encoded into deep vector representations.

Similarly, during query time, we’ll add that same model to the query pipeline and that incoming question or query from the user is also encoded into a deep vector representation. From there, we could find the similarity between that user’s question and the most appropriate answer inside of our search index.

Now, where does Google Dialogflow fit into this? Well, Smart Answers is conversational middleware. We make it easy to use and train a model without having to be a data scientist, although if you are a data scientist, we also have a lot of different configuration settings for you to fine-tune this process. But ultimately we don’t ship with a chatbot user interface or a virtual assistant.

For that we rely on a chatbot platform or a framework that has a friendly set of APIs that we can connect to.

We found Google Dialogflow to be easy to use. It came with text-to-speech and speech-to-text translation that was extremely accurate. It also came with a prebuilt chatbot that we could leverage a webhook for. As users ask questions into our chatbot, we just intercept that question, let Google translate the speech into text, send that over to Fusion where Smart Answers will find an answer and then we send that JSON payload back to Google Dialogflow for the answer to be spoken back to the user.

What’s great about it is we could use a chatbot interface, we could use a cellular phone for example, and create a virtual assistant type of experience. Also, if you’re using managed Fusion that lives in Google Cloud Platform, well, of course that’s where Google Dialogflow is hosted as well.

I have one place to go to, to evaluate logs and see exactly what’s going on if I have to troubleshoot any particular scenarios or just to help me when I’m developing this.

Let’s take a look at it and see what it looks like in action. I’m gonna switch over to my cell phone, which I’ll be screencasting live here, and I’m gonna talk to a fictitious company called Databack. I’m just gonna start asking some technical questions and hope I find the answer.

Hey Google? (Google beeping) Talk to Databack.

[Google voice] Okay, let’s get the test version of Databack. (Google beeping) How may I help you? (Google beeping)

Steven: What happens if the private analytics upload fails?

[Google voice] If the data upload fails, then the service will try to upload again after 60 minutes. Upload failures are reported in the dbd.logfile. (Google beeping)

Steven: Google Dialogflow translated my voice into text. We sent that over to Fusion and Fusion gave us an answer back and all of this, the user interface for this, we didn’t have to develop at all. We just had to write the webhook logic, which was about 100 lines of Node.js.

Let’s ask this question in a different way and hope we get the same answer.

(Google beeping) The upload isn’t working for the analytic server.

[Google voice] If the data upload fails, then the service will try to upload again after 60 minutes. (Google beeping)

Steven: We still get the same answer back, which is what we would expect. Let’s try it one more time and we’ll change up the wording yet again.

(Google beeping) What should I do if uploading to the analytic server is unsuccessful?

[Google voice] If the data upload fails, then the service will try to upload again after 60 minutes. (Google beeping)

Steven: What’s cool about this is I can ask that question in a variety of different ways, maybe even using words that are not in my search index at all. I can still get the right answer back. That saves us a lot of time. We don’t have to maintain a big list of synonyms or rules or anything of the sort. We just let the semantic understanding do basically the hard work for us.

Another thing we could do is we can create some self-tuning behavior just using the components built inside of the Fusion platform.

Let’s ask a question and we’ll see how we feel about it.

(Google beeping) How do I speed up backups?

[Google voice] You can backup an entire instance more quickly by scheduling multiple child clients for simultaneous backup. (Google beeping)

Steven Now that was a perfectly acceptable answer, but let’s say that it wasn’t exactly what I was looking for. Maybe I want something a little different. I could send a request to Fusion to give me a new answer.

(Google beeping) Try again.

[Google voice] If you are experiencing performance issues during backup, you can troubleshoot them by enabling logging of performance details in the log files. Was that better? (Google beeping)

Steven: Yes, thank you.

[Google voice] I’ll let Fusion know you liked this answer the best. (Google beeping)

Steven: How can I speed up backups?

[Google Voice] If you are experiencing performance issues during backup, you can troubleshoot them by enabling logging of performance details in the log files. (Google beeping)

Steven: You could see that we could send some positive feedback to Fusion and then Smart Answers can take that into account and then send me the most recent answer back that we indicated was most appropriate. That was just a quick example of what you could do with Google Dialogflow and how it integrates with Smart Answers.

How does this actually look behind the scenes? How does it work? Well, Fusion needs a friendly chatbot framework to work with. That framework of course was Google Dialogflow in this case and Fusion Smart Answers sits right in the middle of that.

As utterances come into the platform, in the form of questions, what we’re doing is we’re essentially creating one intent, one primary intent, and it’s actually the fallback intent. Everything gets routed to a single endpoint, which is ultimately controlled by a webhook, which is just some Node.js code that picks that text, sends it over to Fusion, Smart Answers finds an answer and then sends it right back through that pipeline. Then that answer was translated into speech which was spoken back to that particular user.

Simple, but powerful in what you can do.

With that, I’m gonna hand it over to Amit and he can take it from there. Amit.

Amit: Thank you, Steven. I’m gonna talk a little bit about Dialogflow in general, what sort of different capabilities Dialogflow provides and what sort of architectural components that you can integrate and build sophisticated, conversational experiences for your enterprises.

At a very high level, Dialogflow is a cool conversational platform that Google provides. It is using the similar set of services that some of the other Google products are using. For example, a speech-to-text, text-to-speech, knowledge service for supporting the knowledge basis.

You can upload a lot of your FAQs and articles and use those as a part of Dialogflow answers. Many of the NLP capabilities for natural language processing, as well as for sentiments. Then obviously speech conversion from text-to-speech, as well as from speech-to-text.

Dialogflow is the integrated conversational platform that you can use for your enterprise.

Next slide please, Steve?

At a high level, why exactly would you wanna utilize Dialogflow? So, Dialogflow gives you this integrated IDE development platform. It gives you a lot of the prebuilt agents where you could use those as a template.

For example, if you are trying to build a FAQ bot, or if you are trying to build a bot in a financial domain or any specific vertical, it gives you a lot of those prebuilt templates that you can import and start to configure for your own specific use cases.

Additionally, a lot of the training tasks, a lot of the training phrases and NLP related tasks are pretty simple. You do not need to learn any AI specific technology. This is all click and… click configuration type of tool where you just provide the training phrases and create and click configurations and ust deploy your chatbots.

The development experience is fast, your go-to-market is fast. Similarly, you can engage, provide a similar sort of omni-channel experience for different channels, whether it’s a chat or it’s voice, you don’t need to change or you don’t need to build a different bot for that. It’s the same bot that you can enable in different channels. It gives you a lot of flexibility from an integration standpoint.

If you have some of your existing backend applications, for example, Salesforce or ServiceNow, you can pretty easily integrate those using fulfillment. That’s one of the integrations that we have done with Lucidworks here in the demo that Steven just showed.

Then again, the training and analytics is again, available across the platform. It’s a one single integrated platform that you are not doing differently for different channels.

From a reach standpoint, it supports a number of different languages. There are different platforms and SDKs available pretty much for any language you can think of that you can use and build your client applications. Then again, you just build it once and then continue to deploy in your different frameworks. It’s providing you a pretty sophisticated go-to-market, faster go-to-market and providing you the platform for building your conversational design and experiences.

This is at a very high level, the architecture, the way it looks. Dialogflow gives you a number of capability, number of one-click integrations for different channels.

For example, Facebook, Apple Business Chat, Google Assistant, these are all one-click integrations that are available.

Similarly for voice, Dialogflow is integrated with a number of telephony partners like Genesys and Avaya.
Using those pre built integrations, you can bring your voice and chats into Dialogflow. That is where you are applying your NLP and your number of other artificial intelligence related capabilities. From there on using fulfillment, you can integrate this with your backend applications.

For example, you may wanna provide an experience where customers wanna come in and they wanna know their order status. You can build that conversational experience with the intent and you can ask customers to provide the order number, and you would be able to integrate with your backend order management system and provide that status back to the customer.

You can build a lot of the self services type of experiences with Dialogflow pretty easily.

This is what we were just talking about earlier.

This is the webhook integration that we have done with Smart Answers. In general, we have defined an intent. Intent is the action that the user wants to do. As the user is speaking with the system or they’re providing the transcript, we are detecting the action that user wants to take.

Based on that action, we can specifically invoke certain webhooks and that webhook can do a number of things. It could integrate with some of the other cloud services, or it can integrate with some of your backend applications. In this case, we are using webhook to integrate with answers and then provide this overall conversational experience.

Here we just wanted to highlight a few architectural components. As we discussed earlier, a Dialogflow is the omni-channel platform. You would just build your agent once and you would enable that for different channels. You are not building different bots for voice and chat and social channels. It is intelligent enough to detect the channel and then, you know, continue to work on the same logic and the configuration that you have built.

It uses the similar AI capabilities, AI services that other Google products are using. It’s using the same speech-to-text, text-to-speech and NLP capabilities. It does have, you know, a very sophisticated, optimized, enhanced model so different sorts of experiences, different sorts of channels can be enabled through Dialogflow.

As we discussed earlier, in general Dialogflow is very well integrated with other GCP, Google cloud products.
It simplifies the overall architecture. You are not integrating on your (indistinct) some of the other AI services. It does all that work behind the scenes for you. Just simply invoking, integrating with Dialogflow and then rest, it’s gonna take care of you. It does give you a lot of the enterprise readiness capability out of the box.

For example, all the traffic, all the interaction, in transit as well at rest is encrypted by default. We do not have to pay anything additional for that. It has pretty sophisticated monitoring and logging capabilities. You can’t see end-to-end transactions, end-to-end conversations. It locks all the interaction so you can go and see all of your chat histories and the conversations that agents and customers are having. And it does give you pretty sophisticated analytics capabilities as well.

It gives you all the real-time conversational analytics, why customers are calling, what sort of case categorizations are there. You can do a lot of work on that transcript and the data that is available to you.

Again, it is highly performant because it uses a number of optimized hardware.

For example, the services are using TPUs behind the scenes. It does support gRPC for real-time streaming as customers are having traction with the telephony partner. It’s highly performant and gives you low-latency based integration options.

With that, I wanna take some time and give you a little demo of Dialogflow screens. I will take control here with this.

This is the Dialogflow IDE. It gives you all the different options for configuring and building your conversational experiences. In the Dialogflow terminology we call this as an agent, a Dialogflow agent, and then on the left hand side it gives you a lot of different options for configurations.

Intent is the utterance and what exactly the user is trying to do. As users are speaking and providing the voices being streamed, Dialogflow is gonna look at the stream and then extract the intent or the actions that this user wants to take.

For example, in this case, I have an intent called a hotel book. Essentially as the customer is speaking, they wanna book a hotel. You can click on an intent and it can provide a lot of different training phrases here. Sort of how exactly different ways customers can say when they are trying to book a travel or book a hotel. You don’t have to do an exact match because it uses a lot of the NLP capabilities.

So, with that, based on just a few training phrases, it would be able to extract, you know, a specific intent.

Then from the intent, it has the ability to extract parameters. It does that automatically for you. As customers are saying, you know, I wanna book a hotel room for two people for tomorrow, for example. From these particular trends, the Dialogflow would be able to extract the number of people that you wanna book a hotel for, when exactly you wanna book it for and then for example, locations and few other things.

It does that automatically for you without doing any specific coding. Then you can set different responses, what exactly you wanna send back to the customer. As a part of this response, you can also configure a fulfillment action that we were just talking about. Essentially you are trying to understand what exactly the user wants to do.

You wanna extract those parameters and then you wanna invoke some sort of backend free space or some other service, which will actually do the reservation booking for you.

Entities are the variables that you define. Dialogflow comes with a lot of the system defined entities.

For example, date, time, color, city, code, zip code. A lot of those things are prebuilt, so you don’t have to define them.

As soon as you provide the training phrases Dialogflow will automatically understand those. Then will extract it and start using those for you. It does give you the ability to create your own specific custom entities.

In this case, for example, you know, we have created one for the star, what is the star for the hotel room that you wanna book. You define those values and different ways, different synonyms that the customer can use for, using those values.

Fulfillment is another option here. We kinda touched on that briefly earlier. Fulfillment is an option for integrating Dialogflow with some of your backend applications. Here you have a number of different options. You could invoke your existing services that you may be running somewhere on-prem or any other cloud providers. You could also build your serverless options here using Google cloud functions and deploy there.

You have a number of different options to build and deploy your backend service. Integrations are one-click integrations for some of the social channels as well as the telephony partners.

Then the next area here is giving you options for training. It gives you a pretty sophisticated out of the box training tool, where you would be able to upload a data training file, where you have listed training data for each of your intents that you have defined. You would be able to use that for training rather than going in each and every intent and typing it there, you can use this option where you could upload that training file.

Validation history is one of another cool features where you would be able to go and see the historical information.
As the customers are having interactions and conversation with the bot, you can go back and look at all that history information. It also gives you this tool here where you could test your chatbot, a voice bot and see how it works and what exactly it’s doing.

So, for example, I know, I just said, book a hotel, and it gives you information about what’s the intent in booking and then you would be able to see, you know, some of the other additional parameters and what exactly the context it’s capturing from that particular customer trends.

A lot of out of the box capability here, it’s a pretty well integrated IDE platform for building your enterprise conversational flow.

I think that’s all we wanted to cover today. Thank you very much for taking time.