Getting a word in! Integrating Oracle Digital Assistant with Alexa

Hi Everyone,

With the rapid rise in the uptake of virtual assistants, we wanted to explore integrating the Oracle Digital Assistant (ODA) with Alexa, so we can expand the range of platforms we can offer for our customers. Good news, with the latest release of Node SDK for ODA by Oracle and the alexa-app SDK by Amazon this can now be done, very easily.

The integration is achieved by setting up a web server in between Alexa and ODA, that acts as middleware.

Let's jump right in and take a look how it works.

Architecture Overview:

We need three elements for Alexa integration:

  1. An Amazon Alexa Skill.

  2. A web server application.

  3. Oracle Digital Assistant.

Procedure:

1. Creating an Amazon Alexa Skill:

We won't go into detail about this part. Please refer to
Amazon's official documentation for information about developing skills.

Note: The Endpoint in the Alexa Skill must have the suffix 'alexa/app' according to the sample code. This can be modified in the endpoints mentioned below (line 345, service.js):

app.locals.endpoints.push({ name: 'alexa', method: 'POST',
endpoint: '/alexa/app' })

After a custom skill is created, we need to create a new intent, slot type to pass the control to the bot. Please follow the below steps to configure the intent and the slot type.

  • Click on the Intents, click +Add button to create a custom intent. Enter the intent name as CommandBot and click Create Custom Intent button.

  • Now, let's create a slot type by clicking on Slot types, +Add. Name the Slot Type as MyCustomSlotType and click Create Custom Slot Type.

    Add two values something and do something to the created Slot Type:

  • The next step is to associate this custom slot to your custom intent. For this click on CommandBot under Intent menu, navigate down to Intent Slots to create a new slot.

  • Let's keep the Slot Name as command and click on + button, and select MyCustomSlotType as slot type, as shown below:

  • Finally, add a sample utterance to your intent. Click on CommandBot under Intent menu and add {command} as a Sample Utterance. Click on + button. This will pass the user input to our web server application code.

  • Click on Save Model.

  • After saving, build the model by clicking the Build Model button. Building the model may take a few seconds, you will get a message once it is successful.

2. Creating a web server app:

Our web server app is a Node.js app with an Express Server.

It requires 'alexa-app' and 'bots-node-sdk' node modules.

Before we begin, please download the sample code and perform an npm install.
The sample code can be found in this repo: Alexa with ODA.

The sample code runs on the port '5000'. It is very helpful if you expose this port and obtain a public URL beforehand.
(I prefer using Ngrok. If you are new to ngrok, refer to the official documentation.

Hint: Execute the command 'ngrok http 5000' in the downloaded ngrok folder and make a note of the 'https' URL provided.

After you have the code locally, there are a couple of changes you need to make. To make the changes we need to setup a webhook channel in ODA and obtain the connection parameters from there.

Now, let's see how it's done.

3. Creating a Webhook channel in ODA:

  • After you login to ODA, click on the burger menu icon in the top-left corner, expand Development and select Channels.

  • Click on create a new Channel.

  • Select Webhook from the Channel Type dropdown.

  • Provide the details for the Name and the Outgoing Webhook URI leaving the Platform Version default. You can also add an optional Description and change the Session Expiration.

  • The Outgoing Webhook URI is the public URL that we've obtained in the previous step, suffixed with "/botWebhook/messages" (you can change this in the sample code provided at line 339 in the service.js by changing the endpoint property, shown below).

    app.locals.endpoints = []; app.locals.endpoints.push({
    name: 'webhook', method: 'POST',
    endpoint: '/botWebhook/messages' })

    After providing the details, click 'create'. If there are no errors, your channel should be created and you will be presented with a screen like this:

    • Preserve the Secret Key and the Webhook URL somewhere (we will need them in the final step). Enable the channel by clicking on the toggle next to Channel Enabled label. Route this channel to your required Digital Assistant or Skill from the drop down available at the Route To label.

We are now just one step away from finishing up the integration.

The Final Step:

We are just left with linking the web server app that we've created with ODA.

To accomplish that, replace:
"YOUR_APP_ID", "YOUR_WEBHOOK_SECRET_KEY_FROM_ODA" and "YOUR_WEBHOOK_URL_FROM_ODA" in the metadata with your Alexa App ID (obtained from the Alexa Developer Console), the Secret Key and the Webhook URL respectively.

The metadata is located at line 20 in the service.js file provided in the sample code:

var metadata = { allowConfigUpdate: true, //set to false to turn off REST endpoint of allowing update of metadata
waitForMoreResponsesMs: 200, //milliseconds to wait for additional webhook responses amzn_appId: "<YOUR_APP_ID>",
channelSecretKey: "<YOUR_WEBHOOK_SECRET_KEY_FROM_ODA>", channelUrl: "<YOUR_WEBHOOK_URL_FROM_ODA>"
};

Testing:

Go to the developer console of your skill, click 'Build Model' if your skill is not already built.

After the skill is built, click test from the console.

Make sure Skill testing is enabled, if not select Development from the drop down.

Type something in the input box along with the invocation name of your skill to test. If everything is properly configured, you should see the response from your bot:

And, we are done!

A few restrictions with Alexa

To wrap up, it's worth pointing out a few restrictions we've noticed in working through our ODA-Alexa integration.

Messaging types in Alexa are limited:

ODA features interactive components like Lists, Cards, Webviews etc. If one/many of these components are present in your conversation flow (BotML), it is not advisable to use Alexa as a channel.

Alexa being a voice based channel makes it very obvious that we can't have graphical interactive components, unlike Facebook and other channels. 

It is possible to show cards (called Home Cards) in Alexa but these cards only show up in the Alexa application in a mobile phone/tablet, and are not interactive, meaning that you can only present the user with some information (eg: weather forecast, an order summary, etc)  but can't have the user interact with Alexa. 

Alexa also supports Progressive Responses (similar to a progress bar in GUI but with a vocal output).

For more information on Home cards and Progressive Responses, please visit Amazon's official documentation.
Even though there is no documentation around graphical cards, Alexa renders them in a completely different way.

Now we'll compare the rendered messages in Alexa and Facebook, and notice some of the Alexa shortcomings.

Multiple responses aren't received:


With the given SDK, Alexa processes only one response from the bot. Even if there are multiple responses from the ODA skill, Alexa renders only the first one. Check out the example below:

Expected Response:

Alexa's Response:

The second and third text messages are skipped in Alexa making it inappropriate to use when there are multiple responses from the bot.

Cards are a bit fuzzy to respond to:

We are accustomed to seeing navigable cards in both web and facebook channels. Alexa renders these cards in a completely different way, it adds the prefix 'CARD' to the card title and the button and then reads out all the card responses in the order of their appearance. To select a card you must tell Alexa the card name, for which Alexa responds with another prompt with the options available for the card (mostly 'view' and 'return').

Alexa opens and reads out the card data when you say 'view' and returns if you say 'return'. Take a look at the screenshots:

Cards in Facebook:

Cards in Alexa:

Selecting a card in other channels is simply clicking on the button available on that particular card, whereas in Alexa, the card title should be given as an input which is not as convenient.

Lists aren't that bad in Alexa:

Lists seem to be working fine in Alexa. The available options are read out to the user and the user has to reply with any one of those options.

Test your patience with the FAQ's (QnA's):

The "System.QnA" component is rendered as cards with the questions that are relevant to the user input, along with the answers. The user has to click on the card to view the full answer.

Alexa reads out all the matched questions prefixed with the word 'Card', just like it does with normal cards as mentioned above. The user has to read the question back to Alexa (imagine how painful it would be if you go wrong, the whole conversation heads nowhere!). Given this, I feel it's better to avoid using FAQs in the Alexa channel.

Let's take a look at these screenshots:

QnA's in Facebook Messenger:

QnA's in Alexa:

When you choose to 'view' :

When you choose to 'return':

Having seen all of these features and some of the shortcomings with Alexa as a channel, I personally feel it is the developer's responsibility to consider these, before choosing the appropriate channels.

Alexa is definitely great for sending single and concise messages, and messages with prompts, but you might want to reconsider using Alexa if your bots are highly interactive (if they contain cards, webviews, etc).

Using Alexa as a channel where the messages contain private and confidential information (ie medical information, bank history) is not recommended.

Alexa is perfect when your messages require beautiful narration. It can even convey emotions that a user can connect to.

Thank you for reading my post, please let me know any comments.
Happy texting (of course, with the bot )! ;)

Acknowledgements

I would like to acknowledge a few people that have contributed to this blog post, helping with reviews/testing/validation. Thanks to my colleagues from Rubicon Red: Nikhil Bansal and Sri Charan for trying out the integration and Mr. Rohit Dhamija from Oracle Product Management for this awesome blog post that I took reference from.

Image Credits:

rclassenlayouts and unsplash-logoChang Qing

Sharath Chandra Gavini

A technology enthusiast and a passionate learner just landed in the field of Information Technology; a cross channel chatbot implementation specialist looking forward to doing awesome things at work.

Hyderabad, India