How to extract ML trained parameters in user intent node and display it via message node

balakrishnav · April 21, 2021, 10:18am

Hi
I have created a user intent node which collects the dob of the user when they provides a response.
So i have created some training samples in the user intent node like this,

Now I test the chat bot with the utterance : my date of birth is 14th march 2020.

now I need to connect a message node followed by a service node and send the date of birth value only i.e (14th march 2020) which is extracted by the Named Entity Recognition ML model and not the whole text that user has uttered.

But I am now sure how to extract that variable.
Please let me know how to resolve the issue.

Added some samples for clarification.

sameera.tumuluri · April 23, 2021, 11:52am

@balakrishnav

Please help clarify the business usecase for this scenario. Based on our observations it is not a best practice to save dates as utterances for a Dialog.

If the requirement is to print the DOB, then we can use date entity node to capture the date based on user input and print through message node.

andy.heydon · April 23, 2021, 4:49pm

@balakrishnav

The Kore platform, splits NL processing into intents and entities. That is, we use “intents” to mean tasks, dialogs, alerts and “entities” to mean specific bits of data. Not everything has to be handled as an “intent”.

Training supplied for intents is all about how to find and start a task. If the user says something like this, then this is the conversation flow to start.

Within a dialog you paint entity nodes to indicate this is the type of data you want to extract at this point in the conversation.

Now some of those entities don’t need any more “training” as the type is sufficient for the platform to know what to look for. Types like numbers, dates and times are all things that the platform implicitly understands, and indeed they can cover a very wide range of possibilities beyond what would be feasible through training sentences.

The standard processing for an entity is to consider all of the previous sentences which have formed part of this conversation, including the initial user utterance that initiated this dialog. The entity will only explicitly prompt if a value is not found in any of them. When a value is extracted (and most entities will normalize the user’s input) it is stored in the context object, under context.entities., where other nodes can reference it.

Once an entity has been added to a dialog, then you can annotate specific words in a ML training sentence that refer to that entity for the NER model. But that is optional, and generally not necessary for the closed class or system entities like dates.

To sum up, on the Kore platform,

intents identify free form text, always based on training, where there is no specific value to extract, e.g. “I want to change my personal details”.
entities gather specific words and data appropriate to the entity type once a dialog has started, either from the initial user utterance that the dialog intent picked up on, or by prompting the user. e.g. “My date of birth is Sep 1st 99” (a date entity will find and set context.entities.myDateEntity = “1999-09-01”)

balakrishnav · April 27, 2021, 12:54pm

Hi @andy.heydon

Thanks for your prompt response.
Let me clarify the actual business use case for the scenario.

Scenario - 1
I wanted to extract a employee id of particular company that has a pattern - 4 alphabets + 3 numbers. Like SPRT123, SPRT456.
I know I can use regex to extract it, but I would rather want to train an NER model with user’s utterance samples like, “My employee id is SPRT789”, I want to highlight the bolded keyword as emp_id.
Later in the corresponding node (dialog/entity) which collects the employee id, when the user tells like the above utterance, I want to extract his id and display it on message node (later I will use REST API using service node for further processing).

Thus I have a training samples that needed to be trained under NER ML model and use it in the node.
But the problem is I cannot find any resources which explains the whole process in details, I always get bits and pieces from the documentation, which is not sufficient for whole integration.
I would request if you could help me in finding the documentation / provide me a set of instruction,so that I can follow accordingly.

Scenario 2
I wanted to find out the illness of the user. Here I have 2 categories to divide, Fever or Cough.
When I ask the question, Do you have Fever or Cold ?, the user response can be anything from a single word or a sentences and I wanted to further construct the node based on the findings (fever/cough)

The user can say :

I am having fever (FEVER)
I don’t have cold, just feverish (FEVER)
Like fever is there (FEVER)
No fever, just cough is there (COUGH)
Cough since last night (COUGH)
There is cough only (COUGH)

I have training samples like these and wanted to train the ML model to find the user’s intent and take decisions appropriately.
Let me know what’s the efficient way to construct a node in kore to accomplish the whole process.

Also let me know if you have any sample bot created that resembles the above scenario, that I can make use of.

Let me know for any clarifications.

Awaiting for your response.

Thanks & Regards
BALAKRISHNA

balakrishnav · April 27, 2021, 12:57pm

Hi @sameera.tumuluri

Thanks for your prompt response.
Let me clarify the actual business use case for the scenario.

Scenario - 1
I wanted to extract a employee id of particular company that has a pattern - 4 alphabets + 3 numbers. Like SPRT123, SPRT456.
I know I can use regex to extract it, but I would rather want to train an NER model with user’s utterance samples like, “My employee id is SPRT789”, I want to highlight the bolded keyword as emp_id.
Later in the corresponding node (dialog/entity) which collects the employee id, when the user tells like the above utterance, I want to extract his id and display it on message node (later I will use REST API using service node for further processing).

Thus I have a training samples that needed to be trained under NER ML model and use it in the node.
But the problem is I cannot find any resources which explains the whole process in details, I always get bits and pieces from the documentation, which is not sufficient for whole integration.
I would request if you could help me in finding the documentation / provide me a set of instruction,so that I can follow accordingly.

Scenario 2
I wanted to find out the illness of the user. Here I have 2 categories to divide, Fever or Cough.
When I ask the question, Do you have Fever or Cold ?, the user response can be anything from a single word or a sentences and I wanted to further construct the node based on the findings (fever/cough)

The user can say :

I am having fever (FEVER)
I don’t have cold, just feverish (FEVER)
Like fever is there (FEVER)
No fever, just cough is there (COUGH)
Cough since last night (COUGH)
There is cough only (COUGH)

I have training samples like these and wanted to train the ML model to find the user’s intent and take decisions appropriately.
Let me know what’s the efficient way to construct a node in kore to accomplish the whole process.

Also let me know if you have any sample bot created that resembles the above scenario, that I can make use of.

Let me know for any clarifications.

Awaiting for your response.

Thanks & Regards
BALAKRISHNA

andy.heydon · April 27, 2021, 8:21pm

@balakrishnav

As I mentioned in my first post the mechanism for extracting specific data from a user’s utterance is an entity. Your dialog has to have an appropriate entity in the flow, and even if you are going to use an NER model then those entities have to exist because their presence is needed in NER annotation process.

So for scenario 1, then you will need an entity of type Custom where you can add the regex to it - [A-Za-z]{4}\d{3}. And that is all you have to do, you don’t need to do any extra training. If you are feeling keen then you could annotate the ML training for the dialog intent to highlight values for the NER model, but I personally would not bother as it is not going to gain you anything. One of the goals of the Kore.ai platform is to reduce the amount of training a bot developer has to do, hence entities have processing behind their simple node definition.

Scenario 2 is a little different and there are, perhaps, three different ways it could be implemented. (In the Kore.ai there is always more than one way of doing things!)

The first step is that you are asking the user a question, “Do you have a Fever or a Cough?”. The standard way to interrupt the dialog flow to ask a question and get a response is with an entity.

The next decision is the type of entity, and for this case where the user is deciding between one of a set of values then you would use an Enumerated List of Items. The list of items entity (note some of use also use the term “LoV”), has an additional configuration section where you define the list of items and the set of synonyms that indicate that value. At run-time the platform will use those synonyms to determine which choice to select. In this example, just the words “fever”, “high temperature” and “cough”,“wheeze” could be sufficient as it is generally not best practice to use too many words that are not explicitly relevant to avoid false positives.

Again, there is no need to train an NER model because the platform and the LoV entity will be able to find those words anywhere in a user’s utterance.

Now the LoV entity will work well for many cases, but one of your examples, #4 “No fever, just cough is there” would likely generate an ambiguity and a subsequent prompt for the user to clarify their choice. The LoV synonyms are all positive selection and so you cannot really indicate the negative situation. Now you could just decide that this is an edge case and the clarification prompt is OK; consider a more ambiguous case “no, fever” where the comma downplays the role of no in negating “fever”. It starts to get very tricky.

That leads to two other options that utilize sentence training (or patterns) which may be closer to your desire to always use ML. Those are traits and subintents.

Subintents are a technique to identify something within a dialog that is not easily handled by an entity, so using them to identify an entity can be a little cumbersome though it is possible. But it is not the way I would recommend, and given ML’s propensity to try to find answer you will run the risk of false positives. In the interests of brevity I’m not going to describe all the steps here.

That leaves traits. These are somewhat misunderstood and sometimes confusing, tool within the Kore platform. At a basic level, traits are just another way of identifying something potentially useful from a user’s utterance. There is no commitment to use whatever the traits engine identifies, but they can be help with providing additional depth of meaning. Every user utterance is always sent to traits engine and the context.traits array in the context object is updated with any results, which then can be referenced in script or connections.

Traits are grouped into a confusingly named “Trait Type”. Of the traits in a group then the engine will only identify one of them. So if you have a single classification of things (fever OR cough) then you would have 2 traits in a single trait type. But if both traits are equally valid (fever AND cough) then you would have 2 trait types with a single trait in each one. The individual traits are then trained either with sample phrases or patterns.

Now traits don’t force anything to happen, they are just a background identification scheme, so in the flow you still need to have a prompt to ask the user a question. Therefore an entity of some kind is still required, and so this is what I would probably do:

Add a List of Items entity to the dialog, like I described above, with simple and obvious synonyms.
Define this entity as optional - you still get the prompt (if needed) but if the user doesn’t enter a known value then the flow carries on.
Define two traits in a single group to cover the scenarios where the user is vague or uses idioms to describe their symptoms.
The entity transitions to a script node where we can check for either an entity value in context.entities. and/or a trait in context.traits. If there is nothing then you can always transition back to the entity to prompt again.

This way you can use the LoV for the vast majority of utterances that will be simple and obvious (and the UI might present buttons for those choices as part of the prompt) and then traits can be used, if and when needed, to fill in the gaps for the irregular and idiomatic expressions.

One thing you don’t mention is that fever and cough are not necessarily mutually exclusive symptoms - a user could say “I have a fever and a cough”. In this case then you can turn on the multi-item switch in the entity and separate the traits into distinct types.

balakrishnav · April 28, 2021, 11:09pm

Hello @andy.heydon

Thanks for your prompt response.
I will try the above mentioned methods and let you know for any concerns.

Thanks & Regards
BALA