I have added a total of 900 FAQ’s of which 328 are primary and rest are alternate phrases without tags in a CSV format.
i have not received the correct response for them. Although the format was right as i uploaded anymore CSV with lesser questions and without tags, I received the correct response. But when I added the final list of FAQ’s(328 primary and rest alternate), they do not work as expected.
So I started adding tags to few questions and then they started giving the expected response.
But at a point even after giving the correct tags, the I did not receive the expected response.
@akhil.solanathan-ext, I’ll explain you the dynamics of the platform’s ontology engine in identifying an FAQ.
The FAQs can be organised using terms and tags at Knowledge Graph. The path qualification will be based on keywords present in the user utterance. If the keywords present are more than 50 percent of terms / tags (this setting can be managed from Nl- KG thresholds) to arrive at a path, the path will be qualified and then the look up happens to identify the most relevant question at that path.
In the recent releases we added an inspect feature at Knowledge graph which basically does an analysis of the entire graph and provides recommendations to the developer so as better tune the engine.
Thank you for the explanation .Although I am a little confused.
I tried to upload a CSV file with roughly 300 Primary and 600 alternate FAQ’s in the Knowledge Graph.
I provided the 3 - 4 nodes as the path and uploaded it.
I have been getting the responses for most of them after adding keywords/tags.
I trained and published the same and then clicked of inspect.(referring to the link you provided)
I see that I have a huge error count and when clicked on it, it says “Modify the path to represent the terms present in the question.” But I do want to modify the paths as they are how I designed and want it to be.
How will the Error impact the current Knowledge graph?
Perhaps it will help to understand how the Knowledge Graph selects an answer.
The KG works through a two step process:
Identify a set of candidate questions
From those candidate questions, select the one that is most similar to the utterance
That first step revolves around extracting the important words from the user’s utterance and selecting the specific questions which reference a good proportion of those important words.
The second step is just a similarity computation between the question and the user’s utterance.
So the training is focused on building a hierarchy of important words, or terms, followed by placing each question in an appropriate path in that hierarchy. The terms in that path are the likeliest indicator of a match and will aid in the disambiguation across similar questions. By default the engine shortlists questions where 50% of its terms are found in the user’s utterance.
Individual questions can augment this path through additional tags to cover situations where that 50% coverage wouldn’t be hit otherwise.
For example, consider the question “What is the capital of France”. The most important words in that are “capital” and “France”, you could add “what” to that mix if there are other questions like “Where is the capital of France”. Now you can manually add those words as tags to the question, but if you have lots of questions about European capitals then it makes sense to save some effort and create and a node in the ontology labeled “capital” and then every question under that node will automatically have that word in its path.
This is what the message from the Inspection tool is telling you. You have questions with lots of words in them, but none of those words are defined as terms (either as ontology nodes or local tags) and so it will not be possible for KG to shortlist those questions.