How to handle misspellings

nlp

(Henry Correia) #1

Hello All,

I did not find any specific topic/documentation related to this so I would like to know what would be the best approach to handle misspellings or digressions.

Please, consider the use case below:

Intent Name: Installation (training has been made. ML and FM recognizes “Installation” as a correct match)
User types: I need intalation
Result: Intent is not recognized

Please, in this case, what would be the best approach to have the intent recognized? Because, it is not possible to predict all possible misspellings that users can do.

To create synonyms is the way to handle this? Any idea, please?

Thank you in advance.

(PS: Please, check your documentation here because looks like there is a mistake in it (written twice “normalized” with “z” and should be “s”) :slight_smile: Use the US spelling of words (i .e. “normalize” instead of “normalize))


(Yoga Ramya Mendu) #2

Hi @henry.correia,

We apologize for the delay in response.

We have a misspelling words algorithm at the platform level. This will recognize the misspelled words to an extent and identify the appropriate intent.

The misspelled algorithm has been designed based on the general spelling errors made by using a Qwerty keypad.

For the utterance “instalation”, though the word is misspelled, the intent is detected appropriately.

image

Similarly, for “installaton”, “intallation” etc.

We have made all the effort from our end to recognize a major number of misspelled words through the algorithm.

However, the threshold of the algorithm is limited to an extent.

We have communicated this to our NLP team to analyze these kinds of scenarios and improvise the algorithm further.

Kindly let us know if you need any further clarification on the above.

Regards,
Yoga Ramya.


(Henry Correia) #3

Hello @yogaramya.mendu,

Thank you very much for the clarification!
Only to confirm, I believe the same logic applies to additional languages, correct?
So, for example, the use case below:

Language: PT_BR
Correct Utterance: Instalação
Wrong Utterance: Instalacao (without these special characters used in PT_BR)

Or should developer train the Bot to recognize both utterances?

Best regards,
Henry


(Yoga Ramya Mendu) #4

Hi @henry.correia,

Currently, the misspelled words algorithm can handle general letters without accents.
If “Instalacao” is provided as an utterance, it is not recognized for the intent.

We have discussed this with our engineering team and they would analyze the scenario for any improvement.

Regards,
Yoga Ramya.


(Henry Correia) #5

Hi @yogaramya.mendu,

Thank you for clarifying it!

Regards,
Henry