I have a bot that one of the entities expects a street name (just a street name, not a full address). All the street names that are possible for that field are in a List so the entity is of type list
The issue is, the city that this street is in spells their street names horribly and the street are named after European cities in the 18 hundreds, so for example, one street is named “Barditchev” (Pronounced: Bar-Ditch-Ove) and when a user is using voice to say the street name the Speech Recognition Engine corrects it to a different word for example “Birthday Chef” or “By the Chevy” etc.
My initial thought was to add lots of synonyms with all possible values to each street name, so even when the engine converts it to “Birthday Chef” it will correct to Barditchev (and so on and so forth for all 70-80 street name)
Issue is that this is becoming unsustainable and even after adding 50+ synonyms for some of the street names it is still not working
My question is, is there any better way to handle that? I would want the Speech Recognition Engine to not process this value and not try to convert it to English, just give me what the user is saying. Alternatively, is there a way to train the bot with Audio? for example, upload an audio file with someone saying a word and setup that when the bot hears this sound it should return a specific value
This is a tricky issue and I don’t think there is just one simple solution. But there are things that can be done at the different stages to improve the entity recognition.
ASR systems can usually be extended with custom training, specifically for domain words. You didn’t mention what system you are using, but this is an important task to undertake. All the ASRs are doing is trying to find a suitable word from a dictionary. If it doesn’t know that Barditchev is actually a real word then it is never going to assemble it from captured phonemes and send it as text to the NL engine.
This is compounded of course by the fact that users themselves may not know the correct pronunciation, so you will have to cover a wide range of possibilities, some of which you will only discover over time.
If you can reduce the number of “bad” transcriptions then the amount of NL training reduces. The technique of adding the variations to a concept is, for now, the best way, as there is no way any spell correction changes “Birthday Chef” to “Barditchev”.
The upcoming R10.1 release of our platform will include some support for phonetic style spellings in utterances, so “bravo alpha romeo …” or “b a r …” or “b like boy a for apple r as is rose …” or “bee ay are …”.
We will be looking into other techniques in the future too.
I’m not sure what you mean with this, what system are you referring to?
My question comes down to is there a way to add words to the Dictionary?
The technique of adding all the words that the ASR might understand when a user is saying Barditchev as synonyms is unsustainable and out of hand (I’m up to over 50 Synonyms for Barditchev itself and I have 70 Street names. Wish me luck )
I am not an expert on ASRs by any stretch of the imagination, but I understand that all of them can be trained with new words and speech. Given the obscure nature of some of your street names then it would be really helpful to teach that ASR a few things. However I don’t know the specifics of that process.
With names like Barditchev, I don’t think you have much choice. I had never seen the word myself until this thread!
I’m using Smart Assist and it’s setup to use Microsoft Azure Speech Services (see attached)
Is there a way I can add words to it?
I also did not here this name until I had to build a bot for a company in that area - see it on the map here (you can check the neighboring street names as well to see what I’m going thru)