TTS Error with Google Cloud Text-to-Speech

TTS Error with Google Cloud Text-to-Speech

Description:

During bot conversations, issues have been observed with the Google Cloud Text-to-Speech (TTS) engine. Specifically, when a large character sequence is entered into the TTS model, it freezes. The problem is particularly evident with a sequence length of 748 characters. While audio is generated for 747 characters, some parts of the text are missing from the generated audio. This issue can be tested from the start flow when TTS settings and the Preview audio option are used. The attached screenshot reflects the issue from the text point of view, but during the call, silence is heard, and the bot freezes.

Symptoms:

  • TTS engine freezes when processing text sequences of 748 characters or more.
  • Missing audio output for parts of the text when using 747 characters.
  • Silence during calls when the TTS engine fails to process the input.

Errors:

  • “Sentences are too long” error from Google Cloud Text-to-Speech.
  • INVALID_ARGUMENT: This request contains sentences that are too long.

Solution Given:

To resolve the issue, it is recommended to enable the TTSChunking parameter, which splits long text into smaller segments before sending it to the TTS provider. This can be configured in the Call Control Parameters section of the bot’s flow.

Steps to Enable TTS Chunking:

  1. Go to Flows & Channels → Start Flow.
  2. Open the very first node this parameter is supported across all node types, including Message, Entity, Confirmation, and Agent nodes in the flow.
  3. Inside the Entity node, navigate to the IVR section.
  4. On the right-side panel, scroll down until you see “Call Control Parameters”.
  5. Click “+ Add” and fill in the details as follows:
    • Parameter Name: enableTTSChunking
    • Value: true
  6. Click Confirm, then Save and Redeploy the flow.

Once this is done, retest the scenario to ensure the TTS freeze issue is resolved.

Commands:

enableTTSChunking = true

Additional Information:

  • If you want the setting to apply across the entire session, use the session. prefix:

    session.enableTTSChunking = true
    
  • If you need to restrict the parameter to a specific node only, use:

    node.enableTTSChunking = true
    

Also we can use at Voice Gateway level configuration

Call Control Parameters can also be configured at the Voice Gateway level (Flows & Channels → Voice Gateway → Speech Customization).

When defined at this level, the parameters are applied across all dialog tasks within the bot, ensuring a consistent configuration without the need to set them individually at each node, unless a parameter is explicitly overridden at the node level.

For further assistance, please reach out to the Kore.ai support team.