Whitepapers

We are often asked by our users how to install and configure Asterisk, UniMRCP and LumenVox together, and we have provided a number of articles and documents describing this process. We understand there are lots of complexities and issues that people run into, especially when building some of these processes from source. Some of this complexity and confusion certainly comes from a lack of clear documentation that describes how these things are meant to be used together. This video series is aimed at removing some of this confusion and complexity.

The Speech Semantic Markup Language (SSML) offers text-to-speech users a variety of ways to control how the LumenVox TTS pronounces words and phrases. Using SSML, a developer can change voices or languages easily, and specify that a string of numbers be read as a number instead of digits. This whitepaper covers some of the most common ways SSML is used to control TTS pronunciations.

Describes what a grammar is, the different grammar formats available, and provides several general strategies for building effective grammars such as making use of built-in grammars, shaping prompts to reflect grammars, and keeping grammars as small as possible while still providing good coverage.

A pretty common task, especially for new speech application developers, is to convert DTMF-only applications to use speech recognition. This whitepaper describes a few approaches to the process and how to avoid some of the common pitfalls that developers fall into when performing the conversion.

The LumenVox speech products are designed to work in a distributed fashion, allowing for load balancing, failover, clustering, etc. This whitepaper presents a case study of a deployment that leverages this functionality to ensure high availability and discusses the various models of HA supported by LumenVox.

Disambiguation in speech recognition applications refers to the process where an application has to figure out what a user meant when they spoke a phrase whose meaning isn’t clear. This process involves re-prompting users, using external data to narrow down a list, and other advanced speech recognition development strategies.

When performing speech tuning, it is important to collect relevant data to perform an analysis on. This whitepaper describes why this is the case, talks about the amount of data needed, how to collect that data, and how to collect relevant data instead of irrelevant data.

"Accuracy" is an overloaded term in the speech industry, and means different things to different people. This whitepaper defines a number of metrics that can be looked at to indicate the performance of a speech application and explains why a single “accuracy” number is often misleading.

With the introduction of the Media Resource Control Protocol (MRCP), speech solution and platform developers now have a choice in how they integrate the LumenVox Speech Engine and other speech engines into their applications: they can use MRCP, or write directly to the Application Programming Interface (API).

This paper discusses the pros and cons of both development methods, so that you can choose the proper path for your organization

Tuning a speech recognition application is the most effective way to improve a speech solution and deliver better caller experiences. The tuning process involves analyzing data from users in order to make prompt, grammar, and call flow enhancements to an existing speech-enabled Interactive Voice Response (IVR) application.