The Mechanics of Tuning: How to Get the Most Mileage from a Speech Application

Tuesday, June 20, 2017 - 08:15

Nigel Quinnin

As mentioned in our previous blog, tuning analysis greatly enhances the performance of speech-enabled applications. LumenVox has invited Interactive Northwest Inc. (INI), a LumenVox Skills Certified Partner, to share their insights on the process of speech tuning and the improvements associated with ongoing tuning over the life of the solution.



The Mechanics of Tuning: How to Get the Most Mileage from a Speech Application

by Maria Simonton, Director of Product Marketing, Interactive Northwest Inc. (INI)


Maria Simonton

I bought a fancy new car about five years ago; all the bells and whistles, every luxury upgrade that money could buy. At 3,000 miles, I got an oil and filter change and put air in the tires. I haven’t taken it to the mechanic since. It’s still running—maybe not as well as it used to—but as long as it gets me from point A to point B, why should I do anything else?

Okay, that’s not a true story. Yet I see this happen all the time with the Cadillac of self-service investments: speech recognition applications. Despite the initial time, effort, and dollars spent, companies often take a set-it-and-forget-it mentality with IVR. Tuning a speech application shouldn’t only happen after the pilot phase; it should occur at regular intervals to ensure the application is running just as smoothly as that vehicle in the driveway.

As we learned in the last Blog post, there are many benefits to tuning, so I won’t reiterate them here. Rather, let’s talk about the process, how it works, and the type of improvements you can expect to see.

The first step is identifying when to initiate a tuning cycle. This can be at pre-defined intervals (annually, for example), following an application enhancement, or even timed in conjunction with an outside event that affects usage and traffic flow to your application. Perhaps you’re an insurance provider, and the government has just mandated coverage and benefit changes. Policyholders may call your application with questions and directives that are new and unexpected. Tuning helps reveal what callers are asking for, and how those needs change over time.

Once you’ve decided to engage in tuning, it’s time to enable utterance capture on your speech recognition server. I usually recommend a minimum two-week period for this, but the interval could be longer or shorter based on call volumes. Before utterance capture is enabled, be sure to play a “your call may be monitored or recorded” message up front to keep the legal department happy, and always double-check that utterance capture is working by listening to a WAV file or two. It’s not unusual for there to be permissions issues writing files to a directory on the server.

While utterance capture is underway, you’ll want to make a list of the dialogs to tune. Prioritize those that get the most use, have high error-out rates, or represent critical “gates” in the call flow. If, for example, failure at a certain prompt prevents a caller from going down an important path (such as making a payment), that “gate” should be tuned for optimal performance. Call event reports will come in handy for identifying any problem areas in the application. In general, I suggest minimizing the number of yes/no or digit dialogs that receive tuning attention because they typically use shared grammars, and any findings at one prompt can be applied to the others.

Armed with utterances, logs, and a tuning plan, it’s now time to load up that data into the LumenVox Speech Tuner. Luckily, the tool provides a very user-friendly interface for transcription and analysis. But once you’re listening to caller recordings, what are you actually looking for? Well, tuning is a fairly subjective process that requires a skilled ear and critical thinking, and it’s sometimes difficult to distinguish trends from outliers. That said, I’m typically on the lookout for: (a) out-of-grammar utterances of a significant sample size, (b) red herring “sound-a-likes” that confuse the speech engine, (c) prompts that mislead the caller into giving unexpected responses, (d) rejection of valid utterances due to confidence scores, and (e) “talk-off” issues where only partial utterances are captured. Addressing such problems may require grammar updates, new phrase recordings, configuration changes, or a combination of all three. A trained voice user interface expert can assist in making and implementing the tuning recommendations based on the data revealed by the Speech Tuner tool.

So, with analysis complete and the application changes in place, what type of measurable improvements can you expect to see? The answer here is always: it varies. You may experience self-service task completion rates that jump from 50% to 75%, but you may not. The fruits of a tuning effort are typically weighed over time and over multiple iterations. False-Accepts (the phrases that the recognizer accepted, but shouldn’t have) and False-Rejects (the phrases that it rejected, but should have accepted) should decrease. Confidence scores for Correct-Accepts should increase. Follow-up tuning cycles will expose these trends, but it’s almost never possible to assign your expectations a hard-and-fast number. Continuing to analyze call event reports will help illuminate where the gains have been made and where to focus your efforts in the next tuning cycle.

Performing these steps in regular intervals will ensure that you’re getting the maximum mileage out of your investment, and protects against a user interface breakdown that could have been avoided with routine maintenance. Not to mention the fact that customer satisfaction will improve when the user experience does!

Please contact your account manager or for more information on the LumenVox Speech Tuner. For more information on the speech tuning process, or to engage Interactive Northwest (INI) in an application tuning cycle, visit the Contact INI page to speak with a qualified speech mechanic. Vroom vroom!

Interactive Northwest, Inc. (INI) develops innovative interactive voice response (IVR), computer telephony integration (CTI), and self-service applications for high-volume contact centers in markets such as government, healthcare, finance, utilities and service industries. A strong commitment to platform expertise, seamless systems integration, and project management excellence uniquely position INI to provide value to its customers. As a long-standing partner in the Avaya DevConnect program and developer of call center speech applications, INI has a deep history in deploying applications on Avaya platforms — making it a reliable partner capable of delivering results that promote the success and profitability of its customers.