An old proverb says, "Man plans, God laughs." Nowhere is this more true than in software development. As developers, we try to plan for every contingency. In our minds we walk down every path our users may travel and compensate accordingly. Without fail, customers will traverse paths never intended for a user's feet. Immediately they come back to us and say "Look, it's broken!" No matter how much we plan, we must be able to modify our systems based on how our users actually use them, not on how we expect them to be used. In the world of Voice User Interfaces, this process is called tuning, and a tuning tool is a vital component of VUI development.

A good VUI is difficult to write and test. When users complain that an application is "broken" it may mean that the speech recognition is failing, but just as often it means that the developer of the application simply did not take into account the sort of responses users would give. Unlike DTMF applications — or even traditional GUI applications — an application with a speech–driven VUI allows for an endless number of potential user responses to any given prompt.

As an example, LumenVox produced a VUI demonstration that would illustrate the capabilities of our ASR. The demo would tell people the current weather for a city of their choice. Using U.S. Census data, we built a grammar with the name of every city with a population of more than 5,000 people. Once a user selected a city and state, the system retrieved weather information from the Internet and read it to them using text–to–speech.

It was a fairly straightforward design, with no obvious snags or hang–ups. Almost immediately after the system was deployed, however, users reported it was failing. We immediately called the system to ensure it was working:

Speech Application: Please tell me the area you would like the weather for.

Caller: San Diego, California.

Speech Application: I heard "San Diego, California." Is this correct?

Caller: Yes.

Speech Application: The weather for San Diego, California is...

The voice interface we had designed seemed to work fine from a technical perspective. The speech recognition was accurate, and all the components were working together as expected. And yet users kept reporting the system was failing. It wasn't until we reviewed actual recordings of calls using our Speech Tuner that the problems in the system's design were exposed. Our Speech Tuner allows us to listen to the audio recordings of callers, see what the Speech Engine recognized, and see how changes to grammars would have affected recognition.

One key feature of the Tuner is its Call Browser, a module that allows us to see details about a call and each utterance in that call. This way we can follow a user through a call, see what the caller said, and see what the Engine recognized the response as. A common user experience went like this:

Speech Application: Please tell me the area you would like the weather for.

Caller: 92123.

Speech Application: I am sorry, that is not a valid choice. Please try again.

Caller: ZIP code 92123.

Speech Application: I am sorry, that is not a valid choice. Please try again.

Caller: Area code 619.

Speech Application: I am sorry, that is not a valid choice. Please try again.

Caller: The moon, or anywhere nearby.

We listened in horror as the users ripped our robust application to shreds. While a developer cannot plan for every possible phrase a user may utter, it was clear the prompt was misleading our callers. The seemingly simple request of "Please tell me the area you would like the weather for," was far too open–ended. We heard responses such as "Near my house," and "The beach."

Needless to say, after reviewing the results of our seemingly simple and fail–proof application, we decided that the initial prompt needed to be changed to something that elicited a specific response instead of such an open–ended question. We changed the prompt to say, "Please tell me the city and state you are interested in," and the application's success rate improved significantly. By reviewing actual calls with the Tuner, we were able to quickly pinpoint the exact cause for user failure, and adjust the system accordingly. Just as importantly, we were able to review the results of the change to ensure it was successful.

The next demonstration application we developed was a fake pizza–ordering application. This demo allows users to choose the toppings, size, and crust of a pizza. As with the weather demo, we built the initial application, tested it internally, and deployed it. Once again, users immediately complained that the application simply did not work. When the system asked users what size pizza they wanted, we expected them to ask for a small, medium, or large pizza. Listening to calls, we heard interactions such as:

Speech Application: "What size pizza would you like?"

Caller: "Twenty–seven inch, please."

Speech Application: "Hey, we only make three sizes of pizza: small, medium, or large."

Caller: "Medium, then."

Speech Application: "Was that a medium?"

Caller: "Yes."

Even though the mistake was handled by the "no match" prompt, this failure to answer an ambiguous question the way we expected could lead to caller fatigue and frustration. This condition can occur if callers encounter just a few questions that don't move them forward to the implied goal of the system (in this case, ordering a pizza).

Using our Speech Tuner, we were able to provide immediate and satisfying proof of the need for a change in the system. We added a grammar that would accommodate users specifying a pizza size in inches, as well as being able to say "small," "medium," or "large."

Unlike the weather system, we decided to add the grammar to accommodate a larger range of responses instead of simply rephrasing the prompt. In the case of the weather demo, there was no reasonable way to accommodate requests like "I need weather near my house." But the responses to the pizza demo were within a limited domain that could be easily handled by a modified grammar, so it made sense to make that change.

Before we made the changes to our live application, we needed to test the grammar. To do this, we transcribed a large number of utterances — the Tuner's built–in transcriber aids us in this by automatically entering the Speech Engine's result into the transcript, but for out–of–grammar utterances we still need to spend time transcribing what the users said.

Once we had transcribed call data, we were able to make use of the Speech Tuner's Grammar Tester component. The tester takes transcribed interactions and gives us a list of the grammars that were active during the recognitions. It also gives us a wealth of statistics about recognition accuracy, based on the transcripts and the recognition results.

The key feature of the tester is the ability to modify the grammars and then run the audio through the Speech Engine, getting new results based on our new grammars. This allowed us to evaluate how the application would handle our original responses with the new grammar entries (the ones that allowed users to specify a size in inches). We saw our semantic error rate drop significantly, because our grammars now accommodated what the users were saying. The grammar test provided us with empirical data to show that the change would be a beneficial one that could be done swiftly and without negative user impact.

Using a good tuning tool allows developers to quickly harness the only experiences that really matter: those of users. Only when we understand how our users actually interact with our speech applications can we then plan improvements. And, with effective testing tools, we can accurately assess how those changes will affect our applications before deploying them in production environments

Software Type


To get free robux then use free robux generator for the free roblox robux game to play eaily with out any difficulty.

What is the reason for the not getting off the proper knowledge of the get help file explorer windows 10. Actually, there are few places where you can get the awareness about the basic windows steps.


All the people working on big band dissertation have left and they are now on I think that it's the best thing that you guys could've done and you need to enjoy from it.

I exploit solely premium quality products -- you will observe these individuals on:

I can’t imagine focusing long enough to research; much less write this kind of article. You’ve outdone yourself with this material. This is great content website

This post is extremely radiant. I extremely like this post. It is outstanding amongst other posts that I've read in quite a while. Much obliged for Chocolate Slim this better than the average post. I truly value it!

I am impressed by the information that you have on this blog. It shows how well you understand this subject. virtual number

Its a great pleasure reading your 123movies post. Its full of information I am looking for and I love to post a comment that "The content of your post is awesome" Great work.

A key component of the analyzer is the capacity to adjust the language structures and afterward run the sound through the Speech Engine, getting new outcomes in view of our new assignment help

Your blog is filled with unique good articles! I was impressed how well you express your thoughts and sharing your experiences.I never regretted any moment that came to your blog. Eco Slim

Engineering as a subject combines mathematics, logic and science to find solutions to our daily life problems. Over the last few decades, engineering as a profession has seen vast expansion.
MBA college in punjab
Engineering college in punjab
Fashion Technology College in chandigarh
Top BCA College in Punjab

Really your blog is very interesting. it contains great and unique information. I enjoyed to visiting your blog. I have you bookmarked your site to look at the new stuff you technofaq post

Our academic pursuits, along with a range of extracurricular activities, help in honing a child's skills and ensuring that he/she grows to be a mature and responsible citizen.
best rating school in greater noida

These things are very important, good think so - I think so too... detective agency in north delhi

Our journey started as frontrunner to bring Apple technology to the Indian subcontinent. Today, SRSG has emerged as one of the leading full-service player offering an array of products and services for system integration, broadcast consultancy, IT infrastructure services, maintenance services and digital archiving services for the broadcast industry.
ipad air reseller in Bhubaneswar
MacBook reseller in Delhi
Newsroom Automation
Mac pro service center in Kolkata
Macbook pro reseller in bhubaneswar
MacBook Air reseller Ahmedabad
iMac reseller in Delhi

I like to recommend exclusively fine plus efficient information and facts, hence notice it:

Really your blog is very interesting. it contains great and unique information. I enjoyed to visiting your blog. I have you bookmarked your site to look at the new stuff you post Το Blog Της Εβελίνας

I also wrote an article on a similar subject will find it at write what you think. conference call services

I found Hubwit as a transparent s ite, a social hub which is a conglomerate of Buyers and Sellers who are ready to offer online digital consultancy at decent cost. business plan book

Initial You got a awesome blog .I determination be involved in plus uniform minutes. i view you got truly very functional matters , i determination be always checking your blog blesss. Preventivo ristrutturazione

Je vous remercie de l'information! Je cherchais et ne pouvait pas trouver. Vous me aidé! 192.168.o.1