MRCPv2 (Media Resource Control Protocol) is a set of standards used in the field of ASR (Automatic Speech Recognition). It enables communication between voice processing applications, voice resource servers and telephone client tools, standardizing interactions and facilitating the integration of its components into larger environments.
More concretely, it is a key tool for setting up high-performance IVRs (Interactive Voice Servers) or Voicebots (conversational agents);
MRCP or built-in grammars
A fundamental feature of the MRCPv2 protocol is its modularity and scalability. This translates into the ability to set up custom or built-in grammars. The possibilities with these tools are almost infinite. They not only improve the accuracy of ASR results by adding a layer of interpretation, but also enable you to handle use cases normally unattainable with traditional language models. For each turn of speech, you’ll be able to determine the most appropriate grammar to use according to your needs and expected returns;
Here are a few examples of the custom grammars we’ve implemented at Allo-Media and the corresponding use cases:
💬 Spelling:
This grammar allows you to interpret the transcription result by forcing the understanding of letters and numbers, and by removing extraneous words.
Usage cases: there are many. Spelling mode can be used to recognize identification numbers, file numbers, credit card numbers…
Example: In my voicebot, I want the user to be able to give me his file number.
💬 User: “My file number is “ABC123“.
Allo-Media returns as value:
<strong><instance>abc123</instance></strong>
📍Address :
This grammar will enable you to understand, transcribe and structure postal addresses based on the INSEE dictionary.
Usage case: when you want to know a customer’s address, a delivery address…
Example: In my voicebot, I want to be able to retrieve the user’s postal address
💬 User: “So… my address is… 18 Boulevard Pasteur, Paris 15, 6th floor“.
Allo-Media returns as value:
<strong><instance>
<address>
<number>18</number>
<street>boulevard pasteur</street>
<zipcode>75015</zipcode>
<city>paris</city>
<complement>6ème étage</complement>
</address>
</instance></strong>
🏷 Keyword:
This tool identifies and retrieves a predefined keyword from an answer provided by the speech recognition engine.
Usage case: when setting up an IVR, this will direct the user to the corresponding queue or speaking turn.
Example: In my IVR, I want the user to be able to choose between the “file” or “claim” queues according to their needs.
💬 User: “I’d like to make a claim“
Allo-Media returns as value:
<strong><instance>claim</instance></strong>
📅 Date:
The built-in date is capable of analyzing, understanding and structuring dates in multiple formats. It is also able to interpret relative dates.
Usage case: you need to determine an appointment date, retrieve a contract start or end date, a delivery date…
Example: In my voicebot, I want the user to be able to tell me when he will be available for an appointment.
💬 User : “I’m available next Tuesday“
Allo-Media returns the following value:
<strong>
<instance>
<date>
<day>30</day>
<month>01</month>
<year>2024</year>
</date>
</instance></strong>
By coupling these grammars with scalable, customized language models, you’ll be able to boost the performance of your IVRs and Voicebots, and above all work on new use cases that were previously difficult to handle.