Speech analytics relies on spoken word transcription (from recorded calls) of the customer and the agent to arm contact centers with customer intelligence such as buying behavior, competitive insight and so on. What many don’t realize is that when it comes to the transcription engine, ‘garbage in equals garbage out’. That is, if the engine cannot clearly distinguish what is being said by each party, it cannot accurately transcribe the conversation and yield usable intelligence.
Consider a talk-over scenario when both parties are talking at the same time. The transcription engine has a difficult time deciphering the two streams of voice. Also, suppose the quality of the recorded interaction isn’t crystal clear, the same thing happens. The transcription engine can’t accurately distill what it hears.
The ability to yield accurate audio all comes down to the call recording software. When selecting a call recorder to feed your speech analytics software, you need to consider various elements, including:
- Stereo recording – Dual channel stereo call recording provides much higher audio quality upon playback. The transcription engine hears each call participant through its own channel/speaker – the agent on the right and the customer on the left, for example. This dramatically enhances the sound and quality and avoids talk-over. This could be particularly important if a discrepancy arises or if a potential HIPAA or PCI compliance infraction may have occurred. You need to be able to prove who said what, and when.
- Enhanced audio quality – In addition to speaker-separated audio, you want your recording solution to support upper-end audio sample rate formats including G.711 and OPUS. This provides a higher level of audio output.
- “G. 711 is a narrowband audio codec that provides toll-quality audio at 64 kbit/s. G. 711 passes audio signals in the range of 300–3400 Hz and samples them at the rate of 8,000 samples per second” [wikipedia].
- “OPUS is a lossy audio codec which provides remarkable audio quality, especially at low bitrates” (auphonic).
- Metadata augmentation – The ability to augment your speech analytics data with additional data sources enhances the value you derive from your analytics. Collecting non-audio data from CRM, ACD or agent desktop applications enables your call recorder to further append audio recordings with such intelligence, beyond the spoken word garnered from the transcription engine. This improves your ability to correlate, discover sales and marketing patterns and easily pinpoint specific types of interactions.
What this article shows is that there are various elements that come into play which can dramatically affect the value your speech analytics solution provides. It’s not solely dependent on the capabilities of your speech analytics software or the transcription engine it utilizes. The call recording software is vital to the entire process.
Learn about OrekaAC Streaming Audio Capture Service to bolster the value of your speech analytics solution.