Press or Say (Preview)

Press or Say enables flow builders to create a voice flow in which a caller can provide an input through speech or DTMF input using the dial pad on their phone. Whichever input they choose is then streamed directly to the IBM Watson speech recognition service. We’re still fine-tuning and taking feedback for Press or Say, so it has been launched in preview.



  • Lower latency: This action streams directly to speech recognition services, such as IBM Waston, for much faster speech recognition in the voice flow. This is an improvement from previous solutions, which involved chaining multiple actions together and relied on API calls to the speech recognition service.
  • Convert spoken numbers to digits: The backend improvements also allow for spoken numbers to be translated to digits (i.e. Spoken “Two four seven three” becomes “2473”).
  • Interrupt: Because this action streams to the speech recognition service directly, it is built to respond quickly to speech, which means that if the user wishes to interrupt while the audio prompt is playing, they can do so. 



This action looks like Collect DTMF in most ways and allows for the same configuration of the following items:



Press or Say creates a variable that contains the caller’s input (either DTMF digits or transcription of their speech). A switch action can be used to route the flow based on the contents of the variable.