Integration overview for real-time audio capture from Twilio Flex

Plan and prepare for integrating the Verint recording solution with Twilio Flex by understanding the required components, how they communicate, and who is responsible for deploying them.

Diagram: Components for real-time audio and agent screen capture

Diagram: Components for Verint recording capture of audio and screen from Cisco Webex Contact Center

Components for real-time audio capture

  1. Agent phone

  2. Twilio Flex contact center

  3. Amazon Kinesis streams

    Amazon Web Service (AWS) is used to stream CTI events.

  4. Web application

  5. Load balancer

    Deploy a load balancer to manage traffic and to improve the availability and scalability of your recording solution. A load balancer forwards the audio streaming requests from Twilio Flex to the Verint Recorder Adapter Proxy Service servers.

  6. Recorder Adapter Proxy Service, which hosts the Twilio Media Adapter

    The Twilio Media adapter invokes the Media Streams API to start audio streaming, listening to WebSocket connections, and redirecting the audio packets.

  7. IP Recorder

  8. Recorder Integration Service, which hosts the Twilio Event Stream Adapter

  9. Verint Data Center

The customer is responsible for configuring the Twilio Flex contact center.

Communications for real-time audio capture

  1. Call between a customer and an Agent over the Twilio Flex contact center.

  2. WebSockets carry text messages, which contain call metadata and the audio files. The messages are in JSON and encrypted in Byte 64 format.

  3. Agent events and call events (real-time)

  4. Interaction consolidation

Verint retrieves interactions from the contact center using the Twilio API, which is a set of RESTful APIs.

Call flow for real-time audio capture (A, B, C, D, and E)

Verint starts and stops capturing the audio from a customer engagement based on the events it receives from the external system. The recording solution captures the calls that monitored agent makes and takes. When the last monitored agent leaves the call, recording stops.

Depending on how the Twilio Flex contact flows are configured, when a monitored agent calls a customer, the call is recorded.

The following scenarios are never recorded:

  • Calls between monitored agents

  • Calls transferred to parties outside the Twilio switch

  • Transfers to unmonitored parties

A typical call flow can go like this:

  1. A customer calls an agent.

  2. The Verint Twilio Event Stream adapter (on the Recorder Integration Service) receives the CTI event “Reservation.Created“.

  3. The Twilio Event Stream adapter sends the CTI event “Reservation.Created“ to Twilio Event Stream tracker.

  4. The Twilio Event Stream tracker creates a call.

  5. The Agent answers the phone.

  6. The Twilio Event Stream adapter receives the CTI event “Reservation.Accepted“.

  7. The Twilio Event Stream adapter sends CTI event “Reservation.Accepted“ to the Twilio Event Stream tracker (on the RIS).

  8. The Twilio Event Stream tracker adds a connection and the current call status (Alerting, Ringing, Active, or Hold) to the call.

  9. The Stream Manager sends a StartDuplicateStream message to the Twilio Event Stream adapter.

    If the StartDuplicateStream message does not include the Stream ID (SID) for the call, the Twilio Event Stream adapter calls the Twilio Conference API to get the call SID. The SID is required to start the stream.

  10. The Twilio Event Stream adapter sends a request to the Twilio Start Stream API to start streaming audio from one call SID.

  11. The Twilio Start Stream API connects to the WebSocket Target URL, which goes through the load balancer.

    The load balancer directs the request to one of Recorder Adapter Proxy Service (RAPS) server in the pool.

  12. The Twilio Media adapter (on the RAPS) creates a WebSocket connection. It receives audio and sends the audio to the IP Recorder through Recorder Media Interface (RMI) client (on the RAPS).

  13. The IP Recorder starts recording, creates an INUM, and sends a STARTED message to the Recorder Integration Service.

  14. The Recorder Integration Service receives the STARTED message and matches the INUM to the correct session and contact by using the agent’s extension.

    The Extension is the user name that the agent uses to sign in to their Twilio Flex account. This user name is the agent's email address.

  15. The call ends when the Customer or agent hangs up the phone.

  16. The Stream Manager sends a StopDuplicateStream message to the Twilio Event Stream adapter.

  17. The Twilio Event Stream adapter sends a request to Twilio Stop stream API to stop streaming audio from the call SID.

  18. The WebSocket connection is closed.

  19. Twilio Media adapter sends a stop message to IP Recorder.

  20. The call is consolidated in database.

    You can now search for and play back the call in Verint Risk Management and Interactions modules.

Integration overview for post-call audio capture from Twilio Flex

Redundancy

Workflow: Set up the Twilio integration

Deploy a load balancer