What is “Voice Ordering Framework”?

Voice Ordering Framework is an API for a set of chatbot services, a set of SDKs, and related tools for an SaaS that enables multi-channel voice and language-based (conversational) ordering from restaurants. The framework includes a cloud-based set of services that handle API calls, along with multiple SDKs for various clients/channels. The tools include a tablet-based kitchen display system (KDS), along with a merchant portal/dashboard service + UI. The framework  allows for integration with restaurant POS solutions. The framework is designed to be “unassisted AI”/”independent AI”, nevertheless it includes a capability for user-requested transfers to human customer service agents.

A high-level overview of the solution architecture is here:  Voice Ordering Framework API and Client SDK Overview

Business Use Cases

  1. Cloud-based API + client SDK for single location (independent restaurant or location of a chain) (for any selected channels: phone, drive-thru, app, kiosk)
  2. Cloud-based API + client SDK for multiple locations of restaurant chain (using selected channels)
  3. White-label solution that can be licensed by vendors and provided to restaurants
  4. Any of the above, used as a fallback/safety net solution to back up existing voice ordering systems

Ideal Customer Profiles

  1. Restaurant with high volume of pickup/delivery phone calls – pizza, sushi, diner/café, Mexican-style
  2. QSR Restaurant with Drive-thru lanes
  3. Fast Casual Restaurant (pickup/delivery)

Available AS:

  1. An API using our proprietary cloud hosting
  2. Nuget Packages + Architecture/Topology guide and deployment scripts

Client SDKs, Tools, Other Samples

  1. Phone Client SDK – a fully-functioning ASP.Net service that uses Twilio APIs for speech recognition, speech generation, and that interfaces with the framework’s chatbot services and the order and payment processing services, to enable phone-based dialogues with restaurant customers.
  2. Drive-thru Client SDK – a Windows UWP sample with speech functionality and REST interface, easily adapted to e.g. Java.
  3. App or Kiosk Client SDK- a Windows Xamarin sample app (fully functional on Android devices); it incorporates speech recognition and generation, it makes API calls to the chatbot services for dialogue management/turn handling, and it interfaces with Stripe and with the Order and Payment services for completed order handling and payment processing.
  4. Kitchen Display System (KDS), also referred to as the “merchant tablet app” – a fully-functional Windows Xamarin app that can be deployed to Windows, Android tablet or iPad.
  5. Menu creation tools – a set of command-line tools that help create menu manifests for a service provider; they streamline this process so that it typically takes approximately 6 man hours or less.
  6. Merchant Portal Dashboard – based on Microsoft Blazor technology, this comprises a Windows web app and browser-agnostic client-side code. This tool is for use by restaurant owners and managers for viewing and managing their profile, menus with items, modifiers and pricing, restaurant customers and orders.
  7. Order and Payment Services – an ASP.Net web service that handles orders and payment for orders using Stripe as the payment processor.

cHATBOT sERVICES aRCHITECTURAL oVERVIEW

  1. The chatbot services uses a hybrid approach that utilizes generative AI as a “task helper” rather than as part of the core infrastructure. This avoids the key implementation and deployment hurdle of having to train custom ML/LLM models per-location or per-chain.
  2. Chatbot services are stateless and fully scalable; this set of services uses relational database as the underlying data store for both reference data and operational data.
  3. Natural language understanding (NLU) within the chatbot services uses a paradigm primarily involving intents and sequence labels. Generative AI is used selectively where needed.
  4. Dialogue Management (control flow management) within the chatbot services uses a proprietary method.
  5. Natural language generation (NLG) within the chatbot services uses a proprietary method. Natural language prompts to the customer are customizable via configuration files.
  6. Chatbot services interfaces to external systems include:
    1. An API method to get completed Order information
    2. An API method to get pending and complete orders (e.g. this is used by the Kitchen Display System)

MAIN bENEFITS OF THE fRAMEWORK

  1. Unassisted AI
  2. High level of accuracy (high completion rate) (95% + is possible)
  3. Uses a unique hybrid/blend of traditional dialogue management and generative AI, using techniques such as RAG and sophisticated prompting
  4. Highly customizable using reference data and configuration settings
  5. Integration points with payments, KDS, POS, etc.
  6. Multi-channel, with SDK’s for Drive-thru, Phone, App, Kiosk
  7. All cloud services are fully scalable
  8. Ease of deployment – fully cloud-based API, no model training
  9. Low operational costs

Deployment Guide High-Level Overview

  1. (Note: these steps are for deployment using the SaaS hosted and provided by Software Engineering Concepts, Inc. Deployment using Voice Ordering Framework NuGet packages is not described here).
  2. (“Location” is used here to refer to an independent restaurant or to one location of a restaurant chain).
  3. Reference Data preparation and creation: 
    • A location that has been selected as a deployment site is referred to as a “Service Provider”. The first step is to create a Service Provider profile manifest, a JSON document that contains information such as contact information for owner(s) or managers, physical address, hours of operation, and various configuration settings.
    • The next step is to use the Menu Creation command-line tools to generate and create all menu data (including prices and attributes such as suggested items). 
  4. Creation of database reference entities: use of APIs (programmatically or from command line) to create a chatbot instance object and a service provider object (which includes all service provider reference information).
  5. Creation of client(s). Client settings include API keys for invocation of chatbot services and other services as needed (e.g., order and payment processing).
  6. Event-based Client code execution: newly-create clients can immediately come to life in response to customer input. A client will typically wait for a user-initiated “start” event at which time it will invoke the chatbot services “CreateDialogue” API. Thereafter a client will call the chatbot services “HandleTurn” API to process user input for each turn of the dialogue (order-handling conversation). 

Reference Knowledge

The Voice Ordering Framework uses reference knowledge – information that mainly consists of service provider information (about location, personnel, as needed, services, and menu) – this reference knowledge is stored in both JSON format in cloud storage files and in a relational database. The reference knowledge object model is described here:   Object Models