What is “Voice Ordering Framework”?
Voice Ordering Framework is an API for a set of chatbot services, a set of SDKs, and related tools for an SaaS that enables multi-channel voice and language-based (conversational) ordering from restaurants. The framework includes a cloud-based set of services that handle API calls, along with multiple SDKs for various clients/channels. The tools include a tablet-based kitchen display system (KDS), along with a merchant portal/dashboard service + UI. The framework allows for integration with restaurant POS solutions. The framework is designed to be “unassisted AI”/”independent AI”, nevertheless it includes a capability for user-requested transfers to human customer service agents.
A high-level overview of the solution architecture is here: Voice Ordering Framework API and Client SDK Overview
Business Use Cases
- Cloud-based API + client SDK for single location (independent restaurant or location of a chain) (for any selected channels: phone, drive-thru, app, kiosk)
- Cloud-based API + client SDK for multiple locations of restaurant chain (using selected channels)
- White-label solution that can be licensed by vendors and provided to restaurants
- Any of the above, used as a fallback/safety net solution to back up existing voice ordering systems
Ideal Customer Profiles
- Restaurant with high volume of pickup/delivery phone calls – pizza, sushi, diner/café, Mexican-style
- QSR Restaurant with Drive-thru lanes
- Fast Casual Restaurant (pickup/delivery)
Available AS:
- An API using our proprietary cloud hosting
- Nuget Packages + Architecture/Topology guide and deployment scripts
Client SDKs, Tools, Other Samples
- Phone Client SDK – a fully-functioning ASP.Net service that uses Twilio APIs for speech recognition, speech generation, and that interfaces with the framework’s chatbot services and the order and payment processing services, to enable phone-based dialogues with restaurant customers.
- Drive-thru Client SDK – a Windows UWP sample with speech functionality and REST interface, easily adapted to e.g. Java.
- App or Kiosk Client SDK- a Windows Xamarin sample app (fully functional on Android devices); it incorporates speech recognition and generation, it makes API calls to the chatbot services for dialogue management/turn handling, and it interfaces with Stripe and with the Order and Payment services for completed order handling and payment processing.
- Kitchen Display System (KDS), also referred to as the “merchant tablet app” – a fully-functional Windows Xamarin app that can be deployed to Windows, Android tablet or iPad.
- Menu creation tools – a set of command-line tools that help create menu manifests for a service provider; they streamline this process so that it typically takes approximately 6 man hours or less.
- Merchant Portal Dashboard – based on Microsoft Blazor technology, this comprises a Windows web app and browser-agnostic client-side code. This tool is for use by restaurant owners and managers for viewing and managing their profile, menus with items, modifiers and pricing, restaurant customers and orders.
- Order and Payment Services – an ASP.Net web service that handles orders and payment for orders using Stripe as the payment processor.
cHATBOT sERVICES aRCHITECTURAL oVERVIEW
- The chatbot services uses a hybrid approach that utilizes generative AI as a “task helper” rather than as part of the core infrastructure. This avoids the key implementation and deployment hurdle of having to train custom ML/LLM models per-location or per-chain.
- Chatbot services are stateless and fully scalable; this set of services uses relational database as the underlying data store for both reference data and operational data.
- Natural language understanding (NLU) within the chatbot services uses a paradigm primarily involving intents and sequence labels. Generative AI is used selectively where needed.
- Dialogue Management (control flow management) within the chatbot services uses a proprietary method.
- Natural language generation (NLG) within the chatbot services uses a proprietary method. Natural language prompts to the customer are customizable via configuration files.
- Chatbot services interfaces to external systems include:
- An API method to get completed Order information
- An API method to get pending and complete orders (e.g. this is used by the Kitchen Display System)
MAIN bENEFITS OF THE fRAMEWORK
- Unassisted AI
- High level of accuracy (high completion rate) (95% + is possible)
- Uses a unique hybrid/blend of traditional dialogue management and generative AI, using techniques such as RAG and sophisticated prompting
- Highly customizable using reference data and configuration settings
- Integration points with payments, KDS, POS, etc.
- Multi-channel, with SDK’s for Drive-thru, Phone, App, Kiosk
- All cloud services are fully scalable
- Ease of deployment – fully cloud-based API, no model training
- Low operational costs
Deployment Guide High-Level Overview
- (Note: these steps are for deployment using the SaaS hosted and provided by Software Engineering Concepts, Inc. Deployment using Voice Ordering Framework NuGet packages is not described here).
- (“Location” is used here to refer to an independent restaurant or to one location of a restaurant chain).
- Reference Data preparation and creation:
- A location that has been selected as a deployment site is referred to as a “Service Provider”. The first step is to create a Service Provider profile manifest, a JSON document that contains information such as contact information for owner(s) or managers, physical address, hours of operation, and various configuration settings.
- The next step is to use the Menu Creation command-line tools to generate and create all menu data (including prices and attributes such as suggested items).
- Creation of database reference entities: use of APIs (programmatically or from command line) to create a chatbot instance object and a service provider object (which includes all service provider reference information).
- Creation of client(s). Client settings include API keys for invocation of chatbot services and other services as needed (e.g., order and payment processing).
- Event-based Client code execution: newly-create clients can immediately come to life in response to customer input. A client will typically wait for a user-initiated “start” event at which time it will invoke the chatbot services “CreateDialogue” API. Thereafter a client will call the chatbot services “HandleTurn” API to process user input for each turn of the dialogue (order-handling conversation).
Reference Knowledge
The Voice Ordering Framework uses reference knowledge – information that mainly consists of service provider information (about location, personnel, as needed, services, and menu) – this reference knowledge is stored in both JSON format in cloud storage files and in a relational database. The reference knowledge object model is described here: Object Models