Natural language understanding (NLU)
NLU is one of the key functions which chatbots must have. It enables them to interpret requests and react to them in accordance with their clients’ expectations.
Chatbots apply NLU algorithms to solve two basic problems: detecting the communicative intention (intent) of their interlocutor and recognizing any mentioned named entities.
NLU features
The NLU core has the following features:
- Detecting user intents. An intent is a key entity of the NLU service; it combines a set of phrases, user intent and other metadata.
- System and custom entities. An entity is a unit of the NLU core. An entity is a sequence of words linked by an intent or rule. For example: names, date and time, location, etc.
- Client entities are entities that can be personalized by the client during a conversation with the bot. The contents of such an entity are accessible to the client only. Client entities are used when personalization is required to identify intents.
- Patterns are formal rules that describe keywords and expressions. You can use patterns to assign a client reply to one of the existing system states that defines state-specific reactions.
- Slot filling is the process of inquiring about additional details in order to process a client request. The data acquired during an additional data request are available for use in the script.
- Data labeling is a tool you can use to extract message subjects from the loaded data to which the bot will respond.
- You can use your trained classifier in external applications via the NLP Direct API.
- Extended NLU settings. You can configure new NLU options unique for each project.
NLU languages
Supported languages
When you create a project, the mandatory NLU language parameter determines the language the bot will understand. For every supported language, JAICP automatically implements the following:
- a library for tokenization and morphological parsing;
- built-in algorithms for intent recognition;
- a set of standard entities.
Language | Note |
---|---|
English | Supports paraphrasing training phrases. |
Chinese | Does not support: • fuzzy search and normalization of NLU entities. • the ~ and $morph advanced pattern elements. |
Danish | |
Dutch | |
French | |
German | |
Greek | |
Italian | |
Japanese | Does not support the recognition of time and numbers written out in full. |
Kazakh | If date, time, or numbers are written out in full, you can recognize them using the zb.datetime and zb.number system entities. |
Lithuanian | Does not support the recognition of time and numbers written out in full. |
Polish | |
Portuguese | |
Romanian | |
Russian | Supports spell checking and paraphrasing training phrases. |
Spanish | |
Ukrainian | Supports spell checking. |
Other languages
If your project requires support for a language not provided by JAICP, you can connect an external NLU service with support for any other language and use it instead.
You can develop such a service yourself or use a third-party one. The external NLU service must comply with the Model API specification.
Bot script
NLU core parameters
The NLU core parameters are specified by default in chatbot.yaml
:
language: en
botEngine: v2
nlp:
intentNoMatchThresholds:
phrases: 0.2
patterns: 0.2
The parameters are as follows:
-
language
is the classifier language. -
intentNoMatchThresholds
is the minimum required similarity of the request to the intent phrases or intent patterns. The default value ofphrases
andpatterns
is0.2
. If the classifier cannot categorize the request, anoMatch
event is triggered.tipYou can also set a threshold value for patterns from the
q
andq!
tags using thepatternNoMatchThreshold
parameter.
Combined use of intents and patterns
JAICP allows the combined use of intents and patterns in a single script. State activation rules defined using these engines have different priorities.
The mechanism for selecting activation rules when using intents and patterns can also be redefined manually, using either:
- the
selectNLUResult
handler; - the
$context.nBest
field.