Through Smarti’s conversational RESTful-API it can be seamlessly integrated into any Chat-based communication platform.
Conversations in Samrti are based on the following JSON model
{
"id": "5a1403a535887bbdfc94b531",
"meta": {
"status": "New",
"lastMessageAnalyzed": 4,
"channel_id": [
"lp3aiunsij2664ft"
]
/* any meta fields */
},
"user": {
"id": "be4df619-f273-4d15-b3fb-6fa91b08a71a",
"displayName": "Anonymous User"
},
"messages": [
{
"id": "{message-id}",
"time": 1509619512000,
"origin": "User",
"content": "{message-content}",
"user": {
"id": "hkxkyrwf",
"displayName": "Anonymous User"
},
"votes": 0,
"metadata": {},
"private": false
}
],
"context": {
"contextType": "rocket.chat",
"domain": "loadtest",
"environment": {
/* any environment variable */
}
},
"lastModified": 1511261254233
}
-
id
: Each conversation has anid
that is assigned by Smarti -
meta
: meta information about the message.-
Smarti allows for any meta fields to be added.
-
the
channel_id
was used up tov0.6.*
to store the ID of the channel. While the mapping of conversation to channel is no longer managed by Smarti starting with version0.7.0
this field SHOULD still be used in case an integrator wants to store such mappings within Smarti.
-
-
user
: The user that started this conversation.-
only the
id
is required
-
-
messages
: The messages of the conversation in the order as sent-
id
: the id of the message (typically the id as provided by the messaging system). Smarti requires this id to be unique within the conversation. -
time
: the time when the message was sent -
origin
: the origin of the message (User
,Agent
) -
content
: the textual content of the message -
user
: information about the user of this message.-
only the
id
is required
-
-
votes
: allows to set the votes for this message -
metadata
: allows for additional metadata to be stored with this message -
private
: iftrue
this is a private message. Private messages will not be available for Conversation Search in Smarti.
-
-
context
: Contextual information about the message.-
environment
: allows for additional contextual information about the environment
-
-
lastModified
: managed by Smarti this field holds the last modification date of this Conversation in Smarti
Information extracted by Smarti are represented by Tokens and later combined in Templates.
{
"tokens": [
{
"messageIdx": 0,
"start": 0,
"end": 8,
"origin": "System",
"state": "Suggested",
"value": "Physiker",
"type": "Keyword",
"hints": [
"interestingTerm"
],
"confidence": 0.7794291
}
]
}
-
The
messageIdx
is the index of the message within the conversation.start
andend
are offsets of the mention within the message.-1
is allowed for bothmessageIdx
,start
andend
in cases where a token has no mention in the conversation (e.g. a classification result based on the conversation as a whole would use-1
formessageIdx
,start
andend
; a classification of the first message would usemessageIdx=0
and-1
forstart
andend
). -
The
origin
specifies who created the Token. Possible values areSystem
,Agent
. The valueSystem
is reserved for Tokens created by Smarti. Other values are for manual annotations. -
The
state
allows for user feedback on Token level.Suggested
is reserved for Tokens created by the Smarti processing pipeline. Users canConfirmed
andRejected
Tokens. Manually created Tokens should useConfirmed
. -
The
type
defines the kind of the Token. Thetype
also specifies the data type of thevalue
-
Date
: An extracted date/time. This type uses a JSON object with{ "date": "2016-09-01T16:30:00.000Z", "grain": "minute"}
as value.-
date
holds the extracted date/time value -
grain
specifies the granularity of the extracted value -millisecond
,second
,minute
,hour
,day
,week
,month
andyear
are allowed values.
-
-
Topic
: Used for any kind of classification. The value is the identifier of the topic -
Place
,Organization
,Person
,Product
andEntity
: Extracted named entities of different kinds.Entity
shall be used if the named entity is not of any of the more specific kinds. -
Term
: A term of a terminology that was mentioned in the conversation -
Keyword
: Keywords identified in the conversation. Keywords are important words and phrases within the conversation. In contrast to Terms Keywords are not NOT part of a controlled vocabulary. -
Attribute
: Attributions the can modify the meaning of other tokens (e.g. the attribute "romantic" can modify the meaning of the term "restaurant"). -
Other
: type to be used for Tokens that do not fit any of the above. The real type should by stored as ahint
-
-
hint
: Hints allow to store additional information for Tokens. Examples are more specific type information; roles of Tokens (e.g. if a Place is the origin or target of a travel) … -
confidence
: the confidence of the Token in the range[0..1]
An abstraction over single Tokens are Templates. Templates structure extracted information as required for a specific intend. Examples are things like travel planing, location based recommendations, route planing but also more abstract things like information retrieval or related content recommendation
{
"templates": [
{
"type": "related.conversation",
"slots": [
{
"role": "Term",
"tokenType": null,
"required": false,
"tokenIndex": 36
}
],
"queries": [],
"confidence": 0.75306565
}
]
}
-
type
: the type refers to the definition this template was build based. -
slots
: this are slots of the template.-
role
Templates define different roles for Tokens. Some roles may be multi valued. In this case multiple slots will have the same role. -
tokenType
: if present the referenced Token MUST BE of the specifiedtype
-
required
: if this slot is required for the Template to be valid. Required slots are always included in the template. If a required Slot can not be filled with a Token thetokenIndex
is set to-1
-
tokenIndex
: the index of the token within thetokens
array (-1
means unassigned).
-
-
confidence
: the confidence for this template. The confidence is typically used in combination with Intend classification. e.g. given a classification for the intend and templates representing intends theconfidence
will be set to the confidence of the intend to be present.
Queries are most specific extraction result of Smarti. Queries are used to retrieve information from a service and are build based on a template. Because of that query
is also a sub-element of template
.
{
"queries": [
{
"creator": "queryBuilder%3Aconversationmlt%3Arelated-conversations",
"displayTitle": "Flug Salzburg -> Berlin (01.09)",
"inlineResultSupport": false,
"state": "Suggested",
"url": "{}",
"confidence": 0.75306565
}
]
}
Common fields supported by all queries include:
-
The
creator
identifies the component/configuration that created this query. -
The
displayTitle
is intended to be used for provide a human readable representation of this query. -
If
inlineResultSupport
istrue
the creator supports server-side execution of the query. -
The
state
can be used for user feedback. Every query will start withSuggested
. Users canConfirmed
andRejected
queries. -
The
url
representing this query. If server side execution is supported this might not be present. -
The
confidence
specifies how well the service is suited to search for information of the template (e.g. bahn.de is very suitable for a travel planing template so a query for Bahn.de would get a high confidence. One could also create a Google Query for travel planing, but results would be less precise so this query should get a lower confidence).
In addition to those fields queries can provide additional information. Those are specific to the creator
of the query.
Smarti processes ongoing Conversations. Those analysis results are available via
-
abstract:
http(s)://${smarti-host}:${smarti-port}/conversation/{conversationId}/analysis
-
specific:
https://localhost:8080/conversation/59ed91d9de10751739a82358/analysis
or when providing a callback URI asynchronously on most CRUD requests for conversations and messages.
{
"conversation" : "5a7befcb27c2da49d629dcab",
"date" : 1518082127519,
"tokens" : [ /* Tokens */ ],
"templates" : [ /* Templates and Queries for Templates */]
}
-
conversation
: The id of the analysed conversation -
date
: the modification date of the conversation - the version of the conversation used for the analysis -
tokens
: Information extracted form the conversation. Important astemplate.slots
refer to token indexes -
templates
: Templates represent a structured view onto the extracted information. Thetype
of the template defines the slots it supports and also what queries one can expect for the tempalte. The LATCH template is an example of such a template intended to be used for information retrieval. -
queries
: Queries are built based on templates and are typically used to retrieve information related to the conversation from services. A single template can have multiple queries from different creators (e.g. the LATCH template can have multiple queries for different configured search indexes).
Note
|
Queries are NOT executed during processing of the conversation. The client is responsible for execution (typically based on a user interaction). |
When inlineResultSupport
is true
a query can be executed by Smarti. In such cases the url
is often not defined - meaning that the query can only be executed by Smarti. To execute a query via smarti the following service has to be used:
-
abstract:
http(s)://${smarti-host}:${smarti-port}/conversation/{conversationId}/template/{templateIndex}/result/{creatorName}
where:
-
{templateIndex}
- is the index of the "template" array returned by the/conversation/{conversationId}/template
request (e.g.0
for the first template in the original response) -
{creatorName}
- the value of thecreator
attribute of the executed query (e.g. the value oftemplate[0].queries[0].creator
when executing the first query of the first template)
Note
|
In a future version it is planed to change referencing of Messages, Tokens and Templates from an index to an ID based model. This will have affects to this webservice. |
The response format is not normalised and fully specific for the query type. Server side execution is e.g. used for the related conversation queries.
For queries where inlineResultSupport
is false
the client needs to execute the query.
Typically this can be done by using the url
attribute.
However specific queries might provide the information for query execution in additional fields.
The related conversation template allows to find conversation similar to the current one. Two query provider do use this template:
-
Similar Conversation: This searches for similar conversation based on the textual content of the current conversation. It does NOT use extracted tokens but rather uses the content of the messages to search for similar conversations. Starting with
v0.7.0
only the content of recent messages is used as context. Up tov0.6.*
the content of all messages in the conversation was used. -
Related Conversation Search: This uses extracted Keywords and Terms to search for related conversations. The search for related conversations uses the
conversation/search
service (part of the conversation API).
The following listing shows the structure use by this template and queries
{
"type" : "related.conversation",
"slots" : [ {
"role" : "Keyword",
"tokenType" : "Keyword",
"required" : false,
"tokenIndex" : {token-idx}
}, {
"role" : "Term",
"tokenType" : null,
"required" : false,
"tokenIndex" : {token-idx}
} ],
"state" : "Suggested",
"queries" : [ {
"_class" : "io.redlink.smarti.query.conversation.ConversationMltQuery",
"displayTitle" : "conversationmlt",
"confidence" : 0.55,
"url" : null,
"creator" : "queryBuilder:conversationmlt:conversationmlt",
"inlineResultSupport" : true,
"created" : 1518423401234,
"state" : "Suggested",
"content" : "{textual-context}"
}, {
"_class" : "io.redlink.smarti.query.conversation.ConversationSearchQuery",
"creator" : "queryBuilder:conversationsearch:conversationsearch",
"displayTitle" : "conversationsearch",
"confidence" : 0.6,
"url" : "/conversation/search",
"inlineResultSupport" : false,
"created" : 1518423401234,
"state" : "Suggested",
"defaults" : {
"ctx.before" : 1,
"rows" : 3,
"ctx.after" : 2,
"fl" : "id,message_id,meta_channel_id,user_id,time,message,type"
},
"keywords" : [ "{keyword-1}", "{keyword-1}", ".." ],
"terms" : [ "{term-1}", "{term-2}", ".." ],
"queryParams" : [ ],
"filterQueries" : [ "completed:true", "-conversation_id:5a7befcb27c2da49d629dcab", "-meta_support_area:*" ]
} ]
}
Related Conversation Template:
-
The
related.conversation
template has two roles -Keyword
andTerm
. Both allow for multiple assignments.-
{token-idx}
is the index of the token referenced by the slot of the template.-1
indicates an unassigend slot. -
the
tokenType
defines the required type of the token. For slots with the roleKeyword
the token type will beKeyword
. For roleTerm
slots this will benull
. The actual type of the token needs to be obtained from the referenced Token itself.
-
Similar Conversation Query:
-
Similar Conversation are provided by the
ConversationMltQuery
query provider -
The
{textual-context}
is the context used to search for similar conversations -
The
url
isnull
as this query is server executed. This means that theGET conversation/{conversationId}/template/{templateIndex}/result/{creator}
service needs to be used to obtain results.
Related Conversation Search Query:
-
Related Conversation are provided by the
ConversationSearchQuery
query provider -
The
url
is the to/conversation/search
the path the the conversation search service part of the conversation webservice API -
The
keywords
andterms
are suggested search parameters for the conversation search-
they can be parsed by using the
text
parameter. In this case prefix and infix searches will be used to the exact match (with different boosts) -
optionally thy can also be parsed via the Solr
q
parameter. This allows for more control over the actual search. See Solr Documentation for more information and options -
the
defaults
are additional parameter to be parsed to the conversation search service. Those are based on the clients conviguration and should be included unchanged when calling the conversation search service. -
filterQueries
are expected to be parsed as is via the Solrfq
parameter. They are also calculated based on the clients configuration and expected to be included unchanged on calles to the conversation search service. NOTE: that skipping those filter does NOT allow users to retrieve information they do not have acces to as all security relevant restrictions are applied by the conversation search service based on the authentication information of the call. So removing some/all of those filter does only affect the quality of results.
-
LATCH stands for Location, Alphabet, Time, Category and Hierarchy and represents a well established model for information retrieval.
The LATCH templates collects information extracted from the conversation and assigns them to the the different dimensions of the Template (e.g. extracted tokens with the type Time
will be assigned to the time
dimension; extracted tokens with the type Place
will be assigned to the location
dimension …).
Based on this template a generic query builder for Apache Solr endpoints is implemented. This query builder can be configured for Solr search endpoints so that information related to the current conversation can be suggested to the user.
{
"type" : "ir_latch",
"slots" : [ {
"role" : "{role}",
"tokenType" : {type},
"required" : false,
"tokenIndex" : {token-idx}
} ],
"probability" : 0.0,
"state" : "Suggested",
"queries" : [ {
"_class" : "io.redlink.smarti.query.solr.SolrSearchQuery",
"creator" : "queryBuilder:solrsearch:config-name-1",
"resultConfig" : {
"mappings" : {
"source" : "source_field",
"title" : "title_field",
"description" : "desc_field",
"type" : "type_field",
"doctype" : null,
"link" : "link_field",
"date" : "date_field",
"thumb" : "thumb_field"
},
"numOfRows" : 10
},
"defaults" : { },
"displayTitle" : "config-title-1",
"confidence" : 0.8,
"url" : "https://search.host-1.org/query?{solr-query-parameter}",
"inlineResultSupport" : false,
"created" : 1518433563606,
"state" : "Suggested",
"queryParams" : [ "address_cityplz_de:\"Salzburg\"^0.8478922", "context_de:\"Tätowieren\"^0.39709732", "title_de_sv:\"Tätowieren\"^1.1912919" ],
"filterQueries" : [ ]
}, {
"_class" : "io.redlink.smarti.query.solr.SolrSearchQuery",
"creator" : "queryBuilder:solrsearch:config-name-2",
"resultConfig" : {
"mappings" : {
"source" : null,
"title" : "title",
"description" : "description",
"type" : "type",
"doctype" : null,
"link" : null,
"date" : "release_date",
"thumb" : null
},
"numOfRows" : 5
},
"defaults" : {
"fq" : "structure:(type_1 OR type_2)"
},
"displayTitle" : "Stadt Salzburg",
"confidence" : 0.8,
"url" : "https://www.host-2.com/search/select?fq=structure%3A(type_1%20OR%20type_2)&{solr-query-parameter}",
"inlineResultSupport" : false,
"created" : 1518433563606,
"state" : "Suggested",
"queryParams" : [ "content:\"Tätowieren\"^0.39709732", "title:\"Tätowieren\"^1.1912919" ],
"filterQueries" : [ ]
} ]
}
LATCH Information Retrieval
-
the type of this template is
ir_latch
-
it supports the following roles:
location
,alphabet
,time
,category
andhierarchy
-
all roles are optional and allow for multiple assignments.
-
some Roles require specific types:
location
→Place
,time
→Date
,category
→Topic
-
Solr Search Query
-
generic query provider that can be configured for a specific Solr Endpont
-
the
resultConfig
is copied from the configuration and defines the Solr field names to be used for the visualisation of results. ThenumOfRows
is the number of results per page. -
the
url
represents the Solr request representing the data provided by the query provider. If the client wants to modify the query it needs to strip the query part from the URL and create the query parameter to reflect the intended modifications -
defaults
are query parameters to be included in the request. They originate form the client configuration.-
the second example shows how a
fq
(filter query) is configuerd in the defaults. NOTE that this is also present as query paramter in theurl
-
-
the
queryParams
represent parts of the Solr query (q
parameter) generated for the extracted tokens and the solr fields as defined by the configuration.-
in the first example the
config-name-1
has mappings for locations and full text search. Because of that the extracted tokenSalzburg
with the typePlace
is used as query parameter for the location name fieldaddress_cityplz_de
. The tokenTätowieren
with the typeKeyword
is used for full text search. First in the title fieldtitle_de_sv
with a boost of1.1912919
and second in the full text fieldcontext_de
with a lower boost of0.39709732
-
the second example
config-name-2
hos no field to search for locations. Because of this onlyTätowieren
is included in the query parameters
-
-
filterQueries
are additional filters set by the query provider based on the conversation (e.g. based on detected categories). Values need to be parsed asfq
parameters in the Solr query.