by Jenny

on August 08, 2017

Building a simple FAQ bot with Starchat

Updated 13 October 2020

This article is about StarChat, the open-source engine that powers GetJenny's powerful chatbots. StarChat is built and maintained by GetJenny. We made it open-source because we believe that advances in technology should be shared and improved constantly. And with more minds working together, we can all build better solutions. 

You can view and download the source code and you can deploy StarChat to build a custom chatbot of your own. 

If you would like to leverage the power of StarChat but would prefer an clean, easy-to-use interface that's built, improved and maintained by GetJenny, check out our ready-made packages for JennyBot. 



For small companies who are just dipping their toes into providing online support, you may have noticed that despite your best efforts at providing your customers with information, they come to your chat asking quite common questions...

Today we’re going to show you how to help your support staff from ripping their hair out, by building a simple bot with Starchat that can serve as a first-line of support for the most common questions.

Starchat powers Jenny chatbots and SmartLayer. You can access it as an open source project as described below. We also provide easy-to-use user interface for bot trainers to use called JennyStudio. To learn more, book a demo

NLP processing

After you’ve set up Starchat with Docker, here’s the brief explanation on how it works, and what can you do with it:

NLP processing is of course the core of any bot. Starchat has two primary ways of triggering states: through queries and analyzers.


If the analyzer field is empty, StarChat will query Elasticsearch for the state containing the most similar sentence in the field queries. We have carefully configured Elasticsearch in order to provide good answers (e.g. boosting results where the same words appear etc), and the results are promising. But you are encouraged to use the analyzer field, documented below.

Watch the webinar - How to run a successful chatbot project


Through the analyzers, you can easily leverage on various NLP algorithms included in StarChat, together with NLP capabilities of Elasticsearch. You can also combine the result of those algorithms. The best way is to look at the simple example included in the CSV provided in the doc/ directory for the state forgot_password:


The expression and and or are called the operators, while keyword is an atom.

Expressions: Atoms

Presently, the keyword(“reset”) in the example provides a very simple score: occurrence of the word reset in the user’s query divided by the total number of words. If evaluated again the sentence “Want to reset my password”, keyword(“reset”) will currently return 0.2. NB.

These are currently the expressions you can use to evaluate the correctness of a query (see DefaultFactoryAtomic and StarchatFactoryAtomic ):

keyword(“word”): as explained above, normalized

regex: evaluate a regular expression, not normalized

search(state_name): takes a state name as argument, queries elastic search and returns the score of the most similar query in the field queries of the argument’s state. In other words, it does what it would do without any analyzer, only with a normalized score -e.g. search(“lost_password_state”)

synonym(“word”): gives a normalized cosine distance between the argument and the closest word in the user’s sentence. We use word2vec, to have an idea of two words distance you can use this word2vec demo by Turku University

similar(“a whole sentence”): gives a normalized cosine distance between the argument and the closest word in the user’s sentence (word2vec)

similarState(state_name): same as above, but for the sentences in the field “queries” of the state in the argument.

Expressions: Operators

Operators evaluate the output of one or more expression and return a value. Currently, the following operators are implemented (the the source code):

boolean or: calls matches of all the expressions it contains and returns true or false. It can be called using bor

boolean and: as above, it’s called with band

boolean not: as above, bnot

conjunction: if the evaluation of the expressions it contains is normalized, and they can be seen as probabilities of them being true, this is the probability that all the expressions are all true (P(A)*P(B))

disjunction: as above, the probability that at least one is true (1-(1-P(A))*(1-P(B)))

max: takes the max score of returned by the expression arguments

Technical corner: expressions

Expressions, like keywords in the example, are called atoms, and have the following methods/members:

def evaluate(query: String): Double: produce a score. It might be normalized to 1 or not (set val isEvaluateNormalized: Boolean accordingly)

val match_threshold This is the threshold above which the expression is considered true when matches is called. NB The default value is 0.0, which is normally not ideal.

def matches(query: String): Boolean: calles evaluate and check agains the threshold…

val rx: the name of the atom, as it should be used in the analyzer field.

Configuration of the answer recommender (Knowledge Base)

Through the /knowledgebase endpoint you can add, edit and remove pairs of question and answers used by StarChat to recommend possible answers when a question arrives.

Documents containing Q&A must be structured like that:

"id": "0", // id of the pair
"conversation": "id:1000", // id of the conversation. This can be useful to external services
"index_in_conversation": 1, // when the pair appears inside the conversation, as above
"question": "thank you", // The question to be matched
"answer": "you are welcome!", // The answer to be recommended
"question_scored_terms": [ // A list of keyword and score. You can use your own keyword extractor or our Manaus (see later)
"verified": true, // A variable used in some call centers
"topics": "t1 t2", // Eventual topics to be associated
"doctype": "normal",
"state": "",
"status": 0

See POST /knowledgebase for an example with curl. Other calls (GET, DELETE, PUT) are used to get the state, delete it or update it.

Testing the knowledge base

Just run the example in POST /knowledgebase_search.

And voila! By configuring your bot with your existing knowledge base and beefing it up with your chat logs of most common conversations, you should have a functional first line of help.

All you have to do is to connect it to the chat system of your choice and configure when you want the bot to handle the conversation.

Watch the webinar - How to run a successful chatbot project

If you would like to learn more about chatbots please read our guide on chatbots

Similar articles

The Best Customer Service Tools in 2021: A Comprehensive Guide

100+ best customer service tools reviewed for you! Learn the benefits of each tool and decide the best fit for your organization, updated for 2021.

Partnership in Action: Chatbots for the Pensions Industry

Learn how GetJenny builds partnerships to bring chatbots to new industries. Riku Salminen shares his thoughts about how partnerships work for...

8 live chat service tips: how to deliver an amazing service?

Great customer service is worth the effort. Here are 8 tips that help you to deliver amazing live chat service to your website visitors and turn them...

Get three content pieces our community found the most useful in 2021: