stut-it Martin Stut - Information Technology Tailored to You
By Martin Stut, 2012-02-04
In my spare time, I'm helping a christian charity that runs their own IT helpdesk for about 1000 users, distributed across much of the globe. This helpdesk service is greatly appreciated by the users, but it is also absorbing (too) much of the capacity of the IT team.
Given that number of users, similar issues tend to recur. So the idea of a pre-screening website came up, that would ask certain questions and point to applicable instructions - or to a human (i.e. mail to RT), if there is no standard answer for the user's situation.
In the computer science world, such a system is called an "expert system", because it tries to mimic an expert. While looking for such a system for helpdesk use, I found surprisingly little that would match:
All other helpdesk software I found was about handling tickets by humans, not solving the underlying issue by a machine.
Perhaps I just didn't use the right search terms. Does anybody out there have an idea, what search terms would lead to expert system backed helpdesk or troubleshooting systems?
So without a web-ready expert system shell available, I started thinking how I'd create one. So far I have not written a single line of code, but the idea might inspire others to do it - or to point me to someone who has done something similar.
The underlying expert knowledge needs to be stored in some data structure, residing e.g. in a database like MySQL. Here is my idea of how to represent that expert knowledge in a way that's ready to use for a machine (program):
A short statement, represented by a short string ("atom" in LISP or Ruby parlance), e.g. mail_works, worked_after_server_change or internet_connection_ok.
For each session (user problem instance), each fact has a state of "true" (either by user input or by logical inference [see rules below]), "false" (dto.), "user doesn't know" or "not yet considered".
The state of the interaction is basically the combined state of all facts.
Certain states of certain facts are solution candidates: If e.g. internet_connection_ok becomes known to be false, while looking for the reason of not (mail_works), then there is only one way out: "Go, fix your Internet connection".
So the full data structure (database schema) of a fact is
A description of a material implication, which combinations of facts determine the state of another fact, e.g.
not (internet_connection_ok) => not (mail_works)
account_settings_outdated => not (worked_after_server_change)
This needs to be handled with the full care and can be used with the full power of propositional logic.
a => b can be pronounced as "a implies b", or if a is true, then (it can be implied with mathematical certainty that) b must also be true. If a is not true, then nothing is known about b. If b is true, then nothing is known about a, because b can be true for other reasons than a.
In the above example: if the internet connection is broken, then mail will certainly not work. But "mail not working" can be true for many other reasons than a broken internet connection.
By the rules of mathematical logic (the material implication (=>) works that way), a => b is fully equivalent to not (b) => not (a), so if b is false, then a must be false too, because if a were true, b would also have to be true, which it isn't.
Applied to the above example, another way equivalent to put the rule is
mail_works => internet_connection_ok
In other words, if you know that your mail works, you can imply that your internet connection is working too.
It may require more than one condition to be able to infer a fact. So the rules need to have the option of at least two conditions, each of them negate-able, and a link operator (AND, OR, perhaps XOR); the result may be negated, so the full data structure of a rule is:
Somehow the facts need to be given to the system. A machine is perfectly happy dealing with an atom-like fact, but the average end-user (typically an expert in theology, not technology) would feel being treated rude if being asked "worked_after_server_change ? yes/no/don't know".
So for each fact there should be one (or more, see below) longer string containing a question, e.g. "Are you sure that your Internet connection works?" This is a question most end users can answer with yes or no. Sometimes there may be alternative wordings for the same question. So there may be multiple questions for a single fact. One of these questions needs to be marked "preferred" (or get a numeric preference value, e.g. 1-100), so the system can know which one to present first.
Some of these alternative wordings may have the opposite meaning, e.g. "Are you currently experiencing trouble with your Internet connection, so you have problems visiting other websites too?". So another attribute per question needs to be "meaning reversed" (yes/no).
In a worldwide user group, people tend to prefer different languages. So for each fact there could (or rather should) be one question per language. The user could specify his preferred language(s) in his user profile on the helpdesk system, e.g. "native German, good English, very little Japanese".
So the full data structure of a question could be:
When the system describes what it has concluded (or what the user has entered), it should present its findings in understandable human language to the user. Because there are multiple languages (English for users, English for technicians, German, Spanish, ...), a single description field won't suffice. Additionally, a "yes" state and a "no" state may have quite different wordings. So there is need for a separate table of descriptions:
Given all these (database) records about the state (within the session) of facts, rules and questions, the core logic of the expert system (I'll call it "rule engine") needs to decide, which question to ask the user next. Because each question is closely linked to exactly one fact, the task can be reworded to "determine, which fact to consider next".
Core parts of the algorithm could be:
in rough order of complexity, easiest first:
Save the state of the sessions for several weeks, in order to enable the user to return later and continue the session, e.g. after having determined the answer to a hard question. Enable the user to supply modified (corrected) fact answers, especially for facts previously answered with "don't know", or in cases of "a second look has revealed that things actually are different".
Specify a preset probability for facts as an additional criterion for choosing them, e.g. "a lot of people forgot to change their account settings when we switched servers" would increase the preset probability of account_settings_outdated.
Specify an "easiness to find out" for facts. Facts that are easier to find out (for the end-user) will be asked first.
Multiple possible solutions: if a fact is not a definite solution, but only a possible one, don't terminate, but continue investigating.
Use fuzzy logic for fact values (values 0...1) instead of Boolean logic (false or true, nothing else).
Specify a probability for rules, e.g. "if (100% a and 100% b) then there is an 80% probability of c".