Customizing full-sentence query pattern definitions

The NAM search engine enables to define full sentence query patterns. The definitions are stored in casey-questions*.json files in your NAM Server\config directory.

The default queries are described in the casey-questions.json file that installs with the NAM Server.

Customization overview

When you're developing query patterns for NAM Server:

  1. Locate and make backups of the relevant configuration files on your NAM Server machine:

    • server\config\casey-questions*.json
    • server\config\casey-dictionary.json
    • server\config\casey-time-format.json

    Do not change the existing files. Instead, create new ones following the same pattern. You can use numbers to control whether your definitions should have higher priority than (be loaded and evaluated before) the existing definitions. Keep the backups handy in case you need to revert your work. See Configuration files for file descriptions.

  2. Edit and test your queries incrementally:

    1. Make a change and save the file. You do not need to restart the NAM Server.
    2. Submit a test question to the NAM Server.
    3. Observe the results on the NAM Server. Basic things to look for:
      • Did the correct report open?
      • Were the correct filters applied to the report?
      • Is the correct time range displayed at the top of the report?
    4. Check the log file (server\log\casey.log) to verify the results. Make sure your definitions and utterances are handling your questions as expected. See Query log and troubleshooting for log file details.

Configuration files

NAM search configuration is controlled by JSON configuration files stored in your NAM Server\config directory.

casey-questions*.json

The casey-questions*.json files (there may be more than one) define the syntax for "intents". Currently, you can only define intents for opening NAM reports with proper filters. For example, the following entry (intent) defines phrases ("utterances") that will open the report named User explorer.

{
"report": "User explorer",
"utterances": [
  "What <$AUXV> [user] (*) do [in|over] {#TIME}",
  "What <was|were|is> [user] (*) doing"
],
"mappings": [
  ["BOTH", "userN"]
]
}
Warning

The casey-questions-*.json files that come with the product are overwritten during upgrades, so you will lose your modifications if you edit them and then upgrade the software. Instead, you should add and modify your own supplemental casey-questions*.json files as needed. See Query syntax below for details.

casey-dictionary.json

The casey-dictionary.json file is essentially a general thesaurus. For each keyword (key) used in search definitions, there is a list of equivalent words and phrases (utterances).

The collection of utterances related to given a key is used as alternative clauses when building the final utterance pattern within your intent. The key used in your utterance pattern is replaced with the proper definition of alternative utterance values, as if you had listed them yourself.

The dictionary is used to simplify the process of writing and maintaining your intents as you don’t need to repeat the same collection of alternative utterances each time.

Important: You can only use simple words and phrases as utterances in the dictionary. The syntax used for utterance definitions in intents is not supported here.

{
	 "key": "HAVE",
	 "utterances": [
		 "have","has"
	 ]
 }

See Alternative phrases below for details.

casey-time-format.json

The casey-time-format.json file is a time-specific thesaurus. For each time keyword (key) used in search definitions, there is a list of equivalent words and phrases (utterances).

This works the same as for casey-dictionary.json, with the exception that the actual utterance patterns are all compiled together into a single $TIME key and resolved into the correct time range filter after a user query has been matched against one of the utterance patterns within one of your intents.

{
    "key": "3h",
    "utterances": [
   	 "[the] <last|past> <3|three> <hour|hours|h|hrs>",
   	 "[the] <last|past> 3h",
   	 "[the] <last|past> 3hrs"
    ]
}

See Time-specific alternative phrases below for details.

Query syntax

Actions supported by the search engine are defined in the default casey-questions-*.json files and any custom casey-questions*.json files you add to your deployment.

Each casey-questions*.json file defines a list of "intents" (things NAM should do in response to a query).

Each intent requires:

  • "report": the report to open if the query matches this intent. This must be the exact report name as defined in the NAM Server.
  • "utterances": a list of patterns defining the syntax of this query. A query that can be mapped to one of these utterance patterns will invoke the specified report.
  • "mappings": how information extracted from a question should be mapped to report filters.

The order of intents matters, as does the order of utterance patterns within each intent. You should define more specialized utterance patterns first.

Custom utterance patterns (questions) can be defined in any language. If you need to develop queries for another language, create a new casey-questions*.json file (such as a copy of the default file) and then translate utterances to that language.

Open your deployment's casey-questions-*.json file in a text editor to see the default set of questions and what they do. For example, the following intent specifies that NAM should open the User explorer report if you ask a question such as What was user John doing? or What is John doing?, which matches the second utterance ("What <was|were|is> [user] (*) doing") in the definition.

{
 "report": "User explorer",
 "utterances": [
   "What <$AUXV> [user] (*) do [in|over] {#TIME}",
   "What <was|were|is> [user] (*) doing"
 ],
 "mappings": [
   ["BOTH", "userN"]
 ]
}

where:

  • User explorer is (in this example) the name of the report that will provide the answer to the question.
  • $AUXV comes from the casey-dictionary.json dictionary of alternative phrases. See Alternative phrases below for details.
  • #TIME comes from casey-time-format.json and is expanded to regular time labels as per casey-time-format.json. See Time-specific alternative phrases below for details.

Bracket types

Optional phrases:

  • [ ] - This does not have to appear in the question and you don't want to use its value if it does appear.
  • { } - This does not have to appear in the question, but if it does appear in the question, you do want to put it into the next available capture.

Required phrases:

  • < > - This word or phrase has to appear in the question (or there will be no match to the utterance) but you don't want to use its value.
  • ( ) - This word or phrase has to appear in the question and you do want to put it into the next available capture.

In this utterance:
"What <$AUXV> [user] (*) do [in|over] {#TIME}"
you are saying that:

  • <$AUXV> has to appear in the question but you aren't interested in the actual value. It just has to be there.
  • [user] may or may not appear in the question
  • (*) has to appear in the question and you care about the value (you want to store it in a capture and use it). In fact, (*) will capture any word or phrase in your question that would fit before "do".
  • do has to appear in the question. If you don’t use a dictionary key or alternative clause (see next), you don’t need to put the word or phrase in < > brackets.
  • [in|over] means that either "in" or "over" may but does not have to appear in the question. If it does appear, you don't want to use its value. The | is a logical OR to indicate that you want to match any of multiple listed possibilities (in this case, in or over will match in that position of the utterance).
  • {#TIME} does not have to appear in the question, but if it does appear in the question, you want to apply that time range to the report.

Mappings

Use mappings to handle filtering, encryption, and time ranges.

Wildcard placement in filters

Use BOTH, LEFT, RIGHT, or NONE (default) to indicate where to add a wildcard (asterisk) to a filter applied to the target report.

For example:
["BOTH", "appl", "section1", 1]
says that whatever is captured in the second capture (specified by 1, because numbering starts at 0) is sent as filter on the appl (software service) dimension in section section1, and the value that was captured is prepended and appended with wildcards in the filter.

Specifying how to handle name encryption

Use ENCRYPTED to offer the option of searching on encrypted (pseudonymized) data.

For example:
["ENCRYPTED", "userN", "s2"]
says that the first capture (not specified, so the default is capture 0) is sent as a filter on the userN (user name) dimension in section s2 of the report and, if we have user name pseudonymization enabled, the name will be encrypted before passing it on as a filter, so that the values in the database (which are also encrypted if user name pseudonymization is enabled) will be able to match a filter value passed this way.

Specifying the report time range

Use TIME to specify a time range for your query.

For example, this mapping:
["TIME", 2]
maps a time range specification to the third capture (capture 2) in the query. How the time is interpreted is specified in the casey-time-format.json file.

See Time-specific alternative phrases for more on defining time ranges.

Mapping example

Sample definition:

  {
    "report": "Operation explorer",
    "utterances": [
      "[$THE] <$HEALTH> <of server> (*) <for> [$USER] (*) [$PREPOSITION] {#TIME}",
      "What <$BE> [$THE] <$HEALTH> <of server> (*) <for> [$USER] (*) [$PREPOSITION] {#TIME}"
    ],
    "mappings": [
      ["BOTH", "sDNSName", 0],
      ["ENCRYPTED", "userN", 1],
      ["TIME", 2],
      ["servers", "pPerspective"]
    ]
  }

In this example, if you asked the question:
What is the health of server CAS-Server for user Agnieszka in last 7 days
it would resolve to the second utterance in the above definition, which is:
"What <$BE> [$THE] <$HEALTH> <of server> (*) <for> [$USER] (*) [$PREPOSITION] {#TIME}"

where:

  • capture 0 = CAS-Server
  • capture 1 = Agnieszka
  • capture 2 = 7d as determined by a lookup in the casey-time-format.json file. See below for details.

NAM would open Operation explorer with the following mappings applied:

  • A filter on the sDNSName dimension in all sections with value *CAS-Server*
    BOTH means to set a wildcard on both sides of the value. Related mapping possibilities: LEFT, RIGHT, and NONE.
  • A filter on the userN dimension in all sections with value Agnieszka (if encryption is off) or enc:XXXXXXXXXXX (if encryption is on). If the person issuing the query does not have rights to view encrypted data, only the value Agnieszka would be added as a filter.
  • Filter pPerspective is set to the value servers in all sections (because no specific section is specified).
  • Time range for the report is set to Last 7 days (7d).
    To see whether within last 7 days is a valid time range, it is compared to the definitions in your casey-time-format.json file. The phrase within last 7 days would resolve to 7d using this time definition:
    {
    "key": "7d",
    "utterances": [
      "[the] <last|past> <7|seven> <day|days>"
    ]
    }
    

The report would be opened with:

  • Time range set to Last 7 days
  • A filter for Server name = *CAS-Server*
  • A filter for User name = *Agnieszka* if you do not have rights to view encrypted data, or a filter for User name = *Agnieszka* | enc:XXXXXXXXXXX if you do have rights to view encrypted data.

Alternative phrases

Open your deployment's casey-dictionary.json file in a text editor to see the dictionary of alternative terms defined for your deployment.

In this example, have, has, had, and having are all mapped to HAVE as it occurs in the default casey-questions.json file or any custom casey-questions*.json files.

{
 "key": "HAVE",
 "utterances": [
   "have",
   "has",
   "had",
   "having" ]
}

where

  • key is the ID of the phrase alternatives collection.
  • utterances is all the alternative forms of this phrase.

For the query:
"What <$HAVE> [$USER] (*) been doing [$PREPOSITION] {#TIME}"
you would get a match on HAVE if any of the alternatives (have, has, had, having) appeared in the corresponding place in a query. If you can think of any other likely alternatives, you could them to the list.

Time-specific alternative phrases

Open your deployment's casey-time-format.json file in a text editor to see the #TIME phrases defined for your deployment. Refer back to your casey-questions.json file to see how these #TIME phrases are used in question definitions.

In this example, the 3h time range is defined.

{
 "key": "3h",
 "utterances": [
   "[the] <last|past> <3|three> <hour|hours|h|hrs>",
   "[the] <last|past> 3h",
   "[the] <last|past> 3hrs"
 ]
}

Phrases such as

  • the last 3 hours
  • past three hours
  • last 3 h

would all match this entry and set the report time range to Last 3 hours.

You can add your own alternatives to the list of utterances for existing time range definitions, or add new time range definitions to suit your needs.

Query log and troubleshooting

Each question submitted to a NAM Server is logged to server\log\casey.log.

Successful question resolutions

A successfully evaluated question produces a log entry such as:
T CASEY 17-05-16 10:19:50.043 Handled query 'How are things today' with utterance (0) of intent casey-questions.json:0

That tells you how it matched the question (How are things today), which in this example means:

  • Date and time of the query (in this example, 17-05-16 10:19:50.043).
  • Query: The label Handled query is followed by the actual query (in the example, How are things today).
  • File name: casey-questions.json is the name of the file containing the matching definition.
  • Definition: 0 (following the file name) is the intent definition number within the file. In this example, 0 says the first definition (numbering starts at 0) in the file contained a match.
  • Utterance: utterance (0) tells you which utterance pattern (numbered sequentially) within the definition was a match. In this example, 0 says the first utterance (numbering starts at 0) in the definition was applied to this query.

So this query matched the first utterance pattern in the first definition in file casey-questions.json.

Unsuccessful question resolutions

If a query cannot be resolved using any question definition, a generic search is run instead and, for questions of three or more words, the query is logged like this:
T CASEY 17-05-16 10:23:54.924 Not handled query: Some unmatched question

That tells you:

  • Date and time of the query: (in this example, 17-05-16 10:23:54.924).
  • Query: The label Not handled query is followed by the text of the query (in this example, Some unmatched question was the text of the question).

Search the log for instances of Not handled query to find queries that were not resolved according to your query definitions.