7.3.58. `select`#

7.3.58.1. Summary#

select searches records that are matched to specified conditions from a table and then outputs them.

select is the most important command in groonga. You need to understand select to use the full power of Groonga.

7.3.58.2. Syntax#

This command takes many parameters.

The required parameter is only table. Other parameters are optional:

select table
       [match_columns=null]
       [query=null]
       [filter=null]
       [scorer=null]
       [sortby=null]
       [output_columns="_id, _key, *"]
       [offset=0]
       [limit=10]
       [drilldown=null]
       [drilldown_sortby=null]
       [drilldown_output_columns="_key, _nsubrecs"]
       [drilldown_offset=0]
       [drilldown_limit=10]
       [cache=yes]
       [match_escalation_threshold=0]
       [query_expansion=null]
       [query_flags=ALLOW_PRAGMA|ALLOW_COLUMN]
       [query_expander=null]
       [adjuster=null]
       [drilldown_calc_types=NONE]
       [drilldown_calc_target=null]
       [drilldown_filter=null]
       [sort_keys=null]
       [drilldown_sort_keys=null]
       [match_escalation=auto]
       [load_table=null]
       [load_columns=null]
       [load_values=null]
       [drilldown_max_n_target_records=-1]
       [n_workers=0]
       [fuzzy_max_distance_ratio=0]
       [fuzzy_max_distance=0]
       [fuzzy_max_expansions=10]
       [fuzzy_prefix_length=0]
       [fuzzy_with_transposition=yes]
       [fuzzy_tokenize=no]

This command has the following named parameters for dynamic columns:

columns[${NAME}].stage=null

columns[${NAME}].flags=COLUMN_SCALAR

columns[${NAME}].type=null

columns[${NAME}].value=null

columns[${NAME}].window.sort_keys=null

columns[${NAME}].window.group_keys=null

You can use one or more alphabets, digits, _ for ${NAME}. For example, column1 is a valid ${NAME}. This is the same rule as normal column. See also name.

Parameters that have the same ${NAME} are grouped.

For example, the following parameters specify one dynamic column:

--columns[name].stage initial

--columns[name].type UInt32

--columns[name].value 29

The following parameters specify two dynamic columns:

--columns[name1].stage initial

--columns[name1].type UInt32

--columns[name1].value 29

--columns[name2].stage filtered

--columns[name2].type Float

--columns[name2].value '_score * 0.1'

This command has the following named parameters for advanced drilldown:

drilldowns[${LABEL}].keys=null

drilldowns[${LABEL}].sort_keys=null

drilldowns[${LABEL}].output_columns="_key, _nsubrecs"

drilldowns[${LABEL}].offset=0

drilldowns[${LABEL}].limit=10

drilldowns[${LABEL}].calc_types=NONE

drilldowns[${LABEL}].calc_target=null

drilldowns[${LABEL}].filter=null

drilldowns[${LABEL}].max_n_target_records=-1

drilldowns[${LABEL}].columns[${NAME}].stage=null

drilldowns[${LABEL}].columns[${NAME}].flags=COLUMN_SCALAR

drilldowns[${LABEL}].columns[${NAME}].type=null

drilldowns[${LABEL}].columns[${NAME}].value=null

drilldowns[${LABEL}].columns[${NAME}].window.sort_keys=null

drilldowns[${LABEL}].columns[${NAME}].window.group_keys=null

Deprecated since version 6.0.3: drilldown[...] syntax is deprecated, Use drilldowns[...] instead.

You can use one or more alphabets, digits, _ and . for ${LABEL}. For example, parent.sub1 is a valid ${LABEL}.

Parameters that have the same ${LABEL} are grouped.

For example, the following parameters specify one drilldown:

--drilldowns[label].keys column

--drilldowns[label].sort_keys -_nsubrecs

The following parameters specify two drilldowns:

--drilldowns[label1].keys column1

--drilldowns[label1].sort_keys -_nsubrecs

--drilldowns[label2].keys column2

--drilldowns[label2].sort_keys _key

7.3.58.3. Usage#

Let’s learn about select usage with examples. This section shows many popular usages.

Here are a schema definition and sample data to show usage.

Execution example:

table_create Entries TABLE_HASH_KEY ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
column_create Entries content COLUMN_SCALAR Text
# [[0,1337566253.89858,0.000355720520019531],true]
column_create Entries n_likes COLUMN_SCALAR UInt32
# [[0,1337566253.89858,0.000355720520019531],true]
column_create Entries tag COLUMN_SCALAR ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
table_create Terms TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram --normalizer NormalizerAuto
# [[0,1337566253.89858,0.000355720520019531],true]
column_create Terms entries_key_index COLUMN_INDEX|WITH_POSITION Entries _key
# [[0,1337566253.89858,0.000355720520019531],true]
column_create Terms entries_content_index COLUMN_INDEX|WITH_POSITION Entries content
# [[0,1337566253.89858,0.000355720520019531],true]
load --table Entries
[
{"_key":    "The first post!",
 "content": "Welcome! This is my first post!",
 "n_likes": 5,
 "tag": "Hello"},
{"_key":    "Groonga",
 "content": "I started to use Groonga. It's very fast!",
 "n_likes": 10,
 "tag": "Groonga"},
{"_key":    "Mroonga",
 "content": "I also started to use Mroonga. It's also very fast! Really fast!",
 "n_likes": 15,
 "tag": "Groonga"},
{"_key":    "Good-bye Senna",
 "content": "I migrated all Senna system!",
 "n_likes": 3,
 "tag": "Senna"},
{"_key":    "Good-bye Tritonn",
 "content": "I also migrated all Tritonn system!",
 "n_likes": 3,
 "tag": "Senna"}
]
# [[0,1337566253.89858,0.000355720520019531],5]

There is a table, Entries, for blog entries. An entry has title, content, the number of likes for the entry and tag. Title is key of Entries. Content is value of Entries.content column. The number of likes is value of Entries.n_likes column. Tag is value of Entries.tag column.

Entries._key column and Entries.content column are indexed using TokenBigram tokenizer. So both Entries._key and Entries.content are fulltext search ready.

OK. The schema and data for examples are ready.

7.3.58.3.1. Simple usage#

Here is the most simple usage with the above schema and data. It outputs all records in Entries table.

Execution example:

select Entries
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "The first post!",
#         "Welcome! This is my first post!",
#         5,
#         "Hello"
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ],
#       [
#         4,
#         "Good-bye Senna",
#         "I migrated all Senna system!",
#         3,
#         "Senna"
#       ],
#       [
#         5,
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         3,
#         "Senna"
#       ]
#     ]
#   ]
# ]

Why does the command output all records? There are two reasons. The first reason is that the command doesn’t specify any search conditions. No search condition means all records are matched. The second reason is that the number of all records is 5. select command outputs 10 records at a maximum by default. There are only 5 records. It is less than 10. So the command outputs all records.

7.3.58.3.2. Search conditions#

Search conditions are specified by query or filter. You can also specify both query and filter. It means that selected records must be matched against both query and filter.

7.3.58.3.2.1. Search condition: `query`#

query is designed for search box in Web page. Imagine a search box in google.com. You specify search conditions for query as space separated keywords. For example, search engine means a matched record should contain two words, search and engine.

Normally, query parameter is used for specifying fulltext search conditions. It can be used for non fulltext search conditions but filter is used for the propose.

query parameter is used with match_columns parameter when query parameter is used for specifying fulltext search conditions. match_columns specifies which columns and indexes are matched against query.

Here is a simple query usage example.

Execution example:

select Entries --match_columns content --query fast
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ]
#     ]
#   ]
# ]

The select command searches records that contain a word fast in content column value from Entries table.

query has query syntax but its details aren’t described here. See Query syntax for details.

7.3.58.3.2.2. Search condition: `filter`#

filter is designed for complex search conditions. You specify search conditions for filter as ECMAScript like syntax.

Here is a simple filter usage example.

Execution example:

select Entries --filter 'content @ "fast" && _key == "Groonga"'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ]
#     ]
#   ]
# ]

The select command searches records that contain a word fast in content column value and has Groonga as _key from Entries table. There are three operators in the command, @, && and ==. @ is fulltext search operator. && and == are the same as ECMAScript. && is logical AND operator and == is equality operator.

filter has more operators and syntax like grouping by (...) its details aren’t described here. See Script syntax for details.

7.3.58.3.3. Paging#

You can specify range of outputted records by offset and limit. Here is an example to output only the 2nd record.

Execution example:

select Entries --offset 1 --limit 1
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ]
#     ]
#   ]
# ]

offset is zero-based. --offset 1 means output range is started from the 2nd record.

limit specifies the max number of output records. --limit 1 means the number of output records is 1 at a maximum. If no records are matched, select command outputs no records.

7.3.58.3.4. The total number of records#

You can use --limit 0 to retrieve the total number of records without any contents of records.

Execution example:

select Entries --limit 0
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ]
#     ]
#   ]
# ]

--limit 0 is also useful for retrieving only the number of matched records.

7.3.58.3.5. Drilldown#

You can get additional grouped results against the search result in one select. You need to use two or more SELECT s in SQL but select in Groonga can do it in one select.

This feature is called as drilldown in Groonga. It’s also called as faceted search in other search engine.

For example, think about the following situation.

You search entries that has fast word:

Execution example:

select Entries --filter 'content @ "fast"'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ]
#     ]
#   ]
# ]

You want to use tag for additional search condition like --filter 'content @ "fast" && tag == "???". But you don’t know suitable tag until you see the result of content @ "fast".

If you know the number of matched records of each available tag, you can choose suitable tag. You can use drilldown for the case:

Execution example:

select Entries --filter 'content @ "fast"' --drilldown tag
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ]
#     ],
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Groonga",
#         2
#       ]
#     ]
#   ]
# ]

--drilldown tag returns a list of pair of available tag and the number of matched records. You can avoid “no hit search” case by choosing a tag from the list. You can also avoid “too many search results” case by choosing a tag that the number of matched records is few from the list.

You can create the following UI with the drilldown results:

Links to narrow search results. (Users don’t need to input a search query by their keyboard. They just click a link.)

Most EC sites use the UI. See side menu at Amazon.

Groonga supports not only counting grouped records but also finding the maximum and/or minimum value from grouped records, summing values in grouped records and so on. See Drilldown related parameters for details.

7.3.58.3.6. Dynamic column#

You can create zero or more columns dynamically while a select execution. You can use them for drilldown by computed value, window function and so on.

Here is an example that uses dynamic column for drilldown by computed value. This example creates a new column named n_likes_class. n_likes_class column has classified value of Entry.n_likes value. This example classifies Entry.n_likes column value 10 step and the lowest number in the class is the classified value. If a Entry.n_likes value is between 0 and 9 such as 3 and 5, n_likes_class value (classified value) is 0. If Entry.n_likes value is between 10 and 19 such as 10 and 15, n_likes_class value (classified value) is 10.

You can use number_classify function for the classification. You need to register functions/number plugin by plugin_register command to use number_classify function.

This example does drilldown by n_likes_class value. The drilldown result will help you to know data trend.

Execution example:

plugin_register functions/number
# [[0,1337566253.89858,0.000355720520019531],true]
select \
  --table Entries \
  --columns[n_likes_class].stage initial \
  --columns[n_likes_class].type UInt32 \
  --columns[n_likes_class].value 'number_classify(n_likes, 10)' \
  --drilldown n_likes_class \
  --drilldown_sort_keys _nsubrecs \
  --output_columns n_likes,n_likes_class
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "n_likes_class",
#           "UInt32"
#         ]
#       ],
#       [
#         5,
#         0
#       ],
#       [
#         10,
#         10
#       ],
#       [
#         15,
#         10
#       ],
#       [
#         3,
#         0
#       ],
#       [
#         3,
#         0
#       ]
#     ],
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "UInt32"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         10,
#         2
#       ],
#       [
#         0,
#         3
#       ]
#     ]
#   ]
# ]

See Dynamic column related parameters for details.

7.3.58.3.7. Window function#

You can compute each record value from values of grouped records. For example, you can compute sums of each group and puts sums to each record. The difference against drilldown is drilldown can compute sums of each group but it puts sums to each group not record.

Here is the result with window function. Each record has sum:

Group No.	Target value	Sum result
1	5	5
2	10	25
2	15	25
3	3	8
3	5	8

Here is the result with drilldown. Each group has sum:

Group No.	Target values	Sum result
1	5	5
2	10, 15	25
3	3, 5	8

Window function is useful for data analysis.

Here is an example that sums Entries.n_likes per Entries.tag:

Execution example:

plugin_register functions/number
# [[0,1337566253.89858,0.000355720520019531],true]
select \
  --table Entries \
  --columns[n_likes_sum_per_tag].stage initial \
  --columns[n_likes_sum_per_tag].type UInt32 \
  --columns[n_likes_sum_per_tag].value 'window_sum(n_likes)' \
  --columns[n_likes_sum_per_tag].window.group_keys tag \
  --output_columns tag,n_likes,n_likes_sum_per_tag
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "tag",
#           "ShortText"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "n_likes_sum_per_tag",
#           "UInt32"
#         ]
#       ],
#       [
#         "Hello",
#         5,
#         5
#       ],
#       [
#         "Groonga",
#         10,
#         25
#       ],
#       [
#         "Groonga",
#         15,
#         25
#       ],
#       [
#         "Senna",
#         3,
#         6
#       ],
#       [
#         "Senna",
#         3,
#         6
#       ]
#     ]
#   ]
# ]

See Window function related parameters for details.

7.3.58.3.8. Typo tolerance#

You can implement typo tolerance search by specifying how many characters to be accepted as typo. If no records are matched by the given query, Groonga searches with typo fixed query again automatically.

The number of accepted typo characters is 0 by default. So typo tolerance search isn’t enabled by default.

You can enable typo tolerance search by specifying fuzzy_max_distance_ratio or fuzzy_max_distance. In general, --fuzzy_max_distance_ratio 0.34 will be a good parameter.

fuzzy_max_distance_ratio specifies how many typo characters is accepted based on the number of characters of each input term.

Here is a table that shows how many characters are accepted as typo with --fuzzy_max_distance_ratio 0.34:

The number of characters of a term	The number of accepted typo characters
1	0 (`floor(1 * 0.34) = floor(0.34) = 0`)
2	0 (`floor(2 * 0.34) = floor(0.68) = 0`)
3	1 (`floor(3 * 0.34) = floor(1.02) = 1`)
4	1 (`floor(4 * 0.34) = floor(1.36) = 1`)
5	1 (`floor(5 * 0.34) = floor(1.7) = 1`)
6	2 (`floor(6 * 0.34) = floor(2.04) = 2`)

In other words, Groonga doesn’t accept any typo for a short term (0-2 characters term), accepts 1 typo for a middle term (3-5 characters term) and accepts 2 or more typos for a long term (6- characters term).

Here is an example that shows that we can search Groonga with Moronga (2 typos):

Execution example:

select \
  --table Entries \
  --fuzzy_max_distance_ratio 0.34 \
  --match_columns content \
  --query Moronga \
  --output_columns content,_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "I started to use Groonga. It's very fast!",
#         1
#       ],
#       [
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         2
#       ]
#     ]
#   ]
# ]

You can specify the fixed number of typo accept characters by fuzzy_max_distance. For example, Groonga accepts 2 characters for all terms with --fuzzy_max_distance 2. But --fuzzy_max_distance_ratio will be better for many use cases.

You need correct terms for typo tolerance search. Groonga uses terms in a lexicon as correct terms. Terms is a lexicon for this case. Terms in a lexicon are generated by a tokenizer. If your data is alphabet based language such as English, you can use TokenNgram. Because TokenNgram tokenizes a text to (almost) words for alphabet based languages. If your data isn’t alphabet based language such as Japanese, you can’t use TokenNgram. Because TokenNgram tokenizes a text to N-characters for non alphabet based languages. You need to use morphological analyzer based tokenizer for non alphabet based languages. For example, you can use TokenMecab for Japanese. (You can use TokenMecab for non Japanese languages with suitable dictionary.)

Here is an example to use typo tolerance search with Japanese text. --default_tokenizer TokenMecab for JapaneseTerms is important. JapaneseTerms is a lexicon for this case.

Execution example:

table_create JapaneseEntries TABLE_NO_KEY
# [[0,1337566253.89858,0.000355720520019531],true]
column_create JapaneseEntries content COLUMN_SCALAR Text
# [[0,1337566253.89858,0.000355720520019531],true]
table_create JapaneseTerms TABLE_PAT_KEY ShortText \
  --default_tokenizer TokenMecab \
  --normalizer NormalizerNFKC150
# [[0,1337566253.89858,0.000355720520019531],true]
column_create JapaneseTerms japanese_entries_content \
  COLUMN_INDEX|WITH_POSITION JapaneseEntries content
# [[0,1337566253.89858,0.000355720520019531],true]
load --table JapaneseEntries
[
{"content": "ようこそ！これが最初の投稿です！"},
{"content": "Groongaを使い始めました。とても速いですね！"},
{"content": "Mroongaも使い始めました。これもとても速いですね！本当に速い！"},
{"content": "Sennaのシステムをすべて移行しました！"},
{"content": "Tritonnのシステムもすべて移行しました！"}
]
# [[0,1337566253.89858,0.000355720520019531],5]
select \
  --table JapaneseEntries \
  --fuzzy_max_distance_ratio 0.34 \
  --match_columns content \
  --query ともて \
  --output_columns content,_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "Groongaを使い始めました。とても速いですね！",
#         1
#       ],
#       [
#         "Mroongaも使い始めました。これもとても速いですね！本当に速い！",
#         1
#       ]
#     ]
#   ]
# ]

See Fuzzy query related parameters for details.

7.3.58.4. Parameters#

This section describes all parameters. Parameters are categorized.

7.3.58.4.1. Required parameters#

There is a required parameter, table.

7.3.58.4.1.1. `table`#

Specifies a table to be searched. table must be specified.

If nonexistent table is specified, an error is returned.

Execution example:

select Nonexistent
# [
#   [
#     -22,
#     1337566253.89858,
#     0.000355720520019531,
#     "[select][table] invalid name: <Nonexistent>",
#     [
#       [
#         "execute",
#         "lib/proc/proc_select.cpp",
#         2929
#       ]
#     ]
#   ]
# ]

7.3.58.4.3. Advanced search parameters#

7.3.58.4.3.1. `match_escalation_threshold`#

Added in version 8.0.1.

Specifies threshold to determine whether search strategy escalation is used or not. The threshold is compared against the number of matched records. If the number of matched records is equal to or less than the threshold, the search strategy escalation is used. See 検索 about the search strategy escalation.

The default threshold is 0. It means that search strategy escalation is used only when no records are matched.

The default threshold can be customized by one of the followings.

--with-match-escalation-threshold option of configure

--match-escalation-threshold option of groonga command

match-escalation-threshold configuration item in configuration file

Here is a simple match_escalation_threshold usage example. The first select doesn’t have match_escalation_threshold parameter. The second select has match_escalation_threshold parameter.

Execution example:

select Entries --match_columns content --query groo
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ]
#     ]
#   ]
# ]
select Entries --match_columns content --query groo --match_escalation_threshold -1
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         0
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ]
#     ]
#   ]
# ]

The first select command searches records that contain a word groo in content column value from Entries table. But no records are matched because the TokenBigram tokenizer tokenizes groonga to groonga not gr|ro|oo|on|ng|ga. (The TokenBigramSplitSymbolAlpha tokenizer tokenizes groonga to gr|ro|oo|on|ng|ga. See Tokenizers for details.) It means that groonga is indexed but groo isn’t indexed. So no records are matched against groo by exact match. In the case, the search strategy escalation is used because the number of matched records (0) is equal to match_escalation_threshold (0). One record is matched against groo by unsplit search.

The second select command also searches records that contain a word groo in content column value from Entries table. And it also doesn’t find matched records. In this case, the search strategy escalation is not used because the number of matched records (0) is larger than match_escalation_threshold (-1). So no more searches aren’t executed. And no records are matched.

7.3.58.4.3.2. `match_escalation`#

Specifies how to use match escalation. See also match_escalation and 検索 about the match escalation.

Here are available values:

Value	Description
`auto`	Groonga uses match_escalation_threshold to determine whether match escalation is used or not. This is the default.
`yes`	Groonga always uses match escalation.
`no`	Groonga never use match escalation.

Value

Description

auto

Groonga uses match_escalation_threshold to determine whether match escalation is used or not.

This is the default.

yes

Groonga always uses match escalation.

no

Groonga never use match escalation.

--match_escalation yes is stronger than --match_escalation_threshold 9999...999. --filter 'true && column @ "query" with --match_escalation yes uses match escalation. --filter 'true && column @ "query" with --match_escalation_threshold 9999...999 doesn’t use match escalation.

Here is a simple match_escalation usage example. The first select doesn’t have match_escalation parameter. The second select has match_escalation parameter.

Execution example:

select Entries --filter 'true && content @ "groo"'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         0
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ]
#     ]
#   ]
# ]
select Entries --filter 'true && content @ "groo"' --match_escalation yes
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ]
#     ]
#   ]
# ]

The second select command also searches records that contain a word groo in content column value from Entries table. And it uses match escalation. So it can find matched records.

7.3.58.4.3.3. `query_expansion`#

Deprecated since version 3.0.2: Use query_expander instead.

7.3.58.4.3.4. `query_flags`#

It customs query parameter syntax. You cannot update column value by query parameter by default. But if you specify ALLOW_COLUMN|ALLOW_UPDATE as query_flags, you can update column value by query.

Here are available values:

ALLOW_PRAGMA
ALLOW_COLUMN
ALLOW_UPDATE
ALLOW_LEADING_NOT
QUERY_NO_SYNTAX_ERROR
NONE

ALLOW_PRAGMA enables pragma at the head of query. This is not implemented yet.

ALLOW_COLUMN enables search against columns that are not included in match_columns. To specify column, there are COLUMN:... syntaxes.

ALLOW_UPDATE enables column update by query with COLUMN:=NEW_VALUE syntax. ALLOW_COLUMN is also required to update column because the column update syntax specifies column.

ALLOW_LEADING_NOT enables leading NOT condition with -WORD syntax. The query searches records that doesn’t match WORD. Leading NOT condition query is heavy query in many cases because it matches many records. So this flag is disabled by default. Be careful about it when you use the flag.

QUERY_NO_SYNTAX_ERROR enables never causes syntax error for query. This flag is useful when an application uses user input directly and doesn’t want to show syntax error to the user and in a log. This flag is disabled by default.

NONE is just ignores. You can use NONE for specifying no flags.

They can be combined by separated | such as ALLOW_COLUMN|ALLOW_UPDATE.

The default value is ALLOW_PRAGMA|ALLOW_COLUMN.

Here is a usage example of ALLOW_COLUMN.

Execution example:

select Entries --query content:@mroonga --query_flags ALLOW_COLUMN
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ]
#     ]
#   ]
# ]

The select command searches records that contain mroonga in content column value from Entries table.

Here is a usage example of ALLOW_UPDATE.

Execution example:

table_create Users TABLE_HASH_KEY ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
column_create Users age COLUMN_SCALAR UInt32
# [[0,1337566253.89858,0.000355720520019531],true]
load --table Users
[
{"_key": "alice", "age": 18},
{"_key": "bob",   "age": 20}
]
# [[0,1337566253.89858,0.000355720520019531],2]
select Users --query age:=19 --query_flags ALLOW_COLUMN|ALLOW_UPDATE
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "age",
#           "UInt32"
#         ]
#       ],
#       [
#         1,
#         "alice",
#         19
#       ],
#       [
#         2,
#         "bob",
#         19
#       ]
#     ]
#   ]
# ]
select Users
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "age",
#           "UInt32"
#         ]
#       ],
#       [
#         1,
#         "alice",
#         19
#       ],
#       [
#         2,
#         "bob",
#         19
#       ]
#     ]
#   ]
# ]

The first select command sets age column value of all records to 19. The second select command outputs updated age column values.

Here is a usage example of ALLOW_LEADING_NOT.

Execution example:

select Entries --match_columns content --query -mroonga --query_flags ALLOW_LEADING_NOT
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         4
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "The first post!",
#         "Welcome! This is my first post!",
#         5,
#         "Hello"
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ],
#       [
#         4,
#         "Good-bye Senna",
#         "I migrated all Senna system!",
#         3,
#         "Senna"
#       ],
#       [
#         5,
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         3,
#         "Senna"
#       ]
#     ]
#   ]
# ]

The select command searches records that don’t contain mroonga in content column value from Entries table.

Here are a schema definition and sample data to describe other flags:

Execution example:

table_create --name Magazine --flags TABLE_HASH_KEY --key_type ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
column_create --table Magazine --name title --type ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
load --table Magazine
[
{"_key":"http://test.jp/magazine/webplus","title":"WEB+"},
{"_key":"http://test.jp/magazine/database","title":"DataBase"},
]
# [[0,1337566253.89858,0.000355720520019531],2]

Here is an example of QUERY_NO_SYNTAX_ERROR:

Execution example:

select Magazine --match_columns title --query 'WEB +'  --query_flags ALLOW_PRAGMA|ALLOW_COLUMN|QUERY_NO_SYNTAX_ERROR
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "title",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "http://test.jp/magazine/webplus",
#         "WEB+"
#       ]
#     ]
#   ]
# ]

If you don’t specify this flag, the query causes a syntax error as below.

Execution example:

select Magazine --match_columns title --query 'WEB +'  --query_flags ALLOW_PRAGMA|ALLOW_COLUMN
# [
#   [
#     -63,
#     1337566253.89858,
#     0.000355720520019531,
#     "Syntax error: <WEB +||>",
#     [
#       [
#         "yy_syntax_error",
#         "grn_ecmascript.lemon",
#         2929
#       ]
#     ]
#   ]
# ]

Here is a usage example of NONE.

Execution example:

select Entries --match_columns content --query 'mroonga OR _key:Groonga' --query_flags NONE
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ]
#     ]
#   ]
# ]

The select command searches records that contain one of two words mroonga or _key:Groonga in content from Entries table. Note that _key:Groonga doesn’t mean that the value of _key column is equal to Groonga. Because ALLOW_COLUMN flag is not specified.

7.3.58.4.3.5. `query_expander`#

It’s for query expansion. Query expansion substitutes specific words to another words in query. Normally, it’s used for synonym search.

It specifies a column that is used to substitute query parameter value. The format of this parameter value is “${TABLE}.${COLUMN}”. For example, “Terms.synonym” specifies synonym column in Terms table.

Table for query expansion is called “substitution table”. Substitution table’s key must be ShortText. So array table (TABLE_NO_KEY) can’t be used for query expansion. Because array table doesn’t have key.

Column for query expansion is called “substitution column”. Substitution column’s value type must be ShortText. Column type must be vector (COLUMN_VECTOR).

Query expansion substitutes key of substitution table in query with values in substitution column. If a word in query is a key of substitution table, the word is substituted with substitution column value that is associated with the key. Substitution isn’t performed recursively. It means that substitution target words in substituted query aren’t substituted.

Here is a sample substitution table to show a simple query_expander usage example.

Execution example:

table_create Thesaurus TABLE_PAT_KEY ShortText --normalizer NormalizerAuto
# [[0,1337566253.89858,0.000355720520019531],true]
column_create Thesaurus synonym COLUMN_VECTOR ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
load --table Thesaurus
[
{"_key": "mroonga", "synonym": ["mroonga", "tritonn", "groonga mysql"]},
{"_key": "groonga", "synonym": ["groonga", "senna"]}
]
# [[0,1337566253.89858,0.000355720520019531],2]

Thesaurus substitution table has two synonyms, "mroonga" and "groonga". If an user searches with "mroonga", Groonga searches with "((mroonga) OR (tritonn) OR (groonga mysql))". If an user searches with "groonga", Groonga searches with "((groonga) OR (senna))".

Normally, it’s good idea that substitution table uses a normalizer. For example, if normalizer is used, substitute target word is matched in case insensitive manner. See Normalizers for available normalizers.

Note that those synonym values include the key value such as "mroonga" and "groonga". It’s recommended that you include the key value. If you don’t include key value, substituted value doesn’t include the original substitute target value. Normally, including the original value is better search result. If you have a word that you don’t want to be searched, you should not include the original word. For example, you can implement “stop words” by an empty vector value.

Here is a simple query_expander usage example.

Execution example:

select Entries --match_columns content --query "mroonga"
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ]
#     ]
#   ]
# ]
select Entries --match_columns content --query "mroonga" --query_expander Thesaurus.synonym
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ],
#       [
#         5,
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         3,
#         "Senna"
#       ]
#     ]
#   ]
# ]
select Entries --match_columns content --query "((mroonga) OR (tritonn) OR (groonga mysql))"
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ],
#       [
#         5,
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         3,
#         "Senna"
#       ]
#     ]
#   ]
# ]

The first select command doesn’t use query expansion. So a record that has "tritonn" isn’t found. The second select command uses query expansion. So a record that has "tritonn" is found. The third select command doesn’t use query expansion but it is same as the second select command. The third one uses expanded query.

Each substitute value can contain any Query syntax syntax such as (...) and OR. You can use complex substitution by using those syntax.

Here is a complex substitution usage example that uses query syntax.

Execution example:

load --table Thesaurus
[
{"_key": "popular", "synonym": ["popular", "n_likes:>=10"]}
]
# [[0,1337566253.89858,0.000355720520019531],1]
select Entries --match_columns content --query "popular" --query_expander Thesaurus.synonym
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ]
#     ]
#   ]
# ]

The load command registers a new synonym "popular". It is substituted with ((popular) OR (n_likes:>=10)). The substituted query means that “popular” is containing the word “popular” or 10 or more liked entries.

The select command outputs records that n_likes column value is equal to or more than 10 from Entries table.

7.3.58.4.3.6. `n_workers`#

Added in version 12.0.5.

Note

This is an experimental feature. Currently, this feature is still not stable.

This feature requires Command version 3 or later.

This feature requires that Apache Arrow is enabled in Groonga.

It depends on package provider whether Apache Arrow is enabled or not.

To check whether Apache Arrow is enabled, you can use status command that show the result of apache_arrow is true or not.

If Apache Arrow is disabled, you should build Groonga from the source code with enabling Apache Arrow following the steps in Install or request to enable Apache Arrow to the package provider.

drilldown , drilldowns and slices are executed in parallel when this parameter is specified -1 or 2 or more.

In a default setting, drilldown, drilldowns and slices are executed in serial. In other words, a next process is executed after a current process is finished. So, queries tend to take a long time if there are a lot of drilldown, drilldowns and slices.

n_workers enables to execute independent drilldown, drilldowns and slices in parallel. The execution time of the total sum of processes can be shourtend by executing them in parallel. This parallel execution is done for each select command.

“independent” means not using drilldowns.table to reference the results of other drilldowns or slices.

If there are dependencies as same meaning as using drilldowns.table, it wait for finish the dependent drilldowns or slices. Therefore, the degree of parallelism is reduced if they have dependencies.

Executing in parallel means using multiple CPUs at the same time. If executing in parallel without free CPU resource, it may actually slow down the execution time. This is because they have to wait for the other process being executed by the target CPU to finish.

It depends on a system configuration whether or not there are free CPU resources and how many n_workers should be specified.

For example, consider using Groonga HTTP server on a system with 6 CPUs.

Groonga HTTP server allocates 1 thread (= 1CPU) for each request.

When the average number of concurrent connections is 6, there are no free CPU resources because 6 CPUs are already in use. All the CPU is used to process each request.

When the average number of concurrent connections is 2, there are 4 free CPU resources because only 2 CPUs are already in use. When specifying 2 for n_workers, the select command will use at most 3 CPUs, including the thread for processing requests. Therefore, if two select commands with 2 specified for n_workers are requested at the same time, they will use at most 6 CPUs in total and will be processed fastly by using all of the resources. When specifying greater than 2, the degree of parallelism can be higher than the CPU resources, so it may actually slow down the execution time.

n_workers behaves as follows depending on the specified value.

When specifying 0 or 1
- Executes the select command in serial
When specifying 2 or more
- Executes the select command in parallel with at most the specified number of threads.
When specifying -1 or less
- Executes the select command in parallel with the threads of at most the number of CPU cores.

The default value of this parameter is 0 . It means that the select command is executed in serial in default.

Note

The default value can be changed by specifying the environment variable GRN_SELECT_N_WORKERS_DEFAULT.

7.3.58.4.5. Fuzzy query related parameters#

Added in version 13.0.8.

Note

This is an experimental feature. Currently, this feature is still not stable.

This section describes fuzzy query related parameters. See also Typo tolerance as a use case of fuzzy query.

You need to specify at least fuzzy_max_distance_ratio or fuzzy_max_distance to use fuzzy query.

Fuzzy query is executed automatically when no record is matched with the original query. It means that fuzzy query is implemented as one of match escalation methods. See also match_escalation_threshold for match escalation.

7.3.58.4.5.1. `fuzzy_max_distance_ratio`#

Added in version 13.0.8.

Note

This is an experimental feature. Currently, this feature is still not stable.

The default value is 0.

You need to specify fuzzy_max_distance_ratio or fuzzy_max_distance to enable fuzzy query. If you specify both of them, fuzzy_max_distance_ratio is used.

You can specify how long edit distance is accepted based on the target query. For most use cases, this parameter is more suitable than fuzzy_max_distance.

In general, long edit distance for short query isn’t suitable. Because it will increase noisy results. For example, hye with edit distance 3 accepts the following terms:

hey (preferable result)
eye (preferable result)
bye (preferable result)
hyper (preferable result?)
hyphen (preferable result?)

fuzzy_max_distance specifies a fixed size edit distance for all queries (both of short queries and long queries).

You can specify an edit distance for each query based on the number of characters of the target query by this parameter. For example, edit distance 1 is used for hye with --fuzzy_max_distance_ratio 0.34. Edit distance 2 is used for hypehn with --fuzzy_max_distance_ratio 0.34. It’s calculated by floor(${THE_NUMBER_OF_CHARACTERS} * ${FUZZY_MAX_DISTANCE_RATIO}):

hye: floor(3 * 0.34) = floor(1.02) = 1
hypehn: floor(6 * 0.34) = floor(2.04) = 2

In general, --fuzzy_max_distance_ratio 0.34 is a good value. If the value doesn’t fix your use case, you can change the value.

Here is a table that shows how many characters are accepted as typo with --fuzzy_max_distance_ratio 0.34:

The number of characters of a term	The number of accepted typo characters
1	0 (`floor(1 * 0.34) = floor(0.34) = 0`)
2	0 (`floor(2 * 0.34) = floor(0.68) = 0`)
3	1 (`floor(3 * 0.34) = floor(1.02) = 1`)
4	1 (`floor(4 * 0.34) = floor(1.36) = 1`)
5	1 (`floor(5 * 0.34) = floor(1.7) = 1`)
6	2 (`floor(6 * 0.34) = floor(2.04) = 2`)

Here is an example that accepts 1 typo character for vary and 2 typo characters for Gnoonag. This example specifies yes as match_escalation to enable fuzzy query for all queries (vary and Gnoonag). In general, you should not specify --match_escalation yes because it may increase noisy results.

Execution example:

select \
  --table Entries \
  --fuzzy_max_distance_ratio 0.34 \
  --match_columns content \
  --query 'vary Gnoonag' \
  --match_escalation yes \
  --output_columns content,_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "I started to use Groonga. It's very fast!",
#         2
#       ]
#     ]
#   ]
# ]

7.3.58.4.5.2. `fuzzy_max_distance`#

Added in version 13.0.8.

Note

This is an experimental feature. Currently, this feature is still not stable.

The default value is 0.

You need to specify fuzzy_max_distance_ratio or fuzzy_max_distance to enable fuzzy query. If you specify both of them, fuzzy_max_distance_ratio is used.

You can specify a fixed edit distance to be accepted by this parameter. For most use cases, fuzzy_max_distance is more suitable than this parameter.

Here is an example that accepts 1 typo character for vary:

Execution example:

select \
  --table Entries \
  --fuzzy_max_distance 1 \
  --match_columns content \
  --query vary \
  --output_columns content,_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "I started to use Groonga. It's very fast!",
#         1
#       ],
#       [
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         1
#       ]
#     ]
#   ]
# ]

7.3.58.4.5.3. `fuzzy_max_expansions`#

Added in version 13.0.8.

Note

This is an experimental feature. Currently, this feature is still not stable.

The default value is 10.

You can specify the max number of terms as fixed terms. If hye is the given query, this parameter is 2 and hey, eye and hyper are candidates of fixed terms, hey and eye (2 terms) are only used as fixed terms.

Here is an example that uses 1 fixed terms for alx. all is only used. also isn’t used.

Execution example:

select \
  --table Entries \
  --fuzzy_max_distance 2 \
  --fuzzy_max_expansions 1 \
  --match_columns content \
  --query alx \
  --output_columns content,_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "I migrated all Senna system!",
#         2
#       ],
#       [
#         "I also migrated all Tritonn system!",
#         2
#       ]
#     ]
#   ]
# ]

7.3.58.4.5.4. `fuzzy_prefix_length`#

Added in version 13.0.8.

Note

This is an experimental feature. Currently, this feature is still not stable.

The default value is 0.

You can specify the number of prefix characters. If this value is 1 and the given term is hye, the prefix is h. Fixed terms must be started with h. For example, hey can be used as a fixed term but eye and bye can’t be used as a fixed terms.

This option will improve performance when a lexicon has many terms.

Here is an example that requires gr prefix for fixed terms with groonag query. Groonga (case insensitive for this case because you use NormalizerAuto for this case) can be used for a fixed term but Mroonga can’t be used for a fixed term. Because Mroonga isn’t started with gr.

Execution example:

select \
  --table Entries \
  --fuzzy_max_distance 2 \
  --fuzzy_prefix_length 2 \
  --match_columns content \
  --query groonag \
  --output_columns content,_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "I started to use Groonga. It's very fast!",
#         2
#       ]
#     ]
#   ]
# ]

7.3.58.4.5.5. `fuzzy_with_transposition`#

Added in version 13.0.9.

Note

This is an experimental feature. Currently, this feature is still not stable.

The default value is yes.

You can choose edit distance 1 or 2 for the transposition case. An example of the transposition case is hello and ehllo. h and e is transposed. If this parameter is yes, the edit distance of this case is 1. It’s 2 (because insertion and deletion are needed) otherwise.

Here is an example that uses edit distance 2 for transposition. In this example, you can’t use Mroonga as a fixed term for groonag because it has edit distance 3:

Substitute g with M: groonag -> Mroonag
Add a: Mroonag -> Mroonaga
Remove a: Mroonaga -> Mroonga

Execution example:

select \
  --table Entries \
  --fuzzy_max_distance 2 \
  --fuzzy_with_transposition no \
  --match_columns content \
  --query groonag \
  --output_columns content,_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "I started to use Groonga. It's very fast!",
#         1
#       ]
#     ]
#   ]
# ]

7.3.58.4.5.6. `fuzzy_with_tokenize`#

Added in version 13.0.9.

Note

This is an experimental feature. Currently, this feature is still not stable.

The default value is no.

You can choose whether tokenize the given term or not before fuzzy query. If tokenizer is TokenNgram and the given term is he11o, it’s tokenized to he, 11 and o. If this value is yes, fixed terms are searched for each of them. For example, hi is found as a fixed term of he, 12 is found as a fixed term of 11 and x is found as a fixed term of o. And hi12x is searched. If this value is no, fixed terms are searched for he11o. For example, hello is found as a fixed term of he11o and hello is searched.

Note that each term is separated by ore or more space characters before fuzzy query is executed. For example, hello world are separated to hello and world by query parser. (See also Query syntax.) hello and world are processed separately.

For morphological analyzer based tokenizer such as TokenMecab, no is suitable. Because typo-ed term isn’t tokenized as you expected for most cases. For example, ともて (a typo of とても) may be tokenized to とも `` (adjective) and ``て (conjunctive particle) not ともて (adverb).

Here is an example that tokenizes the given term before fuzzy query. In this example, you can’t use Groonga as a fixed term for gr00nga because gr, 00 and nga are processed separately.

Execution example:

select \
  --table Entries \
  --fuzzy_max_distance 2 \
  --fuzzy_tokenize yes \
  --match_columns content \
  --query gr00nga \
  --output_columns content,_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         0
#       ],
#       [
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ]
#     ]
#   ]
# ]

7.3.58.4.6. Dynamic column related parameters#

Added in version 6.0.6.

This section describes dynamic column related parameters. You can use dynamic column for window function but this section doesn’t describe window function. See Window function related parameters for window function.

You can create zero or more columns and fill values into these columns while a select execution. These columns are called as “dynamic columns”. You can use dynamic columns as same as normal columns after dynamic columns are created.

Dynamic column has performance merit because its values are computed at once and reused computed values.

Dynamic column increases memory usage because its values are kept while the select execution.

You need to use dynamic column in the following cases:

You want to name values like AS in SQL.

You want to use computed values for drilldown. Groonga doesn’t support drilldown target value computation in drilldown.

You want to use window function.

There are some points to create dynamic columns. You must specify stage to each dynamic column to control dynamic columns creation points. It’s important that you choose proper point to get better performance.

For example, it’s not a good idea that you create a dynamic column that is only used for output for all records. The number of output records will be a little even if there are many records in a table. Because you will filter, sort and limit all records and output only the limited records in many cases.

See columns[${NAME}].stage for stage detail.

Here are parameters for dynamic column. They don’t include window function related parameters. See Window function related parameters for window function related parameters:

Name	Default value	Required or optional
`columns[${NAME}].stage`	`null`	Required
`columns[${NAME}].flags`	`COLUMN_SCALAR`	Optional
`columns[${NAME}].type`	`null`	Required
`columns[${NAME}].value`	`null`	Required

You need to specify multiple parameters for a dynamic column. ${NAME} is the name for each dynamic column. Parameters that use the same ${NAME} are treated as parameters for the same dynamic column. Here is an example to specify parameters for 2 dynamic columns (name1 and name2):

--columns[name1].stage initial
--columns[name1].type UInt32
--columns[name1].value 29

--columns[name2].stage filtered
--columns[name2].type ShortText
--columns[name2].value "29"

7.3.58.4.6.1. `columns[${NAME}].stage`#

Added in version 6.0.6.

Specifies when the dynamic column is created. This is a required parameter to create a dynamic column.

Here are available stages:

Name	Description
`initial`	Dynamic column is created at first.
`filtered`	Dynamic column is created after query and filter are evaluated.
`output`	Dynamic column is created before output_columns is evaluated.

Here is the select process flow with dynamic column creation points. You should choose stage as late as possible:

Creates dynamic columns for initial stage. All records have these dynamic columns.

Evaluates query and filter. You can use dynamic columns created in initial stage.

Creates dynamic columns for filtered stage. Only filtered records have these dynamic columns.

Evaluates adjuster. You can use dynamic columns created in initial stage and filtered stage.

Evaluates scorer. You can use dynamic columns created in initial stage and filtered stage.

Evaluates sort_keys, offset and limit. You can use dynamic columns created in initial stage and filtered stage.

Evaluates Slice related parameters. You can use dynamic columns created in initial stage and filtered stage.

Evaluates Drilldown related parameters and Advanced drilldown related parameters. You can use dynamic columns created in initial stage and filtered stage. Note that you can create dynamic columns in each drilldown.

Creates dynamic columns for output stage. Only limit records have these dynamic columns.

Evaluates output_columns. You can use dynamic columns created in initial stage, filtered stage and output stage.

Here is an example that creates is_popular column at initial stage. You can use is_popular in all parameters such as filter and output_columns:

Execution example:

select Entries \
  --columns[is_popular].stage initial \
  --columns[is_popular].type Bool \
  --columns[is_popular].value 'n_likes >= 10' \
  --filter is_popular \
  --output_columns _id,is_popular,n_likes
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "is_popular",
#           "Bool"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ]
#       ],
#       [
#         2,
#         true,
#         10
#       ],
#       [
#         3,
#         true,
#         15
#       ]
#     ]
#   ]
# ]

7.3.58.4.6.2. `columns[${NAME}].flags`#

Added in version 6.0.6.

Specifies flags for the dynamic column. It’s the same as flags parameter for column_create. See flags for available flags.

The default value is COLUMN_SCALAR.

Here is a columns[${NAME}].flags example. It creates a vector column by COLUMN_VECTOR flags. plugin_register functions/vector is for using vector_new function:

Execution example:

plugin_register functions/vector
# [[0,1337566253.89858,0.000355720520019531],true]
select Entries \
  --columns[vector].stage initial \
  --columns[vector].flags COLUMN_VECTOR \
  --columns[vector].type UInt32 \
  --columns[vector].value 'vector_new(1, 2, 3)' \
  --output_columns _id,vector
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "vector",
#           "UInt32"
#         ]
#       ],
#       [
#         1,
#         [
#           1,
#           2,
#           3
#         ]
#       ],
#       [
#         2,
#         [
#           1,
#           2,
#           3
#         ]
#       ],
#       [
#         3,
#         [
#           1,
#           2,
#           3
#         ]
#       ],
#       [
#         4,
#         [
#           1,
#           2,
#           3
#         ]
#       ],
#       [
#         5,
#         [
#           1,
#           2,
#           3
#         ]
#       ]
#     ]
#   ]
# ]

7.3.58.4.6.3. `columns[${NAME}].type`#

Added in version 6.0.6.

Specifies value type for the dynamic column. It’s the same as type parameter for column_create. See type for available types.

This is a required parameter.

Here is an example that creates a ShortText type column. Stored value is casted to ShortText automatically. In this example, number is casted to ShortText:

Execution example:

select Entries \
  --columns[n_likes_string].stage initial \
  --columns[n_likes_string].type ShortText \
  --columns[n_likes_string].value n_likes \
  --output_columns _id,n_likes,n_likes_string
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "n_likes_string",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         5,
#         "5"
#       ],
#       [
#         2,
#         10,
#         "10"
#       ],
#       [
#         3,
#         15,
#         "15"
#       ],
#       [
#         4,
#         3,
#         "3"
#       ],
#       [
#         5,
#         3,
#         "3"
#       ]
#     ]
#   ]
# ]

7.3.58.4.6.4. `columns[${NAME}].value`#

Added in version 6.0.6.

Specifies expression that generates values for the dynamic column. The expression uses Script syntax. It’s the same syntax in filter. For example, 1 + 1, string_length("Hello"), column * 1.08 and so on are valid expressions.

You need to specify Window function as value value and other window function related parameters when you use window function. See Window function related parameters for details.

This is a required parameter.

Here is an example that creates a new dynamic column that stores the number of characters of content. This example uses string_length function in functions/string plugin to compute the number of characters in a string. plugin_register is used to register functions/string plugin:

Execution example:

plugin_register functions/string
# [[0,1337566253.89858,0.000355720520019531],true]
select Entries \
  --columns[content_length].stage initial \
  --columns[content_length].type UInt32 \
  --columns[content_length].value 'string_length(content)' \
  --output_columns _id,content,content_length
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "content_length",
#           "UInt32"
#         ]
#       ],
#       [
#         1,
#         "Welcome! This is my first post!",
#         31
#       ],
#       [
#         2,
#         "I started to use Groonga. It's very fast!",
#         41
#       ],
#       [
#         3,
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         64
#       ],
#       [
#         4,
#         "I migrated all Senna system!",
#         28
#       ],
#       [
#         5,
#         "I also migrated all Tritonn system!",
#         35
#       ]
#     ]
#   ]
# ]

7.3.58.4.7. Window function related parameters#

Added in version 6.0.6.

This section describes window function related parameters. You need to use dynamic column for using window function. See Dynamic column related parameters for dynamic column.

Window function in Groonga is similar to window function in SQL. Normal function computes its result with only the current record. On the other hand, window function computes its result with multiple records. Window function is useful to data analysis because it can process multiple records.

You can find supported window functions at Window function. For example, window_sum is a window function. It sums numbers in the target records.

Window function processes records that are grouped by the specified group keys. For example, window function processes three groups (Hello group, Groonga group and Senna group) in the following case. window_sum sums n_likes values in each group:

Group No.	Group key value	`n_likes` value	window_sum result
1	`Hello`	5	5
2	`Groonga`	10	25
2	`Groonga`	15	25
3	`Senna`	3	6
3	`Senna`	3	6

You can specify no group keys. In the case, window function processes only one group that includes all records. window_sum sums all n_likes values in the following case:

Group No.	`n_likes` value	window_sum result
1	5	36
1	10	36
1	15	36
1	3	36
1	3	36

Window function processes records in each group in the specified order. You can specify no sort keys like the above group keys example.

The behavior when you specify no sort keys depends on each window function specification. For example, window_sum uses different behavior whether sort keys are specified or not. If you specify not sort keys, window_sum sums values of all records in the group and puts it to all target records like the above group keys example. If you specify sort keys, window_sum behaves as cumulative sum. window_sum sums values of all records in the group in sequence and puts the current sum to the current record like the following:

Group No.	Group key value	Sort key value	`n_likes` value	window_sum result	Note
1	`Hello`	1	5	5	The first record in group No. 1. (`5 = 5`)
2	`Groonga`	90	10	10	The first record in group No. 2. (`10 = 10`)
2	`Groonga`	91	15	25	The second record in group No. 2. (`10 + 15 = 25`)
3	`Senna`	200	3	8	The second record in group No. 3. (`5 + 3 = 8`)
3	`Senna`	100	5	5	The first record in group No. 3. (`5 = 5`)

Here are parameters for window function. You need to specify both window function related parameters and required dynamic columns parameters. Because window function is implemented based on dynamic column. See Dynamic column related parameters for dynamic column related parameters:

Name	Required or optional	Note
`columns[${NAME}].value`	Required	Use Window function.
`columns[${NAME}].window.sort_keys`	Required if `columns[${NAME}].window.group_keys` isn’t specified.
`columns[${NAME}].window.group_keys`	Required if `columns[${NAME}].window.sort_keys` isn’t specified.

7.3.58.4.7.1. `columns[${NAME}].window.sort_keys`#

Added in version 6.0.6.

Specifies sort keys in each group. Window function processes records in each group in the specified order.

Sort keys are separated by ,. Each sort key is column name. It’s the same as sort_keys.

You must specify columns[${NAME}].window.sort_keys or columns[${NAME}].window.group_keys to use window function.

Here is an example that computes cumulative sum per Entries.tag. Each group is sorted by Entries._key:

Execution example:

select \
  --table Entries \
  --columns[n_likes_cumulative_sum_per_tag].stage initial \
  --columns[n_likes_cumulative_sum_per_tag].type UInt32 \
  --columns[n_likes_cumulative_sum_per_tag].value 'window_sum(n_likes)' \
  --columns[n_likes_cumulative_sum_per_tag].window.sort_keys _key \
  --columns[n_likes_cumulative_sum_per_tag].window.group_keys tag \
  --sort_keys _key \
  --output_columns tag,_key,n_likes,n_likes_cumulative_sum_per_tag
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "tag",
#           "ShortText"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "n_likes_cumulative_sum_per_tag",
#           "UInt32"
#         ]
#       ],
#       [
#         "Senna",
#         "Good-bye Senna",
#         3,
#         3
#       ],
#       [
#         "Senna",
#         "Good-bye Tritonn",
#         3,
#         6
#       ],
#       [
#         "Groonga",
#         "Groonga",
#         10,
#         10
#       ],
#       [
#         "Groonga",
#         "Mroonga",
#         15,
#         25
#       ],
#       [
#         "Hello",
#         "The first post!",
#         5,
#         5
#       ]
#     ]
#   ]
# ]

7.3.58.4.7.2. `columns[${NAME}].window.group_keys`#

Added in version 7.0.0.

Specifies group keys. Window function processes records in each group. If you specify no group keys, window function processes one group that includes all records.

Group keys are separated by ,. Each group key is column name. It’s the same as drilldown.

You must specify columns[${NAME}].window.sort_keys or columns[${NAME}].window.group_keys to use window function.

Here is an example that computes sum per Entries.tag:

Execution example:

select \
  --table Entries \
  --columns[n_likes_sum_per_tag].stage initial \
  --columns[n_likes_sum_per_tag].type UInt32 \
  --columns[n_likes_sum_per_tag].value 'window_sum(n_likes)' \
  --columns[n_likes_sum_per_tag].window.group_keys tag \
  --sort_keys _key \
  --output_columns tag,_key,n_likes,n_likes_sum_per_tag
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "tag",
#           "ShortText"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "n_likes_sum_per_tag",
#           "UInt32"
#         ]
#       ],
#       [
#         "Senna",
#         "Good-bye Senna",
#         3,
#         6
#       ],
#       [
#         "Senna",
#         "Good-bye Tritonn",
#         3,
#         6
#       ],
#       [
#         "Groonga",
#         "Groonga",
#         10,
#         25
#       ],
#       [
#         "Groonga",
#         "Mroonga",
#         15,
#         25
#       ],
#       [
#         "Hello",
#         "The first post!",
#         5,
#         5
#       ]
#     ]
#   ]
# ]

7.3.58.4.8. Drilldown related parameters#

This section describes basic drilldown related parameters. Advanced drilldown related parameters are described in another section.

7.3.58.4.8.1. `drilldown`#

Specifies keys for grouping separated by ,.

Matched records by specified search conditions are grouped by each key. If you specify no search condition, all records are grouped by each key.

Here is a simple drilldown example:

Execution example:

select Entries \
  --output_columns _key,tag \
  --drilldown tag
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         "The first post!",
#         "Hello"
#       ],
#       [
#         "Groonga",
#         "Groonga"
#       ],
#       [
#         "Mroonga",
#         "Groonga"
#       ],
#       [
#         "Good-bye Senna",
#         "Senna"
#       ],
#       [
#         "Good-bye Tritonn",
#         "Senna"
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Hello",
#         1
#       ],
#       [
#         "Groonga",
#         2
#       ],
#       [
#         "Senna",
#         2
#       ]
#     ]
#   ]
# ]

The select command outputs the following information:

There is one record that has “Hello” tag.

There is two records that has “Groonga” tag.

There is two records that has “Senna” tag.

Here is a drilldown with search condition example:

Execution example:

select Entries \
  --output_columns _key,tag \
  --filter 'n_likes >= 5' \
  --drilldown tag
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         "The first post!",
#         "Hello"
#       ],
#       [
#         "Groonga",
#         "Groonga"
#       ],
#       [
#         "Mroonga",
#         "Groonga"
#       ]
#     ],
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Hello",
#         1
#       ],
#       [
#         "Groonga",
#         2
#       ]
#     ]
#   ]
# ]

The select command outputs the following information:

In records that have 5 or larger as n_likes value:

There is one record that has “Hello” tag.

There is two records that has “Groonga” tag.

Here is a drilldown with multiple group keys example:

Execution example:

select Entries \
  --limit 0 \
  --output_columns _id \
  --drilldown tag,n_likes
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Hello",
#         1
#       ],
#       [
#         "Groonga",
#         2
#       ],
#       [
#         "Senna",
#         2
#       ]
#     ],
#     [
#       [
#         4
#       ],
#       [
#         [
#           "_key",
#           "UInt32"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         5,
#         1
#       ],
#       [
#         10,
#         1
#       ],
#       [
#         15,
#         1
#       ],
#       [
#         3,
#         2
#       ]
#     ]
#   ]
# ]

The select command outputs the following information:

About tag:

There is one record that has “Hello” tag.

There is two records that has “Groonga” tag.

There is two records that has “Senna” tag.

About n_likes:

There is one record that has “Hello” tag.

There is two records that has “Groonga” tag.

There is two records that has “Senna” tag.

7.3.58.4.8.2. `drilldown_sortby`#

Deprecated since version 6.0.3: Use drilldown_sort_keys instead.

7.3.58.4.8.3. `drilldown_sort_keys`#

Specifies sort keys for drilldown outputs separated by ,. Each sort key is column name.

You can refer the number of grouped records by _nsubrecs Pseudo column.

Here is a simple drilldown_sort_keys example:

Execution example:

select Entries \
  --limit 0 \
  --output_columns _id \
  --drilldown 'tag, n_likes' \
  --drilldown_sort_keys '-_nsubrecs, _key'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Groonga",
#         2
#       ],
#       [
#         "Senna",
#         2
#       ],
#       [
#         "Hello",
#         1
#       ]
#     ],
#     [
#       [
#         4
#       ],
#       [
#         [
#           "_key",
#           "UInt32"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         3,
#         2
#       ],
#       [
#         5,
#         1
#       ],
#       [
#         10,
#         1
#       ],
#       [
#         15,
#         1
#       ]
#     ]
#   ]
# ]

Drilldown result is sorted by the number of grouped records (= _nsubrecs ) in descending order. If there are grouped results that the number of records in the group are the same, these grouped results are sorted by grouped key (= _key ) in ascending order.

The sort keys are used in all group keys specified in drilldown:

Execution example:

select Entries \
  --limit 0 \
  --output_columns _id \
  --drilldown 'tag, n_likes' \
  --drilldown_sort_keys '-_nsubrecs, _key'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Groonga",
#         2
#       ],
#       [
#         "Senna",
#         2
#       ],
#       [
#         "Hello",
#         1
#       ]
#     ],
#     [
#       [
#         4
#       ],
#       [
#         [
#           "_key",
#           "UInt32"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         3,
#         2
#       ],
#       [
#         5,
#         1
#       ],
#       [
#         10,
#         1
#       ],
#       [
#         15,
#         1
#       ]
#     ]
#   ]
# ]

The same sort keys are used in tag drilldown and n_likes drilldown.

If you want to use different sort keys for each drilldown, use Advanced drilldown related parameters.

7.3.58.4.8.4. `drilldown_output_columns`#

Specifies output columns for drilldown separated by ,.

Here is a drilldown_output_columns example:

Execution example:

select Entries \
  --limit 0 \
  --output_columns _id \
  --drilldown tag \
  --drilldown_output_columns _key
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ]
#       ],
#       [
#         "Hello"
#       ],
#       [
#         "Groonga"
#       ],
#       [
#         "Senna"
#       ]
#     ]
#   ]
# ]

The select command just outputs grouped key.

If grouped key is a referenced type column (= column that its type is a table), you can access column of the table referenced by the referenced type column.

Here are a schema definition and sample data to show drilldown against referenced type column:

Execution example:

table_create Tags TABLE_HASH_KEY ShortText --normalizer NormalizerAuto
# [[0,1337566253.89858,0.000355720520019531],true]
column_create Tags label COLUMN_SCALAR ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
column_create Tags priority COLUMN_SCALAR Int32
# [[0,1337566253.89858,0.000355720520019531],true]
table_create Items TABLE_HASH_KEY ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
column_create Items tag COLUMN_SCALAR Tags
# [[0,1337566253.89858,0.000355720520019531],true]
load --table Tags
[
{"_key": "groonga", label: "Groonga", priority: 10},
{"_key": "mroonga", label: "Mroonga", priority: 5}
]
# [[0,1337566253.89858,0.000355720520019531],2]
load --table Items
[
{"_key": "A", "tag": "groonga"},
{"_key": "B", "tag": "groonga"},
{"_key": "C", "tag": "mroonga"}
]
# [[0,1337566253.89858,0.000355720520019531],3]

Tags table is a referenced table. Items.tag is a referenced type column.

You can refer Tags.label by label in drilldown_output_columns:

Execution example:

select Items \
  --limit 0 \
  --output_columns _id \
  --drilldown tag \
  --drilldown_output_columns '_key, label'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "label",
#           "ShortText"
#         ]
#       ],
#       [
#         "groonga",
#         "Groonga"
#       ],
#       [
#         "mroonga",
#         "Mroonga"
#       ]
#     ]
#   ]
# ]

You can use * to refer all columns in referenced table (= Tags):

Execution example:

select Items \
  --limit 0 \
  --output_columns _id \
  --drilldown tag \
  --drilldown_output_columns '_key, *'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "label",
#           "ShortText"
#         ],
#         [
#           "priority",
#           "Int32"
#         ]
#       ],
#       [
#         "groonga",
#         "Groonga",
#         10
#       ],
#       [
#         "mroonga",
#         "Mroonga",
#         5
#       ]
#     ]
#   ]
# ]

* is expanded to label, priority.

The default value of drilldown_output_columns is _key, _nsubrecs. It means that grouped key and the number of records in the group are output.

You can use more Pseudo column in drilldown_output_columns such as _max, _min, _sum and _avg when you use drilldown_calc_types. See drilldown_calc_types document for details.

7.3.58.4.8.5. `drilldown_offset`#

Specifies offset to determine range of drilldown output records. Offset is zero-based. --drilldown_offset 1 means output range is started from the 2nd record.

Here is a drilldown_offset example:

Execution example:

select Entries \
  --limit 0 \
  --output_columns _id \
  --drilldown tag \
  --drilldown_sort_keys _key \
  --drilldown_offset 1
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Hello",
#         1
#       ],
#       [
#         "Senna",
#         2
#       ]
#     ]
#   ]
# ]

The select command outputs from the 2nd record.

You can specify negative value. It means that the number of grouped results + offset. If you have 3 grouped results and specify --drilldown_offset -2, you get grouped results from the 2st (3 + -2 = 1. 1 means 2nd. Remember that offset is zero-based.) grouped result to the 3rd grouped result.

Execution example:

select Entries \
  --limit 0 \
  --output_columns _id \
  --drilldown tag \
  --drilldown_sort_keys _key \
  --drilldown_offset -2
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Hello",
#         1
#       ],
#       [
#         "Senna",
#         2
#       ]
#     ]
#   ]
# ]

The select command outputs from the 2nd grouped result because the total number of grouped results is 3.

The default value of drilldown_offset is 0.

7.3.58.4.8.6. `drilldown_limit`#

Specifies the max number of groups in a drilldown. If the number of groups is less than drilldown_limit, all groups are outputted.

Here is a drilldown_limit example:

Execution example:

select Entries \
  --limit 0 \
  --output_columns _id \
  --drilldown tag \
  --drilldown_sort_keys _key \
  --drilldown_offset 1 \
  --drilldown_limit 2
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Hello",
#         1
#       ],
#       [
#         "Senna",
#         2
#       ]
#     ]
#   ]
# ]

The select command outputs the 2rd and the 3rd groups.

You can specify negative value. It means that the number of groups + drilldown_limit + 1. For example, --drilldown_limit -1 outputs all groups. It’s very useful value to show all groups.

Here is a negative drilldown_limit value example.

Execution example:

select Entries \
  --limit 0 \
  --output_columns _id \
  --drilldown tag \
  --drilldown_sort_keys _key \
  --drilldown_limit -1
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Groonga",
#         2
#       ],
#       [
#         "Hello",
#         1
#       ],
#       [
#         "Senna",
#         2
#       ]
#     ]
#   ]
# ]

The select command outputs all groups.

The default value of drilldown_limit is 10.

7.3.58.4.8.7. `drilldown_calc_types`#

Specifies how to calculate (aggregate) values in grouped records by a drilldown. You can specify multiple calculation types separated by “,”. For example, MAX,MIN.

Calculation target values are read from a column of grouped records. The column is specified by drilldown_calc_target.

You can read calculated value by Pseudo column such as _max and _min in drilldown_output_columns.

You can use the following calculation types:

Type name	Pseudo column name	Need drilldown_calc_target	Description
`NONE`	Nothing.	Not needs.	Just ignored.
`COUNT`	`_nsubrecs`	Not needs.	Counting grouped records. It’s always enabled. So you don’t need to specify it.
`MAX`	`_max`	Needs.	Finding the maximum integer value from integer values in grouped records.
`MIN`	`_min`	Needs.	Finding the minimum integer value from integer values in grouped records.
`SUM`	`_sum`	Needs.	Summing integer values in grouped records.
`AVG`	`_avg`	Needs.	Averaging integer/float values in grouped records.

Here is a MAX example:

Execution example:

select Entries \
  --limit -1 \
  --output_columns _id,n_likes \
  --drilldown tag \
  --drilldown_calc_types MAX \
  --drilldown_calc_target n_likes \
  --drilldown_output_columns _key,_max
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ]
#       ],
#       [
#         1,
#         5
#       ],
#       [
#         2,
#         10
#       ],
#       [
#         3,
#         15
#       ],
#       [
#         4,
#         3
#       ],
#       [
#         5,
#         3
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_max",
#           "Int64"
#         ]
#       ],
#       [
#         "Hello",
#         5
#       ],
#       [
#         "Groonga",
#         15
#       ],
#       [
#         "Senna",
#         3
#       ]
#     ]
#   ]
# ]

The select command groups all records by tag column value, finding the maximum n_likes column value for each group and outputs pairs of grouped key and the maximum n_likes column value for the group. It uses _max Pseudo column to read the maximum n_likes column value.

Here is a MIN example:

Execution example:

select Entries \
  --limit -1 \
  --output_columns _id,n_likes \
  --drilldown tag \
  --drilldown_calc_types MIN \
  --drilldown_calc_target n_likes \
  --drilldown_output_columns _key,_min
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ]
#       ],
#       [
#         1,
#         5
#       ],
#       [
#         2,
#         10
#       ],
#       [
#         3,
#         15
#       ],
#       [
#         4,
#         3
#       ],
#       [
#         5,
#         3
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_min",
#           "Int64"
#         ]
#       ],
#       [
#         "Hello",
#         5
#       ],
#       [
#         "Groonga",
#         10
#       ],
#       [
#         "Senna",
#         3
#       ]
#     ]
#   ]
# ]

The select command groups all records by tag column value, finding the minimum n_likes column value for each group and outputs pairs of grouped key and the minimum n_likes column value for the group. It uses _min Pseudo column to read the minimum n_likes column value.

Here is a SUM example:

Execution example:

select Entries \
  --limit -1 \
  --output_columns _id,n_likes \
  --drilldown tag \
  --drilldown_calc_types SUM \
  --drilldown_calc_target n_likes \
  --drilldown_output_columns _key,_sum
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ]
#       ],
#       [
#         1,
#         5
#       ],
#       [
#         2,
#         10
#       ],
#       [
#         3,
#         15
#       ],
#       [
#         4,
#         3
#       ],
#       [
#         5,
#         3
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_sum",
#           "Int64"
#         ]
#       ],
#       [
#         "Hello",
#         5
#       ],
#       [
#         "Groonga",
#         25
#       ],
#       [
#         "Senna",
#         6
#       ]
#     ]
#   ]
# ]

The select command groups all records by tag column value, sums all n_likes column values for each group and outputs pairs of grouped key and the summed n_likes column values for the group. It uses _sum Pseudo column to read the summed n_likes column values.

Here is a AVG example:

Execution example:

select Entries \
  --limit -1 \
  --output_columns _id,n_likes \
  --drilldown tag \
  --drilldown_calc_types AVG \
  --drilldown_calc_target n_likes \
  --drilldown_output_columns _key,_avg
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ]
#       ],
#       [
#         1,
#         5
#       ],
#       [
#         2,
#         10
#       ],
#       [
#         3,
#         15
#       ],
#       [
#         4,
#         3
#       ],
#       [
#         5,
#         3
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_avg",
#           "Float"
#         ]
#       ],
#       [
#         "Hello",
#         5.0
#       ],
#       [
#         "Groonga",
#         12.5
#       ],
#       [
#         "Senna",
#         3.0
#       ]
#     ]
#   ]
# ]

The select command groups all records by tag column value, averages all n_likes column values for each group and outputs pairs of grouped key and the averaged n_likes column values for the group. It uses _avg Pseudo column to read the averaged n_likes column values.

Here is an example that uses all calculation types:

Execution example:

select Entries \
  --limit -1 \
  --output_columns _id,n_likes \
  --drilldown tag \
  --drilldown_calc_types MAX,MIN,SUM,AVG \
  --drilldown_calc_target n_likes \
  --drilldown_output_columns _key,_nsubrecs,_max,_min,_sum,_avg
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ]
#       ],
#       [
#         1,
#         5
#       ],
#       [
#         2,
#         10
#       ],
#       [
#         3,
#         15
#       ],
#       [
#         4,
#         3
#       ],
#       [
#         5,
#         3
#       ]
#     ],
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ],
#         [
#           "_max",
#           "Int64"
#         ],
#         [
#           "_min",
#           "Int64"
#         ],
#         [
#           "_sum",
#           "Int64"
#         ],
#         [
#           "_avg",
#           "Float"
#         ]
#       ],
#       [
#         "Hello",
#         1,
#         5,
#         5,
#         5,
#         5.0
#       ],
#       [
#         "Groonga",
#         2,
#         15,
#         10,
#         25,
#         12.5
#       ],
#       [
#         "Senna",
#         2,
#         3,
#         3,
#         6,
#         3.0
#       ]
#     ]
#   ]
# ]

The select command specifies multiple calculation types separated by “,” like MAX,MIN,SUM,AVG. You can use _nsubrecs Pseudo column in drilldown_output_columns without specifying COUNT in drilldown_calc_types. Because COUNT is always enabled.

The default value of drilldown_calc_types is NONE. It means that only COUNT is enabled. Because NONE is just ignored and COUNT is always enabled.

7.3.58.4.8.8. `drilldown_calc_target`#

Added in version 6.0.3.

Specifies the target column for drilldown_calc_types.

If you specify a calculation type that needs a target column such as MAX in drilldown_calc_types but you omit drilldown_calc_target, the calculation result is always 0.

You can specify only one column name like --drilldown_calc_target n_likes. You can’t specify multiple column name like --drilldown_calc_target _key,n_likes.

You can use referenced value from the target record by combining “.” like --drilldown_calc_target reference_column.nested_reference_column.value.

See drilldown_calc_types to know how to use drilldown_calc_target.

The default value of drilldown_calc_target is null. It means that no calculation target column is specified.

7.3.58.4.8.9. `drilldown_filter`#

Added in version 6.0.3.

Specifies the filter condition against the drilled down result.

The syntax is Script syntax. It’s the same as filter.

Here is an example to suppress tags that are occurred only once:

Execution example:

select Entries \
  --limit -1 \
  --output_columns _id,tag \
  --drilldown tag \
  --drilldown_filter '_nsubrecs > 1' \
  --drilldown_output_columns _key,_nsubrecs
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "Hello"
#       ],
#       [
#         2,
#         "Groonga"
#       ],
#       [
#         3,
#         "Groonga"
#       ],
#       [
#         4,
#         "Senna"
#       ],
#       [
#         5,
#         "Senna"
#       ]
#     ],
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Groonga",
#         2
#       ],
#       [
#         "Senna",
#         2
#       ]
#     ]
#   ]
# ]

7.3.58.4.8.10. `drilldown_max_n_target_records`#

Added in version 12.0.0.

Specifies the max number of records of the drilldown target table (filtered result) to use drilldown. If the number of filtered result is larger than the specified value, some records in filtered result aren’t used for drilldown.

If the specified value is negative, it’s processed same as limit. For example, -1 uses all records. The default value is -1. It means that all filtered records are used by default.

This feature is useful when filtered result may be very large. A drilldown against large filtered result may be slow. You can limit the max number of records to be used for drilldown by this feature.

Here is an example to limit the max number of records to be used for drilldown. The last 2 records, {"_id": 4, "tag": "Senna"} and {"_id": 5, "tag": "Senna"}, aren’t used:

Execution example:

select Entries \
  --limit -1 \
  --output_columns _id,tag \
  --drilldown tag \
  --drilldown_max_n_target_records 3
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "Hello"
#       ],
#       [
#         2,
#         "Groonga"
#       ],
#       [
#         3,
#         "Groonga"
#       ],
#       [
#         4,
#         "Senna"
#       ],
#       [
#         5,
#         "Senna"
#       ]
#     ],
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_nsubrecs",
#           "Int32"
#         ]
#       ],
#       [
#         "Hello",
#         1
#       ],
#       [
#         "Groonga",
#         2
#       ]
#     ]
#   ]
# ]

7.3.58.4.9. Advanced drilldown related parameters#

Added in version 4.0.8.

You can get multiple drilldown results by specifying multiple group keys by drilldown. But you need to use the same configuration for all drilldowns. For example, drilldown_output_columns is used by all drilldowns.

You can use a configuration for each drilldown by the following parameters:

drilldowns[${LABEL}].keys

drilldowns[${LABEL}].table

drilldowns[${LABEL}].sort_keys

drilldowns[${LABEL}].output_columns

drilldowns[${LABEL}].offset

drilldowns[${LABEL}].limit

drilldowns[${LABEL}].calc_types

drilldowns[${LABEL}].calc_target

drilldowns[${LABEL}].filter

drilldowns[${LABEL}].max_n_target_records

drilldowns[${LABEL}].key_vector_expansion

drilldowns[${LABEL}].columns[${NAME}].stage=null

drilldowns[${LABEL}].columns[${NAME}].flags=COLUMN_SCALAR

drilldowns[${LABEL}].columns[${NAME}].type=null

drilldowns[${LABEL}].columns[${NAME}].value=null

drilldowns[${LABEL}].columns[${NAME}].window.sort_keys=null

drilldowns[${LABEL}].columns[${NAME}].window.group_keys=null

${LABEL} is a variable. You can use the following characters for ${LABEL}:

Alphabets

Digits

.

_

${NAME} is a variable. You can use the following characters for ${NAME}:

Alphabets

Digits

_

Note

You can use more characters but it’s better that you use only these characters.

Parameters that has the same ${LABEL} value are grouped. Grouped parameters are used for one drilldown.

For example, there are 2 groups for the following parameters:

--drilldowns[label1].keys _key

--drilldowns[label1].output_columns _nsubrecs

--drilldowns[label2].keys tag

--drilldowns[label2].output_columns _key,_nsubrecs

drilldowns[label1].keys and drilldowns[label1].output_columns are grouped. drilldowns[label2].keys and drilldowns[label2].output_columns are also grouped.

In label1 group, _key is used for group key and _nsubrecs is used for output columns.

In label2 group, tag is used for group key and _key,_nsubrecs is used for output columns.

See document for corresponding drilldown_XXX parameter to know how to use it for the following parameters:

drilldowns[${LABEL}].sort_keys: drilldown_sort_keys

drilldowns[${LABEL}].offset: drilldown_offset

drilldowns[${LABEL}].limit: drilldown_limit

drilldowns[${LABEL}].calc_types: drilldown_calc_types

drilldowns[${LABEL}].calc_target: drilldown_calc_target

drilldowns[${LABEL}].filter: drilldown_filter

drilldowns[${LABEL}].max_n_target_records: drilldown_max_n_target_records

See document for corresponding columns[${NAME}].XXX parameter to know how to use it for the following parameters:

drilldowns[${LABEL}].columns[${NAME}].flags=COLUMN_SCALAR: columns[${NAME}].flags

drilldowns[${LABEL}].columns[${NAME}].type=null: columns[${NAME}].type

drilldowns[${LABEL}].columns[${NAME}].value=null: columns[${NAME}].value

drilldowns[${LABEL}].columns[${NAME}].window.sort_keys=null: columns[${NAME}].window.sort_keys

drilldowns[${LABEL}].columns[${NAME}].window.group_keys=null: columns[${NAME}].window.group_keys

The following parameters are needed more description:

drilldowns[${LABEL}].keys

drilldowns[${LABEL}].table

drilldowns[${LABEL}].output_columns

drilldowns[${LABEL}].columns[${NAME}].stage=null

Output format is different a bit. It’s also needed more description.

7.3.58.4.9.1. `drilldowns[${LABEL}].keys`#

Added in version 4.0.8.

drilldown can specify multiple keys for multiple drilldowns. But it can’t specify multiple keys for one drilldown.

drilldowns[${LABEL}].keys can’t specify multiple keys for multiple drilldowns. But it can specify multiple keys for one drilldown.

You can specify multiple keys separated by “,”.

Here is an example to group by multiple keys, tag and n_likes column values:

Execution example:

select Entries \
  --limit -1 \
  --output_columns tag,n_likes \
  --drilldowns[tag.n_likes].keys tag,n_likes \
  --drilldowns[tag.n_likes].output_columns _value.tag,_value.n_likes,_nsubrecs
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "tag",
#           "ShortText"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ]
#       ],
#       [
#         "Hello",
#         5
#       ],
#       [
#         "Groonga",
#         10
#       ],
#       [
#         "Groonga",
#         15
#       ],
#       [
#         "Senna",
#         3
#       ],
#       [
#         "Senna",
#         3
#       ]
#     ],
#     {
#       "tag.n_likes": [
#         [
#           4
#         ],
#         [
#           [
#             "tag",
#             "ShortText"
#           ],
#           [
#             "n_likes",
#             "UInt32"
#           ],
#           [
#             "_nsubrecs",
#             "Int32"
#           ]
#         ],
#         [
#           "Hello",
#           5,
#           1
#         ],
#         [
#           "Groonga",
#           10,
#           1
#         ],
#         [
#           "Groonga",
#           15,
#           1
#         ],
#         [
#           "Senna",
#           3,
#           2
#         ]
#       ]
#     }
#   ]
# ]

tag.n_likes is used as the label for the drilldown parameters group. You can refer grouped keys by _value.${KEY_NAME} syntax in drilldowns[${LABEL}].output_columns. ${KEY_NAME} is a column name to be used by group key. tag and n_likes are ${KEY_NAME} in this case.

Note that you can’t use _value.${KEY_NAME} syntax when you just specify one key as drilldowns[${LABEL}].keys like --drilldowns[tag].keys tag. You should use _key for the case. It’s the same rule in drilldown_output_columns.

7.3.58.4.9.2. `drilldowns[${LABEL}].table`#

Added in version 6.0.2.

Specify ${LABEL} of other drilldowns or slices.

You can drilldown the result of specified ${LABEL}. It means that this parameter enables a nested drilldown.

Here is an example to execute the nested drilldown. The final result takes first drilldown by tag and then 2nd drilldown by category against first result.

Execution example:

table_create NestedDrilldownTags TABLE_PAT_KEY ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
column_create NestedDrilldownTags category COLUMN_SCALAR ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
table_create NestedDrilldownMemos TABLE_HASH_KEY ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
column_create NestedDrilldownMemos tag COLUMN_SCALAR NestedDrilldownTags
# [[0,1337566253.89858,0.000355720520019531],true]
load --table NestedDrilldownMemos
[
{"_key": "Groonga is fast!", "tag": "Groonga"},
{"_key": "Groonga sticker!", "tag": "Groonga"},
{"_key": "Mroonga sticker!", "tag": "Mroonga"},
{"_key": "Rroonga is fast!", "tag": "Rroonga"}
]
# [[0,1337566253.89858,0.000355720520019531],4]
load --table NestedDrilldownTags
[
{"_key": "Groonga", "category": "C/C++"},
{"_key": "Mroonga", "category": "C/C++"},
{"_key": "PGroonga", "category": "C/C++"},
{"_key": "Rroonga", "category": "Ruby"}
]
# [[0,1337566253.89858,0.000355720520019531],4]
select NestedDrilldownMemos \
  --drilldowns[Tag].keys tag \
  --drilldowns[Tag].output_columns _key \
  --drilldowns[Category].table Tag \
  --drilldowns[Category].keys category \
  --drilldowns[Category].output_columns _key,_nsubrecs
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         4
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "tag",
#           "NestedDrilldownTags"
#         ]
#       ],
#       [
#         1,
#         "Groonga is fast!",
#         "Groonga"
#       ],
#       [
#         2,
#         "Groonga sticker!",
#         "Groonga"
#       ],
#       [
#         3,
#         "Mroonga sticker!",
#         "Mroonga"
#       ],
#       [
#         4,
#         "Rroonga is fast!",
#         "Rroonga"
#       ]
#     ],
#     {
#       "Tag": [
#         [
#           3
#         ],
#         [
#           [
#             "_key",
#             "ShortText"
#           ]
#         ],
#         [
#           "Groonga"
#         ],
#         [
#           "Mroonga"
#         ],
#         [
#           "Rroonga"
#         ]
#       ],
#       "Category": [
#         [
#           2
#         ],
#         [
#           [
#             "_key",
#             "ShortText"
#           ],
#           [
#             "_nsubrecs",
#             "Int32"
#           ]
#         ],
#         [
#           "C/C++",
#           2
#         ],
#         [
#           "Ruby",
#           1
#         ]
#       ]
#     }
#   ]
# ]

In this example; The schema contains the table named as NestedDrilldownMemo which has the column named as tag, the table named as NestedDrilldownTags which has the column named as category.

Tag drilldowns NestedDrilldownMemos by tag. Thus, the result of Tag contains one row each for Groonga, Mroonga and Rroonga. And then, Category drilldowns Tag by category. Thus the result of Category contains two records has C/C++ and one records has Ruby.

7.3.58.4.9.3. `drilldowns[${LABEL}].key_vector_expansion`#

Added in version 12.1.1.

This specifies how to expand key when keys for drilldown are vector. Currently, NONE or POWER_SET are able to be specified.

This work only time when one key is target for drilldown. This doesn’t work time when more than 2 keys are target for drilldown.

7.3.58.4.9.3.1. `NONE`#

This works as same as key_vector_expansion is not specified.

Keys would not be expanded. Each element within a vector works as each key.

Following is a sample to aggregate total number of individual and combination occurrence for 3 tags, Groonga, Mroonga, and PGroonga.

Execution example:

table_create NoneExpantionDrilldownMemos TABLE_HASH_KEY ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
column_create NoneExpantionDrilldownMemos tags COLUMN_VECTOR ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
load --table NoneExpantionDrilldownMemos
[
{"_key": "Groonga is fast!", "tags": ["Groonga"]},
{"_key": "Mroonga uses Groonga!", "tags": ["Groonga", "Mroonga"]},
{"_key": "PGroonga uses Groonga!", "tags": ["Groonga", "PGroonga"]},
{"_key": "Mroonga and PGroonga are Groonga family", "tags": ["Groonga", "Mroonga", "PGroonga"]}
]
# [[0,1337566253.89858,0.000355720520019531],4]
select NoneExpantionDrilldownMemos \
  --drilldowns[tags].keys tags \
  --drilldowns[tags].key_vector_expansion NONE \
  --drilldowns[tags].columns[none_expantion].stage initial \
  --drilldowns[tags].columns[none_expantion].value _key \
  --drilldowns[tags].columns[none_expantion].flags COLUMN_VECTOR \
  --drilldowns[tags].sort_keys 'none_expantion' \
  --drilldowns[tags].output_columns 'none_expantion, _nsubrecs' \
  --limit 0
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         4
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "tags",
#           "ShortText"
#         ]
#       ]
#     ],
#     {
#       "tags": [
#         [
#           3
#         ],
#         [
#           [
#             "none_expantion",
#             "Text"
#           ],
#           [
#             "_nsubrecs",
#             "Int32"
#           ]
#         ],
#         [
#           [
#             "Groonga"
#           ],
#           4
#         ],
#         [
#           [
#             "Mroonga"
#           ],
#           2
#         ],
#         [
#           [
#             "PGroonga"
#           ],
#           2
#         ]
#       ]
#     }
#   ]
# ]

Following are facts from the results.

tag	number of occurrence（ `_nsubrecs` ）
`Groonga`	4
`Mroonga`	2
`PGroonga`	2

7.3.58.4.9.3.2. `POWER_SET`#

This aggregates total with expanding vectors to the power set. In this case, a target vector is considered as multi set. Thus, each element is considered as an individual element when there are multiple elements with same value.

For example, there is a vector [A, B, C]. In this case, a target set is {A, B, C}. The power set is aggregation for all of subsets within a set. Following shows all of subset within a set {A, B, C}. However, Groonga does not use empty set, that number of elements is 0. It is because empty set is not useful for results of drilldown. Please report at issue, if you find a case to use empty set.

Subset with 1 element
- {A}
- {B}
- {C}
Subset with 2 elements
- {A, B}
- {B, C}
- {A, C}
Subset with 3 elements
- {A, B, C}

Those are all subsets for {A, B, C}. Since the power set is aggregation of those subsets, {{A}, {B}, {C}, {A, B}, {B, C}, {A, C}, {A, B, C}} is a power set for the vector.

POWER_SET aggregates with each subset for {{A}, {B}, {C}, {A, B}, {B, C}, {A, C}, {A, B, C}}.

For example, there is a case to aggregate [A, B, C] and [B, C, D] with the power set.

A power set for [A, B, C] is {{A}, {B}, {C}, {A, B}, {B, C}, {A, C}, {A, B, C}} as previously explained. Likewise a power set for [B, C, D] is {{B}, {C}, {D}, {B, C}, {C, D}, {B, D}, {B, C, D}}.

Aggregating occurrence of each subset in each power set is shown in result as follows.

subset	number of occurrence（ `_nsubrecs` ）
`{A}`	1
`{B}`	2
`{C}`	2
`{D}`	1
`{A, B}`	1
`{A, C}`	1
`{B, C}`	2
`{B, D}`	1
`{C, D}`	1
`{A, B, C}`	1
`{B, C, D}`	1

This aggregating methods is useful when it is requirement to sum total number of occurrence for individual tags and combination of tags at once.

Following is an example to aggregate total occurrence for individual and combination of 3 tags, Groonga, Mroonga, and PGroonga.

Execution example:

table_create PowerSetDrilldownMemos TABLE_HASH_KEY ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
column_create PowerSetDrilldownMemos tags COLUMN_VECTOR ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
load --table PowerSetDrilldownMemos
[
{"_key": "Groonga is fast!", "tags": ["Groonga"]},
{"_key": "Mroonga uses Groonga!", "tags": ["Groonga", "Mroonga"]},
{"_key": "PGroonga uses Groonga!", "tags": ["Groonga", "PGroonga"]},
{"_key": "Mroonga and PGroonga are Groonga family", "tags": ["Groonga", "Mroonga", "PGroonga"]}
]
# [[0,1337566253.89858,0.000355720520019531],4]
select PowerSetDrilldownMemos \
  --drilldowns[tags].keys tags \
  --drilldowns[tags].key_vector_expansion POWER_SET \
  --drilldowns[tags].columns[power_set].stage initial \
  --drilldowns[tags].columns[power_set].value _key \
  --drilldowns[tags].columns[power_set].flags COLUMN_VECTOR \
  --drilldowns[tags].sort_keys 'power_set' \
  --drilldowns[tags].output_columns 'power_set, _nsubrecs' \
  --limit 0
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         4
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "tags",
#           "ShortText"
#         ]
#       ]
#     ],
#     {
#       "tags": [
#         [
#           7
#         ],
#         [
#           [
#             "power_set",
#             "Text"
#           ],
#           [
#             "_nsubrecs",
#             "Int32"
#           ]
#         ],
#         [
#           [
#             "Groonga"
#           ],
#           4
#         ],
#         [
#           [
#             "Mroonga"
#           ],
#           2
#         ],
#         [
#           [
#             "PGroonga"
#           ],
#           2
#         ],
#         [
#           [
#             "Groonga",
#             "Mroonga"
#           ],
#           2
#         ],
#         [
#           [
#             "Groonga",
#             "PGroonga"
#           ],
#           2
#         ],
#         [
#           [
#             "Mroonga",
#             "PGroonga"
#           ],
#           1
#         ],
#         [
#           [
#             "Groonga",
#             "Mroonga",
#             "PGroonga"
#           ],
#           1
#         ]
#       ]
#     }
#   ]
# ]

With this aggregation result, following information would be known.

tag	number of occurrence（ `_nsubrecs` ）
`Groonga`	4
`Mroonga`	2
`PGroonga`	2
`Groonga` and `Mroonga`	2
`Groonga` and `PGroonga`	2
`Mroonga` and `PGroonga`	1
`Groonga` and `Mroonga` and `PGroonga`	1

This result allows to analyze correlation between each tag and combination such as whether often used. For example with this sample, it can be analyzed that Groonga and Mroonga are used twice at same time and within those twice, PGroonga is also used at same time for once.

7.3.58.4.9.4. `drilldowns[${LABEL}].output_columns`#

Added in version 4.0.8.

It’s almost same as drilldown_output_columns. The difference between drilldown_output_columns and drilldowns[${LABEL}].output_columns is how to refer group keys.

drilldown_output_columns uses _key Pseudo column to refer group key. drilldowns[${LABEL}].output_columns also uses _key Pseudo column to refer group key when you specify only one group key by drilldowns[${LABEL}].keys.

Here is an example to refer single group key by _key Pseudo column:

Execution example:

select Entries \
  --limit 0 \
  --output_columns _id \
  --drilldowns[tag.n_likes].keys tag \
  --drilldowns[tag.n_likes].output_columns _key
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     {
#       "tag.n_likes": [
#         [
#           3
#         ],
#         [
#           [
#             "_key",
#             "ShortText"
#           ]
#         ],
#         [
#           "Hello"
#         ],
#         [
#           "Groonga"
#         ],
#         [
#           "Senna"
#         ]
#       ]
#     }
#   ]
# ]

But you can’t refer each group key by _key Pseudo column in drilldowns[${LABEL}].output_columns. You need to use _value.${KEY_NAME} syntax. ${KEY_NAME} is a column name that is used for group key in drilldowns[${LABEL}].keys.

Here is an example to refer each group key in multiple group keys by _value.${KEY_NAME} syntax:

Execution example:

select Entries \
  --limit 0 \
  --output_columns _id \
  --drilldowns[tag.n_likes].keys tag,n_likes \
  --drilldowns[tag.n_likes].output_columns _value.tag,_value.n_likes
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ]
#       ]
#     ],
#     {
#       "tag.n_likes": [
#         [
#           4
#         ],
#         [
#           [
#             "tag",
#             "ShortText"
#           ],
#           [
#             "n_likes",
#             "UInt32"
#           ]
#         ],
#         [
#           "Hello",
#           5
#         ],
#         [
#           "Groonga",
#           10
#         ],
#         [
#           "Groonga",
#           15
#         ],
#         [
#           "Senna",
#           3
#         ]
#       ]
#     }
#   ]
# ]

Tip

Why _value.${KEY_NAME} syntax?

It’s implementation specific information.

_key is a vector value. The vector value is consists of all group keys. You can see byte sequence of the vector value by referring _key in drilldowns[${LABEL}].output_columns.

There is one grouped record in _value to refer each grouped values when you specify multiple group keys to drilldowns[${LABEL}].keys. So you can refer each group key by _value.${KEY_NAME} syntax.

On the other hand, there is no grouped record in _value when you specify only one group key to drilldowns[${LABEL}].keys. So you can’t refer group key by _value.${KEY_NAME} syntax.

7.3.58.4.9.5. `drilldowns[${LABEL}].columns[${NAME}].stage`#

Added in version 6.0.5.

Specifies when the dynamic column is created. This is a required parameter to create a dynamic column.

Here are available stages:

Name	Description
`initial`	Dynamic column is created at first.
`filtered`	Dynamic column is created after `drilldowns[${LABEL}].filter` is evaluated.
`output`	Dynamic column is created before drilldowns[${LABEL}].output_columns is evaluated.

Note

filtered stage and output stage will be able to use from 10.0.3 or later.

Here is one drilldown process flow with dynamic column creation points. You should choose stage as late as possible:

Evaluates drilldowns[${LABEL}].keys, drilldowns[${LABEL}].calc_types and drilldowns[${LABEL}].calc_target.

Creates dynamic columns for initial stage. All drilldown result records have these dynamic columns.

Evaluates drilldowns[${LABEL}].filter. You can use dynamic columns created in initial stage.

Creates dynamic columns for filtered stage. Only filtered records have these dynamic columns.

Evaluates drilldowns[${LABEL}].sort_keys, drilldowns[${LABEL}].offset and drilldowns[${LABEL}].limit. You can use dynamic columns created in initial stage and filtered stage.

Creates dynamic columns for output stage. Only drilldowns[${LABEL}].limit records have these dynamic columns.

Evaluates drilldowns[${LABEL}].output_columns. You can use dynamic columns created in initial stage, filtered stage, and output stage.

Here is a drilldowns[${LABEL}].columns[${NAME}].stage example. It creates is_popular column at initial stage. You can use is_popular in all parameters such as drilldowns[${LABEL}].filter and drilldowns[${LABEL}].output_columns:

Execution example:

select Entries \
  --drilldowns[tag].keys tag \
  --drilldowns[tag].columns[is_popular].stage initial \
  --drilldowns[tag].columns[is_popular].type Bool \
  --drilldowns[tag].columns[is_popular].value '_nsubrecs > 1' \
  --drilldowns[tag].filter is_popular \
  --drilldowns[tag].output_columns _key,is_popular,_nsubrecs
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "The first post!",
#         "Welcome! This is my first post!",
#         5,
#         "Hello"
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ],
#       [
#         4,
#         "Good-bye Senna",
#         "I migrated all Senna system!",
#         3,
#         "Senna"
#       ],
#       [
#         5,
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         3,
#         "Senna"
#       ]
#     ],
#     {
#       "tag": [
#         [
#           2
#         ],
#         [
#           [
#             "_key",
#             "ShortText"
#           ],
#           [
#             "is_popular",
#             "Bool"
#           ],
#           [
#             "_nsubrecs",
#             "Int32"
#           ]
#         ],
#         [
#           "Groonga",
#           true,
#           2
#         ],
#         [
#           "Senna",
#           true,
#           2
#         ]
#       ]
#     }
#   ]
# ]

Added in version 4.0.8.

7.3.58.4.9.6. Output format for `drilldowns[${LABEL}]` style#

There is a difference in output format between drilldown and drilldowns[${LABEL}].keys. drilldown uses array to output multiple drilldown results. drilldowns[${LABEL}].keys uses pairs of label and drilldown result.

drilldown uses the following output format:

[
  HEADER,
  [
    SEARCH_RESULT,
    DRILLDOWN_RESULT1,
    DRILLDOWN_RESULT2,
    ...
  ]
]

drilldowns[${LABEL}].keys uses the following output format:

[
  HEADER,
  [
    SEARCH_RESULT,
    {
      "LABEL1": DRILLDOWN_RESULT1,
      "LABEL2": DRILLDOWN_RESULT2,
      ...
    }
  ]
]

7.3.58.4.10. Slice related parameters#

Added in version 6.0.3.

This section describes slice related parameters.

TODO

Here are parameters for slice:

Name	Required
`--slices[${LABEL}].match_columns`	Optional
`--slices[${LABEL}].query`	Required if `--slices[${LABEL}].filter` isn’t specified.
`--slices[${LABEL}].filter`	Required if `--slices[${LABEL}].query` isn’t specified.
`--slices[${LABEL}].query_expander`	Optional
`--slices[${LABEL}].query_flags`	Optional
`--slices[${LABEL}].sort_keys`	Optional
`--slices[${LABEL}].output_columns`	Optional
`--slices[${LABEL}].offset`	Optional
`--slices[${LABEL}].limit`	Optional

7.3.58.4.10.1. `slices[${LABEL}].match_columns`#

TODO

7.3.58.4.10.2. `slices[${LABEL}].query`#

TODO

7.3.58.4.10.3. `slices[${LABEL}].filter`#

TODO

7.3.58.4.10.4. `slices[${LABEL}].query_expander`#

TODO

7.3.58.4.10.5. `slices[${LABEL}].query_flags`#

TODO

7.3.58.4.10.6. `slices[${LABEL}].sort_keys`#

TODO

7.3.58.4.10.7. `slices[${LABEL}].output_columns`#

TODO

7.3.58.4.10.8. `slices[${LABEL}].offset`#

TODO

7.3.58.4.10.9. `slices[${LABEL}].limit`#

TODO

7.3.58.4.11. Cache related parameter#

7.3.58.4.11.1. `cache`#

Specifies whether caching the result of this query or not.

If the result of this query is cached, the next same query returns response quickly by using the cache.

It doesn’t control whether existing cached result is used or not.

Here are available values:

Value	Description
`no`	Don’t cache the output of this query.
`yes`	Cache the output of this query. It’s the default value.

Here is an example to disable caching the result of this query:

Execution example:

select Entries --cache no
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "The first post!",
#         "Welcome! This is my first post!",
#         5,
#         "Hello"
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         10,
#         "Groonga"
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         15,
#         "Groonga"
#       ],
#       [
#         4,
#         "Good-bye Senna",
#         "I migrated all Senna system!",
#         3,
#         "Senna"
#       ],
#       [
#         5,
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         3,
#         "Senna"
#       ]
#     ]
#   ]
# ]

The default value is yes.

7.3.58.4.12. Score related parameters#

There is a score related parameter, adjuster.

7.3.58.4.12.1. `adjuster`#

Specifies one or more score adjust expressions. You need to use adjuster with query or filter. adjuster doesn’t work with not searched request.

You can increase score of specific records by adjuster. You can use adjuster to set high score for important records.

For example, you can use adjuster to increase score of records that have groonga tag.

Here is the syntax:

--adjuster "SCORE_ADJUST_EXPRESSION1 + SCORE_ADJUST_EXPRESSION2 + ..."

Here is the SCORE_ADJUST_EXPRESSION syntax:

COLUMN @ "KEYWORD" * FACTOR

Note the following:

COLUMN must be indexed.

"KEYWORD" must be a string.

FACTOR must be a positive integer.

Here is a sample adjuster usage example that uses just one SCORE_ADJUST_EXPRESSION:

Execution example:

select Entries \
  --filter true \
  --adjuster 'content @ "groonga" * 5' \
  --output_columns _key,content,_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "The first post!",
#         "Welcome! This is my first post!",
#         1
#       ],
#       [
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         6
#       ],
#       [
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         1
#       ],
#       [
#         "Good-bye Senna",
#         "I migrated all Senna system!",
#         1
#       ],
#       [
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         1
#       ]
#     ]
#   ]
# ]

The select command matches all records. Then it applies adjuster. The adjuster increases score of records that have "groonga" in Entries.content column by 5. There is only one record that has "groonga" in Entries.content column. So the record that its key is "Groonga" has score 6 (= 1 + 5).

You can omit FACTOR. If you omit FACTOR, it is treated as 1.

Here is a sample adjuster usage example that omits FACTOR:

Execution example:

select Entries \
  --filter true \
  --adjuster 'content @ "groonga"' \
  --output_columns _key,content,_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "The first post!",
#         "Welcome! This is my first post!",
#         1
#       ],
#       [
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         2
#       ],
#       [
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         1
#       ],
#       [
#         "Good-bye Senna",
#         "I migrated all Senna system!",
#         1
#       ],
#       [
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         1
#       ]
#     ]
#   ]
# ]

The adjuster in the select command doesn’t have FACTOR. So the factor is treated as 1. There is only one record that has "groonga" in Entries.content column. So the record that its key is "Groonga" has score 2 (= 1 + 1).

Here is a sample adjuster usage example that uses multiple SCORE_ADJUST_EXPRESSION:

Execution example:

select Entries \
  --filter true \
  --adjuster 'content @ "groonga" * 5 + content @ "started" * 3' \
  --output_columns _key,content,_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "The first post!",
#         "Welcome! This is my first post!",
#         1
#       ],
#       [
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         9
#       ],
#       [
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         4
#       ],
#       [
#         "Good-bye Senna",
#         "I migrated all Senna system!",
#         1
#       ],
#       [
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         1
#       ]
#     ]
#   ]
# ]

The adjuster in the select command has two SCORE_ADJUST_EXPRESSION s. The final increased score is sum of scores of these SCORE_ADJUST_EXPRESSION s. All SCORE_ADJUST_EXPRESSION s in the select command are applied to a record that its key is "Groonga". So the final increased score of the record is sum of scores of all SCORE_ADJUST_EXPRESSION s.

The first SCORE_ADJUST_EXPRESSION is content @ "groonga" * 5. It increases score by 5.

The second SCORE_ADJUST_EXPRESSION is content @ "started" * 3. It increases score by 3.

The final increased score is 9 (= 1 + 5 + 3).

A SCORE_ADJUST_EXPRESSION has a factor for "KEYWORD". This means that increased scores of all records that has "KEYWORD" are the same value. You can change increase score for each record that has the same "KEYWORD". It is useful to tune search score. See Weight vector column for details.

7.3.58.5. Return value#

The command returns a response with the following format:

[
  HEADER,
  [
    SEARCH_RESULT,
    DRILLDOWN_RESULT_1,
    DRILLDOWN_RESULT_2,
    ...,
    DRILLDOWN_RESULT_N
  ]
]

If the command fails, error details are in HEADER.

See Output format for HEADER.

There are zero or more DRILLDOWN_RESULT. If no drilldown and drilldowns[${LABEL}].keys are specified, they are omitted like the following:

[
  HEADER,
  [
    SEARCH_RESULT
  ]
]

If drilldown has two or more keys like --drilldown "_key, column1, column2", multiple DRILLDOWN_RESULT exist:

[
  HEADER,
  [
    SEARCH_RESULT,
    DRILLDOWN_RESULT_FOR_KEY,
    DRILLDOWN_RESULT_FOR_COLUMN1,
    DRILLDOWN_RESULT_FOR_COLUMN2
  ]
]

If drilldowns[${LABEL}].keys is used, only one DRILLDOWN_RESULT exist:

[
  HEADER,
  [
    SEARCH_RESULT,
    DRILLDOWN_RESULT_FOR_LABELED_DRILLDOWN
  ]
]

DRILLDOWN_RESULT format is different between drilldown and drilldowns[${LABEL}].keys. It’s described later.

SEARCH_RESULT is the following format:

[
  [N_HITS],
  COLUMNS,
  RECORDS
]

See Simple usage for concrete example of the format.

N_HITS is the number of matched records before limit is applied.

COLUMNS describes about output columns specified by output_columns. It uses the following format:

[
  [COLUMN_NAME_1, COLUMN_TYPE_1],
  [COLUMN_NAME_2, COLUMN_TYPE_2],
  ...,
  [COLUMN_NAME_N, COLUMN_TYPE_N]
]

COLUMNS includes one or more output column information. Each output column information includes the followings:

Column name as string

Column type as string or null

Column name is extracted from value specified as output_columns.

Column type is Groonga’s type name or null. It doesn’t describe whether the column value is vector or scalar. You need to determine it by whether real column value is array or not.

See Data types for type details.

null is used when column value type isn’t determined. For example, function call in output_columns such as --output_columns "snippet_html(content)" uses null.

Here is an example of COLUMNS:

[
  ["_id",     "UInt32"],
  ["_key",    "ShortText"],
  ["n_likes", "UInt32"],
]

RECORDS includes column values for each matched record. Included records are selected by offset and limit. It uses the following format:

[
  [
    RECORD_1_COLUMN_1,
    RECORD_1_COLUMN_2,
    ...,
    RECORD_1_COLUMN_N
  ],
  [
    RECORD_2_COLUMN_1,
    RECORD_2_COLUMN_2,
    ...,
    RECORD_2_COLUMN_N
  ],
  ...
  [
    RECORD_N_COLUMN_1,
    RECORD_N_COLUMN_2,
    ...,
    RECORD_N_COLUMN_N
  ]
]

Here is an example RECORDS:

[
  [
    1,
    "The first post!",
    5
  ],
  [
    2,
    "Groonga",
    10
  ],
  [
    3,
    "Mroonga",
    15
  ]
]

DRILLDOWN_RESULT format is different between drilldown and drilldowns[${LABEL}].keys.

drilldown uses the same format as SEARCH_RESULT:

[
  [N_HITS],
  COLUMNS,
  RECORDS
]

And drilldown generates one or more DRILLDOWN_RESULT when drilldown has one ore more keys.

drilldowns[${LABEL}].keys uses the following format. Multiple drilldowns[${LABEL}].keys are mapped to one object (key-value pairs):

{
  "LABEL_1": [
    [N_HITS],
    COLUMNS,
    RECORDS
  ],
  "LABEL_2": [
    [N_HITS],
    COLUMNS,
    RECORDS
  ],
  ...,
  "LABEL_N": [
    [N_HITS],
    COLUMNS,
    RECORDS
  ]
}

Each drilldowns[${LABEL}].keys corresponds to the following:

"LABEL": [
  [N_HITS],
  COLUMNS,
  RECORDS
]

The following value part is the same format as SEARCH_RESULT:

[
  [N_HITS],
  COLUMNS,
  RECORDS
]

See also Output format for drilldowns[${LABEL}] style for drilldowns[${LABEL}] style drilldown output format.

7.3.58.6. See also#

Query syntax

Script syntax

7.3.58. select#

7.3.58.1. Summary#

7.3.58.2. Syntax#

7.3.58.3. Usage#

7.3.58.3.1. Simple usage#

7.3.58.3.2. Search conditions#

7.3.58.3.2.1. Search condition: query#

7.3.58.3.2.2. Search condition: filter#

7.3.58.3.3. Paging#

7.3.58.3.4. The total number of records#

7.3.58.3.5. Drilldown#

7.3.58.3.6. Dynamic column#

7.3.58.3.7. Window function#

7.3.58.3.8. Typo tolerance#

7.3.58.4. Parameters#

7.3.58.4.1. Required parameters#

7.3.58.4.1.1. table#

7.3.58.4.2. Search related parameters#

7.3.58.4.2.1. match_columns#

7.3.58.4.2.2. query#

7.3.58.4.2.3. filter#

7.3.58.4.2.4. load_table#

7.3.58.4.2.5. load_columns#

7.3.58.4.2.6. load_values#

7.3.58.4.3. Advanced search parameters#

7.3.58.4.3.1. match_escalation_threshold#

7.3.58.4.3.2. match_escalation#

7.3.58.4.3.3. query_expansion#

7.3.58.4.3.4. query_flags#

7.3.58.4.3.5. query_expander#

7.3.58.4.3.6. n_workers#

7.3.58.4.4. Output related parameters#

7.3.58.4.4.1. output_columns#

7.3.58.4.4.2. sortby#

7.3.58.4.4.3. sort_keys#

7.3.58.4.4.4. offset#

7.3.58.4.4.5. limit#

7.3.58.4.4.6. scorer#

7.3.58.4.5. Fuzzy query related parameters#

7.3.58.4.5.1. fuzzy_max_distance_ratio#

7.3.58.4.5.2. fuzzy_max_distance#

7.3.58.4.5.3. fuzzy_max_expansions#

7.3.58.4.5.4. fuzzy_prefix_length#

7.3.58.4.5.5. fuzzy_with_transposition#

7.3.58.4.5.6. fuzzy_with_tokenize#

7.3.58.4.6. Dynamic column related parameters#

7.3.58.4.6.1. columns[${NAME}].stage#

7.3.58.4.6.2. columns[${NAME}].flags#

7.3.58.4.6.3. columns[${NAME}].type#

7.3.58.4.6.4. columns[${NAME}].value#

7.3.58.4.7. Window function related parameters#

7.3.58.4.7.1. columns[${NAME}].window.sort_keys#

7.3.58.4.7.2. columns[${NAME}].window.group_keys#

7.3.58.4.8. Drilldown related parameters#

7.3.58.4.8.1. drilldown#

7.3.58.4.8.2. drilldown_sortby#

7.3.58.4.8.3. drilldown_sort_keys#

7.3.58.4.8.4. drilldown_output_columns#

7.3.58.4.8.5. drilldown_offset#

7.3.58.4.8.6. drilldown_limit#

7.3.58.4.8.7. drilldown_calc_types#

7.3.58.4.8.8. drilldown_calc_target#

7.3.58.4.8.9. drilldown_filter#

7.3.58.4.8.10. drilldown_max_n_target_records#

7.3.58.4.9. Advanced drilldown related parameters#

7.3.58.4.9.1. drilldowns[${LABEL}].keys#

7.3.58.4.9.2. drilldowns[${LABEL}].table#

7.3.58.4.9.3. drilldowns[${LABEL}].key_vector_expansion#

7.3.58.4.9.3.1. NONE#

7.3.58.4.9.3.2. POWER_SET#

7.3.58.4.9.4. drilldowns[${LABEL}].output_columns#

7.3.58.4.9.5. drilldowns[${LABEL}].columns[${NAME}].stage#

7.3.58.4.9.6. Output format for drilldowns[${LABEL}] style#

7.3.58.4.10. Slice related parameters#

7.3.58.4.10.1. slices[${LABEL}].match_columns#

7.3.58.4.10.2. slices[${LABEL}].query#

7.3.58.4.10.3. slices[${LABEL}].filter#

7.3.58.4.10.4. slices[${LABEL}].query_expander#

7.3.58.4.10.5. slices[${LABEL}].query_flags#

7.3.58.4.10.6. slices[${LABEL}].sort_keys#

7.3.58. `select`#

7.3.58.3.2.1. Search condition: `query`#

7.3.58.3.2.2. Search condition: `filter`#

7.3.58.4.1.1. `table`#

7.3.58.4.2.1. `match_columns`#

7.3.58.4.2.2. `query`#

7.3.58.4.2.3. `filter`#

7.3.58.4.2.4. `load_table`#

7.3.58.4.2.5. `load_columns`#

7.3.58.4.2.6. `load_values`#

7.3.58.4.3.1. `match_escalation_threshold`#

7.3.58.4.3.2. `match_escalation`#

7.3.58.4.3.3. `query_expansion`#

7.3.58.4.3.4. `query_flags`#

7.3.58.4.3.5. `query_expander`#

7.3.58.4.3.6. `n_workers`#

7.3.58.4.4.1. `output_columns`#

7.3.58.4.4.2. `sortby`#

7.3.58.4.4.3. `sort_keys`#

7.3.58.4.4.4. `offset`#

7.3.58.4.4.5. `limit`#

7.3.58.4.4.6. `scorer`#

7.3.58.4.5.1. `fuzzy_max_distance_ratio`#

7.3.58.4.5.2. `fuzzy_max_distance`#

7.3.58.4.5.3. `fuzzy_max_expansions`#

7.3.58.4.5.4. `fuzzy_prefix_length`#

7.3.58.4.5.5. `fuzzy_with_transposition`#

7.3.58.4.5.6. `fuzzy_with_tokenize`#

7.3.58.4.6.1. `columns[${NAME}].stage`#

7.3.58.4.6.2. `columns[${NAME}].flags`#

7.3.58.4.6.3. `columns[${NAME}].type`#

7.3.58.4.6.4. `columns[${NAME}].value`#

7.3.58.4.7.1. `columns[${NAME}].window.sort_keys`#

7.3.58.4.7.2. `columns[${NAME}].window.group_keys`#

7.3.58.4.8.1. `drilldown`#

7.3.58.4.8.2. `drilldown_sortby`#

7.3.58.4.8.3. `drilldown_sort_keys`#

7.3.58.4.8.4. `drilldown_output_columns`#

7.3.58.4.8.5. `drilldown_offset`#

7.3.58.4.8.6. `drilldown_limit`#

7.3.58.4.8.7. `drilldown_calc_types`#

7.3.58.4.8.8. `drilldown_calc_target`#

7.3.58.4.8.9. `drilldown_filter`#

7.3.58.4.8.10. `drilldown_max_n_target_records`#

7.3.58.4.9.1. `drilldowns[${LABEL}].keys`#

7.3.58.4.9.2. `drilldowns[${LABEL}].table`#

7.3.58.4.9.3. `drilldowns[${LABEL}].key_vector_expansion`#

7.3.58.4.9.3.1. `NONE`#

7.3.58.4.9.3.2. `POWER_SET`#

7.3.58.4.9.4. `drilldowns[${LABEL}].output_columns`#

7.3.58.4.9.5. `drilldowns[${LABEL}].columns[${NAME}].stage`#

7.3.58.4.9.6. Output format for `drilldowns[${LABEL}]` style#

7.3.58.4.10.1. `slices[${LABEL}].match_columns`#

7.3.58.4.10.2. `slices[${LABEL}].query`#

7.3.58.4.10.3. `slices[${LABEL}].filter`#

7.3.58.4.10.4. `slices[${LABEL}].query_expander`#

7.3.58.4.10.5. `slices[${LABEL}].query_flags`#

7.3.58.4.10.6. `slices[${LABEL}].sort_keys`#

7.3.58.4.10.7. `slices[${LABEL}].output_columns`#

7.3.58.4.10.8. `slices[${LABEL}].offset`#

7.3.58.4.10.9. `slices[${LABEL}].limit`#

7.3.58.4.11.1. `cache`#

7.3.58.4.12.1. `adjuster`#