7.14.14. query

7.14.14.1. Summary

query provides --match_columns and --query parameters of select feature as function. You can specify multiple query functions in --filter parameter in select.

Because of such flexibility, you can control full text search behavior by combination of multiple query functions.

query can be used in only --filter in select.

7.14.14.2. Syntax

query requires two arguments - match_columns and query_string.

The parameter query_expander or substitution_table is optional.

query(match_columns, query_string)
query(match_columns, query_string, query_expander)
query(match_columns, query_string, substitution_table)

7.14.14.3. Usage

Here are a schema definition and sample data to show usage.

Sample schema:

Execution example:

table_create Documents TABLE_NO_KEY
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Documents content COLUMN_SCALAR Text
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Terms TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram --normalizer NormalizerAuto
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Terms documents_content_index COLUMN_INDEX|WITH_POSITION Documents content
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Users TABLE_NO_KEY
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Users name COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Users memo COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Lexicon TABLE_HASH_KEY ShortText \
  --default_tokenizer TokenBigramSplitSymbolAlphaDigit \
  --normalizer NormalizerAuto
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Lexicon users_name COLUMN_INDEX|WITH_POSITION Users name
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Lexicon users_memo COLUMN_INDEX|WITH_POSITION Users memo
# [[0, 1337566253.89858, 0.000355720520019531], true]

Sample data:

Execution example:

load --table Users
[
{"name": "Alice", "memo": "groonga user"},
{"name": "Alisa", "memo": "mroonga user"},
{"name": "Bob",   "memo": "rroonga user"},
{"name": "Tom",   "memo": "nroonga user"},
{"name": "Tobby", "memo": "groonga and mroonga user. mroonga is ..."},
]
# [[0, 1337566253.89858, 0.000355720520019531], 5]

Here is the simple usage of query function which execute full text search by keyword 'alice' without using --match_columns and --query arguments in --filter.

Execution example:

select Users --output_columns name,_score --filter 'query("name * 10", "alice")'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "name",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "Alice",
#         10
#       ]
#     ]
#   ]
# ]

When executing above query, the keyword 'alice' is weighted to the value - '10'.

Here are the contrasting examples with/without query.

Execution example:

select Users --output_columns name,memo,_score --match_columns "memo * 10" --query "memo:@groonga OR memo:@mroonga OR memo:@user" --sortby -_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "name",
#           "ShortText"
#         ],
#         [
#           "memo",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "Tobby",
#         "groonga and mroonga user. mroonga is ...",
#         4
#       ],
#       [
#         "Alice",
#         "groonga user",
#         2
#       ],
#       [
#         "Alisa",
#         "mroonga user",
#         2
#       ],
#       [
#         "Bob",
#         "rroonga user",
#         1
#       ],
#       [
#         "Tom",
#         "nroonga user",
#         1
#       ]
#     ]
#   ]
# ]

In this case, the keywords 'groonga' and 'mroonga' and 'user' are given same weight value. You can't pass different weight value to each keyword in this way.

Execution example:

select Users --output_columns name,memo,_score --filter 'query("memo * 10", "groonga") || query("memo * 20", "mroonga") || query("memo * 1", "user")' --sortby -_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "name",
#           "ShortText"
#         ],
#         [
#           "memo",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "Tobby",
#         "groonga and mroonga user. mroonga is ...",
#         51
#       ],
#       [
#         "Alisa",
#         "mroonga user",
#         21
#       ],
#       [
#         "Alice",
#         "groonga user",
#         11
#       ],
#       [
#         "Tom",
#         "nroonga user",
#         1
#       ],
#       [
#         "Bob",
#         "rroonga user",
#         1
#       ]
#     ]
#   ]
# ]

On the other hand, by specifying multiple query, the keywords 'groonga' and 'mroonga' and 'user' are given different value of weight.

As a result, you can control full text search result by giving different weight to the keywords on your purpose.

7.14.14.4. Parameters

7.14.14.4.1. Required parameters

There are two required parameter, match_columns and query_string.

7.14.14.4.1.1. match_columns

Specifies the default target column for fulltext search by query_string parameter value. It is the same role as match_columns parameter in select.

7.14.14.4.1.2. query_string

Specifies the search condition in Query syntax. It is the same role as query parameter in select.

See match_columns about query parameter in select.

7.14.14.4.2. Optional parameter

There are some optional parameters.

7.14.14.4.2.1. query_expander

Specifies the plugin name for query expansion.

There is one plugin bundled in official release - QueryExpanderTSV.

See QueryExpanderTSV about details.

7.14.14.4.2.2. substitution_table

Specifies the substitution table and substitution column name by following format such as ${TABLE}.${COLUMN} for query expansion.

See query_expander about details.

7.14.14.5. Return value

query returns whether any record is matched or not. If one or more records are matched, it returns true. Otherwise, it returns false.

7.14.14.6. TODO

  • Support query_flags

7.14.14.7. See also