Groonga 8.0.2 has been released
Groonga 8.0.2 has been released!
In this release, you can "define" custom tokenizer and normalizer via options, without any programming. It helps you to search sources including many orthographical variants.
How to install: Install
Here are important changes in this release:
- [logical_range_filter] Added
- Added a new function
time_format(). You can specify time format against a column of
Timetype, with the format same to
- [tokenizers] Support new tokenizer
TokenNgram. You can define its behavior dynamically.
- [normalizers] Support new normalizer
NormalizerNFKC100. It is based on Unicode NFKC for Unicode 10.0.
- [normalizers] Support options for normalizers
NormalizerNFKC100. You can change normalizer's behavior dynamically.
- [dump][schema] Add support for options of tokenizer and normalizer. As the result, Groonga 8.0.1 and earlier versions cannot import dump and schema generated by Groonga 8.0.2 or later, and they will occurs error due to unsupported information.
logical_range_filter now supports a new option
sort_keys, corresponding to
sort_keys in select.
Note that it works only for single search target shard and doesn't work for multiple search target shards. For more details, see the command reference.
Added a new function
Now you can specify time format against a column of
Time type, with the format same to
For example, the following command line will output the
_key column as both UNIX time and a human readable format like
select Timestamps --sortby _id --limit -1 --output_columns '_key, time_format(_key, "%Y-%m-%dT%H:%M:%S")'
[tokenizers] Support new tokenizer
Now a new tokenizer
TokenNgram is available.
You can define its behavior dynamically via its options.
Options are given via the style
'TokenNgram("[name 1]", [value 1], "[name 2]", [value 2], ...).
table_create --name Terms --flags TABLE_PAT_KEY --key_type ShortText --default_tokenizer 'TokenNgram("n", 2, "loose_symbol", true)' --normalizer NormalizerAuto
[normalizers] Support new normalizer
Now a new normalizer
NormalizerNFKC100, based on Unicode NFKC (Normalization Form Compatibility Composition) for Unicode 10.0 is available.
Both it and
NormalizerNFKC51 supports options.
For more details, see the next section.
[normalizers] Support options for normalizers
NormalizerNFKC100 now support options to change their behavior dyanmically.
Options are given via the style
'NormalizerNFKC100("[name 1]", [value 1], "[name 2]", [value 2], ...).
table_create --name Terms --flags TABLE_PAT_KEY --key_type ShortText --default_tokenizer TokenBigram --normalizer 'NormalizerNFKC100("unify_kana", true, "unify_kana_case", true)'
schema commands now report options for tokenizers (
TokenNgram) and normalizers (
table_create Site TABLE_HASH_KEY ShortText column_create Site title COLUMN_SCALAR ShortText table_create Terms TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram --normalizer "NormalizerNFKC100(\"unify_kana\", true, \"unify_kana_case\", true)"
As the result, Groonga 8.0.1 and earlier versions cannot import results of
schema including such options information.
Tokenizers and normalizers without options are still reported same as on the old versions, so you need to be careful only when you use new features of tokenizers or normalizers described above.
See Release 8.0.2 2018-04-29 about detailed changes since 8.0.1
Let's search by Groonga!