Groonga 14.0.3 has been released
Groonga 14.0.3 has been released!
How to install: Install
Changes
Here are important changes in this release:
Improvements
-
We optimized performance as below.
-
We optimized performance of
OR
andAND
search when the number of hits were many. -
We optimized performance of prefix search(
@^
). -
We optimized performance of
AND
search when the number of records ofA
more thanB
in condition ofA AND B
. -
We optimized performance of search when we used many dynamic columns.
-
-
token_ngram Added new option
ignore_blank
.We can replace
TokenBigramIgnoreBlank
withTokenNgram("ignore_blank", true)
as below.Here is example of use
TokenBigram
.tokenize TokenBigram "! ! !" NormalizerAuto [ [ 0, 1715155644.64263, 0.001013517379760742 ], [ { "value": "!", "position": 0, "force_prefix": false, "force_prefix_search": false }, { "value": "!", "position": 1, "force_prefix": false, "force_prefix_search": false }, { "value": "!", "position": 2, "force_prefix": false, "force_prefix_search": false } ] ]
Here is example of use
TokenBigramIgnoreBlank
.tokenize TokenBigramIgnoreBlank "! ! !" NormalizerAuto [ [ 0, 1715155680.323451, 0.0009913444519042969 ], [ { "value": "!!!", "position": 0, "force_prefix": false, "force_prefix_search": false } ] ]
Here is example of use
TokenNgram("ignore_blank", true)
.tokenize 'TokenNgram("ignore_blank", true)' "! ! !" NormalizerAuto [ [ 0, 1715155762.340685, 0.001041412353515625 ], [ { "value": "!!!", "position": 0, "force_prefix": false, "force_prefix_search": false } ] ]
-
ubuntu Add support for Ubuntu 24.04 LTS (Noble Numbat).
Fixes
-
request_cancel Fix a bug that Groonga may crash when we execute
request_cancel
command while we execute the other query. -
Fixed the unexpected error when using
--post_filter
with--offset
greater than the post-filtered resultIn the same situation, using
--filter
with--offset
doesn't raise the error. This inconsistency in behavior between--filter
and--post-filter
has now been resolved.table_create Users TABLE_PAT_KEY ShortText column_create Users age COLUMN_SCALAR UInt32 load --table Users [ ["_key", "age"], ["Alice", 21], ["Bob", 22], ["Chris", 23], ["Diana", 24], ["Emily", 25] ] select Users \ --filter 'age >= 22' \ --post_filter 'age <= 24' \ --offset 3 \ --sort_keys -age --output_pretty yes [ [ -68, 1715224057.317582, 0.001833438873291016, "[table][sort] grn_output_range_normalize failed", [ [ "grn_table_sort", "/home/horimoto/Work/free-software/groonga.tag/lib/sort.c", 1052 ] ] ] ]
-
Fixed a bug where incorrect search result could be returned when not all phrases within
(...)
matched using near phrase product.For example, there is no record which matched
(2)
condition using--query '*NPP1"(a) (2)"'
. In this case, the expected behavior would be return no record. However, the actual behavior was equal to the query--query '*NPP and "(a)"
as below. This means that despite no records matched(2)
, records likeax1
andaxx1
were incorrectly returned.table_create Entries TABLE_NO_KEY column_create Entries content COLUMN_SCALAR Text table_create Terms TABLE_PAT_KEY ShortText --default_tokenizer TokenNgram column_create Terms entries_content COLUMN_INDEX|WITH_POSITION Entries content load --table Entries [ {"content": "ax1"}, {"content": "axx1"} ] select Entries \ --match_columns content \ --query '*NPP1"(a) (2)"' \ --output_columns 'content' [ [ 0, 1715224211.050228, 0.001366376876831055 ], [ [ [ 2 ], [ [ "content", "Text" ] ], [ "ax1" ], [ "axx1" ] ] ] ]
-
Fixed a bug that rehash failed or data in a table broke when rehash occurred that the table with
TABLE_HASH_KEY
has 2^28 or more records. -
Fixed a bug that highlight position slipped out of place in the following cases.
-
If full width space existed before highlight target characters as below.
We expected that Groonga returned
"Groonga <span class=\"keyword\">高</span>速!"
. However, Groonga returned"Groonga <span class=\"keyword\">高速</span>!"
as below.table_create Entries TABLE_NO_KEY column_create Entries body COLUMN_SCALAR ShortText table_create Terms TABLE_PAT_KEY ShortText \ --default_tokenizer 'TokenNgram("report_source_location", true)' \ --normalizer 'NormalizerNFKC150("report_source_offset", true)' column_create Terms document_index COLUMN_INDEX|WITH_POSITION Entries body load --table Entries [ {"body": "Groonga 高速!"} ] select Entries \ --output_columns \ --match_columns body \ --query '高' \ --output_columns 'highlight_html(body, Terms)' [ [ 0, 1715215640.979517, 0.001608610153198242 ], [ [ [ 1 ], [ [ "highlight_html", null ] ], [ "Groonga <span class=\"keyword\">高速</span>!" ] ] ] ]
-
If we used
TokenNgram("loose_blank", true)
and if highlight target characters included full width space as below.We expected that Groonga returned
"<span class=\"keyword\">山田 太郎</span>"
. However, Groonga returned"<span class=\"keyword\">山田 太</span>"
as below.table_create Entries TABLE_NO_KEY column_create Entries body COLUMN_SCALAR ShortText table_create Terms TABLE_PAT_KEY ShortText \ --default_tokenizer 'TokenNgram("loose_blank", true, "report_source_location", true)' \ --normalizer 'NormalizerNFKC150("report_source_offset", true)' column_create Terms document_index COLUMN_INDEX|WITH_POSITION Entries body load --table Entries [ {"body": "山田 太郎"} ] select Entries --output_columns \ --match_columns body --query '山田太郎' \ --output_columns 'highlight_html(body, Terms)' --output_pretty yes [ [ 0, 1715220409.096246, 0.0004854202270507812 ], [ [ [ 1 ], [ [ "highlight_html", null ] ], [ "<span class=\"keyword\">山田 太</span>" ] ] ] ]
-
If white space existed in the front of highlight target characters as below.
We expected that Groonga returned
" <span class=\"keyword\">山</span>田太郎"
. However, Groonga returned" <span class=\"keyword\">山</span>"
as below.table_create Entries TABLE_NO_KEY column_create Entries body COLUMN_SCALAR ShortText table_create Terms TABLE_PAT_KEY ShortText \ --default_tokenizer 'TokenNgram("report_source_location", true)' \ --normalizer 'NormalizerNFKC150("report_source_offset", true)' column_create Terms document_index COLUMN_INDEX|WITH_POSITION Entries body load --table Entries [ {"body": " 山田太郎"} ] select Entries \ --output_columns \ --match_columns body \ --query '山' \ --output_columns 'highlight_html(body, Terms)' --output_pretty yes [ [ 0, 1715221627.002193, 0.001977920532226562 ], [ [ [ 1 ], [ [ "highlight_html", null ] ], [ " <span class=\"keyword\">山</span>" ] ] ] ]
-
If the second character of highlight target was full width space as below.
We expected that Groonga returned
"<span class=\"keyword\">山 田</span>太郎"
. However, Groonga returned"<span class=\"keyword\">山 田太</span>郎"
as below.table_create Entries TABLE_NO_KEY column_create Entries body COLUMN_SCALAR ShortText table_create Terms TABLE_PAT_KEY ShortText \ --default_tokenizer 'TokenNgram("report_source_location", true)' \ --normalizer 'NormalizerNFKC150("report_source_offset", true)' column_create Terms document_index COLUMN_INDEX|WITH_POSITION Entries body load --table Entries [ {"body": "山 田太郎"} ] select Entries \ --output_columns \ --match_columns body \ --query '山 田' \ --output_columns 'highlight_html(body, Terms)' [ [ 0, 1715222501.496007, 0.0005536079406738281 ], [ [ [ 0 ], [ [ "highlight_html", "<span class=\"keyword\">山 田太</span>郎" ] ] ] ] ]
-
Conclusion
Please refert to the following news for more details. News Release 14.0.3
Let's search by Groonga!