Groonga 12.1.1 has been released
Groonga 12.1.1 has been released!
How to install: Install
Changes
Here are important changes in this release:
Improvements
-
[select][POWER_SET] Vector’s power set is now able to aggregate with the drilldowns.
A new option
key_vector_expansionis added to drilldowns. Currently,NONEorPOWER_SETcan be specified forkey_vector_expansion.Specifying
POWER_SETtokey_vector_expansionallows to aggregate for power set case. This method of aggregation is useful to aggregate total number of individual and combination tag occurrence at once.Following example is to see aggregating total number of individual and combination occurrence for following 3 tags,
Groonga,Mroonga, andPGroonga.Sample case:
table_create PowersetDrilldownMemos TABLE_HASH_KEY ShortText # [[0, 1337566253.89858, 0.000355720520019531], true] column_create PowersetDrilldownMemos tags COLUMN_VECTOR ShortText # [[0, 1337566253.89858, 0.000355720520019531], true] load --table PowersetDrilldownMemos [ {"_key": "Groonga is fast!", "tags": ["Groonga"]}, {"_key": "Mroonga uses Groonga!", "tags": ["Groonga", "Mroonga"]}, {"_key": "PGroonga uses Groonga!", "tags": ["Groonga", "PGroonga"]}, {"_key": "Mroonga and PGroonga are Groonga family", "tags": ["Groonga", "Mroonga", "PGroonga"]} ] # [[0, 1337566253.89858, 0.000355720520019531], 4] select PowersetDrilldownMemos \ --drilldowns[tags].keys tags \ --drilldowns[tags].key_vector_expansion POWER_SET \ --drilldowns[tags].columns[power_set].stage initial \ --drilldowns[tags].columns[power_set].value _key \ --drilldowns[tags].columns[power_set].flags COLUMN_VECTOR \ --drilldowns[tags].sort_keys 'power_set' \ --drilldowns[tags].output_columns 'power_set, _nsubrecs' \ --limit 0 # [ # [ # 0, # 1337566253.89858, # 0.000355720520019531 # ], # [ # [ # [ # 4 # ], # [ # [ # "_id", # "UInt32" # ], # [ # "_key", # "ShortText" # ], # [ # "tags", # "ShortText" # ] # ] # ], # { # "tags": [ # [ # 7 # ], # [ # [ # "power_set", # "Text" # ], # [ # "_nsubrecs", # "Int32" # ] # ], # [ # [ # "Groonga" # ], # 4 # ], # [ # [ # "Mroonga" # ], # 2 # ], # [ # [ # "PGroonga" # ], # 2 # ], # [ # [ # "Groonga", # "Mroonga" # ], # 2 # ], # [ # [ # "Groonga", # "PGroonga" # ], # 2 # ], # [ # [ # "Mroonga", # "PGroonga" # ], # 1 # ], # [ # [ # "Groonga", # "Mroonga", # "PGroonga" # ], # 1 # ] # ] # } # ] # ]This result shows following.
tag number of occurrence Groonga4 Mroonga2 PGroonga2 GroongaandMroonga2 GroongaandPGroonga2 MroongaandPGroonga1 GroongaandMroongaandPGroonga1 This feature is complex. For more information, please refer to POWER_SET.
-
[select] Specific element of vector column is now able to be search target.
It allows specific elements of vector column to be search targets that specifying the specific elements to
match_columnswith index number.Following is a sample case.
table_create Memos TABLE_NO_KEY column_create Memos contents COLUMN_VECTOR ShortText table_create Lexicon TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram column_create Lexicon memo_index COLUMN_INDEX|WITH_POSITION|WITH_SECTION Memos contents load --table Memos [ ["contents"], [["I like Groonga", "Use Groonga with Ruby"]], [["I like Ruby", "Use Groonga"]] ] select Memos \ --match_columns "contents[1]" \ --query Ruby \ --output_columns "contents, _score" # [ # [ # 0, # 0.0, # 0.0 # ], # [ # [ # [ # 1 # ], # [ # [ # "contents", # "ShortText" # ], # [ # "_score", # "Int32" # ] # ], # [ # [ # "I like Groonga", # "Use Groonga with Ruby" # ], # 1 # ] # ] # ] # ]--match_columns "contents[1]"specifies only 2nd vector elements ofcontentsas the search target. In this sample,["I like Groonga", "Use Groonga with Ruby"]is shown in the results becauseRubyis in 2nd elementUse Groonga with Ruby. However,["I like Ruby", "Use Groonga"]is not shown in results becauseRubyis not in 2nd elementUse Groonga. -
[load] Added support for
YYYY-MM-DDtime format.YYYY-MM-DDis a general time format. Supporting this time format madeloadmore useful.The time of the loaded value is set to
00:00:00on the local time.plugin_register functions/time # [[0,0.0,0.0],true] table_create Logs TABLE_NO_KEY # [[0,0.0,0.0],true] column_create Logs created_at COLUMN_SCALAR Time # [[0,0.0,0.0],true] column_create Logs created_at_text COLUMN_SCALAR ShortText # [[0,0.0,0.0],true] load --table Logs [ {"created_at": "2000-01-01", "created_at_text": "2000-01-01"} ] # [[0,0.0,0.0],1] select Logs --output_columns "time_format_iso8601(created_at), created_at_text" # [ # [ # 0, # 0.0, # 0.0 # ], # [ # [ # [ # 1 # ], # [ # [ # "time_format_iso8601", # null # ], # [ # "created_at_text", # "ShortText" # ] # ], # [ # "2000-01-01T00:00:00.000000+09:00", # "2000-01-01" # ] # ] # ] # ]
Fixes
-
[select] Fix a bug displaying a wrong label in
drilldownresults whencommand_versionis3.Following is a sample case.
table_create Documents TABLE_NO_KEY column_create Documents tag1 COLUMN_SCALAR ShortText column_create Documents tag2 COLUMN_SCALAR ShortText load --table Documents [ {"tag1": "1", "tag2": "2"} ] select Documents --drilldown tag1,tag2 --command_version 3 # { # "header": { # "return_code": 0, # "start_time": 1672123380.653039, # "elapsed_time": 0.0005846023559570312 # }, # "body": { # "n_hits": 1, # "columns": [ # { # "name": "_id", # "type": "UInt32" # }, # { # "name": "tag1", # "type": "ShortText" # }, # { # "name": "tag2", # "type": "ShortText" # } # ], # "records": [ # [ # 1, # "1", # "2" # ] # ], # "drilldowns": { # "ctor": { # "n_hits": 1, # "columns": [ # { # "name": "_key", # "type": "ShortText" # }, # { # "name": "_nsubrecs", # "type": "Int32" # } # ], # "records": [ # [ # "1", # 1 # ] # ] # }, # "tag2": { # "n_hits": 1, # "columns": [ # { # "name": "_key", # "type": "ShortText" # }, # { # "name": "_nsubrecs", # "type": "Int32" # } # ], # "records": [ # [ # "2", # 1 # ] # ] # } # } # } # }ctor, displaying right afterdrilldownsas result ofselect, should betag1in correct case. In this sample,ctoris shown instead oftag1. However, what kind of value to be shown is unknown. -
[NormalizerTable] Fix a bug for Groonga to crush with specific definition setting in
NormalizerTable.Following case as sample.
table_create Normalizations TABLE_PAT_KEY ShortText --normalizer NormalizerNFKC130 column_create Normalizations normalized COLUMN_SCALAR ShortText load --table Normalizations [ {"_key": "Ⅰ", "normalized": "1"}, {"_key": "Ⅱ", "normalized": "2"}, {"_key": "Ⅲ", "normalized": "3"} ] normalize 'NormalizerTable("normalized", "Normalizations.normalized")' "ⅡⅡ"This bug is reported to occur when condition meet following 1., 2., and 3..
-
Keys are normalized in the target table.
In this sample, it meets condition specifying
--normalizer NormalizerNFKC130inNormalizations. Original keys,Ⅰ,Ⅱ,andⅢ, are normalized each intoi,ii,iiiwithNormalizerNFKC130. -
Same characters in the normalized key are included in the other normalized key.
In this sample, it meets condition because normalized key
iiiincludes the charactersiiandi, same with other normalized keys which are original keyⅡandⅠ. -
Same characters of 2nd condition are used multiple times.
In this sample, it meets condition because normalized key
iiii, original keyⅡⅡwithNormalizerNFKC130, is considered as same with normalized key forⅢandⅠwithNormalizerNFKC130.Normalizing
iiiiwithNormalizationstakes following steps and it meets the condition.-
First
iii( applied forⅢ)iioriare not used at first because NormalizerTable works with the Longest-Common-Prefix search. -
Last
i( applied forⅠ)
-
-
Known Issues
-
Currently, Groonga has a bug that there is possible that data is corrupt when we execute many additions, delete, and update data to vector column.
-
*<and*>only valid when we usequery()the right side of filter condition. If we specify as below,*<and*>work as&&.'content @ "Groonga" *< content @ "Mroonga"'
-
Groonga may not return records that should match caused by
GRN_II_CURSOR_SET_MIN_ENABLE.
Conclusion
Please refert to the following news for more details.
Let's search by Groonga!