Groonga 11.0.4 has been released
Groonga 11.0.4 has been released!
How to install: Install
Changes
Here are important changes in this release:
Improvements
-
[Normalizer] Added support for customized normalizer.
-
Added a new command
object_warm
.This commnad ship Groonga's DB to OS's page cache.
If we never startup Groonga after OS startup, Groonga's DB doesn't exist on OS's page cache When Groonga on the first run. Therefore, the first operation to Groonga is slow.
If we execute this command in advance, the first operation to Groonga is fast. In Linux, we can do the same by also executing
cat *.db > dev/null
. However, we could not do the same thing in Windows until now.By using this command, we can ship Groonga's DB to OS's page cache in both Linux and Windows. Then, we can also do that in units of table, column, and index. Therefore, we can ship only table, column, and index that we often use to OS's page cache.
-
select Added support for adjusting the score of a specific record in
--filter
.We can adjust the score of a specific record by using a oprtator named
*~
.*~
is logical operator same as&&
and||
. Therefore, we can use*~
like as&&
ans||
. Default weight of*~
is -1.Therefore, for example,
'content @ "Groonga" *~ content @ "Mroonga"'
mean the following operations.- Extract records that match
'content @ "Groonga"
andcontent @ "Mroonga"'
. - Add a score as below.
a. Calculate the score of 'content @ "Groonga"'. b. Calculate the score of 'content @ "Mroonga"'. c. b's score multiplied by -1 by *~. d. The socre of this record is a + b Therefore, if a's socre is 1 and b's score is 1, the score of this record is 1 + (1 * -1) = 0.
Then, we can specify score quantity by
*~${score_quantity}
.In particular, the following query adjust the score of match records by the following condition(
'content @ "Groonga" *~2.5 content @ "Mroonga")'
).``` table_create Memos TABLE_NO_KEY column_create Memos content COLUMN_SCALAR ShortText table_create Terms TABLE_PAT_KEY ShortText \ --default_tokenizer TokenBigram \ --normalizer NormalizerAuto column_create Terms index COLUMN_INDEX|WITH_POSITION Memos content load --table Memos [ {"content": "Groonga is a full text search engine."}, {"content": "Rroonga is the Ruby bindings of Groonga."}, {"content": "Mroonga is a MySQL storage engine based of Groonga."} ] select Memos \ --command_version 3 \ --filter 'content @ "Groonga" *~2.5 content @ "Mroonga"' \ --output_columns 'content, _score' \ --sort_keys -_score,_id { "header": { "return_code": 0, "start_time": 1624605205.641078, "elapsed_time": 0.002965450286865234 }, "body": { "n_hits": 3, "columns": [ { "name": "content", "type": "ShortText" }, { "name": "_score", "type": "Float" } ], "records": [ [ "Groonga is a full text search engine.", 1.0 ], [ "Rroonga is the Ruby bindings of Groonga.", 1.0 ], [ "Mroonga is a MySQL storage engine based of Groonga.", -1.5 ] ] } } ```
We can do the same by also useing adjuster . If we use adjuster , we need to make
--filter
condition and--adjuster
conditon on our application, but we make only--filter
condition on it by this improvement.We can also describe filter condition as below by using
query()
.--filter 'content @ "Groonga" *~2.5 content @ "Mroonga"'
- Extract records that match
-
select Added support for
&&
with weight.We can use
&&
with weight by using*<
or*>
. Default weight of*<
is 0.5. Default weight of*>
is 2.0.We can specify score quantity by
*<${score_quantity}
and*>${score_quantity}
. Then, if we specify*<${score_quantity}
, a plus or minus sign of${score_quantity}
is reverse.For example,
'content @ "Groonga" *<2.5 query("content", "MySQL")'
is as below.- Extract records that match
'content @ "Groonga"
andcontent @ "Mroonga"'
. - Add a score as below.
a. Calculate the score of 'content @ "Groonga"'. b. Calculate the score of 'query("content", "MySQL")'. c. b's score multiplied by -2.5 by *<. d. The socre of this record is a + b Therefore, if a's socre is 1 and b's score is 1, the score of this record is 1 + (1 * -2.5) = -1.5.
In particular, the following query adjust the score of match records by the following condition(
'content @ "Groonga" *<2.5 query("content", "Mroonga")'
).``` table_create Memos TABLE_NO_KEY column_create Memos content COLUMN_SCALAR ShortText table_create Terms TABLE_PAT_KEY ShortText \ --default_tokenizer TokenBigram \ --normalizer NormalizerAuto column_create Terms index COLUMN_INDEX|WITH_POSITION Memos content load --table Memos [ {"content": "Groonga is a full text search engine."}, {"content": "Rroonga is the Ruby bindings of Groonga."}, {"content": "Mroonga is a MySQL storage engine based of Groonga."} ] select Memos \ --command_version 3 \ --filter 'content @ "Groonga" *<2.5 query("content", "Mroonga")' \ --output_columns 'content, _score' \ --sort_keys -_score,_id { "header": { "return_code": 0, "start_time": 1624605205.641078, "elapsed_time": 0.002965450286865234 }, "body": { "n_hits": 3, "columns": [ { "name": "content", "type": "ShortText" }, { "name": "_score", "type": "Float" } ], "records": [ [ "Groonga is a full text search engine.", 1.0 ], [ "Rroonga is the Ruby bindings of Groonga.", 1.0 ], [ "Mroonga is a MySQL storage engine based of Groonga.", -1.5 ] ] } } ```
- Extract records that match
-
Log Added support for outputting to stdout and stderr.
Process-log and Query-log supported output to stdout and stderr.
- If we specify as
--log-path -
,--query-log-path -
, Groonga output log to stdout. - If we specify as
--log-path +
,--query-log-path +
, Groonga output log to stderr.
Process-log is for all of Groonga works. Query-log is just for query processing.
This feature is useful when we execute Groonga on Docker. Docker has the feature that records stdout and stderr in standard. Therefore, we don't need to login into the environment of Docker to get Groonga's log.
For example, this feature is useful as he following case.
-
If we want to analyze slow queries of Groonga on Docker.
If we specify
--query-log-path -
when startup Groonga, we can analyze slow queries by only execution the following commands.docker logs ${container_name} | groonga-query-log-analyze
By this, we can analyze slow query with the query log that output from Groonga on Docker simply.
- If we specify as
-
[Documentation] Filled missing documentation of
string_substring
.
Known Issues
-
Currently, Groonga has a bug that there is possible that data is corrupt when we execute many additions, delete, and update data to vector column.
-
[The browser based administration tool] Currently, Groonga has a bug that a search query that is inputted to non-administration mode is sent even if we input checks to the checkbox for the administration mode of a record list.
-
*<
and*>
only valid when we usequery()
the right side of filter condition. If we specify as below,*<
and*>
work as&&
.'content @ "Groonga" *< content @ "Mroonga"'
Conclusion
Please refert to the following news for more details.
Let's search by Groonga!