BloGroonga

2018-03-29

Groonga 8.0.1 has been released

Groonga 8.0.1 has been released!

How to install: Install

Changes

Here are important changes in this release:

  • [log] Show filter conditions in query log.
  • [Windows] Install *.pdb into the directory where *.dll and *.exe are installed .
  • [logical_count] Support filtered stage dynamic columns.
  • [logical_count] Added a new filter timing.
  • [logical_select] Added a new filter timing.
  • [logical_range_filter] Optimize window function for large result set.
  • [select] Added --match_escalation parameter.`
  • [httpd] Updated bundled nginx to 1.13.10.
  • Fixed memory leak that occurs when a prefix query doesn't match any token.
  • Fixed a bug that a cache for different databases is used when multiple databases are opened in the same process.
  • Fixed a bug that a constant value can overflow or underflow in comparison (>,>=,<,<=,==,!=).

[log] Show filter conditions in query log.

As a result, under what conditions you can see how many records were narrowed down. Specifically, below.

2018-02-15 19:04:02.303809|0x7ffd9eedf6f0|:000000013837058 filter(17): product equal "test_product"

In the above example, we can 17 records were narrowed down by product == "test_product". It's disabled by default. To enable it, you need to set an environment variable below.

GRN_QUERY_LOG_SHOW_CONDITION=yes

[logical_count] Support filtered stage dynamic columns.

logical_count is only support initial stage dynamic columns until now. You can use filtered stage dynamic columns in logical_count also from this release.

[logical_count][logical_select] Added a new filter timing.

It's executed after filtered stage generated columns are generated. Specifically, below.

logical_select \
    --logical_table Entries \
    --shard_key created_at \
    --columns[n_likes_sum_per_tag].stage filtered \
    --columns[n_likes_sum_per_tag].type UInt32 \
    --columns[n_likes_sum_per_tag].value 'window_sum(n_likes)' \
    --columns[n_likes_sum_per_tag].window.group_keys 'tag' \
    --filter 'content @ "system" || content @ "use"' \
    --post_filter 'n_likes_sum_per_tag > 10' \
    --output_columns _key,n_likes,n_likes_sum_per_tag

  # [
  #   [
  #     0, 
  #     1519030779.410312,
  #     0.04758048057556152
  #   ], 
  #   [
  #     [
  #       [
  #         2
  #       ], 
  #       [
  #         [
  #           "_key", 
  #           "ShortText"
  #         ], 
  #         [
  #           "n_likes", 
  #           "UInt32"
  #         ], 
  #         [
  #           "n_likes_sum_per_tag", 
  #           "UInt32"
  #         ]
  #       ]
  #       [
  #         "Groonga", 
  #         10, 
  #         25
  #       ], 
  #       [
  #         "Mroonga", 
  #         15, 
  #         25
  #       ]
  #     ]
  #   ]
  # ]

This feature's point that after filtered stage generated columns use in --post_filter. In the above example is logical_select'example, however it's available on the logical_count as well.

[logical_range_filter] Optimize window function for large result set.

If we find enough matched records, we don't apply window function to the remaining windows. Disable this optimization for small result set if its overhead is not negligible.

[select] Added --match_escalation parameter.`

You can force to enable match escalation by --match_escalation yes. It's stronger than --match_escalation_threshold 99999....999 because --match_escalation yes also works with SOME_CONDITIONS && column @ 'query'. --match_escalation_threshold isn't used in this case.

The default is --match_escalation auto. It doesn't change the current behavior.

You can disable match escalation by --match_escalation no. It's the same as --match_escalation_threshold -1.

Fixed memory leak that occurs when a prefix query doesn't match any token.

Fixed a memory leak that occurs when a prefix query doesn't match any token by fuzzy search as below example.

table_create Users TABLE_NO_KEY
[[0,0.0,0.0],true]
column_create Users name COLUMN_SCALAR ShortText
[[0,0.0,0.0],true]
table_create Names TABLE_PAT_KEY ShortText
[[0,0.0,0.0],true]
column_create Names user COLUMN_INDEX Users name
[[0,0.0,0.0],true]
load --table Users
[
{"name": "Tom"},
{"name": "Tomy"},
{"name": "Pom"},
{"name": "Tom"}
]
[[0,0.0,0.0],4]
select Users --filter 'fuzzy_search(name, "Atom", {"prefix_length": 1})'   --output_columns 'name, _score'   --match_escalation_threshold -1
[[0,0.0,0.0],[[[0],[["name","ShortText"],["_score","Int32"]]]]]

Fixed a bug that a cache for different databases is used when multiple databases are opened in the same process.

Fixed a bug that when multiple databases are opened in the same process, results are returned from the cache of another database because the cache was shared within the process.

Fixed a bug that a constant value can overflow or underflow in comparison (>,>=,<,<=,==,!=).

Fixed a bug that a constant value can overflow or underflow in comparison as below example.

table_create Values TABLE_NO_KEY
[[0,0.0,0.0],true]
column_create Values number COLUMN_SCALAR Int16
[[0,0.0,0.0],true]
load --table Values
[
{"number": 3},
{"number": 4},
{"number": -1}
]
[[0,0.0,0.0],3]
select Values   --filter 'number > 32768'   --output_columns 'number'
[[0,1522305525.361629,0.0003235340118408203],[[[3],[["number","Int16"]],[3],[4],[-1]]]]

An overflow occurs, because of 32768 is over the range of Int16 (-32,768 to 32,767). As this time number> 32768 was evaluated as number> - 32768. In this release, when overflow or underflow occurs as above, not to return any results.

Conclusion

See Release 8.0.1 2018-03-29 about detailed changes since 8.0.0

Let's search by Groonga!