News - 14 series#
Release 14.0.2 - 2024-03-29#
Improvements#
Reduced a log level of a log when Groonga setting normalizers/tokenizer/token_filters against temporary table.
For example, the target log of this modification is the following log.
DDL:1234567890:set_normalizers NormalizerAuto
PGroonga sets normalizers against temporary table on start. So, this log becomes noise. Because this log become output when PGroonga start because of PGroonga’s default log level is
notice
.Therefore, we reduce log level to
debug
for the log since this release. Thus, this log does not output when PGroonga start in default.
Release 14.0.1 - 2024-03-14#
Improvements#
[load] Stopped reporting an error when we
load
key that becomes an empty key by normalization."-"
becomes""
withNormalizerNFKC150("remove_symbol", true)
. So the following case reports a “empty key” error.table_create Values TABLE_HASH_KEY ShortText \ --normalizers 'NormalizerNFKC150("remove_symbol", true)' table_create Data TABLE_NO_KEY column_create Data value COLUMN_SCALAR Values load --table Data [ {"value": "-"} ]
However, if we many
load
in such data, many error log are generated. Because Groonga output many “empty key” error because of Groonga can’t register empty string to index.No problem even if empty string can’t register to index in such case. Because we don’t match anything even if we search by empty string. So, we stop reporting an “empty key” error in such case.
Fixes#
Fixed a crash bug if a request is canceled between or range search.
This bug doesn’t necessarily occur. This bug occur when we cancel a request in the specific timing. This bug occur easily when search time is long such as sequential search.
Fixed a bug that highlight_html may return invalid result when the following conditions are met.
We use multiple normalizers such as
NormalizerTable
andNormalizerNFKC150
.We highlight string include whitespace.
For example, this bug occur such as the following case.
table_create NormalizationsIndex TABLE_PAT_KEY ShortText --normalizer NormalizerAuto table_create Normalizations TABLE_HASH_KEY UInt64 column_create Normalizations normalized COLUMN_SCALAR LongText column_create Normalizations target COLUMN_SCALAR NormalizationsIndex column_create NormalizationsIndex index COLUMN_INDEX Normalizations target table_create Lexicon TABLE_PAT_KEY ShortText \ --normalizers 'NormalizerTable("normalized", \ "Normalizations.normalized", \ "target", \ "target"), NormalizerNFKC150' table_create Names TABLE_HASH_KEY UInt64 column_create Names name COLUMN_SCALAR Lexicon load --table Names [ ["_key","name"], [1,"Sato Toshio"] ] select Names \ --query '_key:1 OR name._key:@"Toshio"' \ --output_columns 'highlight_html(name._key, Lexicon) [ [ 0, 1710401574.332274, 0.001911401748657227 ], [ [ [ 1 ], [ [ "highlight_html", null ] ], [ "sato <span class=\"keyword\">toshi</span>o" ] ] ] ]
[Ubuntu] We become able to provide package for Ubuntu again.
We don’t provide packages for Ubuntu in Groonga version 14.0.0. Because we fail makeing Groonga package for Ubuntu by problrm of build environment for Ubuntu package.
We fixed problrm of build environment for Ubuntu package in 14.0.1. So, we can provide packages for Ubuntu again since this release.
Fixed build error when we build from source by using
clang
. [GitHub#1738][Reported by windymelt]
Thanks#
windymelt
Release 14.0.0 - 2024-02-29#
This is a major version up! But It keeps backward compatibility. We can upgrade to 14.0.0 without rebuilding database.
Improvements#
Added a new tokenizer
TokenH3Index
(experimental).TokenH3Index
tokenizes WGS84GetPoint to UInt64(H3 index).Added support for offline and online index construction with non text based tokenizer (experimental).
TokenH3Index
is one of non text based tokenizers.[select] Added support for searching by index with non text based tokenizer (experimental).
TokenH3Index
is one of non text based tokenizers.Added new functions
distance_cosine()
,distance_inner_product()
,distance_l2_norm_squared()
,distance_l1_norm()
.We can only get records that a small distance as vector with these functions and
limit N
These functions calculate distance in the
output
stage.However, we don’t optimaize these functions yet.
distance_cosine()
: Calculate cosine similarity.distance_inner_product()
: Calculate inner product.distance_l2_norm_squared()
: Calculate euclidean distance.distance_l1_norm()
: Calculate manhattan distance.
Added a new function
number_round()
.[load] Added support for parallel
load
.This feature only enable when the
input_type
ofload
isapache-arrow
.This feature one thread per column. If there are many target columns, it will reduce load time.
[select] We can use uvector as much as possible for array literal in
--filter
.uvector is vector of elements with fix size.
If all elements have the same type, we use uvector instead vector.
[status] Added
n_workers
to output ofstatus
.Optimized a dynamic column creation.
[WAL] Added support for rebuilding broken indexes in parallel.
[select] Added support for
Int64
inoutput_type=apache-arrow
for columns that reference other table.
Fixes#
[Windows] Fixed path for documents of
groonga-normalizer-mysql
in package for Windows.Documents of
groonga-normalizer-mysql
put under theshare/
in this release.[select] Fixed a bug that Groonga may crash when we use bitwise operations.