News - 16 series#
Release 16.0.1 - 2026-03-30#
Improvements#
[language_model_vectorize] Added prefix option#
You can now add a prefix to the input text.
This is useful for models that require a prefix, similar to the passage_prefix and query_prefix options of TokenLanguageModelKNN.
language_model_vectorize("hf:///groonga/multilingual-e5-base-Q4_K_M-GGUF", \
"male child", \
{"prefix": "query: "})
[object_list] Added normalizer information to the output#
object_list command now includes a normalizers field in the output for each table.
Fixes#
[HTTP] Fixed a false error for chunked HTTP requests#
Fixed a bug where valid chunked HTTP requests could fail with an error under certain conditions.
This issue could only occur when a chunked request body was split across multiple receives in a very specific pattern, so most users would never encounter it.
If it did occur during a load command, the load would only be partially completed and would need to be re-executed.
No index corruption occurs due to this bug.
Fixed a bug that caused a crash due to using the wrong free function#
Fixed a bug where grn_obj_unlink() was used instead of GRN_OBJ_FIN() for bulk objects.
This could cause a crash in some cases.
Reported by Daniel Black
Experimental features#
These features are still experimental and unstable. We must not use these features in production.
[Data types] Added support for arrays and objects in JSON type#
The JSON type now supports arrays and objects, including nested ones.
All JSON value types are now supported.
Added support for OpenZL compression and decompression for scalar columns and Float32 vector columns#
OpenZL compression and decompression are now supported for scalar columns and Float32 vector columns. OpenZL can achieve better compression ratios than Zstandard.
Examples of compression:
The compression of float32 array results for the following data:
The number of array elements is 40,960.
The number of records is 10,000.
Results:
Compress with OpenZL: 1.3GB
Compress with Zstandard: 1.5GB
No compress: 1.6GB
Added extractor support#
Extractors are a new type of module in Groonga that extract plain text from structured data before tokenization.
The built-in ExtractorHTML extractor strips HTML tags and decodes HTML entities from a value, leaving only the text content.
Here is an example of using ExtractorHTML with the extract command:
extract \
--extractors 'ExtractorHTML' \
--value "<html><body>He<ll>o</body></html>"
[[0,0.0,0.0],{"extracted":"He<ll>o"}]
Thanks#
Daniel Black
Release 16.0.0 - 2026-02-09#
This is our annual major release! This release doesn’t have any backward incompatible changes! So you can upgrade Groonga without migrating your existing databases. You can still use your existing databases as-is.
Fixes#
Fixed a bug that causes overflow in the key of TABLE_DAT_KEY tables#
If you set a value of 4096 byte against a key, a TABLE_DAT_KEY table may be broken.