News - 7 series#

Release 7.1.1 - 2018-01-29#

Improvements#

  • [Ubuntu] Dropped Ubuntu 17.04 (Zesty Zapus) support. It has reached EOL at Jan 13, 2018.

  • Added quorum match support. You can use quorum match in both script syntax and query syntax. [groonga-talk,385][Suggested by 付超群]

    TODO: Add documents for quorum match syntax and link to them.

  • Added custom similarity threshold support in script syntax. You can use custom similarity threshold in script syntax.

    TODO: Add document for the syntax and link to it.

  • [grndb][--force-lock-clear] Added --force-lock-clear option. With this option, grndb forces to clear locks of database, tables and data columns. You can use your database again even if locks are remained in database, tables and data columns.

    But this option very risky. Normally, you should not use it. If your database is broken, your database is still broken. This option just ignores locks.

  • [load] Added surrogate pairs support in escape syntax. For example, \uD83C\uDF7A is processed as 🍺.

  • [Windows] Changed to use sparse file on Windows. It reduces disk space and there are no performance demerit.

  • [Online index construction] Added GRN_II_REDUCE_EXPIRE_THRESHOLD environment variable to control when memory maps are expired in index column. It’s -1 by default. It means that expire timing is depends on index column size. If index column is smaller, expire timing is more. If index column is larger, expire timing is less.

    You can use the previous behavior by 0. It means that Groonga always tries to expire.

  • [logical_range_filter] [post_filter] Added a new filter timing. It’s executed after filtered stage generated columns are generated.

Fixes#

  • Reduced resource usage for creating index for reference vector. [GitHub#806][Reported by Naoya Murakami]

  • [table_create] Fixed a bug that a table is created even when token_filters is invalid. [GitHub#266]

Thanks#

  • 付超群

  • Naoya Murakami

Release 7.1.0 - 2017-12-29#

Improvements#

  • [load] Improved the load’s query-log format. Added detail below items in the load’s query-log.

    • outputs number of loaded records.

    • outputs number of error records and columns.

    • outputs number of total records.

  • [logical_count] Improved the logical_count’s query-log format. Added detail below items in the logical_count’s query-log.

    • outputs number of count.

  • [logical_select] Improve the logical_select’s query-log format. Added detail below items in the logical_select’s query-log.

    • log N outputs.

    • outputs plain drilldown.

    • outputs labeled drilldown.

    • outputs selected in each shard.

    • use “[…]” for target information.

  • [delete] Improved the delete’s query-log format. Added detail below items in the delete’s query-log.

    • outputs number of deleted and error records.

    • outputs number of rest number of records.

  • [Groonga HTTP server] The server executed by groonga -s ensure stopping by C-c.

  • Used NaN and Infinity, -Infinity instead of Lisp representations(#<nan> and #i1/0, #-i1/0).

  • Supported vector for drilldown calc target.

  • Partially supported keyword extraction from regexp search. It enables highlight_html and snippet_html for regexp search. [GitHub#787][Reported by takagi01]

  • [bulk] Reduced the number of realloc(). grn_bulk_*() API supports it.

    It improves performance for large output case on Windows. For example, it causes 100x faster for 100MB over output.

    Because realloc() is heavy on Windows.

  • Enabled GRN_II_OVERLAP_TOKEN_SKIP_ENABLE only when its value is “yes”.

  • Deprecated GRN_NGRAM_TOKENIZER_REMOVE_BLANK_DISABLE. Use GRN_NGRAM_TOKENIZER_REMOVE_BLANK_ENABLE=no instead.

  • Added new function index_column_source_records. It gets source records of index column.[Patch by Naoya Murakami]

  • [select] Supported negative “offset” for “offset + size - limit” >= 0

  • Added grn_column_cache. It’ll improve performance for getter of fixed size column value.

  • [groonga executable file] Added --listen-backlog option. You can customize listen(2)’s backlog by this option.

  • [httpd] Updated bundled nginx to 1.13.8.

Fixes#

  • Fixed a memory leak in highlight_full

  • Fixed a crash bug by early unlink It’s not caused by instruction in grn_expr_parse() but it’s caused when libgroonga user such as Mroonga uses the following instructions:

    1. grn_expr_append_const("_id")

    2. grn_expr_append_op(GRN_OP_GET_VALUE)

Thanks#

  • takagi01

  • Naoya Murakami

Release 7.0.9 - 2017-11-29#

Improvements#

  • Supported newer version of Apache Arrow. In this release, 0.8.0 or later is required for Apache Arrow support.

  • [sharding] Added new API for dynamic columns.

    • Groonga::LabeledArguments

  • [sharding] Added convenient Table#select_all method.

  • [logical_range_filter] Supported dynamic columns. Note that initial and filtered stage are only supported.

  • [logical_range_filter] Added documentation about cache parameter and dynamic columns.

  • [logical_count] Supported dynamic columns. Note that initial stage is only supported.

  • [logical_count] Added documentation about named parameters.

  • [select] Supported --match_columns _key without index.

  • [in_values] Supported to specify more than 126 values. [GitHub#760] [GitHub#781] [groonga-dev,04449] [Reported by Murata Satoshi]

  • [httpd] Updated bundled nginx to 1.13.7.

Fixes#

  • [httpd] Fixed build error when old Groonga is already installed. [GitHub#775] [Reported by myamanishi3]

  • [in_values] Fixed a bug that in_values with too many arguments can cause a crash. This bug is found during supporting more than 126 values. [GitHub#780]

  • [cmake] Fixed LZ4 and MessagePack detection. [Reported by Sergei Golubchik]

  • [Offline index construction] Fixed a bug that offline index construction for vector column consumes unnecessary resources. If you have a log of elements in one vector column and many records, Groonga will crash. [groonga-dev,04533][Reported by Toshio Uchiyama]

Thanks#

  • Murata Satoshi

  • myamanishi3

  • Sergei Golubchik

  • Toshio Uchiyama

Release 7.0.8 - 2017-10-29#

Improvements#

  • [windows] Supported backtrace on crash. This feature not only function call history but also source filename and number of lines can be displayed as much as possible. This feature makes problem solving easier.

  • Supported ( ) (empty block) only query (--query "( )") for QUERY_NO_SYNTAX_ERROR. In the previous version, it caused an error. [GitHub#767]

  • Supported (+) (only and block) only query (--query "(+)") for QUERY_NO_SYNTAX_ERROR. In the previous version, it caused an error. [GitHub#767]

  • Supported ~foo (starting with “~”) query (--query "~y") for QUERY_NO_SYNTAX_ERROR. In the previous version, it caused an error. [GitHub#767]

  • Modified log level of expired from info to debug. 2017-10-29 14:05:34.123456|i| <0000000012345678:0> expired i=000000000B123456 max=10 (2/2) This message is logged when memory mapped area for index is unmapped. Thus, this log message is useful information for debugging, in other words, as it is unnecessary information in normal operation, we changed log level from info to debug.

  • Supported Ubuntu 17.10 (Artful Aardvark)

Fixes#

  • [dat] Fixed a bug that large file is created unexpectedly in the worst case during database expansion process. This bug may occurs when you create/delete index columns so frequently. In 7.0.7 release, a related bug was fixed - “table_create command fails when there are many deleted keys”, but it turns out that it is not enough in the worst case.

  • [logical_select] Fixed a bug that when offset and limit were applied to multiple shards at the same time, there is a case that it returns a fewer number of records unexpectedly.

Release 7.0.7 - 2017-09-29#

Improvements#

  • Supported + only query (--query "+") for QUERY_NO_SYNTAX_ERROR. In the previous version, it caused an error.

  • [httpd] Updated bundled nginx to 1.13.5.

  • [dump] Added the default argument values to the syntax section.

  • [Command version] Supported --default-command-version 3.

  • Supported caching select result with function call. Now, most of existing functions supports this feature. There are two exception, when now() and rand() are used in query, select result will not cached. Because of this default behavior change, new APIs are introduced.

    • grn_proc_set_is_stable()

    • grn_proc_is_stable()

    Note that if you add a new function that may return different result with the same argument, you must call grn_proc_is_stable(ctx, proc, GRN_FALSE). If you don’t call it, select result with the function call is cached and is wrong result for multiple requests.

Fixes#

  • [windows] Fixed to clean up file handle correctly on failure when database_unmap is executed. There is a case that critical section is not initialized when request is canceled before executing database_unmap. In such a case, it caused a crach bug.

  • [Tokenizers] Fixed document for wrong tokenizer names. It should be TokenBigramIgnoreBlankSplitSymbolAlpha and TokenBigramIgnoreBlankSplitSymbolAlphaDigit.

  • Changed not to keep created empty file on error.

    In the previous versions, there is a case that empty file keeps remain on error.

    Here is the senario to reproduce:

    1. creating new file by grn_fileinfo_open succeeds

    2. mapping file by DO_MAP() is failed

    In such a case, it causes an another error such as “already file exists” because of the file which isn’t under control. so these file should be removed during cleanup process.

  • Fixed a bug that Groonga may be crashed when search process is executed during executing many updates in a short time.

  • [table_create] Fixed a bug that table_create failed when there are many deleted keys.

Release 7.0.6 - 2017-08-29#

Improvements#

  • Supported prefix match search using multiple indexes. (e.g. --query "Foo*" --match_columns "TITLE_INDEX_COLUMN||BODY_INDEX_COLUMN").

  • [window_count] Supported window_count function to add count data to result set. It is useful to analyze or filter additionally.

  • Added the following API

    • grn_obj_get_disk_usage():

    • GRN_EXPR_QUERY_NO_SYNTAX_ERROR

    • grn_expr_syntax_expand_query_by_table()

    • grn_table_find_reference_object()

  • [object_inspect] Supported to show disk usage about specified object.

  • Supported falling back query parse feature. It is enabled when QUERY_NO_SYNTAX_ERROR flag is set to query_flags. (this feature is disabled by default). If this flag is set, query never causes syntax error. For example, “A +” is parsed and escaped automatically into “A +”. This behavior is useful when application uses user input directly and doesn’t want to show syntax error to user and in log.

  • Supported to adjust score for term in query. “>”, “<”, and “~” operators are supported. For example, “>Groonga” increments score of “Groonga”, “<Groonga” decrements score of “Groonga”. “~Groonga” decreases score of matched document in the current search result. “~” operator doesn’t change search result itself.

  • Improved performance to remove table. thread_limit=1 is not needed for it. The process about checking referenced table existence is done without opening objects. As a result, performance is improved.

  • [httpd] Updated bundled nginx to 1.13.4.

Fixes#

  • [dump] Fixed a bug that the 7-th unnamed parameter for –sort_hash_table option is ignored.

  • [schema] Fixed a typo in command line parameter name. It should be source instead of sources. [groonga-dev,04449] [Reported by murata satoshi]

  • [ruby_eval] Fixed crash when ruby_eval returned syntax error. [GitHub#751] [Patch by ryo-pinus]

Thanks#

  • murata satoshi

  • ryo-pinus

Release 7.0.5 - 2017-07-29#

Improvements#

  • [httpd] Updated bundled nginx to 1.13.3. Note that this version contains security fix for CVE-2017-7529.

  • [load] Supported to load the value of max UInt64. In the previous versions, max UInt64 value is converted into 0 unexpectedlly.

  • Added the following API

    • grn_window_get_size() [GitHub#725] [Patch by Naoya Murakami]

  • [math_abs] Supported math_abs() function to calculate absolute value. [GitHub#721]

  • Supported to make grn_default_logger_set_path() and grn_default_query_logger_set_path() thread safe.

  • [windows] Updated bundled pcre library to 8.41.

  • [normalize] Improved not to output redundant empty string "" on error. [GitHub#730]

  • [functions/time] Supported to show error message when division by zero was happened. [GitHub#733] [Patch by Naoya Murakami]

  • [windows] Changed to map ERROR_NO_SYSTEM_RESOURCES to GRN_RESOURCE_TEMPORARILY_UNAVAILABLE. In the previous versions, it returns rc=-1 as a result code. It is not helpful to investigate what actually happened. With this fix, it returns rc=-12.

  • [functions/min][functions/max] Supported vector column. Now you need not to care scalar column or vector column to use. [GitHub#735] [Patch by Naoya Murakami]

  • [dump] Supported --sort_hash_table option to sort by _key for hash table. Specify --sort_hash_table yes to use it.

  • [between] Supported to specify index column. [GitHub#740] [Patch by Naoya Murakami]

  • [load] Supported Apache Arrow 0.5.0 or later.

  • [How to analyze error messages] Added howto article to analyze error message in Groonga.

  • [Debian GNU/Linux] Updated required package list to build from source.

  • [Ubuntu] Dropped Ubuntu 16.10 (Yakkety Yak) support. It has reached EOL at July 20, 2017.

Fixes#

  • Fixed to construct correct fulltext indexes against vector column which type belongs to text family (`ShortText and so on). This fix resolves that fulltext search doesn’t work well against text vector column after updating indexes. [GitHub#494]

  • [thread_limit] Fixed a bug that deadlock occurs when thread_limit?max=1 is requested at once.

  • [groonga-httpd] Fixed a mismatch path of pid file between default one and restart command assumed. This mismatch blocked restarting groonga-httpd. [GitHub#743] [Reported by sozaki]

Thanks#

  • Naoya Murakami

Release 7.0.4 - 2017-06-29#

Improvements#

  • Added physical create/delete operation logs to identify problem for troubleshooting. [GitHub#700,#701]

  • [in_records] Improved performance for fixed sized column. It may reduce 50% execution time.

  • [grndb] Added --log-path option. [GitHub#702,#703]

  • [grndb] Added --log-level option. [GitHub#706,#708]

  • Added the following API

    • grn_operator_to_exec_func()

    • grn_obj_is_corrupt()

  • Improved performance for “FIXED_SIZE_COLUMN OP CONSTANT”. Supported operators are: ==, !=, <, >, <= and >=.

  • Improved performance for “COLUMN OP VALUE && COLUMN OP VALUE && …”.

  • [grndb] Supported corrupted object detection with grndb check.

  • [io_flush] Supported --only_opened option which enables to flush only opened database objects.

  • [grndb] Supported to detect/delete orphan “inspect” object. The orphaned “inspect” object is created by renamed command name from inspect to object_inspect.

Fixes#

  • [rpm][centos] Fixed unexpected macro expansion problem with customized build. This bug only affects when rebuilding Groonga SRPM with customized additional_configure_options parameter in spec file.

  • Fixed missing null check for grn_table_setoperation(). There is a possibility of crash bug when indexes are broken. [GitHub#699]

Thanks#

Release 7.0.3 - 2017-05-29#

Improvements#

  • [select] Add document about Full text search with specific index name.

  • [index] Supported to log warning message which record causes posting list overflows.

  • [load][dump] Supported Apache Arrow. [GitHub#691]

  • [cmake] Supported linking lz4 in embedded static library build. [Original patch by Sergei Golubchik]

  • [delete] Supported to cancel.

  • [httpd] Updated bundled nginx to 1.13.0

  • Exported the following API

    • grn_plugin_proc_get_caller()

  • Added index column related function and selector.

    • Added new selector: index_column_df_ratio_between()

    • Added new function: index_column_df_ratio()

Fixes#

  • [delete] Fixed a bug that error isn’t cleared correctly. It affects to following deletions so that it causes unexpected behavior.

  • [windows] Fixed a bug that IO version is not detected correctly when the file is opened with O_CREAT flag.

  • [vector_slice] Fixed a bug that non 4 bytes vector columns can’t slice. [GitHub#695] [Patch by Naoya Murakami]

  • Fixed a bug that non 4 bytes fixed vector column can’t sequential match by specifying index of vector. [GitHub#696] [Patch by Naoya Murakami]

  • [logical_select] Fixed a bug that “argument out of range” occurs when setting last day of month to the min. [GitHub#698]

Thanks#

  • Sergei Golubchik

  • Naoya Murakami

Release 7.0.2 - 2017-04-29#

Improvements#

Fixes#

  • [logical_select] Fixed a bug that wrong cache is used. This bug was occurred when dynamic column parameter is used.

  • [logical_select] Fixed a bug that dynamic columns aren’t created. It’s occurred when no match case.

  • [reindex] Fixed a bug that data is lost by reindex. [GitHub#646]

  • [httpd] Fixed a bug that response of quit and shutdown is broken JSON when worker is running as another user. [GitHub ranguba/groonga-client#12]

Release 7.0.1 - 2017-03-29#

Improvements#

  • Exported the following API

    • grn_ii_cursor_next_pos()

    • grn_table_apply_expr()

    • grn_obj_is_data_column()

    • grn_obj_is_expr()

    • grn_obj_is_scalar_column()

  • [dump] Supported to dump weight reference vector.

  • [load] Supported to load array<object> style weight vector column. The example of array<object> style is: [{"key1": weight1}, {"key2": weight2}].

  • Supported to search !(XXX OPERATOR VALUE) by index. Supported operator is not only > but also >=, <, <=, == and !=.

  • Supported index search for “!(column == CONSTANT)”. The example in this case is: !(column == 29) and so on.

  • Supported more “!” optimization in the following patterns.

    • !(column @ "X") && (column @ "Y")

    • (column @ "Y") && !(column @ "X")

    • (column @ "Y") &! !(column @ "X")

  • Supported to search XXX || !(column @ "xxx") by index.

  • [dump] Changed to use '{"x": 1, "y": 2}' style for not referenced weight vector. This change doesn’t affect to old Groonga because it already supports one.

  • [experimental] Supported GRN_ORDER_BY_ESTIMATED_SIZE_ENABLE environment variable. This variable controls whether query optimization which is based on estimated size is applied or not. This feature is disabled by default. Set GRN_ORDER_BY_ESTIMATED_SIZE_ENABLE=yes if you want to try it.

  • [select] Added query log for columns, drilldown evaluation.

  • [select] Changed query log format for drilldown. This is backward incompatible change, but it only affects users who convert query log by own programs.

  • [table_remove] Reduced temporary memory usage. It’s enabled when the number of max threads is 0.

  • [select] columns[LABEL](N) is used for query log format instead of columns(N)[LABEL]..

  • [Query expansion] Updated example to use vector column because it is recommended way. [Reported by Gurunavi, Inc]

  • Supported to detect canceled request while locking. It fixes the problem that request_cancel is ignored unexpectedly while locking.

  • [logical_select] Supported initial and filtered stage dynamic columns. The examples are: --columns[LABEL].stage initial or --columns[LABEL].stage filtered.

  • [logical_select] Supported match_columns, query and drilldown_filter option.

  • [highlight_html] Supported similar search.

  • [logical_select] Supported initial and stage dynamic columns in labeled drilldown. The example is: --drilldowns[LABEL].stage initial.

  • [logical_select] Supported window function in dynamic column.

  • [select] Added documentation about dynamic columns.

  • [Window function] Added section about window functions.

  • [CentOS] Dropped CentOS 5 support because of EOL.

  • [httpd] Updated bundled nginx to 1.11.12

  • Supported to disable AND match optimization by environment variable. You can disable this feature by GRN_TABLE_SELECT_AND_MIN_SKIP_ENABLE=no. This feature is enable by default.

  • [vector_new] Added a new function to create a new vector.

  • [select] Added documentation about drilldown_filter.

Fixes#

  • [lock_clear] Fixed a crash bug against temporary database.

  • Fixed a problem that dynamically updated index size was increased for natural language since Grooonga 6.1.4.

  • [select] Fixed a bug that “A && B.C @ X” may not return records that should be matched.

  • Fixed a conflict with grn_io_flush() and grn_io_expire(). Without this change, if io_flush and load command are executed simultaneously in specific timing, it causes a crash bug by access violation.

  • [logical_table_remove] Fixed a crash bug when the max number of threads is 1.

Thanks#

  • Gurunavi, Inc.

Release 7.0.0 - 2017-02-09#

Improvements#

  • [in_values] Supported sequential search for reference vector column. [Patch by Naoya Murakami] [GitHub#629]

  • [select] Changed to report error instead of ignoring on invalid drilldown[LABEL].sort_keys.

  • [select] Removed needless metadata updates on DB. It reduces the case that database lock remains even though select command is executed. [Reported by aomi-n]

  • [lock_clear] Changed to clear metadata lock by lock_clear against DB.

  • [CentOS] Enabled EPEL by default to install Groonga on Amazon Linux.

  • [query] Supported “@X” style in script syntax for prefix(“@^”), suffix(“@$”), regexp(“@^”) search.

  • [query] Added documentation about available list of mode. The default mode is MATCH (“@”) mode which executes full text search.

  • [rpm][centos] Supported groonga-token-filter-stem package which provides stemming feature by TokenFilterStem token filter on CentOS 7. [GitHub#633] [Reported by Tim Bellefleur]

  • [window_record_number] Marked record_number as deprecated. Use window_record_number instead. record_number is still available for backward compatibility.

  • [window_sum] Added window_sum window function. It’s similar behavior to window function sum() on PostgreSQL.

  • Supported to construct offline indexing with in-memory (temporary) TABLE_DAT_KEY table. [GitHub#623] [Reported by Naoya Murakami]

  • [onigmo] Updated bundled Onigmo to 6.1.1.

  • Supported columns[LABEL].window.group_keys. It’s used to apply window function for every group.

  • [load] Supported to report error on invalid key. It enables you to detect mismatch type of key.

  • [load] Supported --output_errors yes option. If you specify “yes”, you can get errors for each load failed record. Note that this feature requires command version 3.

  • [load] Improve error message on table key cast failure. Instead of “cast failed”, type of table key and target type of table key are also contained in error message.

  • [httpd] Updated bundled nginx to 1.11.9.

Fixes#

  • Fixed a bug that nonexistent sort keys for drilldowns[LABEL] or slices[LABEL] causes invalid JSON parse error. [Patch by Naoya Murakami] [GitHub#627]

  • Fixed a bug that access to nonexistent sub records for group causes a crash. For example, This bug affects the case when you use drilldowns[LABEL].sort_keys _sum without specifying calc_types. [Patch by Naoya Murakami] [GitHub#625]

  • Fixed a crash bug when tokenizer has an error. It’s caused when tokenizer and token filter are registered and tokenizer has an error.

  • [window_record_number] Fixed a bug that arguments for window function is not correctly passed. [GitHub#634][Patch by Naoya Murakami]

Thanks#

  • Naoya Murakami

  • aomi-n