BloGroonga

2020-03-12

PGroonga (fast full text search module for PostgreSQL) 2.2.5 has been released

PGroonga 2.2.5 has been released! PGroonga makes PostgreSQL fast full text search for all languages.

If you are new user, see also About PGroonga.

Note

PGroonga 2.2.3 and 2.2.4 have below problems. Therefore please don't use them.

  • PGroonga 2.2.3

    • This version fails to install packages for Debian and Ubuntu. Because these packages fail to resolve the dependency.
  • PGroonga 2.2.4

    • This version fails to upgrade using ALTER EXTENSION pgroonga UPDATE;.

Highlight

Here are highlights after PGroonga 2.2.2:

  • Added support for CREATE TABLE, CREATE INDEX, and TRUNCATE in the same transaction.

  • Regulated the number of estimated search cost of PGroonga's index for easy to select it when searched records.

    • In particular, PostgreSQL choice PGroonga's index when the record includes many same tokens.
  • Fixed a bug that we failed to upgrade to PGroonga 2.2.4.

  • Fixed a bug that execution of the query that had not conditional expression failed if the table had only jsonb index of PGroonga.

  • Fixed a bug that the sequential scan occurred error when we used Groonga library ( libgroonga ) version 10.0.0 or later.

Added support for CREATE TABLE, CREATE INDEX, and TRUNCATE in the same transaction.

By until before versions, TRUNCATE failed when we make the index of PGroonga and execute TRUNCATE in the same transaction.

For example, TRUNCATE failed if below case.

BEGIN TRANSACTION;
CREATE EXTENSION IF NOT EXISTS pgroonga;
CREATE TABLE test (
    id SERIAL PRIMARY KEY,
    content text
);
CREATE INDEX test_id ON test USING pgroonga (content);
TRUNCATE TABLE test CASCADE;
COMMIT;

Fixed a bug that execution of the query that had not conditional expression failed if the table had only jsonb index of PGroonga.

For example, SELECT COUNT(*) failed if below case.

CREATE TABLE logs (
  record jsonb
);
CREATE INDEX pgroonga_index ON logs
  USING pgroonga (record pgroonga_jsonb_ops_v2);
INSERT INTO logs VALUES ('{}');
SET enable_seqscan = off;
SELECT count(*) FROM logs;

Fixed a bug that the sequential scan occurred error when we used Groonga library ( libgroonga ) version 10.0.0 or later.

This problem didn't occur if we installed a package for PGroonga. Only when it occurred if we installed PGroonga and Groonga that built from source.

For example, SELECT failed if below case.

CREATE TABLE memos (
  content varchar(256)
);

INSERT INTO memos VALUES ('Green apple');
INSERT INTO memos VALUES ('Apple');

CREATE INDEX pgrn_index ON memos
 USING pgroonga (content pgroonga_varchar_full_text_search_ops_v2);

SET enable_seqscan = on;
SET enable_indexscan = off;
SET enable_bitmapscan = off;

SELECT content, pgroonga_score(tableoid, ctid)
  FROM memos
 WHERE content &@~ 'Apple';

We can confirm version of libgroonga by below query.

SHOW pgroonga.libgroonga_version;

 pgroonga.libgroonga_version 
-----------------------------
 10.0.0
(1 row)

How to upgrade

This version is compatible with before versions. You can upgrade by steps in "Compatible case" in Upgrade document.

Conclusion

Try PGroonga when you want to perform fast full text search against all languages on PostgreSQL!

2020-01-29

Groonga 9.1.2 has been released

Groonga 9.1.2 has been released!

How to install: Install

Changes

Here are important changes in this release:

  • [tools] Added a script for copying only files of specify tables or columns.

    • This script name is copy-related-files.rb.
    • This script is useful if we want to extract specifying tables or columns from a huge database.
    • Related files of specific tables or columns may need for reproducing fault.
    • If we difficult to offer a database whole, we can extract related files of target tables or columns by this tool.
    • This tool uses as below.
    copy-related-files.rb \
      --destination=db.copy \
      --target=column.index \
      db
    
    • We specify the directory of copy destination by --destination.
    • We specify --target a table or a column to be copied.
      • We can specify this option multiple times to copy multiple tables or columns.
  • shutdown Accept /d/shutdown?mode=immediate immediately even when all threads are used.

    • This feature can only use on the Groonga HTTP server.
  • Unused objects free immediately by using GRN_ENABLE_REFERENCE_COUNT=yes.

    • This feature is experimental. Performance degrade by this feature.
    • If we load to span many tables, we can expect to keep in the usage of memory by this feature.

Conclusion

See Release 9.1.2 2020-01-29 about detailed changes since 9.1.1

Let's search by Groonga!

2020-01-07

Groonga 9.1.1 has been released

Groonga 9.1.1 has been released!

How to install: Install

Changes

Here are important changes in this release:

  • load Added support for Apache Arrow format data.

    • If we use Apache Arrow format data, we may reduce parse cost. Therefore, data might be loading faster than other formats.
    • Groonga can also directly input data for Apache Arrow format from other data analysis systems by this improvement.
    • However, Apache Arrow format can use in the HTTP interface only. We can't use it in the command line interface.
  • load Improve error message.

    • Response of load command includes error message also.
    • If we faile data load, Groonga output error detail of load command as below by this Improvement.
    table_create Memos TABLE_NO_KEY
    [[0,0.0,0.0],true]
    column_create Memos content COLUMN_SCALAR Text
    [[0,0.0,0.0],true]
    load --table Memos
    [
    {"content": "Groonga is fast"}
    ]
    [[0,0.0,0.0],1]
    load --table Memos
    [
    {"_id": "invalid", "content": "Mroonga is fast"}
    ]
    [[[-22,0.0,0.0],"<_id>: failed to cast to <UInt32>: <\"invalid\">"],0]
    
    • If we want to output multiple error messages, we use output_errors option of command_version 3 as below.
    table_create Memos TABLE_NO_KEY
    [[0,0.0,0.0],true]
    column_create Memos content COLUMN_SCALAR Text
    [[0,0.0,0.0],true]
    load --table Memos --command_version 3 --output_errors yes
    [
    {"_id": "invalid", "content": "Groonga is fast"},
    {"_id": "invalid", "content": "Mroonga is fast"}
    ]
    {
      "header":{
        "return_code":-22,
        "start_time":1576717803.408522,
        "elapsed_time":0.8798723220825195,
        "error":{
          "message":"<_id>: failed to cast to <UInt32>: <\"invalid\">",
          "function":"parse_id_value",
          "file":"load.c","line":394
        }
      },
      "body":{
        "n_loaded_records":0,
        "errors":[
          {
            "return_code":-22,
            "message":"<_id>: failed to cast to <UInt32>: <\"invalid\">"
          },
          {
            "return_code":-22,
            "message":"<_id>: failed to cast to <UInt32>: <\"invalid\">"
          }
        ]
      }
    }
    
  • [httpd] Updated bundled nginx to 1.17.7.

  • Groonga HTTP server Added support for sending command parameters by body of HTTP request.

    • We must set application/x-www-form-urlencoded to Content-Type for this case.
    • If we use the HTTP POST request, we can specify multiple parameters as below by the HTTP request body.
    POST /d/status HTTP/1.1
    Host: 127.0.0.1:10041
    Content-Length: 35
    Content-Type: application/x-www-form-urlencoded
    
    command_version=3&output_pretty=yes
    

Conclusion

See Release 9.1.1 2020-01-07 about detailed changes since 9.1.0

Let's search by Groonga!

2019-11-29

Groonga 9.1.0 has been released

Groonga 9.1.0 has been released!

How to install: Install

Changes

Here are important changes in this release:

  • Improved the performance of the "&&" operation.

    • For example, the performance of condition expression such as the following is increased.

    • ( A || B ) && ( C || D ) && ( E || F) ...

    • This optimization has an effect especially when a condition that hits many records and it that hits few records are mixing.

  • TokenMecab Added a new option use_base_form.

    • We can search using the base form of a token by this option.

    • For example, if we search "支える" using this option, "支えた" is hit also.

  • Fix a bug that when the accessor is index, performance decreases.

    • For example, it occurs with the query include the following conditions.

      • sccessor @ query

      • accessor == query

  • Fixed a bug the estimated size of a search result was overflow when the buffer is big enough.

  • Added missing tools.

    • Because index-column-diff-all.sh and object-inspect-all.sh had not bundled in before version.

Conclusion

See Release 9.1.0 2019-11-29 about detailed changes since 9.0.9

Let's search by Groonga!

2019-10-30

Groonga 9.0.9 has been released

Groonga 9.0.9 has been released!

How to install: Install

Notice

Maybe performance decreases from this version. Therefore, If performance decreases than before, please report us with reproducible steps.

Changes

Here are important changes in this release:

  • log Improved that output the sending time of response into query-log.

  • status Added that the number of current jobs in the status command response.

  • groonga-httpd Added support for $request_time in log.

    • In the previous version, even if we specified the $request_time in the log_format directive, it was always 0.00.
    • If we specify the $request_time, groonga-httpd output the correct time form this version.
  • groonga-httpd Added how to set the $request_time in the document.

  • Supported Ubuntu 19.10 (Eoan Ermine)

  • Supported CentOS 8 (experimental)

    • The package for CentOS 8 can't use a part of features(e.g. we can't use TokenMecab and can't cast to int32 vector from JSON string) for lacking some packages for development.
  • [tools] Added a script for executeing the index_column_diff command simply.

    • This script name is index-column-diff-all.sh.
    • This script extracts index columns form Groonga's database and execute the index_column_diff command to them.
  • [tools] Added a script for executing object_inspect against all objects.

    • This script name is object-inspect-all.sh.
  • Fixed a bug that Groonga crash when we specify the value as the first argument of between.

Fixed a bug that Groonga crash when we specify the value as the first argument of between.

Groonga had been crashed as below case.

table_create Users TABLE_HASH_KEY ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Users age COLUMN_SCALAR Int32
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Ages TABLE_HASH_KEY Int32
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Ages user_age COLUMN_INDEX Users age
# [[0, 1337566253.89858, 0.000355720520019531], true]
load --table Users
[
{"_key": "Alice",  "age": 12},
{"_key": "Bob",    "age": 13},
{"_key": "Calros", "age": 15},
{"_key": "Dave",   "age": 16},
{"_key": "Eric",   "age": 20},
{"_key": "Frank",  "age": 21}
]
# [[0, 1337566253.89858, 0.000355720520019531], 6]

select Users --filter 'between(14, 13, "include", 16, "include")'

Conclusion

See Release 9.0.9 2019-10-30 about detailed changes since 9.0.8

Let's search by Groonga!