BloGroonga

2021-12-29

Groonga 11.1.1 has been released

Groonga 11.1.1 has been released!

How to install: Install

Changes

Here are important changes in this release:

Improvements

  • select Added support for near phrase product search.

    This feature is a shortcut of '*NP"..." OR *NP"..." OR ...'. For example, we can use *NPP instead of the expression that execute mulitiple *NP with query as below.

    query ("title * 10 || content",
           "*NP"a 1 x" OR
            *NP"a 1 y" OR
            *NP"a 1 z" OR
            *NP"a 2 x" OR
            *NP"a 2 y" OR
            *NP"a 2 z" OR
            *NP"a 3 x" OR
            *NP"a 3 y" OR
            *NP"a 3 z" OR
            *NP"b 1 x" OR
            *NP"b 1 y" OR
            *NP"b 1 z" OR
            *NP"b 2 x" OR
            *NP"b 2 y" OR
            *NP"b 2 z" OR
            *NP"b 3 x" OR
            *NP"b 3 y" OR
            *NP"b 3 z"")
    

    We can be written as *NPP"(a b) (1 2 3) (x y z)" the above expression by this feature. In addition, *NPP"(a b) (1 2 3) (x y z)" is faster than '*NP"..." OR *NP"..." OR ...'.

    query ("title * 10 || content",
           "*NPP"(a b) (1 2 3) (x y z)"")
    

    We implements this feature for improving performance near phrase search like '*NP"..." OR *NP"..." OR ...'.

  • select Added support for order near phrase product search.

    This feature is a shortcut of '*ONP"..." OR *ONP"..." OR ...'. For example, we can use *ONPP instead of the expression that execute multiple *ONP with query as below.

    query ("title * 10 || content",
            "*ONP"a 1 x" OR
             *ONP"a 1 y" OR
             *ONP"a 1 z" OR
             *ONP"a 2 x" OR
             *ONP"a 2 y" OR
             *ONP"a 2 z" OR
             *ONP"a 3 x" OR
             *ONP"a 3 y" OR
             *ONP"a 3 z" OR
             *ONP"b 1 x" OR
             *ONP"b 1 y" OR
             *ONP"b 1 z" OR
             *ONP"b 2 x" OR
             *ONP"b 2 y" OR
             *ONP"b 2 z" OR
             *ONP"b 3 x" OR
             *ONP"b 3 y" OR
             *ONP"b 3 z"")
    

    We can be written as *ONPP"(a b) (1 2 3) (x y z)" the above expression by this feature. In addition, *ONPP"(a b) (1 2 3) (x y z)" is faster than '*ONP"..." OR *ONP"..." OR ...'.

    query ("title * 10 || content",
           "*ONPP"(a b) (1 2 3) (x y z)"")
    

    We implements this feature for improving performance near phrase search like '*ONP"..." OR *ONP"..." OR ...'.

  • request_cancel Groonga became easily detects request_cancel while executing a search.

    Because we added more checks of return code to detect request_cancel.

  • thread_dump Added a new command thread_dump

    Currently, this command works only on Windows.

    We can put a backtrace of all threads into a log as logs of NOTICE level at the time of running this command.

    This feature is useful when we solve a problem such as Groonga doesn't return a response.

  • CentOS Dropped support for CentOS 8.

    Because CentOS 8 will reach EOL at 2021-12-31.

Fixes

  • Fixed a bug that we can't remove a index column with invalid parameter.

    • For example, we can't remove a table when we create an invalid index column with column_create as below.

      table_create Statuses TABLE_NO_KEY
      column_create Statuses start_time COLUMN_SCALAR UInt16
      column_create Statuses end_time COLUMN_SCALAR UInt16
      
      table_create Times TABLE_PAT_KEY UInt16
      column_create Times statuses COLUMN_INDEX Statuses start_time,end_time
      # [
          [
            -22,
            1639037503.16114,
            0.003981828689575195,
            "grn_obj_set_info(): GRN_INFO_SOURCE: multi column index must be created with WITH_SECTION flag: <Times.statuses>",
            [
              [
                "grn_obj_set_info_source_validate",
                "../../groonga/lib/db.c",
                9605
              ],
              [
                "/tmp/d.grn",
                6,
                "column_create Times statuses COLUMN_INDEX Statuses start_time,end_time"
              ]
            ]
          ],
          false
        ]
      table_remove Times
      # [
          [
            -22,
            1639037503.16515,
            0.0005414485931396484,
            "[object][remove] column is broken: <Times.statuses>",
            [
              [
                "remove_columns",
                "../../groonga/lib/db.c",
                10649
              ],
              [
                "/tmp/d.grn",
                8,
                "table_remove Times"
              ]
            ]
          ],
          false
        ]
      

Known Issues

  • Currently, Groonga has a bug that there is possible that data is corrupt when we execute many additions, delete, and update data to vector column.

  • *< and *> only valid when we use query() the right side of filter condition. If we specify as below, *< and *> work as &&.

    • 'content @ "Groonga" *< content @ "Mroonga"'
  • Groonga may not return records that should match caused by GRN_II_CURSOR_SET_MIN_ENABLE.

Conclusion

Please refert to the following news for more details.

News Release 11.1.1

Let's search by Groonga!

2021-11-29

Groonga 11.1.0 has been released

Groonga 11.1.0 has been released!

How to install: Install

Changes

Here are important changes in this release:

Improvements

  • load Added support for ISO 8601 time format.

    load support the following format by this modification.

    • YYYY-MM-ddThh:mm:ss.sZ
    • YYYY-MM-ddThh:mm:ss.s+10:00
    • YYYY-MM-ddThh:mm:ss.s-10:00

    We can also use t and z characters instead of T and Z in this syntax. We can also use / character instead of - in this syntax. However, note that this is not an ISO 8601 format. This format is present for compatibility.

  • select Added a new query_flags DISABLE_PREFIX_SEARCH.

    We can use the prefix search operators ^ and * as search keywords by DISABLE_PREFIX_SEARCH as below.

    This feature is useful if we want to search documents including ^ and *.

      table_create Users TABLE_PAT_KEY ShortText
    
      load --table Users
      [
      {"_key": "alice"},
      {"_key": "alan"},
      {"_key": "ba*"}
      ]
    
      select Users \
        --match_columns "_key" \
        --query "a*" \
        --query_flags "DISABLE_PREFIX_SEARCH"
      [[0,0.0,0.0],[[[1],[["_id","UInt32"],["_key","ShortText"]],[3,"ba*"]]]]
    
      table_create Users TABLE_PAT_KEY ShortText
    
      load --table Users
      [
      {"_key": "alice"},
      {"_key": "alan"},
      {"_key": "^a"}
      ]
    
      select Users \
        --query "_key:^a" \
        --query_flags "ALLOW_COLUMN|DISABLE_PREFIX_SEARCH"
      [[0,0.0,0.0],[[[1],[["_id","UInt32"],["_key","ShortText"]],[3,"^a"]]]]
    
  • select Added a new query_flags DISABLE_AND_NOT.

    We can use AND NOT operators - as search keywords by DISABLE_AND_NOT as below.

    This feature is useful if we want to search documents including -.

      table_create Users TABLE_PAT_KEY ShortText
    
      load --table Users
      [
      {"_key": "alice"},
      {"_key": "bob"},
      {"_key": "cab-"}
      ]
    
      select Users   --match_columns "_key"   --query "b - a"   --query_flags "DISABLE_AND_NOT"
      [[0,0.0,0.0],[[[1],[["_id","UInt32"],["_key","ShortText"]],[3,"cab-"]]]]
    

Fixes

  • [The browser based administration tool] Fixed a bug that a search query that is inputted to non-administration mode is sent even if we input checks to the checkbox for the administration mode of a record list.

Known Issues

  • Currently, Groonga has a bug that there is possible that data is corrupt when we execute many additions, delete, and update data to vector column.

  • *< and *> only valid when we use query() the right side of filter condition. If we specify as below, *< and *> work as &&.

    • 'content @ "Groonga" *< content @ "Mroonga"'
  • Groonga may not return records that should match caused by GRN_II_CURSOR_SET_MIN_ENABLE.

Conclusion

Please refert to the following news for more details.

News Release 11.1.0

Let's search by Groonga!

2021-11-09

PGroonga (fast full text search module for PostgreSQL) 2.3.4 has been released

PGroonga 2.3.4 has been released! PGroonga makes PostgreSQL fast full text search for all languages.

If you are new user, see also About PGroonga.

We implemented big features in PGroonga 2.3.3. Therefore, we also announce features of PGroonga 2.3.3 in this blog.

Highlight

Here are highlights in PGroonga 2.3.3 and 2.3.4:

  • Added support for PostgreSQL's RLS(Row Level security)

    PostgreSQL had not selected PGroonga's index in the table with RLS until now. However, PostgreSQL is able to select PGroonga's index in the table with RLS enabled since this release.

  • Dropped support for PostgreSQL 9.6.

    Because PostgreSQL 9.6 reached EOL at 2021/11/11.

  • Added support for applying PGroonga's WAL automatically in the standby server when we use stream replication.

    PGroonga's WAL wasn't deleted unless we apply WAL manually in the standby server until now. However, we support the feature that apply automatically WAL in the standby server since this release.

    This feature doesn't enable in the default. We need to enable this feature to use it. Please refer to the following link about how to use this feature.

  • Added support for AlmaLinux 8.

  • Fix a crash bug when EXPLAIN ANALYZE is executed with seqscan.

    This bug only occures in PGroonga 2.3.3.

  • Added support for crash safe. (Experimental feature)

    If we enable this feature, PGroonga recovers automatically its index when PostgreSQL crashes. We don't execute REINDEX manually when PostgreSQL crashes. This feature doesn't enable in the default. We need to enable this feature to use it. Please refer to the following link about how to use this feature.

How to upgrade

This version is compatible with before versions. You can upgrade by steps in "Compatible case" in Upgrade document.

Conclusion

Try PGroonga when you want to perform fast full text search against all languages on PostgreSQL!

2021-11-05

Groonga 11.0.9 has been released

Groonga 11.0.9 has been released!

How to install: Install

Changes

Here are important changes in this release:

Improvements

  • snippet Added a new option delimiter_regexp for detecting snippet delimiter with regular expression.

    snippet() extracts text around search keywords. We call the text that is extracted by snippet() snippet.

    Normally, snippet() returns the text of 200 bytes around search keywords. However, snippet() gives no thought to a delimiter of sentences. The snippet may be composed of multi sentences.

    delimiter_regexp option is useful if we want to only extract the text of the same sentence as search keywords.

  • window_rank Added a new function window_rank().

    • We can calculate a rank that includes a gap of each record. Normally, the rank isn’t incremented when multiple records that are the same order. For example, if values of sort keys are 100, 100, 200 then the ranks of them are 1, 1, 3. The rank of the last record is 3 not 2 because there are two 1 rank records.

      This is similar to window_record_number(). However, window_record_number() gives no thought to gap.

  • in_values Added support for auto cast when we search tables.

    For example, if we load values of UInt32 into a table that a key type is UInt64, Groonga cast the values to UInt64 automatically when we search the table with in_values(). However, in_values(_key, 10) doesn't work with UInt64 key table. Because 10 is parsed as Int32.

  • [httpd] Updated bundled nginx to 1.21.3.

  • AlmaLinux Added support for AlmaLinux 8.

  • Ubuntu Added support for Ubuntu 21.10 (Impish Indri).

Fixes

  • Fixed a bug that Groonga doesn't return a response when an error occurred in command (e.g. sytax error in filter).

    • This bug only occurs when we use --output_type apache-arrow.

Known Issues

  • Currently, Groonga has a bug that there is possible that data is corrupt when we execute many additions, delete, and update data to vector column.

  • [The browser based administration tool] Currently, Groonga has a bug that a search query that is inputted to non-administration mode is sent even if we input checks to the checkbox for the administration mode of a record list.

  • *< and *> only valid when we use query() the right side of filter condition. If we specify as below, *< and *> work as &&.

    • 'content @ "Groonga" *< content @ "Mroonga"'
  • Groonga may not return records that should match caused by GRN_II_CURSOR_SET_MIN_ENABLE.

Conclusion

Please refert to the following news for more details.

News Release 11.0.9

Let's search by Groonga!

2021-10-04

PGroonga (fast full text search module for PostgreSQL) 2.3.2 has been released

PGroonga 2.3.2 has been released! PGroonga makes PostgreSQL fast full text search for all languages.

If you are new user, see also About PGroonga.

In this release, we supported PostgreSQL14 that was just released!

Highlight

Here are highlights after PGroonga 2.3.2:

  • Added support for PostgreSQL 14.

  • Added support for parallel scan.

  • Added support for parallel scan against declarative partitioning.

  • [CREATE INDEX USING PGroonga] Added index_flags_mapping option that can be used to customize index flags for each indexed target.

  • [CREATE INDEX USING PGroonga] Added support for ${table:INDEX_NAME} substitution in normalizers_mapping option.

  • [Ubuntu] Added support for Ubuntu 21.04.

  • [pgroonga_highlight_html function] Fixed a bug that a lexicon may not update when we recreate the lexicon.

How to upgrade

This version is compatible with before versions. You can upgrade by steps in "Compatible case" in Upgrade document.

Announce

Session

This tutorial session is for people who have already used PGroonga. We will introduce how to an improvement of search results by using PGroonga.

Conclusion

Try PGroonga when you want to perform fast full text search against all languages on PostgreSQL!