BloGroonga

2013-03-29

Groonga 3.0.2 has been released

Groonga 3.0.2 has been released!

How to install: Install

There are two topics for this release.

  • Supported sub_filter() function
  • Supported query expander in query() function

Supported sub_filter() function

In this release, sub_filter() function has been supported.

For example, we introduce the suitable case for sub_filter() function.

Here is the sample schema:

  table_create Comment TABLE_PAT_KEY UInt32
  column_create Comment name COLUMN_SCALAR ShortText
  column_create Comment content COLUMN_SCALAR ShortText

  table_create Blog TABLE_PAT_KEY ShortText
  column_create Blog title COLUMN_SCALAR ShortText
  column_create Blog content COLUMN_SCALAR ShortText
  column_create Blog comments COLUMN_VECTOR Comment

  column_create Comment blog_comment_index COLUMN_INDEX Blog comments

  table_create Lexicon TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram
  column_create Lexicon comment_content COLUMN_INDEX|WITH_POSITION Comment content
  column_create Lexicon comment_name COLUMN_INDEX|WITH_POSITION Comment name
  column_create Lexicon blog_content COLUMN_INDEX|WITH_POSITION Blog content

Blog table is related to Comment table. Lexicon table is created for full text search.

Here are the sample data:

  load --table Comment
  [
  {"_key": 1, "name": "A", "content": "groonga"},
  {"_key": 2, "name": "B", "content": "groonga"},
  {"_key": 3, "name": "C", "content": "rroonga"},
  {"_key": 4, "name": "A", "content": "mroonga"},
  ]

  load --table Blog
  [
  {"_key": "groonga's blog", "content": "content of groonga's blog", comments: [1, 2, 3]},
  {"_key": "mroonga's blog", "content": "content of mroonga's blog", comments: [2, 3, 4]},
  {"_key": "rroonga's blog", "content": "content of rroonga's blog", comments: [3]},
  ]

If you want to search blog entries that user "A" commented. Here is the query for this purpose.

  select Comment --output_columns _key,name,content --query "name:@"A""

  [
    [2],
    [
      ["_key","UInt32"],["name","ShortText"],["content","ShortText"]
    ],
    [1,"A","groonga"],
    [4,"A","mroonga"]
  ]

You know User "A" commented 'groonga' and 'mroonga'.

Then, let's search the entries that user "A" commented about 'groonga'. At first, search the entries user "A"'s comment from Blog table.

  select Blog --output_columns _key --filter "comments.name @ "A""

  [
    [
      [2],
      [["_key","ShortText"]],
      ["groonga's blog"],
      ["mroonga's blog"]
    ]
  ]

You know user "A" commented to "groonga's blog" and "mroonga's blog".

Then, search the entries that user "A" commented about 'groonga'. Here is the query which is added extra condition about 'groonga'.

  select Blog --output_columns _key --filter "comments.name @ "A" && comments.content @ "groonga""

  [
    [
      [2],
      [["_key","ShortText"]],
      ["groonga's blog"],
      ["mroonga's blog"]
    ]
  ]

You know that the search results is same as before. User "A" commented to "mroonga's blog", but not mentioned about "groonga".

This is not the intended search results.

What is the problem?

The fact about following query:

--filter "comment.name @ "A" && comment.content @ "groonga"
  • Are there entry matched to comment.name @ "A"? => Yes. there are comment of user "A".
  • Are there entry matched to comment.content @ "groonga"? => Yes. there are entries matched to groonga.

The both conditions above are met, "mroonga's blog" entry is regarded as matched entry.

But, this is not intended results.

We expect only "groonga's blog" is matched.

sub_filter() function is used for this purpose.

You can filter the results by comments context.

Here is the example by using sub_filter() function:

  select Blog --output_columns _key 
    --filter 'sub_filter(comments, "name @ "A" && content @ "groonga"")'

  [
    [
      [1],
      [["_key","ShortText"]],
      ["groonga's blog"],
    ]
  ]

Confirm the results of above query. Now, you can search the Blog entry that user "A" commented about 'groonga'.

Supported query expander in query() function

This release began to support query expander in query() function.

The select command supports query expander and the keywords which has different weight to the column. In 2.1.2 release, multiple query() function had been supported, you can specify the different weight to the column each query.

But, as query() fuction doesn't support query expander functionality, you need to do query expansion by yourself if you want to search as same as --query.

In the previous release, query() function syntax is:

 query("MATCH_COLUMNS", "QUERY")

In this release, query() function accepts 3rd argument:

 query("MATCH_COLUMNS", "QUERY", "QueryExpanderTSV")

Now you can use with QueryExpanderTSV or the substitution table.

See following documentation about QueryExpanderTSV

See following documentation about substitution table:

As you know, the limitation that you need to use --query if you want to use query expander and keywords which has different weight at once.

Here is the query which returns the same results as substitution table:

  select Users --match_columns "memo * 100" --query "memo:@groonga" 
    --query_expander Synonyms.synonym 
    --filter 'query("name * 10", "((...) OR (...) OR (...))")'

Here is the query which returns the same effects by using query expander functionality introduced at this release:

  select Users --match_columns "memo * 100" --query "memo:@groonga" 
    --query_expander Synonyms.synonym 
    --filter 'query("name * 10", "...", "Synonyms.synonym")'

Conclusion

See Release 3.0.2 2013/03/29 about detailed changes since 3.0.1.

Let's search by groonga!