Groonga 10.0.3 has been released
Groonga 10.0.3 has been released!
How to install: Install
Changes
Here are important changes in this release:
- 
    
We came to be able to construct an inverted index from data that are tokenized in advance.
 - 
    
select We came to be able to specify a
vectorfor the argument of a function. - 
    
select Added a new stage
result_setfor dynamic columns.- 
        
This stage generates a column into a result set table. Therefore, it is not generated if
queryorfilterdoesn't exist- Because if 
queryorfilterdoesn't exist, Groonga doesn't make a result set table. 
 - Because if 
 - 
        
We can't use
_valuefor the stage. Theresult_setstage is for storing value byscore_column. 
 - 
        
 - 
    
[vector_slice] Added support for weight vector that has weight of
Float32type. - 
    
select Added support for
filteredstage andoutputstage of dynamic columns on drilldowns.- We can use 
filteredandoutputstage of dynamic columns on drilldowns as withdrilldowns[Label].stage filteredanddrilldowns[Label].stage output. 
 - We can use 
 - 
    
select Added support for
Floattype value in aggregating on drilldown.- We can aggregate max value, min value, and sum value for 
Floattype value usingMAX,MIN, andSUM. 
 - We can aggregate max value, min value, and sum value for 
 - 
    
query, geo_in_rectangle, geo_in_circle Added a new option
score_columnforquery(),geo_in_rectangle(), andgeo_in_circle(). - 
    
[Windows] Groonga came to be able to output backtrace when it occurs error even if it doesn't crash.
 - 
    
[Windows] Dropped support for old Windows.
- Groonga for Windows come to require Windows 8 (Windows Server 2012) or later from 10.0.3.
 
 - 
    
select Improved sort performance when sort keys were mixed referable sort keys and the other sort keys.
 - 
    
select Improved sort performance when sort keys are all referable keys case.
 - 
    
select Improve scorer performance as a
_socre = column1*X + column2*Y + ...case.- This optimization effective when there are many 
+or*in_score. - At the moment, it has only effective against 
+and*. 
 - This optimization effective when there are many 
 - 
    
select Added support for phrase near search.
 - 
    
vector Added support for
float32weight vector. - 
    
Fixed a crash bug if the modules (tokenizers, normalizers, and token filters) are used at the same time from multiple threads.
 - 
    
Fixed precision of
Float32value when it outputted.- The precision of it changes to 8-digit to 7-digit from 10.0.3.
 
 - 
    
Fixed a bug that Groonga used the wrong cache when the query that just the parameters of dynamic column different was executed.
 
We came to be able to construct an inverted index from data that are tokenized in advance.
- 
    
The construct of an index is speeded up from this.
 - 
    
We need to prepare token column to use this improvement.
 - 
    
token column is an auto generated value column like an index column.
 - 
    
token column value is generated from source column value by tokenizing the source column value.
 - 
    
We can create a token column by setting the source column as below.
table_create Terms TABLE_PAT_KEY ShortText \ --normalizer NormalizerNFKC121 \ --default_tokenizer TokenNgram table_create Notes TABLE_NO_KEY column_create Notes title COLUMN_SCALAR Text # The last "title" is the source column. column_create Notes title_terms COLUMN_VECTOR Terms title 
select We came to be able to specify a vector for the argument of a function.
- 
    
For example,
flagsoptions ofquerycan describe by avectoras below.select \ --table Memos \ --filter 'query("content", "-content:@mroonga", \ { \ "expander": "QueryExpanderTSV", \ "flags": ["ALLOW_LEADING_NOT", "ALLOW_COLUMN"] \ })' 
query, geo_in_rectangle, geo_in_circle Added a new option score_column for query(), geo_in_rectangle(), and geo_in_circle().
- 
    
We can store a score value by condition using
score_column. - 
    
Normally, Groonga calculate a score by adding scores of all conditions. However, we sometimes want to get a score value by condition.
 - 
    
For example, if we want to only use how near central coordinate as score as below, we use
score_column.table_create LandMarks TABLE_NO_KEY column_create LandMarks name COLUMN_SCALAR ShortText column_create LandMarks category COLUMN_SCALAR ShortText column_create LandMarks point COLUMN_SCALAR WGS84GeoPoint table_create Points TABLE_PAT_KEY WGS84GeoPoint column_create Points land_mark_index COLUMN_INDEX LandMarks point load --table LandMarks [ {"name": "Aries" , "category": "Tower" , "point": "11x11"}, {"name": "Taurus" , "category": "Lighthouse", "point": "9x10" }, {"name": "Gemini" , "category": "Lighthouse", "point": "8x8" }, {"name": "Cancer" , "category": "Tower" , "point": "12x12"}, {"name": "Leo" , "category": "Tower" , "point": "11x13"}, {"name": "Virgo" , "category": "Temple" , "point": "22x10"}, {"name": "Libra" , "category": "Tower" , "point": "14x14"}, {"name": "Scorpio" , "category": "Temple" , "point": "21x9" }, {"name": "Sagittarius", "category": "Temple" , "point": "43x12"}, {"name": "Capricorn" , "category": "Tower" , "point": "33x12"}, {"name": "Aquarius" , "category": "mountain" , "point": "55x11"}, {"name": "Pisces" , "category": "Tower" , "point": "9x9" }, {"name": "Ophiuchus" , "category": "mountain" , "point": "21x21"} ] select LandMarks \ --sort_keys 'distance' \ --columns[distance].stage initial \ --columns[distance].type Float \ --columns[distance].flags COLUMN_SCALAR \ --columns[distance].value 0.0 \ --output_columns 'name, category, point, distance, _score' \ --limit -1 \ --filter 'geo_in_circle(point, "11x11", "11x1", {"score_column": distance}) && category == "Tower"' [ [ 0, 1590647445.406149, 0.0002503395080566406 ], [ [ [ 5 ], [ [ "name", "ShortText" ], [ "category","ShortText" ], [ "point", "WGS84GeoPoint" ], [ "distance", "Float" ], [ "_score", "Int32" ] ], [ "Aries", "Tower", "11x11", 0.0, 1 ], [ "Cancer", "Tower", "12x12", 0.0435875803232193, 1 ], [ "Leo", "Tower", "11x13", 0.06164214760065079, 1 ], [ "Pisces", "Tower", "9x9", 0.0871751606464386, 1 ], [ "Libra", "Tower", "14x14", 0.1307627409696579, 1 ] ] ] ] - 
    
The sort by
_scoreis meaningless in the above example. Because the value of_scoreis all1bycategory == "Tower". However, we can sort distance from central coordinate usingscore_column. 
select Improved sort performance when sort keys were mixed referable sort keys and the other sort keys.
- 
    
We improved sort performance if mixed referable sort keys and the other and there are referable keys two or more.
- 
        
Referable sort keys are sort keys that except below them.
- Compressed columns
 _valueagainst the result of drilldown that is specified multiple values to the key of drilldown._keyagainst patricia trie table that has not the key ofShortTexttype._score
 
 - 
        
 - 
    
The more sort keys that except string, a decrease in the usage of memory for sort.
 
select Added support for phrase near search.
- 
    
We can search phrase by phrase by a near search.
- Query syntax for near phrase search is 
*NP"Phrase1 phrase2 ...". - 
        
Script syntax for near phrase search is
column *NP "phrase1 phrase2 ...". - 
        
If the search target phrase includes space, we can search for it by surrounding it with
"as below.table_create Entries TABLE_NO_KEY column_create Entries content COLUMN_SCALAR Text table_create Terms TABLE_PAT_KEY ShortText \ --default_tokenizer 'TokenNgram("unify_alphabet", false, \ "unify_digit", false)' \ --normalizer NormalizerNFKC121 column_create Terms entries_content COLUMN_INDEX|WITH_POSITION Entries content load --table Entries [ {"content": "I started to use Groonga. It's very fast!"}, {"content": "I also started to use Groonga. It's also very fast! Really fast!"} ] select Entries --filter 'content *NP "\\"I started\\" \\"use Groonga\\""' --output_columns 'content' [ [ 0, 1590469700.715882, 0.03997230529785156 ], [ [ [ 1 ], [ [ "content", "Text" ] ], [ "I started to use Groonga. It's very fast!" ] ] ] ] 
 - Query syntax for near phrase search is 
 
vector Added support for float32 weight vector.
- We can store weight as 
float32instead ofuint32. - 
    
We need to add
WEIGHT_FLOAT32flag when executecolumn_createto use this feature.column_create Records tags COLUMN_VECTOR|WITH_WEIGHT|WEIGHT_FLOAT32 Tags - However, 
WEIGHT_FLOAT32flag isn't available withCOLUMN_INDEXflag for now. 
Conclusion
Let's search by Groonga!