Groonga 5.0.0 has been released
Groonga 5.0.0 has been released!
How to install: Install
Since Groonga 4.0.0 had been released, many improvements, changes, or bug fixes was shipped.
Changes
In this release, there is following topic:
- Experimental improvements
- Supported plugin which enable sharding feature
Experimental improvement - Supported plugin which enable sharding feature
In this release, we added plugin which supports experimental sharding feature. Register this plugin in advance.
$ groonga testdb/db
> register sharding
Here is the simple example which shows how to use this feature. Let's consider to count specified logs which are stored into multiple tables.
Here is the schema and data.
table_create Logs_20150203 TABLE_NO_KEY
column_create Logs_20150203 timestamp COLUMN_SCALAR Time
column_create Logs_20150203 message COLUMN_SCALAR Text
table_create Logs_20150204 TABLE_NO_KEY
column_create Logs_20150204 timestamp COLUMN_SCALAR Time
column_create Logs_20150204 message COLUMN_SCALAR Text
table_create Logs_20150205 TABLE_NO_KEY
column_create Logs_20150205 timestamp COLUMN_SCALAR Time
column_create Logs_20150205 message COLUMN_SCALAR Text
load --table Logs_20150203
[
{"timestamp": "2015-02-03 23:59:58", "message": "Start"},
{"timestamp": "2015-02-03 23:59:58", "message": "Shutdown"},
{"timestamp": "2015-02-03 23:59:59", "message": "Start"},
{"timestamp": "2015-02-03 23:59:59", "message": "Shutdown"}
]
load --table Logs_20150204
[
{"timestamp": "2015-02-04 00:00:00", "message": "Start"},
{"timestamp": "2015-02-04 00:00:00", "message": "Shutdown"},
{"timestamp": "2015-02-04 00:00:01", "message": "Start"},
{"timestamp": "2015-02-04 00:00:01", "message": "Shutdown"},
{"timestamp": "2015-02-04 23:59:59", "message": "Start"},
{"timestamp": "2015-02-04 23:59:59", "message": "Shutdown"}
]
load --table Logs_20150205
[
{"timestamp": "2015-02-05 00:00:00", "message": "Start"},
{"timestamp": "2015-02-05 00:00:00", "message": "Shutdown"},
{"timestamp": "2015-02-05 00:00:01", "message": "Start"},
{"timestamp": "2015-02-05 00:00:01", "message": "Shutdown"}
]
There are three tables which are mapped each day from 2015 Feb 03 to 2015 Feb 05.
- Logs_20150203
- Logs_20150204
- Logs_20150205
Then, it loads data into each table which correspond to.
Let's count logs which contains "Shutdown" in timestamp column and the value of timestamp is "2015-02-04 00:00:00" or later.
Here is the query to achieve above purpose.
logical_count Logs timestamp \
--filter 'message == "Shutdown"' \
--min "2015-02-04 00:00:00" \
--min_border "include"
Pros. You don't need to consider limitation of table.
There is a limitation about the number of records. By sharding feature, you can overcome such limitations.
Cons. You need to create each table and load to it explicitly
There is no convenient query such as PARTITIONING BY
in SQL. Thus, you must create table by table_create
for each tables
which contains _YYYYMMDD
postfix in table name.
And more, you must load data into proper table carefully.
Conclusion
See Release 5.0.0 2015-02-09 about detailed changes since 4.1.1.
Let's search by Groonga!