4.1. Basic operations#

A Groonga package provides a C library (libgroonga) and a command line tool (groonga). This tutorial explains how to use the command line tool, with which you can create/operate databases, start a server, establish a connection with a server, etc.

4.1.1. Create a database#

The first step to using Groonga is to create a new database. The following shows how to do it.

Form:

groonga -n DB_PATH

The -n option specifies to create a new database and DB_PATH specifies the path of the new database. Actually, a database consists of a series of files and DB_PATH specifies the file which will be the entrance to the new database. DB_PATH also specifies the path prefix for other files. Note that database creation fails if DB_PATH points to an existing file (For example, db open failed (DB_PATH): syscall error 'DB_PATH' (File exists). You can operate an existing database in a way that is in the next chapter).

This command creates a new database and then enters into interactive mode in which Groonga prompts you to enter commands for operating that database. You can terminate this mode with Ctrl-d.

Execution example:

% groonga -n /tmp/groonga-databases/introduction.db

After this database creation, you can find a series of files in /tmp/groonga-databases.

4.1.2. Operate a database#

The following shows how to operate an existing database.

Form:

groonga DB_PATH [COMMAND]

DB_PATH specifies the path of a target database. This command fails if the specified database does not exist.

If COMMAND is specified, Groonga executes COMMAND and returns the result. Otherwise, Groonga starts in interactive mode that reads commands from the standard input and executes them one by one. This tutorial focuses on the interactive mode.

Let’s see the status of a Groonga process by using a status command.

Execution example:

% groonga /tmp/groonga-databases/introduction.db
status
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   {
#     "alloc_count": 29,
#     "starttime": 1696558618,
#     "start_time": 1696558618,
#     "uptime": 0,
#     "version": "2.9.1",
#     "n_queries": 0,
#     "cache_hit_rate": 0.0,
#     "command_version": 1,
#     "default_command_version": 1,
#     "max_command_version": 3,
#     "n_jobs": 0,
#     "features": {
#       "nfkc": true,
#       "mecab": true,
#       "message_pack": true,
#       "mruby": true,
#       "onigmo": true,
#       "zlib": true,
#       "lz4": true,
#       "zstandard": true,
#       "kqueue": false,
#       "epoll": true,
#       "poll": false,
#       "rapidjson": false,
#       "apache_arrow": true,
#       "xxhash": true,
#       "blosc": true,
#       "bfloat16": true,
#       "h3": true,
#       "simdjson": true,
#       "llama.cpp": true,
#       "back_trace": true,
#       "reference_count": false
#     },
#     "apache_arrow": {
#       "version_major": 2,
#       "version_minor": 9,
#       "version_patch": 1,
#       "version": "2.9.1"
#     },
#     "memory_map_size": 2929,
#     "n_workers": 0,
#     "default_n_workers": 0,
#     "os": "Linux",
#     "cpu": "x86_64"
#   }
# ]

As shown in the above example, a command returns a JSON array. The first element contains an error code, execution time, etc. The second element is the result of an operation.

Note

You can format a JSON using additional tools. For example, grnwrap, Grnline, jq and so on.

4.1.3. Command format#

Commands for operating a database accept arguments as follows:

Form_1: COMMAND VALUE_1 VALUE_2 ..

Form_2: COMMAND --NAME_1 VALUE_1 --NAME_2 VALUE_2 ..

In the first form, arguments must be passed in order. This kind of arguments are called positional arguments because the position of each argument determines its meaning.

In the second form, you can specify a parameter name with its value. So, the order of arguments is not defined. This kind of arguments are known as named parameters or keyword arguments.

If you want to specify a value which contains white-spaces or special characters, such as quotes and parentheses, please enclose the value with single-quotes or double-quotes.

For details, see also the paragraph of “command” in groonga executable file.

4.1.4. Basic commands#

status

shows status of a Groonga process.

table_list

shows a list of tables in a database.

column_list

shows a list of columns in a table.

table_create

adds a table to a database.

column_create

adds a column to a table.

select

searches records from a table and shows the result.

load

inserts records to a table.

4.1.5. Create a table#

A table_create command creates a new table.

In most cases, a table has a primary key which must be specified with its data type and index type.

There are various data types such as integers, strings, etc. See also Data types for more details. The index type determines the search performance and the availability of prefix searches. The details will be described later.

Let’s create a table. The following example creates a table with a primary key. The name parameter specifies the name of the table. The flags parameter specifies the index type for the primary key. The key_type parameter specifies the data type of the primary key.

Execution example:

table_create --name Site --flags TABLE_HASH_KEY --key_type ShortText
# [[0,1337566253.89858,0.000355720520019531],true]

The second element of the result indicates that the operation succeeded.

4.1.6. View a table#

A select command can enumerate records in a table.

Execution example:

select --table Site
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         0
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ]
#       ]
#     ]
#   ]
# ]

When only a table name is specified with a table parameter, a select command returns the first (at most) 10 records in the table. [0] in the result shows the number of records in the table. The next array is a list of columns. [“_id”,”Uint32”] is a column of UInt32, named _id. [“_key”,”ShortText”] is a column of ShortText, named _key.

The above two columns, _id and _key, are the necessary columns. The _id column stores IDs those are automatically allocated by Groonga. The _key column is associated with the primary key. You are not allowed to rename these columns.

4.1.7. Create a column#

A column_create command creates a new column.

Let’s add a column. The following example adds a column to the Site table. The table parameter specifies the target table. The name parameter specifies the name of the column. The type parameter specifies the data type of the column.

Execution example:

column_create --table Site --name title --type ShortText
# [[0,1337566253.89858,0.000355720520019531],true]
select --table Site
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         0
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "title",
#           "ShortText"
#         ]
#       ]
#     ]
#   ]
# ]

4.1.8. Load records#

A load command loads JSON-formatted records into a table.

The following example loads nine records into the Site table.

Execution example:

load --table Site
[
{"_key":"http://example.org/","title":"This is test record 1!"},
{"_key":"http://example.net/","title":"test record 2."},
{"_key":"http://example.com/","title":"test test record three."},
{"_key":"http://example.net/afr","title":"test record four."},
{"_key":"http://example.org/aba","title":"test test test record five."},
{"_key":"http://example.com/rab","title":"test test test test record six."},
{"_key":"http://example.net/atv","title":"test test test record seven."},
{"_key":"http://example.org/gat","title":"test test record eight."},
{"_key":"http://example.com/vdw","title":"test test record nine."},
]
# [[0,1337566253.89858,0.000355720520019531],9]

The second element of the result indicates how many records were successfully loaded. In this case, all the records are successfully loaded.

Let’s make sure that these records are correctly stored.

Execution example:

select --table Site
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         9
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "title",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "http://example.org/",
#         "This is test record 1!"
#       ],
#       [
#         2,
#         "http://example.net/",
#         "test record 2."
#       ],
#       [
#         3,
#         "http://example.com/",
#         "test test record three."
#       ],
#       [
#         4,
#         "http://example.net/afr",
#         "test record four."
#       ],
#       [
#         5,
#         "http://example.org/aba",
#         "test test test record five."
#       ],
#       [
#         6,
#         "http://example.com/rab",
#         "test test test test record six."
#       ],
#       [
#         7,
#         "http://example.net/atv",
#         "test test test record seven."
#       ],
#       [
#         8,
#         "http://example.org/gat",
#         "test test record eight."
#       ],
#       [
#         9,
#         "http://example.com/vdw",
#         "test test record nine."
#       ]
#     ]
#   ]
# ]

4.1.9. Get a record#

A select command can search records in a table.

If a search condition is specified with a query parameter, a select command searches records matching the search condition and returns the matched records.

Let’s get a record having a specified record ID. The following example gets the first record in the Site table. More precisely, the query parameter specifies a record whose _id column stores 1.

Execution example:

select --table Site --query _id:1
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "title",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "http://example.org/",
#         "This is test record 1!"
#       ]
#     ]
#   ]
# ]

Next, let’s get a record having a specified key. The following example gets the record whose primary key is “http://example.org/”. More precisely, the query parameter specifies a record whose _key column stores “http://example.org/”.

Execution example:

select --table Site --query '_key:"http://example.org/"'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "title",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "http://example.org/",
#         "This is test record 1!"
#       ]
#     ]
#   ]
# ]

4.1.13. Specify output columns#

An output_columns parameter of a select command specifies columns to appear in the search result. If you want to specify more than one columns, please separate column names by commas (‘,’).

Execution example:

select --table Site --output_columns _key,title,_score --query title:@test
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         9
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "title",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "http://example.org/",
#         "This is test record 1!",
#         1
#       ],
#       [
#         "http://example.net/",
#         "test record 2.",
#         1
#       ],
#       [
#         "http://example.com/",
#         "test test record three.",
#         2
#       ],
#       [
#         "http://example.net/afr",
#         "test record four.",
#         1
#       ],
#       [
#         "http://example.org/aba",
#         "test test test record five.",
#         3
#       ],
#       [
#         "http://example.com/rab",
#         "test test test test record six.",
#         4
#       ],
#       [
#         "http://example.net/atv",
#         "test test test record seven.",
#         3
#       ],
#       [
#         "http://example.org/gat",
#         "test test record eight.",
#         2
#       ],
#       [
#         "http://example.com/vdw",
#         "test test record nine.",
#         2
#       ]
#     ]
#   ]
# ]

This example specifies three output columns including the _score column, which stores the relevance score of each record.

4.1.14. Specify output ranges#

A select command returns a part of its search result if offset and/or limit parameters are specified. These parameters are useful to paginate a search result, a widely-used interface which shows a search result on a page by page basis.

An offset parameter specifies the starting point and a limit parameter specifies the maximum number of records to be returned. If you need the first record in a search result, the offset parameter must be 0 or omitted.

Execution example:

select --table Site --offset 0 --limit 3
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         9
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "title",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "http://example.org/",
#         "This is test record 1!"
#       ],
#       [
#         2,
#         "http://example.net/",
#         "test record 2."
#       ],
#       [
#         3,
#         "http://example.com/",
#         "test test record three."
#       ]
#     ]
#   ]
# ]
select --table Site --offset 3 --limit 3
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         9
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "title",
#           "ShortText"
#         ]
#       ],
#       [
#         4,
#         "http://example.net/afr",
#         "test record four."
#       ],
#       [
#         5,
#         "http://example.org/aba",
#         "test test test record five."
#       ],
#       [
#         6,
#         "http://example.com/rab",
#         "test test test test record six."
#       ]
#     ]
#   ]
# ]
select --table Site --offset 7 --limit 3
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         9
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "title",
#           "ShortText"
#         ]
#       ],
#       [
#         8,
#         "http://example.org/gat",
#         "test test record eight."
#       ],
#       [
#         9,
#         "http://example.com/vdw",
#         "test test record nine."
#       ]
#     ]
#   ]
# ]

4.1.15. Sort a search result#

A select command sorts its result when used with a sort_keys parameter.

A sort_keys parameter specifies a column as a sorting creteria. A search result is arranged in ascending order of the column values. If you want to sort a search result in reverse order, please add a leading hyphen (‘-’) to the column name in a parameter.

The following example shows records in the Site table in reverse order.

Execution example:

select --table Site --sort_keys -_id
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         9
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "title",
#           "ShortText"
#         ]
#       ],
#       [
#         9,
#         "http://example.com/vdw",
#         "test test record nine."
#       ],
#       [
#         8,
#         "http://example.org/gat",
#         "test test record eight."
#       ],
#       [
#         7,
#         "http://example.net/atv",
#         "test test test record seven."
#       ],
#       [
#         6,
#         "http://example.com/rab",
#         "test test test test record six."
#       ],
#       [
#         5,
#         "http://example.org/aba",
#         "test test test record five."
#       ],
#       [
#         4,
#         "http://example.net/afr",
#         "test record four."
#       ],
#       [
#         3,
#         "http://example.com/",
#         "test test record three."
#       ],
#       [
#         2,
#         "http://example.net/",
#         "test record 2."
#       ],
#       [
#         1,
#         "http://example.org/",
#         "This is test record 1!"
#       ]
#     ]
#   ]
# ]

The next example uses the _score column as the sorting criteria for ranking the search result. The result is sorted in relevance order.

Execution example:

select --table Site --query title:@test --output_columns _id,_score,title --sort_keys -_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         9
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_score",
#           "Int32"
#         ],
#         [
#           "title",
#           "ShortText"
#         ]
#       ],
#       [
#         6,
#         4,
#         "test test test test record six."
#       ],
#       [
#         5,
#         3,
#         "test test test record five."
#       ],
#       [
#         7,
#         3,
#         "test test test record seven."
#       ],
#       [
#         8,
#         2,
#         "test test record eight."
#       ],
#       [
#         3,
#         2,
#         "test test record three."
#       ],
#       [
#         9,
#         2,
#         "test test record nine."
#       ],
#       [
#         1,
#         1,
#         "This is test record 1!"
#       ],
#       [
#         4,
#         1,
#         "test record four."
#       ],
#       [
#         2,
#         1,
#         "test record 2."
#       ]
#     ]
#   ]
# ]

If you want to specify more than one columns, please separate column names by commas (‘,’). In such a case, a search result is sorted in order of the values in the first column, and then records having the same values in the first column are sorted in order of the second column values.

Execution example:

select --table Site --query title:@test --output_columns _id,_score,title --sort_keys -_score,_id
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         9
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_score",
#           "Int32"
#         ],
#         [
#           "title",
#           "ShortText"
#         ]
#       ],
#       [
#         6,
#         4,
#         "test test test test record six."
#       ],
#       [
#         5,
#         3,
#         "test test test record five."
#       ],
#       [
#         7,
#         3,
#         "test test test record seven."
#       ],
#       [
#         3,
#         2,
#         "test test record three."
#       ],
#       [
#         8,
#         2,
#         "test test record eight."
#       ],
#       [
#         9,
#         2,
#         "test test record nine."
#       ],
#       [
#         1,
#         1,
#         "This is test record 1!"
#       ],
#       [
#         2,
#         1,
#         "test record 2."
#       ],
#       [
#         4,
#         1,
#         "test record four."
#       ]
#     ]
#   ]
# ]

footnote