7.3.2. Parallel execution#
7.3.2.1. Summary#
Groonga executes serially by default. However, by specifying the option you can execute in parallel.
The next section shows you how to set up for parallel execution. Please read the notes before using this option.
7.3.2.2. How to use#
7.3.2.2.1. Set per Groonga command#
Only works for the command to be executed.
7.3.2.2.1.1. Specified by --n_workers
option of Groonga command#
Groonga command example:
load --table Data --n_workers -1
[
{"_key", "value"}
]
7.3.2.2.2. Set the default value#
If you set a default value, you do not need to specify it for each Groonga command. The default value is used for all Groonga commands.
7.3.2.2.2.1. Specified by --default-n-workers
option of groonga
executable file#
Execution example:
$ groonga --default-n-workers -1 DB_PATH status
7.3.2.2.2.2. Specified by environment variable GRN_N_WORKERS_DEFAULT
#
Execution example:
$ GRN_N_WORKERS_DEFAULT=-1 groonga DB_PATH status
7.3.2.2.3. Available Values#
You can set the number of parallels.
If you specify -1
or 2
or more, it will execute in parallel.
n_workers |
Behavior |
---|---|
When specifying |
Execute in serial. |
When specifying |
Execute in parallel with at most the specified number of threads. |
When specifying |
Execute in parallel with the threads of at most the number of CPU cores. |
7.3.2.3. Check the settings#
You can check it by the value of n_workers
and default_n_workers
in the status command.
Execution example:
status
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# {
# "alloc_count": 29,
# "starttime": 1696558618,
# "start_time": 1696558618,
# "uptime": 1,
# "version": "2.9.1",
# "n_queries": 0,
# "cache_hit_rate": 0.0,
# "command_version": 1,
# "default_command_version": 1,
# "max_command_version": 3,
# "n_jobs": 0,
# "features": {
# "nfkc": true,
# "mecab": true,
# "message_pack": true,
# "mruby": true,
# "onigmo": true,
# "zlib": true,
# "lz4": true,
# "zstandard": true,
# "kqueue": false,
# "epoll": true,
# "poll": false,
# "rapidjson": false,
# "apache_arrow": true,
# "xxhash": true,
# "blosc": true,
# "bfloat16": true,
# "h3": true,
# "simdjson": true,
# "llama.cpp": true,
# "back_trace": true,
# "reference_count": false
# },
# "apache_arrow": {
# "version_major": 2,
# "version_minor": 9,
# "version_patch": 1,
# "version": "2.9.1"
# },
# "memory_map_size": 2929,
# "n_workers": 0,
# "default_n_workers": 0,
# "os": "Linux",
# "cpu": "x86_64"
# }
# ]
n_workers
is per Groonga command value. default_n_workers
is the default value.
7.3.2.4. Notes#
7.3.2.4.1. Apache Arrow is required#
This feature requires that Apache Arrow is enabled in Groonga.
It depends on package provider whether Apache Arrow is enabled or not.
To check whether Apache Arrow is enabled, you can use status command that show the result of apache_arrow
is true
or not.
7.3.2.4.2. For use as a daemon process#
For example, consider using Groonga HTTP server on a system with 6 CPUs.
Groonga HTTP server allocates 1 thread (= 1 CPU) for each request.
When the average number of concurrent connections is 6, there are no free CPU resources because 6 CPUs are already in use. All the CPU is used to process each request.
When the average number of concurrent connections is 2, there are 4 free CPU resources because only 2 CPUs are already in use.
When specifying 2
for n_workers
, it uses at most 3 CPUs, including the thread for processing requests.
Therefore, if two requests to Groonga process with 2
specified for n_workers
are requested at the same time,
they will use at most 6 CPUs in total and will be processed fastly by using all of the resources.
When specifying greater than 2
, the degree of parallelism can be higher than the CPU resources, so it may actually slow down the execution time.