7.3.34. logical_select

7.3.34.1. Summary

New in version 5.0.5.

logical_select is a sharding version of select. logical_select searches records from multiple tables and outputs them.

You need to plugin_register sharding plugin because logical_select is included in sharding plugin.

7.3.34.2. Syntax

This command takes many parameters.

The required parameters are logical_table and shard_key. Other parameters are optional:

logical_select logical_table
               shard_key
               [min=null]
               [min_border="include"]
               [max=null]
               [max_border="include"]
               [filter=null]
               [sortby=null]
               [output_columns="_id, _key, *"]
               [offset=0]
               [limit=10]
               [drilldown=null]
               [drilldown_sortby=null]
               [drilldown_output_columns="_key, _nsubrecs"]
               [drilldown_offset=0]
               [drilldown_limit=10]
               [drilldown_calc_types=NONE]
               [drilldown_calc_target=null]
               [sort_keys=null]
               [drilldown_sort_keys=null]
               [match_columns=null]
               [query=null]
               [drilldown_filter=null]

This command has the following named parameters for dynamic columns:

  • columns[${NAME}].stage=null
  • columns[${NAME}].flags=COLUMN_SCALAR
  • columns[${NAME}].type=null
  • columns[${NAME}].value=null
  • columns[${NAME}].window.sort_keys=null
  • columns[${NAME}].window.group_keys=null

You can use one or more alphabets, digits, _ for ${NAME}. For example, column1 is a valid ${NAME}. This is the same rule as normal column. See also name.

Parameters that have the same ${NAME} are grouped.

For example, the following parameters specify one dynamic column:

  • --columns[name].stage initial
  • --columns[name].type UInt32
  • --columns[name].value 29

The following parameters specify two dynamic columns:

  • --columns[name1].stage initial
  • --columns[name1].type UInt32
  • --columns[name1].value 29
  • --columns[name2].stage filtered
  • --columns[name2].type Float
  • --columns[name2].value '_score * 0.1'

This command has the following named parameters for advanced drilldown:

  • drilldowns[${LABEL}].keys=null
  • drilldowns[${LABEL}].sort_keys=null
  • drilldowns[${LABEL}].output_columns="_key, _nsubrecs"
  • drilldowns[${LABEL}].offset=0
  • drilldowns[${LABEL}].limit=10
  • drilldowns[${LABEL}].calc_types=NONE
  • drilldowns[${LABEL}].calc_target=null
  • drilldowns[${LABEL}].filter=null
  • drilldowns[${LABEL}].columns[${NAME}].stage=null
  • drilldowns[${LABEL}].columns[${NAME}].flags=COLUMN_SCALAR
  • drilldowns[${LABEL}].columns[${NAME}].type=null
  • drilldowns[${LABEL}].columns[${NAME}].value=null
  • drilldowns[${LABEL}].columns[${NAME}].window.sort_keys=null
  • drilldowns[${LABEL}].columns[${NAME}].window.group_keys=null

Deprecated since version 6.1.4: drilldown[...] syntax is deprecated. Use drilldowns[...] instead.

Deprecated since version 6.1.5: drilldowns[${LABEL}].sortby is deprecated. Use drilldowns[${LABEL}].sort_keys instead.

You can use one or more alphabets, digits, _ and . for ${LABEL}. For example, parent.sub1 is a valid ${LABEL}.

Parameters that have the same ${LABEL} are grouped.

For example, the following parameters specify one drilldown:

  • --drilldowns[label].keys column
  • --drilldowns[label].sort_keys -_nsubrecs

The following parameters specify two drilldowns:

  • --drilldowns[label1].keys column1
  • --drilldowns[label1].sort_keys -_nsubrecs
  • --drilldowns[label2].keys column2
  • --drilldowns[label2].sort_keys _key

7.3.34.3. Differences from select

Most of logical_select features can be used like corresponding select features. For example, parameter name is same, output format is same and so on.

But there are some differences from select:

  • logical_table and shard_key parameters are required instead of table parameter.
  • sort_keys isn't supported when multiple shards are used. (Only one shard is used, they are supported.)
  • _value.${KEY_NAME} in drilldowns[${LABEL}].sort_keys doesn't work with multiple shards. It works with one shard. _key in drilldowns[${LABEL}].sort_keys work with multiple shards.
  • cache isn't supported yet.
  • match_escalation_threshold isn't supported yet.
  • query_flags isn't supported yet.
  • query_expander isn't supported yet.
  • adjuster isn't supported yet.

7.3.34.4. Usage

Let's learn about logical_select usage with examples. This section shows many popular usages.

You need to register sharding plugin because logical_select is included in sharding plugin.

Execution example:

plugin_register sharding
# [[0, 1337566253.89858, 0.000355720520019531], true]

Here are a schema definition and sample data to show usage.

Execution example:

table_create Entries_20150708 TABLE_HASH_KEY ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150708 created_at COLUMN_SCALAR Time
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150708 content COLUMN_SCALAR Text
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150708 n_likes COLUMN_SCALAR UInt32
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150708 tag COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Entries_20150709 TABLE_HASH_KEY ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150709 created_at COLUMN_SCALAR Time
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150709 content COLUMN_SCALAR Text
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150709 n_likes COLUMN_SCALAR UInt32
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150709 tag COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Terms TABLE_PAT_KEY ShortText \
  --default_tokenizer TokenBigram \
  --normalizer NormalizerAuto
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Terms entries_key_index_20150708 \
  COLUMN_INDEX|WITH_POSITION Entries_20150708 _key
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Terms entries_content_index_20150708 \
  COLUMN_INDEX|WITH_POSITION Entries_20150708 content
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Terms entries_key_index_20150709 \
  COLUMN_INDEX|WITH_POSITION Entries_20150709 _key
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Terms entries_content_index_20150709 \
  COLUMN_INDEX|WITH_POSITION Entries_20150709 content
# [[0, 1337566253.89858, 0.000355720520019531], true]
load --table Entries_20150708
[
{"_key":       "The first post!",
 "created_at": "2015/07/08 00:00:00",
 "content":    "Welcome! This is my first post!",
 "n_likes":    5,
 "tag":        "Hello"},
{"_key":       "Groonga",
 "created_at": "2015/07/08 01:00:00",
 "content":    "I started to use Groonga. It's very fast!",
 "n_likes":    10,
 "tag":        "Groonga"},
{"_key":       "Mroonga",
 "created_at": "2015/07/08 02:00:00",
 "content":    "I also started to use Mroonga. It's also very fast! Really fast!",
 "n_likes":    15,
 "tag":        "Groonga"}
]
# [[0, 1337566253.89858, 0.000355720520019531], 3]
load --table Entries_20150709
[
{"_key":       "Good-bye Senna",
 "created_at": "2015/07/09 00:00:00",
 "content":    "I migrated all Senna system!",
 "n_likes":    3,
 "tag":        "Senna"},
{"_key":       "Good-bye Tritonn",
 "created_at": "2015/07/09 01:00:00",
 "content":    "I also migrated all Tritonn system!",
 "n_likes":    3,
 "tag":        "Senna"}
]
# [[0, 1337566253.89858, 0.000355720520019531], 2]

There are two tables, Entries_20150708 and Entries_20150709, for blog entries.

Note

You need to use ${LOGICAL_TABLE_NAME}_${YYYYMMDD} naming rule for table names. In this example, LOGICAL_TABLE_NAME is Entries and YYYYMMDD is 20150708 or 20150709.

An entry has title, created time, content, the number of likes for the entry and tag. Title is key of Entries_YYYYMMDD. Created time is value of Entries_YYYYMMDD.created_at column. Content is value of Entries_YYYYMMDD.content column. The number of likes is value of Entries_YYYYMMDD.n_likes column. Tag is value of Entries_YYYYMMDD.tag column.

Entries_YYYYMMDD._key column and Entries_YYYYMMDD.content column are indexed using TokenBigram tokenizer. So both Entries_YYYYMMDD._key and Entries_YYYYMMDD.content are fulltext search ready.

OK. The schema and data for examples are ready.

7.3.34.4.1. Simple usage

TODO

7.3.34.5. Parameters

This section describes parameters of logical_select.

7.3.34.5.1. Required parameters

There are required parameters, logical_table and shard_key.

7.3.34.5.1.1. logical_table

Specifies logical table name. It means table name without _YYYYMMDD postfix. If you use actual table such as Entries_20150708, Entries_20150709 and so on, logical table name is Entries.

You can show 10 records by specifying logical_table and shard_key parameters. They are required parameters.

Execution example:

logical_select --logical_table Entries --shard_key created_at
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "created_at",
#           "Time"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "The first post!",
#         "Welcome! This is my first post!",
#         1436281200.0,
#         5,
#         "Hello"
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         1436284800.0,
#         10,
#         "Groonga"
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         1436288400.0,
#         15,
#         "Groonga"
#       ],
#       [
#         1,
#         "Good-bye Senna",
#         "I migrated all Senna system!",
#         1436367600.0,
#         3,
#         "Senna"
#       ],
#       [
#         2,
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         1436371200.0,
#         3,
#         "Senna"
#       ]
#     ]
#   ]
# ]

If nonexistent table is specified, an error is returned.

Execution example:

logical_select --logical_table Nonexistent --shard_key created_at
# [
#   [
#     -22,
#     1337566253.89858,
#     0.000355720520019531,
#     "[logical_select] no shard exists: logical_table: <Nonexistent>: shard_key: <created_at>",
#     [
#       [
#         "Groonga::Sharding::LogicalSelectCommand::Executor.execute_search",
#         "/home/kou/work/c/groonga.clean/plugins/sharding/logical_select.rb",
#         728
#       ]
#     ]
#   ]
# ]

7.3.34.5.1.2. shard_key

Specifies column name which is treated as shared key. Shard key is a column that stores data that is used for distributing records to suitable shards.

Shard key must be Time type for now.

See logical_table how to specify shard_key.

7.3.34.5.2. Optional parameters

There are optional parameters.

7.3.34.5.2.1. min

Specifies the minimum value of shard_key column. If shard doesn't have any matched records, the shard isn't searched.

For example, min is "2015/07/09 00:00:00", Entry_20150708 isn't searched. Because Entry_20150708 has only records for "2015/07/08".

The following example only uses Entry_20150709 table. Entry_20150708 isn't used.

Execution example:

logical_select \
  --logical_table Entries \
  --shard_key created_at \
  --min "2015/07/09 00:00:00"
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         2
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "created_at",
#           "Time"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "Good-bye Senna",
#         "I migrated all Senna system!",
#         1436367600.0,
#         3,
#         "Senna"
#       ],
#       [
#         2,
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         1436371200.0,
#         3,
#         "Senna"
#       ]
#     ]
#   ]
# ]

7.3.34.5.2.2. min_border

Specifies whether the minimum value is included or not. Here is available values.

Value Description
include Includes min value. This is the default.
exclude Doesn't include min value.

Here is an example for exclude. The result doesn't include the "Good-bye Senna" record because its created_at value is "2015/07/09 00:00:00".

Execution example:

logical_select \
  --logical_table Entries \
  --shard_key created_at \
  --min "2015/07/09 00:00:00" \
  --min_border "exclude"
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "created_at",
#           "Time"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         2,
#         "Good-bye Tritonn",
#         "I also migrated all Tritonn system!",
#         1436371200.0,
#         3,
#         "Senna"
#       ]
#     ]
#   ]
# ]

7.3.34.5.2.3. max

Specifies the maximum value of shard_key column. If shard doesn't have any matched records, the shard isn't searched.

For example, max is "2015/07/08 23:59:59", Entry_20150709 isn't searched. Because Entry_20150709 has only records for ""2015/07/09".

The following example only uses Entry_20150708 table. Entry_20150709 isn't used.

Execution example:

logical_select \
  --logical_table Entries \
  --shard_key created_at \
  --max "2015/07/08 23:59:59"
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "created_at",
#           "Time"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "The first post!",
#         "Welcome! This is my first post!",
#         1436281200.0,
#         5,
#         "Hello"
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         1436284800.0,
#         10,
#         "Groonga"
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         1436288400.0,
#         15,
#         "Groonga"
#       ]
#     ]
#   ]
# ]

7.3.34.5.2.4. max_border

Specifies whether the maximum value is included or not. Here is available values.

Value Description
include Includes max value. This is the default.
exclude Doesn't include max value.

Here is an example for exclude. The result doesn't include the "Good-bye Senna" record because its created_at value is "2015/07/09 00:00:00".

Execution example:

logical_select \
  --logical_table Entries \
  --shard_key created_at \
  --max "2015/07/09 00:00:00" \
  --max_border "exclude"
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         3
#       ],
#       [
#         [
#           "_id",
#           "UInt32"
#         ],
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "content",
#           "Text"
#         ],
#         [
#           "created_at",
#           "Time"
#         ],
#         [
#           "n_likes",
#           "UInt32"
#         ],
#         [
#           "tag",
#           "ShortText"
#         ]
#       ],
#       [
#         1,
#         "The first post!",
#         "Welcome! This is my first post!",
#         1436281200.0,
#         5,
#         "Hello"
#       ],
#       [
#         2,
#         "Groonga",
#         "I started to use Groonga. It's very fast!",
#         1436284800.0,
#         10,
#         "Groonga"
#       ],
#       [
#         3,
#         "Mroonga",
#         "I also started to use Mroonga. It's also very fast! Really fast!",
#         1436288400.0,
#         15,
#         "Groonga"
#       ]
#     ]
#   ]
# ]

7.3.34.5.4. Advanced search parameters

logical_select doesn't implement advanced search parameters yet.

7.3.34.5.4.1. match_escalation_threshold

Not implemented yet.

7.3.34.5.4.2. query_flags

Not implemented yet.

7.3.34.5.4.3. query_expander

Not implemented yet.

7.3.34.6. Return value

The return value format of logical_select is compatible with select. See Return value for details.