7.15.25. string_slice#

New in version 11.0.3.

7.15.25.1. Summary#

string_slice extracts a substring of a string. You can use two different extraction methods depending on the arguments.

  • Extraction by position

  • Extraction by regular expression

Groonga uses the same regular expression syntax in Ruby.

To enable this function, register functions/string plugin by following the command:

plugin_register functions/string

7.15.25.2. Syntax#

string_slice requires two to four parameters. The required parametars are depending on the extraction method.

7.15.25.2.1. Extraction by position#

string_slice(target, nth[, options])
string_slice(target, nth, length[, options])

options uses the following format. All of key-value pairs are optional:

{
  "default_value": default_value
}

7.15.25.2.2. Extraction by regular expression#

string_slice(target, regexp, nth[, options])
string_slice(target, regexp, name[, options])

options uses the following format. All of key-value pairs are optional:

{
  "default_value": default_value
}

7.15.25.3. Usage#

Here are a schema definition and sample data to show usage.

Sample schema:

Execution example:

plugin_register functions/string
# [[0,1337566253.89858,0.000355720520019531],true]
table_create Memos TABLE_HASH_KEY ShortText
# [[0,1337566253.89858,0.000355720520019531],true]

Sample data:

Execution example:

load --table Memos
[
{"_key": "Groonga"}
]
# [[0,1337566253.89858,0.000355720520019531],1]

Here is a simple example for the extraction by position.

Execution example:

select Memos --output_columns '_key, string_slice(_key, 2, 3)'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "string_slice",
#           null
#         ]
#       ],
#       [
#         "Groonga",
#         "oon"
#       ]
#     ]
#   ]
# ]

Here are simple examples for the extraction by regular expression.

In the following example, extracting by specifying the group number of the capturing group: (subexp).

Execution example:

select Memos --output_columns '_key, string_slice(_key, "(Gro+)(.*)", 2)'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "string_slice",
#           null
#         ]
#       ],
#       [
#         "Groonga",
#         "nga"
#       ]
#     ]
#   ]
# ]

In the following example, extracting by specifying the name of the named capturing group: (?<name>subexp).

Execution example:

select Memos --output_columns '_key, string_slice(_key, "(Gr)(?<Name1>o*)(?<Name2>.*)", "Name1")'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "string_slice",
#           null
#         ]
#       ],
#       [
#         "Groonga",
#         "oo"
#       ]
#     ]
#   ]
# ]

In the following example, specifying the default value.

Execution example:

select Memos --output_columns '_key, string_slice(_key, "mismatch", 2, { "default_value" : "default" })'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "string_slice",
#           null
#         ]
#       ],
#       [
#         "Groonga",
#         "default"
#       ]
#     ]
#   ]
# ]

You can specify string literal instead of column.

Execution example:

select Memos --output_columns 'string_slice("Groonga", "(roon)(g)", 2)'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "string_slice",
#           null
#         ]
#       ],
#       [
#         "g"
#       ]
#     ]
#   ]
# ]

7.15.25.4. Parameters#

7.15.25.4.1. Extraction by position#

There are two required parameters, target and nth.

There are two optional parameters, length and options.

7.15.25.4.1.1. target#

Specify a string literal or a string type column.

7.15.25.4.1.2. nth#

Specify a 0-based index number of charactors where to start the extraction from target.

If you specify a negative value, it counts from the end of target.

7.15.25.4.1.3. length#

Specify a number of characters to extract from nth.

The default is 1.

7.15.25.4.1.4. options#

Specify the following key.

default_value

Specify a string to be returned when a substring is an empty string except when specifying 0 for length.

The default is an empty string.

7.15.25.4.2. Extraction by regular expression#

There are three required parameters, target and regexp and nth or name. Specify either nth or name.

There is one optional parameter, options.

7.15.25.4.2.1. target#

Specify a string literal or a string type column.

7.15.25.4.2.2. regexp#

Specify a regular expression string.

When you use nth and specify a value greater than 0, you must use capturing group: (subexp).

When you use name, you must use named capturing group: (?<name>subexp), (?'name'subexp).

7.15.25.4.2.3. nth#

Specify a number of the capturing group for regexp.

A captured string of the nth capturing group is returned when regexp is matched to target.

If 0 is specified for nth, the entire string that matches regexp is returned.

Specify either nth or name.

7.15.25.4.2.4. name#

Specify a name of the named capturing group for regexp.

A captured string of the named capturing group that matches name is returned when regexp is matched to target.

Specify either nth or name.

7.15.25.4.2.5. options#

Specify the following key.

default_value

Specify a string returned if regexp does not match to target. This value also be returned when the value of nth or name is incorrect.

The default is an empty string.

7.15.25.5. Return value#

string_slice returns a substring extracted under the specified conditions from target.