7.15.22. string_slice

New in version 11.0.3.

7.15.22.1. Summary

string_slice extracts a substring of a string. You can use two different extraction methods depending on the arguments.

  • Extraction by position

  • Extraction by regular expression

Groonga uses the same regular expression syntax in Ruby.

To enable this function, register functions/string plugin by following the command:

plugin_register functions/string

7.15.22.2. Syntax

string_slice requires two to four parameters. The required parametars are depending on the extraction method.

7.15.22.2.1. Extraction by position

string_slice(target, nth[, options])
string_slice(target, nth, length[, options])

options uses the following format. All of key-value pairs are optional:

{
  "default_value": default_value
}

7.15.22.2.2. Extraction by regular expression

string_slice(target, regexp, nth[, options])
string_slice(target, regexp, name[, options])

options uses the following format. All of key-value pairs are optional:

{
  "default_value": default_value
}

7.15.22.3. Usage

Here are a schema definition and sample data to show usage.

Sample schema:

Execution example:

plugin_register functions/string
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Memos TABLE_HASH_KEY ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]

Sample data:

Execution example:

load --table Memos
[
{"_key": "Groonga"}
]
# [[0, 1337566253.89858, 0.000355720520019531], 1]

Here is a simple example for the extraction by position.

Execution example:

select Memos --output_columns '_key, string_slice(_key, 2, 3)'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "string_slice",
#           null
#         ]
#       ],
#       [
#         "Groonga",
#         "oon"
#       ]
#     ]
#   ]
# ]

Here are simple examples for the extraction by regular expression.

In the following example, extracting by specifying the group number of the capturing group: (subexp).

Execution example:

select Memos --output_columns '_key, string_slice(_key, "(Gro+)(.*)", 2)'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "string_slice",
#           null
#         ]
#       ],
#       [
#         "Groonga",
#         "nga"
#       ]
#     ]
#   ]
# ]

In the following example, extracting by specifying the name of the named capturing group: (?<name>subexp).

Execution example:

select Memos --output_columns '_key, string_slice(_key, "(Gr)(?<Name1>o*)(?<Name2>.*)", "Name1")'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "string_slice",
#           null
#         ]
#       ],
#       [
#         "Groonga",
#         "oo"
#       ]
#     ]
#   ]
# ]

In the following example, specifying the default value.

Execution example:

select Memos --output_columns '_key, string_slice(_key, "mismatch", 2, { "default_value" : "default" })'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "string_slice",
#           null
#         ]
#       ],
#       [
#         "Groonga",
#         {
#           "default_value": "default"
#         }
#       ]
#     ]
#   ]
# ]

You can specify string literal instead of column.

Execution example:

select Memos --output_columns 'string_slice("Groonga", "(roon)(g)", 2)'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "string_slice",
#           null
#         ]
#       ],
#       [
#         "g"
#       ]
#     ]
#   ]
# ]

7.15.22.4. Parameters

7.15.22.4.1. Extraction by position

There are two required parameters, target and nth.

There are two optional parameters, length and options.

7.15.22.4.1.1. target

Specify a string literal or a string type column.

7.15.22.4.1.2. nth

Specify a 0-based index number of charactors where to start the extraction from target.

If you specify a negative value, it counts from the end of target.

7.15.22.4.1.3. length

Specify a number of characters to extract from nth.

The default is 1.

7.15.22.4.1.4. options

Specify the following key.

default_value

Specify a string to be returned when a substring is an empty string except when specifying 0 for length.

The default is an empty string.

7.15.22.4.2. Extraction by regular expression

There are three required parameters, target and regexp and nth or name. Specify either nth or name.

There is one optional parameter, options.

7.15.22.4.2.1. target

Specify a string literal or a string type column.

7.15.22.4.2.2. regexp

Specify a regular expression string.

When you use nth and specify a value greater than 0, you must use capturing group: (subexp).

When you use name, you must use named capturing group: (?<name>subexp), (?'name'subexp).

7.15.22.4.2.3. nth

Specify a number of the capturing group for regexp.

A captured string of the nth capturing group is returned when regexp is matched to target.

If 0 is specified for nth, the entire string that matches regexp is returned.

Specify either nth or name.

7.15.22.4.2.4. name

Specify a name of the named capturing group for regexp.

A captured string of the named capturing group that matches name is returned when regexp is matched to target.

Specify either nth or name.

7.15.22.4.2.5. options

Specify the following key.

default_value

Specify a string returned if regexp does not match to target. This value also be returned when the value of nth or name is incorrect.

The default is an empty string.

7.15.22.5. Return value

string_slice returns a substring extracted under the specified conditions from target.