7.1.8. groonga-suggest-httpd#

7.1.8.1. Summary#

groonga-suggest-httpd is a program that provides HTTP interface for the following features:

  • Returning Suggest execution result

  • Saving logs for learning

groonga-suggest-httpd provides suggest feature like suggest command. Note that some parameter names are different of them.

7.1.8.2. Syntax#

groonga-suggest-httpd requires database path:

groonga-suggest-httpd [options] DATABASE_PATH

7.1.8.3. Usage#

You need to create one or more datasets to use groonga-suggest-httpd. A dataset consists of tables and columns. You can define them by groonga-suggest-create-dataset.

You need to use groonga-suggest-learner to learn suggestion data from user inputs. You doesn’t need to use groonga-suggest-learner when you create suggestion data by hand. See Suggest and sub documents about creating suggestion data by hand.

You can use groonga-suggest-httpd via HTTP after you create one or more datasets.

The following sections describes the followings:

  • How to set up a dataset

  • How to use groonga-suggest-httpd with groonga-suggest-learner

  • How to use groonga-suggest-httpd for retrieving suggestions.

7.1.8.3.1. Setup#

You need to create a dataset by groonga-suggest-create-dataset.

Here is an example that creates query dataset:

Execution example:

$ groonga-suggest-create-dataset ${DB_PATH} query
> plugin_register suggest/suggest
true
> table_create event_type TABLE_HASH_KEY ShortText
true
> table_create bigram TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram --normalizer NormalizerAuto
true
> table_create kana TABLE_PAT_KEY ShortText --normalizer NormalizerAuto
true
> table_create item_query TABLE_PAT_KEY ShortText --default_tokenizer TokenDelimit --normalizer NormalizerAuto
true
> column_create bigram item_query_key COLUMN_INDEX|WITH_POSITION item_query _key
true
> column_create item_query kana COLUMN_VECTOR kana
true
> column_create kana item_query_kana COLUMN_INDEX item_query kana
true
> column_create item_query freq COLUMN_SCALAR Int32
true
> column_create item_query last COLUMN_SCALAR Time
true
> column_create item_query boost COLUMN_SCALAR Int32
true
> column_create item_query freq2 COLUMN_SCALAR Int32
true
> column_create item_query buzz COLUMN_SCALAR Int32
true
> table_create pair_query TABLE_HASH_KEY UInt64
true
> column_create pair_query pre COLUMN_SCALAR item_query
true
> column_create pair_query post COLUMN_SCALAR item_query
true
> column_create pair_query freq0 COLUMN_SCALAR Int32
true
> column_create pair_query freq1 COLUMN_SCALAR Int32
true
> column_create pair_query freq2 COLUMN_SCALAR Int32
true
> column_create item_query co COLUMN_INDEX pair_query pre
true
> table_create sequence_query TABLE_HASH_KEY ShortText
true
> table_create event_query TABLE_NO_KEY
true
> column_create sequence_query events COLUMN_VECTOR|RING_BUFFER event_query
true
> column_create event_query type COLUMN_SCALAR event_type
true
> column_create event_query time COLUMN_SCALAR Time
true
> column_create event_query item COLUMN_SCALAR item_query
true
> column_create event_query sequence COLUMN_SCALAR sequence_query
true
> table_create configuration TABLE_HASH_KEY ShortText
true
> column_create configuration weight COLUMN_SCALAR UInt32
true
> load --table configuration
> [
> {"_key": "query", "weight": 1}
> ]
1

groonga-suggest-create-dataset outputs executed commands. You can confirm that what tables and columns are created for the new dataset.

7.1.8.3.2. Launch groonga-suggest-learner#

You can choose whether you use learned suggestion data immediately or not.

There are two ways to use learned suggestion data immediately:

In the former case, you must run both groonga-suggest-httpd and groonga-suggest-learner on the same host.

In the latter case, you can run groonga-suggest-httpd and groonga-suggest-learner on different hosts.

If you don’t need to use learned suggestion data immediately, you need to apply learned suggestion data from database that is used by groonga-suggest-learner to database that is used by groonga-suggest-httpd by hand. Normally, this usage is recommended. Because learned suggestion data may have garbage data by inputs from evil users.

In this document, learned suggestion data are used immediately by receiving learned suggestion data from groonga-suggest-learner. Both groonga-suggest-httpd and groonga-suggest-learner are running on the same host. Because it’s easy to explain.

Here is an example that launches groonga-suggest-learner. You need to specify database that has query dataset. This document omits the instruction for creating query dataset:

Execution example:

$ groonga-suggest-learner --daemon ${DB_PATH}

The groonga-suggest-learner process opens two endpoints at 1234 port and 1235 port:

  • 1234 port: Endpoint that accepts user input data from groonga-suggest-httpd

  • 1235 port: Endpoint that sends learned suggestion data to groonga-suggest-httpd

7.1.8.3.3. Launch groonga-suggest-httpd#

You need to launch groonga-suggest-httpd for the following proposes:

  • Learning suggestion data from user inputs

  • Providing suggestion result to clients

Here is an example that launches groonga-suggest-httpd that communicates groonga-suggest-learner:

Execution example:

$ groonga-suggest-httpd --send-endpoint 'tcp://127.0.0.1:1234' --receive-endpoint 'tcp://127.0.0.1:1235' --daemon ${DB_PATH}

The groonga-suggest-httpd process accepts HTTP requests on 8080 port.

If you want to save requests into log file, use --log-base-path option.

Here is an example to save log files under logs directory with log prefix for each file:

% groonga-suggest-httpd --log-base-path logs/log ${DB_PATH}

groonga-suggest-httpd creates log files such as logYYYYmmddHHMMSS-00 under logs directory.

7.1.8.3.4. Learn from user inputs#

You can learn suggestion data from user inputs.

You need to specify the following parameters to learn suggestion data:

  • i: The ID of the user (You may use IP address of client)

  • l: The dataset name

  • s: The timestamp of the input in seconds

  • t: The query type (It’s optional. You must specify submit only when the user input is submitted.)

  • q: The user input

Here are example requests to learn user input “Groonga” in query dataset:

.. groonga-command
.. include:: ../../example/reference/executables/groonga-suggest-httpd/learn.log
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=92619&q=G'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=93850&q=Gr'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=94293&q=Gro'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=94734&q=Groo'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=95147&q=Grooon'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=95553&q=Groonga'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=95959&t=submit&q=Groonga'

Inputting data must not use t=submit parameter. In the above example, you just learn user inputs but you can learn and get complete candidates at once. It’s described at the next section.

Submitted data must use t=submit parameter.

7.1.8.3.5. Use suggested response#

You can get suggested result from groonga-suggest-httpd.

You need to specify the following parameters to get suggested result:

  • n: The dataset name

  • t: The query type (complete, correct and/or suggest)

  • q: The user input

You can also specify parameters for suggest as option.

Here is an example that gets Completion result. The result is computed by using learned data at the previous section. frequency_threshold=1 parameter is used because this is an example. The parameter enables input data that are occurred one or more times. Normally, you should not use the parameter for production. The parameter will increase noises:

Execution example:

$ curl 'http://localhost:8080/?n=query&t=complete&q=G&frequency_threshold=1'
{"complete":[[1],[["_key","ShortText"],["_score","Int32"]],["groonga",1]]}

You can combine completion and learning by specifying parameters for both:

Execution example:

$ curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=96000&q=G&n=query&t=complete&frequency_threshold=1'
{"complete":[[1],[["_key","ShortText"],["_score","Int32"]],["groonga",1]]}

7.1.8.4. Command line parameters#

7.1.8.4.1. Required parameters#

There is only one required parameter.

7.1.8.4.1.1. DATABASE_PATH#

Specifies the path to a Groonga database. This database must have one or more datasets. Each dataset must be created by groonga-suggest-create-dataset.

7.1.8.4.2. Optional parameters#

-p, --port#

Specify HTTP server port number.

The default port number is 8080.

-t, --n-threads#

Specify number of threads.

This option accepts 128 as the max value, but use the number of CPU cores for performance.

The default value is the number of CPU cores.

-s, --send-endpoint#

Specify endpoint URI of groonga-suggest-learner for sending user inputs.

The format is tcp://${HOST}:${PORT} such as tcp://192.168.0.1:2929.

The default value is none.

-r, --receive-endpoint#

Specify endpoint URI of groonga-suggest-learner for receiving learned suggestion data.

The format is tcp://${HOST}:${PORT} such as tcp://192.168.0.1:2929.

The default value is none.

-l, --log-base-path#

Specify path prefix of log.

The default value is none.

--n-lines-per-log-file#

Specify the max number of lines in a log file.

The default value is 1000000.

-d, --daemon#

Specify this option to daemonize.

Don’t daemonize by default.

--disable-max-fd-check#

Specify this option to disable checking the max number of file descriptors on start.

Check by default.

7.1.8.5. GET parameters#

groonga-suggest-httpd accepts some GET parameters.

There are required parameters which depend on query type.

In complete, correct or suggest query type, unhandled parameters are passed through suggest. It means that you can use parameters of suggest.

7.1.8.5.1. Required parameters#

You must specify the following parameters.

Key

Description

Note

q

Input by user. It must be UTF-8 encoded string.

7.1.8.5.2. Required parameters for learning#

You must specify the following parameters when you specify --send-endpoint.

Key

Description

Note

s

Elapsed time since 1970-01-01T00:00:00Z.

The unit is millisecond.

i

Unique ID to distinct each user

Session ID, IP address and so on will be usable for this value.

l

One or more learn target dataset names. You need to use | as separator such as dataset1|dataset2|dataset3.

Dataset name is the name that you specify to groonga-suggest-create-dataset.

7.1.8.5.3. Required parameters for suggestion#

You must specify the following parameters when you specify one of complete, correct and suggest to t parameter.

Key

Description

Note

n

The dataset name to use computing suggestion result.

Dataset name is the name that you specify to groonga-suggest-create-dataset.

t

The query type.

Available values are complete, correct, suggest.

You can specify multiple types. You need to use | as separator such as complete|correct.

7.1.8.5.4. Optional parameters#

Here are optional parameters.

Key

Description

Note

callback

Function name for JSONP

7.1.8.5.5. Optional parameters for learning#

Here are optional parameters when you specify --send-endpoint.

Key

Description

Note

t

The query type.

Available value is only submit.

You must specify submit when user submits the input specified as q.

You must not specify submit for user inputs that aren’t submitted yet. You can use suggestion by specifying complete, correct and/or suggest to t when you doesn’t specify submit. See Required parameters for suggestion for details about these values.

7.1.8.6. Return value#

groonga-suggest-httpd returns the following format response. It’s the same format as body of suggest:

{
  TYPE1: [
    [CANDIDATE_1, SCORE_1],
    [CANDIDATE_2, SCORE_2],
    ...,
    [CANDIDATE_N, SCORE_N]
  ],
  TYPE2: [
    [CANDIDATE_1, SCORE_1],
    [CANDIDATE_2, SCORE_2],
    ...,
    [CANDIDATE_N, SCORE_N]
  ],
  ...
}

Here is the response when t is submit:

{}

7.1.8.6.1. TYPE#

One of complete, correct and suggest.

7.1.8.6.2. CANDIDATE_N#

The string of candidate in UTF-8.

7.1.8.6.3. SCORE_N#

The score of the candidate.

Candidates are sorted by score descendant.

7.1.8.7. See also#