7.1.8. groonga-suggest-httpd

7.1.8.1. Summary

groonga-suggest-httpd is a program that provides HTTP interface for the following features:

  • Returning Suggest execution result
  • Saving logs for learning

groonga-suggest-httpd provides suggest feature like suggest command. Note that some parameter names are different of them.

7.1.8.2. Syntax

groonga-suggest-httpd requires database path:

groonga-suggest-httpd [options] DATABASE_PATH

7.1.8.3. Usage

You need to create one or more datasets to use groonga-suggest-httpd. A dataset consists of tables and columns. You can define them by groonga-suggest-create-dataset.

You need to use groonga-suggest-learner to learn suggestion data from user inputs. You doesn't need to use groonga-suggest-learner when you create suggestion data by hand. See Suggest and sub documents about creating suggestion data by hand.

You can use groonga-suggest-httpd via HTTP after you create one or more datasets.

The following sections describes the followings:

  • How to set up a dataset
  • How to use groonga-suggest-httpd with groonga-suggest-learner
  • How to use groonga-suggest-httpd for retrieving suggestions.

7.1.8.3.1. Setup

You need to create a dataset by groonga-suggest-create-dataset.

Here is an example that creates query dataset:

Execution example:

% groonga-suggest-create-dataset ${DB_PATH} query
> plugin_register suggest/suggest
true
> table_create event_type TABLE_HASH_KEY ShortText
true
> table_create bigram TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram --normalizer NormalizerAuto
true
> table_create kana TABLE_PAT_KEY ShortText --normalizer NormalizerAuto
true
> table_create item_query TABLE_PAT_KEY ShortText --default_tokenizer TokenDelimit --normalizer NormalizerAuto
true
> column_create bigram item_query_key COLUMN_INDEX|WITH_POSITION item_query _key
true
> column_create item_query kana COLUMN_VECTOR kana
true
> column_create kana item_query_kana COLUMN_INDEX item_query kana
true
> column_create item_query freq COLUMN_SCALAR Int32
true
> column_create item_query last COLUMN_SCALAR Time
true
> column_create item_query boost COLUMN_SCALAR Int32
true
> column_create item_query freq2 COLUMN_SCALAR Int32
true
> column_create item_query buzz COLUMN_SCALAR Int32
true
> table_create pair_query TABLE_HASH_KEY UInt64
true
> column_create pair_query pre COLUMN_SCALAR item_query
true
> column_create pair_query post COLUMN_SCALAR item_query
true
> column_create pair_query freq0 COLUMN_SCALAR Int32
true
> column_create pair_query freq1 COLUMN_SCALAR Int32
true
> column_create pair_query freq2 COLUMN_SCALAR Int32
true
> column_create item_query co COLUMN_INDEX pair_query pre
true
> table_create sequence_query TABLE_HASH_KEY ShortText
true
> table_create event_query TABLE_NO_KEY
true
> column_create sequence_query events COLUMN_VECTOR|RING_BUFFER event_query
true
> column_create event_query type COLUMN_SCALAR event_type
true
> column_create event_query time COLUMN_SCALAR Time
true
> column_create event_query item COLUMN_SCALAR item_query
true
> column_create event_query sequence COLUMN_SCALAR sequence_query
true
> table_create configuration TABLE_HASH_KEY ShortText
true
> column_create configuration weight COLUMN_SCALAR UInt32
true
> load --table configuration
> [
> {"_key": "query", "weight": 1}
> ]
1

groonga-suggest-create-dataset outputs executed commands. You can confirm that what tables and columns are created for the new dataset.

7.1.8.3.2. Launch groonga-suggest-learner

You can choose whether you use learned suggestion data immediately or not.

There are two ways to use learned suggestion data immediately:

In the former case, you must run both groonga-suggest-httpd and groonga-suggest-learner on the same host.

In the latter case, you can run groonga-suggest-httpd and groonga-suggest-learner on different hosts.

If you don't need to use learned suggestion data immediately, you need to apply learned suggestion data from database that is used by groonga-suggest-learner to database that is used by groonga-suggest-httpd by hand. Normally, this usage is recommended. Because learned suggestion data may have garbage data by inputs from evil users.

In this document, learned suggestion data are used immediately by receiving learned suggestion data from groonga-suggest-learner. Both groonga-suggest-httpd and groonga-suggest-learner are running on the same host. Because it's easy to explain.

Here is an example that launches groonga-suggest-learner. You need to specify database that has query dataset. This document omits the instruction for creating query dataset:

Execution example:

% groonga-suggest-learner --daemon ${DB_PATH}

The groonga-suggest-learner process opens two endpoints at 1234 port and 1235 port:

  • 1234 port: Endpoint that accepts user input data from groogna-suggest-httpd
  • 1235 port: Endpoint that sends learned suggestion data to groogna-suggest-httpd

7.1.8.3.3. Launch groonga-suggest-httpd

You need to launch groonga-suggest-httpd for the following proposes:

  • Learning suggestion data from user inputs
  • Providing suggestion result to clients

Here is an example that launches groonga-suggest-httpd that communicates groonga-suggest-learner:

Execution example:

% groonga-suggest-httpd --send-endpoint 'tcp://127.0.0.1:1234' --receive-endpoint 'tcp://127.0.0.1:1235' --daemon ${DB_PATH}

The groonga-suggest-httpd process accepts HTTP requests on 8080 port.

If you want to save requests into log file, use --log-base-path option.

Here is an example to save log files under logs directory with log prefix for each file:

% groonga-suggest-httpd --log-base-path logs/log ${DB_PATH}

groonga-suggest-httpd creates log files such as logYYYYmmddHHMMSS-00 under logs directory.

7.1.8.3.4. Learn from user inputs

You can learn suggestion data from user inputs.

You need to specify the following parameters to learn suggestion data:

  • i: The ID of the user (You may use IP address of client)
  • l: The dataset name
  • s: The timestamp of the input in seconds
  • t: The query type (It's optional. You must specify submit only when the user input is submitted.)
  • q: The user input

Here are example requests to learn user input "Groonga" in query dataset:

.. groonga-command
.. include:: ../../example/reference/executables/groonga-suggest-httpd/learn.log
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=92619&q=G'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=93850&q=Gr'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=94293&q=Gro'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=94734&q=Groo'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=95147&q=Grooon'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=95553&q=Groonga'
.. % curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=95959&t=submit&q=Groonga'

Inputting data must not use t=submit parameter. In the above example, you just learn user inputs but you can learn and get complete candidates at once. It's described at the next section.

Submitted data must use t=submit parameter.

7.1.8.3.5. Use suggested response

You can get suggested result from groonga-suggest-httpd.

You need to specify the following parameters to get suggested result:

  • n: The dataset name
  • t: The query type (complete, correct and/or suggest)
  • q: The user input

You can also specify parameters for suggest as option.

Here is an example that gets Completion result. The result is computed by using learned data at the previous section. frequency_threshold=1 parameter is used because this is an example. The parameter enables input data that are occurred one or more times. Normally, you should not use the parameter for production. The parameter will increase noises:

Execution example:

% curl 'http://localhost:8080/?n=query&t=complete&q=G&frequency_threshold=1'
{"complete":[[6],[["_key","ShortText"],["_score","Int32"]],["groonga",3],["g",2],["grooon",2],["gro",2],["groo",2],["gr",2]]}

You can combine completion and learning by specifying parameters for both:

Execution example:

% curl 'http://localhost:8080/?i=127.0.0.1&l=query&s=93000&q=G&n=query&t=complete&frequency_threshold=1'
{"complete":[[6],[["_key","ShortText"],["_score","Int32"]],["groonga",3],["g",2],["grooon",2],["gro",2],["groo",2],["gr",2]]}

7.1.8.4. Command line parameters

7.1.8.4.1. Required parameters

There is one required parameter.

7.1.8.4.1.1. DATABASE_PATH

Specifies the path to a Groonga database. This database must have one or more datasets. Each dataset must be created by groonga-suggest-create-dataset.

7.1.8.4.2. Optional parameters

-p, --port

Specify HTTP server port number.

The default port number is 8080.

-t, --n-threads

Specify number of threads.

This option accepts 128 as the max value, but use the number of CPU cores for performance.

The default value is the number of CPU cores.

-s, --send-endpoint

Specify endpoint URI of groonga-suggest-learner for sending user inputs.

The format is tcp://${HOST}:${PORT} such as tcp://192.168.0.1:2929.

The default value is none.

-r, --receive-endpoint

Specify endpoint URI of groonga-suggest-learner for receiving learned suggestion data.

The format is tcp://${HOST}:${PORT} such as tcp://192.168.0.1:2929.

The default value is none.

-l, --log-base-path

Specify path prefix of log.

The default value is none.

--n-lines-per-log-file

Specify the max number of lines in a log file.

The default value is 1000000.

-d, --daemon

Specify this option to daemonize.

Don't daemonize by default.

--disable-max-fd-check

Specify this option to disable checking the max number of file descriptors on start.

Check by default.

7.1.8.5. GET parameters

groonga-suggest-httpd accepts some GET parameters.

There are required parameters which depend on query type.

In complete, correct or suggest query type, unhandled parameters are passed through suggest. It means that you can use parameters of suggest.

7.1.8.5.1. Required parameters

You must specify the following parameters.

Key Description Note
q Input by user. It must be UTF-8 encoded string.  

7.1.8.5.2. Required parameters for learning

You must specify the following parameters when you specify --send-endpoint.

Key Description Note
s Elapsed time since 1970-01-01T00:00:00Z. The unit is millisecond.
i Unique ID to distinct each user Session ID, IP address and so on will be usable for this value.
l One or more learn target dataset names. You need to use | as separator such as dataset1|dataset2|dataset3. Dataset name is the name that you specify to groonga-suggest-create-dataset.

7.1.8.5.3. Required parameters for suggestion

You must specify the following parameters when you specify one of complete, correct and suggest to t parameter.

Key Description Note
n The dataset name to use computing suggestion result. Dataset name is the name that you specify to groonga-suggest-create-dataset.
t

The query type.

Available values are complete, correct, suggest.

You can specify multiple types. You need to use | as separator such as complete|correct.

7.1.8.5.4. Optional parameters

Here are optional parameters.

Key Description Note
callback Function name for JSONP  

7.1.8.5.5. Optional parameters for learning

Here are optional parameters when you specify --send-endpoint.

Key Description Note
t

The query type.

Available value is only submit.

You must specify submit when user submits the input specified as q.

You must not specify submit for user inputs that aren't submitted yet. You can use suggestion by specifying complete, correct and/or suggest to t when you doesn't specify submit. See Required parameters for suggestion for details about these values.

7.1.8.6. Return value

groonga-suggest-httpd returns the following format response. It's the same format as body of suggest:

{
  TYPE1: [
    [CANDIDATE_1, SCORE_1],
    [CANDIDATE_2, SCORE_2],
    ...,
    [CANDIDATE_N, SCORE_N]
  ],
  TYPE2: [
    [CANDIDATE_1, SCORE_1],
    [CANDIDATE_2, SCORE_2],
    ...,
    [CANDIDATE_N, SCORE_N]
  ],
  ...
}

Here is the response when t is submit:

{}

7.1.8.6.1. TYPE

One of complete, correct and suggest.

7.1.8.6.2. CANDIDATE_N

The string of candidate in UTF-8.

7.1.8.6.3. SCORE_N

The score of the candidate.

Candidates are sorted by score descendant.

7.1.8.7. See also