INV 1

NAME

Inv — Make an inverted index from output of mkey.

SYNOPSIS

inv [−danpv] [-hn] [-i [u] name] outfile]

DESCRIPTION

The inv program computes the hash codes and writes the inverted files. It reads the output of mkey and writes the set of files described earlier in this section. It expects one argument, which is used as the base name for the three (or four) files to be written. Assuming an argument of Index (the default) the entry file is named Index.ia, the posting file Index.ib, the tag file Index.ic, and the key file (if present) index.id.

The inv programm recognize the following options:

About half the time used in inv is in the contained sort. Assuming the sort is roughly linear, however, a guess at the total timing for inv is 250 keys per second. The space used is usually of more importance: the entry file uses four bytes per possible hash (note the -h option), and the tag file around 15-20 bytes per item indexed. Roughly, the posting file contains one item for each key instance and one item for each possible hash code; the items are two bytes long if the tag file is less than 65336 bytes long, and the items are four bytes wide if the tag file is greater than 65536 bytes long. To minimize storage, the hash tables should be over-full; for most of the files indexed in this way, there is no other real choice, since the entry file must fit in memory.

FILES

@BINDIR@/inv Executable. Assuming an argument of Index (the default): Index.ia Entry file. Index.ib Posting file. Index.ic Tag file. Index.id Key file.

LICENSE

The text of this manual page comes from Some application of Inverted Indexes in the UNIX System by M. E. Lesk, which is distributed under the bsd4 license. The inv software is distributed under the cddl license.

SEE ALSO

refer(1), referformat(7), mkey(1), hunt(1), and Some application of Inverted Indexes in the UNIX System by M. E. Lesk.

AUTHORS

M. E. Lesk. Modified by Pierre-Jean Fichet