Creates a lookup table that converts string tensors into integer IDs.
Creates a lookup table that converts string tensors into integer IDs.
This operation constructs a lookup table to convert tensors of strings into tensors of INT64
IDs. The mapping
is initialized from a vocabulary file specified in filename
, where the whole line is the key and the zero-based
line number is the ID.
Any lookup of an out-of-vocabulary token will return a bucket ID based on its hash if numOOVBuckets
is greater
than zero. Otherwise it is assigned the defaultValue
. The bucket ID range is:
[vocabularySize, vocabularySize + numOOVBuckets - 1]
.
The underlying table must be initialized by executing the tf.tablesInitializer()
op or the op returned by
table.initialize()
.
Example usage:
If we have a vocabulary file "test.txt"
with the following content:
emerson lake palmer
Then, we can use the following code to create a table mapping "emerson" -> 0
, "lake" -> 1
, and
"palmer" -> 2
:
val table = tf.indexTableFromFile("test.txt"))
Filename of the text file to be used for initialization. The path must be accessible from wherever the graph is initialized (e.g., trainer or evaluation workers).
Delimiter to use in case a TextFileColumn
extractor is being used.
Number of elements in the file, if known. If not known, set to -1
(the default value).
Default value to use if a key is missing from the table.
Number of out-of-vocabulary buckets.
Hashing function specification to use.
Data type of the table keys.
Name for the created table.
Created table.
Returns the set of all lookup table initializers that have been created in the current graph.
Returns the set of all lookup table initializers that have been created in the current graph.
Returns an initializer op for all lookup table initializers that have been created in the current graph.
Returns an initializer op for all lookup table initializers that have been created in the current graph.