@TensorType(dataType=DT_STRING, byteSize=-1, mapperClass=TStringMapper.class) public interface TString extends org.tensorflow.ndarray.NdArray<String>, TType
This type can be used to store any arbitrary byte sequence of variable length.
Since the size of a tensor is fixed, creating a tensor of this type requires to provide all of its values initially, so TensorFlow can compute and allocate the right amount of memory. Then the data in the tensor is initialized once and cannot be modified afterwards.
Modifier and Type | Method and Description |
---|---|
org.tensorflow.ndarray.NdArray<byte[]> |
asBytes() |
static TString |
scalarOf(String value)
Allocates a new tensor for storing a string scalar.
|
static TString |
tensorOf(Charset charset,
org.tensorflow.ndarray.NdArray<String> src)
Allocates a new tensor which is a copy of a given array.
|
static TString |
tensorOf(Charset charset,
org.tensorflow.ndarray.Shape shape,
org.tensorflow.ndarray.buffer.DataBuffer<String> data)
Allocates a new tensor with the given shape and data.
|
static TString |
tensorOf(org.tensorflow.ndarray.NdArray<String> src)
Allocates a new tensor which is a copy of a given array.
|
static TString |
tensorOf(org.tensorflow.ndarray.Shape shape,
org.tensorflow.ndarray.buffer.DataBuffer<String> data)
Allocates a new tensor with the given shape and data.
|
static TString |
tensorOfBytes(org.tensorflow.ndarray.NdArray<byte[]> src)
Allocates a new tensor which is a copy of a given array of raw bytes.
|
static TString |
tensorOfBytes(org.tensorflow.ndarray.Shape shape,
org.tensorflow.ndarray.buffer.DataBuffer<byte[]> data)
Allocates a new tensor with the given shape and raw bytes.
|
TString |
using(Charset charset)
Use a specific charset for decoding data from a string tensor, instead of the default UTF-8.
|
static TString |
vectorOf(String... values)
Allocates a new tensor for storing a vector of strings.
|
static TString scalarOf(String value)
The string is encoded into bytes using the UTF-8 charset.
value
- scalar value to store in the new tensorstatic TString vectorOf(String... values)
The strings are encoded into bytes using the UTF-8 charset.
values
- values to store in the new tensorstatic TString tensorOf(org.tensorflow.ndarray.NdArray<String> src)
The tensor will have the same shape as the source array and its data will be copied. The strings are encoded into bytes using the UTF-8 charset.
src
- the source array giving the shape and data to the new tensorstatic TString tensorOf(Charset charset, org.tensorflow.ndarray.NdArray<String> src)
The tensor will have the same shape as the source array and its data will be copied. The strings are encoded into bytes using the charset passed in parameter.
If charset is different than default UTF-8, then it must also be provided explicitly when
reading data from the tensor, using using(Charset)
:
// Given `originalStrings` an initialized vector of strings
TString tensor = TString.tensorOf(Charsets.UTF_16, originalStrings);
...
TString tensorStrings = tensor.data().using(Charsets.UTF_16);
assertEquals(originalStrings.getObject(0), tensorStrings.getObject(0));
charset
- charset to use for encoding the strings into bytessrc
- the source array giving the shape and data to the new tensorstatic TString tensorOf(org.tensorflow.ndarray.Shape shape, org.tensorflow.ndarray.buffer.DataBuffer<String> data)
The data will be copied from the provided buffer to the tensor after it is allocated. The strings are encoded into bytes using the UTF-8 charset.
shape
- shape of the tensordata
- buffer of strings to initialize the tensor withstatic TString tensorOf(Charset charset, org.tensorflow.ndarray.Shape shape, org.tensorflow.ndarray.buffer.DataBuffer<String> data)
The data will be copied from the provided buffer to the tensor after it is allocated. The strings are encoded into bytes using the charset passed in parameter.
If charset is different than default UTF-8, then it must also be provided explicitly when
reading data from the tensor, using using(Charset)
:
// Given `originalStrings` an initialized buffer of strings
TString tensor =
TString.tensorOf(Charsets.UTF_16, Shape.of(originalString.size()), originalStrings);
...
TString tensorStrings = tensor.data().using(Charsets.UTF_16);
assertEquals(originalStrings.getObject(0), tensorStrings.getObject(0));
charset
- charset to use for encoding the strings into bytesshape
- shape of the tensordata
- buffer of strings to initialize the tensor withstatic TString tensorOfBytes(org.tensorflow.ndarray.NdArray<byte[]> src)
The tensor will have the same shape as the source array and its data will be copied.
If data must be read as raw bytes as well, the user must specify it explicitly by invoking
asBytes()
on the returned data:
byte[] bytes = tensor.data().asBytes().getObject(0); // returns first sequence of bytes in the tensor
src
- the source array giving the shape and data to the new tensorstatic TString tensorOfBytes(org.tensorflow.ndarray.Shape shape, org.tensorflow.ndarray.buffer.DataBuffer<byte[]> data)
The data will be copied from the provided buffer to the tensor after it has been allocated.
If data must be read as raw bytes as well, the user must specify it explicitly by invoking
asBytes()
on the returned data:
byte[] bytes = tensor.data().asBytes().getObject(0); // returns first sequence of bytes in the tensor
shape
- shape of the tensor to createdata
- the source array giving the shape and data to the new tensorTString using(Charset charset)
The charset must match the one used for encoding the string values when the tensor was created. For example:
TString tensor =
TString.tensorOf(StandardCharsets.UTF_16, NdArrays.scalarOfObject("TensorFlow");
assertEquals("TensorFlow", tensor.data().using(StandardCharsets.UTF_16).getObject());
charset
- charset to useorg.tensorflow.ndarray.NdArray<byte[]> asBytes()
Copyright © 2015–2022. All rights reserved.