Interface Embedder

All Known Implementing Classes:
Embedder.FailingEmbedder

public interface Embedder
An embedder converts a text string to a tensor
Author:
bratseth
  • Field Details

    • defaultEmbedderId

      static final String defaultEmbedderId
      Name of embedder when none is explicity given
      See Also:
    • throwsOnUse

      static final Embedder throwsOnUse
      An instance of this which throws IllegalStateException if attempted used
  • Method Details

    • asMap

      default Map<String,Embedder> asMap()
      Returns this embedder instance as a map with the default embedder name
    • asMap

      default Map<String,Embedder> asMap(String name)
      Returns this embedder instance as a map with the given name
    • embed

      List<Integer> embed(String text, Embedder.Context context)
      Converts text into a list of token id's (a vector embedding)
      Parameters:
      text - the text to embed
      context - the context which may influence an embedder's behavior
      Returns:
      the text embedded as a list of token ids
      Throws:
      IllegalArgumentException - if the language is not supported by this embedder
    • decode

      default String decode(List<Integer> tokens, Embedder.Context context)
      Converts the list of token id's into a text. The opposite operation of embed.
      Parameters:
      tokens - the list of tokens to decode to a string
      context - the context which specifies the language used to select a model
      Returns:
      the string formed by decoding the tokens back to their string repreesentation
    • embed

      com.yahoo.tensor.Tensor embed(String text, Embedder.Context context, com.yahoo.tensor.TensorType tensorType)
      Converts text into tokens in a tensor. The information contained in the embedding may depend on the tensor type.
      Parameters:
      text - the text to embed
      context - the context which may influence an embedder's behavior
      tensorType - the type of the tensor to be returned
      Returns:
      the tensor embedding of the text, as the specified tensor type
      Throws:
      IllegalArgumentException - if the language or tensor type is not supported by this embedder