Interface Embedder

  • All Known Implementing Classes:
    Embedder.FailingEmbedder

    public interface Embedder
    An embedder converts a text string to a tensor
    Author:
    bratseth
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static Embedder throwsOnUse
      An instance of this which throws IllegalStateException if attempted used
    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      java.util.List<java.lang.Integer> embed​(java.lang.String text, Embedder.Context context)
      Converts text into a list of token id's (a vector embedding)
      com.yahoo.tensor.Tensor embed​(java.lang.String text, Embedder.Context context, com.yahoo.tensor.TensorType tensorType)
      Converts text into tokens in a tensor.
    • Field Detail

      • throwsOnUse

        static final Embedder throwsOnUse
        An instance of this which throws IllegalStateException if attempted used
    • Method Detail

      • embed

        java.util.List<java.lang.Integer> embed​(java.lang.String text,
                                                Embedder.Context context)
        Converts text into a list of token id's (a vector embedding)
        Parameters:
        text - the text to embed
        context - the context which may influence an embedder's behavior
        Returns:
        the text embedded as a list of token ids
        Throws:
        java.lang.IllegalArgumentException - if the language is not supported by this embedder
      • embed

        com.yahoo.tensor.Tensor embed​(java.lang.String text,
                                      Embedder.Context context,
                                      com.yahoo.tensor.TensorType tensorType)
        Converts text into tokens in a tensor. The information contained in the embedding may depend on the tensor type.
        Parameters:
        text - the text to embed
        context - the context which may influence an embedder's behavior
        tensorType - the type of the tensor to be returned
        Returns:
        the tensor embedding of the text, as the spoecified tensor type
        Throws:
        java.lang.IllegalArgumentException - if the language or tensor type is not supported by this embedder