Class

com.johnsnowlabs.ml.tensorflow

TensorflowAlbert

Related Doc: package tensorflow

Permalink

class TensorflowAlbert extends Serializable

This class is used to calculate ALBERT embeddings for For Sequence Batches of WordpieceTokenizedSentence. Input for this model must be tokenzied with a SentencePieceModel,

This Tensorflow model is using the weights provided by https://tfhub.dev/google/albert_base/3 * sequence_output: representations of every token in the input sequence with shape [batch_size, max_sequence_length, hidden_size].

ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS - Google Research, Toyota Technological Institute at Chicago This these embeddings represent the outputs generated by the Albert model. All offical Albert releases by google in TF-HUB are supported with this Albert Wrapper:

TF-HUB Models : albert_base = https://tfhub.dev/google/albert_base/3 | 768-embed-dim, 12-layer, 12-heads, 12M parameters albert_large = https://tfhub.dev/google/albert_large/3 | 1024-embed-dim, 24-layer, 16-heads, 18M parameters albert_xlarge = https://tfhub.dev/google/albert_xlarge/3 | 2048-embed-dim, 24-layer, 32-heads, 60M parameters albert_xxlarge = https://tfhub.dev/google/albert_xxlarge/3 | 4096-embed-dim, 12-layer, 64-heads, 235M parameters

This model requires input tokenization with SentencePiece model, which is provided by Spark NLP

For additional information see : https://arxiv.org/pdf/1909.11942.pdf https://github.com/google-research/ALBERT https://tfhub.dev/s?q=albert

Tips:

ALBERT uses repeating layers which results in a small memory footprint, however the computational cost remains similar to a BERT-like architecture with the same number of hidden layers as it has to iterate through the same number of (repeating) layers.

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. TensorflowAlbert
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new TensorflowAlbert(tensorflow: TensorflowWrapper, spp: SentencePieceWrapper, batchSize: Int, configProtoBytes: Option[Array[Byte]] = None, signatures: Option[Map[String, String]] = None)

    Permalink

    tensorflow

    Albert Model wrapper with TensorFlowWrapper

    spp

    Albert SentencePiece model with SentencePieceWrapper

    batchSize

    size of batch

    configProtoBytes

    Configuration for TensorFlow session

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. val _tfAlbertSignatures: Map[String, String]

    Permalink
  5. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  6. def calculateEmbeddings(tokenizedSentences: Seq[TokenizedSentence], batchSize: Int, maxSentenceLength: Int, caseSensitive: Boolean): Seq[WordpieceEmbeddingsSentence]

    Permalink
  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  12. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  13. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  14. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  15. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  16. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  17. def prepareBatchInputs(sentences: Seq[(WordpieceTokenizedSentence, Int)], maxSequenceLength: Int): Seq[Array[Int]]

    Permalink
  18. val spp: SentencePieceWrapper

    Permalink

    Albert SentencePiece model with SentencePieceWrapper

  19. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  20. def tag(batch: Seq[Array[Int]]): Seq[Array[Array[Float]]]

    Permalink
  21. val tensorflow: TensorflowWrapper

    Permalink

    Albert Model wrapper with TensorFlowWrapper

  22. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  23. def tokenizeWithAlignment(sentences: Seq[TokenizedSentence], maxSeqLength: Int, caseSensitive: Boolean): Seq[WordpieceTokenizedSentence]

    Permalink
  24. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped