Class

com.johnsnowlabs.ml.tensorflow

TensorflowAlbert

Related Doc: package tensorflow

Permalink

class TensorflowAlbert extends Serializable

This class is used to calculate ALBERT embeddings for For Sequence Batches of WordpieceTokenizedSentence. Input for this model must be tokenzied with a SentencePieceModel,

This Tensorflow model is using the weights provided by https://tfhub.dev/google/albert_base/3 * sequence_output: representations of every token in the input sequence with shape [batch_size, max_sequence_length, hidden_size].

ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS - Google Research, Toyota Technological Institute at Chicago This these embeddings represent the outputs generated by the Albert model. All offical Albert releases by google in TF-HUB are supported with this Albert Wrapper:

TF-HUB Models : albert_base = https://tfhub.dev/google/albert_base/3 | 768-embed-dim, 12-layer, 12-heads, 12M parameters albert_large = https://tfhub.dev/google/albert_large/3 | 1024-embed-dim, 24-layer, 16-heads, 18M parameters albert_xlarge = https://tfhub.dev/google/albert_xlarge/3 | 2048-embed-dim, 24-layer, 32-heads, 60M parameters albert_xxlarge = https://tfhub.dev/google/albert_xxlarge/3 | 4096-embed-dim, 12-layer, 64-heads, 235M parameters

This model requires input tokenization with SentencePiece model, which is provided by Spark NLP

For additional information see : https://arxiv.org/pdf/1909.11942.pdf https://github.com/google-research/ALBERT https://tfhub.dev/s?q=albert

Tips:

ALBERT uses repeating layers which results in a small memory footprint, however the computational cost remains similar to a BERT-like architecture with the same number of hidden layers as it has to iterate through the same number of (repeating) layers.

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. TensorflowAlbert
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new TensorflowAlbert(tensorflow: TensorflowWrapper, spp: SentencePieceWrapper, batchSize: Int, configProtoBytes: Option[Array[Byte]] = None)

    Permalink

    tensorflow

    Albert Model wrapper with TensorFlowWrapper

    spp

    Albert SentencePiece model with SentencePieceWrapper

    batchSize

    size of batch

    configProtoBytes

    Configuration for TensorFlow session

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def calculateEmbeddings(sentences: Seq[TokenizedSentence], batchSize: Int, maxSentenceLength: Int, caseSensitive: Boolean): Seq[WordpieceEmbeddingsSentence]

    Permalink
  6. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  11. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  12. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  13. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  14. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  15. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  16. val spp: SentencePieceWrapper

    Permalink

    Albert SentencePiece model with SentencePieceWrapper

  17. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  18. def tag(batch: Seq[Array[Int]]): Seq[Array[Array[Float]]]

    Permalink
  19. val tensorflow: TensorflowWrapper

    Permalink

    Albert Model wrapper with TensorFlowWrapper

  20. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  21. def tokenize(sentences: Seq[TokenizedSentence], maxSeqLength: Int, caseSensitive: Boolean): Seq[Array[WordpieceTokenizedSentence]]

    Permalink
  22. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped