Package ai.djl.nn.transformer
Class BertBlock
- java.lang.Object
-
- ai.djl.nn.AbstractBaseBlock
-
- ai.djl.nn.AbstractBlock
-
- ai.djl.nn.transformer.BertBlock
-
- All Implemented Interfaces:
Block
public final class BertBlock extends AbstractBlock
Implements the core bert model (without next sentence and masked language task) of bert.This closely follows the original Devlin et. al. paper and its reference implementation.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
BertBlock.Builder
-
Field Summary
-
Fields inherited from class ai.djl.nn.AbstractBlock
children, parameters
-
Fields inherited from class ai.djl.nn.AbstractBaseBlock
inputNames, inputShapes, version
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static BertBlock.Builder
builder()
Returns a new BertBlock builder.static NDArray
createAttentionMaskFromInputMask(NDArray ids, NDArray mask)
Creates a 3D attention mask from a 2D tensor mask.protected NDList
forwardInternal(ParameterStore ps, NDList inputs, boolean training, ai.djl.util.PairList<java.lang.String,java.lang.Object> params)
A helper forBlock.forward(ParameterStore, NDList, boolean, PairList)
after initialization.int
getEmbeddingSize()
Returns the embedding size used for tokens.Shape[]
getOutputShapes(Shape[] inputShapes)
Returns the expected output shapes of the block for the specified input shapes.int
getTokenDictionarySize()
Returns the size of the token dictionary.IdEmbedding
getTokenEmbedding()
Returns the token embedding used by this Bert model.int
getTypeDictionarySize()
Returns the size of the type dictionary.void
initializeChildBlocks(NDManager manager, DataType dataType, Shape... inputShapes)
Initializes the Child blocks of this block.-
Methods inherited from class ai.djl.nn.AbstractBlock
addChildBlock, addChildBlock, addChildBlockSingleton, addParameter, getChildren, getDirectParameters
-
Methods inherited from class ai.djl.nn.AbstractBaseBlock
beforeInitialize, cast, clear, describeInput, forward, forward, forwardInternal, getInputShapes, getParameters, initialize, isInitialized, loadMetadata, loadParameters, prepare, readInputShapes, saveInputShapes, saveMetadata, saveParameters, setInitializer, setInitializer, setInitializer, toString
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface ai.djl.nn.Block
forward, freezeParameters
-
-
-
-
Method Detail
-
getTokenEmbedding
public IdEmbedding getTokenEmbedding()
Returns the token embedding used by this Bert model.- Returns:
- the token embedding used by this Bert model
-
getEmbeddingSize
public int getEmbeddingSize()
Returns the embedding size used for tokens.- Returns:
- the embedding size used for tokens
-
getTokenDictionarySize
public int getTokenDictionarySize()
Returns the size of the token dictionary.- Returns:
- the size of the token dictionary
-
getTypeDictionarySize
public int getTypeDictionarySize()
Returns the size of the type dictionary.- Returns:
- the size of the type dictionary
-
getOutputShapes
public Shape[] getOutputShapes(Shape[] inputShapes)
Returns the expected output shapes of the block for the specified input shapes.- Parameters:
inputShapes
- the shapes of the inputs- Returns:
- the expected output shapes of the block
-
initializeChildBlocks
public void initializeChildBlocks(NDManager manager, DataType dataType, Shape... inputShapes)
Initializes the Child blocks of this block. You need to override this method if your subclass has child blocks. Used to determine the correct input shapes for child blocks based on the requested input shape for this block.- Overrides:
initializeChildBlocks
in classAbstractBaseBlock
- Parameters:
manager
- the manager to use for initializationdataType
- the requested data typeinputShapes
- the expected input shapes for this block
-
createAttentionMaskFromInputMask
public static NDArray createAttentionMaskFromInputMask(NDArray ids, NDArray mask)
Creates a 3D attention mask from a 2D tensor mask.- Parameters:
ids
- 2D Tensor of shape (B, F)mask
- 2D Tensor of shape (B, T)- Returns:
- float tensor of shape (B, F, T)
-
forwardInternal
protected NDList forwardInternal(ParameterStore ps, NDList inputs, boolean training, ai.djl.util.PairList<java.lang.String,java.lang.Object> params)
A helper forBlock.forward(ParameterStore, NDList, boolean, PairList)
after initialization.- Specified by:
forwardInternal
in classAbstractBaseBlock
- Parameters:
ps
- the parameter storeinputs
- the input NDListtraining
- true for a training forward passparams
- optional parameters- Returns:
- the output of the forward pass
-
builder
public static BertBlock.Builder builder()
Returns a new BertBlock builder.- Returns:
- a new BertBlock builder.
-
-