public final class ScaledDotProductAttentionBlock extends AbstractBlock
Abbreviations used:
In many use cases F=T. For self attention, the input is equal to the output.
This block can process input in four forms:
Attention masks must contain a 1 for positions to keep and a 0 for positions to mask.
Modifier and Type | Class and Description |
---|---|
static class |
ScaledDotProductAttentionBlock.Builder
A builder for
ScaledDotProductAttentionBlock s. |
children, inputNames, inputShapes, parameters, parameterShapeCallbacks, version
Modifier and Type | Method and Description |
---|---|
static ScaledDotProductAttentionBlock.Builder |
builder()
Creates a new Builder to build an Attention Block with.
|
NDList |
forward(ParameterStore parameterStore,
NDList inputs,
boolean training,
ai.djl.util.PairList<java.lang.String,java.lang.Object> params)
Applies the operating function of the block once.
|
Linear |
getKeyProjection()
Pointwise Linear projection of the keys.
|
Shape[] |
getOutputShapes(NDManager manager,
Shape[] inputShapes)
Returns the expected output shapes of the block for the specified input shapes.
|
Linear |
getQueryProjection()
Pointwise Linear projection of the queries.
|
Linear |
getResultProjection()
Pointwise Linear projection of the results.
|
Linear |
getValueProjection()
Pointwise Linear projection of the values.
|
void |
initializeChildBlocks(NDManager manager,
DataType dataType,
Shape... inputShapes)
Initializes the Child blocks of this block.
|
addChildBlock, addParameter, addParameter, addParameter, beforeInitialize, cast, clear, describeInput, getChildren, getDirectParameters, getParameters, getParameterShape, initialize, isInitialized, loadMetadata, loadParameters, readInputShapes, saveInputShapes, saveMetadata, saveParameters, setInitializer, setInitializer, toString
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
forward, forward, validateLayout
public Linear getKeyProjection()
public Linear getQueryProjection()
public Linear getValueProjection()
public Linear getResultProjection()
public Shape[] getOutputShapes(NDManager manager, Shape[] inputShapes)
manager
- an NDManagerinputShapes
- the shapes of the inputspublic void initializeChildBlocks(NDManager manager, DataType dataType, Shape... inputShapes)
initializeChildBlocks
in class AbstractBlock
manager
- the manager to use for initializationdataType
- the requested data typeinputShapes
- the expected input shapes for this blockpublic NDList forward(ParameterStore parameterStore, NDList inputs, boolean training, ai.djl.util.PairList<java.lang.String,java.lang.Object> params)
parameterStore
- the parameter storeinputs
- the input NDListtraining
- true for a training forward passparams
- optional parameterspublic static ScaledDotProductAttentionBlock.Builder builder()