SDNN (nd4j-api 1.0.0-beta7 API)

java.lang.Object
- org.nd4j.autodiff.samediff.ops.SDOps
- - org.nd4j.autodiff.samediff.ops.SDNN

```
public class SDNN
extends SDOps
```

Field Summary
- Fields inherited from class org.nd4j.autodiff.samediff.ops.SDOps
  sd

Constructor Summary

Constructors
Constructor and Description

SDNN(SameDiff sameDiff)

Constructors
Constructor and Description
`SDNN(SameDiff sameDiff)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`SDVariable`	`batchNorm(SDVariable input, SDVariable mean, SDVariable variance, SDVariable gamma, SDVariable beta, double epsilon, int... axis)` Neural network batch normalization operation. For details, see https://arxiv.org/abs/1502.03167
`SDVariable`	`batchNorm(String name, SDVariable input, SDVariable mean, SDVariable variance, SDVariable gamma, SDVariable beta, double epsilon, int... axis)` Neural network batch normalization operation. For details, see https://arxiv.org/abs/1502.03167
`SDVariable`	`biasAdd(SDVariable input, SDVariable bias, boolean nchw)` Bias addition operation: a special case of addition, typically used with CNN 4D activations and a 1D bias vector
`SDVariable`	`biasAdd(String name, SDVariable input, SDVariable bias, boolean nchw)` Bias addition operation: a special case of addition, typically used with CNN 4D activations and a 1D bias vector
`SDVariable`	`cReLU(SDVariable x)` Concatenates a ReLU which selects only the positive part of the activation with a ReLU which selects only the negative part of the activation.
`SDVariable`	`cReLU(String name, SDVariable x)` Concatenates a ReLU which selects only the positive part of the activation with a ReLU which selects only the negative part of the activation.
`SDVariable`	`dotProductAttention(SDVariable queries, SDVariable keys, SDVariable values, SDVariable mask, boolean scaled)` This operation performs dot product attention on the given timeseries input with the given queries out = sum(similarity(k_i, q) * v_i) similarity(k, q) = softmax(k * q) where x * q is the dot product of x and q Optionally with normalization step: similarity(k, q) = softmax(k * q / sqrt(size(q)) See also "Attention is all you need" (https://arxiv.org/abs/1706.03762, p.
`SDVariable`	`dotProductAttention(String name, SDVariable queries, SDVariable keys, SDVariable values, SDVariable mask, boolean scaled)` This operation performs dot product attention on the given timeseries input with the given queries out = sum(similarity(k_i, q) * v_i) similarity(k, q) = softmax(k * q) where x * q is the dot product of x and q Optionally with normalization step: similarity(k, q) = softmax(k * q / sqrt(size(q)) See also "Attention is all you need" (https://arxiv.org/abs/1706.03762, p.
`SDVariable`	`dropout(SDVariable input, double inputRetainProbability)` Dropout operation
`SDVariable`	`dropout(String name, SDVariable input, double inputRetainProbability)` Dropout operation
`SDVariable`	`elu(SDVariable x)` Element-wise exponential linear unit (ELU) function: out = x if x > 0 out = a * (exp(x) - 1) if x <= 0 with constant a = 1.0
`SDVariable`	`elu(String name, SDVariable x)` Element-wise exponential linear unit (ELU) function: out = x if x > 0 out = a * (exp(x) - 1) if x <= 0 with constant a = 1.0
`SDVariable`	`gelu(SDVariable x)` GELU activation function - Gaussian Error Linear Units For more details, see Gaussian Error Linear Units (GELUs) - https://arxiv.org/abs/1606.08415 This method uses the sigmoid approximation
`SDVariable`	`gelu(String name, SDVariable x)` GELU activation function - Gaussian Error Linear Units For more details, see Gaussian Error Linear Units (GELUs) - https://arxiv.org/abs/1606.08415 This method uses the sigmoid approximation
`SDVariable`	`hardSigmoid(SDVariable x)` Element-wise hard sigmoid function: out[i] = 0 if in[i] <= -2.5 out[1] = 0.2*in[i]+0.5 if -2.5 < in[i] < 2.5 out[i] = 1 if in[i] >= 2.5
`SDVariable`	`hardSigmoid(String name, SDVariable x)` Element-wise hard sigmoid function: out[i] = 0 if in[i] <= -2.5 out[1] = 0.2*in[i]+0.5 if -2.5 < in[i] < 2.5 out[i] = 1 if in[i] >= 2.5
`SDVariable`	`hardTanh(SDVariable x)` Element-wise hard tanh function: out[i] = -1 if in[i] <= -1 out[1] = in[i] if -1 < in[i] < 1 out[i] = 1 if in[i] >= 1
`SDVariable`	`hardTanh(String name, SDVariable x)` Element-wise hard tanh function: out[i] = -1 if in[i] <= -1 out[1] = in[i] if -1 < in[i] < 1 out[i] = 1 if in[i] >= 1
`SDVariable`	`hardTanhDerivative(SDVariable x)` Derivative (dOut/dIn) of the element-wise hard Tanh function - hardTanh(INDArray)
`SDVariable`	`hardTanhDerivative(String name, SDVariable x)` Derivative (dOut/dIn) of the element-wise hard Tanh function - hardTanh(INDArray)
`SDVariable`	`layerNorm(SDVariable input, SDVariable gain, boolean channelsFirst, int... dimensions)` Apply Layer Normalization y = gain * standardize(x) + bias
`SDVariable`	`layerNorm(SDVariable input, SDVariable gain, SDVariable bias, boolean channelsFirst, int... dimensions)` Apply Layer Normalization y = gain * standardize(x) + bias
`SDVariable`	`layerNorm(String name, SDVariable input, SDVariable gain, boolean channelsFirst, int... dimensions)` Apply Layer Normalization y = gain * standardize(x) + bias
`SDVariable`	`layerNorm(String name, SDVariable input, SDVariable gain, SDVariable bias, boolean channelsFirst, int... dimensions)` Apply Layer Normalization y = gain * standardize(x) + bias
`SDVariable`	`leakyRelu(SDVariable x, double alpha)` Element-wise leaky ReLU function: out = x if x >= 0.0 out = alpha * x if x < cutoff Alpha value is most commonly set to 0.01
`SDVariable`	`leakyRelu(String name, SDVariable x, double alpha)` Element-wise leaky ReLU function: out = x if x >= 0.0 out = alpha * x if x < cutoff Alpha value is most commonly set to 0.01
`SDVariable`	`leakyReluDerivative(SDVariable x, double alpha)` Leaky ReLU derivative: dOut/dIn given input.
`SDVariable`	`leakyReluDerivative(String name, SDVariable x, double alpha)` Leaky ReLU derivative: dOut/dIn given input.
`SDVariable`	`linear(SDVariable input, SDVariable weights, SDVariable bias)` Linear layer operation: out = mmul(in,w) + bias Note that bias array is optional
`SDVariable`	`linear(String name, SDVariable input, SDVariable weights, SDVariable bias)` Linear layer operation: out = mmul(in,w) + bias Note that bias array is optional
`SDVariable`	`logSigmoid(SDVariable x)` Element-wise sigmoid function: out[i] = log(sigmoid(in[i]))
`SDVariable`	`logSigmoid(String name, SDVariable x)` Element-wise sigmoid function: out[i] = log(sigmoid(in[i]))
`SDVariable`	`logSoftmax(SDVariable x)` Log softmax activation
`SDVariable`	`logSoftmax(SDVariable x, int dimension)` Log softmax activation
`SDVariable`	`logSoftmax(String name, SDVariable x)` Log softmax activation
`SDVariable`	`logSoftmax(String name, SDVariable x, int dimension)` Log softmax activation
`SDVariable`	`multiHeadDotProductAttention(SDVariable queries, SDVariable keys, SDVariable values, SDVariable Wq, SDVariable Wk, SDVariable Wv, SDVariable Wo, SDVariable mask, boolean scaled)` This performs multi-headed dot product attention on the given timeseries input out = concat(head_1, head_2, ..., head_n) * Wo head_i = dot_product_attention(Wq_iq, Wk_ik, Wv_i*v) Optionally with normalization when calculating the attention for each head. See also "Attention is all you need" (https://arxiv.org/abs/1706.03762, pp.
`SDVariable`	`multiHeadDotProductAttention(String name, SDVariable queries, SDVariable keys, SDVariable values, SDVariable Wq, SDVariable Wk, SDVariable Wv, SDVariable Wo, SDVariable mask, boolean scaled)` This performs multi-headed dot product attention on the given timeseries input out = concat(head_1, head_2, ..., head_n) * Wo head_i = dot_product_attention(Wq_iq, Wk_ik, Wv_i*v) Optionally with normalization when calculating the attention for each head. See also "Attention is all you need" (https://arxiv.org/abs/1706.03762, pp.
`SDVariable`	`pad(SDVariable input, SDVariable padding, double constant)` Padding operation
`SDVariable`	`pad(SDVariable input, SDVariable padding, PadMode PadMode, double constant)` Padding operation
`SDVariable`	`pad(String name, SDVariable input, SDVariable padding, double constant)` Padding operation
`SDVariable`	`pad(String name, SDVariable input, SDVariable padding, PadMode PadMode, double constant)` Padding operation
`SDVariable`	`preciseGelu(SDVariable x)` GELU activation function - Gaussian Error Linear Units For more details, see Gaussian Error Linear Units (GELUs) - https://arxiv.org/abs/1606.08415 This method uses the precise method
`SDVariable`	`preciseGelu(String name, SDVariable x)` GELU activation function - Gaussian Error Linear Units For more details, see Gaussian Error Linear Units (GELUs) - https://arxiv.org/abs/1606.08415 This method uses the precise method
`SDVariable`	`prelu(SDVariable input, SDVariable alpha, int... sharedAxes)` PReLU (Parameterized Rectified Linear Unit) operation.
`SDVariable`	`prelu(String name, SDVariable input, SDVariable alpha, int... sharedAxes)` PReLU (Parameterized Rectified Linear Unit) operation.
`SDVariable`	`relu(SDVariable x, double cutoff)` Element-wise rectified linear function with specified cutoff: out[i] = in[i] if in[i] >= cutoff out[i] = 0 otherwise
`SDVariable`	`relu(String name, SDVariable x, double cutoff)` Element-wise rectified linear function with specified cutoff: out[i] = in[i] if in[i] >= cutoff out[i] = 0 otherwise
`SDVariable`	`relu6(SDVariable x, double cutoff)` Element-wise "rectified linear 6" function with specified cutoff: out[i] = min(max(in, cutoff), 6)
`SDVariable`	`relu6(String name, SDVariable x, double cutoff)` Element-wise "rectified linear 6" function with specified cutoff: out[i] = min(max(in, cutoff), 6)
`SDVariable`	`reluLayer(SDVariable input, SDVariable weights, SDVariable bias)` ReLU (Rectified Linear Unit) layer operation: out = relu(mmul(in,w) + bias) Note that bias array is optional
`SDVariable`	`reluLayer(String name, SDVariable input, SDVariable weights, SDVariable bias)` ReLU (Rectified Linear Unit) layer operation: out = relu(mmul(in,w) + bias) Note that bias array is optional
`SDVariable`	`selu(SDVariable x)` Element-wise SeLU function - Scaled exponential Lineal Unit: see Self-Normalizing Neural Networks out[i] = scale * alpha * (exp(in[i])-1) if in[i]>0, or 0 if in[i] <= 0 Uses default scale and alpha values.
`SDVariable`	`selu(String name, SDVariable x)` Element-wise SeLU function - Scaled exponential Lineal Unit: see Self-Normalizing Neural Networks out[i] = scale * alpha * (exp(in[i])-1) if in[i]>0, or 0 if in[i] <= 0 Uses default scale and alpha values.
`SDVariable`	`sigmoid(SDVariable x)` Element-wise sigmoid function: out[i] = 1.0/(1+exp(-in[i]))
`SDVariable`	`sigmoid(String name, SDVariable x)` Element-wise sigmoid function: out[i] = 1.0/(1+exp(-in[i]))
`SDVariable`	`sigmoidDerivative(SDVariable x, SDVariable wrt)` Element-wise sigmoid function derivative: dL/dIn given input and dL/dOut
`SDVariable`	`sigmoidDerivative(String name, SDVariable x, SDVariable wrt)` Element-wise sigmoid function derivative: dL/dIn given input and dL/dOut
`SDVariable`	`softmax(SDVariable x)` Softmax activation, along the specified dimension
`SDVariable`	`softmax(SDVariable x, int dimension)` Softmax activation, along the specified dimension
`SDVariable`	`softmax(String name, SDVariable x)` Softmax activation, along the specified dimension
`SDVariable`	`softmax(String name, SDVariable x, int dimension)` Softmax activation, along the specified dimension
`SDVariable`	`softmaxDerivative(SDVariable x, SDVariable wrt, int dimension)` Softmax derivative function
`SDVariable`	`softmaxDerivative(String name, SDVariable x, SDVariable wrt, int dimension)` Softmax derivative function
`SDVariable`	`softplus(SDVariable x)` Element-wise softplus function: out = log(exp(x) + 1)
`SDVariable`	`softplus(String name, SDVariable x)` Element-wise softplus function: out = log(exp(x) + 1)
`SDVariable`	`softsign(SDVariable x)` Element-wise softsign function: out = x / (abs(x) + 1)
`SDVariable`	`softsign(String name, SDVariable x)` Element-wise softsign function: out = x / (abs(x) + 1)
`SDVariable`	`softsignDerivative(SDVariable x)` Element-wise derivative (dOut/dIn) of the softsign function softsign(INDArray)
`SDVariable`	`softsignDerivative(String name, SDVariable x)` Element-wise derivative (dOut/dIn) of the softsign function softsign(INDArray)
`SDVariable`	`swish(SDVariable x)` Element-wise "swish" function: out = x * sigmoid(b*x) with b=1.0 See: https://arxiv.org/abs/1710.05941
`SDVariable`	`swish(String name, SDVariable x)` Element-wise "swish" function: out = x * sigmoid(b*x) with b=1.0 See: https://arxiv.org/abs/1710.05941
`SDVariable`	`tanh(SDVariable x)` Elementwise tanh (hyperbolic tangent) operation: out = tanh(x)
`SDVariable`	`tanh(String name, SDVariable x)` Elementwise tanh (hyperbolic tangent) operation: out = tanh(x)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - SDNN
```
public SDNN(SameDiff sameDiff)
```
- Method Detail
  - cReLU
```
public SDVariable cReLU(SDVariable x)
```
    Concatenates a ReLU which selects only the positive part of the activation with a ReLU which selects only the negative part of the activation. Note that as a result this non-linearity doubles the depth of the activations.
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - cReLU
```
public SDVariable cReLU(String name,
                        SDVariable x)
```
    Concatenates a ReLU which selects only the positive part of the activation with a ReLU which selects only the negative part of the activation. Note that as a result this non-linearity doubles the depth of the activations.
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - batchNorm
```
public SDVariable batchNorm(SDVariable input,
                            SDVariable mean,
                            SDVariable variance,
                            SDVariable gamma,
                            SDVariable beta,
                            double epsilon,
                            int... axis)
```
    Neural network batch normalization operation.
    For details, see https://arxiv.org/abs/1502.03167
    
    Parameters:
    
    input - Input variable. (NUMERIC type)
    
    mean - Mean value. For 1d axis, this should match input.size(axis) (NUMERIC type)
    
    variance - Variance value. For 1d axis, this should match input.size(axis) (NUMERIC type)
    
    gamma - Gamma value. For 1d axis, this should match input.size(axis) (NUMERIC type)
    
    beta - Beta value. For 1d axis, this should match input.size(axis) (NUMERIC type)
    
    epsilon - Epsilon constant for numerical stability (to avoid division by 0)
    
    axis - For 2d CNN activations: 1 for NCHW format activations, or 3 for NHWC format activations. For 3d CNN activations: 1 for NCDHW format, 4 for NDHWC For 1d/RNN activations: 1 for NCW format, 2 for NWC (Size: AtLeast(min=1))
    
    Returns:
    
    output variable for batch normalization (NUMERIC type)
  - batchNorm
```
public SDVariable batchNorm(String name,
                            SDVariable input,
                            SDVariable mean,
                            SDVariable variance,
                            SDVariable gamma,
                            SDVariable beta,
                            double epsilon,
                            int... axis)
```
    Neural network batch normalization operation.
    For details, see https://arxiv.org/abs/1502.03167
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    input - Input variable. (NUMERIC type)
    
    mean - Mean value. For 1d axis, this should match input.size(axis) (NUMERIC type)
    
    variance - Variance value. For 1d axis, this should match input.size(axis) (NUMERIC type)
    
    gamma - Gamma value. For 1d axis, this should match input.size(axis) (NUMERIC type)
    
    beta - Beta value. For 1d axis, this should match input.size(axis) (NUMERIC type)
    
    epsilon - Epsilon constant for numerical stability (to avoid division by 0)
    
    axis - For 2d CNN activations: 1 for NCHW format activations, or 3 for NHWC format activations. For 3d CNN activations: 1 for NCDHW format, 4 for NDHWC For 1d/RNN activations: 1 for NCW format, 2 for NWC (Size: AtLeast(min=1))
    
    Returns:
    
    output variable for batch normalization (NUMERIC type)
  - biasAdd
```
public SDVariable biasAdd(SDVariable input,
                          SDVariable bias,
                          boolean nchw)
```
    Bias addition operation: a special case of addition, typically used with CNN 4D activations and a 1D bias vector
    
    Parameters:
    
    input - 4d input variable (NUMERIC type)
    
    bias - 1d bias (NUMERIC type)
    
    nchw - The format - nchw=true means [minibatch, channels, height, width] format; nchw=false - [minibatch, height, width, channels]. Unused for 2d inputs
    
    Returns:
    
    output Output variable, after applying bias add operation (NUMERIC type)
  - biasAdd
```
public SDVariable biasAdd(String name,
                          SDVariable input,
                          SDVariable bias,
                          boolean nchw)
```
    Bias addition operation: a special case of addition, typically used with CNN 4D activations and a 1D bias vector
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    input - 4d input variable (NUMERIC type)
    
    bias - 1d bias (NUMERIC type)
    
    nchw - The format - nchw=true means [minibatch, channels, height, width] format; nchw=false - [minibatch, height, width, channels]. Unused for 2d inputs
    
    Returns:
    
    output Output variable, after applying bias add operation (NUMERIC type)
  - dotProductAttention
```
public SDVariable dotProductAttention(SDVariable queries,
                                      SDVariable keys,
                                      SDVariable values,
                                      SDVariable mask,
                                      boolean scaled)
```
    This operation performs dot product attention on the given timeseries input with the given queries
    out = sum(similarity(k_i, q) * v_i)
    
    similarity(k, q) = softmax(k * q) where x * q is the dot product of x and q
    
    Optionally with normalization step:
    similarity(k, q) = softmax(k * q / sqrt(size(q))
    
    See also "Attention is all you need" (https://arxiv.org/abs/1706.03762, p. 4, eq. 1)
    
    Note: This supports multiple queries at once, if only one query is available the queries vector still has to
    be 3D but can have queryCount = 1
    
    Note: keys and values usually is the same array. If you want to use it as the same array, simply pass it for
    both.
    
    Note: Queries, keys and values must either be all rank 3 or all rank 4 arrays. Mixing them doesn't work. The
    output rank will depend on the input rank.
    
    Parameters:
    
    queries - input 3D array "queries" of shape [batchSize, featureKeys, queryCount] or 4D array of shape [batchSize, numHeads, featureKeys, queryCount] (NUMERIC type)
    
    keys - input 3D array "keys" of shape [batchSize, featureKeys, timesteps] or 4D array of shape [batchSize, numHeads, featureKeys, timesteps] (NUMERIC type)
    
    values - input 3D array "values" of shape [batchSize, featureValues, timesteps] or 4D array of shape [batchSize, numHeads, featureValues, timesteps] (NUMERIC type)
    
    mask - OPTIONAL; array that defines which values should be skipped of shape [batchSize, timesteps] (NUMERIC type)
    
    scaled - normalization, false -> do not apply normalization, true -> apply normalization
    
    Returns:
    
    output Attention result arrays of shape [batchSize, featureValues, queryCount] or [batchSize, numHeads, featureValues, queryCount], (optionally) Attention Weights of shape [batchSize, timesteps, queryCount] or [batchSize, numHeads, timesteps, queryCount] (NUMERIC type)
  - dotProductAttention
```
public SDVariable dotProductAttention(String name,
                                      SDVariable queries,
                                      SDVariable keys,
                                      SDVariable values,
                                      SDVariable mask,
                                      boolean scaled)
```
    This operation performs dot product attention on the given timeseries input with the given queries
    out = sum(similarity(k_i, q) * v_i)
    
    similarity(k, q) = softmax(k * q) where x * q is the dot product of x and q
    
    Optionally with normalization step:
    similarity(k, q) = softmax(k * q / sqrt(size(q))
    
    See also "Attention is all you need" (https://arxiv.org/abs/1706.03762, p. 4, eq. 1)
    
    Note: This supports multiple queries at once, if only one query is available the queries vector still has to
    be 3D but can have queryCount = 1
    
    Note: keys and values usually is the same array. If you want to use it as the same array, simply pass it for
    both.
    
    Note: Queries, keys and values must either be all rank 3 or all rank 4 arrays. Mixing them doesn't work. The
    output rank will depend on the input rank.
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    queries - input 3D array "queries" of shape [batchSize, featureKeys, queryCount] or 4D array of shape [batchSize, numHeads, featureKeys, queryCount] (NUMERIC type)
    
    keys - input 3D array "keys" of shape [batchSize, featureKeys, timesteps] or 4D array of shape [batchSize, numHeads, featureKeys, timesteps] (NUMERIC type)
    
    values - input 3D array "values" of shape [batchSize, featureValues, timesteps] or 4D array of shape [batchSize, numHeads, featureValues, timesteps] (NUMERIC type)
    
    mask - OPTIONAL; array that defines which values should be skipped of shape [batchSize, timesteps] (NUMERIC type)
    
    scaled - normalization, false -> do not apply normalization, true -> apply normalization
    
    Returns:
    
    output Attention result arrays of shape [batchSize, featureValues, queryCount] or [batchSize, numHeads, featureValues, queryCount], (optionally) Attention Weights of shape [batchSize, timesteps, queryCount] or [batchSize, numHeads, timesteps, queryCount] (NUMERIC type)
  - dropout
```
public SDVariable dropout(SDVariable input,
                          double inputRetainProbability)
```
    Dropout operation
    
    Parameters:
    
    input - Input array (NUMERIC type)
    
    inputRetainProbability - Probability of retaining an input (set to 0 with probability 1-p)
    
    Returns:
    
    output Output (NUMERIC type)
  - dropout
```
public SDVariable dropout(String name,
                          SDVariable input,
                          double inputRetainProbability)
```
    Dropout operation
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    input - Input array (NUMERIC type)
    
    inputRetainProbability - Probability of retaining an input (set to 0 with probability 1-p)
    
    Returns:
    
    output Output (NUMERIC type)
  - elu
```
public SDVariable elu(SDVariable x)
```
    Element-wise exponential linear unit (ELU) function:
    out = x if x > 0
    out = a * (exp(x) - 1) if x <= 0
    with constant a = 1.0
    
    See: https://arxiv.org/abs/1511.07289
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - elu
```
public SDVariable elu(String name,
                      SDVariable x)
```
    Element-wise exponential linear unit (ELU) function:
    out = x if x > 0
    out = a * (exp(x) - 1) if x <= 0
    with constant a = 1.0
    
    See: https://arxiv.org/abs/1511.07289
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - gelu
```
public SDVariable gelu(SDVariable x)
```
    GELU activation function - Gaussian Error Linear Units
    For more details, see Gaussian Error Linear Units (GELUs) - https://arxiv.org/abs/1606.08415
    This method uses the sigmoid approximation
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - gelu
```
public SDVariable gelu(String name,
                       SDVariable x)
```
    GELU activation function - Gaussian Error Linear Units
    For more details, see Gaussian Error Linear Units (GELUs) - https://arxiv.org/abs/1606.08415
    This method uses the sigmoid approximation
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - hardSigmoid
```
public SDVariable hardSigmoid(SDVariable x)
```
    Element-wise hard sigmoid function:
    out[i] = 0 if in[i] <= -2.5
    out[1] = 0.2*in[i]+0.5 if -2.5 < in[i] < 2.5
    out[i] = 1 if in[i] >= 2.5
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - hardSigmoid
```
public SDVariable hardSigmoid(String name,
                              SDVariable x)
```
    Element-wise hard sigmoid function:
    out[i] = 0 if in[i] <= -2.5
    out[1] = 0.2*in[i]+0.5 if -2.5 < in[i] < 2.5
    out[i] = 1 if in[i] >= 2.5
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - hardTanh
```
public SDVariable hardTanh(SDVariable x)
```
    Element-wise hard tanh function:
    out[i] = -1 if in[i] <= -1
    out[1] = in[i] if -1 < in[i] < 1
    out[i] = 1 if in[i] >= 1
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - hardTanh
```
public SDVariable hardTanh(String name,
                           SDVariable x)
```
    Element-wise hard tanh function:
    out[i] = -1 if in[i] <= -1
    out[1] = in[i] if -1 < in[i] < 1
    out[i] = 1 if in[i] >= 1
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - hardTanhDerivative
```
public SDVariable hardTanhDerivative(SDVariable x)
```
    Derivative (dOut/dIn) of the element-wise hard Tanh function - hardTanh(INDArray)
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - hardTanhDerivative
```
public SDVariable hardTanhDerivative(String name,
                                     SDVariable x)
```
    Derivative (dOut/dIn) of the element-wise hard Tanh function - hardTanh(INDArray)
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - layerNorm
```
public SDVariable layerNorm(SDVariable input,
                            SDVariable gain,
                            SDVariable bias,
                            boolean channelsFirst,
                            int... dimensions)
```
    Apply Layer Normalization
    
    y = gain * standardize(x) + bias
    
    Parameters:
    
    input - Input variable (NUMERIC type)
    
    gain - Gain (NUMERIC type)
    
    bias - Bias (NUMERIC type)
    
    channelsFirst - For 2D input - unused. True for NCHW (minibatch, channels, height, width), false for NHWC data
    
    dimensions - Dimensions to perform layer norm over - dimension=1 for 2d/MLP data, dimension=1,2,3 for CNNs (Size: AtLeast(min=1))
    
    Returns:
    
    output Output variable (NUMERIC type)
  - layerNorm
```
public SDVariable layerNorm(String name,
                            SDVariable input,
                            SDVariable gain,
                            SDVariable bias,
                            boolean channelsFirst,
                            int... dimensions)
```
    Apply Layer Normalization
    
    y = gain * standardize(x) + bias
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    input - Input variable (NUMERIC type)
    
    gain - Gain (NUMERIC type)
    
    bias - Bias (NUMERIC type)
    
    channelsFirst - For 2D input - unused. True for NCHW (minibatch, channels, height, width), false for NHWC data
    
    dimensions - Dimensions to perform layer norm over - dimension=1 for 2d/MLP data, dimension=1,2,3 for CNNs (Size: AtLeast(min=1))
    
    Returns:
    
    output Output variable (NUMERIC type)
  - layerNorm
```
public SDVariable layerNorm(SDVariable input,
                            SDVariable gain,
                            boolean channelsFirst,
                            int... dimensions)
```
    Apply Layer Normalization
    
    y = gain * standardize(x) + bias
    
    Parameters:
    
    input - Input variable (NUMERIC type)
    
    gain - Gain (NUMERIC type)
    
    channelsFirst - For 2D input - unused. True for NCHW (minibatch, channels, height, width), false for NHWC data
    
    dimensions - Dimensions to perform layer norm over - dimension=1 for 2d/MLP data, dimension=1,2,3 for CNNs (Size: AtLeast(min=1))
    
    Returns:
    
    output Output variable (NUMERIC type)
  - layerNorm
```
public SDVariable layerNorm(String name,
                            SDVariable input,
                            SDVariable gain,
                            boolean channelsFirst,
                            int... dimensions)
```
    Apply Layer Normalization
    
    y = gain * standardize(x) + bias
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    input - Input variable (NUMERIC type)
    
    gain - Gain (NUMERIC type)
    
    channelsFirst - For 2D input - unused. True for NCHW (minibatch, channels, height, width), false for NHWC data
    
    dimensions - Dimensions to perform layer norm over - dimension=1 for 2d/MLP data, dimension=1,2,3 for CNNs (Size: AtLeast(min=1))
    
    Returns:
    
    output Output variable (NUMERIC type)
  - leakyRelu
```
public SDVariable leakyRelu(SDVariable x,
                            double alpha)
```
    Element-wise leaky ReLU function:
    out = x if x >= 0.0
    out = alpha * x if x < cutoff
    Alpha value is most commonly set to 0.01
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    alpha - Cutoff - commonly 0.01
    
    Returns:
    
    output Output variable (NUMERIC type)
  - leakyRelu
```
public SDVariable leakyRelu(String name,
                            SDVariable x,
                            double alpha)
```
    Element-wise leaky ReLU function:
    out = x if x >= 0.0
    out = alpha * x if x < cutoff
    Alpha value is most commonly set to 0.01
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    alpha - Cutoff - commonly 0.01
    
    Returns:
    
    output Output variable (NUMERIC type)
  - leakyReluDerivative
```
public SDVariable leakyReluDerivative(SDVariable x,
                                      double alpha)
```
    Leaky ReLU derivative: dOut/dIn given input.
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    alpha - Cutoff - commonly 0.01
    
    Returns:
    
    output Output variable (NUMERIC type)
  - leakyReluDerivative
```
public SDVariable leakyReluDerivative(String name,
                                      SDVariable x,
                                      double alpha)
```
    Leaky ReLU derivative: dOut/dIn given input.
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    alpha - Cutoff - commonly 0.01
    
    Returns:
    
    output Output variable (NUMERIC type)
  - linear
```
public SDVariable linear(SDVariable input,
                         SDVariable weights,
                         SDVariable bias)
```
    Linear layer operation: out = mmul(in,w) + bias
    Note that bias array is optional
    
    Parameters:
    
    input - Input data (NUMERIC type)
    
    weights - Weights variable, shape [nIn, nOut] (NUMERIC type)
    
    bias - Optional bias variable (may be null) (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - linear
```
public SDVariable linear(String name,
                         SDVariable input,
                         SDVariable weights,
                         SDVariable bias)
```
    Linear layer operation: out = mmul(in,w) + bias
    Note that bias array is optional
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    input - Input data (NUMERIC type)
    
    weights - Weights variable, shape [nIn, nOut] (NUMERIC type)
    
    bias - Optional bias variable (may be null) (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - logSigmoid
```
public SDVariable logSigmoid(SDVariable x)
```
    Element-wise sigmoid function: out[i] = log(sigmoid(in[i]))
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - logSigmoid
```
public SDVariable logSigmoid(String name,
                             SDVariable x)
```
    Element-wise sigmoid function: out[i] = log(sigmoid(in[i]))
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - logSoftmax
```
public SDVariable logSoftmax(SDVariable x)
```
    Log softmax activation
    
    Parameters:
    
    x - (NUMERIC type)
    
    Returns:
    
    output (NUMERIC type)
  - logSoftmax
```
public SDVariable logSoftmax(String name,
                             SDVariable x)
```
    Log softmax activation
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - (NUMERIC type)
    
    Returns:
    
    output (NUMERIC type)
  - logSoftmax
```
public SDVariable logSoftmax(SDVariable x,
                             int dimension)
```
    Log softmax activation
    
    Parameters:
    
    x - Input (NUMERIC type)
    
    dimension - Dimension along which to apply log softmax
    
    Returns:
    
    output Output - log(softmax(input)) (NUMERIC type)
  - logSoftmax
```
public SDVariable logSoftmax(String name,
                             SDVariable x,
                             int dimension)
```
    Log softmax activation
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input (NUMERIC type)
    
    dimension - Dimension along which to apply log softmax
    
    Returns:
    
    output Output - log(softmax(input)) (NUMERIC type)
  - multiHeadDotProductAttention
```
public SDVariable multiHeadDotProductAttention(SDVariable queries,
                                               SDVariable keys,
                                               SDVariable values,
                                               SDVariable Wq,
                                               SDVariable Wk,
                                               SDVariable Wv,
                                               SDVariable Wo,
                                               SDVariable mask,
                                               boolean scaled)
```
    This performs multi-headed dot product attention on the given timeseries input
    out = concat(head_1, head_2, ..., head_n) * Wo
    head_i = dot_product_attention(Wq_i*q, Wk_i*k, Wv_i*v)
    
    Optionally with normalization when calculating the attention for each head.
    
    See also "Attention is all you need" (https://arxiv.org/abs/1706.03762, pp. 4,5, "3.2.2 Multi-Head Attention")
    
    This makes use of dot_product_attention OP support for rank 4 inputs.
    see dotProductAttention(INDArray, INDArray, INDArray, INDArray, boolean, boolean)
    
    Parameters:
    
    queries - input 3D array "queries" of shape [batchSize, featureKeys, queryCount] (NUMERIC type)
    
    keys - input 3D array "keys" of shape [batchSize, featureKeys, timesteps] (NUMERIC type)
    
    values - input 3D array "values" of shape [batchSize, featureValues, timesteps] (NUMERIC type)
    
    Wq - input query projection weights of shape [numHeads, projectedKeys, featureKeys] (NUMERIC type)
    
    Wk - input key projection weights of shape [numHeads, projectedKeys, featureKeys] (NUMERIC type)
    
    Wv - input value projection weights of shape [numHeads, projectedValues, featureValues] (NUMERIC type)
    
    Wo - output projection weights of shape [numHeads * projectedValues, outSize] (NUMERIC type)
    
    mask - OPTIONAL; array that defines which values should be skipped of shape [batchSize, timesteps] (NUMERIC type)
    
    scaled - normalization, false -> do not apply normalization, true -> apply normalization
    
    Returns:
    
    output Attention result arrays of shape [batchSize, outSize, queryCount] (optionally) Attention Weights of shape [batchSize, numHeads, timesteps, queryCount] (NUMERIC type)
  - multiHeadDotProductAttention
```
public SDVariable multiHeadDotProductAttention(String name,
                                               SDVariable queries,
                                               SDVariable keys,
                                               SDVariable values,
                                               SDVariable Wq,
                                               SDVariable Wk,
                                               SDVariable Wv,
                                               SDVariable Wo,
                                               SDVariable mask,
                                               boolean scaled)
```
    This performs multi-headed dot product attention on the given timeseries input
    out = concat(head_1, head_2, ..., head_n) * Wo
    head_i = dot_product_attention(Wq_i*q, Wk_i*k, Wv_i*v)
    
    Optionally with normalization when calculating the attention for each head.
    
    See also "Attention is all you need" (https://arxiv.org/abs/1706.03762, pp. 4,5, "3.2.2 Multi-Head Attention")
    
    This makes use of dot_product_attention OP support for rank 4 inputs.
    see dotProductAttention(INDArray, INDArray, INDArray, INDArray, boolean, boolean)
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    queries - input 3D array "queries" of shape [batchSize, featureKeys, queryCount] (NUMERIC type)
    
    keys - input 3D array "keys" of shape [batchSize, featureKeys, timesteps] (NUMERIC type)
    
    values - input 3D array "values" of shape [batchSize, featureValues, timesteps] (NUMERIC type)
    
    Wq - input query projection weights of shape [numHeads, projectedKeys, featureKeys] (NUMERIC type)
    
    Wk - input key projection weights of shape [numHeads, projectedKeys, featureKeys] (NUMERIC type)
    
    Wv - input value projection weights of shape [numHeads, projectedValues, featureValues] (NUMERIC type)
    
    Wo - output projection weights of shape [numHeads * projectedValues, outSize] (NUMERIC type)
    
    mask - OPTIONAL; array that defines which values should be skipped of shape [batchSize, timesteps] (NUMERIC type)
    
    scaled - normalization, false -> do not apply normalization, true -> apply normalization
    
    Returns:
    
    output Attention result arrays of shape [batchSize, outSize, queryCount] (optionally) Attention Weights of shape [batchSize, numHeads, timesteps, queryCount] (NUMERIC type)
  - pad
```
public SDVariable pad(SDVariable input,
                      SDVariable padding,
                      PadMode PadMode,
                      double constant)
```
    Padding operation
    
    Parameters:
    
    input - Input tensor (NUMERIC type)
    
    padding - Padding value (NUMERIC type)
    
    PadMode - Padding format
    
    constant - Padding constant
    
    Returns:
    
    output Padded input (NUMERIC type)
  - pad
```
public SDVariable pad(String name,
                      SDVariable input,
                      SDVariable padding,
                      PadMode PadMode,
                      double constant)
```
    Padding operation
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    input - Input tensor (NUMERIC type)
    
    padding - Padding value (NUMERIC type)
    
    PadMode - Padding format
    
    constant - Padding constant
    
    Returns:
    
    output Padded input (NUMERIC type)
  - pad
```
public SDVariable pad(SDVariable input,
                      SDVariable padding,
                      double constant)
```
    Padding operation
    
    Parameters:
    
    input - Input tensor (NUMERIC type)
    
    padding - Padding value (NUMERIC type)
    
    constant - Padding constant
    
    Returns:
    
    output Padded input (NUMERIC type)
  - pad
```
public SDVariable pad(String name,
                      SDVariable input,
                      SDVariable padding,
                      double constant)
```
    Padding operation
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    input - Input tensor (NUMERIC type)
    
    padding - Padding value (NUMERIC type)
    
    constant - Padding constant
    
    Returns:
    
    output Padded input (NUMERIC type)
  - preciseGelu
```
public SDVariable preciseGelu(SDVariable x)
```
    GELU activation function - Gaussian Error Linear Units
    For more details, see Gaussian Error Linear Units (GELUs) - https://arxiv.org/abs/1606.08415
    This method uses the precise method
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - preciseGelu
```
public SDVariable preciseGelu(String name,
                              SDVariable x)
```
    GELU activation function - Gaussian Error Linear Units
    For more details, see Gaussian Error Linear Units (GELUs) - https://arxiv.org/abs/1606.08415
    This method uses the precise method
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - prelu
```
public SDVariable prelu(SDVariable input,
                        SDVariable alpha,
                        int... sharedAxes)
```
    PReLU (Parameterized Rectified Linear Unit) operation. Like LeakyReLU with a learnable alpha:
    out[i] = in[i] if in[i] >= 0
    out[i] = in[i] * alpha[i] otherwise
    
    sharedAxes allows you to share learnable parameters along axes.
    For example, if the input has shape [batchSize, channels, height, width]
    and you want each channel to have its own cutoff, use sharedAxes = [2, 3] and an
    alpha with shape [channels].
    
    Parameters:
    
    input - Input data (NUMERIC type)
    
    alpha - The cutoff variable. Note that the batch dimension (the 0th, whether it is batch or not) should not be part of alpha. (NUMERIC type)
    
    sharedAxes - Which axes to share cutoff parameters along. (Size: AtLeast(min=1))
    
    Returns:
    
    output Output (NUMERIC type)
  - prelu
```
public SDVariable prelu(String name,
                        SDVariable input,
                        SDVariable alpha,
                        int... sharedAxes)
```
    PReLU (Parameterized Rectified Linear Unit) operation. Like LeakyReLU with a learnable alpha:
    out[i] = in[i] if in[i] >= 0
    out[i] = in[i] * alpha[i] otherwise
    
    sharedAxes allows you to share learnable parameters along axes.
    For example, if the input has shape [batchSize, channels, height, width]
    and you want each channel to have its own cutoff, use sharedAxes = [2, 3] and an
    alpha with shape [channels].
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    input - Input data (NUMERIC type)
    
    alpha - The cutoff variable. Note that the batch dimension (the 0th, whether it is batch or not) should not be part of alpha. (NUMERIC type)
    
    sharedAxes - Which axes to share cutoff parameters along. (Size: AtLeast(min=1))
    
    Returns:
    
    output Output (NUMERIC type)
  - relu
```
public SDVariable relu(SDVariable x,
                       double cutoff)
```
    Element-wise rectified linear function with specified cutoff:
    out[i] = in[i] if in[i] >= cutoff
    out[i] = 0 otherwise
    
    Parameters:
    
    x - Input (NUMERIC type)
    
    cutoff - Cutoff value for ReLU operation - x > cutoff ? x : 0. Usually 0
    
    Returns:
    
    output Output (NUMERIC type)
  - relu
```
public SDVariable relu(String name,
                       SDVariable x,
                       double cutoff)
```
    Element-wise rectified linear function with specified cutoff:
    out[i] = in[i] if in[i] >= cutoff
    out[i] = 0 otherwise
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input (NUMERIC type)
    
    cutoff - Cutoff value for ReLU operation - x > cutoff ? x : 0. Usually 0
    
    Returns:
    
    output Output (NUMERIC type)
  - relu6
```
public SDVariable relu6(SDVariable x,
                        double cutoff)
```
    Element-wise "rectified linear 6" function with specified cutoff:
    out[i] = min(max(in, cutoff), 6)
    
    Parameters:
    
    x - Input (NUMERIC type)
    
    cutoff - Cutoff value for ReLU operation. Usually 0
    
    Returns:
    
    output Output (NUMERIC type)
  - relu6
```
public SDVariable relu6(String name,
                        SDVariable x,
                        double cutoff)
```
    Element-wise "rectified linear 6" function with specified cutoff:
    out[i] = min(max(in, cutoff), 6)
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input (NUMERIC type)
    
    cutoff - Cutoff value for ReLU operation. Usually 0
    
    Returns:
    
    output Output (NUMERIC type)
  - reluLayer
```
public SDVariable reluLayer(SDVariable input,
                            SDVariable weights,
                            SDVariable bias)
```
    ReLU (Rectified Linear Unit) layer operation: out = relu(mmul(in,w) + bias)
    Note that bias array is optional
    
    Parameters:
    
    input - Input data (NUMERIC type)
    
    weights - Weights variable (NUMERIC type)
    
    bias - Optional bias variable (may be null) (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - reluLayer
```
public SDVariable reluLayer(String name,
                            SDVariable input,
                            SDVariable weights,
                            SDVariable bias)
```
    ReLU (Rectified Linear Unit) layer operation: out = relu(mmul(in,w) + bias)
    Note that bias array is optional
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    input - Input data (NUMERIC type)
    
    weights - Weights variable (NUMERIC type)
    
    bias - Optional bias variable (may be null) (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - selu
```
public SDVariable selu(SDVariable x)
```
    Element-wise SeLU function - Scaled exponential Lineal Unit: see Self-Normalizing Neural Networks
    
    out[i] = scale * alpha * (exp(in[i])-1) if in[i]>0, or 0 if in[i] <= 0
    Uses default scale and alpha values.
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - selu
```
public SDVariable selu(String name,
                       SDVariable x)
```
    Element-wise SeLU function - Scaled exponential Lineal Unit: see Self-Normalizing Neural Networks
    
    out[i] = scale * alpha * (exp(in[i])-1) if in[i]>0, or 0 if in[i] <= 0
    Uses default scale and alpha values.
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - sigmoid
```
public SDVariable sigmoid(SDVariable x)
```
    Element-wise sigmoid function: out[i] = 1.0/(1+exp(-in[i]))
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - sigmoid
```
public SDVariable sigmoid(String name,
                          SDVariable x)
```
    Element-wise sigmoid function: out[i] = 1.0/(1+exp(-in[i]))
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - sigmoidDerivative
```
public SDVariable sigmoidDerivative(SDVariable x,
                                    SDVariable wrt)
```
    Element-wise sigmoid function derivative: dL/dIn given input and dL/dOut
    
    Parameters:
    
    x - Input Variable (NUMERIC type)
    
    wrt - Gradient at the output - dL/dOut. Must have same shape as the input (NUMERIC type)
    
    Returns:
    
    output Output (gradient at input of sigmoid) (NUMERIC type)
  - sigmoidDerivative
```
public SDVariable sigmoidDerivative(String name,
                                    SDVariable x,
                                    SDVariable wrt)
```
    Element-wise sigmoid function derivative: dL/dIn given input and dL/dOut
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input Variable (NUMERIC type)
    
    wrt - Gradient at the output - dL/dOut. Must have same shape as the input (NUMERIC type)
    
    Returns:
    
    output Output (gradient at input of sigmoid) (NUMERIC type)
  - softmax
```
public SDVariable softmax(SDVariable x,
                          int dimension)
```
    Softmax activation, along the specified dimension
    
    Parameters:
    
    x - Input (NUMERIC type)
    
    dimension - Dimension along which to apply softmax
    
    Returns:
    
    output Output variable (NUMERIC type)
  - softmax
```
public SDVariable softmax(String name,
                          SDVariable x,
                          int dimension)
```
    Softmax activation, along the specified dimension
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input (NUMERIC type)
    
    dimension - Dimension along which to apply softmax
    
    Returns:
    
    output Output variable (NUMERIC type)
  - softmax
```
public SDVariable softmax(SDVariable x)
```
    Softmax activation, along the specified dimension
    
    Parameters:
    
    x - Input (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - softmax
```
public SDVariable softmax(String name,
                          SDVariable x)
```
    Softmax activation, along the specified dimension
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - softmaxDerivative
```
public SDVariable softmaxDerivative(SDVariable x,
                                    SDVariable wrt,
                                    int dimension)
```
    Softmax derivative function
    
    Parameters:
    
    x - Softmax input (NUMERIC type)
    
    wrt - Gradient at output, dL/dx (NUMERIC type)
    
    dimension - Softmax dimension
    
    Returns:
    
    output (NUMERIC type)
  - softmaxDerivative
```
public SDVariable softmaxDerivative(String name,
                                    SDVariable x,
                                    SDVariable wrt,
                                    int dimension)
```
    Softmax derivative function
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Softmax input (NUMERIC type)
    
    wrt - Gradient at output, dL/dx (NUMERIC type)
    
    dimension - Softmax dimension
    
    Returns:
    
    output (NUMERIC type)
  - softplus
```
public SDVariable softplus(SDVariable x)
```
    Element-wise softplus function: out = log(exp(x) + 1)
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - softplus
```
public SDVariable softplus(String name,
                           SDVariable x)
```
    Element-wise softplus function: out = log(exp(x) + 1)
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - softsign
```
public SDVariable softsign(SDVariable x)
```
    Element-wise softsign function: out = x / (abs(x) + 1)
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - softsign
```
public SDVariable softsign(String name,
                           SDVariable x)
```
    Element-wise softsign function: out = x / (abs(x) + 1)
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - softsignDerivative
```
public SDVariable softsignDerivative(SDVariable x)
```
    Element-wise derivative (dOut/dIn) of the softsign function softsign(INDArray)
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output (NUMERIC type)
  - softsignDerivative
```
public SDVariable softsignDerivative(String name,
                                     SDVariable x)
```
    Element-wise derivative (dOut/dIn) of the softsign function softsign(INDArray)
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output (NUMERIC type)
  - swish
```
public SDVariable swish(SDVariable x)
```
    Element-wise "swish" function: out = x * sigmoid(b*x) with b=1.0
    See: https://arxiv.org/abs/1710.05941
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - swish
```
public SDVariable swish(String name,
                        SDVariable x)
```
    Element-wise "swish" function: out = x * sigmoid(b*x) with b=1.0
    See: https://arxiv.org/abs/1710.05941
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - tanh
```
public SDVariable tanh(SDVariable x)
```
    Elementwise tanh (hyperbolic tangent) operation: out = tanh(x)
    
    Parameters:
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)
  - tanh
```
public SDVariable tanh(String name,
                       SDVariable x)
```
    Elementwise tanh (hyperbolic tangent) operation: out = tanh(x)
    
    Parameters:
    
    name - name May be null. Name for the output variable
    
    x - Input variable (NUMERIC type)
    
    Returns:
    
    output Output variable (NUMERIC type)

Class SDNN

Field Summary

Fields inherited from class org.nd4j.autodiff.samediff.ops.SDOps

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

SDNN

Method Detail

cReLU

cReLU

batchNorm

batchNorm

biasAdd

biasAdd

dotProductAttention

dotProductAttention

dropout

dropout

elu

elu

gelu

gelu

hardSigmoid

hardSigmoid

hardTanh

hardTanh

hardTanhDerivative

hardTanhDerivative

layerNorm

layerNorm

layerNorm

layerNorm

leakyRelu

leakyRelu

leakyReluDerivative

leakyReluDerivative

linear

linear

logSigmoid

logSigmoid

logSoftmax

logSoftmax

logSoftmax

logSoftmax

multiHeadDotProductAttention

multiHeadDotProductAttention

pad

pad

pad

pad

preciseGelu

preciseGelu

prelu

prelu

relu

relu

relu6

relu6

reluLayer

reluLayer

selu

selu

sigmoid

sigmoid

sigmoidDerivative

sigmoidDerivative

softmax

softmax

softmax

softmax

softmaxDerivative

softmaxDerivative

softplus

softplus

softsign

softsign

softsignDerivative

softsignDerivative

swish