public class Conv2d extends Convolution
Conv2d
layer works on two dimensions of input,
LayoutType.WIDTH
and LayoutType.HEIGHT
as usually a Conv2d
layer is used
to process data with two spatial dimensions, namely image. The concept itself works just as how
Convolution
does, and each filter slides through an input data by two directions, first
traversing the LayoutType.WIDTH
then traverses each row of the data.
First proposed by LeCun et al.'s paper, 2-dimensional convolution
layer gained its rising interest with the publication of
paper about AlexNet for image classification task. It is still commonly used in image-related
tasks and adapted in other tasks, including but not limited to 1-dimensional data which may be
transformed to 2-dimensional data, though Conv1d
is now available for use.
The input to a Conv2d
is an NDList
with a single 4-D NDArray
. The layout of the NDArray
must be "NCHW". The
shapes are
data: (batch_size, channel, height, width)
weight: (num_filter, channel, kernel[0], kernel[1])
bias: (num_filter,)
out: (batch_size, num_filter, out_height, out_width)
out_height = f(height, kernel[0], pad[0], stride[0], dilate[0])
out_width = f(width, kernel[1], pad[1], stride[1], dilate[1])
where f(x, k, p, s, d) = floor((x + 2 * p - d * (k - 1) - 1)/s) + 1
Both weight
and bias
are learn-able parameters.
Modifier and Type | Class and Description |
---|---|
static class |
Conv2d.Builder
|
Convolution.ConvolutionBuilder<T extends Convolution.ConvolutionBuilder>
bias, dilation, filters, groups, includeBias, kernelShape, padding, stride, weight
children, inputNames, inputShapes, parameters, parameterShapeCallbacks, version
Modifier and Type | Method and Description |
---|---|
static Conv2d.Builder |
builder()
Creates a builder to build a
Conv2d . |
static NDList |
conv2d(NDArray input,
NDArray weight)
Applies 2D convolution over an input signal composed of several input planes.
|
static NDList |
conv2d(NDArray input,
NDArray weight,
NDArray bias)
Applies 2D convolution over an input signal composed of several input planes.
|
static NDList |
conv2d(NDArray input,
NDArray weight,
NDArray bias,
Shape stride)
Applies 2D convolution over an input signal composed of several input planes.
|
static NDList |
conv2d(NDArray input,
NDArray weight,
NDArray bias,
Shape stride,
Shape padding)
Applies 2D convolution over an input signal composed of several input planes.
|
static NDList |
conv2d(NDArray input,
NDArray weight,
NDArray bias,
Shape stride,
Shape padding,
Shape dilation)
Applies 2D convolution over an input signal composed of several input planes.
|
static NDList |
conv2d(NDArray input,
NDArray weight,
NDArray bias,
Shape stride,
Shape padding,
Shape dilation,
int groups)
Applies 2D convolution over an input signal composed of several input planes.
|
protected LayoutType[] |
getExpectedLayout()
Returns the expected layout of the input.
|
protected java.lang.String |
getStringLayout()
Returns the string representing the layout of the input.
|
protected int |
numDimensions()
Returns the number of dimensions of the input.
|
beforeInitialize, forward, getOutputShapes, loadMetadata
addChildBlock, addParameter, addParameter, addParameter, cast, clear, describeInput, getChildren, getDirectParameters, getParameters, getParameterShape, initialize, initializeChildBlocks, isInitialized, loadParameters, readInputShapes, saveInputShapes, saveMetadata, saveParameters, setInitializer, setInitializer, toString
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
forward, forward, validateLayout
protected LayoutType[] getExpectedLayout()
getExpectedLayout
in class Convolution
protected java.lang.String getStringLayout()
getStringLayout
in class Convolution
protected int numDimensions()
numDimensions
in class Convolution
public static NDList conv2d(NDArray input, NDArray weight)
input
- the input NDArray
of shape (batchSize, inputChannel, height, width)weight
- filters NDArray
of shape (outChannel, inputChannel/groups, height,
width)public static NDList conv2d(NDArray input, NDArray weight, NDArray bias)
input
- the input NDArray
of shape (batchSize, inputChannel, height, width)weight
- filters NDArray
of shape (outChannel, inputChannel/groups, height,
width)bias
- bias NDArray
of shape (outChannel)public static NDList conv2d(NDArray input, NDArray weight, NDArray bias, Shape stride)
input
- the input NDArray
of shape (batchSize, inputChannel, height, width)weight
- filters NDArray
of shape (outChannel, inputChannel/groups, height,
width)bias
- bias NDArray
of shape (outChannel)stride
- the stride of the convolving kernel: Shape(height, width)public static NDList conv2d(NDArray input, NDArray weight, NDArray bias, Shape stride, Shape padding)
input
- the input NDArray
of shape (batchSize, inputChannel, height, width)weight
- filters NDArray
of shape (outChannel, inputChannel/groups, height,
width)bias
- bias NDArray
of shape (outChannel)stride
- the stride of the convolving kernel: Shape(height, width)padding
- implicit paddings on both sides of the input: Shape(height, width)public static NDList conv2d(NDArray input, NDArray weight, NDArray bias, Shape stride, Shape padding, Shape dilation)
input
- the input NDArray
of shape (batchSize, inputChannel, height, width)weight
- filters NDArray
of shape (outChannel, inputChannel/groups, height,
width)bias
- bias NDArray
of shape (outChannel)stride
- the stride of the convolving kernel: Shape(height, width)padding
- implicit paddings on both sides of the input: Shape(height, width)dilation
- the spacing between kernel elements: Shape(height, width)public static NDList conv2d(NDArray input, NDArray weight, NDArray bias, Shape stride, Shape padding, Shape dilation, int groups)
input
- the input NDArray
of shape (batchSize, inputChannel, height, width)weight
- filters NDArray
of shape (outChannel, inputChannel/groups, height,
width)bias
- bias NDArray
of shape (outChannel)stride
- the stride of the convolving kernel: Shape(height, width)padding
- implicit paddings on both sides of the input: Shape(height, width)dilation
- the spacing between kernel elements: Shape(height, width)groups
- split input into groups: input channel(input.size(1)) should be divisible by
the number of groupspublic static Conv2d.Builder builder()
Conv2d
.