Class Conv2d

  • All Implemented Interfaces:
    Block

    public class Conv2d
    extends Convolution
    Being the pioneer of convolution layers, Conv2d layer works on two dimensions of input, LayoutType.WIDTH and LayoutType.HEIGHT as usually a Conv2d layer is used to process data with two spatial dimensions, namely image. The concept itself works just as how Convolution does, and each filter slides through an input data by two directions, first traversing the LayoutType.WIDTH then traverses each row of the data.

    First proposed by LeCun et al.'s paper, 2-dimensional convolution layer gained its rising interest with the publication of paper about AlexNet for image classification task. It is still commonly used in image-related tasks and adapted in other tasks, including but not limited to 1-dimensional data which may be transformed to 2-dimensional data, though Conv1d is now available for use.

    The input to a Conv2d is an NDList with a single 4-D NDArray. The layout of the NDArray must be "NCHW". The shapes are

    • data: (batch_size, channel, height, width)
    • weight: (num_filter, channel, kernel[0], kernel[1])
    • bias: (num_filter,)
    • out: (batch_size, num_filter, out_height, out_width)
      out_height = f(height, kernel[0], pad[0], stride[0], dilate[0])
      out_width = f(width, kernel[1], pad[1], stride[1], dilate[1])
      where f(x, k, p, s, d) = floor((x + 2 * p - d * (k - 1) - 1)/s) + 1

    Both weight and bias are learn-able parameters.

    See Also:
    Convolution
    • Method Detail

      • getExpectedLayout

        protected LayoutType[] getExpectedLayout()
        Returns the expected layout of the input.
        Specified by:
        getExpectedLayout in class Convolution
        Returns:
        the expected layout of the input
      • getStringLayout

        protected java.lang.String getStringLayout()
        Returns the string representing the layout of the input.
        Specified by:
        getStringLayout in class Convolution
        Returns:
        the string representing the layout of the input
      • numDimensions

        protected int numDimensions()
        Returns the number of dimensions of the input.
        Specified by:
        numDimensions in class Convolution
        Returns:
        the number of dimensions of the input
      • conv2d

        public static NDList conv2d​(NDArray input,
                                    NDArray weight)
        Applies 2D convolution over an input signal composed of several input planes.
        Parameters:
        input - the input NDArray of shape (batchSize, inputChannel, height, width)
        weight - filters NDArray of shape (outChannel, inputChannel/groups, height, width)
        Returns:
        the output of the conv2d operation
      • conv2d

        public static NDList conv2d​(NDArray input,
                                    NDArray weight,
                                    NDArray bias)
        Applies 2D convolution over an input signal composed of several input planes.
        Parameters:
        input - the input NDArray of shape (batchSize, inputChannel, height, width)
        weight - filters NDArray of shape (outChannel, inputChannel/groups, height, width)
        bias - bias NDArray of shape (outChannel)
        Returns:
        the output of the conv2d operation
      • conv2d

        public static NDList conv2d​(NDArray input,
                                    NDArray weight,
                                    NDArray bias,
                                    Shape stride)
        Applies 2D convolution over an input signal composed of several input planes.
        Parameters:
        input - the input NDArray of shape (batchSize, inputChannel, height, width)
        weight - filters NDArray of shape (outChannel, inputChannel/groups, height, width)
        bias - bias NDArray of shape (outChannel)
        stride - the stride of the convolving kernel: Shape(height, width)
        Returns:
        the output of the conv2d operation
      • conv2d

        public static NDList conv2d​(NDArray input,
                                    NDArray weight,
                                    NDArray bias,
                                    Shape stride,
                                    Shape padding)
        Applies 2D convolution over an input signal composed of several input planes.
        Parameters:
        input - the input NDArray of shape (batchSize, inputChannel, height, width)
        weight - filters NDArray of shape (outChannel, inputChannel/groups, height, width)
        bias - bias NDArray of shape (outChannel)
        stride - the stride of the convolving kernel: Shape(height, width)
        padding - implicit paddings on both sides of the input: Shape(height, width)
        Returns:
        the output of the conv2d operation
      • conv2d

        public static NDList conv2d​(NDArray input,
                                    NDArray weight,
                                    NDArray bias,
                                    Shape stride,
                                    Shape padding,
                                    Shape dilation)
        Applies 2D convolution over an input signal composed of several input planes.
        Parameters:
        input - the input NDArray of shape (batchSize, inputChannel, height, width)
        weight - filters NDArray of shape (outChannel, inputChannel/groups, height, width)
        bias - bias NDArray of shape (outChannel)
        stride - the stride of the convolving kernel: Shape(height, width)
        padding - implicit paddings on both sides of the input: Shape(height, width)
        dilation - the spacing between kernel elements: Shape(height, width)
        Returns:
        the output of the conv2d operation
      • conv2d

        public static NDList conv2d​(NDArray input,
                                    NDArray weight,
                                    NDArray bias,
                                    Shape stride,
                                    Shape padding,
                                    Shape dilation,
                                    int groups)
        Applies 2D convolution over an input signal composed of several input planes.
        Parameters:
        input - the input NDArray of shape (batchSize, inputChannel, height, width)
        weight - filters NDArray of shape (outChannel, inputChannel/groups, height, width)
        bias - bias NDArray of shape (outChannel)
        stride - the stride of the convolving kernel: Shape(height, width)
        padding - implicit paddings on both sides of the input: Shape(height, width)
        dilation - the spacing between kernel elements: Shape(height, width)
        groups - split input into groups: input channel(input.size(1)) should be divisible by the number of groups
        Returns:
        the output of the conv2d operation
      • builder

        public static Conv2d.Builder builder()
        Creates a builder to build a Conv2d.
        Returns:
        a new builder