Class VariationalAutoencoder

    • Field Detail

      • paramsFlattened

        protected INDArray paramsFlattened
      • gradientsFlattened

        protected INDArray gradientsFlattened
      • score

        protected double score
      • index

        protected int index
      • maskArray

        protected INDArray maskArray
      • solver

        protected Solver solver
      • encoderLayerSizes

        protected int[] encoderLayerSizes
      • decoderLayerSizes

        protected int[] decoderLayerSizes
      • pzxActivationFn

        protected IActivation pzxActivationFn
      • numSamples

        protected int numSamples
      • zeroedPretrainParamGradients

        protected boolean zeroedPretrainParamGradients
      • iterationCount

        protected int iterationCount
      • epochCount

        protected int epochCount
    • Method Detail

      • setCacheMode

        public void setCacheMode​(CacheMode mode)
        Description copied from interface: Layer
        This method sets given CacheMode for current layer
        Specified by:
        setCacheMode in interface Layer
      • layerId

        protected String layerId()
      • init

        public void init()
        Init the model
        Specified by:
        init in interface Model
      • update

        public void update​(Gradient gradient)
        Description copied from interface: Model
        Update layer weights and biases with gradient change
        Specified by:
        update in interface Model
      • update

        public void update​(INDArray gradient,
                           String paramType)
        Description copied from interface: Model
        Perform one update applying the gradient
        Specified by:
        update in interface Model
        Parameters:
        gradient - the gradient to apply
      • score

        public double score()
        Description copied from interface: Model
        The score for the model
        Specified by:
        score in interface Model
        Returns:
        the score for the model
      • params

        public INDArray params()
        Description copied from interface: Model
        Parameters of the model (if any)
        Specified by:
        params in interface Model
        Specified by:
        params in interface Trainable
        Returns:
        the parameters of the model
      • numParams

        public long numParams()
        Description copied from interface: Model
        the number of parameters for the model
        Specified by:
        numParams in interface Model
        Specified by:
        numParams in interface Trainable
        Returns:
        the number of parameters for the model
      • numParams

        public long numParams​(boolean backwards)
        Description copied from interface: Model
        the number of parameters for the model
        Specified by:
        numParams in interface Model
        Returns:
        the number of parameters for the model
      • setParams

        public void setParams​(INDArray params)
        Description copied from interface: Model
        Set the parameters for this model. This expects a linear ndarray which then be unpacked internally relative to the expected ordering of the model
        Specified by:
        setParams in interface Model
        Parameters:
        params - the parameters for the model
      • setParamsViewArray

        public void setParamsViewArray​(INDArray params)
        Description copied from interface: Model
        Set the initial parameters array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.
        Specified by:
        setParamsViewArray in interface Model
        Parameters:
        params - a 1 x nParams row vector that is a view of the larger (MLN/CG) parameters array
      • setBackpropGradientsViewArray

        public void setBackpropGradientsViewArray​(INDArray gradients)
        Description copied from interface: Model
        Set the gradients array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.
        Specified by:
        setBackpropGradientsViewArray in interface Model
        Parameters:
        gradients - a 1 x nParams row vector that is a view of the larger (MLN/CG) gradients array
      • fit

        public void fit​(INDArray data,
                        LayerWorkspaceMgr workspaceMgr)
        Description copied from interface: Model
        Fit the model to the given data
        Specified by:
        fit in interface Model
        Parameters:
        data - the data to fit the model to
      • gradient

        public Gradient gradient()
        Description copied from interface: Model
        Get the gradient. Note that this method will not calculate the gradient, it will rather return the gradient that has been computed before. For calculating the gradient, see Model.computeGradientAndScore(LayerWorkspaceMgr) } .
        Specified by:
        gradient in interface Model
        Returns:
        the gradient for this model, as calculated before
      • gradientAndScore

        public Pair<Gradient,​Double> gradientAndScore()
        Description copied from interface: Model
        Get the gradient and score
        Specified by:
        gradientAndScore in interface Model
        Returns:
        the gradient and score
      • batchSize

        public int batchSize()
        Description copied from interface: Model
        The current inputs batch size
        Specified by:
        batchSize in interface Model
        Returns:
        the current inputs batch size
      • conf

        public NeuralNetConfiguration conf()
        Description copied from interface: Model
        The configuration for the neural network
        Specified by:
        conf in interface Model
        Returns:
        the configuration for the neural network
      • input

        public INDArray input()
        Description copied from interface: Model
        The input/feature matrix for the model
        Specified by:
        input in interface Model
        Returns:
        the input/feature matrix for the model
      • getOptimizer

        public ConvexOptimizer getOptimizer()
        Description copied from interface: Model
        Returns this models optimizer
        Specified by:
        getOptimizer in interface Model
        Returns:
        this models optimizer
      • getParam

        public INDArray getParam​(String param)
        Description copied from interface: Model
        Get the parameter
        Specified by:
        getParam in interface Model
        Parameters:
        param - the key of the parameter
        Returns:
        the parameter vector/matrix with that particular key
      • paramTable

        public Map<String,​INDArray> paramTable​(boolean backpropParamsOnly)
        Description copied from interface: Model
        Table of parameters by key, for backprop For many models (dense layers, etc) - all parameters are backprop parameters
        Specified by:
        paramTable in interface Model
        Specified by:
        paramTable in interface Trainable
        Parameters:
        backpropParamsOnly - If true, return backprop params only. If false: return all params (equivalent to paramsTable())
        Returns:
        Parameter table
      • updaterDivideByMinibatch

        public boolean updaterDivideByMinibatch​(String paramName)
        Description copied from interface: Trainable
        DL4J layers typically produce the sum of the gradients during the backward pass for each layer, and if required (if minibatch=true) then divide by the minibatch size.
        However, there are some exceptions, such as the batch norm mean/variance estimate parameters: these "gradients" are actually not gradients, but are updates to be applied directly to the parameter vector. Put another way, most gradients should be divided by the minibatch to get the average; some "gradients" are actually final updates already, and should not be divided by the minibatch size.
        Specified by:
        updaterDivideByMinibatch in interface Trainable
        Parameters:
        paramName - Name of the parameter
        Returns:
        True if gradients should be divided by minibatch (most params); false otherwise (edge cases like batch norm mean/variance estimates)
      • setParamTable

        public void setParamTable​(Map<String,​INDArray> paramTable)
        Description copied from interface: Model
        Setter for the param table
        Specified by:
        setParamTable in interface Model
      • setParam

        public void setParam​(String key,
                             INDArray val)
        Description copied from interface: Model
        Set the parameter with a new ndarray
        Specified by:
        setParam in interface Model
        Parameters:
        key - the key to se t
        val - the new ndarray
      • clear

        public void clear()
        Description copied from interface: Model
        Clear input
        Specified by:
        clear in interface Model
      • applyConstraints

        public void applyConstraints​(int iteration,
                                     int epoch)
        Description copied from interface: Model
        Apply any constraints to the model
        Specified by:
        applyConstraints in interface Model
      • isPretrainParam

        public boolean isPretrainParam​(String param)
      • calcRegularizationScore

        public double calcRegularizationScore​(boolean backpropParamsOnly)
        Description copied from interface: Layer
        Calculate the regularization component of the score, for the parameters in this layer
        For example, the L1, L2 and/or weight decay components of the loss function
        Specified by:
        calcRegularizationScore in interface Layer
        Parameters:
        backpropParamsOnly - If true: calculate regularization score based on backprop params only. If false: calculate based on all params (including pretrain params, if any)
        Returns:
        the regularization score of
      • type

        public Layer.Type type()
        Description copied from interface: Layer
        Returns the layer type
        Specified by:
        type in interface Layer
        Returns:
      • backpropGradient

        public Pair<Gradient,​INDArray> backpropGradient​(INDArray epsilon,
                                                              LayerWorkspaceMgr workspaceMgr)
        Description copied from interface: Layer
        Calculate the gradient relative to the error in the next layer
        Specified by:
        backpropGradient in interface Layer
        Parameters:
        epsilon - w^(L+1)*delta^(L+1). Or, equiv: dC/da, i.e., (dC/dz)*(dz/da) = dC/da, where C is cost function a=sigma(z) is activation.
        workspaceMgr - Workspace manager
        Returns:
        Pair where Gradient is gradient for this layer, INDArray is epsilon (activation gradient) needed by next layer, but before element-wise multiply by sigmaPrime(z). So for standard feed-forward layer, if this layer is L, then return.getSecond() == dL/dIn = (w^(L)*(delta^(L))^T)^T. Note that the returned array should be placed in the ArrayType.ACTIVATION_GRAD workspace via the workspace manager
      • activate

        public INDArray activate​(boolean training,
                                 LayerWorkspaceMgr workspaceMgr)
        Description copied from interface: Layer
        Perform forward pass and return the activations array with the last set input
        Specified by:
        activate in interface Layer
        Parameters:
        training - training or test mode
        workspaceMgr - Workspace manager
        Returns:
        the activation (layer output) of the last specified input. Note that the returned array should be placed in the ArrayType.ACTIVATIONS workspace via the workspace manager
      • activate

        public INDArray activate​(INDArray input,
                                 boolean training,
                                 LayerWorkspaceMgr workspaceMgr)
        Description copied from interface: Layer
        Perform forward pass and return the activations array with the specified input
        Specified by:
        activate in interface Layer
        Parameters:
        input - the input to use
        training - train or test mode
        workspaceMgr - Workspace manager.
        Returns:
        Activations array. Note that the returned array should be placed in the ArrayType.ACTIVATIONS workspace via the workspace manager
      • addListeners

        public void addListeners​(TrainingListener... listeners)
        This method ADDS additional TrainingListener to existing listeners
        Specified by:
        addListeners in interface Model
        Parameters:
        listeners -
      • setIndex

        public void setIndex​(int index)
        Description copied from interface: Layer
        Set the layer index.
        Specified by:
        setIndex in interface Layer
      • getIndex

        public int getIndex()
        Description copied from interface: Layer
        Get the layer index.
        Specified by:
        getIndex in interface Layer
      • setInputMiniBatchSize

        public void setInputMiniBatchSize​(int size)
        Description copied from interface: Layer
        Set current/last input mini-batch size.
        Used for score and gradient calculations. Mini batch size may be different from getInput().size(0) due to reshaping operations - for example, when using RNNs with DenseLayer and OutputLayer. Called automatically during forward pass.
        Specified by:
        setInputMiniBatchSize in interface Layer
      • isPretrainLayer

        public boolean isPretrainLayer()
        Description copied from interface: Layer
        Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)
        Specified by:
        isPretrainLayer in interface Layer
        Returns:
        true if the layer can be pretrained (using fit(INDArray), false otherwise
      • allowInputModification

        public void allowInputModification​(boolean allow)
        Description copied from interface: Layer
        A performance optimization: mark whether the layer is allowed to modify its input array in-place. In many cases, this is totally safe - in others, the input array will be shared by multiple layers, and hence it's not safe to modify the input array. This is usually used by ops such as dropout.
        Specified by:
        allowInputModification in interface Layer
        Parameters:
        allow - If true: the input array is safe to modify. If false: the input array should be copied before it is modified (i.e., in-place modifications are un-safe)
      • feedForwardMaskArray

        public Pair<INDArray,​MaskState> feedForwardMaskArray​(INDArray maskArray,
                                                                   MaskState currentMaskState,
                                                                   int minibatchSize)
        Description copied from interface: Layer
        Feed forward the input mask array, setting in the layer as appropriate. This allows different layers to handle masks differently - for example, bidirectional RNNs and normal RNNs operate differently with masks (the former sets activations to 0 outside of the data present region (and keeps the mask active for future layers like dense layers), whereas normal RNNs don't zero out the activations/errors )instead relying on backpropagated error arrays to handle the variable length case.
        This is also used for example for networks that contain global pooling layers, arbitrary preprocessors, etc.
        Specified by:
        feedForwardMaskArray in interface Layer
        Parameters:
        maskArray - Mask array to set
        currentMaskState - Current state of the mask - see MaskState
        minibatchSize - Current minibatch size. Needs to be known as it cannot always be inferred from the activations array due to reshaping (such as a DenseLayer within a recurrent neural network)
        Returns:
        New mask array after this layer, along with the new mask state.
      • getHelper

        public LayerHelper getHelper()
        Specified by:
        getHelper in interface Layer
        Returns:
        Get the layer helper, if any
      • fit

        public void fit()
        Description copied from interface: Model
        All models have a fit method
        Specified by:
        fit in interface Model
      • reconstructionProbability

        public INDArray reconstructionProbability​(INDArray data,
                                                  int numSamples)
        Calculate the reconstruction probability, as described in An & Cho, 2015 - "Variational Autoencoder based Anomaly Detection using Reconstruction Probability" (Algorithm 4)
        The authors describe it as follows: "This is essentially the probability of the data being generated from a given latent variable drawn from the approximate posterior distribution."

        Specifically, for each example x in the input, calculate p(x). Note however that p(x) is a stochastic (Monte-Carlo) estimate of the true p(x), based on the specified number of samples. More samples will produce a more accurate (lower variance) estimate of the true p(x) for the current model parameters.

        Internally uses reconstructionLogProbability(INDArray, int) for the actual implementation. That method may be more numerically stable in some cases.

        The returned array is a column vector of reconstruction probabilities, for each example. Thus, reconstruction probabilities can (and should, for efficiency) be calculated in a batched manner.
        Parameters:
        data - The data to calculate the reconstruction probability for
        numSamples - Number of samples with which to base the reconstruction probability on.
        Returns:
        Column vector of reconstruction probabilities for each example (shape: [numExamples,1])
      • reconstructionLogProbability

        public INDArray reconstructionLogProbability​(INDArray data,
                                                     int numSamples)
        Return the log reconstruction probability given the specified number of samples.
        See reconstructionLogProbability(INDArray, int) for more details
        Parameters:
        data - The data to calculate the log reconstruction probability
        numSamples - Number of samples with which to base the reconstruction probability on.
        Returns:
        Column vector of reconstruction log probabilities for each example (shape: [numExamples,1])
      • generateAtMeanGivenZ

        public INDArray generateAtMeanGivenZ​(INDArray latentSpaceValues)
        Given a specified values for the latent space as input (latent space being z in p(z|data)), generate output from P(x|z), where x = E[P(x|z)]
        i.e., return the mean value for the distribution P(x|z)
        Parameters:
        latentSpaceValues - Values for the latent space. size(1) must equal nOut configuration parameter
        Returns:
        Sample of data: E[P(x|z)]
      • generateRandomGivenZ

        public INDArray generateRandomGivenZ​(INDArray latentSpaceValues,
                                             LayerWorkspaceMgr workspaceMgr)
        Given a specified values for the latent space as input (latent space being z in p(z|data)), randomly generate output x, where x ~ P(x|z)
        Parameters:
        latentSpaceValues - Values for the latent space. size(1) must equal nOut configuration parameter
        Returns:
        Sample of data: x ~ P(x|z)
      • hasLossFunction

        public boolean hasLossFunction()
        Does the reconstruction distribution have a loss function (such as mean squared error) or is it a standard probabilistic reconstruction distribution?
      • reconstructionError

        public INDArray reconstructionError​(INDArray data)
        Return the reconstruction error for this variational autoencoder.
        NOTE (important): This method is used ONLY for VAEs that have a standard neural network loss function (i.e., an ILossFunction instance such as mean squared error) instead of using a probabilistic reconstruction distribution P(x|z) for the reconstructions (as presented in the VAE architecture by Kingma and Welling).
        You can check if the VAE has a loss function using hasLossFunction()
        Consequently, the reconstruction error is a simple deterministic function (no Monte-Carlo sampling is required, unlike reconstructionProbability(INDArray, int) and reconstructionLogProbability(INDArray, int))
        Parameters:
        data - The data to calculate the reconstruction error on
        Returns:
        Column vector of reconstruction errors for each example (shape: [numExamples,1])
      • assertInputSet

        public void assertInputSet​(boolean backprop)
      • close

        public void close()
        Specified by:
        close in interface Model