Helper to produce functions used to the create the covariate data passed to the dataset generators.
Helper to provide a way to construct dependent variables.
A function that takes a value and returns extracted features and information on missing and erring features.
Given a semantics, json specification and an ordered sequence of RowCreatorProducers, find the first producer that applies to creating a Spec from the json specification and use it to instantiate the RowCreator object.
Given a semantics, json specification and an ordered sequence of RowCreatorProducers, find the first producer that applies to creating a Spec from the json specification and use it to instantiate the RowCreator object.
the type consumed by the RowCreator produced by this Readable.
the type produced by the RowCreator produced by this Readable.
the implementation of RowCreator.
a Semantics to be used for creating the RowCreator.
an ordered sequence of RowCreatorProducers. These producers form the basis of a chain of responsibility pattern. Therefore, the order is important.
RowCreatorProducer is used to create different kinds of RowCreator instances.
RowCreatorProducer is used to create different kinds of RowCreator instances.
Classes that extend RowCreatorProducer should (try to) have only zero-argument constructors.
This is because RowCreator instances should ideally only be parametrized by the JSON specification. Otherwise, one JSON specification could produce non-equivalent RowCreator instances in different environments.
This statelessness is a design goal and should only be broken with good reason.
One of the reasons this rule will likely be broken is that things like context bounds on a type parameter to a RowCreatorProducer become constructor arguments. So if a RowCreatorProducer is parametrized by a type that requires a type class to decode the JSON representation, this rule would be broken.
Another example might be in training multi-label models. Whereas in binary classifiers
the labels values are known automatically (because they are isomorphic to the set
{true, false}
), the label set isn't known a priori (because each problem codomain
might be different). Therefore, we might ask for the set of labels to expect.
NOTE: com.eharmony.aloha.dataset.RowCreatorProducerTest
will be used to control
which RowCreatorProducers can accept parameters.
type of input passed to the RowCreator.
type of output returned from the RowCreator.
implementation of the RowCreator that is returned by the
getRowCreator
function.
A mixin that gives a standardized name to RowCreatorProducer instances.
A row creator that requires state.
A row creator that requires state. This state should be modeled functionally, meaning implementations should be referentially transparent.
Created by ryan.deak on 11/2/17.
Created by deaktator on 11/6/17.
This is for backward compatibility, but should someday be removed and SpecBuilder should be updated to remove it too.
This is for backward compatibility, but should someday be removed and SpecBuilder should be updated to remove it too. Eventually, the spec type should always appear in the JSON used to create the JsonSpec.
This package name is kept to eHarmony compatibility.