Package

org.pmml4s

model

Permalink

package model

PMML is a standard for XML documents which express trained instances of analytic models. The following classes of model are addressed:

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. model
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. class AnomalyDetectionAttributes extends ModelAttributes with HasAnomalyDetectionAttributes

    Permalink

    Holds attributes of an Anomaly Detection Model.

  2. class AnomalyDetectionModel extends Model with HasWrappedAnomalyDetectionAttributes

    Permalink

    Anomaly detection (also outlier detection) is the identification of items, events or observations which do not conform to an expected pattern or other items in a data set.

    Anomaly detection (also outlier detection) is the identification of items, events or observations which do not conform to an expected pattern or other items in a data set. Traditional approaches comprise of distance and density-based approaches. Examples of common ways to define distance or density are distance to the k-nearest neighbors or count of points within a given fixed radius. These methods however are unable to handle data sets with regions of different densities and do not scale well for large data. Other algorithms have been proposed which are better able to handle such cases; the PMML standard at this time supports three such algorithms:

    - Isolation Forest - One Class SVM - Clustering mean distance based anomaly detection model - Other models can also be used if their scoring follows PMML standard rules.

  3. class AnomalyDetectionOutput extends RegOutputs

    Permalink
  4. class AssociationAttributes extends ModelAttributes with HasAssociationAttributes

    Permalink
  5. class AssociationModel extends Model with HasWrappedAssociationAttributes

    Permalink

    The Association Rule model represents rules where some set of items is associated to another set of items.

    The Association Rule model represents rules where some set of items is associated to another set of items. For example a rule can express that a certain product or set of products is often bought in combination with a certain set of other products, also known as Market Basket Analysis. An Association Rule model typically has two variables: one for grouping records together into transactions (usageType="group") and another that uniquely identifies each record (usageType="active"). Alternatively, association rule models can be built on regular data, where each category of each categorical field is an item. Yet another possible format of data is a table with true/false values, where only the fields having true value in a record are considered valid items.

    An Association Rule model consists of four major parts: - Model attributes - Items - ItemSets - AssociationRules

  6. class AssociationOutputs extends ModelOutputs with HasAssociationRules

    Permalink
  7. class AssociationRule extends HasPredictedValue with HasEntityId with HasConfidence with PmmlElement

    Permalink

    We consider association rules of the form "<antecedent itemset> => <consequent itemset>" next:

  8. class Attribute extends Predicate with PmmlElement

    Permalink

    Defines input attributes for each scorecard characteristic are defined in terms of predicates.

    Defines input attributes for each scorecard characteristic are defined in terms of predicates. For numeric characteristics, predicates are used to implement the mapping from a range of continuous values to a partial score . For example, age range 20 to 29 may map to partial score "15". For categorical characteristics, predicates are used to implement the mapping of categorical values to partial scores. Note that while predicates will not (typically) overlap, the Scoring Procedure requires the ordering of Attributes to be respected, and that the first matching Attribute shall determine the partial scored value.

  9. class BaseCumHazardTables extends PmmlElement

    Permalink
  10. class BaselineCell extends PmmlElement

    Permalink
  11. class BaselineStratum extends PmmlElement

    Permalink
  12. class BayesInput extends PmmlElement

    Permalink

    For a discrete field, each BayesInput contains the counts pairing the discrete values of that field with those of the target field.

    For a discrete field, each BayesInput contains the counts pairing the discrete values of that field with those of the target field. For a continuous field, the BayesInput element lists the distributions obtained for that field with each value of the target field. BayesInput may also be used to define how continuous values are encoded as discrete bins. (Discretization is achieved using DerivedField; only the Discretize mapping for DerivedField may be invoked here).

    Note that a BayesInput element encompasses either one TargetValueStats element or one or more PairCounts elements. Element DerivedField can only be used in conjunction with PairCounts.

  13. class BayesInputs extends PmmlElement

    Permalink

    Contains several BayesInput elements.

  14. class BayesOutput extends PmmlElement

    Permalink

    Contains the counts associated with the values of the target field.

  15. class Categories extends PmmlElement

    Permalink
  16. class Category extends PmmlElement

    Permalink
  17. class Characteristic extends PmmlElement

    Permalink

    Defines the point allocation strategy for each scorecard characteristic (numeric or categorical).

    Defines the point allocation strategy for each scorecard characteristic (numeric or categorical). Once point allocation between input attributes and partial scores takes place, each scorecard characteristic is assigned a single partial score which is used to compute the overall score. The overall score is simply the sum of all partial scores. Partial scores are assumed to be continuous values of type "double".

  18. class Characteristics extends PmmlElement

    Permalink

    Envelopes for all scorecard characteristics.

  19. class Cluster extends PmmlElement

    Permalink

    A cluster is defined by its center vector or by statistics.

    A cluster is defined by its center vector or by statistics. A center vector is implemented by a NUM-ARRAY. Each Partition corresponds to a cluster and holds field statistics to describe it. The definition of a cluster may contain a center vector as well as statistics. The attribute modelClass in the ClusteringModel defines which one is used to actually define the cluster.

  20. class ClusteringAttributes extends ModelAttributes with HasClusteringAttributes

    Permalink
  21. class ClusteringField extends PmmlElement

    Permalink

  22. class ClusteringModel extends Model with HasWrappedClusteringAttributes

    Permalink

    A cluster model basically consists of a set of clusters.

    A cluster model basically consists of a set of clusters. For each cluster a center vector can be given. In center-based models a cluster is defined by a vector of center coordinates. Some distance measure is used to determine the nearest center, that is the nearest cluster for a given input record. For distribution-based models (e.g., in demographic clustering) the clusters are defined by their statistics. Some similarity measure is used to determine the best matching cluster for a given record. The center vectors then only approximate the clusters.

  23. class ClusteringOutputs extends CluOutputs

    Permalink
  24. class Coefficient extends PmmlElement

    Permalink

    Coefficient αi is described

  25. class Coefficients extends PmmlElement

    Permalink

    Used to store the support vector coefficients αi and b.

  26. class Comparisons extends PmmlElement

    Permalink

    Comparisons is a matrix which contains the similarity values or distance values, depending on the attribute modelClass in ClusteringModel.

    Comparisons is a matrix which contains the similarity values or distance values, depending on the attribute modelClass in ClusteringModel. The order of the rows and columns corresponds to the order of discrete values or intervals in that field.

  27. class ComplexPartialScore extends RegressionEvaluator with PmmlElement

    Permalink

    Defines ComplexPartialScore, the actual partial score is the value returned by the EXPRESSION (see .transformations for more information).

  28. class CompoundRule extends Rule with PmmlElement

    Permalink

    CompoundRule consists of a predicate and one or more rules.

    CompoundRule consists of a predicate and one or more rules. CompoundRules offer a shorthand for a more compact representation of rulesets and suggest a more efficient execution mechanism.

  29. class Con extends PmmlElement

    Permalink

    Defines the connections coming into that parent element.

    Defines the connections coming into that parent element. The neuron identified by from may be part of any layer.

  30. class Covariances extends PmmlElement

    Permalink

    Stores coordinate-by-coordinate variances (diagonal cells) and covariances (non-diagonal cells).

  31. class CovariateList extends PmmlElement

    Permalink

    List of covariate names.

    List of covariate names. Will not be present when there is no covariate. Each name in the list must match a DataField name or a DerivedField name. The covariates will be treated as continuous variables.

  32. class DataModel extends Model

    Permalink

    DataModel is a container for all info about metadata, it's the parent model of all predictive models.

  33. class DecisionTree extends EmbeddedModel

    Permalink
  34. abstract class EmbeddedModel extends Model

    Permalink

    Model Composition

  35. class EventValues extends PmmlElement

    Permalink
  36. class FactorList extends PmmlElement

    Permalink

    List of factor (categorical predictor) names.

    List of factor (categorical predictor) names. Not present if this particular regression flavor does not support factors (ex. linear regression). If present, the list may or may not be empty. Each name in the list must match a DataField name or a DerivedField name. The factors must be categorical variables.

  37. class GeneralRegressionAttributes extends ModelAttributes with HasGeneralRegressionAttributes

    Permalink
  38. class GeneralRegressionModel extends Model with HasWrappedGeneralRegressionAttributes

    Permalink

    Definition of a general regression model.

    Definition of a general regression model. As the name says it, this is intended to support a multitude of regression models.

  39. class GeneralRegressionOutputs extends MixedClsWithRegOutputs

    Permalink
  40. trait HasAnomalyDetectionAttributes extends HasModelAttributes

    Permalink
  41. trait HasAssociationAttributes extends HasModelAttributes

    Permalink
  42. trait HasClusteringAttributes extends HasModelAttributes

    Permalink
  43. trait HasGeneralRegressionAttributes extends HasModelAttributes

    Permalink
  44. trait HasNaiveBayesAttributes extends HasModelAttributes

    Permalink
  45. trait HasNearestNeighborAttributes extends HasModelAttributes

    Permalink
  46. trait HasNeuralNetworkAttributes extends HasModelAttributes

    Permalink
  47. trait HasRegressionAttributes extends HasModelAttributes

    Permalink
  48. trait HasScorecardAttributes extends HasModelAttributes

    Permalink
  49. trait HasSupportVectorMachineAttributes extends HasModelAttributes

    Permalink
  50. trait HasTreeAttributes extends HasModelAttributes

    Permalink
  51. trait HasWrappedAnomalyDetectionAttributes extends HasWrappedModelAttributes with HasAnomalyDetectionAttributes

    Permalink
  52. trait HasWrappedAssociationAttributes extends HasWrappedModelAttributes with HasAssociationAttributes

    Permalink
  53. trait HasWrappedClusteringAttributes extends HasWrappedModelAttributes with HasClusteringAttributes

    Permalink
  54. trait HasWrappedGeneralRegressionAttributes extends HasWrappedModelAttributes with HasGeneralRegressionAttributes

    Permalink
  55. trait HasWrappedNaiveBayesAttributes extends HasWrappedModelAttributes with HasNaiveBayesAttributes

    Permalink
  56. trait HasWrappedNearestNeighborAttributes extends HasWrappedModelAttributes with HasNearestNeighborAttributes

    Permalink
  57. trait HasWrappedNeuralNetworkAttributes extends HasWrappedModelAttributes with HasNeuralNetworkAttributes

    Permalink
  58. trait HasWrappedRegressionAttributes extends HasWrappedModelAttributes with HasRegressionAttributes

    Permalink
  59. trait HasWrappedScorecardAttributes extends HasWrappedModelAttributes with HasScorecardAttributes

    Permalink
  60. trait HasWrappedSupportVectorMachineAttributes extends HasWrappedModelAttributes with HasSupportVectorMachineAttributes

    Permalink
  61. trait HasWrappedTreeAttributes extends HasWrappedModelAttributes with HasTreeAttributes

    Permalink
  62. class InstanceField extends PmmlElement

    Permalink

  63. class InstanceFields extends PmmlElement

    Permalink

    Serves as an envelope for all the fields included in the training instances.

    Serves as an envelope for all the fields included in the training instances. It encapsulates InstanceField elements.

  64. class Item extends PmmlElement

    Permalink

    Obviously the id of an Item must be unique.

    Obviously the id of an Item must be unique. Furthermore the Item values must be unique, or if they are not unique then attributes field and category must distiguish them. That is, an AssocationModel must not have different instances of Item where the values of the value, field, and category attribute are all the same. The entries in mappedValue may be the same, though. Here are some examples of Items:

  65. class ItemRef extends PmmlElement

    Permalink

    Item references point to elements of type Item

  66. class Itemset extends PmmlElement

    Permalink

  67. class KNNInput extends PmmlElement

    Permalink

  68. class KNNInputs extends PmmlElement

    Permalink

    encapsulates several KNNInput elements which define the fields used to query the k-NN model, one KNNInput element per field.

  69. trait KernelType extends AnyRef

    Permalink
  70. class KohonenMap extends PmmlElement

    Permalink

    The element KohonenMap is appropriate for clustering models that were produced by a Kohonen map algorithm.

    The element KohonenMap is appropriate for clustering models that were produced by a Kohonen map algorithm. The attributes coord1, coord2 and coord3 describe the position of the current cluster in a map with up to three dimensions. This element is not relevant to the scoring function.

  71. class LinearKernelType extends KernelType with PmmlElement

    Permalink

    Linear basis functions which lead to a hyperplane as classifier.

    Linear basis functions which lead to a hyperplane as classifier. K(x,y) = <x,y>

  72. class MeanClusterDistances extends PmmlElement

    Permalink

    Contains an array of non-negative real values, it is required when the algorithm type is clusterMeanDist.

    Contains an array of non-negative real values, it is required when the algorithm type is clusterMeanDist. The length of the array must equal the number of clusters in the model, and the values in it are the mean distances/similarities to the center for each cluster.

  73. class MiningModel extends Model with HasWrappedModelAttributes

    Permalink

    The element MiningModel allows precise specification of the usage of multiple models within one PMML file.

    The element MiningModel allows precise specification of the usage of multiple models within one PMML file. The two main approaches are Model Composition, and Segmentation.

    Model Composition includes model sequencing and model selection but is only applicable to Tree and Regression models. Segmentation allows representation of different models for different data segments and also can be used for model ensembles and model sequences. Scoring a case using a model ensemble consists of scoring it using each model separately, then combining the results into a single scoring result using one of the pre-defined combination methods. Scoring a case using a sequence, or chain, of models allows the output of one model to be passed in as input to subsequent models.

    ModelComposition uses "embedded model elements" that are defeatured copies of "standalone model elements" -- specifically, Regression for RegressionModel, DecisionTree for TreeModel. Besides being limited to Regression and Tree models, these embedded model elements lack key features like a MiningSchema (essential to manage scope across multiple model elements). Therefore, in PMML 4.2, the Model Composition approach has been deprecated since the Segmentation approach allows for a wider range of models to be used more reliably. For more on deprecation, see Conformance.

    Segmentation is accomplished by using any PMML model element inside of a Segment element, which also contains a PREDICATE and an optional weight. MiningModel then contains Segmentation element with a number of Segment elements as well as the attribute multipleModelMethod specifying how all the models applicable to a record should be combined. It is also possible to use a combination of model composition and segmentation approaches, using simple regression or decision trees for data preprocessing before segmentation.

  74. class MiningOutputs extends ClsOutputs with RegOutputs with CluOutputs with SegmentOutputs

    Permalink
  75. class MissingValueWeights extends PmmlElement

    Permalink

    MissingValueWeights is used to adjust distance or similarity measures for missing data.

  76. abstract class Model extends HasParent with HasVersion with HasWrappedModelAttributes with HasMiningSchema with HasOutput with HasModelStats with HasModelExplanation with HasTargets with HasLocalTransformations with FieldScope with ModelLocation with HasTargetFields with Predictable with HasModelVerification with PmmlElement

    Permalink

    Abstract class that represents a PMML model

  77. sealed trait ModelElement extends AnyRef

    Permalink
  78. trait ModelLocation extends AnyRef

    Permalink
  79. class MutableModel extends Model

    Permalink
  80. class NaiveBayesAttributes extends ModelAttributes with HasNaiveBayesAttributes

    Permalink
  81. class NaiveBayesModel extends Model with HasWrappedNaiveBayesAttributes

    Permalink

    Naïve Bayes uses Bayes' Theorem, combined with a ("naive") presumption of conditional independence, to predict the value of a target (output), from evidence given by one or more predictor (input) fields.

    Naïve Bayes uses Bayes' Theorem, combined with a ("naive") presumption of conditional independence, to predict the value of a target (output), from evidence given by one or more predictor (input) fields.

    Naïve Bayes models require the target field to be discretized so that a finite number of values are considered by the model.

  82. class NaiveBayesOutputs extends ClsOutputs

    Permalink
  83. class NearestNeighborAttributes extends ModelAttributes with HasNearestNeighborAttributes

    Permalink
  84. class NearestNeighborModel extends Model with HasWrappedNearestNeighborAttributes

    Permalink

    k-Nearest Neighbors (k-NN) is an instance-based learning algorithm.

    k-Nearest Neighbors (k-NN) is an instance-based learning algorithm. In a k-NN model, a hypothesis or generalization is built from the training data directly at the time a query is made to the system. The prediction is based on the K training instances closest to the case being scored. Therefore, all training cases have to be stored, which may be problematic when the amount of data is large. This model has the ability to store the data directly in PMML using InlineTable or elsewhere using the TableLocator element defined in the Taxonomy document.

    A k-NN model can have one or more target variables or no targets. When one or more targets are present, the predicted value is computed based on the target values of the nearest neighbors. When no targets are present, the model specifies a case ID variable for the training data. In this way, one can easily obtain the IDs of the K closest training cases (nearest neighbors).

    A k-NN model consists of four major parts:

    - Model attributes - Training instances - Comparison measure - Input fields

  85. class NearestNeighborModelOutputs extends KNNOutputs

    Permalink
  86. class NeuralInput extends PmmlElement

    Permalink

    Defines how input fields are normalized so that the values can be processed in the neural network.

    Defines how input fields are normalized so that the values can be processed in the neural network. For example, string values must be encoded as numeric values.

  87. class NeuralInputs extends PmmlElement

    Permalink

    An input neuron represents the normalized value for an input field.

    An input neuron represents the normalized value for an input field. A numeric input field is usually mapped to a single input neuron while a categorical input field is usually mapped to a set of input neurons using some fan-out function. The normalization is defined using the elements NormContinuous and NormDiscrete defined in the Transformation Dictionary. The element DerivedField is the general container for these transformations.

  88. class NeuralLayer extends PmmlElement

    Permalink
  89. class NeuralNetwork extends Model with HasWrappedNeuralNetworkAttributes

    Permalink

    A neural network has one or more input nodes and one or more neurons.

    A neural network has one or more input nodes and one or more neurons. Some neurons' outputs are the output of the network. The network is defined by the neurons and their connections, aka weights. All neurons are organized into layers; the sequence of layers defines the order in which the activations are computed. All output activations for neurons in some layer L are evaluated before computation proceeds to the next layer L+1. Note that this allows for recurrent networks where outputs of neurons in layer L+i can be used as input in layer L where L+i > L. The model does not define a specific evaluation order for neurons within a layer.

  90. class NeuralNetworkAttributes extends ModelAttributes with HasNeuralNetworkAttributes

    Permalink
  91. class NeuralNetworkOutputs extends MixedClsWithRegOutputs

    Permalink
  92. class NeuralOutput extends PmmlElement

    Permalink

    Defines how the output of the neural network must be interpreted.

  93. class NeuralOutputs extends PmmlElement

    Permalink
  94. class Neuron extends PmmlElement

    Permalink

    Contains an identifier id which must be unique in all layers.

    Contains an identifier id which must be unique in all layers. The attribute bias implicitly defines a connection to a bias unit where the unit's value is 1.0 and the weight is the value of bias. The activation function and normalization method for Neuron can be defined in NeuralLayer. If either one is not defined for the layer then the default one specified for NeuralNetwork applies. If the activation function is radialBasis, the attribute width must be specified either in Neuron, NeuralLayer or NeuralNetwork. Again, width specified in Neuron will override a respective value from NeuralLayer, and in turn will override a value given in NeuralNetwork.

    Weighted connections between neural net nodes are represented by Con elements.

  95. class Node extends Predicate with HasScoreDistributions

    Permalink

    This element is an encapsulation for either defining a split or a leaf in a tree model.

    This element is an encapsulation for either defining a split or a leaf in a tree model. Every Node contains a predicate that identifies a rule for choosing itself or any of its siblings. A predicate may be an expression composed of other nested predicates.

  96. class PCell extends PmmlElement

    Permalink

    Cell in the ParamMatrix.

    Cell in the ParamMatrix. The optional targetCategory and required parameterName attributes determine the cell's location in the Parameter matrix. The information contained is: beta (actual Parameter value, required), and df (degrees of freedom, optional). For ordinalMultinomial model ParamMatrix specifies different values for the intercept parameter: one for each target category except one. Values for all other parameters are constant across all target variable values. For multinomialLogistic model ParamMatrix specifies parameter estimates for each target category except the reference category.

  97. class PCovCell extends PmmlElement

    Permalink
  98. class PCovMatrix extends PmmlElement

    Permalink

    Matrix of Parameter estimate covariances.

    Matrix of Parameter estimate covariances. Made up of PCovCells, each of them being located via row information for Parameter name (pRow), row information for target variable value (tRow), column information for Parameter name (pCol) and column information for target variable value (tCol). Note that the matrix is symmetric with respect to the main diagonal (interchanging tRow and tCol together with pRow and pCol will not change the value). Therefore it is sufficient that only half of the matrix be exported. Attributes tRow and tCol are optional since they are not needed for linear regression models. This element has an optional attribute type that can take values model and robust. This attribute describes the way the covariance matrix was computed in generalizedLinear model. The robust option is also known as Huber-White or sandwich or HCCM.

  99. class PPCell extends PmmlElement

    Permalink

    Cell in the PPMatrix.

    Cell in the PPMatrix. Knows its row name, column name.

  100. class PPMatrix extends PmmlElement

    Permalink

    Predictor-to-Parameter correlation matrix.

    Predictor-to-Parameter correlation matrix. It is a rectangular matrix having a column for each Predictor (factor or covariate) and a row for each Parameter. The matrix is represented as a sequence of cells, each cell containing a number representing the correlation between the Predictor and the Parameter.

  101. class PairCounts extends PmmlElement

    Permalink

    PairCounts lists, for a field Ii's discrete value Iij, the TargetValueCounts that pair the value Iij with each value of the target field.

  102. class ParamMatrix extends PmmlElement

    Permalink

    Parameter matrix.

    Parameter matrix. A table containing the Parameter values along with associated statistics (degrees of freedom). One dimension has the target variable's categories, the other has the Parameter names. The table is represented by specifying each cell. There is no requirement for Parameter names other than that each name should uniquely identify one Parameter.

  103. class Parameter extends PmmlElement

    Permalink

    Each Parameter contains a required name and optional label.

  104. class ParameterList extends PmmlElement

    Permalink

    Lists all Parameters.

    Lists all Parameters. ParameterList can be empty only for CoxRegression models, for other models at least one Parameter should be present.

  105. class PolynomialKernelType extends KernelType with PmmlElement

    Permalink

    Polynomial basis functions which lead to a polynome classifier.

    Polynomial basis functions which lead to a polynome classifier. K(x,y) = (gamma*<x,y>+coef0)degree

  106. class Predictor extends PmmlElement

    Permalink

    Describes a categorical (factor) or a continuous (covariate) predictor for the model.

    Describes a categorical (factor) or a continuous (covariate) predictor for the model. When describing a factor, it can optionally contain a list of categories and a contrast matrix. Such matrix describes the codings of categorical variables. If a categorical variable has n values, there will be n rows and n-1 or n columns in the matrix. The rows and columns correspond to the categories of the factor in the order listed in the Category element if it is present, otherwise in the order listed in the DataField or DerivedField element. If the Categories element is present and the corresponding DataField or DerivedField element has a list of valid categories, then the list in Categories should be a subset of that in DataField or DerivedField. A contrast matrix with n-1 columns helps to reduce the total number of parameters in the model.

  107. class RadialBasisKernelType extends KernelType with PmmlElement

    Permalink

    Radial basis functions, the most common kernel type K(x,y) = exp(-gamma*||x - y||2)

  108. class Regression extends EmbeddedModel

    Permalink
  109. class RegressionAttributes extends ModelAttributes with HasRegressionAttributes

    Permalink
  110. class RegressionModel extends Model with HasWrappedRegressionAttributes

    Permalink

    The regression functions are used to determine the relationship between the dependent variable (target field) and one or more independent variables.

    The regression functions are used to determine the relationship between the dependent variable (target field) and one or more independent variables. The dependent variable is the one whose values you want to predict, whereas the independent variables are the variables that you base your prediction on. While the term regression usually refers to the prediction of numeric values, the PMML element RegressionModel can also be used for classification. This is due to the fact that multiple regression equations can be combined in order to predict categorical values.

  111. class RegressionOutputs extends MixedClsWithRegOutputs

    Permalink
  112. sealed trait Rule extends AnyRef

    Permalink
  113. class RuleSelectionMethod extends PmmlElement

    Permalink

    Describes how rules are selected to apply the model to a new case

  114. class RuleSet extends PmmlElement

    Permalink

  115. class RuleSetModel extends Model with HasWrappedModelAttributes

    Permalink

    Ruleset models can be thought of as flattened decision tree models.

    Ruleset models can be thought of as flattened decision tree models. A ruleset consists of a number of rules. Each rule contains a predicate and a predicted class value, plus some information collected at training or testing time on the performance of the rule.

  116. class RuleSetOutputs extends ModelOutputs with MutablePredictedValue with MutableConfidence

    Permalink
  117. class SVMOutputs extends MixedClsWithRegOutputs

    Permalink
  118. class Scorecard extends Model with HasWrappedScorecardAttributes

    Permalink

    A data mining model contains a set of input fields which are used to predict a certain target value.

    A data mining model contains a set of input fields which are used to predict a certain target value. This prediction can be seen as an assessment about a prospect, a customer, or a scenario for which an outcome is predicted based on historical data. In a scorecard, input fields, also referred to as characteristics (for example, "age"), are broken down into attributes (for example, "19-29" and "30-39" age groups or ranges) with specific partial scores associated with them. These scores represent the influence of the input attributes on the target and are readily available for inspection. Partial scores are then summed up so that an overall score can be obtained for the target value.

    Scorecards are very popular in the financial industry for their interpretability and ease of implementation, and because input attributes can be mapped to a series of reason codes which provide explanations of each individual's score. Usually, the lower the overall score produced by a scorecard, the higher the chances of it triggering an adverse decision, which usually involves the referral or denial of services. Reason codes, as the name suggests, allow for an explanation of scorecard behavior and any adverse decisions generated as a consequence of the overall score. They basically answer the question: "Why is the score low, given its input conditions?"

  119. class ScorecardAttributes extends ModelAttributes with HasScorecardAttributes

    Permalink

    Holds attributes of a Scorecard.

  120. class ScorecardOutput extends RegOutputs with MutableReasonCodes

    Permalink
  121. class Segment extends Predictable with Predicate with PmmlElement

    Permalink
  122. class Segmentation extends PmmlElement

    Permalink
  123. class SigmoidKernelType extends KernelType with PmmlElement

    Permalink

    Sigmoid kernel functions for some models of Neural Network type K(x,y) = tanh(gamma*<x,y>+coef0)

  124. class SimpleRule extends Rule with HasScoreDistributions with PmmlElement

    Permalink

    SimpleRule consists of an identifier, a predicate, a score and information on rule performance.

  125. class SupportVector extends PmmlElement

    Permalink

    SupportVector which only has the attribute vectorId - the reference to the support vector in VectorDictionary.

  126. class SupportVectorMachine extends PmmlElement

    Permalink

    Holds a single instance of an SVM.

    Holds a single instance of an SVM.

    SupportVectors holds the support vectors as references towards VectorDictionary used by the respective SVM instance. For storing the SVM coefficients, the element Coefficients is used. Both are combined in the element SupportVectorMachine, which holds a single instance of an SVM.

    The attribute targetCategory is required for classification models and gives the corresponding class label. This attribute is to be used for classification models implementing the one-against-all method. In this method, for n classes, there are exactly n SupportVectorMachine elements. Depending on the model attribute maxWins, the SVM with the largest or the smallest value determines the predicted class label.

    The attribute alternateTargetCategory is required in case of binary classification models with only one SupportVectorMachine element. It is also required in case of multi-class classification models implementing the one-against-one method. In this method, for n classes, there are exactly n(n-1)/2 SupportVectorMachine elements where each SVM is trained on data from two classes. The first class is represented by the targetCategory attribute and the second class by the alternateTargetCategory attribute. The predicted class label is determined based on a voting scheme in which the category with the maximum number of votes wins. In case of a tie, the predicted class label is the first category with maximal number of votes. For both cases (binary classification and multi-class classification with one-against-one), the corresponding class labels are determined by comparing the numeric prediction with the threshold. If maxWins is true and the prediction is larger than the threshold or maxWins is false and the prediction is smaller than the threshold, the class label is the targetCategory attribute, otherwise, it is the alternateTargetCategory attribute.

    Note that each SupportVectorMachine element may have its own threshold that overrides the default.

  127. class SupportVectorMachineAttributes extends ModelAttributes with HasSupportVectorMachineAttributes

    Permalink
  128. class SupportVectorMachineModel extends Model with HasWrappedSupportVectorMachineAttributes

    Permalink

    Support Vector Machine models for classification and regression are considered.

    Support Vector Machine models for classification and regression are considered. A Support Vector Machine is a function f which is defined in the space spanned by the kernel basis functions K(x,xi) of the support vectors xi: f(x) = Sum_(i=1)n αi*K(x,xi) + b.

    Here n is the number of all support vectors, αi are the basis coefficients and b is the absolute coefficient. In an equivalent interpretation, n could also be considered as the total number of all training vectors xi. Then the support vectors are the subset of all those vectors xi whose coefficients αi are greater than zero. The term Support Vector (SV) has also a geometrical interpretation because these vectors really support the discrimination function f(x) = 0 in the mechanical interpretation.

  129. class SupportVectors extends PmmlElement

    Permalink

    Contains all support vectors required for the respective SVM instance.

  130. class TargetValueCount extends PmmlElement

    Permalink
  131. class TargetValueCounts extends PmmlElement

    Permalink

    Lists the counts associated with each value of the target field, However, a TargetValueCount whose count is zero may be omitted.

    Lists the counts associated with each value of the target field, However, a TargetValueCount whose count is zero may be omitted. Within BayesOutput, TargetValueCounts lists the total count of occurrences of each target value. Within PairCounts, TargetValueCounts lists, for each target value, the count of the joint occurrences of that target value with a particular discrete input value.

  132. class TargetValueStat extends PmmlElement

    Permalink

    Used for a continuous input field Ii to define statistical measures associated with each value of the target field.

    Used for a continuous input field Ii to define statistical measures associated with each value of the target field. As defined in CONTINUOUS-DISTRIBUTION-TYPES, different distribution types can be used to represent such measures. For Bayes models, these are restricted to Gaussian and Poisson distributions.

  133. class TargetValueStats extends PmmlElement

    Permalink

    Serves as the envelope for element TargetValueStat.

  134. class TrainingInstances extends PmmlElement

    Permalink

    Encapsulates the definition of the fields included in the training instances as well as their values.

  135. class TransformationModel extends DataModel

    Permalink
  136. class TreeAttributes extends ModelAttributes with HasTreeAttributes

    Permalink

    Holds attributes of a Tree model

  137. class TreeModel extends Model with HasWrappedTreeAttributes

    Permalink

    The TreeModel in PMML allows for defining either a classification or prediction structure.

    The TreeModel in PMML allows for defining either a classification or prediction structure. Each Node holds a logical predicate expression that defines the rule for choosing the Node or any of the branching Nodes.

  138. class TreeOutputs extends MixedClsWithRegOutputs with MutableConfidence with MutableEntityId

    Permalink
  139. class VariableWeight extends PmmlElement

    Permalink
  140. class VectorDictionary extends PmmlElement

    Permalink

    Contains the set of support vectors which are of the typeVectorInstance.

  141. class VectorFields extends PmmlElement

    Permalink

    Defines which entries in the vectors correspond to which fields.

  142. class VectorInstance extends PmmlElement

    Permalink

    A data vector given in dense or sparse array format.

    A data vector given in dense or sparse array format. The order of the values corresponds to that of the VectorFields. The sizes of the sparse arrays must match the number of fields included in the VectorFields element.

Value Members

  1. object ActivationFunction extends Enumeration

    Permalink
  2. object AlgorithmType extends Enumeration

    Permalink

    Defines model types used by the anomaly model.

  3. object BaselineMethod extends Enumeration

    Permalink

    An informational string describing the technique used by the model designer to establish the baseline scores.

    An informational string describing the technique used by the model designer to establish the baseline scores. Allowed values are: - max: Indicates that baseline scores are the maximum partial score in element Characteristic - min: Baseline scores are the minimum partial score in Characteristic - mean: Baseline scores are the mean (weighted average) partial score in Characteristic - neutral: Baseline scores are the risk-neutral partial score in Characteristic - other: Baseline scores are derived using any other technique.

    This attribute is purely informational and does not influence the runtime calculations of reason codes. (By contrast, the reasonCodeAlgorithm is critical to achieving an accurate calculation of reasons.)

  4. object CatScoringMethod extends Enumeration

    Permalink
  5. object ContScoringMethod extends Enumeration

    Permalink
  6. object Criterion extends Enumeration

    Permalink
  7. object CumulativeLinkFunction extends Enumeration

    Permalink

    Definition is used for specifying a cumulative link function used in ordinalMultinomial model.

  8. object Distribution extends Enumeration

    Permalink

    The probability distribution of the dependent variable for generalizedLinear model.

  9. object GeneralModelType extends Enumeration

    Permalink

    Specifies the type of regression model in use.

    Specifies the type of regression model in use. This information will be used to select the appropriate mathematical formulas during scoring.

  10. object KernelType

    Permalink
  11. object LinkFunction extends Enumeration

    Permalink

    Definition is used for specifies the type of link function to use when generalizedLinear model type is specified.

  12. object MissingPredictionTreatment extends Enumeration

    Permalink

    The missing prediction treatment options are used when at least one model for which the predicate in the Segment evaluates to true has a missing result.

    The missing prediction treatment options are used when at least one model for which the predicate in the Segment evaluates to true has a missing result. The attribute missingThreshold is closely related and has default value 1. The options are defined as follows:

    - returnMissing means that if at least one model has a missing result, the whole MiningModel's result should be missing. - skipSegment says that if a model has a missing result, that segment is ignored and the results are computed based on other segments. However, if the fraction of the models with missing results ( weighted if the model combination method is weighted ) exceeds the missingThreshold, the returned result must be missing. This option should not be used with modelChain combination method. - continue says that if a model has a missing result, the processing should continue normally. This can work well for voting or modelChain situations, as well as returnFirst and returnAll. In case of majorityVote or weightedMajorityVote the missing result can be returned if it gets the most ( possibly weighted ) votes, or if the fraction of the models with missing result exceeds the missingThreshold. Otherwise a valid result is computed normally. Other model combination methods will return a missing value as the result.

  13. object MissingValueStrategy extends Enumeration

    Permalink

    Defines a strategy for dealing with missing values.

  14. object Model extends Serializable

    Permalink
  15. object ModelClass extends Enumeration

    Permalink
  16. object ModelElement

    Permalink
  17. object MultipleModelMethod extends Enumeration

    Permalink

    Specifying how all the models applicable to a record should be combined.

  18. object NNNormalizationMethod extends Enumeration

    Permalink

    A normalization method softmax ( pj = exp(yj) / Sumi(exp(yi) ) ) or simplemax ( pj = yj / Sumi(yi) ) can be applied to the computed activation values.

    A normalization method softmax ( pj = exp(yj) / Sumi(exp(yi) ) ) or simplemax ( pj = yj / Sumi(yi) ) can be applied to the computed activation values. The attribute normalizationMethod is defined for the network with default value none ( pj = yj ), but can be specified for each layer as well. Softmax normalization is most often applied to the output layer of a classification network to get the probabilities of all answers. Simplemax normalization is often applied to the hidden layer consisting of elements with radial basis activation function to get a "normalized RBF" activation.

  19. object NoTrueChildStrategy extends Enumeration

    Permalink

    Defines what to do in situations where scoring cannot reach a leaf node.

  20. object PCovMatrixType extends Enumeration

    Permalink
  21. object ReasonCodeAlgorithm extends Enumeration

    Permalink

    Describes how reason codes shall be ranked.

  22. object RegressionModelType extends Enumeration

    Permalink

    Specifies the type of a regression model.

    Specifies the type of a regression model. The attribute modelType is for information only.

  23. object RegressionNormalizationMethod extends Enumeration

    Permalink

    Describes how the prediction is converted into a confidence value (aka probability).

  24. object Rule

    Permalink
  25. object SVMClassificationMethod extends Enumeration

    Permalink

    The two most popular methods for multi-class classification are one-against-all (also known as one-against-rest) and one-against-one.

    The two most popular methods for multi-class classification are one-against-all (also known as one-against-rest) and one-against-one. Depending on the method used, the number of SVMs built will differ.

    The SVM classification method specifies which of both methods is used:

  26. object SVMRepresentation extends Enumeration

    Permalink

    Usually the SVM model uses support vectors to define the model function.

    Usually the SVM model uses support vectors to define the model function. However, for the case of a linear function (linear kernel type) the function is a linear hyperplane that can be more efficiently expressed using the coefficients of all mining fields. In this case, no support vectors are required at all, and hence SupportVectors will be absent and only the Coefficients element is necessary.

    The SVM representation specifies which of both representations is used:

  27. object SplitCharacteristic extends Enumeration

    Permalink

    Indicates whether non-leaf Nodes in the tree model have exactly two children, or an unrestricted number of children.

Inherited from AnyRef

Inherited from Any

Ungrouped