reg

Type Members

case class Coefficient(coeff: Double, featureIndices: IndexedSeq[Int] = IndexedSeq.empty[Int]) extends Product with Serializable
case class ConstantDeltaSpline(min: Double, max: Double, knots: IndexedSeq[Double]) extends Spline with Product with Serializable

A spline with the property the delta between consecutive domain values is a fixed constant.
A spline with the property the delta between consecutive domain values is a fixed constant. Because of this, we just need to specify the min and max and the values in the image of the function.
NOTE: This class is exposed outside the package for use in aloha-conversions only. This class SHOULD NOT be used outside the aloha libraries.
min
the minimum domain value
max
the maximum domain value (strictly greater than min IFF spline has at least two knots, or equal to min IFF spline has one knot)
knots
Required to have a positive number of knots (size > 0).
trait MapTreeLike[K, +A] extends Tree[A, [+V]Map[K, V], MapTreeLike[K, A]]

A tree with a map structure for the descendants data structure.
A tree with a map structure for the descendants data structure. Note: Map keys are invariant. We could make K contravariant and use existential types in the type lambda to make it so we could construct MapTree without having to specify the type in the root instance or leave instance.
K
key type of the map structure
A
type of the descendant data structure.
final class OptToKv[A] extends AnyVal

Provides an extension method toKv to convert Options to Seq[(String, Double)].
Provides an extension method toKv to convert Options to Seq[(String, Double)]. This is used to coerce the value to the type that is used in regression models. We don't do an implicit conversion method from Option[A] to Iterable[(String, Double)] because it can negatively impact type inference. So we make the users convert explicitly via:
```
val option: Option[Int] = Option(1)
val iterable = option.toKv
require(iterable == List(("", 1d)))
```
A
the type of Option.
trait PolynomialEvaluationAlgo extends AnyRef

An algorithm for efficiently evaluating polynomials at a given point.
An algorithm for efficiently evaluating polynomials at a given point. Evaluating first order polynomials is obviously a sub case, which is important because first order polynomial evaluation is isomorphic to linear regression, which may be the standard use case.
As an example, imagine that we wanted to evaluate Z(u,v,x,y) = w_u,vuv + w_u,v,xuvx + w_u,v,yuvy for coefficients W = [w_u,v, w_u,v,x, w_u,v,y]^T.
This is:
- Z = w_u,vuv + w_u,v,xuvx + w_u,v,yuvy
- Z = uv(w_u,v + w_u,v,xx + w_u,v,yy)
That Z can be factored indicates there is a way to structure the algorithm to efficiently reuse computations. The way to achieve this is to structure the possible polynomials as a tree and traverse and aggregate over the tree. As the tree is traversed, the computation is accumulated. As a concrete example, let's take the above example and show it using real code. Then a motivation of the example will be provided.
The computation tree works as follows: the edge labels are multiplied by the associated coefficient (0 if non-existent) to get the node values. Node values are added together to get the inner product. So, every time we descend farther into the tree, we multiply the running product by the value we extract from the input vector X and every time a weight is found, it is multiplied by the current product and added to the running sum. The process recurses until the tree can no longer by traversed. The sum is then returned.
```
//           u            u      v                                u     v     x
//    (1)*1.00      (1*1.00)*1.000        u     v    w1     (1*1.00*1.000)*0.75        u     v    x      w2
//   ----------> 0 ----------------> 1*1.00*1.000 * 0.5 ------------------------> 1*1.00*1.000*0.75 * 0.111
//                                                      \
//                                                       \        u     v     y
//                                                        \ (1*1.00*1.000)*0.25        u     v    y       w3
//                                                         ---------------------> 1*1.00*1.000*0.25 * 0.4545
//
//         u *     v *  w1    +       u *     v *    x *    w2   +       u *     v *    y *     w3
val Z = 1.00 * 1.000 * 0.5    +    1.00 * 1.000 * 0.75 * 0.111   +    1.00 * 1.000 * 0.25 * 0.4545

val X = IndexedSeq(
          Seq(("a=1", 1.00)),                  // u
          Seq(("b=1", 1.000)),                 // v
          Seq(("c=1", 0.75), ("c=2", 0.25)))   // x and y, respectively

val W1 =
  PolynomialEvaluator(Coefficient(0, IndexedSeq(0)), Map(
    "a=1" -> PolynomialEvaluator(Coefficient(0, IndexedSeq(1)), Map(
      "b=1" -> PolynomialEvaluator(Coefficient(0.5, IndexedSeq(2)), Map(   // w1
        "c=1" -> PolynomialEvaluator(Coefficient(0.111)),                  // w2
        "c=2" -> PolynomialEvaluator(Coefficient(0.4545))))))))            // w3


assert(Z == (W1 dot X))
```
While constructing a PolynomialEvaluator via direct means is entirely possible, it is less straightforward than using a builder to do it. Below, we show a better way to construct PolynomialEvaluator instances where we just specify the terms in the polynomial and the associated coefficient values. Note linear regression is the special case when all of the inner maps contain exactly one element.
```
val W2 = (PolynomialEvaluator.builder ++= Map(
  Map("a=1" -> 0, "b=1" -> 1            ) -> 0.5,
  Map("a=1" -> 0, "b=1" -> 1, "c=1" -> 2) -> 0.111,
  Map("a=1" -> 0, "b=1" -> 1, "c=2" -> 2) -> 0.4545
)).result

assert(W2 == W1)
```
Notice the values in the inner map look a little weird. These are the indices into the input vector x from which the key comes. This is for efficiency purposes but allows the algorithm to dramatically prune the search space while accumulating over the tree.
case class PolynomialEvaluator(value: Coefficient, descendants: Map[String, PolynomialEvaluator] = ...) extends MapTreeLike[String, Coefficient] with PolynomialEvaluationAlgo with Product with Serializable

Provides a method to evaluate polynomials given an input.
Provides a method to evaluate polynomials given an input. Default implementation of com.eharmony.aloha.models.reg.PolynomialEvaluationAlgo.
trait RegFeatureCompiler extends AnyRef

Created by deak on 11/1/15.
trait RegressionFeatures[-A] extends AnyRef

A helper trait for sparse regression models with String keys.
A helper trait for sparse regression models with String keys. This trait exposes the constructFeatures method which applies the featureFunctions to the input data and keeps track of missing features.
case class RegressionModel[U, -A, +B <: U](modelId: ModelIdentity, featureNames: IndexedSeq[String], featureFunctions: IndexedSeq[GenAggFunc[A, Iterable[(String, Double)]]], beta: PolynomialEvaluationAlgo, invLinkFunction: (Double) ⇒ Double, spline: Option[Spline], numMissingThreshold: Option[Int], auditor: Auditor[U, Double, B]) extends SubmodelBase[U, Double, A, B] with RegressionFeatures[A] with Logging with Product with Serializable

A regression model capable of doing not only linear regression but polynomial regression in general.
A regression model capable of doing not only linear regression but polynomial regression in general.
```
val regImp = "com.eharmony.aloha.models.reg.RegressionModelValueToTupleConversions._"
val compiler = ...
val plugin = ...
val imports: Seq[String] = ...
val s = CompiledSemantics(compiler, plugin, imports :+ regImp)
```
This is useful because these conversions allow implicit conversion function from some of the AnyVal types and Options of AnyVal types to Iterable[(String, Double)]. This is useful because specifying features in the JSON spec like:
```
{
  ...
  "features": {
    "intercept": "-3",
    "income": "${user.profile.income}"
  }
}
```
into sequences like:
```
val interceptFeature = Iterable(("intercept", 3.0))  // AND
val incomeFeature = Iterable(("income", [the income value converted to a double]))
```
For more information, see com.eharmony.aloha.models.reg.RegressionModelValueToTupleConversions.
A
model input type
B
model output type. to convert from B to com.eharmony.aloha.score.Scores.Score
modelId
An identifier for the model. User in score and error reporting.
featureNames
feature names (parallel to featureFunctions)
featureFunctions
feature extracting functions.
beta
representation of the regression model parameters.
invLinkFunction
a function applied to the inner product of the input vector and weight vector.
spline
an optional calibration spline to Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers, Zadrozny, Elkan (ICML, 2001). This is applied prior to invLinkFunction
numMissingThreshold
if provided, we check whether the threshold is exceeded. If so, return an error instead of the computed score. This is for missing data situations.
trait RegressionModelValueToTupleConversions extends AnyRef

Provides a series of implicit conversions to make the specification of regression models cleaner.
Provides a series of implicit conversions to make the specification of regression models cleaner.
Each feature in the Regression model constructs an Iterable[(String, Double)]. Once each feature constructs the iterable, the regression model maps this to a new one prefixed by the feature name. For instance, in the example that follows, "intercept" would emit a value of type Long which would become a function of type com.eharmony.aloha.semantics.func.GenAggFunc [A, Long]. This however doesn't match the expected output type of com.eharmony.aloha.semantics.func.GenAggFunc [A, Iterable[(String, Double)] ]. Conversions are provide for {Byte, Short, Int, Long, Float, Double} and the Option equivalents so that can produce specify the translate the JSON key-value pair "intercept": "1234L" to Iterable(("", 1234.0)), which when prefixed will yield Iterable(("intercept", 1234.0))
```
  * {
  "modelType": "Regression",
  "modelId": {"id": 0, "name": ""},
  "features": {
    "intercept": "1234L",
    "some_option": "Option(5678L).toKv"
    ...
  },
  ...
}
```
sealed trait Spline extends (Double) ⇒ Double

Value Members

object PolynomialEvaluator extends Serializable

A polynomial evaluator.
object RegressionModel extends ParserProviderCompanion with RegressionModelJson with Serializable
object RegressionModelValueToTupleConversions extends RegressionModelValueToTupleConversions
package json

package reg

Type Members

case class Coefficient(coeff: Double, featureIndices: IndexedSeq[Int] = IndexedSeq.empty[Int]) extends Product with Serializable

case class ConstantDeltaSpline(min: Double, max: Double, knots: IndexedSeq[Double]) extends Spline with Product with Serializable

trait MapTreeLike[K, +A] extends Tree[A, [+V]Map[K, V], MapTreeLike[K, A]]

final class OptToKv[A] extends AnyVal

trait PolynomialEvaluationAlgo extends AnyRef

case class PolynomialEvaluator(value: Coefficient, descendants: Map[String, PolynomialEvaluator] = ...) extends MapTreeLike[String, Coefficient] with PolynomialEvaluationAlgo with Product with Serializable

trait RegFeatureCompiler extends AnyRef

trait RegressionFeatures[-A] extends AnyRef

trait RegressionModelValueToTupleConversions extends AnyRef

sealed trait Spline extends (Double) ⇒ Double

Value Members

object PolynomialEvaluator extends Serializable

object RegressionModel extends ParserProviderCompanion with RegressionModelJson with Serializable

object RegressionModelValueToTupleConversions extends RegressionModelValueToTupleConversions

package json

Inherited from AnyRef

Inherited from Any

Ungrouped