Package

com.eharmony.aloha.models

reg

Permalink

package reg

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. reg
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class Coefficient(coeff: Double, featureIndices: IndexedSeq[Int] = IndexedSeq.empty[Int]) extends Product with Serializable

    Permalink
  2. case class ConstantDeltaSpline(min: Double, max: Double, knots: IndexedSeq[Double]) extends Spline with Product with Serializable

    Permalink

    A spline with the property the delta between consecutive domain values is a fixed constant.

    A spline with the property the delta between consecutive domain values is a fixed constant. Because of this, we just need to specify the min and max and the values in the image of the function.

    NOTE: This class is exposed outside the package for use in aloha-conversions only. This class SHOULD NOT be used outside the aloha libraries.

    min

    the minimum domain value

    max

    the maximum domain value (strictly greater than min IFF spline has at least two knots, or equal to min IFF spline has one knot)

    knots

    Required to have a positive number of knots (size > 0).

  3. trait MapTreeLike[K, +A] extends Tree[A, [+V]Map[K, V], MapTreeLike[K, A]]

    Permalink

    A tree with a map structure for the descendants data structure.

    A tree with a map structure for the descendants data structure. Note: Map keys are invariant. We could make K contravariant and use existential types in the type lambda to make it so we could construct MapTree without having to specify the type in the root instance or leave instance.

    K

    key type of the map structure

    A

    type of the descendant data structure.

  4. final class OptToKv[A] extends AnyVal

    Permalink

    Provides an extension method toKv to convert Options to Seq[(String, Double)].

    Provides an extension method toKv to convert Options to Seq[(String, Double)]. This is used to coerce the value to the type that is used in regression models. We don't do an implicit conversion method from Option[A] to Iterable[(String, Double)] because it can negatively impact type inference. So we make the users convert explicitly via:

    val option: Option[Int] = Option(1)
    val iterable = option.toKv
    require(iterable == List(("", 1d)))
    A

    the type of Option.

  5. trait PolynomialEvaluationAlgo extends AnyRef

    Permalink

    An algorithm for efficiently evaluating polynomials at a given point.

    An algorithm for efficiently evaluating polynomials at a given point. Evaluating first order polynomials is obviously a sub case, which is important because first order polynomial evaluation is isomorphic to linear regression, which may be the standard use case.

    As an example, imagine that we wanted to evaluate Z(u,v,x,y) = wu,vuv + wu,v,xuvx + wu,v,yuvy for coefficients W = [wu,v, wu,v,x, wu,v,y]T.

    This is:

    • Z = wu,vuv + wu,v,xuvx + wu,v,yuvy
    • Z = uv(wu,v + wu,v,xx + wu,v,yy)

    That Z can be factored indicates there is a way to structure the algorithm to efficiently reuse computations. The way to achieve this is to structure the possible polynomials as a tree and traverse and aggregate over the tree. As the tree is traversed, the computation is accumulated. As a concrete example, let's take the above example and show it using real code. Then a motivation of the example will be provided.

    The computation tree works as follows: the edge labels are multiplied by the associated coefficient (0 if non-existent) to get the node values. Node values are added together to get the inner product. So, every time we descend farther into the tree, we multiply the running product by the value we extract from the input vector X and every time a weight is found, it is multiplied by the current product and added to the running sum. The process recurses until the tree can no longer by traversed. The sum is then returned.

    //           u            u      v                                u     v     x
    //    (1)*1.00      (1*1.00)*1.000        u     v    w1     (1*1.00*1.000)*0.75        u     v    x      w2
    //   ----------> 0 ----------------> 1*1.00*1.000 * 0.5 ------------------------> 1*1.00*1.000*0.75 * 0.111
    //                                                      \
    //                                                       \        u     v     y
    //                                                        \ (1*1.00*1.000)*0.25        u     v    y       w3
    //                                                         ---------------------> 1*1.00*1.000*0.25 * 0.4545
    //
    //         u *     v *  w1    +       u *     v *    x *    w2   +       u *     v *    y *     w3
    val Z = 1.00 * 1.000 * 0.5    +    1.00 * 1.000 * 0.75 * 0.111   +    1.00 * 1.000 * 0.25 * 0.4545
    
    val X = IndexedSeq(
              Seq(("a=1", 1.00)),                  // u
              Seq(("b=1", 1.000)),                 // v
              Seq(("c=1", 0.75), ("c=2", 0.25)))   // x and y, respectively
    
    val W1 =
      PolynomialEvaluator(Coefficient(0, IndexedSeq(0)), Map(
        "a=1" -> PolynomialEvaluator(Coefficient(0, IndexedSeq(1)), Map(
          "b=1" -> PolynomialEvaluator(Coefficient(0.5, IndexedSeq(2)), Map(   // w1
            "c=1" -> PolynomialEvaluator(Coefficient(0.111)),                  // w2
            "c=2" -> PolynomialEvaluator(Coefficient(0.4545))))))))            // w3
    
    
    assert(Z == (W1 dot X))

    While constructing a PolynomialEvaluator via direct means is entirely possible, it is less straightforward than using a builder to do it. Below, we show a better way to construct PolynomialEvaluator instances where we just specify the terms in the polynomial and the associated coefficient values. Note linear regression is the special case when all of the inner maps contain exactly one element.

    val W2 = (PolynomialEvaluator.builder ++= Map(
      Map("a=1" -> 0, "b=1" -> 1            ) -> 0.5,
      Map("a=1" -> 0, "b=1" -> 1, "c=1" -> 2) -> 0.111,
      Map("a=1" -> 0, "b=1" -> 1, "c=2" -> 2) -> 0.4545
    )).result
    
    assert(W2 == W1)

    Notice the values in the inner map look a little weird. These are the indices into the input vector x from which the key comes. This is for efficiency purposes but allows the algorithm to dramatically prune the search space while accumulating over the tree.

  6. case class PolynomialEvaluator(value: Coefficient, descendants: Map[String, PolynomialEvaluator] = ...) extends MapTreeLike[String, Coefficient] with PolynomialEvaluationAlgo with Product with Serializable

    Permalink

    Provides a method to evaluate polynomials given an input.

    Provides a method to evaluate polynomials given an input. Default implementation of com.eharmony.aloha.models.reg.PolynomialEvaluationAlgo.

  7. trait RegFeatureCompiler extends AnyRef

    Permalink

    Created by deak on 11/1/15.

  8. trait RegressionFeatures[-A] extends AnyRef

    Permalink

    A helper trait for sparse regression models with String keys.

    A helper trait for sparse regression models with String keys. This trait exposes the constructFeatures method which applies the featureFunctions to the input data and keeps track of missing features.

  9. case class RegressionModel[U, -A, +B <: U](modelId: ModelIdentity, featureNames: IndexedSeq[String], featureFunctions: IndexedSeq[GenAggFunc[A, Iterable[(String, Double)]]], beta: PolynomialEvaluationAlgo, invLinkFunction: (Double) ⇒ Double, spline: Option[Spline], numMissingThreshold: Option[Int], auditor: Auditor[U, Double, B]) extends SubmodelBase[U, Double, A, B] with RegressionFeatures[A] with Logging with Product with Serializable

    Permalink

    A regression model capable of doing not only linear regression but polynomial regression in general.

    A regression model capable of doing not only linear regression but polynomial regression in general.

    val regImp = "com.eharmony.aloha.models.reg.RegressionModelValueToTupleConversions._"
    val compiler = ...
    val plugin = ...
    val imports: Seq[String] = ...
    val s = CompiledSemantics(compiler, plugin, imports :+ regImp)

    This is useful because these conversions allow implicit conversion function from some of the AnyVal types and Options of AnyVal types to Iterable[(String, Double)]. This is useful because specifying features in the JSON spec like:

    {
      ...
      "features": {
        "intercept": "-3",
        "income": "${user.profile.income}"
      }
    }

    into sequences like:

    val interceptFeature = Iterable(("intercept", 3.0))  // AND
    val incomeFeature = Iterable(("income", [the income value converted to a double]))

    For more information, see com.eharmony.aloha.models.reg.RegressionModelValueToTupleConversions.

    A

    model input type

    B

    model output type. to convert from B to com.eharmony.aloha.score.Scores.Score

    modelId

    An identifier for the model. User in score and error reporting.

    featureNames

    feature names (parallel to featureFunctions)

    featureFunctions

    feature extracting functions.

    beta

    representation of the regression model parameters.

    invLinkFunction

    a function applied to the inner product of the input vector and weight vector.

    spline

    an optional calibration spline to Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers, Zadrozny, Elkan (ICML, 2001). This is applied prior to invLinkFunction

    numMissingThreshold

    if provided, we check whether the threshold is exceeded. If so, return an error instead of the computed score. This is for missing data situations.

  10. trait RegressionModelValueToTupleConversions extends AnyRef

    Permalink

    Provides a series of implicit conversions to make the specification of regression models cleaner.

    Provides a series of implicit conversions to make the specification of regression models cleaner.

    Each feature in the Regression model constructs an Iterable[(String, Double)]. Once each feature constructs the iterable, the regression model maps this to a new one prefixed by the feature name. For instance, in the example that follows, "intercept" would emit a value of type Long which would become a function of type com.eharmony.aloha.semantics.func.GenAggFunc [A, Long]. This however doesn't match the expected output type of com.eharmony.aloha.semantics.func.GenAggFunc [A, Iterable[(String, Double)] ]. Conversions are provide for {Byte, Short, Int, Long, Float, Double} and the Option equivalents so that can produce specify the translate the JSON key-value pair "intercept": "1234L" to Iterable(("", 1234.0)), which when prefixed will yield Iterable(("intercept", 1234.0))

      * {
      "modelType": "Regression",
      "modelId": {"id": 0, "name": ""},
      "features": {
        "intercept": "1234L",
        "some_option": "Option(5678L).toKv"
        ...
      },
      ...
    }
  11. sealed trait Spline extends (Double) ⇒ Double

    Permalink

Value Members

  1. object PolynomialEvaluator extends Serializable

    Permalink

    A polynomial evaluator.

  2. object RegressionModel extends ParserProviderCompanion with RegressionModelJson with Serializable

    Permalink
  3. object RegressionModelValueToTupleConversions extends RegressionModelValueToTupleConversions

    Permalink
  4. package json

    Permalink

Inherited from AnyRef

Inherited from Any

Ungrouped