PraFeatureGenerator

Instance Constructors

new PraFeatureGenerator(params: JValue, graph: GraphOnDisk, relation: String, relationMetadata: RelationMetadata, outputter: Outputter, fileUtil: FileUtil = new com.mattg.util.FileUtil())

Value Members

final def !=(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def !=(arg0: Any): Boolean

Definition Classes
Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def ==(arg0: Any): Boolean

Definition Classes
Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
def computeFeatureValues(pathTypes: Seq[PathType], data: Dataset[NodePairInstance], isTraining: Boolean): FeatureMatrix

Given a set of source nodes and path types, compute values for a feature matrix where the feature types (or columns) are the path types, the rows are (source node, target node) pairs, and the values are the probability of starting at source node, following a path of a particular type, and ending at target node.
Given a set of source nodes and path types, compute values for a feature matrix where the feature types (or columns) are the path types, the rows are (source node, target node) pairs, and the values are the probability of starting at source node, following a path of a particular type, and ending at target node.
This is essentially a simple wrapper around the PathFollower GraphChi program, which computes these features using random walks.
Note that this computes a fixed number of _columns_ of the feature matrix, with a not necessarily known number of rows (when only the source node of the row is specified).
pathTypes
A list of PathType objects specifying the path types to follow from each source node.
returns
A feature matrix encoded as a list of MatrixRow objects. Note that this feature matrix may not have rows corresponding to every source in sourcesMap if there were no paths from a source to an acceptable target following any of the path types, there will be no row in the matrix for that source.
def constructMatrixRow(instance: NodePairInstance): Option[MatrixRow]

Constructs a MatrixRow for a single instance.
Constructs a MatrixRow for a single instance. This is intended for SGD-style training or online prediction. Note that this could be _really_ inefficient for some kinds of feature generators, and so far is only implemented for SFE.

Definition Classes
PraFeatureGenerator → FeatureGenerator
def createPathFollower(followerParams: JValue, pathTypes: Seq[PathType], data: Dataset[NodePairInstance], isTraining: Boolean): PathFollower
def createPathTypeSelector(selectorParams: JValue, finder: PathFinder[NodePairInstance]): PathTypeSelector
def createTestMatrix(data: Dataset[NodePairInstance]): FeatureMatrix

Constructs a matrix for the test data.
Constructs a matrix for the test data. In general, if this step is dependent on training (because, for instance, a feature set was selected at training time), the FeatureGenerator should save that state internally, and use it to do this computation. Not all implementations need internal state to do this, but some do.

Definition Classes
PraFeatureGenerator → FeatureGenerator
def createTrainingMatrix(data: Dataset[NodePairInstance]): FeatureMatrix

Takes the data, probably does some random walks (or maybe some matrix multiplications, or a few other possibilities), and returns a FeatureMatrix.
Takes the data, probably does some random walks (or maybe some matrix multiplications, or a few other possibilities), and returns a FeatureMatrix.

Definition Classes
PraFeatureGenerator → FeatureGenerator
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
val featureParamKeys: Seq[String]
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
implicit val formats: DefaultFormats.type
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def getFeatureNames(): Array[String]

Returns a string representation of the features in the feature matrix.
Returns a string representation of the features in the feature matrix. This need only be defined after createTrainingMatrix is called once, and calling removeZeroWeightFeatures may change the output of this function (because the training and test matrices may have different feature spaces; see comments above).

Definition Classes
PraFeatureGenerator → FeatureGenerator
def getMatrixAcceptPolicy(params: JValue, isTraining: Boolean): MatrixRowPolicy
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
var pathTypes: Seq[PathType]
def removeZeroWeightFeatures(weights: Seq[Double]): Seq[Double]

For efficiency in creating the test matrix, we might drop some features if they have zero weight.
For efficiency in creating the test matrix, we might drop some features if they have zero weight. In some FeatureGenerator implementations, computing feature values can be very expensive, so this allows us to save some work. The return value is the updated set of weights, with any desired values removed. Yes, this potentially changes the indices and thus the meaning of the feature matrix. Thus the updated weights can't be used anymore on the training matrix, only on the test matrix.

Definition Classes
PraFeatureGenerator → FeatureGenerator
def selectPathFeatures(data: Dataset[NodePairInstance]): Seq[PathType]

Do feature selection for a PRA model, which amounts to finding common paths between sources and targets.
Do feature selection for a PRA model, which amounts to finding common paths between sources and targets.
This pretty much just wraps around the PathFinder GraphChi program, which does random walks to find paths between source and target nodes, along with a little bit of post processing to (for example) collapse paths that are the same, but are written differently in the GraphChi output because of inverse relationships.
data
A Dataset containing source and target nodes from which we start walks.
returns
A ranked list of the numPaths highest ranked path features, encoded as PathType objects.
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

class PraFeatureGenerator extends FeatureGenerator[NodePairInstance]

Instance Constructors

new PraFeatureGenerator(params: JValue, graph: GraphOnDisk, relation: String, relationMetadata: RelationMetadata, outputter: Outputter, fileUtil: FileUtil = new com.mattg.util.FileUtil())

Value Members

final def !=(arg0: AnyRef): Boolean

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: AnyRef): Boolean

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

def computeFeatureValues(pathTypes: Seq[PathType], data: Dataset[NodePairInstance], isTraining: Boolean): FeatureMatrix

def constructMatrixRow(instance: NodePairInstance): Option[MatrixRow]

def createPathFollower(followerParams: JValue, pathTypes: Seq[PathType], data: Dataset[NodePairInstance], isTraining: Boolean): PathFollower

def createPathTypeSelector(selectorParams: JValue, finder: PathFinder[NodePairInstance]): PathTypeSelector

def createTestMatrix(data: Dataset[NodePairInstance]): FeatureMatrix

def createTrainingMatrix(data: Dataset[NodePairInstance]): FeatureMatrix

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

val featureParamKeys: Seq[String]

def finalize(): Unit

implicit val formats: DefaultFormats.type

final def getClass(): Class[_]

def getFeatureNames(): Array[String]

def getMatrixAcceptPolicy(params: JValue, isTraining: Boolean): MatrixRowPolicy

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

var pathTypes: Seq[PathType]

def removeZeroWeightFeatures(weights: Seq[Double]): Seq[Double]

def selectPathFeatures(data: Dataset[NodePairInstance]): Seq[PathType]

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from FeatureGenerator[NodePairInstance]

Inherited from AnyRef

Inherited from Any

Ungrouped