Class

com.twitter.scalding.examples

WeightedPageRank

Related Doc: package examples

Permalink

class WeightedPageRank extends Job

weighted page rank for the given graph, start from the given pagerank, perform one iteartion, test for convergence, if not yet, clone itself and start the next page rank job with updated pagerank as input.

This class is very similar to the PageRank class, main differences are: 1. supported weighted pagerank 2. the reset pagerank is pregenerated, possibly through a previous job 3. dead pagerank is evenly distributed

Options: --pwd: working directory, will read/generate the following files there numnodes: total number of nodes nodes: nodes file <'src_id, 'dst_ids, 'weights, 'mass_prior> pagerank: the page rank file eg pagerank_0, pagerank_1 etc totaldiff: the current max pagerank delta Optional arguments: --weighted: do weighted pagerank, default false --curiteration: what is the current iteration, default 0 --maxiterations: how many iterations to run. Default is 20 --jumpprob: probability of a random jump, default is 0.1 --threshold: total difference before finishing early, default 0.001

Linear Supertypes
Job, Serializable, FieldConversions, LowPriorityFieldConversions, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. WeightedPageRank
  2. Job
  3. Serializable
  4. FieldConversions
  5. LowPriorityFieldConversions
  6. AnyRef
  7. Any
  1. Hide All
  2. Show all
Visibility
  1. Public
  2. All

Instance Constructors

  1. new WeightedPageRank(args: Args)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. val ALPHA: Double

    Permalink
  5. val CURITERATION: Int

    Permalink
  6. val MAXITERATIONS: Int

    Permalink
  7. val PWD: String

    Permalink
  8. val ROW_TYPE_1: Int

    Permalink
  9. val ROW_TYPE_2: Int

    Permalink
  10. val THRESHOLD: Double

    Permalink
  11. val WEIGHTED: Boolean

    Permalink
  12. implicit def _implicitJobArgs: Args

    Permalink
    Attributes
    protected
    Definition Classes
    Job
  13. def anyToFieldArg(f: Any): Comparable[_]

    Permalink
    Attributes
    protected
    Definition Classes
    LowPriorityFieldConversions
  14. val args: Args

    Permalink
    Definition Classes
    Job
  15. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  16. def asList(f: Fields): List[Comparable[_]]

    Permalink
    Definition Classes
    FieldConversions
  17. def asSet(f: Fields): Set[Comparable[_]]

    Permalink
    Definition Classes
    FieldConversions
  18. def buildFlow: Flow[_]

    Permalink
    Definition Classes
    Job
  19. def classIdentifier: String

    Permalink
    Definition Classes
    Job
  20. def clear: Unit

    Permalink
    Definition Classes
    Job
  21. def clone(nextargs: Args): Job

    Permalink
    Definition Classes
    Job
  22. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. def config: Map[AnyRef, AnyRef]

    Permalink
    Definition Classes
    Job
  24. implicit def dateParser: DateParser

    Permalink
    Definition Classes
    Job
  25. def defaultComparator: Option[Class[_ <: Comparator[_]]]

    Permalink
    Definition Classes
    Job
  26. def defaultMode(fromFields: Fields, toFields: Fields): Fields

    Permalink
    Definition Classes
    FieldConversions
  27. def defaultSpillThreshold: Int

    Permalink
    Definition Classes
    Job
  28. def doPageRank(nodeRows: RichPipe, inputPagerank: RichPipe): RichPipe

    Permalink

    one iteration of pagerank inputPagerank: <'src_id_input, 'mass_input> return <'src_id, 'mass_n, 'mass_input>

    one iteration of pagerank inputPagerank: <'src_id_input, 'mass_input> return <'src_id, 'mass_n, 'mass_input>

    Here is a highlevel view of the unweighted algorithm: let N: number of nodes inputPagerank(N_i): prob of walking to node i, d(N_j): N_j's out degree then pagerankNext(N_i) = (\sum_{j points to i} inputPagerank(N_j) / d_j) deadPagerank = (1 - \sum_{i} pagerankNext(N_i)) / N randomPagerank(N_i) = userMass(N_i) * ALPHA + deadPagerank * (1-ALPHA) pagerankOutput(N_i) = randomPagerank(N_i) + pagerankNext(N_i) * (1-ALPHA)

    For weighted algorithm: let w(N_j, N_i): weight from N_j to N_i tw(N_j): N_j's total out weights then pagerankNext(N_i) = (\sum_{j points to i} inputPagerank(N_j) * w(N_j, N_i) / tw(N_j))

  29. final def ensureUniqueFields(left: Fields, right: Fields, rightPipe: Pipe): (Fields, Pipe)

    Permalink
    Definition Classes
    FieldConversions
  30. implicit def enumValueToFields(x: Value): Fields

    Permalink
    Definition Classes
    FieldConversions
  31. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  32. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  33. implicit def fieldFields[T <: TraversableOnce[Field[_]]](f: T): RichFields

    Permalink
    Definition Classes
    FieldConversions
  34. implicit def fieldToFields(f: Field[_]): RichFields

    Permalink
    Definition Classes
    FieldConversions
  35. implicit def fields[T <: TraversableOnce[Symbol]](f: T): Fields

    Permalink
    Definition Classes
    FieldConversions
  36. implicit def fieldsToRichFields(fields: Fields): RichFields

    Permalink
    Definition Classes
    FieldConversions
  37. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  38. implicit val flowDef: FlowDef

    Permalink
    Attributes
    protected
    Definition Classes
    Job
  39. implicit def fromEnum[T <: Enumeration](enumeration: T): Fields

    Permalink
    Definition Classes
    FieldConversions
  40. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  41. def getField(f: Fields, idx: Int): Fields

    Permalink
    Definition Classes
    FieldConversions
  42. def getInputPagerank(fileName: String): Pipe

    Permalink
  43. def getNodes(fileName: String): Pipe

    Permalink

    read the pregenerated nodes file <'src_id, 'dst_ids, 'weights, 'mass_prior>

  44. def getNumNodes(fileName: String): Pipe

    Permalink

    the total number of nodes, single line file

  45. def handleStats(statsData: CascadingStats): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Job
  46. def hasInts(f: Fields): Boolean

    Permalink
    Definition Classes
    FieldConversions
  47. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  48. val inputPagerank: Pipe

    Permalink
  49. implicit def intFields[T <: TraversableOnce[Int]](f: T): Fields

    Permalink
    Definition Classes
    FieldConversions
  50. implicit def intToFields(x: Int): Fields

    Permalink
    Definition Classes
    FieldConversions
  51. implicit def integerToFields(x: Integer): Fields

    Permalink
    Definition Classes
    FieldConversions
  52. def ioSerializations: List[Class[_ <: Serialization[_]]]

    Permalink
    Definition Classes
    Job
  53. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  54. implicit def iterableToRichPipe[T](iter: Iterable[T])(implicit set: TupleSetter[T], conv: TupleConverter[T]): RichPipe

    Permalink
    Definition Classes
    Job
  55. def keepAlive: Unit

    Permalink
    Definition Classes
    Job
  56. def listeners: List[FlowListener]

    Permalink
    Definition Classes
    Job
  57. implicit def mode: Mode

    Permalink
    Definition Classes
    Job
  58. def name: String

    Permalink
    Definition Classes
    Job
  59. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  60. final def newSymbol(avoid: Set[Symbol], guess: Symbol, trial: Int): Symbol

    Permalink
    Definition Classes
    FieldConversions
    Annotations
    @tailrec()
  61. def next: Option[Job]

    Permalink

    test convergence, if not yet, kick off the next iteration

    test convergence, if not yet, kick off the next iteration

    Definition Classes
    WeightedPageRank → Job
  62. val nodes: Pipe

    Permalink
  63. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  64. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  65. val numNodes: Pipe

    Permalink
  66. val outputFileName: String

    Permalink
  67. val outputPagerank: RichPipe

    Permalink
  68. implicit def parseAnySeqToFields[T <: TraversableOnce[Any]](anyf: T): Fields

    Permalink
    Definition Classes
    FieldConversions
  69. implicit def pipeToRichPipe(pipe: Pipe): RichPipe

    Permalink
    Definition Classes
    Job
  70. implicit def productToFields(f: Product): Fields

    Permalink
    Definition Classes
    LowPriorityFieldConversions
  71. implicit def read(src: Source): Pipe

    Permalink
    Definition Classes
    Job
  72. def run: Boolean

    Permalink
    Definition Classes
    Job
  73. implicit def scaldingConfig: Config

    Permalink
    Attributes
    protected
    Definition Classes
    Job
  74. def skipStrategy: Option[FlowSkipStrategy]

    Permalink
    Definition Classes
    Job
  75. implicit def sourceToRichPipe(src: Source): RichPipe

    Permalink
    Definition Classes
    Job
  76. def stepListeners: List[FlowStepListener]

    Permalink
    Definition Classes
    Job
  77. def stepStrategy: Option[FlowStepStrategy[_]]

    Permalink
    Definition Classes
    Job
  78. implicit def strFields[T <: TraversableOnce[String]](f: T): Fields

    Permalink
    Definition Classes
    FieldConversions
  79. implicit def stringToFields(x: String): Fields

    Permalink
    Definition Classes
    FieldConversions
  80. implicit def symbolToFields(x: Symbol): Fields

    Permalink
    Definition Classes
    FieldConversions
  81. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  82. def timeout[T](timeout: AbsoluteDuration)(t: ⇒ T): Option[T]

    Permalink
    Definition Classes
    Job
  83. implicit def toPipe[T](iter: Iterable[T])(implicit set: TupleSetter[T], conv: TupleConverter[T]): Pipe

    Permalink
    Definition Classes
    Job
  84. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  85. val totalDiff: Pipe

    Permalink
  86. implicit def tuple2ToFieldsPair[T, U](pair: (T, U))(implicit tf: (T) ⇒ Fields, uf: (U) ⇒ Fields): (Fields, Fields)

    Permalink
    Definition Classes
    FieldConversions
  87. implicit def unitToFields(u: Unit): Fields

    Permalink
    Definition Classes
    FieldConversions
  88. def validate: Unit

    Permalink
    Definition Classes
    Job
  89. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  90. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  91. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  92. def write(pipe: Pipe, src: Source): Unit

    Permalink
    Definition Classes
    Job

Inherited from Job

Inherited from Serializable

Inherited from FieldConversions

Inherited from LowPriorityFieldConversions

Inherited from AnyRef

Inherited from Any

Ungrouped