com.twitter

scalding

package scalding

Visibility
  1. Public
  2. All

Type Members

  1. sealed abstract class AccessMode extends AnyRef

  2. class BufferOp[I, T, X] extends BaseOperation[Any] with Buffer[Any]

  3. abstract class CascadeJob extends Job

  4. class CascadeTest extends JobTest

  5. trait CascadingLocal extends Mode

  6. trait CaseClassPackers extends LowPriorityTuplePackers

  7. class CoGroupBuilder extends GroupBuilder

    Builder classes used internally to implement coGroups (joins).

  8. case class Csv(p: String, separator: String = ",", fields: Fields = cascading.tuple.Fields.ALL, skipHeader: Boolean = false, writeHeader: Boolean = false, quote: String = "\"") extends FixedPathSource with DelimitedScheme with Product with Serializable

    Csv value source separated by commas and quotes wrapping all fields

  9. trait DefaultDateRangeJob extends Job

    Sets up an implicit dateRange to use in your sources and an implicit timezone.

  10. trait DelimitedScheme extends Source

    Mix this in for delimited schemes such as TSV or one-separated values By default, TSV is given

  11. sealed abstract class Field[T] extends Serializable

  12. trait FieldConversions extends LowPriorityFieldConversions

  13. abstract class FileSource extends Source

    This is a base class for File-based sources

  14. class FilterFunction[T] extends BaseOperation[Any] with Filter[Any]

  15. abstract class FixedPathSource extends FileSource

  16. class FlatMapFunction[S, T] extends BaseOperation[Any] with Function[Any]

  17. class FoldAggregator[T, X] extends BaseOperation[X] with Aggregator[X]

  18. abstract class FoldFunctor[X] extends Functor

    This handles the mapReduceMap work on the map-side of the operation.

  19. trait FoldOperations[+Self <: FoldOperations[Self]] extends ReduceOperations[Self] with Sortable[Self]

    Implements reductions on top of a simple abstraction for the Fields-API We use the f-bounded polymorphism trick to return the type called Self in each operation.

  20. trait GeneratedConversions extends LowPriorityConversions

  21. trait GeneratedTupleAdders extends AnyRef

  22. class GroupBuilder extends FoldOperations[GroupBuilder] with StreamOperations[GroupBuilder]

    This controls the sequence of reductions that happen inside a particular grouping operation.

  23. class Grouped[K, +T] extends KeyedList[K, T] with Serializable

    Represents a grouping which is the transition from map to reduce phase in hadoop.

  24. trait HadoopMode extends Mode

  25. case class HadoopTest(conf: Configuration, buffers: Map[Source, Buffer[Tuple]]) extends Mode with HadoopMode with TestMode with Product with Serializable

  26. case class Hdfs(strict: Boolean, conf: Configuration) extends Mode with HadoopMode with Product with Serializable

  27. case class IntField[T](id: Integer)(implicit ord: Ordering[T], mf: Option[Manifest[T]]) extends Field[T] with Product with Serializable

  28. class IntegralComparator extends Comparator[AnyRef] with Hasher[AnyRef] with Serializable

  29. class InvalidJoinModeException extends Exception

  30. class InvalidSourceException extends RuntimeException

    thrown when validateTaps fails

  31. case class IterableSource[T](iter: Iterable[T], inFields: Fields = cascading.tuple.Fields.NONE)(implicit set: TupleSetter[T], converter: TupleConverter[T]) extends Source with Mappable[T] with Product with Serializable

    Allows working with an iterable object defined in the job (on the submitter) to be used within a Job as you would a Pipe/RichPipe

  32. class Job extends TupleConversions with FieldConversions with Serializable

  33. class JobTest extends TupleConversions

    This class is used to construct unit tests for scalding jobs.

  34. trait JoinAlgorithms extends AnyRef

  35. sealed abstract class JoinMode extends AnyRef

  36. case class JsonLine(p: String, fields: Fields = cascading.tuple.Fields.ALL) extends FixedPathSource with TextLineScheme with Product with Serializable

    This Source writes out the TupleEntry as a simple JSON object, using the field names as keys and the string representation of the values.

  37. trait KeyedList[K, +T] extends AnyRef

    Represents sharded lists of items of type T

  38. case class Local(strict: Boolean) extends Mode with CascadingLocal with Product with Serializable

  39. trait LocalTapSource extends FileSource

  40. trait LowPriorityConversions extends AnyRef

  41. trait LowPriorityFieldConversions extends AnyRef

  42. trait LowPriorityTuplePackers extends TupleConversions

  43. trait LowPriorityTupleUnpackers extends TupleConversions

  44. class MRMAggregator[T, X, U] extends BaseOperation[Tuple] with Aggregator[Tuple]

  45. class MRMBy[T, X, U] extends AggregateBy

    MapReduceMapBy Class

  46. class MRMFunctor[T, X] extends FoldFunctor[X]

    This handles the mapReduceMap work on the map-side of the operation.

  47. class MapFunction[S, T] extends BaseOperation[Any] with Function[Any]

  48. trait Mappable[T] extends Source

    Usually as soon as we open a source, we read and do some mapping operation on a single column or set of columns.

  49. class MappedOrdering[B, T] extends Ordering[T] with Serializable

  50. class MapsideReduce[V] extends BaseOperation[SummingCache[Tuple, V]] with Function[SummingCache[Tuple, V]]

    An implementation of map-side combining which is appropriate for associative and commutative functions If a cacheSize is given, it is used, else we query the config for cascading.

  51. class MemoryTap[In, Out] extends Tap[Properties, In, Out]

  52. class MemoryTupleEntryCollector extends TupleEntryCollector

  53. abstract class Mode extends AnyRef

    There are three ways to run jobs sourceStrictness is set to true

  54. abstract class MostRecentGoodSource extends TimePathedSource

  55. case class MultipleDelimitedFiles(f: Fields, separator: String, quote: String, skipHeader: Boolean, writeHeader: Boolean, p: String*) extends FixedPathSource with DelimitedScheme with Product with Serializable

    Delimited files source allowing to override separator and quotation characters and header configuration

  56. case class MultipleSequenceFiles(p: String*) extends FixedPathSource with SequenceFileScheme with LocalTapSource with Product with Serializable

  57. case class MultipleTextLineFiles(p: String*) extends FixedPathSource with TextLineScheme with Product with Serializable

  58. case class MultipleWritableSequenceFiles[K <: Writable, V <: Writable](p: Seq[String], f: Fields)(implicit evidence$9: Manifest[K], evidence$10: Manifest[V]) extends FixedPathSource with WritableSequenceFileScheme with LocalTapSource with Product with Serializable

  59. class NamedPoolThreadFactory extends ThreadFactory

  60. class NullTap[Config, Input, Output, SourceContext, SinkContext] extends SinkTap[Config, Output]

    A tap that output nothing.

  61. class OrderedConstructorConverter[T] extends TupleConverter[T]

  62. class OrderedTuplePacker[T] extends TuplePacker[T]

    This just blindly uses the first public constructor with the same arity as the fields size

  63. case class Osv(p: String, f: Fields = cascading.tuple.Fields.ALL) extends FixedPathSource with DelimitedScheme with Product with Serializable

    One separated value (commonly used by Pig)

  64. class PipeTExtensions extends Serializable

  65. trait ReduceOperations[+Self <: ReduceOperations[Self]] extends Serializable

    Implements reductions on top of a simple abstraction for the Fields-API This is for associative and commutive operations (particularly Monoids play a big role here)

  66. class ReflectionSetter[T] extends TupleSetter[T]

  67. class ReflectionTupleConverter[T] extends TupleConverter[T]

  68. class ReflectionTuplePacker[T] extends TuplePacker[T]

    Packs a tuple into any object with set methods, e.

  69. class ReflectionTupleUnpacker[T] extends TupleUnpacker[T]

  70. class RichFields extends Fields

  71. class RichPipe extends Serializable with JoinAlgorithms

  72. class ScaldingMultiSourceTap extends MultiSourceTap[Tap[JobConf, RecordReader[_, _], OutputCollector[_, _]], JobConf, RecordReader[_, _]]

  73. class ScanLeftIterator[T, U] extends Iterator[U] with Serializable

    Scala 2.

  74. class ScriptJob extends Job

  75. case class SequenceFile(p: String, f: Fields = cascading.tuple.Fields.ALL) extends FixedPathSource with SequenceFileScheme with LocalTapSource with Product with Serializable

  76. trait SequenceFileScheme extends Source

  77. abstract class SideEffectBaseOperation[C] extends BaseOperation[C]

  78. class SideEffectBufferOp[I, T, C, X] extends SideEffectBaseOperation[C] with Buffer[C]

  79. class SideEffectFlatMapFunction[S, C, T] extends SideEffectBaseOperation[C] with Function[C]

  80. class SideEffectMapFunction[S, C, T] extends SideEffectBaseOperation[C] with Function[C]

  81. sealed abstract class SkewReplication extends AnyRef

    Represents a strategy for replicating rows when performing skewed joins.

  82. case class SkewReplicationA(replicationFactor: Int = 1) extends SkewReplication with Product with Serializable

    See https://github.

  83. case class SkewReplicationB(maxKeysInMemory: Int = 1000000.0.toInt, maxReducerOutput: Int = 1.0E7.toInt) extends SkewReplication with Product with Serializable

    See https://github.

  84. trait Sortable[+Self] extends AnyRef

  85. abstract class Source extends Serializable

    Every source must have a correct toString method.

  86. trait Stateful extends AnyRef

    A simple trait for releasable resource.

  87. trait StreamOperations[+Self <: StreamOperations[Self]] extends Sortable[Self] with Serializable

    Implements reductions on top of a simple abstraction for the Fields-API We use the f-bounded polymorphism trick to return the type called Self in each operation.

  88. case class StringField[T](id: String)(implicit ord: Ordering[T], mf: Option[Manifest[T]]) extends Field[T] with Product with Serializable

  89. trait SuccessFileSource extends FileSource

    Ensures that a _SUCCESS file is present in the Source path.

  90. case class Test(buffers: Map[Source, Buffer[Tuple]]) extends Mode with TestMode with CascadingLocal with Product with Serializable

    Memory only testing for unit tests

  91. trait TestMode extends Mode

  92. case class TextLine(p: String) extends FixedPathSource with TextLineScheme with Product with Serializable

  93. trait TextLineScheme extends Source with Mappable[String]

    The fields here are ('offset, 'line)

  94. abstract class TimePathedSource extends FileSource

    This will automatically produce a globbed version of the given path.

  95. class Tool extends Configured with org.apache.hadoop.util.Tool

  96. case class Tsv(p: String, fields: Fields = cascading.tuple.Fields.ALL, skipHeader: Boolean = false, writeHeader: Boolean = false) extends FixedPathSource with DelimitedScheme with Product with Serializable

    Tab separated value source

  97. trait TupleArity extends AnyRef

    Mixed in to both TupleConverter and TupleSetter to improve arity safety of cascading jobs before we run anything on Hadoop.

  98. trait TupleConversions extends GeneratedConversions

  99. abstract class TupleConverter[T] extends Serializable with TupleArity

  100. abstract class TupleGetter[T] extends Serializable

  101. abstract class TuplePacker[T] extends Serializable

  102. abstract class TupleSetter[-T] extends Serializable with TupleArity

  103. abstract class TupleUnpacker[-T] extends Serializable

  104. class TupleUnpackerException extends Exception

  105. class TypedDelimited[T] extends FixedPathSource with DelimitedScheme with Mappable[T]

  106. class TypedPipe[+T] extends Serializable

    Represents a phase in a distributed computation on an input data source Wraps a cascading Pipe object, and holds the transformation done up until that point

  107. trait UtcDateRangeJob extends Job with DefaultDateRangeJob

  108. case class WritableSequenceFile[K <: Writable, V <: Writable](p: String, f: Fields)(implicit evidence$7: Manifest[K], evidence$8: Manifest[V]) extends FixedPathSource with WritableSequenceFileScheme with LocalTapSource with Product with Serializable

  109. trait WritableSequenceFileScheme extends Source

  110. class XHandler extends AnyRef

    Provide handlers and mapping for exceptions

  111. class LtOrdering[T] extends Ordering[T] with Serializable

    Annotations
    @deprecated
    Deprecated

    (Since version 0.8.3) Using Ordering.fromLessThan, duh..

Value Members

  1. object CascadeTest

  2. object CascadingUtils

  3. object Dsl extends FieldConversions with TupleConversions with GeneratedTupleAdders with Serializable

    This object has all the implicit functions and values that are used to make the scalding DSL.

  4. object Field extends Serializable

  5. object Grouped extends Serializable

  6. object HadoopSchemeInstance

  7. object InnerJoinMode extends JoinMode with Product with Serializable

  8. object Job extends Serializable

  9. object JobTest

  10. object JsonLine extends AbstractFunction2[String, Fields, JsonLine] with Serializable with Serializable

    TODO: at the next binary incompatible version remove the AbstractFunction2/scala.

  11. object Mode

  12. object NullSource extends Source

    A source outputs nothing.

  13. object OuterJoinMode extends JoinMode with Product with Serializable

  14. object Read extends AccessMode with Product with Serializable

  15. object ReflectionUtils

    A helper for working with class reflection.

  16. object RichFields extends Serializable

  17. object RichPipe extends Serializable

  18. object RichXHandler

    Provide apply method for creating XHandlers with default or custom settings and contain messages and mapping

  19. object TDsl extends Serializable with GeneratedTupleAdders

    implicits for the type-safe DSL import TDsl.

  20. object TimePathedSource extends Serializable

  21. object Tool

  22. object TuplePacker extends CaseClassPackers with Serializable

    Base class for classes which pack a Tuple into a serializable object.

  23. object TupleUnpacker extends LowPriorityTupleUnpackers with Serializable

    Base class for objects which unpack an object into a tuple.

  24. object TypedPipe extends Serializable

    factory methods for TypedPipe

  25. object TypedTsv

    Allows you to set the types, prefer this: If T is a subclass of Product, we assume it is a tuple.

  26. object Write extends AccessMode with Product with Serializable

  27. package examples

  28. package mathematics

  29. package serialization

  30. package source

  31. package typed

Ungrouped