com.twitter.scalding.typed

PartitionedTextLine

case class PartitionedTextLine[P](path: String, template: String, encoding: String = ...)(implicit valueSetter: TupleSetter[String], valueConverter: TupleConverter[(Long, String)], partitionSetter: TupleSetter[P], partitionConverter: TupleConverter[P]) extends SchemedSource with TypedSink[(P, String)] with Mappable[(P, (Long, String))] with HfsTapProvider with Serializable with Product with Serializable

Scalding source to read or write partitioned text.

For writing it expects a pair of (P, String), where P is the data used for partitioning and String is the output to write out. Below is an example.

val data = List(
  (("a", "x"), "line1"),
  (("a", "y"), "line2"),
  (("b", "z"), "line3")
)
IterablePipe(data, flowDef, mode)
  .write(PartitionTextLine[(String, String)](args("out"), "col1=%s/col2=%s"))

For reading it produces a pair (P, (Long, String)) where P is the partition data, Long is the offset into the file and String is a line from the file. Below is an example.

val in: TypedPipe[((String, String), (Long, String))] = PartitionTextLine[(String, String)](args("in"), "col1=%s/col2=%s")
path

Base path of the partitioned directory

template

Template for the partitioned path

encoding

Text encoding of the file content

Linear Supertypes
Serializable, Product, Equals, HfsTapProvider, Mappable[(P, (Long, String))], TypedSource[(P, (Long, String))], TypedSink[(P, String)], SchemedSource, Source, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. PartitionedTextLine
  2. Serializable
  3. Product
  4. Equals
  5. HfsTapProvider
  6. Mappable
  7. TypedSource
  8. TypedSink
  9. SchemedSource
  10. Source
  11. Serializable
  12. AnyRef
  13. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new PartitionedTextLine(path: String, template: String, encoding: String = ...)(implicit valueSetter: TupleSetter[String], valueConverter: TupleConverter[(Long, String)], partitionSetter: TupleSetter[P], partitionConverter: TupleConverter[P])

    path

    Base path of the partitioned directory

    template

    Template for the partitioned path

    encoding

    Text encoding of the file content

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def andThen[U](fn: ((P, (Long, String))) ⇒ U): TypedSource[U]

    Transform this TypedSource into another by mapping after.

    Transform this TypedSource into another by mapping after. We don't call this map because of conflicts with Mappable, unfortunately

    Definition Classes
    TypedSource
  7. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  8. def checkFlowDefNotNull()(implicit flowDef: FlowDef, mode: Mode): Unit

    Attributes
    protected
    Definition Classes
    Source
  9. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  10. def contraMap[U](fn: (U) ⇒ (P, String)): TypedSink[U]

    Transform this sink into another type by applying a function first

    Transform this sink into another type by applying a function first

    Definition Classes
    TypedSink
  11. def converter[U >: (P, (Long, String))]: TupleConverter[U]

    Combine both the partition and value converter to extract the data from a flat cascading tuple into a pair of P and (offset, line).

    Combine both the partition and value converter to extract the data from a flat cascading tuple into a pair of P and (offset, line).

    Definition Classes
    PartitionedTextLineTypedSource
  12. def createHfsTap(scheme: Scheme[JobConf, RecordReader[_, _], OutputCollector[_, _], _, _], path: String, sinkMode: SinkMode): Hfs

    Definition Classes
    HfsTapProvider
  13. def createTap(readOrWrite: AccessMode)(implicit mode: Mode): Tap[_, _, _]

    Creates the taps for local and hdfs mode.

    Creates the taps for local and hdfs mode.

    Definition Classes
    PartitionedTextLineSource
  14. val encoding: String

    Text encoding of the file content

  15. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  16. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  17. final def flatMapTo[U](out: Fields)(mf: ((P, (Long, String))) ⇒ TraversableOnce[U])(implicit flowDef: FlowDef, mode: Mode, setter: TupleSetter[U]): Pipe

    If you want to filter, you should use this and output a 0 or 1 length Iterable.

    If you want to filter, you should use this and output a 0 or 1 length Iterable. Filter does not change column names, and we generally expect to change columns here

    Definition Classes
    Mappable
  18. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  19. def hdfsScheme: Scheme[JobConf, RecordReader[_, _], OutputCollector[_, _], _, _]

    The scheme to use if the source is on hdfs.

    The scheme to use if the source is on hdfs.

    Definition Classes
    PartitionedTextLineSchemedSource
  20. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  21. def localScheme: Scheme[Properties, InputStream, OutputStream, _, _]

    The scheme to use if the source is local.

    The scheme to use if the source is local.

    Definition Classes
    PartitionedTextLineSchemedSource
  22. final def mapTo[U](out: Fields)(mf: ((P, (Long, String))) ⇒ U)(implicit flowDef: FlowDef, mode: Mode, setter: TupleSetter[U]): Pipe

    Definition Classes
    Mappable
  23. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  24. final def notify(): Unit

    Definition Classes
    AnyRef
  25. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  26. implicit val partitionConverter: TupleConverter[P]

  27. val partitionFields: Fields

  28. implicit val partitionSetter: TupleSetter[P]

  29. val path: String

    Base path of the partitioned directory

  30. def read(implicit flowDef: FlowDef, mode: Mode): Pipe

    Definition Classes
    Source
  31. def setter[U <: (P, String)]: TupleSetter[U]

    Flatten a pair of P and line into a cascading tuple.

    Flatten a pair of P and line into a cascading tuple.

    Definition Classes
    PartitionedTextLineTypedSink
  32. def sinkFields: Fields

    Definition Classes
    PartitionedTextLineTypedSink
  33. val sinkMode: SinkMode

    Definition Classes
    SchemedSource
  34. def sourceFields: Fields

    Definition Classes
    TypedSource
  35. def sourceId: String

    This is a name the refers to this exact instance of the source (put another way, if s1.

    This is a name the refers to this exact instance of the source (put another way, if s1.sourceId == s2.sourceId, the job should work the same if one is replaced with the other

    Definition Classes
    Source
  36. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  37. val template: String

    Template for the partitioned path

  38. def toIterator(implicit config: Config, mode: Mode): Iterator[(P, (Long, String))]

    Allows you to read a Tap on the submit node NOT FOR USE IN THE MAPPERS OR REDUCERS.

    Allows you to read a Tap on the submit node NOT FOR USE IN THE MAPPERS OR REDUCERS. Typical use might be to read in Job.next to determine if another job is needed

    Definition Classes
    Mappable
  39. def transformForRead(pipe: Pipe): Pipe

    Attributes
    protected
    Definition Classes
    Source
  40. def transformForWrite(pipe: Pipe): Pipe

    Attributes
    protected
    Definition Classes
    Source
  41. def transformInTest: Boolean

    The mock passed in to scalding.

    The mock passed in to scalding.JobTest may be considered as a mock of the Tap or the Source. By default, as of 0.9.0, it is considered as a Mock of the Source. If you set this to true, the mock in TestMode will be considered to be a mock of the Tap (which must be transformed) and not the Source.

    Definition Classes
    Source
  42. def validateTaps(mode: Mode): Unit

    Definition Classes
    Source
  43. implicit val valueConverter: TupleConverter[(Long, String)]

  44. implicit val valueSetter: TupleSetter[String]

  45. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  46. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  47. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  48. def writeFrom(pipe: Pipe)(implicit flowDef: FlowDef, mode: Mode): Pipe

    write the pipe but return the input so it can be chained into the next operation

    write the pipe but return the input so it can be chained into the next operation

    Definition Classes
    Source

Deprecated Value Members

  1. def readAtSubmitter[T](implicit mode: Mode, conv: TupleConverter[T]): Stream[T]

    Definition Classes
    Source
    Annotations
    @deprecated
    Deprecated

    (Since version 0.9.0) replace with Mappable.toIterator

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from HfsTapProvider

Inherited from Mappable[(P, (Long, String))]

Inherited from TypedSource[(P, (Long, String))]

Inherited from TypedSink[(P, String)]

Inherited from SchemedSource

Inherited from Source

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped