com.datastax.spark

connector

package connector

The root package of Cassandra connector for Apache Spark. Offers handy implicit conversions that add Cassandra-specific methods to SparkContext and RDD.

Call cassandraTable method on the SparkContext object to create a CassandraRDD exposing Cassandra tables as Spark RDDs.

Call RDDFunctions saveToCassandra function on any RDD to save distributed collection to a Cassandra table.

Example:

CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1 };
CREATE TABLE test.words (word text PRIMARY KEY, count int);
INSERT INTO test.words(word, count) VALUES ("and", 50);
import com.datastax.spark.connector._

val sparkMasterHost = "127.0.0.1"
val cassandraHost = "127.0.0.1"
val keyspace = "test"
val table = "words"

// Tell Spark the address of one Cassandra node:
val conf = new SparkConf(true).set("spark.cassandra.connection.host", cassandraHost)

// Connect to the Spark cluster:
val sc = new SparkContext("spark://" + sparkMasterHost + ":7077", "example", conf)

// Read the table and print its contents:
val rdd = sc.cassandraTable(keyspace, table)
rdd.toArray().foreach(println)

// Write two rows to the table:
val col = sc.parallelize(Seq(("of", 1200), ("the", "863")))
col.saveToCassandra(keyspace, table)

sc.stop()
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. connector
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Type Members

  1. trait AbstractGettableData extends AnyRef

  2. sealed trait BatchSize extends AnyRef

  3. case class BytesInBatch(batchSize: Int) extends BatchSize with Product with Serializable

  4. final class CassandraRow extends ScalaGettableData with Serializable

    Represents a single row fetched from Cassandra.

    Represents a single row fetched from Cassandra. Offers getters to read individual fields by column name or column index. The getters try to convert value to desired type, whenever possible. Most of the column types can be converted to a String. For nullable columns, you should use the getXXXOption getters which convert nulls to None values, otherwise a NullPointerException would be thrown.

    All getters throw an exception if column name/index is not found. Column indexes start at 0.

    If the value cannot be converted to desired type, com.datastax.spark.connector.types.TypeConversionException is thrown.

    Recommended getters for Cassandra types:

    • ascii: getString, getStringOption
    • bigint: getLong, getLongOption
    • blob: getBytes, getBytesOption
    • boolean: getBool, getBoolOption
    • counter: getLong, getLongOption
    • decimal: getDecimal, getDecimalOption
    • double: getDouble, getDoubleOption
    • float: getFloat, getFloatOption
    • inet: getInet, getInetOption
    • int: getInt, getIntOption
    • text: getString, getStringOption
    • timestamp: getDate, getDateOption
    • timeuuid: getUUID, getUUIDOption
    • uuid: getUUID, getUUIDOption
    • varchar: getString, getStringOption
    • varint: getVarInt, getVarIntOption
    • list: getList[T]
    • set: getSet[T]
    • map: getMap[K, V]

    Collection getters getList, getSet and getMap require to explicitly pass an appropriate item type:

    row.getList[String]("a_list")
    row.getList[Int]("a_list")
    row.getMap[Int, String]("a_map")

    Generic get allows to automatically convert collections to other collection types. Supported containers:

    • scala.collection.immutable.List
    • scala.collection.immutable.Set
    • scala.collection.immutable.TreeSet
    • scala.collection.immutable.Vector
    • scala.collection.immutable.Map
    • scala.collection.immutable.TreeMap
    • scala.collection.Iterable
    • scala.collection.IndexedSeq
    • java.util.ArrayList
    • java.util.HashSet
    • java.util.HashMap

    Example:

    row.get[List[Int]]("a_list")
    row.get[Vector[Int]]("a_list")
    row.get[java.util.ArrayList[Int]]("a_list")
    row.get[TreeMap[Int, String]]("a_map")

    Timestamps can be converted to other Date types by using generic get. Supported date types:

    • java.util.Date
    • java.sql.Date
    • org.joda.time.DateTime
  5. case class ColumnIndex(columnIndex: Int) extends ColumnRef with Product with Serializable

    References a column by its index in the row.

    References a column by its index in the row. Useful for tuples.

  6. case class ColumnName(columnName: String, alias: Option[String] = None) extends NamedColumnRef with Product with Serializable

    References a column by name.

  7. implicit final class ColumnNameFunctions extends AnyVal

  8. class ColumnNotFoundException extends Exception

    Thrown when the requested column does not exist in the result set.

  9. sealed trait ColumnRef extends AnyRef

    Unambiguous reference to a column in the query result set row.

  10. sealed trait ColumnSelector extends AnyRef

  11. sealed trait NamedColumnRef extends SelectableColumnRef

    A selectable column based on a real, non-virtual column with a name in the table

  12. class PairRDDFunctions[K, V] extends Serializable

  13. class RDDFunctions[T] extends WritableToCassandra[T] with Serializable

    Provides Cassandra-specific methods on RDD

  14. case class RowsInBatch(batchSize: Int) extends BatchSize with Product with Serializable

  15. trait ScalaGettableData extends AbstractGettableData

  16. sealed trait SelectableColumnRef extends ColumnRef

    A column that can be selected from CQL results set by name

  17. case class SomeColumns(columns: SelectableColumnRef*) extends ColumnSelector with Product with Serializable

  18. class SparkContextFunctions extends Serializable

    Provides Cassandra-specific methods on SparkContext

  19. case class TTL(columnName: String, alias: Option[String] = None) extends NamedColumnRef with Product with Serializable

  20. final class UDTValue extends ScalaGettableData with Serializable

  21. case class WriteTime(columnName: String, alias: Option[String] = None) extends NamedColumnRef with Product with Serializable

Value Members

  1. object AbstractGettableData

  2. object AllColumns extends ColumnSelector with Product with Serializable

  3. object BatchSize

  4. object CassandraRow extends Serializable

  5. object NamedColumnRef

  6. object PartitionKeyColumns extends ColumnSelector with Product with Serializable

  7. object RowCountRef extends SelectableColumnRef with Product with Serializable

  8. object SelectableColumnRef

  9. object SomeColumns extends Serializable

  10. object UDTValue extends Serializable

  11. package cql

    Contains a cql.CassandraConnector object which is used to connect to a Cassandra cluster and to send CQL statements to it.

    Contains a cql.CassandraConnector object which is used to connect to a Cassandra cluster and to send CQL statements to it. CassandraConnector provides a Scala-idiomatic way of working with Cluster and Session object and takes care of connection pooling and proper resource disposal.

  12. package mapper

    Provides machinery for mapping Cassandra tables to user defined Scala classes or tuples.

    Provides machinery for mapping Cassandra tables to user defined Scala classes or tuples. The main class in this package is mapper.ColumnMapper responsible for matching Scala object's properties with Cassandra column names.

  13. package metrics

  14. package rdd

    Contains com.datastax.spark.connector.rdd.CassandraTableScanRDD class that is the main entry point for analyzing Cassandra data from Spark.

  15. package streaming

  16. implicit def toNamedColumnRef(columnName: String): ColumnName

  17. implicit def toPairRDDFunctions[K, V](rdd: RDD[(K, V)]): PairRDDFunctions[K, V]

  18. implicit def toRDDFunctions[T](rdd: RDD[T]): RDDFunctions[T]

  19. implicit def toSparkContextFunctions(sc: SparkContext): SparkContextFunctions

  20. package types

    Offers type conversion magic, so you can receive Cassandra column values in a form you like the most.

    Offers type conversion magic, so you can receive Cassandra column values in a form you like the most. Simply specify the type you want to use on the Scala side, and the column value will be converted automatically. Works also with complex objects like collections.

  21. package util

    Useful stuff that didn't fit elsewhere.

  22. package writer

    Contains components for writing RDDs to Cassandra

Inherited from AnyRef

Inherited from Any

Ungrouped