Adds a method, ryft
, to DataFrameReader that allows you to read ryft files using
the DataFileReader
Provides Ryft-specific methods on SparkContext
Provides classes to describe query data model.
Provides classes to describe query data model.
Here are some essential enums and classes.
com.ryft.spark.connector.domain.LogicalOperator - Logical operators (and, or, xor) used by com.ryft.spark.connector.query.RyftQuery.
com.ryft.spark.connector.domain.Action - Represents action on RyftOne and corresponding REST endpoint. Currently supoprted Search and Count.
com.ryft.spark.connector.domain.RyftQueryOptions - And important class representing REST Search query parameters, like fuzziness, surrounding, list of fields and number of nodes to use for search.
Provides classes to represent connector specific exceptions.
Provides classes to represent different RyftOne queries.
Provides classes to represent different RyftOne queries.
The main class to use is com.ryft.spark.connector.query.RecordQuery, as so
import com.ryft.spark.connector._ import com.ryft.spark.connector.domain.{recordField, contains} import com.ryft.spark.connector.query.RecordQuery val query = RecordQuery(recordField("Date"), contains, "04/14/2015") .and(recordField("Arrest"), contains, "false")
For raw text searches com.ryft.spark.connector.query.SimpleQuery class should be used, as so
import com.ryft.spark.connector._ import com.ryft.spark.connector.query.SimpleQuery val query = SimpleQuery("James Bond")
Provides classes for RyftOne connector specific RDD operations.
Provides classes for DataFrames and SparkSQL support.
Provides classes for DataFrames and SparkSQL support.
These classes are not intended to be used directly but should be accessed via com.ryft.spark.connector.RyftDataFrameReader
For example:
import org.apache.spark.sql.SQLContext import org.apache.spark.sql.types._ import org.apache.spark.{SparkConf, SparkContext} import com.ryft.spark.connector._ // Definition for the DataFrame setup val schema = StructType(Seq( StructField("Arrest", BooleanType), StructField("Beat", IntegerType), StructField("Block", StringType), StructField("CaseNumber", StringType), StructField("_index", StructType(Seq( StructField("file", StringType), StructField("offset", StringType), StructField("length", IntegerType), StructField("fuzziness", ByteType))) ) )) sqlContext.read.ryft(schema, "*.crimestat", "crime_table")