org.apache.spark.sql.hive

HiveContext

class HiveContext extends SQLContext with Logging

An instance of the Spark SQL execution engine that integrates with data stored in Hive. Configuration for Hive is read from hive-site.xml on the classpath.

Self Type
HiveContext
Since

1.0.0

Linear Supertypes
SQLContext, Serializable, Serializable, Logging, AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. HiveContext
  2. SQLContext
  3. Serializable
  4. Serializable
  5. Logging
  6. AnyRef
  7. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new HiveContext(sc: SparkContext)

Type Members

  1. class QueryExecution extends HiveContext.QueryExecution

    Extends QueryExecution with hive specific features.

  2. class SQLSession extends HiveContext.SQLSession

    Attributes
    protected[org.apache.spark.sql.hive]
    Definition Classes
    HiveContext → SQLContext
  3. class SparkPlanner extends SparkStrategies

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def analyze(tableName: String): Unit

    Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.

    Analyzes the given table in the current database to generate statistics, which will be used in query optimizations.

    Right now, it only supports Hive tables and it only updates the size of a Hive table in the Hive metastore.

    Annotations
    @Experimental()
    Since

    1.2.0

  7. lazy val analyzer: Analyzer

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    HiveContext → SQLContext
  8. def applySchemaToPythonRDD(rdd: RDD[Array[Any]], schema: StructType): DataFrame

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  9. def applySchemaToPythonRDD(rdd: RDD[Array[Any]], schemaString: String): DataFrame

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  10. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  11. def baseRelationToDataFrame(baseRelation: BaseRelation): DataFrame

    Definition Classes
    SQLContext
  12. val cacheManager: execution.CacheManager

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  13. def cacheTable(tableName: String): Unit

    Definition Classes
    SQLContext
  14. lazy val catalog: HiveMetastoreCatalog with OverrideCatalog

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    HiveContext → SQLContext
  15. def clearCache(): Unit

    Definition Classes
    SQLContext
  16. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  17. def conf: SQLConf

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  18. def configure(): Map[String, String]

    Overridden by child classes that need to set configuration before the client init.

    Overridden by child classes that need to set configuration before the client init.

    Attributes
    protected
  19. def convertCTAS: Boolean

    When true, a table created by a Hive CTAS statement (no USING clause) will be converted to a data source table, using the data source set by spark.

    When true, a table created by a Hive CTAS statement (no USING clause) will be converted to a data source table, using the data source set by spark.sql.sources.default. The table in CTAS statement will be converted when it meets any of the following conditions:

    • The CTAS does not specify any of a SerDe (ROW FORMAT SERDE), a File Format (STORED AS), or a Storage Hanlder (STORED BY), and the value of hive.default.fileformat in hive-site.xml is either TextFile or SequenceFile.
    • The CTAS statement specifies TextFile (STORED AS TEXTFILE) as the file format and no SerDe is specified (no ROW FORMAT SERDE clause).
    • The CTAS statement specifies SequenceFile (STORED AS SEQUENCEFILE) as the file format and no SerDe is specified (no ROW FORMAT SERDE clause).
    Attributes
    protected[org.apache.spark.sql]
  20. def convertMetastoreParquet: Boolean

    When true, enables an experimental feature where metastore tables that use the parquet SerDe are automatically converted to use the Spark SQL parquet table scan, instead of the Hive SerDe.

    When true, enables an experimental feature where metastore tables that use the parquet SerDe are automatically converted to use the Spark SQL parquet table scan, instead of the Hive SerDe.

    Attributes
    protected[org.apache.spark.sql]
  21. def convertMetastoreParquetWithSchemaMerging: Boolean

    When true, also tries to merge possibly different but compatible Parquet schemas in different Parquet data files.

    When true, also tries to merge possibly different but compatible Parquet schemas in different Parquet data files.

    This configuration is only effective when "spark.sql.hive.convertMetastoreParquet" is true.

    Attributes
    protected[org.apache.spark.sql]
  22. def createDataFrame(rdd: JavaRDD[_], beanClass: Class[_]): DataFrame

    Definition Classes
    SQLContext
  23. def createDataFrame(rdd: RDD[_], beanClass: Class[_]): DataFrame

    Definition Classes
    SQLContext
  24. def createDataFrame(rowRDD: JavaRDD[Row], schema: StructType): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @DeveloperApi()
  25. def createDataFrame(rowRDD: RDD[Row], schema: StructType): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @DeveloperApi()
  26. def createDataFrame[A <: Product](data: Seq[A])(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[A]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  27. def createDataFrame[A <: Product](rdd: RDD[A])(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[A]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  28. def createExternalTable(tableName: String, source: String, schema: StructType, options: Map[String, String]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  29. def createExternalTable(tableName: String, source: String, schema: StructType, options: Map[String, String]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  30. def createExternalTable(tableName: String, source: String, options: Map[String, String]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  31. def createExternalTable(tableName: String, source: String, options: Map[String, String]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  32. def createExternalTable(tableName: String, path: String, source: String): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  33. def createExternalTable(tableName: String, path: String): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  34. def createSession(): SQLSession

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    HiveContext → SQLContext
  35. def currentSession(): HiveContext.SQLSession

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  36. val ddlParser: DDLParser

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  37. val defaultSession: HiveContext.SQLSession

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  38. def detachSession(): Unit

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  39. def dialectClassName: String

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    HiveContext → SQLContext
  40. def dropTempTable(tableName: String): Unit

    Definition Classes
    SQLContext
  41. lazy val emptyDataFrame: DataFrame

    Definition Classes
    SQLContext
  42. lazy val emptyResult: RDD[InternalRow]

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  43. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  44. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  45. def executePlan(plan: LogicalPlan): QueryExecution

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    HiveContext → SQLContext
  46. def executeSql(sql: String): HiveContext.QueryExecution

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  47. lazy val executionHive: ClientWrapper

    The copy of the hive client that is used for execution.

    The copy of the hive client that is used for execution. Currently this must always be Hive 13 as this is the version of Hive that is packaged with Spark SQL. This copy of the client is used for execution related tasks like registering temporary functions or ensuring that the ThreadLocal SessionState is correctly populated. This copy of Hive is *not* used for storing persistent metadata, and only point to a dummy metastore in a temporary directory.

    Attributes
    protected[org.apache.spark.sql.hive]
  48. val experimental: ExperimentalMethods

    Definition Classes
    SQLContext
  49. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  50. lazy val functionRegistry: FunctionRegistry

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    HiveContext → SQLContext
  51. def getAllConfs: Map[String, String]

    Definition Classes
    SQLContext
  52. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  53. def getConf(key: String, defaultValue: String): String

    Definition Classes
    SQLContext
  54. def getConf(key: String): String

    Definition Classes
    SQLContext
  55. def getSQLDialect(): ParserDialect

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  56. def getSchema(beanClass: Class[_]): Seq[AttributeReference]

    Attributes
    protected
    Definition Classes
    SQLContext
  57. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  58. def hiveMetastoreBarrierPrefixes: Seq[String]

    A comma separated list of class prefixes that should explicitly be reloaded for each version of Hive that Spark SQL is communicating with.

    A comma separated list of class prefixes that should explicitly be reloaded for each version of Hive that Spark SQL is communicating with. For example, Hive UDFs that are declared in a prefix that typically would be shared (i.e. org.apache.spark.*)

    Attributes
    protected[org.apache.spark.sql.hive]
  59. def hiveMetastoreJars: String

    The location of the jars that should be used to instantiate the HiveMetastoreClient.

    The location of the jars that should be used to instantiate the HiveMetastoreClient. This property can be one of three options:

    • a classpath in the standard format for both hive and hadoop.
    • builtin - attempt to discover the jars that were used to load Spark SQL and use those. This option is only valid when using the execution version of Hive.
    • maven - download the correct version of hive on demand from maven.
    Attributes
    protected[org.apache.spark.sql.hive]
  60. def hiveMetastoreSharedPrefixes: Seq[String]

    A comma separated list of class prefixes that should be loaded using the classloader that is shared between Spark SQL and a specific version of Hive.

    A comma separated list of class prefixes that should be loaded using the classloader that is shared between Spark SQL and a specific version of Hive. An example of classes that should be shared is JDBC drivers that are needed to talk to the metastore. Other classes that need to be shared are those that interact with classes that are already shared. For example, custom appenders that are used by log4j.

    Attributes
    protected[org.apache.spark.sql.hive]
  61. def hiveMetastoreVersion: String

    The version of the hive client that will be used to communicate with the metastore.

    The version of the hive client that will be used to communicate with the metastore. Note that this does not necessarily need to be the same version of Hive that is used internally by Spark SQL for execution.

    Attributes
    protected[org.apache.spark.sql.hive]
  62. def hiveThriftServerAsync: Boolean

    Attributes
    protected[org.apache.spark.sql.hive]
  63. def hiveconf: HiveConf

    Attributes
    protected[org.apache.spark.sql.hive]
  64. def invalidateTable(tableName: String): Unit

    Attributes
    protected[org.apache.spark.sql.hive]
  65. def isCached(tableName: String): Boolean

    Definition Classes
    SQLContext
  66. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  67. def isTraceEnabled(): Boolean

    Attributes
    protected
    Definition Classes
    Logging
  68. def log: Logger

    Attributes
    protected
    Definition Classes
    Logging
  69. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  70. def logDebug(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  71. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  72. def logError(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  73. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  74. def logInfo(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  75. def logName: String

    Attributes
    protected
    Definition Classes
    Logging
  76. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  77. def logTrace(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  78. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  79. def logWarning(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  80. lazy val metadataHive: ClientInterface

    The copy of the Hive client that is used to retrieve metadata from the Hive MetaStore.

    The copy of the Hive client that is used to retrieve metadata from the Hive MetaStore. The version of the Hive client that is used here must match the metastore that is configured in the hive-site.xml file.

    Attributes
    protected[org.apache.spark.sql.hive]
  81. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  82. final def notify(): Unit

    Definition Classes
    AnyRef
  83. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  84. def openSession(): HiveContext.SQLSession

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  85. lazy val optimizer: Optimizer

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  86. def parseDataType(dataTypeString: String): DataType

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  87. def parseSql(sql: String): LogicalPlan

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    HiveContext → SQLContext
  88. val planner: SparkPlanner with HiveStrategies

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    HiveContext → SQLContext
  89. val prepareForExecution: RuleExecutor[SparkPlan]

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  90. def range(start: Long, end: Long, step: Long, numPartitions: Int): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  91. def range(start: Long, end: Long): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  92. def range(end: Long): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  93. def read: DataFrameReader

    Definition Classes
    SQLContext
    Annotations
    @Experimental()
  94. def refreshTable(tableName: String): Unit

    Invalidate and refresh all the cached the metadata of the given table.

    Invalidate and refresh all the cached the metadata of the given table. For performance reasons, Spark SQL or the external data source library it uses might cache certain metadata about a table, such as the location of blocks. When those change outside of Spark SQL, users should call this function to invalidate the cache.

    Since

    1.3.0

  95. def runSqlHive(sql: String): Seq[String]

    Attributes
    protected[org.apache.spark.sql.hive]
  96. def setConf(key: String, value: String): Unit

    Definition Classes
    HiveContext → SQLContext
  97. def setConf(props: Properties): Unit

    Definition Classes
    SQLContext
  98. def setSession(session: HiveContext.SQLSession): Unit

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  99. val sparkContext: SparkContext

    Definition Classes
    SQLContext
  100. def sql(sqlText: String): DataFrame

    Definition Classes
    SQLContext
  101. val sqlParser: SparkSQLParser

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  102. lazy val substitutor: VariableSubstitution

    Attributes
    protected[org.apache.spark.sql]
  103. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  104. def table(tableName: String): DataFrame

    Definition Classes
    SQLContext
  105. def tableNames(databaseName: String): Array[String]

    Definition Classes
    SQLContext
  106. def tableNames(): Array[String]

    Definition Classes
    SQLContext
  107. def tables(databaseName: String): DataFrame

    Definition Classes
    SQLContext
  108. def tables(): DataFrame

    Definition Classes
    SQLContext
  109. val tlSession: ThreadLocal[HiveContext.SQLSession]

    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SQLContext
  110. def toString(): String

    Definition Classes
    AnyRef → Any
  111. val udf: UDFRegistration

    Definition Classes
    SQLContext
  112. def uncacheTable(tableName: String): Unit

    Definition Classes
    SQLContext
  113. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  114. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  115. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. def applySchema(rdd: JavaRDD[_], beanClass: Class[_]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.3.0) use createDataFrame

  2. def applySchema(rdd: RDD[_], beanClass: Class[_]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.3.0) use createDataFrame

  3. def applySchema(rowRDD: JavaRDD[Row], schema: StructType): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.3.0) use createDataFrame

  4. def applySchema(rowRDD: RDD[Row], schema: StructType): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.3.0) use createDataFrame

  5. def jdbc(url: String, table: String, theParts: Array[String]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) use read.jdbc()

  6. def jdbc(url: String, table: String, columnName: String, lowerBound: Long, upperBound: Long, numPartitions: Int): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) use read.jdbc()

  7. def jdbc(url: String, table: String): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) use read.jdbc()

  8. def jsonFile(path: String, samplingRatio: Double): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.json()

  9. def jsonFile(path: String, schema: StructType): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.json()

  10. def jsonFile(path: String): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.json()

  11. def jsonRDD(json: JavaRDD[String], samplingRatio: Double): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.json()

  12. def jsonRDD(json: RDD[String], samplingRatio: Double): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.json()

  13. def jsonRDD(json: JavaRDD[String], schema: StructType): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.json()

  14. def jsonRDD(json: RDD[String], schema: StructType): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.json()

  15. def jsonRDD(json: JavaRDD[String]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.json()

  16. def jsonRDD(json: RDD[String]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.json()

  17. def load(source: String, schema: StructType, options: Map[String, String]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.format(source).schema(schema).options(options).load()

  18. def load(source: String, schema: StructType, options: Map[String, String]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.format(source).schema(schema).options(options).load()

  19. def load(source: String, options: Map[String, String]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.format(source).options(options).load()

  20. def load(source: String, options: Map[String, String]): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.format(source).options(options).load()

  21. def load(path: String, source: String): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.format(source).load(path)

  22. def load(path: String): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated
    Deprecated

    (Since version 1.4.0) Use read.load(path)

  23. def parquetFile(paths: String*): DataFrame

    Definition Classes
    SQLContext
    Annotations
    @deprecated @varargs()
    Deprecated

    (Since version 1.4.0) Use read.parquet()

Inherited from SQLContext

Inherited from Serializable

Inherited from Serializable

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped