where to load the data from
the schema as present in the metastore and used to match up with the raw data in dialects where the schema is not present. For example with a CSV format in Hive, the metastoreSchema is required in order to know what each column represents. We can't use the projection schema for this because the projection schema might be in a different order.
the schema required to read. This might not be the full schema present in the data but is required here because some file schemas can read data more efficiently if they know they can omit some fields (eg Parquet).
used by some implementations to filter data at a file read level (eg Parquet) The dataSchema represents the schema that was written for the data files. This won't necessarily be the same as the hive metastore schema, because partition values are not written to the data files. We must include this here because some hive formats don't store schema information with the data, eg delimited files. The readerSchema is the schema required by the caller which may be the same as the written data, or it may be a subset if a projection pushdown is being used.
Creates a new writer ready to do the bidding of the hive sink.
Creates a new writer ready to do the bidding of the hive sink.
the schema that will be written to the underlying file. Since this is Hive, the caller should have stripped out any partition values
the location to write the file
optional permission to set on the file once completed
any metadata we wish to include in the file, this might not be supported by all filetypes