Class JsonDataSources
java.lang.Object
net.sansa_stack.spark.io.json.input.JsonDataSources
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
static enum
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic Function<org.apache.spark.api.java.JavaRDD<com.google.gson.JsonElement>,
org.apache.spark.api.java.JavaRDD<org.apache.jena.sparql.engine.binding.Binding>> bindingMapper
(org.apache.jena.sparql.core.Var outputVar) Convert a JavaRDD>JsonElement< into a JavaRDD>Binding< by means of converting JSON elements into Nodes (primitive JSON will become native RDF!) and adding them to bindings with the given outputVar.static org.apache.spark.api.java.JavaRDD<org.apache.jena.sparql.engine.binding.Binding>
createRddFromJson
(org.apache.spark.api.java.JavaSparkContext javaSparkContext, String filename, int probeCount, org.apache.jena.sparql.core.Var outputVar) static HadoopInputData<org.apache.hadoop.io.LongWritable,
com.google.gson.JsonElement, org.apache.spark.api.java.JavaRDD<com.google.gson.JsonElement>> static HadoopInputData<org.apache.hadoop.io.LongWritable,
com.google.gson.JsonElement, org.apache.spark.api.java.JavaRDD<com.google.gson.JsonElement>> jsonSequence
(String filename, org.apache.hadoop.conf.Configuration conf) probeJsonFormat
(Reader reader, com.google.gson.Gson gson, int probeCount) Detect whether input is...probeJsonFormat
(String filename, org.apache.hadoop.conf.Configuration conf, int probeCount) static HadoopInputData<org.apache.hadoop.io.LongWritable,
com.google.gson.JsonElement, org.apache.spark.api.java.JavaRDD<com.google.gson.JsonElement>> probeJsonInputFormat
(String filename, org.apache.hadoop.conf.Configuration conf, int probeCount)
-
Constructor Details
-
JsonDataSources
public JsonDataSources()
-
-
Method Details
-
createRddFromJson
public static org.apache.spark.api.java.JavaRDD<org.apache.jena.sparql.engine.binding.Binding> createRddFromJson(org.apache.spark.api.java.JavaSparkContext javaSparkContext, String filename, int probeCount, org.apache.jena.sparql.core.Var outputVar) -
probeJsonInputFormat
public static HadoopInputData<org.apache.hadoop.io.LongWritable,com.google.gson.JsonElement, probeJsonInputFormatorg.apache.spark.api.java.JavaRDD<com.google.gson.JsonElement>> (String filename, org.apache.hadoop.conf.Configuration conf, int probeCount) throws IOException - Throws:
IOException
-
jsonArray
public static HadoopInputData<org.apache.hadoop.io.LongWritable,com.google.gson.JsonElement, jsonArrayorg.apache.spark.api.java.JavaRDD<com.google.gson.JsonElement>> (String filename, org.apache.hadoop.conf.Configuration conf) -
jsonSequence
public static HadoopInputData<org.apache.hadoop.io.LongWritable,com.google.gson.JsonElement, jsonSequenceorg.apache.spark.api.java.JavaRDD<com.google.gson.JsonElement>> (String filename, org.apache.hadoop.conf.Configuration conf) -
probeJsonFormat
public static JsonDataSources.JsonProbeResult probeJsonFormat(String filename, org.apache.hadoop.conf.Configuration conf, int probeCount) throws IOException - Throws:
IOException
-
bindingMapper
public static Function<org.apache.spark.api.java.JavaRDD<com.google.gson.JsonElement>,org.apache.spark.api.java.JavaRDD<org.apache.jena.sparql.engine.binding.Binding>> bindingMapper(org.apache.jena.sparql.core.Var outputVar) Convert a JavaRDD>JsonElement< into a JavaRDD>Binding< by means of converting JSON elements into Nodes (primitive JSON will become native RDF!) and adding them to bindings with the given outputVar. -
probeJsonFormat
public static JsonDataSources.JsonProbeResult probeJsonFormat(Reader reader, com.google.gson.Gson gson, int probeCount) throws IOException Detect whether input is... (a) A JSON array; identified by a starting open bracket [ (b) A sequence of JSON elements (with no special separator)- Parameters:
reader
- A reader with mark support. The mark will be reset upon returning from this function.gson
- The Gson instanceprobeCount
- The number of json elements to read from the input stream for probing- Throws:
IOException
-