Package net.sansa_stack.spark.rdd.op.rdf
Class JavaRddOfDatasetsOps
java.lang.Object
net.sansa_stack.spark.rdd.op.rdf.JavaRddOfDatasetsOps
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic org.apache.spark.api.java.JavaPairRDD<String,
org.apache.jena.rdf.model.Model> flatMapToNamedModels
(org.apache.spark.api.java.JavaRDD<? extends org.apache.jena.query.Dataset> rdd) static org.apache.spark.api.java.JavaRDD<org.apache.jena.sparql.core.Quad>
flatMapToQuads
(org.apache.spark.api.java.JavaRDD<? extends org.apache.jena.query.Dataset> rdd) static org.apache.spark.api.java.JavaRDD<org.apache.jena.graph.Triple>
flatMapToTriples
(org.apache.spark.api.java.JavaRDD<? extends org.apache.jena.query.Dataset> rdd) Maps a dataset to triples - emits quads from named graphs as triples by dropping the named graphstatic org.apache.spark.api.java.JavaRDD<org.aksw.jenax.arq.dataset.api.DatasetOneNg>
groupNamedGraphsByGraphIri
(org.apache.spark.api.java.JavaRDD<? extends org.apache.jena.query.Dataset> rdd, boolean distinct, boolean sortGraphsByIri, int numPartitions) Group all graphs by their named graph IRIs.
-
Constructor Details
-
JavaRddOfDatasetsOps
public JavaRddOfDatasetsOps()
-
-
Method Details
-
flatMapToQuads
public static org.apache.spark.api.java.JavaRDD<org.apache.jena.sparql.core.Quad> flatMapToQuads(org.apache.spark.api.java.JavaRDD<? extends org.apache.jena.query.Dataset> rdd) -
flatMapToTriples
public static org.apache.spark.api.java.JavaRDD<org.apache.jena.graph.Triple> flatMapToTriples(org.apache.spark.api.java.JavaRDD<? extends org.apache.jena.query.Dataset> rdd) Maps a dataset to triples - emits quads from named graphs as triples by dropping the named graph -
flatMapToNamedModels
public static org.apache.spark.api.java.JavaPairRDD<String,org.apache.jena.rdf.model.Model> flatMapToNamedModels(org.apache.spark.api.java.JavaRDD<? extends org.apache.jena.query.Dataset> rdd) -
groupNamedGraphsByGraphIri
public static org.apache.spark.api.java.JavaRDD<org.aksw.jenax.arq.dataset.api.DatasetOneNg> groupNamedGraphsByGraphIri(org.apache.spark.api.java.JavaRDD<? extends org.apache.jena.query.Dataset> rdd, boolean distinct, boolean sortGraphsByIri, int numPartitions) Group all graphs by their named graph IRIs. Effectively merges triples from all named graphs with the same IRI. Removes duplicated triples. Ignores default graphs which get lost.
-