Class RddRdfWriterSettings<SELF extends RddRdfWriterSettings>

java.lang.Object
net.sansa_stack.spark.io.rdf.output.RddWriterSettings<SELF>
net.sansa_stack.spark.io.rdf.output.RddRdfWriterSettings<SELF>
Direct Known Subclasses:
RddRdfWriter, RddRdfWriterFactory

public class RddRdfWriterSettings<SELF extends RddRdfWriterSettings> extends RddWriterSettings<SELF>
  • Field Details

    • globalPrefixMapping

      protected org.apache.jena.shared.PrefixMapping globalPrefixMapping
    • outputFormat

      protected org.apache.jena.riot.RDFFormat outputFormat
    • mapQuadsToTriplesForTripleLangs

      protected boolean mapQuadsToTriplesForTripleLangs
      Whether to convert quads to triples if a triple-based output format is requested
    • deferOutputForUsedPrefixes

      protected long deferOutputForUsedPrefixes
      Only for console output: Instead of writing tuples out immediatly, collect up to this number of tuples in order to derive the used prefixes. Upon reaching this threshold, print out all seen prefixes and emit the held-back data as well as any further data immediately
  • Constructor Details

    • RddRdfWriterSettings

      public RddRdfWriterSettings()
  • Method Details

    • isMapQuadsToTriplesForTripleLangs

      public boolean isMapQuadsToTriplesForTripleLangs()
    • self

      protected SELF self()
      Overrides:
      self in class RddWriterSettings<SELF extends RddRdfWriterSettings>
    • mutate

      public SELF mutate(Consumer<? super SELF> action)
      Pass this object to a consumer. Useful to conditionally configure this object without breaking the fluent chain:
          rdd.configureSave().mutate(self -> { if (condition) { self.setX(); }}).run();
       
      Parameters:
      action -
      Returns:
    • configureFrom

      public SELF configureFrom(RddRdfWriterSettings<?> other)
    • setMapQuadsToTriplesForTripleLangs

      public SELF setMapQuadsToTriplesForTripleLangs(boolean mapQuadsToTriplesForTripleLangs)
      Whether to convert quads to triples if a triple-based output format is requested Jena by default discards any quad outside of the default graph when writing to a triple format. Setting this flag to true will map each quad in a named graph to the default graph.
    • getGlobalPrefixMapping

      public org.apache.jena.shared.PrefixMapping getGlobalPrefixMapping()
    • setGlobalPrefixMapping

      public SELF setGlobalPrefixMapping(org.apache.jena.shared.PrefixMapping globalPrefixMapping)
      Set a prefix mapping to be used "globally" across all partitions.
      Parameters:
      globalPrefixMapping -
      Returns:
    • setGlobalPrefixMapping

      public SELF setGlobalPrefixMapping(Map<String,String> globalPrefixMap)
    • getOutputFormat

      public org.apache.jena.riot.RDFFormat getOutputFormat()
    • setOutputFormat

      public SELF setOutputFormat(org.apache.jena.riot.RDFFormat format)
    • setOutputFormat

      public SELF setOutputFormat(String formatName)
      Raises an exception if the format is not found
    • getFallbackOutputFormat

      public org.apache.jena.riot.RDFFormat getFallbackOutputFormat()
    • isPartitionsAsIndependentFiles

      public boolean isPartitionsAsIndependentFiles()
      Overrides:
      isPartitionsAsIndependentFiles in class RddWriterSettings<SELF extends RddRdfWriterSettings>
    • setPartitionsAsIndependentFiles

      public SELF setPartitionsAsIndependentFiles(boolean partitionsAsIndependentFiles)
      Overrides:
      setPartitionsAsIndependentFiles in class RddWriterSettings<SELF extends RddRdfWriterSettings>
    • setDeferOutputForUsedPrefixes

      public SELF setDeferOutputForUsedPrefixes(long deferOutputForUsedPrefixes)