org.apache.jena.riot.lang
Class PipedRDFIterator<T>

java.lang.Object
  extended by org.apache.jena.riot.lang.PipedRDFIterator<T>
Type Parameters:
T - The type of the RDF primitive, should be one of Triple, Quad, or Tuple<Node>
All Implemented Interfaces:
Iterator<T>, Closeable

public class PipedRDFIterator<T>
extends Object
implements Iterator<T>, Closeable

A PipedRDFIterator should be connected to a PipedRDFStream implementation; the piped iterator then provides whatever RDF primitives are written to the PipedRDFStream

Typically, data is read from a PipedRDFIterator by one thread (the consumer) and data is written to the corresponding PipedRDFStream by some other thread (the producer). Attempting to use both objects from a single thread is not recommended, as it may deadlock the thread. The PipedRDFIterator contains a buffer, decoupling read operations from write operations, within limits.

Inspired by Java's PipedInputStream and PipedOutputStream

See Also:
PipedTriplesStream, PipedQuadsStream, PipedTuplesStream

Field Summary
static int DEFAULT_BUFFER_SIZE
          Constant for default buffer size
static int DEFAULT_MAX_POLLS
          Constant for max number of failed poll attempts before the producer will be declared as dead
static int DEFAULT_POLL_TIMEOUT
          Constant for default poll timeout in milliseconds, used to stop the consumer deadlocking in certain circumstances
 
Constructor Summary
PipedRDFIterator()
          Creates a new piped RDF iterator with the default buffer size of DEFAULT_BUFFER_SIZE.
PipedRDFIterator(int bufferSize)
          Creates a new piped RDF iterator
PipedRDFIterator(int bufferSize, boolean fair)
          Creates a new piped RDF iterator
PipedRDFIterator(int bufferSize, boolean fair, int pollTimeout, int maxPolls)
          Creates a new piped RDF iterator
 
Method Summary
 void close()
          May be called by the consumer when it is finished reading from the iterator, if the producer thread has not finished it will receive an error the next time it tries to write to the iterator
 String getBaseIri()
          Gets the most recently seen Base IRI
 PrefixMap getPrefixes()
          Gets the prefix map which contains the prefixes seen so far in the stream
 boolean hasNext()
           
 T next()
           
 void remove()
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_BUFFER_SIZE

public static final int DEFAULT_BUFFER_SIZE
Constant for default buffer size

See Also:
Constant Field Values

DEFAULT_POLL_TIMEOUT

public static final int DEFAULT_POLL_TIMEOUT
Constant for default poll timeout in milliseconds, used to stop the consumer deadlocking in certain circumstances

See Also:
Constant Field Values

DEFAULT_MAX_POLLS

public static final int DEFAULT_MAX_POLLS
Constant for max number of failed poll attempts before the producer will be declared as dead

See Also:
Constant Field Values
Constructor Detail

PipedRDFIterator

public PipedRDFIterator()
Creates a new piped RDF iterator with the default buffer size of DEFAULT_BUFFER_SIZE.

Buffer size must be chosen carefully in order to avoid performance problems, if you set the buffer size too low you will experience a lot of blocked calls so it will take longer to consume the data from the iterator. For best performance the buffer size should be at least 10% of the expected input size though you may need to tune this depending on how fast your consumer thread is.


PipedRDFIterator

public PipedRDFIterator(int bufferSize)
Creates a new piped RDF iterator

Buffer size must be chosen carefully in order to avoid performance problems, if you set the buffer size too low you will experience a lot of blocked calls so it will take longer to consume the data from the iterator. For best performance the buffer size should be roughly 10% of the expected input size though you may need to tune this depending on how fast your consumer thread is.

Parameters:
bufferSize - Buffer size

PipedRDFIterator

public PipedRDFIterator(int bufferSize,
                        boolean fair)
Creates a new piped RDF iterator

Buffer size must be chosen carefully in order to avoid performance problems, if you set the buffer size too low you will experience a lot of blocked calls so it will take longer to consume the data from the iterator. For best performance the buffer size should be roughly 10% of the expected input size though you may need to tune this depending on how fast your consumer thread is.

The fair parameter controls whether the locking policy used for the buffer is fair. When enabled this reduces throughput but also reduces the chance of thread starvation. This likely need only be set to true if there will be multiple consumers.

Parameters:
bufferSize - Buffer size
fair - Whether the buffer should use a fair locking policy

PipedRDFIterator

public PipedRDFIterator(int bufferSize,
                        boolean fair,
                        int pollTimeout,
                        int maxPolls)
Creates a new piped RDF iterator

Buffer size must be chosen carefully in order to avoid performance problems, if you set the buffer size too low you will experience a lot of blocked calls so it will take longer to consume the data from the iterator. For best performance the buffer size should be roughly 10% of the expected input size though you may need to tune this depending on how fast your consumer thread is.

The fair parameter controls whether the locking policy used for the buffer is fair. When enabled this reduces throughput but also reduces the chance of thread starvation. This likely need only be set to true if there will be multiple consumers.

The pollTimeout parameter controls how long each poll attempt waits for data to be produced. This prevents the consumer thread from blocking indefinitely and allows it to detect various potential deadlock conditions e.g. dead producer thread, another consumer closed the iterator etc. and errors out accordingly. It is unlikely that you will ever need to adjust this from the default value provided by DEFAULT_POLL_TIMEOUT.

The maxPolls parameter controls how many poll attempts will be made by a single consumer thread within the context of a single call to hasNext() before the iterator declares the producer to be dead and errors out accordingly. You may need to adjust this if you have a slow producer thread or many consumer threads.

Parameters:
bufferSize - Buffer size
fair - Whether the buffer should use a fair locking policy
pollTimeout - Poll timeout in milliseconds
maxPolls - Max poll attempts
Method Detail

hasNext

public boolean hasNext()
Specified by:
hasNext in interface Iterator<T>

next

public T next()
Specified by:
next in interface Iterator<T>

remove

public void remove()
Specified by:
remove in interface Iterator<T>

getBaseIri

public String getBaseIri()
Gets the most recently seen Base IRI

Returns:
Base IRI

getPrefixes

public PrefixMap getPrefixes()
Gets the prefix map which contains the prefixes seen so far in the stream

Returns:
Prefix Map

close

public void close()
May be called by the consumer when it is finished reading from the iterator, if the producer thread has not finished it will receive an error the next time it tries to write to the iterator

Specified by:
close in interface Closeable


Licenced under the Apache License, Version 2.0