Class HugeGraph
- java.lang.Object
-
- org.neo4j.gds.core.huge.HugeGraph
-
- All Implemented Interfaces:
BatchNodeIterable
,CSRGraph
,Degrees
,Graph
,IdMapping
,NodeIterator
,NodeMapping
,NodePropertyContainer
,RelationshipAccess
,RelationshipIterator
,RelationshipPredicate
,RelationshipProperties
public class HugeGraph extends java.lang.Object implements CSRGraph
Huge Graph contains two array like data structures.The adjacency data is stored in a ByteArray, which is a byte[] addressable by longs indices and capable of storing about 2^46 (~ 70k bn) bytes – or 64 TiB. The bytes are stored in byte[] pages of 32 KiB size.
The data is in the format:
degree
~targetId
1
~targetId
2
~targetId
n
degree
is stored as a fill-sized 4 byte longint
(the neo kernel api returns an int forNodes.countAll(org.neo4j.internal.kernel.api.NodeCursor)
). Every target ID is first sorted, then delta encoded, and finally written as variable-length vlongs. The delta encoding does not write the actual value but only the difference to the previous value, which plays very nice with the vlong encoding.The seconds data structure is a LongArray, which is a long[] addressable by longs and capable of storing about 2^43 (~9k bn) longs – or 64 TiB worth of 64 bit longs. The data is the offset address into the aforementioned adjacency array, the index is the respective source node id.
To traverse all nodes, first access to offset from the LongArray, then read 4 bytes into the
degree
from the ByteArray, starting from the offset, then readdegree
vlongs as targetId.Reading the degree from the offset position not only does not require the offset array to be sorted but also allows the adjacency array to be sparse. This fact is used during the import – each thread pre-allocates a local chunk of some pages (512 KiB) and gives access to this data during import. Synchronization between threads only has to happen when a new chunk has to be pre-allocated. This is similar to what most garbage collectors do with TLAB allocations.
- See Also:
- more abount vlong, more abount TLAB allocation
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
HugeGraph.GetTargetConsumer
-
Nested classes/interfaces inherited from interface org.neo4j.gds.api.BatchNodeIterable
BatchNodeIterable.IdIterable, BatchNodeIterable.IdIterator
-
Nested classes/interfaces inherited from interface org.neo4j.gds.api.NodeMapping
NodeMapping.NodeLabelConsumer
-
-
Field Summary
Fields Modifier and Type Field Description protected AdjacencyList
adjacency
protected AllocationTracker
allocationTracker
protected double
defaultPropertyValue
protected boolean
hasRelationshipProperty
protected NodeMapping
idMapping
protected boolean
isMultiGraph
static double
NO_PROPERTY_VALUE
protected java.util.Map<java.lang.String,NodeProperties>
nodeProperties
protected org.neo4j.gds.Orientation
orientation
protected @Nullable AdjacencyProperties
properties
protected long
relationshipCount
protected org.neo4j.gds.api.schema.GraphSchema
schema
-
Fields inherited from interface org.neo4j.gds.api.IdMapping
NOT_FOUND, START_NODE_ID
-
-
Constructor Summary
Constructors Modifier Constructor Description protected
HugeGraph(NodeMapping idMapping, org.neo4j.gds.api.schema.GraphSchema schema, java.util.Map<java.lang.String,NodeProperties> nodeProperties, long relationshipCount, @NotNull AdjacencyList adjacency, boolean hasRelationshipProperty, double defaultPropertyValue, @Nullable AdjacencyProperties properties, org.neo4j.gds.Orientation orientation, boolean isMultiGraph, AllocationTracker allocationTracker)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.util.Optional<NodeFilteredGraph>
asNodeFilteredGraph()
If this graph is created using a node label filter, this will return a NodeFilteredGraph that represents the node set used in this graph.java.util.Set<org.neo4j.gds.NodeLabel>
availableNodeLabels()
java.util.Set<java.lang.String>
availableNodeProperties()
java.util.Collection<PrimitiveLongIterable>
batchIterables(long batchSize)
void
canRelease(boolean canRelease)
HugeGraph
concurrentCopy()
boolean
contains(long nodeId)
Returns true iff the nodeId is mapped, otherwise false.static HugeGraph
create(NodeMapping nodes, org.neo4j.gds.api.schema.GraphSchema schema, java.util.Map<java.lang.String,NodeProperties> nodeProperties, Relationships.Topology topology, java.util.Optional<Relationships.Properties> maybeProperties, AllocationTracker allocationTracker)
int
degree(long node)
int
degreeWithoutParallelRelationships(long nodeId)
Much slower than just degree() because it may have to look up all relationships.boolean
exists(long sourceNodeId, long targetNodeId)
O(n) !void
forEachNode(java.util.function.LongPredicate consumer)
Iterate over each nodeIdvoid
forEachNodeLabel(long nodeId, NodeMapping.NodeLabelConsumer consumer)
void
forEachRelationship(long nodeId, double fallbackValue, RelationshipWithPropertyConsumer consumer)
Calls the given consumer function for every relationship of a given node.void
forEachRelationship(long nodeId, RelationshipConsumer consumer)
Calls the given consumer function for every relationship of a given node.long
getTarget(long sourceNodeId, long index)
boolean
hasLabel(long nodeId, org.neo4j.gds.NodeLabel label)
boolean
hasRelationshipProperty()
long
highestNeoId()
NodeMapping
idMap()
boolean
isMultiGraph()
Whether the graph is guaranteed to have no parallel relationships.boolean
isUndirected()
long
nodeCount()
Number of mapped nodeIds.PrimitiveLongIterator
nodeIterator()
java.util.Set<org.neo4j.gds.NodeLabel>
nodeLabels(long nodeId)
java.util.Map<java.lang.String,NodeProperties>
nodeProperties()
NodeProperties
nodeProperties(java.lang.String propertyKey)
Return the property values for a property key NOTE: Avoid using this on the hot path, favor caching the NodeProperties object when possiblelong
relationshipCount()
double
relationshipProperty(long sourceNodeId, long targetNodeId)
Returns the property value for a relationship defined by its source and target nodes.double
relationshipProperty(long sourceId, long targetId, double fallbackValue)
get value of property on relationship between source and target node idRelationships
relationships()
java.util.Map<org.neo4j.gds.RelationshipType,Relationships.Topology>
relationshipTopologies()
Relationships.Topology
relationshipTopology()
Graph
relationshipTypeFilteredGraph(java.util.Set<org.neo4j.gds.RelationshipType> relationshipTypes)
void
releaseProperties()
Release only the properties associated with that graph.void
releaseTopology()
Release only the topological data associated with that graph.long
rootNodeCount()
Number of mapped node ids in the root mapping.NodeMapping
rootNodeMapping()
Returns the original node mapping if the current node mapping is filtered, otherwise it returns itself.org.neo4j.gds.api.schema.GraphSchema
schema()
java.util.stream.Stream<RelationshipCursor>
streamRelationships(long nodeId, double fallbackValue)
long
toMappedNodeId(long nodeId)
Map original nodeId to inner nodeIdlong
toOriginalNodeId(long nodeId)
Map inner nodeId back to original nodeIdlong
toRootNodeId(long nodeId)
Maps an internal id to its root internal node id.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.neo4j.gds.api.IdMapping
cloneIdMapping, safeToMappedNodeId
-
Methods inherited from interface org.neo4j.gds.api.NodeMapping
withFilteredLabels
-
-
-
-
Field Detail
-
NO_PROPERTY_VALUE
public static final double NO_PROPERTY_VALUE
- See Also:
- Constant Field Values
-
idMapping
protected final NodeMapping idMapping
-
allocationTracker
protected final AllocationTracker allocationTracker
-
schema
protected final org.neo4j.gds.api.schema.GraphSchema schema
-
nodeProperties
protected final java.util.Map<java.lang.String,NodeProperties> nodeProperties
-
orientation
protected final org.neo4j.gds.Orientation orientation
-
relationshipCount
protected final long relationshipCount
-
adjacency
protected AdjacencyList adjacency
-
defaultPropertyValue
protected final double defaultPropertyValue
-
properties
@Nullable protected @Nullable AdjacencyProperties properties
-
hasRelationshipProperty
protected final boolean hasRelationshipProperty
-
isMultiGraph
protected final boolean isMultiGraph
-
-
Constructor Detail
-
HugeGraph
protected HugeGraph(NodeMapping idMapping, org.neo4j.gds.api.schema.GraphSchema schema, java.util.Map<java.lang.String,NodeProperties> nodeProperties, long relationshipCount, @NotNull @NotNull AdjacencyList adjacency, boolean hasRelationshipProperty, double defaultPropertyValue, @Nullable @Nullable AdjacencyProperties properties, org.neo4j.gds.Orientation orientation, boolean isMultiGraph, AllocationTracker allocationTracker)
-
-
Method Detail
-
create
public static HugeGraph create(NodeMapping nodes, org.neo4j.gds.api.schema.GraphSchema schema, java.util.Map<java.lang.String,NodeProperties> nodeProperties, Relationships.Topology topology, java.util.Optional<Relationships.Properties> maybeProperties, AllocationTracker allocationTracker)
-
nodeCount
public long nodeCount()
Description copied from interface:IdMapping
Number of mapped nodeIds.
-
rootNodeCount
public long rootNodeCount()
Description copied from interface:IdMapping
Number of mapped node ids in the root mapping. This is necessary for nested (filtered) id mappings.- Specified by:
rootNodeCount
in interfaceIdMapping
-
highestNeoId
public long highestNeoId()
- Specified by:
highestNeoId
in interfaceIdMapping
-
idMap
public NodeMapping idMap()
-
rootNodeMapping
public NodeMapping rootNodeMapping()
Description copied from interface:NodeMapping
Returns the original node mapping if the current node mapping is filtered, otherwise it returns itself.- Specified by:
rootNodeMapping
in interfaceNodeMapping
-
nodeProperties
public java.util.Map<java.lang.String,NodeProperties> nodeProperties()
-
relationshipCount
public long relationshipCount()
- Specified by:
relationshipCount
in interfaceGraph
- Returns:
- returns the total number of relationships in the graph.
-
batchIterables
public java.util.Collection<PrimitiveLongIterable> batchIterables(long batchSize)
- Specified by:
batchIterables
in interfaceBatchNodeIterable
- Returns:
- a collection of iterables over every node, partitioned by the given batch size.
-
forEachNode
public void forEachNode(java.util.function.LongPredicate consumer)
Description copied from interface:NodeIterator
Iterate over each nodeId- Specified by:
forEachNode
in interfaceNodeIterator
-
nodeIterator
public PrimitiveLongIterator nodeIterator()
- Specified by:
nodeIterator
in interfaceNodeIterator
-
relationshipProperty
public double relationshipProperty(long sourceNodeId, long targetNodeId)
Description copied from interface:RelationshipProperties
Returns the property value for a relationship defined by its source and target nodes.- Specified by:
relationshipProperty
in interfaceRelationshipProperties
-
relationshipProperty
public double relationshipProperty(long sourceId, long targetId, double fallbackValue)
Description copied from interface:RelationshipProperties
get value of property on relationship between source and target node id- Specified by:
relationshipProperty
in interfaceRelationshipProperties
- Parameters:
sourceId
- source nodetargetId
- target nodefallbackValue
- value to use if relationship has no property value- Returns:
- the property value
-
nodeProperties
public NodeProperties nodeProperties(java.lang.String propertyKey)
Description copied from interface:NodePropertyContainer
Return the property values for a property key NOTE: Avoid using this on the hot path, favor caching the NodeProperties object when possible- Specified by:
nodeProperties
in interfaceNodePropertyContainer
- Parameters:
propertyKey
- the node property key- Returns:
- the values associated with that key
-
availableNodeProperties
public java.util.Set<java.lang.String> availableNodeProperties()
- Specified by:
availableNodeProperties
in interfaceNodePropertyContainer
-
forEachRelationship
public void forEachRelationship(long nodeId, RelationshipConsumer consumer)
Description copied from interface:RelationshipIterator
Calls the given consumer function for every relationship of a given node.- Specified by:
forEachRelationship
in interfaceRelationshipIterator
- Parameters:
nodeId
- id of the node for which to iterate relationshipsconsumer
- relationship consumer function
-
forEachRelationship
public void forEachRelationship(long nodeId, double fallbackValue, RelationshipWithPropertyConsumer consumer)
Description copied from interface:RelationshipIterator
Calls the given consumer function for every relationship of a given node. If the graph was loaded with a relationship property, the property value of the relationship will be passed into the consumer. Otherwise the given fallback value will be used.- Specified by:
forEachRelationship
in interfaceRelationshipIterator
- Parameters:
nodeId
- id of the node for which to iterate relationshipsfallbackValue
- value used as relationship property if no properties were loadedconsumer
- relationship consumer function
-
streamRelationships
public java.util.stream.Stream<RelationshipCursor> streamRelationships(long nodeId, double fallbackValue)
- Specified by:
streamRelationships
in interfaceRelationshipIterator
-
relationshipTypeFilteredGraph
public Graph relationshipTypeFilteredGraph(java.util.Set<org.neo4j.gds.RelationshipType> relationshipTypes)
- Specified by:
relationshipTypeFilteredGraph
in interfaceGraph
-
relationshipTopologies
public java.util.Map<org.neo4j.gds.RelationshipType,Relationships.Topology> relationshipTopologies()
- Specified by:
relationshipTopologies
in interfaceCSRGraph
-
relationshipTopology
public Relationships.Topology relationshipTopology()
-
degreeWithoutParallelRelationships
public int degreeWithoutParallelRelationships(long nodeId)
Description copied from interface:Degrees
Much slower than just degree() because it may have to look up all relationships. This is not thread-safe, so if this is called concurrently please useRelationshipIterator.concurrentCopy()
.- Specified by:
degreeWithoutParallelRelationships
in interfaceDegrees
- See Also:
Graph.isMultiGraph()
-
toMappedNodeId
public long toMappedNodeId(long nodeId)
Description copied from interface:IdMapping
Map original nodeId to inner nodeId- Specified by:
toMappedNodeId
in interfaceIdMapping
- Parameters:
nodeId
- must be smaller or equal to the id returned byIdMapping.highestNeoId()
-
toOriginalNodeId
public long toOriginalNodeId(long nodeId)
Description copied from interface:IdMapping
Map inner nodeId back to original nodeId- Specified by:
toOriginalNodeId
in interfaceIdMapping
-
toRootNodeId
public long toRootNodeId(long nodeId)
Description copied from interface:IdMapping
Maps an internal id to its root internal node id. This is necessary for nested (filtered) id mappings. If this mapping is a nested mapping, this method returns the root node id of the parent mapping. For the root mapping this method returns the given node id.- Specified by:
toRootNodeId
in interfaceIdMapping
-
contains
public boolean contains(long nodeId)
Description copied from interface:IdMapping
Returns true iff the nodeId is mapped, otherwise false.
-
concurrentCopy
public HugeGraph concurrentCopy()
- Specified by:
concurrentCopy
in interfaceCSRGraph
- Specified by:
concurrentCopy
in interfaceGraph
- Specified by:
concurrentCopy
in interfaceRelationshipIterator
- Returns:
- a copy of this iterator that reuses new cursors internally, so that iterations happen independent from other iterations.
-
asNodeFilteredGraph
public java.util.Optional<NodeFilteredGraph> asNodeFilteredGraph()
Description copied from interface:Graph
If this graph is created using a node label filter, this will return a NodeFilteredGraph that represents the node set used in this graph. Be aware that it is not guaranteed to contain all relationships of the graph. Otherwise, it will return an empty Optional.- Specified by:
asNodeFilteredGraph
in interfaceGraph
-
exists
public boolean exists(long sourceNodeId, long targetNodeId)
O(n) !- Specified by:
exists
in interfaceRelationshipPredicate
-
getTarget
public long getTarget(long sourceNodeId, long index)
- Specified by:
getTarget
in interfaceRelationshipAccess
-
canRelease
public void canRelease(boolean canRelease)
- Specified by:
canRelease
in interfaceGraph
-
releaseTopology
public void releaseTopology()
Description copied from interface:Graph
Release only the topological data associated with that graph.- Specified by:
releaseTopology
in interfaceGraph
-
releaseProperties
public void releaseProperties()
Description copied from interface:Graph
Release only the properties associated with that graph.- Specified by:
releaseProperties
in interfaceGraph
-
isUndirected
public boolean isUndirected()
- Specified by:
isUndirected
in interfaceGraph
-
isMultiGraph
public boolean isMultiGraph()
Description copied from interface:Graph
Whether the graph is guaranteed to have no parallel relationships. If this returnsfalse
it still may be parallel-free, but we do not know.- Specified by:
isMultiGraph
in interfaceGraph
- Returns:
true
iff the graph has maximum one relationship between each pair of nodes.
-
relationships
public Relationships relationships()
-
hasRelationshipProperty
public boolean hasRelationshipProperty()
- Specified by:
hasRelationshipProperty
in interfaceGraph
-
nodeLabels
public java.util.Set<org.neo4j.gds.NodeLabel> nodeLabels(long nodeId)
- Specified by:
nodeLabels
in interfaceNodeMapping
-
forEachNodeLabel
public void forEachNodeLabel(long nodeId, NodeMapping.NodeLabelConsumer consumer)
- Specified by:
forEachNodeLabel
in interfaceNodeMapping
-
availableNodeLabels
public java.util.Set<org.neo4j.gds.NodeLabel> availableNodeLabels()
- Specified by:
availableNodeLabels
in interfaceNodeMapping
-
hasLabel
public boolean hasLabel(long nodeId, org.neo4j.gds.NodeLabel label)
- Specified by:
hasLabel
in interfaceNodeMapping
-
-