Class HugeGraph

  • All Implemented Interfaces:
    BatchNodeIterable, CSRGraph, Degrees, Graph, IdMap, NodeIterator, NodePropertyContainer, PartialIdMap, RelationshipIterator, RelationshipPredicate, RelationshipProperties

    public class HugeGraph
    extends java.lang.Object
    implements CSRGraph
    Huge Graph contains two array like data structures.

    The adjacency data is stored in a ByteArray, which is a byte[] addressable by longs indices and capable of storing about 2^46 (~ 70k bn) bytes – or 64 TiB. The bytes are stored in byte[] pages of 32 KiB size.

    The data is in the format:

    degree ~ targetId1 ~ targetId2 ~ targetIdn
    The degree is stored as a fill-sized 4 byte long int (the neo kernel api returns an int for Nodes.countAll(org.neo4j.internal.kernel.api.NodeCursor)). Every target ID is first sorted, then delta encoded, and finally written as variable-length vlongs. The delta encoding does not write the actual value but only the difference to the previous value, which plays very nice with the vlong encoding.

    The seconds data structure is a LongArray, which is a long[] addressable by longs and capable of storing about 2^43 (~9k bn) longs – or 64 TiB worth of 64 bit longs. The data is the offset address into the aforementioned adjacency array, the index is the respective source node id.

    To traverse all nodes, first access to offset from the LongArray, then read 4 bytes into the degree from the ByteArray, starting from the offset, then read degree vlongs as targetId.

    Reading the degree from the offset position not only does not require the offset array to be sorted but also allows the adjacency array to be sparse. This fact is used during the import – each thread pre-allocates a local chunk of some pages (512 KiB) and gives access to this data during import. Synchronization between threads only has to happen when a new chunk has to be pre-allocated. This is similar to what most garbage collectors do with TLAB allocations.

    See Also:
    more abount vlong, more abount TLAB allocation
    • Field Detail

      • idMap

        protected final IdMap idMap
      • schema

        protected final org.neo4j.gds.api.schema.GraphSchema schema
      • nodeProperties

        protected final java.util.Map<java.lang.String,​NodeProperties> nodeProperties
      • orientation

        protected final org.neo4j.gds.Orientation orientation
      • relationshipCount

        protected final long relationshipCount
      • hasRelationshipProperty

        protected final boolean hasRelationshipProperty
      • isMultiGraph

        protected final boolean isMultiGraph
    • Constructor Detail

      • HugeGraph

        protected HugeGraph​(IdMap idMap,
                            org.neo4j.gds.api.schema.GraphSchema schema,
                            java.util.Map<java.lang.String,​NodeProperties> nodeProperties,
                            long relationshipCount,
                            @NotNull
                            @NotNull AdjacencyList adjacency,
                            boolean hasRelationshipProperty,
                            double defaultRelationshipPropertyValue,
                            @Nullable
                            @Nullable AdjacencyProperties relationshipProperty,
                            org.neo4j.gds.Orientation orientation,
                            boolean isMultiGraph)
    • Method Detail

      • nodeCount

        public long nodeCount()
        Description copied from interface: IdMap
        Number of mapped nodeIds.
        Specified by:
        nodeCount in interface IdMap
      • rootNodeCount

        public java.util.OptionalLong rootNodeCount()
        Description copied from interface: PartialIdMap
        Number of mapped node ids in the root mapping. This is necessary for nested (filtered) id mappings.
        Specified by:
        rootNodeCount in interface PartialIdMap
      • highestNeoId

        public long highestNeoId()
        Specified by:
        highestNeoId in interface IdMap
      • idMap

        public IdMap idMap()
      • rootIdMap

        public IdMap rootIdMap()
        Description copied from interface: IdMap
        Returns the original node mapping if the current node mapping is filtered, otherwise it returns itself.
        Specified by:
        rootIdMap in interface IdMap
      • schema

        public org.neo4j.gds.api.schema.GraphSchema schema()
        Specified by:
        schema in interface Graph
      • nodeProperties

        public java.util.Map<java.lang.String,​NodeProperties> nodeProperties()
      • relationshipCount

        public long relationshipCount()
        Specified by:
        relationshipCount in interface Graph
        Returns:
        returns the total number of relationships in the graph.
      • forEachNode

        public void forEachNode​(java.util.function.LongPredicate consumer)
        Description copied from interface: NodeIterator
        Iterate over each nodeId
        Specified by:
        forEachNode in interface NodeIterator
      • relationshipProperty

        public double relationshipProperty​(long sourceNodeId,
                                           long targetNodeId)
        Description copied from interface: RelationshipProperties
        Returns the property value for a relationship defined by its source and target nodes.
        Specified by:
        relationshipProperty in interface RelationshipProperties
      • relationshipProperty

        public double relationshipProperty​(long sourceId,
                                           long targetId,
                                           double fallbackValue)
        Description copied from interface: RelationshipProperties
        get value of property on relationship between source and target node id
        Specified by:
        relationshipProperty in interface RelationshipProperties
        Parameters:
        sourceId - source node
        targetId - target node
        fallbackValue - value to use if relationship has no property value
        Returns:
        the property value
      • nodeProperties

        public NodeProperties nodeProperties​(java.lang.String propertyKey)
        Description copied from interface: NodePropertyContainer
        Return the property values for a property key NOTE: Avoid using this on the hot path, favor caching the NodeProperties object when possible
        Specified by:
        nodeProperties in interface NodePropertyContainer
        Parameters:
        propertyKey - the node property key
        Returns:
        the values associated with that key
      • forEachRelationship

        public void forEachRelationship​(long nodeId,
                                        RelationshipConsumer consumer)
        Description copied from interface: RelationshipIterator
        Calls the given consumer function for every relationship of a given node.
        Specified by:
        forEachRelationship in interface RelationshipIterator
        Parameters:
        nodeId - id of the node for which to iterate relationships
        consumer - relationship consumer function
      • forEachRelationship

        public void forEachRelationship​(long nodeId,
                                        double fallbackValue,
                                        RelationshipWithPropertyConsumer consumer)
        Description copied from interface: RelationshipIterator
        Calls the given consumer function for every relationship of a given node. If the graph was loaded with a relationship property, the property value of the relationship will be passed into the consumer. Otherwise the given fallback value will be used.
        Specified by:
        forEachRelationship in interface RelationshipIterator
        Parameters:
        nodeId - id of the node for which to iterate relationships
        fallbackValue - value used as relationship property if no properties were loaded
        consumer - relationship consumer function
      • relationshipTypeFilteredGraph

        public Graph relationshipTypeFilteredGraph​(java.util.Set<org.neo4j.gds.RelationshipType> relationshipTypes)
        Specified by:
        relationshipTypeFilteredGraph in interface Graph
      • degree

        public int degree​(long node)
        Specified by:
        degree in interface Degrees
      • toMappedNodeId

        public long toMappedNodeId​(long nodeId)
        Description copied from interface: PartialIdMap
        Map original nodeId to inner nodeId
        Specified by:
        toMappedNodeId in interface PartialIdMap
        Parameters:
        nodeId - must be smaller or equal to the id returned by IdMap.highestNeoId()
      • toOriginalNodeId

        public long toOriginalNodeId​(long nodeId)
        Description copied from interface: IdMap
        Map inner nodeId back to original nodeId
        Specified by:
        toOriginalNodeId in interface IdMap
      • toRootNodeId

        public long toRootNodeId​(long nodeId)
        Description copied from interface: IdMap
        Maps an internal id to its root internal node id. This is necessary for nested (filtered) id mappings. If this mapping is a nested mapping, this method returns the root node id of the parent mapping. For the root mapping this method returns the given node id.
        Specified by:
        toRootNodeId in interface IdMap
      • contains

        public boolean contains​(long nodeId)
        Description copied from interface: IdMap
        Returns true iff the nodeId is mapped, otherwise false.
        Specified by:
        contains in interface IdMap
      • asNodeFilteredGraph

        public java.util.Optional<NodeFilteredGraph> asNodeFilteredGraph()
        Description copied from interface: Graph
        If this graph is created using a node label filter, this will return a NodeFilteredGraph that represents the node set used in this graph. Be aware that it is not guaranteed to contain all relationships of the graph. Otherwise, it will return an empty Optional.
        Specified by:
        asNodeFilteredGraph in interface Graph
      • exists

        public boolean exists​(long sourceNodeId,
                              long targetNodeId)
        O(n) !
        Specified by:
        exists in interface RelationshipPredicate
      • canRelease

        public void canRelease​(boolean canRelease)
        Specified by:
        canRelease in interface Graph
      • releaseTopology

        public void releaseTopology()
        Description copied from interface: Graph
        Release only the topological data associated with that graph.
        Specified by:
        releaseTopology in interface Graph
      • releaseProperties

        public void releaseProperties()
        Description copied from interface: Graph
        Release only the properties associated with that graph.
        Specified by:
        releaseProperties in interface Graph
      • isUndirected

        public boolean isUndirected()
        Specified by:
        isUndirected in interface Graph
      • isMultiGraph

        public boolean isMultiGraph()
        Description copied from interface: Graph
        Whether the graph is guaranteed to have no parallel relationships. If this returns false it still may be parallel-free, but we do not know.
        Specified by:
        isMultiGraph in interface Graph
        Returns:
        true iff the graph has maximum one relationship between each pair of nodes.
      • nodeLabels

        public java.util.List<org.neo4j.gds.NodeLabel> nodeLabels​(long nodeId)
        Specified by:
        nodeLabels in interface IdMap
      • availableNodeLabels

        public java.util.Set<org.neo4j.gds.NodeLabel> availableNodeLabels()
        Specified by:
        availableNodeLabels in interface IdMap
      • hasLabel

        public boolean hasLabel​(long nodeId,
                                org.neo4j.gds.NodeLabel label)
        Specified by:
        hasLabel in interface IdMap