CopyOnWriteStateTable (flink-runtime 1.8.2 API)

java.lang.Object
- org.apache.flink.runtime.state.heap.StateTable<K,N,S>
- - org.apache.flink.runtime.state.heap.CopyOnWriteStateTable<K,N,S>

类型参数:

K - type of key.

N - type of namespace.

S - type of value.

所有已实现的接口:

Iterable<StateEntry<K,N,S>>, StateSnapshotRestore
```
public class CopyOnWriteStateTable<K,N,S>
extends StateTable<K,N,S>
implements Iterable<StateEntry<K,N,S>>
```
Implementation of Flink's in-memory state tables with copy-on-write support. This map does not support null values for key or namespace.
CopyOnWriteStateTable sacrifices some peak performance and memory efficiency for features like incremental rehashing and asynchronous snapshots through copy-on-write. Copy-on-write tries to minimize the amount of copying by maintaining version meta data for both, the map structure and the state objects. However, we must often proactively copy state objects when we hand them to the user.
As for any state backend, user should not keep references on state objects that they obtained from state backends outside the scope of the user function calls.
Some brief maintenance notes:
1) Flattening the underlying data structure from nested maps (namespace) -> (key) -> (state) to one flat map (key, namespace) -> (state) brings certain performance trade-offs. In theory, the flat map has one less level of indirection compared to the nested map. However, the nested map naturally de-duplicates namespace objects for which #equals() is true. This leads to potentially a lot of redundant namespace objects for the flattened version. Those, in turn, can again introduce more cache misses because we need to follow the namespace object on all operations to ensure entry identities. Obviously, copy-on-write can also add memory overhead. So does the meta data to track copy-on-write requirement (state and entry versions on CopyOnWriteStateTable.StateTableEntry).
2) A flat map structure is a lot easier when it comes to tracking copy-on-write of the map structure.
3) Nested structure had the (never used) advantage that we can easily drop and iterate whole namespaces. This could give locality advantages for certain access pattern, e.g. iterating a namespace.
4) Serialization format is changed from namespace-prefix compressed (as naturally provided from the old nested structure) to making all entries self contained as (key, namespace, state).
5) We got rid of having multiple nested tables, one for each key-group. Instead, we partition state into key-groups on-the-fly, during the asynchronous part of a snapshot.
6) Currently, a state table can only grow, but never shrinks on low load. We could easily add this if required.
7) Heap based state backends like this can easily cause a lot of GC activity. Besides using G1 as garbage collector, we should provide an additional state backend that operates on off-heap memory. This would sacrifice peak performance (due to de/serialization of objects) for a lower, but more constant throughput and potentially huge simplifications w.r.t. copy-on-write.
8) We could try a hybrid of a serialized and object based backends, where key and namespace of the entries are both serialized in one byte-array.
9) We could consider smaller types (e.g. short) for the version counting and think about some reset strategy before overflows, when there is no snapshot running. However, this would have to touch all entries in the map.
This class was initially based on the HashMap implementation of the Android JDK, but is now heavily customized towards the use case of table for state entries. IMPORTANT: the contracts for this class rely on the user not holding any references to objects returned by this map beyond the life cycle of per-element operations. Or phrased differently, all get-update-put operations on a mapping should be within one call of processElement. Otherwise, the user must take care of taking deep copies, e.g. for caching purposes.

嵌套类概要

嵌套类
限定符和类型类和说明

protected static class CopyOnWriteStateTable.StateTableEntry<K,N,S>
One entry in the CopyOnWriteStateTable.

嵌套类
限定符和类型	类和说明
`protected static class`	`CopyOnWriteStateTable.StateTableEntry<K,N,S>` One entry in the `CopyOnWriteStateTable`.

字段概要

字段
限定符和类型字段和说明

static int DEFAULT_CAPACITY
Default capacity for a CopyOnWriteStateTable.
- 从类继承的字段 org.apache.flink.runtime.state.heap.StateTable
  keyContext, metaInfo

字段
限定符和类型	字段和说明
`static int`	`DEFAULT_CAPACITY` Default capacity for a `CopyOnWriteStateTable`.

方法概要

所有方法实例方法具体方法
限定符和类型	方法和说明
`boolean`	`containsKey(N namespace)` Returns whether this table contains a mapping for the composite of active key and given namespace.
`S`	`get(K key, N namespace)` Returns the state for the composite of active key and given namespace.
`S`	`get(N namespace)` Returns the state of the mapping for the composite of active key and given namespace.
`java.util.stream.Stream<K>`	`getKeys(N namespace)`
`RegisteredKeyValueStateBackendMetaInfo<N,S>`	`getMetaInfo()`
`org.apache.flink.api.common.typeutils.TypeSerializer<N>`	`getNamespaceSerializer()`
`InternalKvState.StateIncrementalVisitor<K,N,S>`	`getStateIncrementalVisitor(int recommendedMaxNumberOfReturnedRecords)`
`org.apache.flink.api.common.typeutils.TypeSerializer<S>`	`getStateSerializer()`
`Iterator<StateEntry<K,N,S>>`	`iterator()`
`void`	`put(K key, int keyGroup, N namespace, S state)`
`void`	`put(N namespace, S state)` Maps the composite of active key and given namespace to the specified state.
`S`	`putAndGetOld(N namespace, S state)` Maps the composite of active key and given namespace to the specified state.
`void`	`remove(N namespace)` Removes the mapping for the composite of active key and given namespace.
`S`	`removeAndGetOld(N namespace)` Removes the mapping for the composite of active key and given namespace, returning the state that was found under the entry.
`void`	`setMetaInfo(RegisteredKeyValueStateBackendMetaInfo<N,S> metaInfo)`
`int`	`size()` Returns the total number of entries in this `CopyOnWriteStateTable`.
`int`	`sizeOfNamespace(Object namespace)`
`CopyOnWriteStateTableSnapshot<K,N,S>`	`stateSnapshot()` Creates a snapshot of this `CopyOnWriteStateTable`, to be written in checkpointing.
`<T> void`	`transform(N namespace, T value, StateTransformationFunction<S,T> transformation)` Applies the given `StateTransformationFunction` to the state (1st input argument), using the given value as second input argument.

从类继承的方法 org.apache.flink.runtime.state.heap.StateTable
isEmpty, keyGroupReader

从类继承的方法 java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

从接口继承的方法 java.lang.Iterable
forEach, spliterator

- 字段详细资料
  - DEFAULT_CAPACITY
```
public static final int DEFAULT_CAPACITY
```
    Default capacity for a CopyOnWriteStateTable. Must be a power of two, greater than MINIMUM_CAPACITY and less than MAXIMUM_CAPACITY.
    
    另请参阅:
    
    常量字段值
- 方法详细资料
  - size
```
public int size()
```
    Returns the total number of entries in this CopyOnWriteStateTable. This is the sum of both sub-tables.
    
    指定者:
    
    size 在类中 StateTable<K,N,S>
    
    返回:
    
    the number of entries in this CopyOnWriteStateTable.
  - get
```
public S get(K key,
             N namespace)
```
    从类复制的说明: StateTable
    
    Returns the state for the composite of active key and given namespace. This is typically used by queryable state.
    
    指定者:
    
    get 在类中 StateTable<K,N,S>
    
    参数:
    
    key - the key. Not null.
    
    namespace - the namespace. Not null.
    
    返回:
    
    the state of the mapping with the specified key/namespace composite key, or null if no mapping for the specified key is found.
  - getKeys
```
public java.util.stream.Stream<K> getKeys(N namespace)
```
    指定者:
    
    getKeys 在类中 StateTable<K,N,S>
  - put
```
public void put(K key,
                int keyGroup,
                N namespace,
                S state)
```
    指定者:
    
    put 在类中 StateTable<K,N,S>
  - get
```
public S get(N namespace)
```
    从类复制的说明: StateTable
    
    Returns the state of the mapping for the composite of active key and given namespace.
    
    指定者:
    
    get 在类中 StateTable<K,N,S>
    
    参数:
    
    namespace - the namespace. Not null.
    
    返回:
    
    the states of the mapping with the specified key/namespace composite key, or null if no mapping for the specified key is found.
  - containsKey
```
public boolean containsKey(N namespace)
```
    从类复制的说明: StateTable
    
    Returns whether this table contains a mapping for the composite of active key and given namespace.
    
    指定者:
    
    containsKey 在类中 StateTable<K,N,S>
    
    参数:
    
    namespace - the namespace in the composite key to search for. Not null.
    
    返回:
    
    true if this map contains the specified key/namespace composite key, false otherwise.
  - put
```
public void put(N namespace,
                S state)
```
    从类复制的说明: StateTable
    
    Maps the composite of active key and given namespace to the specified state. This method should be preferred over #putAndGetOld(N, S) (Namespace, State)} when the caller is not interested in the old state.
    
    指定者:
    
    put 在类中 StateTable<K,N,S>
    
    参数:
    
    namespace - the namespace. Not null.
    
    state - the state. Can be null.
  - putAndGetOld
```
public S putAndGetOld(N namespace,
                      S state)
```
    从类复制的说明: StateTable
    
    Maps the composite of active key and given namespace to the specified state. Returns the previous state that was registered under the composite key.
    
    指定者:
    
    putAndGetOld 在类中 StateTable<K,N,S>
    
    参数:
    
    namespace - the namespace. Not null.
    
    state - the state. Can be null.
    
    返回:
    
    the state of any previous mapping with the specified key or null if there was no such mapping.
  - remove
```
public void remove(N namespace)
```
    从类复制的说明: StateTable
    
    Removes the mapping for the composite of active key and given namespace. This method should be preferred over #removeAndGetOld(N) when the caller is not interested in the old state.
    
    指定者:
    
    remove 在类中 StateTable<K,N,S>
    
    参数:
    
    namespace - the namespace of the mapping to remove. Not null.
  - removeAndGetOld
```
public S removeAndGetOld(N namespace)
```
    从类复制的说明: StateTable
    
    Removes the mapping for the composite of active key and given namespace, returning the state that was found under the entry.
    
    指定者:
    
    removeAndGetOld 在类中 StateTable<K,N,S>
    
    参数:
    
    namespace - the namespace of the mapping to remove. Not null.
    
    返回:
    
    the state of the removed mapping or null if no mapping for the specified key was found.
  - transform
```
public <T> void transform(N namespace,
                          T value,
                          StateTransformationFunction<S,T> transformation)
                   throws Exception
```
    从类复制的说明: StateTable
    
    Applies the given StateTransformationFunction to the state (1st input argument), using the given value as second input argument. The result of StateTransformationFunction.apply(Object, Object) is then stored as the new state. This function is basically an optimization for get-update-put pattern.
    
    指定者:
    
    transform 在类中 StateTable<K,N,S>
    
    参数:
    
    namespace - the namespace. Not null.
    
    value - the value to use in transforming the state. Can be null.
    
    transformation - the transformation function.
    
    抛出:
    
    Exception - if some exception happens in the transformation function.
  - getStateSerializer
```
public org.apache.flink.api.common.typeutils.TypeSerializer<S> getStateSerializer()
```
    覆盖:
    
    getStateSerializer 在类中 StateTable<K,N,S>
  - getNamespaceSerializer
```
public org.apache.flink.api.common.typeutils.TypeSerializer<N> getNamespaceSerializer()
```
    覆盖:
    
    getNamespaceSerializer 在类中 StateTable<K,N,S>
  - getMetaInfo
```
public RegisteredKeyValueStateBackendMetaInfo<N,S> getMetaInfo()
```
    覆盖:
    
    getMetaInfo 在类中 StateTable<K,N,S>
  - setMetaInfo
```
public void setMetaInfo(RegisteredKeyValueStateBackendMetaInfo<N,S> metaInfo)
```
    覆盖:
    
    setMetaInfo 在类中 StateTable<K,N,S>
  - iterator
```
@Nonnull
public Iterator<StateEntry<K,N,S>> iterator()
```
    指定者:
    
    iterator 在接口中 Iterable<StateEntry<K,N,S>>
  - stateSnapshot
```
@Nonnull
public CopyOnWriteStateTableSnapshot<K,N,S> stateSnapshot()
```
    Creates a snapshot of this CopyOnWriteStateTable, to be written in checkpointing. The snapshot integrity is protected through copy-on-write from the CopyOnWriteStateTable. Users should call releaseSnapshot(CopyOnWriteStateTableSnapshot) after using the returned object.
    
    指定者:
    
    stateSnapshot 在接口中 StateSnapshotRestore
    
    返回:
    
    a snapshot from this CopyOnWriteStateTable, for checkpointing.
  - sizeOfNamespace
```
public int sizeOfNamespace(Object namespace)
```
    指定者:
    
    sizeOfNamespace 在类中 StateTable<K,N,S>
  - getStateIncrementalVisitor
```
public InternalKvState.StateIncrementalVisitor<K,N,S> getStateIncrementalVisitor(int recommendedMaxNumberOfReturnedRecords)
```
    指定者:
    
    getStateIncrementalVisitor 在类中 StateTable<K,N,S>

类 CopyOnWriteStateTable<K,N,S>

嵌套类概要

字段概要

从类继承的字段 org.apache.flink.runtime.state.heap.StateTable

方法概要

从类继承的方法 org.apache.flink.runtime.state.heap.StateTable

从类继承的方法 java.lang.Object

从接口继承的方法 java.lang.Iterable

字段详细资料

DEFAULT_CAPACITY

方法详细资料

size

get

getKeys

put

get

containsKey

put

putAndGetOld

remove

removeAndGetOld

transform

getStateSerializer

getNamespaceSerializer

getMetaInfo

setMetaInfo

iterator

stateSnapshot

sizeOfNamespace

getStateIncrementalVisitor