public class FileSystemConfiguration extends Object
IGFS
configuration. More than one file system can be configured within grid.
IGFS
configuration is provided via IgniteConfiguration.getFileSystemConfiguration()
method.Modifier and Type | Field and Description |
---|---|
static int |
DFLT_BLOCK_SIZE
Default file's data block size (bytes).
|
static int |
DFLT_BUF_SIZE
Default read/write buffers size (bytes).
|
static boolean |
DFLT_COLOCATE_META
Default value of metadata co-location flag.
|
static int |
DFLT_FRAGMENTIZER_CONCURRENT_FILES
Default fragmentizer concurrent files.
|
static boolean |
DFLT_FRAGMENTIZER_ENABLED
Fragmentizer enabled property.
|
static long |
DFLT_FRAGMENTIZER_THROTTLING_BLOCK_LENGTH
Default fragmentizer throttling block length.
|
static long |
DFLT_FRAGMENTIZER_THROTTLING_DELAY
Default fragmentizer throttling delay.
|
static int |
DFLT_IGFS_LOG_BATCH_SIZE
Default batch size for logging.
|
static String |
DFLT_IGFS_LOG_DIR
Default
IGFS log directory. |
static boolean |
DFLT_IPC_ENDPOINT_ENABLED
Default IPC endpoint enabled flag.
|
static int |
DFLT_MGMT_PORT
Default management port.
|
static IgfsMode |
DFLT_MODE
Default IGFS mode.
|
static int |
DFLT_PER_NODE_BATCH_SIZE
Default per node buffer size.
|
static int |
DFLT_PER_NODE_PARALLEL_BATCH_CNT
Default number of per node parallel operations.
|
static boolean |
DFLT_RELAXED_CONSISTENCY
Default value of relaxed consistency flag.
|
static boolean |
DFLT_UPDATE_FILE_LEN_ON_FLUSH
Default value of update file length on flush flag.
|
static String |
DFLT_USER_NAME
Default file system user name.
|
Constructor and Description |
---|
FileSystemConfiguration()
Constructs default configuration.
|
FileSystemConfiguration(FileSystemConfiguration cfg)
Constructs the copy of the configuration.
|
Modifier and Type | Method and Description |
---|---|
int |
getBlockSize()
Get file's data block size.
|
int |
getBufferSize()
Get read/write buffer size for
IGFS stream operations in bytes. |
CacheConfiguration |
getDataCacheConfiguration()
Cache config to store IGFS data.
|
IgfsMode |
getDefaultMode()
Gets mode to specify how
IGFS interacts with Hadoop file system, like HDFS . |
int |
getFragmentizerConcurrentFiles()
Gets number of files that can be processed by fragmentizer concurrently.
|
long |
getFragmentizerThrottlingBlockLength()
Gets the length of file chunk to send before delaying the fragmentizer.
|
long |
getFragmentizerThrottlingDelay()
Gets throttle delay for fragmentizer.
|
IgfsIpcEndpointConfiguration |
getIpcEndpointConfiguration()
Gets IPC endpoint configuration.
|
int |
getManagementPort()
Gets port number for management endpoint.
|
long |
getMaximumTaskRangeLength()
Get maximum default range size of a file being split during IGFS task execution.
|
CacheConfiguration |
getMetaCacheConfiguration()
Cache config to store IGFS meta information.
|
String |
getName()
Gets IGFS instance name.
|
Map<String,IgfsMode> |
getPathModes()
Gets map of path prefixes to
IGFS modes used for them. |
int |
getPerNodeBatchSize()
Gets number of file blocks buffered on local node before sending batch to remote node.
|
int |
getPerNodeParallelBatchCount()
Gets number of batches that can be concurrently sent to remote node.
|
int |
getPrefetchBlocks()
Get number of pre-fetched blocks if specific file's chunk is requested.
|
IgfsSecondaryFileSystem |
getSecondaryFileSystem()
Gets the secondary file system.
|
int |
getSequentialReadsBeforePrefetch()
Get amount of sequential block reads before prefetch is triggered.
|
boolean |
isColocateMetadata()
Get whether to co-locate metadata on a single node.
|
boolean |
isFragmentizerEnabled()
Gets flag indicating whether IGFS fragmentizer is enabled.
|
boolean |
isIpcEndpointEnabled()
Get IPC endpoint enabled flag.
|
boolean |
isRelaxedConsistency()
Get relaxed consistency flag.
|
boolean |
isUpdateFileLengthOnFlush()
Get whether to update file length on flush.
|
FileSystemConfiguration |
setBlockSize(int blockSize)
Sets file's data block size.
|
FileSystemConfiguration |
setBufferSize(int bufSize)
Sets read/write buffers size for
IGFS stream operations (bytes). |
FileSystemConfiguration |
setColocateMetadata(boolean colocateMeta)
Set metadata co-location flag.
|
FileSystemConfiguration |
setDataCacheConfiguration(CacheConfiguration dataCacheCfg)
Cache config to store IGFS data.
|
FileSystemConfiguration |
setDefaultMode(IgfsMode dfltMode)
Sets
IGFS mode to specify how it should interact with secondary
Hadoop file system, like HDFS . |
FileSystemConfiguration |
setFragmentizerConcurrentFiles(int fragmentizerConcurrentFiles)
Sets number of files to process concurrently by fragmentizer.
|
FileSystemConfiguration |
setFragmentizerEnabled(boolean fragmentizerEnabled)
Sets property indicating whether fragmentizer is enabled.
|
FileSystemConfiguration |
setFragmentizerThrottlingBlockLength(long fragmentizerThrottlingBlockLen)
Sets length of file chunk to transmit before throttling is delayed.
|
FileSystemConfiguration |
setFragmentizerThrottlingDelay(long fragmentizerThrottlingDelay)
Sets delay in milliseconds for which fragmentizer is paused.
|
FileSystemConfiguration |
setIpcEndpointConfiguration(IgfsIpcEndpointConfiguration ipcEndpointCfg)
Sets IPC endpoint configuration.
|
FileSystemConfiguration |
setIpcEndpointEnabled(boolean ipcEndpointEnabled)
Set IPC endpoint enabled flag.
|
FileSystemConfiguration |
setManagementPort(int mgmtPort)
Sets management endpoint port.
|
FileSystemConfiguration |
setMaximumTaskRangeLength(long maxTaskRangeLen)
Set maximum default range size of a file being split during IGFS task execution.
|
FileSystemConfiguration |
setMetaCacheConfiguration(CacheConfiguration metaCacheCfg)
Cache config to store IGFS meta information.
|
FileSystemConfiguration |
setName(String name)
Sets IGFS instance name.
|
FileSystemConfiguration |
setPathModes(Map<String,IgfsMode> pathModes)
Sets map of path prefixes to
IGFS modes used for them. |
FileSystemConfiguration |
setPerNodeBatchSize(int perNodeBatchSize)
Sets number of file blocks collected on local node before sending batch to remote node.
|
FileSystemConfiguration |
setPerNodeParallelBatchCount(int perNodeParallelBatchCnt)
Sets number of file block batches that can be concurrently sent to remote node.
|
FileSystemConfiguration |
setPrefetchBlocks(int prefetchBlocks)
Sets the number of pre-fetched blocks if specific file's chunk is requested.
|
FileSystemConfiguration |
setRelaxedConsistency(boolean relaxedConsistency)
Set relaxed consistency flag.
|
FileSystemConfiguration |
setSecondaryFileSystem(IgfsSecondaryFileSystem fileSystem)
Sets the secondary file system.
|
FileSystemConfiguration |
setSequentialReadsBeforePrefetch(int seqReadsBeforePrefetch)
Sets amount of sequential block reads before prefetch is triggered.
|
FileSystemConfiguration |
setUpdateFileLengthOnFlush(boolean updateFileLenOnFlush)
Set whether to update file length on flush.
|
String |
toString() |
public static final String DFLT_USER_NAME
public static final long DFLT_FRAGMENTIZER_THROTTLING_BLOCK_LENGTH
public static final long DFLT_FRAGMENTIZER_THROTTLING_DELAY
public static final int DFLT_FRAGMENTIZER_CONCURRENT_FILES
public static final boolean DFLT_FRAGMENTIZER_ENABLED
public static final int DFLT_IGFS_LOG_BATCH_SIZE
public static final String DFLT_IGFS_LOG_DIR
IGFS
log directory.public static final int DFLT_PER_NODE_BATCH_SIZE
public static final int DFLT_PER_NODE_PARALLEL_BATCH_CNT
public static final IgfsMode DFLT_MODE
public static final int DFLT_BLOCK_SIZE
public static final int DFLT_BUF_SIZE
public static final int DFLT_MGMT_PORT
public static final boolean DFLT_IPC_ENDPOINT_ENABLED
public static final boolean DFLT_COLOCATE_META
public static final boolean DFLT_RELAXED_CONSISTENCY
public static final boolean DFLT_UPDATE_FILE_LEN_ON_FLUSH
public FileSystemConfiguration()
public FileSystemConfiguration(FileSystemConfiguration cfg)
cfg
- Configuration to copy.public String getName()
public FileSystemConfiguration setName(String name)
name
- IGFS instance name.this
for chaining.@Nullable public CacheConfiguration getMetaCacheConfiguration()
public FileSystemConfiguration setMetaCacheConfiguration(CacheConfiguration metaCacheCfg)
null
, then default config for
meta-cache will be used.
Default configuration for the meta cache is:
metaCacheCfg
- Cache configuration object.this
for chaining.@Nullable public CacheConfiguration getDataCacheConfiguration()
public FileSystemConfiguration setDataCacheConfiguration(CacheConfiguration dataCacheCfg)
null
, then default config for
data cache will be used.
Default configuration for the data cache is:
dataCacheCfg
- Cache configuration object.this
for chaining.public int getBlockSize()
public FileSystemConfiguration setBlockSize(int blockSize)
blockSize
- File's data block size (bytes) or 0
to reset default value.this
for chaining.public int getPrefetchBlocks()
public FileSystemConfiguration setPrefetchBlocks(int prefetchBlocks)
prefetchBlocks
- New number of pre-fetched blocks.this
for chaining.public int getSequentialReadsBeforePrefetch()
Default is 0
which means that pre-fetching will start right away.
fs.igfs.[name].open.sequential_reads_before_prefetch
configuration property directly to Hadoop
MapReduce task.
NOTE: Integration with Hadoop is available only in In-Memory Accelerator For Hadoop
edition.
public FileSystemConfiguration setSequentialReadsBeforePrefetch(int seqReadsBeforePrefetch)
Default is 0
which means that pre-fetching will start right away.
fs.igfs.[name].open.sequential_reads_before_prefetch
configuration property directly to Hadoop
MapReduce task.
NOTE: Integration with Hadoop is available only in In-Memory Accelerator For Hadoop
edition.
seqReadsBeforePrefetch
- Amount of sequential block reads before prefetch is triggered.this
for chaining.public int getBufferSize()
IGFS
stream operations in bytes.public FileSystemConfiguration setBufferSize(int bufSize)
IGFS
stream operations (bytes).bufSize
- Read/write buffers size for stream operations (bytes) or 0
to reset default value.this
for chaining.public int getPerNodeBatchSize()
public FileSystemConfiguration setPerNodeBatchSize(int perNodeBatchSize)
perNodeBatchSize
- Per node buffer size.this
for chaining.public int getPerNodeParallelBatchCount()
public FileSystemConfiguration setPerNodeParallelBatchCount(int perNodeParallelBatchCnt)
perNodeParallelBatchCnt
- Per node parallel load operations.this
for chaining.@Nullable public IgfsIpcEndpointConfiguration getIpcEndpointConfiguration()
Endpoint is needed for communication between IGFS and IgniteHadoopFileSystem
shipped with Ignite
Hadoop Accelerator.
public FileSystemConfiguration setIpcEndpointConfiguration(@Nullable IgfsIpcEndpointConfiguration ipcEndpointCfg)
Endpoint is needed for communication between IGFS and IgniteHadoopFileSystem
shipped with Ignite
Hadoop Accelerator.
ipcEndpointCfg
- IPC endpoint configuration.this
for chaining.public boolean isIpcEndpointEnabled()
true
endpoint will be created and bound to specific
port. Otherwise endpoint will not be created. Default value is DFLT_IPC_ENDPOINT_ENABLED
.
Endpoint is needed for communication between IGFS and IgniteHadoopFileSystem
shipped with Ignite
Hadoop Accelerator.
True
in case endpoint is enabled.public FileSystemConfiguration setIpcEndpointEnabled(boolean ipcEndpointEnabled)
isIpcEndpointEnabled()
.
Endpoint is needed for communication between IGFS and IgniteHadoopFileSystem
shipped with Ignite
Hadoop Accelerator.
ipcEndpointEnabled
- IPC endpoint enabled flag.this
for chaining.public int getManagementPort()
Default value is DFLT_MGMT_PORT
-1
if management endpoint should be disabled.public FileSystemConfiguration setManagementPort(int mgmtPort)
mgmtPort
- port number or -1
to disable management endpoint.this
for chaining.public IgfsMode getDefaultMode()
IGFS
interacts with Hadoop file system, like HDFS
.
Secondary Hadoop file system is provided for pass-through, write-through, and read-through
purposes.
Default mode is IgfsMode.DUAL_ASYNC
. If secondary Hadoop file system is
not configured, this mode will work just like IgfsMode.PRIMARY
mode.
public FileSystemConfiguration setDefaultMode(IgfsMode dfltMode)
IGFS
mode to specify how it should interact with secondary
Hadoop file system, like HDFS
. Secondary Hadoop file system is provided
for pass-through, write-through, and read-through purposes.dfltMode
- IGFS
mode.this
for chaining.public IgfsSecondaryFileSystem getSecondaryFileSystem()
public FileSystemConfiguration setSecondaryFileSystem(IgfsSecondaryFileSystem fileSystem)
fileSystem
- Secondary file system.this
for chaining.@Nullable public Map<String,IgfsMode> getPathModes()
IGFS
modes used for them.
If path doesn't correspond to any specified prefix or mappings are not provided, then
getDefaultMode()
is used.
IGFS
modes.public FileSystemConfiguration setPathModes(Map<String,IgfsMode> pathModes)
IGFS
modes used for them.
If path doesn't correspond to any specified prefix or mappings are not provided, then
getDefaultMode()
is used.
pathModes
- Map of paths to IGFS
modes.this
for chaining.public long getFragmentizerThrottlingBlockLength()
public FileSystemConfiguration setFragmentizerThrottlingBlockLength(long fragmentizerThrottlingBlockLen)
fragmentizerThrottlingBlockLen
- Block length in bytes.this
for chaining.public long getFragmentizerThrottlingDelay()
public FileSystemConfiguration setFragmentizerThrottlingDelay(long fragmentizerThrottlingDelay)
fragmentizerThrottlingDelay
- Delay in milliseconds.this
for chaining.public int getFragmentizerConcurrentFiles()
public FileSystemConfiguration setFragmentizerConcurrentFiles(int fragmentizerConcurrentFiles)
fragmentizerConcurrentFiles
- Number of files to process concurrently.this
for chaining.public boolean isFragmentizerEnabled()
public FileSystemConfiguration setFragmentizerEnabled(boolean fragmentizerEnabled)
fragmentizerEnabled
- True
if fragmentizer is enabled.this
for chaining.public long getMaximumTaskRangeLength()
IgfsFileRange
which
has length. In case this parameter is set to positive value, then IGFS will split single file range into smaller
ranges with length not greater that this parameter. The only exception to this case is when maximum task range
length is smaller than file block size. In this case maximum task range size will be overridden and set to file
block size.
Note that this parameter is applied when task is split into jobs before IgfsRecordResolver
is
applied. Therefore, final file ranges being assigned to particular jobs could be greater than value of this
parameter depending on file data layout and selected resolver type.
Setting this parameter might be useful when file is highly colocated and have very long consequent data chunks so that task execution suffers from insufficient parallelism. E.g., in case you have one IGFS node in topology and want to process 1Gb file, then only single range of length 1Gb will be returned. This will result in a single job which will be processed in one thread. But in case you provide this configuration parameter and set maximum range length to 16Mb, then 64 ranges will be returned resulting in 64 jobs which could be executed in parallel.
Note that some IgniteFs.execute()
methods can override value of this parameter.
In case value of this parameter is set to 0
or negative value, it is simply ignored. Default value is
0
.
public FileSystemConfiguration setMaximumTaskRangeLength(long maxTaskRangeLen)
getMaximumTaskRangeLength()
for more details.maxTaskRangeLen
- Set maximum default range size of a file being split during IGFS task execution.this
for chaining.public boolean isColocateMetadata()
Normally Ignite spread ownership of particular keys among all cache nodes. Transaction with keys owned by different nodes will produce more network traffic and will require more time to complete comparing to transaction with keys owned only by a single node.
IGFS stores information about file system structure (metadata) inside a transactional cache configured through
getMetaCacheConfiguration()
property. Metadata updates caused by operations on IGFS usually require
several internal keys to be updated. As IGFS metadata cache usually operates
in CacheMode.REPLICATED
mode, meaning that all nodes have all metadata locally, it makes sense to give
a hint to Ignite to co-locate ownership of all metadata keys on a single node.
This will decrease amount of network trips required to update metadata and hence could improve performance.
This property should be disabled if you see excessive CPU and network load on a single node, which degrades performance and cannot be explained by business logic of your application.
This settings is only used if metadata cache is configured in CacheMode#REPLICATED
mode. Otherwise it
is ignored.
Defaults to DFLT_COLOCATE_META
.
True
if metadata co-location is enabled.public FileSystemConfiguration setColocateMetadata(boolean colocateMeta)
See isColocateMetadata()
for more information.
colocateMeta
- Whether metadata co-location is enabled.this
for chaining.public boolean isRelaxedConsistency()
Concurrent file system operations might conflict with each other. E.g. move("/a1/a2", "/b")
and
move("/b", "/a1")
. Hence, it is necessary to atomically verify that participating paths are still
on their places to keep file system in consistent state in such cases. These checks are expensive in
distributed environment.
Real applications, e.g. Hadoop jobs, rarely produce conflicting operations. So additional checks could be skipped in these scenarios without any negative effect on file system integrity. It significantly increases performance of file system operations.
If value of this flag is true
, IGFS will skip expensive consistency checks. It is recommended to set
this flag to false
if your application has conflicting operations, or you do not how exactly users will
use your system.
This property affects only IgfsMode.PRIMARY
paths.
Defaults to DFLT_RELAXED_CONSISTENCY
.
True
if relaxed consistency is enabled.public FileSystemConfiguration setRelaxedConsistency(boolean relaxedConsistency)
See isColocateMetadata()
for more information.
relaxedConsistency
- Whether to use relaxed consistency optimization.this
for chaining.public boolean isUpdateFileLengthOnFlush()
Controls whether to update file length or not when OutputStream.flush()
method is invoked. You
may want to set this property to true in case you want to read from a file which is being written at the
same time.
Defaults to DFLT_UPDATE_FILE_LEN_ON_FLUSH
.
public FileSystemConfiguration setUpdateFileLengthOnFlush(boolean updateFileLenOnFlush)
Set isUpdateFileLengthOnFlush()
for more information.
updateFileLenOnFlush
- Whether to update file length on flush.this
for chaining.
Follow @ApacheIgnite
Ignite Fabric : ver. 2.5.0 Release Date : May 23 2018