Package org.apache.beam.sdk.util
Class NumberedShardedFile
- java.lang.Object
-
- org.apache.beam.sdk.util.NumberedShardedFile
-
- All Implemented Interfaces:
java.io.Serializable
,ShardedFile
@Internal public class NumberedShardedFile extends java.lang.Object implements ShardedFile
Utility methods for working with sharded files. For internal use only; many parameters are just hardcoded to allow existing uses to work OK.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description NumberedShardedFile(java.lang.String filePattern)
Constructor that uses default shard template.NumberedShardedFile(java.lang.String filePattern, java.util.regex.Pattern shardTemplate)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
getFilePattern()
java.util.List<java.lang.String>
readFilesWithRetries()
Discovers all shards of this file.java.util.List<java.lang.String>
readFilesWithRetries(Sleeper sleeper, BackOff backOff)
java.lang.String
toString()
-
-
-
Constructor Detail
-
NumberedShardedFile
public NumberedShardedFile(java.lang.String filePattern)
Constructor that uses default shard template.- Parameters:
filePattern
- path or glob of files to include
-
NumberedShardedFile
public NumberedShardedFile(java.lang.String filePattern, java.util.regex.Pattern shardTemplate)
Constructor.- Parameters:
filePattern
- path or glob of files to includeshardTemplate
- template of shard name to parse out the total number of shards which is used in I/O retry to avoid inconsistency of filesystem. Customized template should assign name "numshards" to capturing group - total shard number.
-
-
Method Detail
-
getFilePattern
public java.lang.String getFilePattern()
-
readFilesWithRetries
public java.util.List<java.lang.String> readFilesWithRetries(Sleeper sleeper, BackOff backOff) throws java.io.IOException, java.lang.InterruptedException
Discovers all shards of this file using the providedSleeper
andBackOff
.Because of eventual consistency, reads may discover no files or fewer files than the shard template implies. In this case, the read is considered to have failed.
- Specified by:
readFilesWithRetries
in interfaceShardedFile
- Throws:
java.io.IOException
java.lang.InterruptedException
-
readFilesWithRetries
public java.util.List<java.lang.String> readFilesWithRetries() throws java.io.IOException, java.lang.InterruptedException
Discovers all shards of this file.Because of eventual consistency, reads may discover no files or fewer files than the shard template implies. In this case, the read is considered to have failed.
- Throws:
java.io.IOException
java.lang.InterruptedException
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
-