Package org.apache.druid.segment.loading
Interface DataSegmentPusher
-
- All Known Implementing Classes:
NoopDataSegmentPusher
public interface DataSegmentPusher
-
-
Field Summary
Fields Modifier and Type Field Description static com.google.common.base.Joiner
JOINER
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Default Methods Deprecated Methods Modifier and Type Method Description static String
generateUniquePath()
default List<String>
getAllowedPropertyPrefixesForHadoop()
Property prefixes that should be added to the "allowedHadoopPrefix" config for passing down to Hadoop jobs.static String
getDefaultStorageDir(DataSegment segment, boolean useUniquePath)
static String
getDefaultStorageDirWithExistingUniquePath(DataSegment segment, String uniquePath)
String
getPathForHadoop()
String
getPathForHadoop(String dataSource)
Deprecated.default String
getStorageDir(DataSegment dataSegment)
Deprecated.backward-compatibiliy shim that should be removed on next major release; usegetStorageDir(DataSegment, boolean)
instead.default String
getStorageDir(DataSegment dataSegment, boolean useUniquePath)
default String
makeIndexPathName(DataSegment dataSegment, String indexName)
Map<String,Object>
makeLoadSpec(URI finalIndexZipFilePath)
DataSegment
push(File file, DataSegment segment, boolean useUniquePath)
Pushes index files and segment descriptor to deep storage.default DataSegment
pushToPath(File indexFilesDir, DataSegment segment, String storageDirSuffix)
-
-
-
Method Detail
-
getPathForHadoop
@Deprecated String getPathForHadoop(String dataSource)
Deprecated.
-
getPathForHadoop
String getPathForHadoop()
-
push
DataSegment push(File file, DataSegment segment, boolean useUniquePath) throws IOException
Pushes index files and segment descriptor to deep storage.- Parameters:
file
- directory containing index filessegment
- segment descriptoruseUniquePath
- if true, pushes to a unique file path. This prevents situations where task failures or replica tasks can either overwrite or fail to overwrite existing segments leading to the possibility of different versions of the same segment ID containing different data. As an example, a Kafka indexing task starting at offset A and ending at offset B may push a segment to deep storage and then fail before writing the loadSpec to the metadata table, resulting in a replacement task being spawned. This replacement will also start at offset A but will read to offset C and will then push a segment to deep storage and write the loadSpec metadata. Without unique file paths, this can only work correctly if new segments overwrite existing segments. Suppose that at this point the task then fails so that the supervisor retries again from offset A. This 3rd attempt will overwrite the segments in deep storage before failing to write the loadSpec metadata, resulting in inconsistencies in the segment data now in deep storage and copies of the segment already loaded by historicals. If unique paths are used, caller is responsible for cleaning up segments that were pushed but were not written to the metadata table (for example when using replica tasks).- Returns:
- segment descriptor
- Throws:
IOException
-
pushToPath
default DataSegment pushToPath(File indexFilesDir, DataSegment segment, String storageDirSuffix) throws IOException
- Throws:
IOException
-
getStorageDir
@Deprecated default String getStorageDir(DataSegment dataSegment)
Deprecated.backward-compatibiliy shim that should be removed on next major release; usegetStorageDir(DataSegment, boolean)
instead.
-
getStorageDir
default String getStorageDir(DataSegment dataSegment, boolean useUniquePath)
-
makeIndexPathName
default String makeIndexPathName(DataSegment dataSegment, String indexName)
-
getAllowedPropertyPrefixesForHadoop
default List<String> getAllowedPropertyPrefixesForHadoop()
Property prefixes that should be added to the "allowedHadoopPrefix" config for passing down to Hadoop jobs. These should be property prefixes like "druid.xxx", which means to include "druid.xxx" and "druid.xxx.*".
-
getDefaultStorageDir
static String getDefaultStorageDir(DataSegment segment, boolean useUniquePath)
-
getDefaultStorageDirWithExistingUniquePath
static String getDefaultStorageDirWithExistingUniquePath(DataSegment segment, String uniquePath)
-
generateUniquePath
static String generateUniquePath()
-
-