public class HttpInputSource extends AbstractInputSource implements SplittableInputSource<URI>
| Modifier and Type | Field and Description |
|---|---|
static String |
TYPE_KEY |
DEFAULT_SPLIT_HINT_SPECTYPE_PROPERTY| Constructor and Description |
|---|
HttpInputSource(List<URI> uris,
String httpAuthenticationUsername,
PasswordProvider httpAuthenticationPasswordProvider,
HttpInputSourceConfig config) |
fixedFormatReader, readerclone, finalize, getClass, notify, notifyAll, wait, wait, waitgetSplitHintSpecOrDefault, isSplittablereaderpublic static final String TYPE_KEY
public HttpInputSource(List<URI> uris, @Nullable String httpAuthenticationUsername, @Nullable PasswordProvider httpAuthenticationPasswordProvider, HttpInputSourceConfig config)
@Nonnull public Set<String> getTypes()
InputSourcegetTypes in interface InputSourcepublic static void throwIfInvalidProtocols(HttpInputSourceConfig config, List<URI> uris)
@Nullable public PasswordProvider getHttpAuthenticationPasswordProvider()
public Stream<InputSplit<URI>> createSplits(InputFormat inputFormat, @Nullable SplitHintSpec splitHintSpec)
SplittableInputSourceStream of InputSplits. The returned stream is supposed to be evaluated lazily to avoid
consuming too much memory.
Note that this interface also has SplittableInputSource.estimateNumSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec) which is related to this method. The implementations
should be careful to NOT cache the created splits in memory.
Implementations can consider InputFormat.isSplittable() and SplitHintSpec to create splits
in the same way with SplittableInputSource.estimateNumSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec).createSplits in interface SplittableInputSource<URI>public int estimateNumSplits(InputFormat inputFormat, @Nullable SplitHintSpec splitHintSpec)
SplittableInputSourceSplittableInputSource.createSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec). The estimated number of splits
doesn't have to be accurate and can be different from the actual number of InputSplits returned from
SplittableInputSource.createSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec). This will be used to estimate the progress of a phase in parallel indexing.
See TaskMonitor for more details of the progress estimation.
This method can be expensive if an implementation iterates all directories or whatever substructure
to find all input entities.
Implementations can consider InputFormat.isSplittable() and SplitHintSpec to find splits
in the same way with SplittableInputSource.createSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec).estimateNumSplits in interface SplittableInputSource<URI>public SplittableInputSource<URI> withSplit(InputSplit<URI> split)
SplittableInputSourcewithSplit in interface SplittableInputSource<URI>protected InputSourceReader formattableReader(InputRowSchema inputRowSchema, InputFormat inputFormat, @Nullable File temporaryDirectory)
formattableReader in class AbstractInputSourcepublic boolean needsFormat()
InputSourceInputFormats. Some inputSources such as
LocalInputSource can store files of any format. These storage types require an InputFormat
to be passed so that InputSourceReader can parse data properly. However, some storage types have
a fixed format. For example, druid inputSource always reads segments. These inputSources should return false for
this method.needsFormat in interface InputSourceCopyright © 2011–2023 The Apache Software Foundation. All rights reserved.