Class RealtimeAudioInputTurnDetection.ServerVad
-
- All Implemented Interfaces:
public final class RealtimeAudioInputTurnDetection.ServerVadServer-side voice activity detection (VAD) which flips on when user speech is detected and off after a period of silence.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description public final classRealtimeAudioInputTurnDetection.ServerVad.BuilderA builder for ServerVad.
-
Method Summary
Modifier and Type Method Description final JsonValue_type()Type of turn detection, server_vadto turn on simple Server VAD.final Optional<Boolean>createResponse()Whether or not to automatically generate a response when a VAD stop event occurs. final Optional<Long>idleTimeoutMs()Optional timeout after which a model response will be triggered automatically. final Optional<Boolean>interruptResponse()Whether or not to automatically interrupt any ongoing response with output to the default conversation (i.e. final Optional<Long>prefixPaddingMs()Used only for server_vadmode.final Optional<Long>silenceDurationMs()Used only for server_vadmode.final Optional<Double>threshold()Used only for server_vadmode.final JsonField<Boolean>_createResponse()Returns the raw JSON value of createResponse. final JsonField<Long>_idleTimeoutMs()Returns the raw JSON value of idleTimeoutMs. final JsonField<Boolean>_interruptResponse()Returns the raw JSON value of interruptResponse. final JsonField<Long>_prefixPaddingMs()Returns the raw JSON value of prefixPaddingMs. final JsonField<Long>_silenceDurationMs()Returns the raw JSON value of silenceDurationMs. final JsonField<Double>_threshold()Returns the raw JSON value of threshold. final Map<String, JsonValue>_additionalProperties()final RealtimeAudioInputTurnDetection.ServerVad.BuildertoBuilder()final RealtimeAudioInputTurnDetection.ServerVadvalidate()final BooleanisValid()Booleanequals(Object other)IntegerhashCode()StringtoString()final static RealtimeAudioInputTurnDetection.ServerVad.Builderbuilder()Returns a mutable builder for constructing an instance of ServerVad. -
-
Method Detail
-
_type
final JsonValue _type()
Type of turn detection,
server_vadto turn on simple Server VAD.Expected to always return the following:
JsonValue.from("server_vad")However, this method can be useful for debugging and logging (e.g. if the server responded with an unexpected value).
-
createResponse
final Optional<Boolean> createResponse()
Whether or not to automatically generate a response when a VAD stop event occurs.
-
idleTimeoutMs
final Optional<Long> idleTimeoutMs()
Optional timeout after which a model response will be triggered automatically. This is useful for situations in which a long pause from the user is unexpected, such as a phone call. The model will effectively prompt the user to continue the conversation based on the current context.
The timeout value will be applied after the last model response's audio has finished playing, i.e. it's set to the
response.donetime plus audio playback duration.An
input_audio_buffer.timeout_triggeredevent (plus events associated with the Response) will be emitted when the timeout is reached. Idle timeout is currently only supported forserver_vadmode.
-
interruptResponse
final Optional<Boolean> interruptResponse()
Whether or not to automatically interrupt any ongoing response with output to the default conversation (i.e.
conversationofauto) when a VAD start event occurs.
-
prefixPaddingMs
final Optional<Long> prefixPaddingMs()
Used only for
server_vadmode. Amount of audio to include before the VAD detected speech (in milliseconds). Defaults to 300ms.
-
silenceDurationMs
final Optional<Long> silenceDurationMs()
Used only for
server_vadmode. Duration of silence to detect speech stop (in milliseconds). Defaults to 500ms. With shorter values the model will respond more quickly, but may jump in on short pauses from the user.
-
threshold
final Optional<Double> threshold()
Used only for
server_vadmode. Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A higher threshold will require louder audio to activate the model, and thus might perform better in noisy environments.
-
_createResponse
final JsonField<Boolean> _createResponse()
Returns the raw JSON value of createResponse.
Unlike createResponse, this method doesn't throw if the JSON field has an unexpected type.
-
_idleTimeoutMs
final JsonField<Long> _idleTimeoutMs()
Returns the raw JSON value of idleTimeoutMs.
Unlike idleTimeoutMs, this method doesn't throw if the JSON field has an unexpected type.
-
_interruptResponse
final JsonField<Boolean> _interruptResponse()
Returns the raw JSON value of interruptResponse.
Unlike interruptResponse, this method doesn't throw if the JSON field has an unexpected type.
-
_prefixPaddingMs
final JsonField<Long> _prefixPaddingMs()
Returns the raw JSON value of prefixPaddingMs.
Unlike prefixPaddingMs, this method doesn't throw if the JSON field has an unexpected type.
-
_silenceDurationMs
final JsonField<Long> _silenceDurationMs()
Returns the raw JSON value of silenceDurationMs.
Unlike silenceDurationMs, this method doesn't throw if the JSON field has an unexpected type.
-
_threshold
final JsonField<Double> _threshold()
Returns the raw JSON value of threshold.
Unlike threshold, this method doesn't throw if the JSON field has an unexpected type.
-
_additionalProperties
final Map<String, JsonValue> _additionalProperties()
-
toBuilder
final RealtimeAudioInputTurnDetection.ServerVad.Builder toBuilder()
-
validate
final RealtimeAudioInputTurnDetection.ServerVad validate()
-
builder
final static RealtimeAudioInputTurnDetection.ServerVad.Builder builder()
Returns a mutable builder for constructing an instance of ServerVad.
-
-
-
-