Class RealtimeServerEvent
-
- All Implemented Interfaces:
public final class RealtimeServerEvent
A realtime server event.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description public interface
RealtimeServerEvent.Visitor
An interface that defines how to map each variant of RealtimeServerEvent to a value of type T.
public final class
RealtimeServerEvent.ConversationItemRetrieved
Returned when a conversation item is retrieved with
conversation.item.retrieve
. This is provided as a way to fetch the server's representation of an item, for example to get access to the post-processed audio data after noise cancellation and VAD. It includes the full content of the Item, including audio data.public final class
RealtimeServerEvent.OutputAudioBufferStarted
WebRTC Only: Emitted when the server begins streaming audio to the client. This event is emitted after an audio content part has been added (
response.content_part.added
) to the response. Learn more.public final class
RealtimeServerEvent.OutputAudioBufferStopped
WebRTC Only: Emitted when the output audio buffer has been completely drained on the server, and no more audio is forthcoming. This event is emitted after the full response data has been sent to the client (
response.done
). Learn more.public final class
RealtimeServerEvent.OutputAudioBufferCleared
WebRTC Only: Emitted when the output audio buffer is cleared. This happens either in VAD mode when the user has interrupted (
input_audio_buffer.speech_started
), or when the client has emitted theoutput_audio_buffer.clear
event to manually cut off the current audio response. Learn more.
-
Method Summary
Modifier and Type Method Description final Optional<ConversationCreatedEvent>
conversationCreated()
Returned when a conversation is created. final Optional<ConversationItemCreatedEvent>
conversationItemCreated()
Returned when a conversation item is created. final Optional<ConversationItemDeletedEvent>
conversationItemDeleted()
Returned when an item in the conversation is deleted by the client with a conversation.item.delete
event.final Optional<ConversationItemInputAudioTranscriptionCompletedEvent>
conversationItemInputAudioTranscriptionCompleted()
This event is the output of audio transcription for user audio written to the user audio buffer. final Optional<ConversationItemInputAudioTranscriptionDeltaEvent>
conversationItemInputAudioTranscriptionDelta()
Returned when the text value of an input audio transcription content part is updated with incremental transcription results. final Optional<ConversationItemInputAudioTranscriptionFailedEvent>
conversationItemInputAudioTranscriptionFailed()
Returned when input audio transcription is configured, and a transcription request for a user message failed. final Optional<RealtimeServerEvent.ConversationItemRetrieved>
conversationItemRetrieved()
Returned when a conversation item is retrieved with conversation.item.retrieve
.final Optional<ConversationItemTruncatedEvent>
conversationItemTruncated()
Returned when an earlier assistant audio message item is truncated by the client with a conversation.item.truncate
event.final Optional<RealtimeErrorEvent>
error()
Returned when an error occurs, which could be a client problem or a server problem. final Optional<InputAudioBufferClearedEvent>
inputAudioBufferCleared()
Returned when the input audio buffer is cleared by the client with a input_audio_buffer.clear
event.final Optional<InputAudioBufferCommittedEvent>
inputAudioBufferCommitted()
Returned when an input audio buffer is committed, either by the client or automatically in server VAD mode. final Optional<InputAudioBufferSpeechStartedEvent>
inputAudioBufferSpeechStarted()
Sent by the server when in server_vad
mode to indicate that speech has been detected in the audio buffer.final Optional<InputAudioBufferSpeechStoppedEvent>
inputAudioBufferSpeechStopped()
Returned in server_vad
mode when the server detects the end of speech in the audio buffer.final Optional<RateLimitsUpdatedEvent>
rateLimitsUpdated()
Emitted at the beginning of a Response to indicate the updated rate limits. final Optional<ResponseAudioDeltaEvent>
responseOutputAudioDelta()
Returned when the model-generated audio is updated. final Optional<ResponseAudioDoneEvent>
responseOutputAudioDone()
Returned when the model-generated audio is done. final Optional<ResponseAudioTranscriptDeltaEvent>
responseOutputAudioTranscriptDelta()
Returned when the model-generated transcription of audio output is updated. final Optional<ResponseAudioTranscriptDoneEvent>
responseOutputAudioTranscriptDone()
Returned when the model-generated transcription of audio output is done streaming. final Optional<ResponseContentPartAddedEvent>
responseContentPartAdded()
Returned when a new content part is added to an assistant message item during response generation. final Optional<ResponseContentPartDoneEvent>
responseContentPartDone()
Returned when a content part is done streaming in an assistant message item. final Optional<ResponseCreatedEvent>
responseCreated()
Returned when a new Response is created. final Optional<ResponseDoneEvent>
responseDone()
Returned when a Response is done streaming. final Optional<ResponseFunctionCallArgumentsDeltaEvent>
responseFunctionCallArgumentsDelta()
Returned when the model-generated function call arguments are updated. final Optional<ResponseFunctionCallArgumentsDoneEvent>
responseFunctionCallArgumentsDone()
Returned when the model-generated function call arguments are done streaming. final Optional<ResponseOutputItemAddedEvent>
responseOutputItemAdded()
Returned when a new Item is created during Response generation. final Optional<ResponseOutputItemDoneEvent>
responseOutputItemDone()
Returned when an Item is done streaming. final Optional<ResponseTextDeltaEvent>
responseOutputTextDelta()
Returned when the text value of an "output_text" content part is updated. final Optional<ResponseTextDoneEvent>
responseOutputTextDone()
Returned when the text value of an "output_text" content part is done streaming. final Optional<SessionCreatedEvent>
sessionCreated()
Returned when a Session is created. final Optional<SessionUpdatedEvent>
sessionUpdated()
Returned when a session is updated with a session.update
event, unless there is an error.final Optional<RealtimeServerEvent.OutputAudioBufferStarted>
outputAudioBufferStarted()
WebRTC Only: Emitted when the server begins streaming audio to the client. final Optional<RealtimeServerEvent.OutputAudioBufferStopped>
outputAudioBufferStopped()
WebRTC Only: Emitted when the output audio buffer has been completely drained on the server, and no more audio is forthcoming. final Optional<RealtimeServerEvent.OutputAudioBufferCleared>
outputAudioBufferCleared()
WebRTC Only: Emitted when the output audio buffer is cleared. final Optional<ConversationItemAdded>
conversationItemAdded()
Sent by the server when an Item is added to the default Conversation. final Optional<ConversationItemDone>
conversationItemDone()
Returned when a conversation item is finalized. final Optional<InputAudioBufferTimeoutTriggered>
inputAudioBufferTimeoutTriggered()
Returned when the Server VAD timeout is triggered for the input audio buffer. final Optional<ConversationItemInputAudioTranscriptionSegment>
conversationItemInputAudioTranscriptionSegment()
Returned when an input audio transcription segment is identified for an item. final Optional<McpListToolsInProgress>
mcpListToolsInProgress()
Returned when listing MCP tools is in progress for an item. final Optional<McpListToolsCompleted>
mcpListToolsCompleted()
Returned when listing MCP tools has completed for an item. final Optional<McpListToolsFailed>
mcpListToolsFailed()
Returned when listing MCP tools has failed for an item. final Optional<ResponseMcpCallArgumentsDelta>
responseMcpCallArgumentsDelta()
Returned when MCP tool call arguments are updated during response generation. final Optional<ResponseMcpCallArgumentsDone>
responseMcpCallArgumentsDone()
Returned when MCP tool call arguments are finalized during response generation. final Optional<ResponseMcpCallInProgress>
responseMcpCallInProgress()
Returned when an MCP tool call has started and is in progress. final Optional<ResponseMcpCallCompleted>
responseMcpCallCompleted()
Returned when an MCP tool call has completed successfully. final Optional<ResponseMcpCallFailed>
responseMcpCallFailed()
Returned when an MCP tool call has failed. final Boolean
isConversationCreated()
final Boolean
isConversationItemCreated()
final Boolean
isConversationItemDeleted()
final Boolean
isConversationItemInputAudioTranscriptionCompleted()
final Boolean
isConversationItemInputAudioTranscriptionDelta()
final Boolean
isConversationItemInputAudioTranscriptionFailed()
final Boolean
isConversationItemRetrieved()
final Boolean
isConversationItemTruncated()
final Boolean
isError()
final Boolean
isInputAudioBufferCleared()
final Boolean
isInputAudioBufferCommitted()
final Boolean
isInputAudioBufferSpeechStarted()
final Boolean
isInputAudioBufferSpeechStopped()
final Boolean
isRateLimitsUpdated()
final Boolean
isResponseOutputAudioDelta()
final Boolean
isResponseOutputAudioDone()
final Boolean
isResponseOutputAudioTranscriptDelta()
final Boolean
isResponseOutputAudioTranscriptDone()
final Boolean
isResponseContentPartAdded()
final Boolean
isResponseContentPartDone()
final Boolean
isResponseCreated()
final Boolean
isResponseDone()
final Boolean
isResponseFunctionCallArgumentsDelta()
final Boolean
isResponseFunctionCallArgumentsDone()
final Boolean
isResponseOutputItemAdded()
final Boolean
isResponseOutputItemDone()
final Boolean
isResponseOutputTextDelta()
final Boolean
isResponseOutputTextDone()
final Boolean
isSessionCreated()
final Boolean
isSessionUpdated()
final Boolean
isOutputAudioBufferStarted()
final Boolean
isOutputAudioBufferStopped()
final Boolean
isOutputAudioBufferCleared()
final Boolean
isConversationItemAdded()
final Boolean
isConversationItemDone()
final Boolean
isInputAudioBufferTimeoutTriggered()
final Boolean
isConversationItemInputAudioTranscriptionSegment()
final Boolean
isMcpListToolsInProgress()
final Boolean
isMcpListToolsCompleted()
final Boolean
isMcpListToolsFailed()
final Boolean
isResponseMcpCallArgumentsDelta()
final Boolean
isResponseMcpCallArgumentsDone()
final Boolean
isResponseMcpCallInProgress()
final Boolean
isResponseMcpCallCompleted()
final Boolean
isResponseMcpCallFailed()
final ConversationCreatedEvent
asConversationCreated()
Returned when a conversation is created. final ConversationItemCreatedEvent
asConversationItemCreated()
Returned when a conversation item is created. final ConversationItemDeletedEvent
asConversationItemDeleted()
Returned when an item in the conversation is deleted by the client with a conversation.item.delete
event.final ConversationItemInputAudioTranscriptionCompletedEvent
asConversationItemInputAudioTranscriptionCompleted()
This event is the output of audio transcription for user audio written to the user audio buffer. final ConversationItemInputAudioTranscriptionDeltaEvent
asConversationItemInputAudioTranscriptionDelta()
Returned when the text value of an input audio transcription content part is updated with incremental transcription results. final ConversationItemInputAudioTranscriptionFailedEvent
asConversationItemInputAudioTranscriptionFailed()
Returned when input audio transcription is configured, and a transcription request for a user message failed. final RealtimeServerEvent.ConversationItemRetrieved
asConversationItemRetrieved()
Returned when a conversation item is retrieved with conversation.item.retrieve
.final ConversationItemTruncatedEvent
asConversationItemTruncated()
Returned when an earlier assistant audio message item is truncated by the client with a conversation.item.truncate
event.final RealtimeErrorEvent
asError()
Returned when an error occurs, which could be a client problem or a server problem. final InputAudioBufferClearedEvent
asInputAudioBufferCleared()
Returned when the input audio buffer is cleared by the client with a input_audio_buffer.clear
event.final InputAudioBufferCommittedEvent
asInputAudioBufferCommitted()
Returned when an input audio buffer is committed, either by the client or automatically in server VAD mode. final InputAudioBufferSpeechStartedEvent
asInputAudioBufferSpeechStarted()
Sent by the server when in server_vad
mode to indicate that speech has been detected in the audio buffer.final InputAudioBufferSpeechStoppedEvent
asInputAudioBufferSpeechStopped()
Returned in server_vad
mode when the server detects the end of speech in the audio buffer.final RateLimitsUpdatedEvent
asRateLimitsUpdated()
Emitted at the beginning of a Response to indicate the updated rate limits. final ResponseAudioDeltaEvent
asResponseOutputAudioDelta()
Returned when the model-generated audio is updated. final ResponseAudioDoneEvent
asResponseOutputAudioDone()
Returned when the model-generated audio is done. final ResponseAudioTranscriptDeltaEvent
asResponseOutputAudioTranscriptDelta()
Returned when the model-generated transcription of audio output is updated. final ResponseAudioTranscriptDoneEvent
asResponseOutputAudioTranscriptDone()
Returned when the model-generated transcription of audio output is done streaming. final ResponseContentPartAddedEvent
asResponseContentPartAdded()
Returned when a new content part is added to an assistant message item during response generation. final ResponseContentPartDoneEvent
asResponseContentPartDone()
Returned when a content part is done streaming in an assistant message item. final ResponseCreatedEvent
asResponseCreated()
Returned when a new Response is created. final ResponseDoneEvent
asResponseDone()
Returned when a Response is done streaming. final ResponseFunctionCallArgumentsDeltaEvent
asResponseFunctionCallArgumentsDelta()
Returned when the model-generated function call arguments are updated. final ResponseFunctionCallArgumentsDoneEvent
asResponseFunctionCallArgumentsDone()
Returned when the model-generated function call arguments are done streaming. final ResponseOutputItemAddedEvent
asResponseOutputItemAdded()
Returned when a new Item is created during Response generation. final ResponseOutputItemDoneEvent
asResponseOutputItemDone()
Returned when an Item is done streaming. final ResponseTextDeltaEvent
asResponseOutputTextDelta()
Returned when the text value of an "output_text" content part is updated. final ResponseTextDoneEvent
asResponseOutputTextDone()
Returned when the text value of an "output_text" content part is done streaming. final SessionCreatedEvent
asSessionCreated()
Returned when a Session is created. final SessionUpdatedEvent
asSessionUpdated()
Returned when a session is updated with a session.update
event, unless there is an error.final RealtimeServerEvent.OutputAudioBufferStarted
asOutputAudioBufferStarted()
WebRTC Only: Emitted when the server begins streaming audio to the client. final RealtimeServerEvent.OutputAudioBufferStopped
asOutputAudioBufferStopped()
WebRTC Only: Emitted when the output audio buffer has been completely drained on the server, and no more audio is forthcoming. final RealtimeServerEvent.OutputAudioBufferCleared
asOutputAudioBufferCleared()
WebRTC Only: Emitted when the output audio buffer is cleared. final ConversationItemAdded
asConversationItemAdded()
Sent by the server when an Item is added to the default Conversation. final ConversationItemDone
asConversationItemDone()
Returned when a conversation item is finalized. final InputAudioBufferTimeoutTriggered
asInputAudioBufferTimeoutTriggered()
Returned when the Server VAD timeout is triggered for the input audio buffer. final ConversationItemInputAudioTranscriptionSegment
asConversationItemInputAudioTranscriptionSegment()
Returned when an input audio transcription segment is identified for an item. final McpListToolsInProgress
asMcpListToolsInProgress()
Returned when listing MCP tools is in progress for an item. final McpListToolsCompleted
asMcpListToolsCompleted()
Returned when listing MCP tools has completed for an item. final McpListToolsFailed
asMcpListToolsFailed()
Returned when listing MCP tools has failed for an item. final ResponseMcpCallArgumentsDelta
asResponseMcpCallArgumentsDelta()
Returned when MCP tool call arguments are updated during response generation. final ResponseMcpCallArgumentsDone
asResponseMcpCallArgumentsDone()
Returned when MCP tool call arguments are finalized during response generation. final ResponseMcpCallInProgress
asResponseMcpCallInProgress()
Returned when an MCP tool call has started and is in progress. final ResponseMcpCallCompleted
asResponseMcpCallCompleted()
Returned when an MCP tool call has completed successfully. final ResponseMcpCallFailed
asResponseMcpCallFailed()
Returned when an MCP tool call has failed. final Optional<JsonValue>
_json()
final <T extends Any> T
accept(RealtimeServerEvent.Visitor<T> visitor)
final RealtimeServerEvent
validate()
final Boolean
isValid()
Boolean
equals(Object other)
Integer
hashCode()
String
toString()
final static RealtimeServerEvent
ofConversationCreated(ConversationCreatedEvent conversationCreated)
Returned when a conversation is created. final static RealtimeServerEvent
ofConversationItemCreated(ConversationItemCreatedEvent conversationItemCreated)
Returned when a conversation item is created. final static RealtimeServerEvent
ofConversationItemDeleted(ConversationItemDeletedEvent conversationItemDeleted)
Returned when an item in the conversation is deleted by the client with a conversation.item.delete
event.final static RealtimeServerEvent
ofConversationItemInputAudioTranscriptionCompleted(ConversationItemInputAudioTranscriptionCompletedEvent conversationItemInputAudioTranscriptionCompleted)
This event is the output of audio transcription for user audio written to the user audio buffer. final static RealtimeServerEvent
ofConversationItemInputAudioTranscriptionDelta(ConversationItemInputAudioTranscriptionDeltaEvent conversationItemInputAudioTranscriptionDelta)
Returned when the text value of an input audio transcription content part is updated with incremental transcription results. final static RealtimeServerEvent
ofConversationItemInputAudioTranscriptionFailed(ConversationItemInputAudioTranscriptionFailedEvent conversationItemInputAudioTranscriptionFailed)
Returned when input audio transcription is configured, and a transcription request for a user message failed. final static RealtimeServerEvent
ofConversationItemRetrieved(RealtimeServerEvent.ConversationItemRetrieved conversationItemRetrieved)
Returned when a conversation item is retrieved with conversation.item.retrieve
.final static RealtimeServerEvent
ofConversationItemTruncated(ConversationItemTruncatedEvent conversationItemTruncated)
Returned when an earlier assistant audio message item is truncated by the client with a conversation.item.truncate
event.final static RealtimeServerEvent
ofError(RealtimeErrorEvent error)
Returned when an error occurs, which could be a client problem or a server problem. final static RealtimeServerEvent
ofInputAudioBufferCleared(InputAudioBufferClearedEvent inputAudioBufferCleared)
Returned when the input audio buffer is cleared by the client with a input_audio_buffer.clear
event.final static RealtimeServerEvent
ofInputAudioBufferCommitted(InputAudioBufferCommittedEvent inputAudioBufferCommitted)
Returned when an input audio buffer is committed, either by the client or automatically in server VAD mode. final static RealtimeServerEvent
ofInputAudioBufferSpeechStarted(InputAudioBufferSpeechStartedEvent inputAudioBufferSpeechStarted)
Sent by the server when in server_vad
mode to indicate that speech has been detected in the audio buffer.final static RealtimeServerEvent
ofInputAudioBufferSpeechStopped(InputAudioBufferSpeechStoppedEvent inputAudioBufferSpeechStopped)
Returned in server_vad
mode when the server detects the end of speech in the audio buffer.final static RealtimeServerEvent
ofRateLimitsUpdated(RateLimitsUpdatedEvent rateLimitsUpdated)
Emitted at the beginning of a Response to indicate the updated rate limits. final static RealtimeServerEvent
ofResponseOutputAudioDelta(ResponseAudioDeltaEvent responseOutputAudioDelta)
Returned when the model-generated audio is updated. final static RealtimeServerEvent
ofResponseOutputAudioDone(ResponseAudioDoneEvent responseOutputAudioDone)
Returned when the model-generated audio is done. final static RealtimeServerEvent
ofResponseOutputAudioTranscriptDelta(ResponseAudioTranscriptDeltaEvent responseOutputAudioTranscriptDelta)
Returned when the model-generated transcription of audio output is updated. final static RealtimeServerEvent
ofResponseOutputAudioTranscriptDone(ResponseAudioTranscriptDoneEvent responseOutputAudioTranscriptDone)
Returned when the model-generated transcription of audio output is done streaming. final static RealtimeServerEvent
ofResponseContentPartAdded(ResponseContentPartAddedEvent responseContentPartAdded)
Returned when a new content part is added to an assistant message item during response generation. final static RealtimeServerEvent
ofResponseContentPartDone(ResponseContentPartDoneEvent responseContentPartDone)
Returned when a content part is done streaming in an assistant message item. final static RealtimeServerEvent
ofResponseCreated(ResponseCreatedEvent responseCreated)
Returned when a new Response is created. final static RealtimeServerEvent
ofResponseDone(ResponseDoneEvent responseDone)
Returned when a Response is done streaming. final static RealtimeServerEvent
ofResponseFunctionCallArgumentsDelta(ResponseFunctionCallArgumentsDeltaEvent responseFunctionCallArgumentsDelta)
Returned when the model-generated function call arguments are updated. final static RealtimeServerEvent
ofResponseFunctionCallArgumentsDone(ResponseFunctionCallArgumentsDoneEvent responseFunctionCallArgumentsDone)
Returned when the model-generated function call arguments are done streaming. final static RealtimeServerEvent
ofResponseOutputItemAdded(ResponseOutputItemAddedEvent responseOutputItemAdded)
Returned when a new Item is created during Response generation. final static RealtimeServerEvent
ofResponseOutputItemDone(ResponseOutputItemDoneEvent responseOutputItemDone)
Returned when an Item is done streaming. final static RealtimeServerEvent
ofResponseOutputTextDelta(ResponseTextDeltaEvent responseOutputTextDelta)
Returned when the text value of an "output_text" content part is updated. final static RealtimeServerEvent
ofResponseOutputTextDone(ResponseTextDoneEvent responseOutputTextDone)
Returned when the text value of an "output_text" content part is done streaming. final static RealtimeServerEvent
ofSessionCreated(SessionCreatedEvent sessionCreated)
Returned when a Session is created. final static RealtimeServerEvent
ofSessionUpdated(SessionUpdatedEvent sessionUpdated)
Returned when a session is updated with a session.update
event, unless there is an error.final static RealtimeServerEvent
ofOutputAudioBufferStarted(RealtimeServerEvent.OutputAudioBufferStarted outputAudioBufferStarted)
WebRTC Only: Emitted when the server begins streaming audio to the client. final static RealtimeServerEvent
ofOutputAudioBufferStopped(RealtimeServerEvent.OutputAudioBufferStopped outputAudioBufferStopped)
WebRTC Only: Emitted when the output audio buffer has been completely drained on the server, and no more audio is forthcoming. final static RealtimeServerEvent
ofOutputAudioBufferCleared(RealtimeServerEvent.OutputAudioBufferCleared outputAudioBufferCleared)
WebRTC Only: Emitted when the output audio buffer is cleared. final static RealtimeServerEvent
ofConversationItemAdded(ConversationItemAdded conversationItemAdded)
Sent by the server when an Item is added to the default Conversation. final static RealtimeServerEvent
ofConversationItemDone(ConversationItemDone conversationItemDone)
Returned when a conversation item is finalized. final static RealtimeServerEvent
ofInputAudioBufferTimeoutTriggered(InputAudioBufferTimeoutTriggered inputAudioBufferTimeoutTriggered)
Returned when the Server VAD timeout is triggered for the input audio buffer. final static RealtimeServerEvent
ofConversationItemInputAudioTranscriptionSegment(ConversationItemInputAudioTranscriptionSegment conversationItemInputAudioTranscriptionSegment)
Returned when an input audio transcription segment is identified for an item. final static RealtimeServerEvent
ofMcpListToolsInProgress(McpListToolsInProgress mcpListToolsInProgress)
Returned when listing MCP tools is in progress for an item. final static RealtimeServerEvent
ofMcpListToolsCompleted(McpListToolsCompleted mcpListToolsCompleted)
Returned when listing MCP tools has completed for an item. final static RealtimeServerEvent
ofMcpListToolsFailed(McpListToolsFailed mcpListToolsFailed)
Returned when listing MCP tools has failed for an item. final static RealtimeServerEvent
ofResponseMcpCallArgumentsDelta(ResponseMcpCallArgumentsDelta responseMcpCallArgumentsDelta)
Returned when MCP tool call arguments are updated during response generation. final static RealtimeServerEvent
ofResponseMcpCallArgumentsDone(ResponseMcpCallArgumentsDone responseMcpCallArgumentsDone)
Returned when MCP tool call arguments are finalized during response generation. final static RealtimeServerEvent
ofResponseMcpCallInProgress(ResponseMcpCallInProgress responseMcpCallInProgress)
Returned when an MCP tool call has started and is in progress. final static RealtimeServerEvent
ofResponseMcpCallCompleted(ResponseMcpCallCompleted responseMcpCallCompleted)
Returned when an MCP tool call has completed successfully. final static RealtimeServerEvent
ofResponseMcpCallFailed(ResponseMcpCallFailed responseMcpCallFailed)
Returned when an MCP tool call has failed. -
-
Method Detail
-
conversationCreated
final Optional<ConversationCreatedEvent> conversationCreated()
Returned when a conversation is created. Emitted right after session creation.
-
conversationItemCreated
final Optional<ConversationItemCreatedEvent> conversationItemCreated()
Returned when a conversation item is created. There are several scenarios that produce this event:
The server is generating a Response, which if successful will produce either one or two Items, which will be of type
message
(roleassistant
) or typefunction_call
.The input audio buffer has been committed, either by the client or the server (in
server_vad
mode). The server will take the content of the input audio buffer and add it to a new user message Item.The client has sent a
conversation.item.create
event to add a new Item to the Conversation.
-
conversationItemDeleted
final Optional<ConversationItemDeletedEvent> conversationItemDeleted()
Returned when an item in the conversation is deleted by the client with a
conversation.item.delete
event. This event is used to synchronize the server's understanding of the conversation history with the client's view.
-
conversationItemInputAudioTranscriptionCompleted
final Optional<ConversationItemInputAudioTranscriptionCompletedEvent> conversationItemInputAudioTranscriptionCompleted()
This event is the output of audio transcription for user audio written to the user audio buffer. Transcription begins when the input audio buffer is committed by the client or server (when VAD is enabled). Transcription runs asynchronously with Response creation, so this event may come before or after the Response events.
Realtime API models accept audio natively, and thus input transcription is a separate process run on a separate ASR (Automatic Speech Recognition) model. The transcript may diverge somewhat from the model's interpretation, and should be treated as a rough guide.
-
conversationItemInputAudioTranscriptionDelta
final Optional<ConversationItemInputAudioTranscriptionDeltaEvent> conversationItemInputAudioTranscriptionDelta()
Returned when the text value of an input audio transcription content part is updated with incremental transcription results.
-
conversationItemInputAudioTranscriptionFailed
final Optional<ConversationItemInputAudioTranscriptionFailedEvent> conversationItemInputAudioTranscriptionFailed()
Returned when input audio transcription is configured, and a transcription request for a user message failed. These events are separate from other
error
events so that the client can identify the related Item.
-
conversationItemRetrieved
final Optional<RealtimeServerEvent.ConversationItemRetrieved> conversationItemRetrieved()
Returned when a conversation item is retrieved with
conversation.item.retrieve
. This is provided as a way to fetch the server's representation of an item, for example to get access to the post-processed audio data after noise cancellation and VAD. It includes the full content of the Item, including audio data.
-
conversationItemTruncated
final Optional<ConversationItemTruncatedEvent> conversationItemTruncated()
Returned when an earlier assistant audio message item is truncated by the client with a
conversation.item.truncate
event. This event is used to synchronize the server's understanding of the audio with the client's playback.This action will truncate the audio and remove the server-side text transcript to ensure there is no text in the context that hasn't been heard by the user.
-
error
final Optional<RealtimeErrorEvent> error()
Returned when an error occurs, which could be a client problem or a server problem. Most errors are recoverable and the session will stay open, we recommend to implementors to monitor and log error messages by default.
-
inputAudioBufferCleared
final Optional<InputAudioBufferClearedEvent> inputAudioBufferCleared()
Returned when the input audio buffer is cleared by the client with a
input_audio_buffer.clear
event.
-
inputAudioBufferCommitted
final Optional<InputAudioBufferCommittedEvent> inputAudioBufferCommitted()
Returned when an input audio buffer is committed, either by the client or automatically in server VAD mode. The
item_id
property is the ID of the user message item that will be created, thus aconversation.item.created
event will also be sent to the client.
-
inputAudioBufferSpeechStarted
final Optional<InputAudioBufferSpeechStartedEvent> inputAudioBufferSpeechStarted()
Sent by the server when in
server_vad
mode to indicate that speech has been detected in the audio buffer. This can happen any time audio is added to the buffer (unless speech is already detected). The client may want to use this event to interrupt audio playback or provide visual feedback to the user.The client should expect to receive a
input_audio_buffer.speech_stopped
event when speech stops. Theitem_id
property is the ID of the user message item that will be created when speech stops and will also be included in theinput_audio_buffer.speech_stopped
event (unless the client manually commits the audio buffer during VAD activation).
-
inputAudioBufferSpeechStopped
final Optional<InputAudioBufferSpeechStoppedEvent> inputAudioBufferSpeechStopped()
Returned in
server_vad
mode when the server detects the end of speech in the audio buffer. The server will also send anconversation.item.created
event with the user message item that is created from the audio buffer.
-
rateLimitsUpdated
final Optional<RateLimitsUpdatedEvent> rateLimitsUpdated()
Emitted at the beginning of a Response to indicate the updated rate limits. When a Response is created some tokens will be "reserved" for the output tokens, the rate limits shown here reflect that reservation, which is then adjusted accordingly once the Response is completed.
-
responseOutputAudioDelta
final Optional<ResponseAudioDeltaEvent> responseOutputAudioDelta()
Returned when the model-generated audio is updated.
-
responseOutputAudioDone
final Optional<ResponseAudioDoneEvent> responseOutputAudioDone()
Returned when the model-generated audio is done. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
responseOutputAudioTranscriptDelta
final Optional<ResponseAudioTranscriptDeltaEvent> responseOutputAudioTranscriptDelta()
Returned when the model-generated transcription of audio output is updated.
-
responseOutputAudioTranscriptDone
final Optional<ResponseAudioTranscriptDoneEvent> responseOutputAudioTranscriptDone()
Returned when the model-generated transcription of audio output is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
responseContentPartAdded
final Optional<ResponseContentPartAddedEvent> responseContentPartAdded()
Returned when a new content part is added to an assistant message item during response generation.
-
responseContentPartDone
final Optional<ResponseContentPartDoneEvent> responseContentPartDone()
Returned when a content part is done streaming in an assistant message item. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
responseCreated
final Optional<ResponseCreatedEvent> responseCreated()
Returned when a new Response is created. The first event of response creation, where the response is in an initial state of
in_progress
.
-
responseDone
final Optional<ResponseDoneEvent> responseDone()
Returned when a Response is done streaming. Always emitted, no matter the final state. The Response object included in the
response.done
event will include all output Items in the Response but will omit the raw audio data.Clients should check the
status
field of the Response to determine if it was successful (completed
) or if there was another outcome:cancelled
,failed
, orincomplete
.A response will contain all output items that were generated during the response, excluding any audio content.
-
responseFunctionCallArgumentsDelta
final Optional<ResponseFunctionCallArgumentsDeltaEvent> responseFunctionCallArgumentsDelta()
Returned when the model-generated function call arguments are updated.
-
responseFunctionCallArgumentsDone
final Optional<ResponseFunctionCallArgumentsDoneEvent> responseFunctionCallArgumentsDone()
Returned when the model-generated function call arguments are done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
responseOutputItemAdded
final Optional<ResponseOutputItemAddedEvent> responseOutputItemAdded()
Returned when a new Item is created during Response generation.
-
responseOutputItemDone
final Optional<ResponseOutputItemDoneEvent> responseOutputItemDone()
Returned when an Item is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
responseOutputTextDelta
final Optional<ResponseTextDeltaEvent> responseOutputTextDelta()
Returned when the text value of an "output_text" content part is updated.
-
responseOutputTextDone
final Optional<ResponseTextDoneEvent> responseOutputTextDone()
Returned when the text value of an "output_text" content part is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
sessionCreated
final Optional<SessionCreatedEvent> sessionCreated()
Returned when a Session is created. Emitted automatically when a new connection is established as the first server event. This event will contain the default Session configuration.
-
sessionUpdated
final Optional<SessionUpdatedEvent> sessionUpdated()
Returned when a session is updated with a
session.update
event, unless there is an error.
-
outputAudioBufferStarted
final Optional<RealtimeServerEvent.OutputAudioBufferStarted> outputAudioBufferStarted()
WebRTC Only: Emitted when the server begins streaming audio to the client. This event is emitted after an audio content part has been added (
response.content_part.added
) to the response. Learn more.
-
outputAudioBufferStopped
final Optional<RealtimeServerEvent.OutputAudioBufferStopped> outputAudioBufferStopped()
WebRTC Only: Emitted when the output audio buffer has been completely drained on the server, and no more audio is forthcoming. This event is emitted after the full response data has been sent to the client (
response.done
). Learn more.
-
outputAudioBufferCleared
final Optional<RealtimeServerEvent.OutputAudioBufferCleared> outputAudioBufferCleared()
WebRTC Only: Emitted when the output audio buffer is cleared. This happens either in VAD mode when the user has interrupted (
input_audio_buffer.speech_started
), or when the client has emitted theoutput_audio_buffer.clear
event to manually cut off the current audio response. Learn more.
-
conversationItemAdded
final Optional<ConversationItemAdded> conversationItemAdded()
Sent by the server when an Item is added to the default Conversation. This can happen in several cases:
When the client sends a
conversation.item.create
event.When the input audio buffer is committed. In this case the item will be a user message containing the audio from the buffer.
When the model is generating a Response. In this case the
conversation.item.added
event will be sent when the model starts generating a specific Item, and thus it will not yet have any content (andstatus
will bein_progress
).
The event will include the full content of the Item (except when model is generating a Response) except for audio data, which can be retrieved separately with a
conversation.item.retrieve
event if necessary.
-
conversationItemDone
final Optional<ConversationItemDone> conversationItemDone()
Returned when a conversation item is finalized.
The event will include the full content of the Item except for audio data, which can be retrieved separately with a
conversation.item.retrieve
event if needed.
-
inputAudioBufferTimeoutTriggered
final Optional<InputAudioBufferTimeoutTriggered> inputAudioBufferTimeoutTriggered()
Returned when the Server VAD timeout is triggered for the input audio buffer. This is configured with
idle_timeout_ms
in theturn_detection
settings of the session, and it indicates that there hasn't been any speech detected for the configured duration.The
audio_start_ms
andaudio_end_ms
fields indicate the segment of audio after the last model response up to the triggering time, as an offset from the beginning of audio written to the input audio buffer. This means it demarcates the segment of audio that was silent and the difference between the start and end values will roughly match the configured timeout.The empty audio will be committed to the conversation as an
input_audio
item (there will be ainput_audio_buffer.committed
event) and a model response will be generated. There may be speech that didn't trigger VAD but is still detected by the model, so the model may respond with something relevant to the conversation or a prompt to continue speaking.
-
conversationItemInputAudioTranscriptionSegment
final Optional<ConversationItemInputAudioTranscriptionSegment> conversationItemInputAudioTranscriptionSegment()
Returned when an input audio transcription segment is identified for an item.
-
mcpListToolsInProgress
final Optional<McpListToolsInProgress> mcpListToolsInProgress()
Returned when listing MCP tools is in progress for an item.
-
mcpListToolsCompleted
final Optional<McpListToolsCompleted> mcpListToolsCompleted()
Returned when listing MCP tools has completed for an item.
-
mcpListToolsFailed
final Optional<McpListToolsFailed> mcpListToolsFailed()
Returned when listing MCP tools has failed for an item.
-
responseMcpCallArgumentsDelta
final Optional<ResponseMcpCallArgumentsDelta> responseMcpCallArgumentsDelta()
Returned when MCP tool call arguments are updated during response generation.
-
responseMcpCallArgumentsDone
final Optional<ResponseMcpCallArgumentsDone> responseMcpCallArgumentsDone()
Returned when MCP tool call arguments are finalized during response generation.
-
responseMcpCallInProgress
final Optional<ResponseMcpCallInProgress> responseMcpCallInProgress()
Returned when an MCP tool call has started and is in progress.
-
responseMcpCallCompleted
final Optional<ResponseMcpCallCompleted> responseMcpCallCompleted()
Returned when an MCP tool call has completed successfully.
-
responseMcpCallFailed
final Optional<ResponseMcpCallFailed> responseMcpCallFailed()
Returned when an MCP tool call has failed.
-
isConversationCreated
final Boolean isConversationCreated()
-
isConversationItemCreated
final Boolean isConversationItemCreated()
-
isConversationItemDeleted
final Boolean isConversationItemDeleted()
-
isConversationItemInputAudioTranscriptionCompleted
final Boolean isConversationItemInputAudioTranscriptionCompleted()
-
isConversationItemInputAudioTranscriptionDelta
final Boolean isConversationItemInputAudioTranscriptionDelta()
-
isConversationItemInputAudioTranscriptionFailed
final Boolean isConversationItemInputAudioTranscriptionFailed()
-
isConversationItemRetrieved
final Boolean isConversationItemRetrieved()
-
isConversationItemTruncated
final Boolean isConversationItemTruncated()
-
isInputAudioBufferCleared
final Boolean isInputAudioBufferCleared()
-
isInputAudioBufferCommitted
final Boolean isInputAudioBufferCommitted()
-
isInputAudioBufferSpeechStarted
final Boolean isInputAudioBufferSpeechStarted()
-
isInputAudioBufferSpeechStopped
final Boolean isInputAudioBufferSpeechStopped()
-
isRateLimitsUpdated
final Boolean isRateLimitsUpdated()
-
isResponseOutputAudioDelta
final Boolean isResponseOutputAudioDelta()
-
isResponseOutputAudioDone
final Boolean isResponseOutputAudioDone()
-
isResponseOutputAudioTranscriptDelta
final Boolean isResponseOutputAudioTranscriptDelta()
-
isResponseOutputAudioTranscriptDone
final Boolean isResponseOutputAudioTranscriptDone()
-
isResponseContentPartAdded
final Boolean isResponseContentPartAdded()
-
isResponseContentPartDone
final Boolean isResponseContentPartDone()
-
isResponseCreated
final Boolean isResponseCreated()
-
isResponseDone
final Boolean isResponseDone()
-
isResponseFunctionCallArgumentsDelta
final Boolean isResponseFunctionCallArgumentsDelta()
-
isResponseFunctionCallArgumentsDone
final Boolean isResponseFunctionCallArgumentsDone()
-
isResponseOutputItemAdded
final Boolean isResponseOutputItemAdded()
-
isResponseOutputItemDone
final Boolean isResponseOutputItemDone()
-
isResponseOutputTextDelta
final Boolean isResponseOutputTextDelta()
-
isResponseOutputTextDone
final Boolean isResponseOutputTextDone()
-
isSessionCreated
final Boolean isSessionCreated()
-
isSessionUpdated
final Boolean isSessionUpdated()
-
isOutputAudioBufferStarted
final Boolean isOutputAudioBufferStarted()
-
isOutputAudioBufferStopped
final Boolean isOutputAudioBufferStopped()
-
isOutputAudioBufferCleared
final Boolean isOutputAudioBufferCleared()
-
isConversationItemAdded
final Boolean isConversationItemAdded()
-
isConversationItemDone
final Boolean isConversationItemDone()
-
isInputAudioBufferTimeoutTriggered
final Boolean isInputAudioBufferTimeoutTriggered()
-
isConversationItemInputAudioTranscriptionSegment
final Boolean isConversationItemInputAudioTranscriptionSegment()
-
isMcpListToolsInProgress
final Boolean isMcpListToolsInProgress()
-
isMcpListToolsCompleted
final Boolean isMcpListToolsCompleted()
-
isMcpListToolsFailed
final Boolean isMcpListToolsFailed()
-
isResponseMcpCallArgumentsDelta
final Boolean isResponseMcpCallArgumentsDelta()
-
isResponseMcpCallArgumentsDone
final Boolean isResponseMcpCallArgumentsDone()
-
isResponseMcpCallInProgress
final Boolean isResponseMcpCallInProgress()
-
isResponseMcpCallCompleted
final Boolean isResponseMcpCallCompleted()
-
isResponseMcpCallFailed
final Boolean isResponseMcpCallFailed()
-
asConversationCreated
final ConversationCreatedEvent asConversationCreated()
Returned when a conversation is created. Emitted right after session creation.
-
asConversationItemCreated
final ConversationItemCreatedEvent asConversationItemCreated()
Returned when a conversation item is created. There are several scenarios that produce this event:
The server is generating a Response, which if successful will produce either one or two Items, which will be of type
message
(roleassistant
) or typefunction_call
.The input audio buffer has been committed, either by the client or the server (in
server_vad
mode). The server will take the content of the input audio buffer and add it to a new user message Item.The client has sent a
conversation.item.create
event to add a new Item to the Conversation.
-
asConversationItemDeleted
final ConversationItemDeletedEvent asConversationItemDeleted()
Returned when an item in the conversation is deleted by the client with a
conversation.item.delete
event. This event is used to synchronize the server's understanding of the conversation history with the client's view.
-
asConversationItemInputAudioTranscriptionCompleted
final ConversationItemInputAudioTranscriptionCompletedEvent asConversationItemInputAudioTranscriptionCompleted()
This event is the output of audio transcription for user audio written to the user audio buffer. Transcription begins when the input audio buffer is committed by the client or server (when VAD is enabled). Transcription runs asynchronously with Response creation, so this event may come before or after the Response events.
Realtime API models accept audio natively, and thus input transcription is a separate process run on a separate ASR (Automatic Speech Recognition) model. The transcript may diverge somewhat from the model's interpretation, and should be treated as a rough guide.
-
asConversationItemInputAudioTranscriptionDelta
final ConversationItemInputAudioTranscriptionDeltaEvent asConversationItemInputAudioTranscriptionDelta()
Returned when the text value of an input audio transcription content part is updated with incremental transcription results.
-
asConversationItemInputAudioTranscriptionFailed
final ConversationItemInputAudioTranscriptionFailedEvent asConversationItemInputAudioTranscriptionFailed()
Returned when input audio transcription is configured, and a transcription request for a user message failed. These events are separate from other
error
events so that the client can identify the related Item.
-
asConversationItemRetrieved
final RealtimeServerEvent.ConversationItemRetrieved asConversationItemRetrieved()
Returned when a conversation item is retrieved with
conversation.item.retrieve
. This is provided as a way to fetch the server's representation of an item, for example to get access to the post-processed audio data after noise cancellation and VAD. It includes the full content of the Item, including audio data.
-
asConversationItemTruncated
final ConversationItemTruncatedEvent asConversationItemTruncated()
Returned when an earlier assistant audio message item is truncated by the client with a
conversation.item.truncate
event. This event is used to synchronize the server's understanding of the audio with the client's playback.This action will truncate the audio and remove the server-side text transcript to ensure there is no text in the context that hasn't been heard by the user.
-
asError
final RealtimeErrorEvent asError()
Returned when an error occurs, which could be a client problem or a server problem. Most errors are recoverable and the session will stay open, we recommend to implementors to monitor and log error messages by default.
-
asInputAudioBufferCleared
final InputAudioBufferClearedEvent asInputAudioBufferCleared()
Returned when the input audio buffer is cleared by the client with a
input_audio_buffer.clear
event.
-
asInputAudioBufferCommitted
final InputAudioBufferCommittedEvent asInputAudioBufferCommitted()
Returned when an input audio buffer is committed, either by the client or automatically in server VAD mode. The
item_id
property is the ID of the user message item that will be created, thus aconversation.item.created
event will also be sent to the client.
-
asInputAudioBufferSpeechStarted
final InputAudioBufferSpeechStartedEvent asInputAudioBufferSpeechStarted()
Sent by the server when in
server_vad
mode to indicate that speech has been detected in the audio buffer. This can happen any time audio is added to the buffer (unless speech is already detected). The client may want to use this event to interrupt audio playback or provide visual feedback to the user.The client should expect to receive a
input_audio_buffer.speech_stopped
event when speech stops. Theitem_id
property is the ID of the user message item that will be created when speech stops and will also be included in theinput_audio_buffer.speech_stopped
event (unless the client manually commits the audio buffer during VAD activation).
-
asInputAudioBufferSpeechStopped
final InputAudioBufferSpeechStoppedEvent asInputAudioBufferSpeechStopped()
Returned in
server_vad
mode when the server detects the end of speech in the audio buffer. The server will also send anconversation.item.created
event with the user message item that is created from the audio buffer.
-
asRateLimitsUpdated
final RateLimitsUpdatedEvent asRateLimitsUpdated()
Emitted at the beginning of a Response to indicate the updated rate limits. When a Response is created some tokens will be "reserved" for the output tokens, the rate limits shown here reflect that reservation, which is then adjusted accordingly once the Response is completed.
-
asResponseOutputAudioDelta
final ResponseAudioDeltaEvent asResponseOutputAudioDelta()
Returned when the model-generated audio is updated.
-
asResponseOutputAudioDone
final ResponseAudioDoneEvent asResponseOutputAudioDone()
Returned when the model-generated audio is done. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
asResponseOutputAudioTranscriptDelta
final ResponseAudioTranscriptDeltaEvent asResponseOutputAudioTranscriptDelta()
Returned when the model-generated transcription of audio output is updated.
-
asResponseOutputAudioTranscriptDone
final ResponseAudioTranscriptDoneEvent asResponseOutputAudioTranscriptDone()
Returned when the model-generated transcription of audio output is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
asResponseContentPartAdded
final ResponseContentPartAddedEvent asResponseContentPartAdded()
Returned when a new content part is added to an assistant message item during response generation.
-
asResponseContentPartDone
final ResponseContentPartDoneEvent asResponseContentPartDone()
Returned when a content part is done streaming in an assistant message item. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
asResponseCreated
final ResponseCreatedEvent asResponseCreated()
Returned when a new Response is created. The first event of response creation, where the response is in an initial state of
in_progress
.
-
asResponseDone
final ResponseDoneEvent asResponseDone()
Returned when a Response is done streaming. Always emitted, no matter the final state. The Response object included in the
response.done
event will include all output Items in the Response but will omit the raw audio data.Clients should check the
status
field of the Response to determine if it was successful (completed
) or if there was another outcome:cancelled
,failed
, orincomplete
.A response will contain all output items that were generated during the response, excluding any audio content.
-
asResponseFunctionCallArgumentsDelta
final ResponseFunctionCallArgumentsDeltaEvent asResponseFunctionCallArgumentsDelta()
Returned when the model-generated function call arguments are updated.
-
asResponseFunctionCallArgumentsDone
final ResponseFunctionCallArgumentsDoneEvent asResponseFunctionCallArgumentsDone()
Returned when the model-generated function call arguments are done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
asResponseOutputItemAdded
final ResponseOutputItemAddedEvent asResponseOutputItemAdded()
Returned when a new Item is created during Response generation.
-
asResponseOutputItemDone
final ResponseOutputItemDoneEvent asResponseOutputItemDone()
Returned when an Item is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
asResponseOutputTextDelta
final ResponseTextDeltaEvent asResponseOutputTextDelta()
Returned when the text value of an "output_text" content part is updated.
-
asResponseOutputTextDone
final ResponseTextDoneEvent asResponseOutputTextDone()
Returned when the text value of an "output_text" content part is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
asSessionCreated
final SessionCreatedEvent asSessionCreated()
Returned when a Session is created. Emitted automatically when a new connection is established as the first server event. This event will contain the default Session configuration.
-
asSessionUpdated
final SessionUpdatedEvent asSessionUpdated()
Returned when a session is updated with a
session.update
event, unless there is an error.
-
asOutputAudioBufferStarted
final RealtimeServerEvent.OutputAudioBufferStarted asOutputAudioBufferStarted()
WebRTC Only: Emitted when the server begins streaming audio to the client. This event is emitted after an audio content part has been added (
response.content_part.added
) to the response. Learn more.
-
asOutputAudioBufferStopped
final RealtimeServerEvent.OutputAudioBufferStopped asOutputAudioBufferStopped()
WebRTC Only: Emitted when the output audio buffer has been completely drained on the server, and no more audio is forthcoming. This event is emitted after the full response data has been sent to the client (
response.done
). Learn more.
-
asOutputAudioBufferCleared
final RealtimeServerEvent.OutputAudioBufferCleared asOutputAudioBufferCleared()
WebRTC Only: Emitted when the output audio buffer is cleared. This happens either in VAD mode when the user has interrupted (
input_audio_buffer.speech_started
), or when the client has emitted theoutput_audio_buffer.clear
event to manually cut off the current audio response. Learn more.
-
asConversationItemAdded
final ConversationItemAdded asConversationItemAdded()
Sent by the server when an Item is added to the default Conversation. This can happen in several cases:
When the client sends a
conversation.item.create
event.When the input audio buffer is committed. In this case the item will be a user message containing the audio from the buffer.
When the model is generating a Response. In this case the
conversation.item.added
event will be sent when the model starts generating a specific Item, and thus it will not yet have any content (andstatus
will bein_progress
).
The event will include the full content of the Item (except when model is generating a Response) except for audio data, which can be retrieved separately with a
conversation.item.retrieve
event if necessary.
-
asConversationItemDone
final ConversationItemDone asConversationItemDone()
Returned when a conversation item is finalized.
The event will include the full content of the Item except for audio data, which can be retrieved separately with a
conversation.item.retrieve
event if needed.
-
asInputAudioBufferTimeoutTriggered
final InputAudioBufferTimeoutTriggered asInputAudioBufferTimeoutTriggered()
Returned when the Server VAD timeout is triggered for the input audio buffer. This is configured with
idle_timeout_ms
in theturn_detection
settings of the session, and it indicates that there hasn't been any speech detected for the configured duration.The
audio_start_ms
andaudio_end_ms
fields indicate the segment of audio after the last model response up to the triggering time, as an offset from the beginning of audio written to the input audio buffer. This means it demarcates the segment of audio that was silent and the difference between the start and end values will roughly match the configured timeout.The empty audio will be committed to the conversation as an
input_audio
item (there will be ainput_audio_buffer.committed
event) and a model response will be generated. There may be speech that didn't trigger VAD but is still detected by the model, so the model may respond with something relevant to the conversation or a prompt to continue speaking.
-
asConversationItemInputAudioTranscriptionSegment
final ConversationItemInputAudioTranscriptionSegment asConversationItemInputAudioTranscriptionSegment()
Returned when an input audio transcription segment is identified for an item.
-
asMcpListToolsInProgress
final McpListToolsInProgress asMcpListToolsInProgress()
Returned when listing MCP tools is in progress for an item.
-
asMcpListToolsCompleted
final McpListToolsCompleted asMcpListToolsCompleted()
Returned when listing MCP tools has completed for an item.
-
asMcpListToolsFailed
final McpListToolsFailed asMcpListToolsFailed()
Returned when listing MCP tools has failed for an item.
-
asResponseMcpCallArgumentsDelta
final ResponseMcpCallArgumentsDelta asResponseMcpCallArgumentsDelta()
Returned when MCP tool call arguments are updated during response generation.
-
asResponseMcpCallArgumentsDone
final ResponseMcpCallArgumentsDone asResponseMcpCallArgumentsDone()
Returned when MCP tool call arguments are finalized during response generation.
-
asResponseMcpCallInProgress
final ResponseMcpCallInProgress asResponseMcpCallInProgress()
Returned when an MCP tool call has started and is in progress.
-
asResponseMcpCallCompleted
final ResponseMcpCallCompleted asResponseMcpCallCompleted()
Returned when an MCP tool call has completed successfully.
-
asResponseMcpCallFailed
final ResponseMcpCallFailed asResponseMcpCallFailed()
Returned when an MCP tool call has failed.
-
accept
final <T extends Any> T accept(RealtimeServerEvent.Visitor<T> visitor)
-
validate
final RealtimeServerEvent validate()
-
ofConversationCreated
final static RealtimeServerEvent ofConversationCreated(ConversationCreatedEvent conversationCreated)
Returned when a conversation is created. Emitted right after session creation.
-
ofConversationItemCreated
final static RealtimeServerEvent ofConversationItemCreated(ConversationItemCreatedEvent conversationItemCreated)
Returned when a conversation item is created. There are several scenarios that produce this event:
The server is generating a Response, which if successful will produce either one or two Items, which will be of type
message
(roleassistant
) or typefunction_call
.The input audio buffer has been committed, either by the client or the server (in
server_vad
mode). The server will take the content of the input audio buffer and add it to a new user message Item.The client has sent a
conversation.item.create
event to add a new Item to the Conversation.
-
ofConversationItemDeleted
final static RealtimeServerEvent ofConversationItemDeleted(ConversationItemDeletedEvent conversationItemDeleted)
Returned when an item in the conversation is deleted by the client with a
conversation.item.delete
event. This event is used to synchronize the server's understanding of the conversation history with the client's view.
-
ofConversationItemInputAudioTranscriptionCompleted
final static RealtimeServerEvent ofConversationItemInputAudioTranscriptionCompleted(ConversationItemInputAudioTranscriptionCompletedEvent conversationItemInputAudioTranscriptionCompleted)
This event is the output of audio transcription for user audio written to the user audio buffer. Transcription begins when the input audio buffer is committed by the client or server (when VAD is enabled). Transcription runs asynchronously with Response creation, so this event may come before or after the Response events.
Realtime API models accept audio natively, and thus input transcription is a separate process run on a separate ASR (Automatic Speech Recognition) model. The transcript may diverge somewhat from the model's interpretation, and should be treated as a rough guide.
-
ofConversationItemInputAudioTranscriptionDelta
final static RealtimeServerEvent ofConversationItemInputAudioTranscriptionDelta(ConversationItemInputAudioTranscriptionDeltaEvent conversationItemInputAudioTranscriptionDelta)
Returned when the text value of an input audio transcription content part is updated with incremental transcription results.
-
ofConversationItemInputAudioTranscriptionFailed
final static RealtimeServerEvent ofConversationItemInputAudioTranscriptionFailed(ConversationItemInputAudioTranscriptionFailedEvent conversationItemInputAudioTranscriptionFailed)
Returned when input audio transcription is configured, and a transcription request for a user message failed. These events are separate from other
error
events so that the client can identify the related Item.
-
ofConversationItemRetrieved
final static RealtimeServerEvent ofConversationItemRetrieved(RealtimeServerEvent.ConversationItemRetrieved conversationItemRetrieved)
Returned when a conversation item is retrieved with
conversation.item.retrieve
. This is provided as a way to fetch the server's representation of an item, for example to get access to the post-processed audio data after noise cancellation and VAD. It includes the full content of the Item, including audio data.
-
ofConversationItemTruncated
final static RealtimeServerEvent ofConversationItemTruncated(ConversationItemTruncatedEvent conversationItemTruncated)
Returned when an earlier assistant audio message item is truncated by the client with a
conversation.item.truncate
event. This event is used to synchronize the server's understanding of the audio with the client's playback.This action will truncate the audio and remove the server-side text transcript to ensure there is no text in the context that hasn't been heard by the user.
-
ofError
final static RealtimeServerEvent ofError(RealtimeErrorEvent error)
Returned when an error occurs, which could be a client problem or a server problem. Most errors are recoverable and the session will stay open, we recommend to implementors to monitor and log error messages by default.
-
ofInputAudioBufferCleared
final static RealtimeServerEvent ofInputAudioBufferCleared(InputAudioBufferClearedEvent inputAudioBufferCleared)
Returned when the input audio buffer is cleared by the client with a
input_audio_buffer.clear
event.
-
ofInputAudioBufferCommitted
final static RealtimeServerEvent ofInputAudioBufferCommitted(InputAudioBufferCommittedEvent inputAudioBufferCommitted)
Returned when an input audio buffer is committed, either by the client or automatically in server VAD mode. The
item_id
property is the ID of the user message item that will be created, thus aconversation.item.created
event will also be sent to the client.
-
ofInputAudioBufferSpeechStarted
final static RealtimeServerEvent ofInputAudioBufferSpeechStarted(InputAudioBufferSpeechStartedEvent inputAudioBufferSpeechStarted)
Sent by the server when in
server_vad
mode to indicate that speech has been detected in the audio buffer. This can happen any time audio is added to the buffer (unless speech is already detected). The client may want to use this event to interrupt audio playback or provide visual feedback to the user.The client should expect to receive a
input_audio_buffer.speech_stopped
event when speech stops. Theitem_id
property is the ID of the user message item that will be created when speech stops and will also be included in theinput_audio_buffer.speech_stopped
event (unless the client manually commits the audio buffer during VAD activation).
-
ofInputAudioBufferSpeechStopped
final static RealtimeServerEvent ofInputAudioBufferSpeechStopped(InputAudioBufferSpeechStoppedEvent inputAudioBufferSpeechStopped)
Returned in
server_vad
mode when the server detects the end of speech in the audio buffer. The server will also send anconversation.item.created
event with the user message item that is created from the audio buffer.
-
ofRateLimitsUpdated
final static RealtimeServerEvent ofRateLimitsUpdated(RateLimitsUpdatedEvent rateLimitsUpdated)
Emitted at the beginning of a Response to indicate the updated rate limits. When a Response is created some tokens will be "reserved" for the output tokens, the rate limits shown here reflect that reservation, which is then adjusted accordingly once the Response is completed.
-
ofResponseOutputAudioDelta
final static RealtimeServerEvent ofResponseOutputAudioDelta(ResponseAudioDeltaEvent responseOutputAudioDelta)
Returned when the model-generated audio is updated.
-
ofResponseOutputAudioDone
final static RealtimeServerEvent ofResponseOutputAudioDone(ResponseAudioDoneEvent responseOutputAudioDone)
Returned when the model-generated audio is done. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
ofResponseOutputAudioTranscriptDelta
final static RealtimeServerEvent ofResponseOutputAudioTranscriptDelta(ResponseAudioTranscriptDeltaEvent responseOutputAudioTranscriptDelta)
Returned when the model-generated transcription of audio output is updated.
-
ofResponseOutputAudioTranscriptDone
final static RealtimeServerEvent ofResponseOutputAudioTranscriptDone(ResponseAudioTranscriptDoneEvent responseOutputAudioTranscriptDone)
Returned when the model-generated transcription of audio output is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
ofResponseContentPartAdded
final static RealtimeServerEvent ofResponseContentPartAdded(ResponseContentPartAddedEvent responseContentPartAdded)
Returned when a new content part is added to an assistant message item during response generation.
-
ofResponseContentPartDone
final static RealtimeServerEvent ofResponseContentPartDone(ResponseContentPartDoneEvent responseContentPartDone)
Returned when a content part is done streaming in an assistant message item. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
ofResponseCreated
final static RealtimeServerEvent ofResponseCreated(ResponseCreatedEvent responseCreated)
Returned when a new Response is created. The first event of response creation, where the response is in an initial state of
in_progress
.
-
ofResponseDone
final static RealtimeServerEvent ofResponseDone(ResponseDoneEvent responseDone)
Returned when a Response is done streaming. Always emitted, no matter the final state. The Response object included in the
response.done
event will include all output Items in the Response but will omit the raw audio data.Clients should check the
status
field of the Response to determine if it was successful (completed
) or if there was another outcome:cancelled
,failed
, orincomplete
.A response will contain all output items that were generated during the response, excluding any audio content.
-
ofResponseFunctionCallArgumentsDelta
final static RealtimeServerEvent ofResponseFunctionCallArgumentsDelta(ResponseFunctionCallArgumentsDeltaEvent responseFunctionCallArgumentsDelta)
Returned when the model-generated function call arguments are updated.
-
ofResponseFunctionCallArgumentsDone
final static RealtimeServerEvent ofResponseFunctionCallArgumentsDone(ResponseFunctionCallArgumentsDoneEvent responseFunctionCallArgumentsDone)
Returned when the model-generated function call arguments are done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
ofResponseOutputItemAdded
final static RealtimeServerEvent ofResponseOutputItemAdded(ResponseOutputItemAddedEvent responseOutputItemAdded)
Returned when a new Item is created during Response generation.
-
ofResponseOutputItemDone
final static RealtimeServerEvent ofResponseOutputItemDone(ResponseOutputItemDoneEvent responseOutputItemDone)
Returned when an Item is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
ofResponseOutputTextDelta
final static RealtimeServerEvent ofResponseOutputTextDelta(ResponseTextDeltaEvent responseOutputTextDelta)
Returned when the text value of an "output_text" content part is updated.
-
ofResponseOutputTextDone
final static RealtimeServerEvent ofResponseOutputTextDone(ResponseTextDoneEvent responseOutputTextDone)
Returned when the text value of an "output_text" content part is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.
-
ofSessionCreated
final static RealtimeServerEvent ofSessionCreated(SessionCreatedEvent sessionCreated)
Returned when a Session is created. Emitted automatically when a new connection is established as the first server event. This event will contain the default Session configuration.
-
ofSessionUpdated
final static RealtimeServerEvent ofSessionUpdated(SessionUpdatedEvent sessionUpdated)
Returned when a session is updated with a
session.update
event, unless there is an error.
-
ofOutputAudioBufferStarted
final static RealtimeServerEvent ofOutputAudioBufferStarted(RealtimeServerEvent.OutputAudioBufferStarted outputAudioBufferStarted)
WebRTC Only: Emitted when the server begins streaming audio to the client. This event is emitted after an audio content part has been added (
response.content_part.added
) to the response. Learn more.
-
ofOutputAudioBufferStopped
final static RealtimeServerEvent ofOutputAudioBufferStopped(RealtimeServerEvent.OutputAudioBufferStopped outputAudioBufferStopped)
WebRTC Only: Emitted when the output audio buffer has been completely drained on the server, and no more audio is forthcoming. This event is emitted after the full response data has been sent to the client (
response.done
). Learn more.
-
ofOutputAudioBufferCleared
final static RealtimeServerEvent ofOutputAudioBufferCleared(RealtimeServerEvent.OutputAudioBufferCleared outputAudioBufferCleared)
WebRTC Only: Emitted when the output audio buffer is cleared. This happens either in VAD mode when the user has interrupted (
input_audio_buffer.speech_started
), or when the client has emitted theoutput_audio_buffer.clear
event to manually cut off the current audio response. Learn more.
-
ofConversationItemAdded
final static RealtimeServerEvent ofConversationItemAdded(ConversationItemAdded conversationItemAdded)
Sent by the server when an Item is added to the default Conversation. This can happen in several cases:
When the client sends a
conversation.item.create
event.When the input audio buffer is committed. In this case the item will be a user message containing the audio from the buffer.
When the model is generating a Response. In this case the
conversation.item.added
event will be sent when the model starts generating a specific Item, and thus it will not yet have any content (andstatus
will bein_progress
).
The event will include the full content of the Item (except when model is generating a Response) except for audio data, which can be retrieved separately with a
conversation.item.retrieve
event if necessary.
-
ofConversationItemDone
final static RealtimeServerEvent ofConversationItemDone(ConversationItemDone conversationItemDone)
Returned when a conversation item is finalized.
The event will include the full content of the Item except for audio data, which can be retrieved separately with a
conversation.item.retrieve
event if needed.
-
ofInputAudioBufferTimeoutTriggered
final static RealtimeServerEvent ofInputAudioBufferTimeoutTriggered(InputAudioBufferTimeoutTriggered inputAudioBufferTimeoutTriggered)
Returned when the Server VAD timeout is triggered for the input audio buffer. This is configured with
idle_timeout_ms
in theturn_detection
settings of the session, and it indicates that there hasn't been any speech detected for the configured duration.The
audio_start_ms
andaudio_end_ms
fields indicate the segment of audio after the last model response up to the triggering time, as an offset from the beginning of audio written to the input audio buffer. This means it demarcates the segment of audio that was silent and the difference between the start and end values will roughly match the configured timeout.The empty audio will be committed to the conversation as an
input_audio
item (there will be ainput_audio_buffer.committed
event) and a model response will be generated. There may be speech that didn't trigger VAD but is still detected by the model, so the model may respond with something relevant to the conversation or a prompt to continue speaking.
-
ofConversationItemInputAudioTranscriptionSegment
final static RealtimeServerEvent ofConversationItemInputAudioTranscriptionSegment(ConversationItemInputAudioTranscriptionSegment conversationItemInputAudioTranscriptionSegment)
Returned when an input audio transcription segment is identified for an item.
-
ofMcpListToolsInProgress
final static RealtimeServerEvent ofMcpListToolsInProgress(McpListToolsInProgress mcpListToolsInProgress)
Returned when listing MCP tools is in progress for an item.
-
ofMcpListToolsCompleted
final static RealtimeServerEvent ofMcpListToolsCompleted(McpListToolsCompleted mcpListToolsCompleted)
Returned when listing MCP tools has completed for an item.
-
ofMcpListToolsFailed
final static RealtimeServerEvent ofMcpListToolsFailed(McpListToolsFailed mcpListToolsFailed)
Returned when listing MCP tools has failed for an item.
-
ofResponseMcpCallArgumentsDelta
final static RealtimeServerEvent ofResponseMcpCallArgumentsDelta(ResponseMcpCallArgumentsDelta responseMcpCallArgumentsDelta)
Returned when MCP tool call arguments are updated during response generation.
-
ofResponseMcpCallArgumentsDone
final static RealtimeServerEvent ofResponseMcpCallArgumentsDone(ResponseMcpCallArgumentsDone responseMcpCallArgumentsDone)
Returned when MCP tool call arguments are finalized during response generation.
-
ofResponseMcpCallInProgress
final static RealtimeServerEvent ofResponseMcpCallInProgress(ResponseMcpCallInProgress responseMcpCallInProgress)
Returned when an MCP tool call has started and is in progress.
-
ofResponseMcpCallCompleted
final static RealtimeServerEvent ofResponseMcpCallCompleted(ResponseMcpCallCompleted responseMcpCallCompleted)
Returned when an MCP tool call has completed successfully.
-
ofResponseMcpCallFailed
final static RealtimeServerEvent ofResponseMcpCallFailed(ResponseMcpCallFailed responseMcpCallFailed)
Returned when an MCP tool call has failed.
-
-
-
-