public class CreateAICallRequest extends AbstractModel
header, skipSign
Constructor and Description |
---|
CreateAICallRequest() |
CreateAICallRequest(CreateAICallRequest source)
NOTE: Any ambiguous key set via .set("AnyKey", "value") will be a shallow copy,
and any explicit key, i.e Foo, set via .setFoo("value") will be a deep copy.
|
Modifier and Type | Method and Description |
---|---|
String |
getAPIKey()
Get Model API key, for authentication information, please refer to the respective model's official website
- OpenAI protocol: [GPT](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key), [DeepSeek](https://api-docs.deepseek.com/zh-cn/);
- Azure protocol: [Azure GPT](https://learn.microsoft.com/en-us/azure/ai-services/openai/chatgpt-quickstart?tabs=command-line%2Ctypescript%2Cpython-new&pivots=programming-language-studio#key-settings);
- Minimax:[Minimax](https://platform.minimaxi.com/document/Fast%20access?key=66701cf51d57f38758d581b2)
|
String |
getAPIUrl()
Get Model interface address
- OpenAI protocol
GPT:"https://api.openai.com/v1/"
Deepseek:"https://api.deepseek.com/v1"
- Azure protocol
"https://{your-resource-name}.openai.azure.com?api-version={api-version}"
- Minimax protocol
"https://api.minimax.chat/v1"
|
String |
getCallee()
Get Called number.
|
String[] |
getCallers()
Get Caller number list
|
String |
getCustomTTSConfig()
Get
|
String |
getEndFunctionDesc()
Get Effective when EndFunctionEnable is true; the description of call_end function calling, default is "End the call when user has to leave (like says bye) or you are instructed to do so."
|
Boolean |
getEndFunctionEnable()
Get Whether the model supports (or enables) call_end function calling
|
AICallExtractConfigElement[] |
getExtractConfig()
Get Call content extraction configuration.
|
Long |
getInterruptMode()
Get Interrupt ai speaking mode.
|
Long |
getInterruptSpeechDuration()
Get Used when InterruptMode is 0, unit in milliseconds, default is 500ms.
|
String[] |
getLanguages()
Get ASR Supported Languages, default is "zh" Chinese,
Fill in the array with up to 4 languages, the first is the primary language for recognition, followed by optional languages,
Note: When the primary language is a Chinese dialect, optional languages are invalid
Currently, the supported languages are as follows.
|
String |
getLLMType()
Get Model interface protocol types, currently compatible with three protocol types:
- OpenAI protocol (including GPT, DeepSeek, etc.):"openai"
- Azure protocol:"azure"
- Minimax protocol:"minimax"
|
Long |
getMaxDuration()
Get Maximum Waiting Duration (milliseconds), default is 60 seconds, if the user does not speak within this time, the call is automatically terminated
|
String |
getModel()
Get Model name, such as
- OpenAI protocol
"gpt-4o-mini","gpt-4o","deepseek-chat";
- Azure protocol
"gpt-4o-mini", "gpt-4o";
- Minimax protocol
"deepseek-chat".
|
Long |
getNotifyDuration()
Get The duration after which the user hasn't spoken to trigger a notification, minimum 10 seconds, default 10 seconds
|
Long |
getNotifyMaxCount()
Get Maximum number of times to trigger ai prompt sound, unlimited by default.
|
String |
getNotifyMessage()
Get The AI prompt when NotifyDuration has passed without the user speaking, default is "Sorry, I didn't hear you clearly.
|
Variable[] |
getPromptVariables()
Deprecated.
|
Long |
getSdkAppId()
Get Application ID (required) can be found at https://console.cloud.tencent.com/ccc.
|
String |
getSystemPrompt()
Get ## Identity
You are Kate from the appointment department at Retell Health calling Cindy over the phone to prepare for the annual checkup coming up.
|
Boolean |
getTransferFunctionEnable()
Get Whether the model supports (or enables) transfer_to_human function calling.
|
AITransferItem[] |
getTransferItems()
Get Takes effect when transferfunctionenable is true: transfer to human configuration.
|
Long |
getVadSilenceTime()
Get Automatic speech recognition vad time ranges from 240 to 2000, with a default of 1000, measured in milliseconds.
|
String |
getVoiceType()
Get The following voice parameter values are available by default.
|
String |
getWelcomeMessage()
Get Used to set the AI Agent Welcome Message.
|
Long |
getWelcomeMessagePriority()
Get 0: interruptible by default, 1: high priority and not interruptible.
|
Long |
getWelcomeType()
Get 0: Use welcomeMessage (if empty, the callee speaks first; if not empty, the bot speaks first)
1: Use AI to automatically generate welcomeMessage and speak first based on the prompt
|
void |
setAPIKey(String APIKey)
Set Model API key, for authentication information, please refer to the respective model's official website
- OpenAI protocol: [GPT](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key), [DeepSeek](https://api-docs.deepseek.com/zh-cn/);
- Azure protocol: [Azure GPT](https://learn.microsoft.com/en-us/azure/ai-services/openai/chatgpt-quickstart?tabs=command-line%2Ctypescript%2Cpython-new&pivots=programming-language-studio#key-settings);
- Minimax:[Minimax](https://platform.minimaxi.com/document/Fast%20access?key=66701cf51d57f38758d581b2)
|
void |
setAPIUrl(String APIUrl)
Set Model interface address
- OpenAI protocol
GPT:"https://api.openai.com/v1/"
Deepseek:"https://api.deepseek.com/v1"
- Azure protocol
"https://{your-resource-name}.openai.azure.com?api-version={api-version}"
- Minimax protocol
"https://api.minimax.chat/v1"
|
void |
setCallee(String Callee)
Set Called number.
|
void |
setCallers(String[] Callers)
Set Caller number list
|
void |
setCustomTTSConfig(String CustomTTSConfig)
Set
|
void |
setEndFunctionDesc(String EndFunctionDesc)
Set Effective when EndFunctionEnable is true; the description of call_end function calling, default is "End the call when user has to leave (like says bye) or you are instructed to do so."
|
void |
setEndFunctionEnable(Boolean EndFunctionEnable)
Set Whether the model supports (or enables) call_end function calling
|
void |
setExtractConfig(AICallExtractConfigElement[] ExtractConfig)
Set Call content extraction configuration.
|
void |
setInterruptMode(Long InterruptMode)
Set Interrupt ai speaking mode.
|
void |
setInterruptSpeechDuration(Long InterruptSpeechDuration)
Set Used when InterruptMode is 0, unit in milliseconds, default is 500ms.
|
void |
setLanguages(String[] Languages)
Set ASR Supported Languages, default is "zh" Chinese,
Fill in the array with up to 4 languages, the first is the primary language for recognition, followed by optional languages,
Note: When the primary language is a Chinese dialect, optional languages are invalid
Currently, the supported languages are as follows.
|
void |
setLLMType(String LLMType)
Set Model interface protocol types, currently compatible with three protocol types:
- OpenAI protocol (including GPT, DeepSeek, etc.):"openai"
- Azure protocol:"azure"
- Minimax protocol:"minimax"
|
void |
setMaxDuration(Long MaxDuration)
Set Maximum Waiting Duration (milliseconds), default is 60 seconds, if the user does not speak within this time, the call is automatically terminated
|
void |
setModel(String Model)
Set Model name, such as
- OpenAI protocol
"gpt-4o-mini","gpt-4o","deepseek-chat";
- Azure protocol
"gpt-4o-mini", "gpt-4o";
- Minimax protocol
"deepseek-chat".
|
void |
setNotifyDuration(Long NotifyDuration)
Set The duration after which the user hasn't spoken to trigger a notification, minimum 10 seconds, default 10 seconds
|
void |
setNotifyMaxCount(Long NotifyMaxCount)
Set Maximum number of times to trigger ai prompt sound, unlimited by default.
|
void |
setNotifyMessage(String NotifyMessage)
Set The AI prompt when NotifyDuration has passed without the user speaking, default is "Sorry, I didn't hear you clearly.
|
void |
setPromptVariables(Variable[] PromptVariables)
Deprecated.
|
void |
setSdkAppId(Long SdkAppId)
Set Application ID (required) can be found at https://console.cloud.tencent.com/ccc.
|
void |
setSystemPrompt(String SystemPrompt)
Set ## Identity
You are Kate from the appointment department at Retell Health calling Cindy over the phone to prepare for the annual checkup coming up.
|
void |
setTransferFunctionEnable(Boolean TransferFunctionEnable)
Set Whether the model supports (or enables) transfer_to_human function calling.
|
void |
setTransferItems(AITransferItem[] TransferItems)
Set Takes effect when transferfunctionenable is true: transfer to human configuration.
|
void |
setVadSilenceTime(Long VadSilenceTime)
Set Automatic speech recognition vad time ranges from 240 to 2000, with a default of 1000, measured in milliseconds.
|
void |
setVoiceType(String VoiceType)
Set The following voice parameter values are available by default.
|
void |
setWelcomeMessage(String WelcomeMessage)
Set Used to set the AI Agent Welcome Message.
|
void |
setWelcomeMessagePriority(Long WelcomeMessagePriority)
Set 0: interruptible by default, 1: high priority and not interruptible.
|
void |
setWelcomeType(Long WelcomeType)
Set 0: Use welcomeMessage (if empty, the callee speaks first; if not empty, the bot speaks first)
1: Use AI to automatically generate welcomeMessage and speak first based on the prompt
|
void |
toMap(HashMap<String,String> map,
String prefix)
Internal implementation, normal users should not use it.
|
any, fromJsonString, getBinaryParams, GetHeader, getMultipartRequestParams, getSkipSign, set, SetHeader, setParamArrayObj, setParamArraySimple, setParamObj, setParamSimple, setSkipSign, toJsonString
public CreateAICallRequest()
public CreateAICallRequest(CreateAICallRequest source)
public Long getSdkAppId()
public void setSdkAppId(Long SdkAppId)
SdkAppId
- Application ID (required) can be found at https://console.cloud.tencent.com/ccc.public String getCallee()
public void setCallee(String Callee)
Callee
- Called number.public String getLLMType()
public void setLLMType(String LLMType)
LLMType
- Model interface protocol types, currently compatible with three protocol types:
- OpenAI protocol (including GPT, DeepSeek, etc.):"openai"
- Azure protocol:"azure"
- Minimax protocol:"minimax"public String getAPIKey()
public void setAPIKey(String APIKey)
APIKey
- Model API key, for authentication information, please refer to the respective model's official website
- OpenAI protocol: [GPT](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key), [DeepSeek](https://api-docs.deepseek.com/zh-cn/);
- Azure protocol: [Azure GPT](https://learn.microsoft.com/en-us/azure/ai-services/openai/chatgpt-quickstart?tabs=command-line%2Ctypescript%2Cpython-new&pivots=programming-language-studio#key-settings);
- Minimax:[Minimax](https://platform.minimaxi.com/document/Fast%20access?key=66701cf51d57f38758d581b2)public String getAPIUrl()
public void setAPIUrl(String APIUrl)
APIUrl
- Model interface address
- OpenAI protocol
GPT:"https://api.openai.com/v1/"
Deepseek:"https://api.deepseek.com/v1"
- Azure protocol
"https://{your-resource-name}.openai.azure.com?api-version={api-version}"
- Minimax protocol
"https://api.minimax.chat/v1"public String getSystemPrompt()
public void setSystemPrompt(String SystemPrompt)
SystemPrompt
- ## Identity
You are Kate from the appointment department at Retell Health calling Cindy over the phone to prepare for the annual checkup coming up. You are a pleasant and friendly receptionist caring deeply for the user. You don't provide medical advice but would use the medical knowledge to understand user responses.
## Style Guardrails
Be Concise: Respond succinctly, addressing one topic at most.
Embrace Variety: Use diverse language and rephrasing to enhance clarity without repeating content.
Be Conversational: Use everyday language, making the chat feel like talking to a friend.
Be Proactive: Lead the conversation, often wrapping up with a question or next-step suggestion.
Avoid multiple questions in a single response.
Get clarity: If the user only partially answers a question, or if the answer is unclear, keep asking to get clarity.
Use a colloquial way of referring to the date (like Friday, January 14th, or Tuesday, January 12th, 2024 at 8am).
## Response Guideline
Adapt and Guess: Try to understand transcripts that may contain transcription errors. Avoid mentioning "transcription error" in the response.
Stay in Character: Keep conversations within your role's scope, guiding them back creatively without repeating.
Ensure Fluid Dialogue: Respond in a role-appropriate, direct manner to maintain a smooth conversation flow.
## Task
You will follow the steps below, do not skip steps, and only ask up to one question in response.
If at any time the user showed anger or wanted a human agent, call transfer_call to transfer to a human representative.
1. Begin with a self-introduction and verify if callee is Cindy.
- if callee is not Cindy, call end_call to hang up, say sorry for the confusion when hanging up.
- if Cindy is not available, call end_call politely to hang up, say you will call back later when hanging up.
2. Inform Cindy she has an annual body check coming up on April 4th, 2024 at 10am PDT. Check if Cindy is available.
- If not, tell Cindy to reschedule online and jump to step 5.
3. Ask Cindy if there's anything that the doctor should know before the annual checkup.
- Ask followup questions as needed to assess the severity of the issue, and understand how it has progressed.
4. Tell Cindy to not eat or drink that day before the checkup. Also tell Cindy to give you a callback if there's any changes in health condition.
5. Ask Cindy if she has any questions, and if so, answer them until there are no questions.
- If user asks something you do not know, let them know you don't have the answer. Ask them if they have any other questions.
- If user do not have any questions, call function end_call to hang up.public String getModel()
public void setModel(String Model)
Model
- Model name, such as
- OpenAI protocol
"gpt-4o-mini","gpt-4o","deepseek-chat";
- Azure protocol
"gpt-4o-mini", "gpt-4o";
- Minimax protocol
"deepseek-chat".public String getVoiceType()
public void setVoiceType(String VoiceType)
VoiceType
- The following voice parameter values are available by default. If you wish to customize the voice type, please leave VoiceType blank and configure it in the CustomTTSConfig parameter.
Chinese:
ZhiMei: Zhimei, customer service female voice
ZhiXi: Zhixi, general female voice
ZhiQi: Zhiqi, customer service female voice
ZhiTian: Zhitian, female child voice
AiXiaoJing: Ai Xiaojing, dialogue female voice
English:
WeRose:English Female Voice
Monika:English Female Voice
Japanese:
Nanami
Korean:
SunHi
Indonesian (Indonesia):
Gadis
Malay (Malaysia):
Yasmin
Tamil (Malaysia):
Kani
Thai (Thailand):
Achara
Vietnamese (Vietnam):
HoaiMypublic String[] getCallers()
public void setCallers(String[] Callers)
Callers
- Caller number listpublic String getWelcomeMessage()
public void setWelcomeMessage(String WelcomeMessage)
WelcomeMessage
- Used to set the AI Agent Welcome Message.public Long getWelcomeType()
public void setWelcomeType(Long WelcomeType)
WelcomeType
- 0: Use welcomeMessage (if empty, the callee speaks first; if not empty, the bot speaks first)
1: Use AI to automatically generate welcomeMessage and speak first based on the promptpublic Long getWelcomeMessagePriority()
public void setWelcomeMessagePriority(Long WelcomeMessagePriority)
WelcomeMessagePriority
- 0: interruptible by default, 1: high priority and not interruptible.public Long getMaxDuration()
public void setMaxDuration(Long MaxDuration)
MaxDuration
- Maximum Waiting Duration (milliseconds), default is 60 seconds, if the user does not speak within this time, the call is automatically terminatedpublic String[] getLanguages()
public void setLanguages(String[] Languages)
Languages
- ASR Supported Languages, default is "zh" Chinese,
Fill in the array with up to 4 languages, the first is the primary language for recognition, followed by optional languages,
Note: When the primary language is a Chinese dialect, optional languages are invalid
Currently, the supported languages are as follows. The English name of the language is on the left side of the equals sign, and the value to be filled in the Language field is on the right side, following ISO639:
1. Chinese = "zh" # Chinese
2. Chinese_TW = "zh-TW" # Taiwan (China)
3. Chinese_DIALECT = "zh-dialect" # Chinese Dialect
4. English = "en" # English
5. Vietnamese = "vi" # Vietnamese
6. Japanese = "ja" # Japanese
7. Korean = "ko" # Korean
8. Indonesia = "id" # Indonesian
9. Thai = "th" # Thai
10. Portuguese = "pt" # Portuguese
11. Turkish = "tr" # Turkish
12. Arabic = "ar" # Arabic
13. Spanish = "es" # Spanish
14. Hindi = "hi" # Hindi
15. French = "fr" # French
16. Malay = "ms" # Malay
17. Filipino = "fil" # Filipino
18. German = "de" # German
19. Italian = "it" # Italian
20. Russian = "ru" # Russianpublic Long getInterruptMode()
public void setInterruptMode(Long InterruptMode)
InterruptMode
- Interrupt ai speaking mode. default is 0. 0 indicates automatic interruption and 1 indicates no interruption.public Long getInterruptSpeechDuration()
public void setInterruptSpeechDuration(Long InterruptSpeechDuration)
InterruptSpeechDuration
- Used when InterruptMode is 0, unit in milliseconds, default is 500ms. It means that the server-side detects ongoing vocal input for the InterruptSpeechDuration milliseconds and then interrupts.public Boolean getEndFunctionEnable()
public void setEndFunctionEnable(Boolean EndFunctionEnable)
EndFunctionEnable
- Whether the model supports (or enables) call_end function callingpublic String getEndFunctionDesc()
public void setEndFunctionDesc(String EndFunctionDesc)
EndFunctionDesc
- Effective when EndFunctionEnable is true; the description of call_end function calling, default is "End the call when user has to leave (like says bye) or you are instructed to do so."public Boolean getTransferFunctionEnable()
public void setTransferFunctionEnable(Boolean TransferFunctionEnable)
TransferFunctionEnable
- Whether the model supports (or enables) transfer_to_human function calling.public AITransferItem[] getTransferItems()
public void setTransferItems(AITransferItem[] TransferItems)
TransferItems
- Takes effect when transferfunctionenable is true: transfer to human configuration.public Long getNotifyDuration()
public void setNotifyDuration(Long NotifyDuration)
NotifyDuration
- The duration after which the user hasn't spoken to trigger a notification, minimum 10 seconds, default 10 secondspublic String getNotifyMessage()
public void setNotifyMessage(String NotifyMessage)
NotifyMessage
- The AI prompt when NotifyDuration has passed without the user speaking, default is "Sorry, I didn't hear you clearly. Can you repeat that?"public Long getNotifyMaxCount()
public void setNotifyMaxCount(Long NotifyMaxCount)
NotifyMaxCount
- Maximum number of times to trigger ai prompt sound, unlimited by default.public String getCustomTTSConfig()
And VoiceType field needs to select one, here is to use your own custom TTS, VoiceType is some built-in sound qualities
{
"TTSType": "tencent", // String TTS type, currently supports "tencent" and "minixmax", other vendors support in progress
"AppId": "Your application ID", // String required
"SecretId": "Your Secret ID", // String Required
"SecretKey": "Your Secret Key", // String Required
"VoiceType": 101001, // Integer Required, Sound quality ID, includes standard and premium sound quality. Premium sound quality is more realistic and differently priced than standard sound quality. See TTS billing overview for details. For the full list of sound quality IDs, see the TTS sound quality list.
"Speed": 1.25, // Integer Optional, speech speed, range: [-2,6], corresponding to different speeds: -2: represents 0.6x -1: represents 0.8x 0: represents 1.0x (default) 1: represents 1.2x 2: represents 1.5x 6: represents 2.5x For more precise speed control, you can retain two decimal places, such as 0.5/1.25/2.81, etc. For parameter value to actual speed conversion, refer to Speed Conversion
"Volume": 5, // Integer Optional, Volume level, range: [0,10], corresponding to 11 levels of volume, default is 0, which represents normal volume.
"PrimaryLanguage": 1, // Integer Optional, Primary language 1- Chinese (default) 2- English 3- Japanese
"FastVoiceType": "xxxx" // Optional parameter, Fast VRS parameter
}
{
"TTSType": "azure", // Required: String TTS type
"SubscriptionKey": "xxxxxxxx", // Required: String subscription key
"Region": "chinanorth3", // Required: String subscription region
"VoiceName": "zh-CN-XiaoxiaoNeural", // Required: String Timbre Name required
"Language": "zh-CN", // Required: String Language for synthesis
"Rate": 1 // Optional: float Playback Speed 0.5-2 default is 1
}
TTS
Please refer to the specific protocol standards in the Tencent documentation
{
"TTSType": "custom", // Required String
"APIKey": "ApiKey", // Required String for Authentication
"APIUrl": "http://0.0.0.0:8080/stream-audio" // Required String, TTS API URL
"AudioFormat": "wav", // String, optional, expected audio format, such as mp3, ogg_opus, pcm, wav, default is wav, currently only pcm and wav are supported,
"SampleRate": 16000, // Integer, optional, audio sample rate, default is 16000 (16k), recommended value is 16000
"AudioChannel": 1, // Integer, optional, number of audio channels, values: 1 or 2, default is 1
}
And VoiceType field needs to select one, here is to use your own custom TTS, VoiceType is some built-in sound qualities
{
"TTSType": "tencent", // String TTS type, currently supports "tencent" and "minixmax", other vendors support in progress
"AppId": "Your application ID", // String required
"SecretId": "Your Secret ID", // String Required
"SecretKey": "Your Secret Key", // String Required
"VoiceType": 101001, // Integer Required, Sound quality ID, includes standard and premium sound quality. Premium sound quality is more realistic and differently priced than standard sound quality. See TTS billing overview for details. For the full list of sound quality IDs, see the TTS sound quality list.
"Speed": 1.25, // Integer Optional, speech speed, range: [-2,6], corresponding to different speeds: -2: represents 0.6x -1: represents 0.8x 0: represents 1.0x (default) 1: represents 1.2x 2: represents 1.5x 6: represents 2.5x For more precise speed control, you can retain two decimal places, such as 0.5/1.25/2.81, etc. For parameter value to actual speed conversion, refer to Speed Conversion
"Volume": 5, // Integer Optional, Volume level, range: [0,10], corresponding to 11 levels of volume, default is 0, which represents normal volume.
"PrimaryLanguage": 1, // Integer Optional, Primary language 1- Chinese (default) 2- English 3- Japanese
"FastVoiceType": "xxxx" // Optional parameter, Fast VRS parameter
}
{
"TTSType": "azure", // Required: String TTS type
"SubscriptionKey": "xxxxxxxx", // Required: String subscription key
"Region": "chinanorth3", // Required: String subscription region
"VoiceName": "zh-CN-XiaoxiaoNeural", // Required: String Timbre Name required
"Language": "zh-CN", // Required: String Language for synthesis
"Rate": 1 // Optional: float Playback Speed 0.5-2 default is 1
}
TTS
Please refer to the specific protocol standards in the Tencent documentation
{
"TTSType": "custom", // Required String
"APIKey": "ApiKey", // Required String for Authentication
"APIUrl": "http://0.0.0.0:8080/stream-audio" // Required String, TTS API URL
"AudioFormat": "wav", // String, optional, expected audio format, such as mp3, ogg_opus, pcm, wav, default is wav, currently only pcm and wav are supported,
"SampleRate": 16000, // Integer, optional, audio sample rate, default is 16000 (16k), recommended value is 16000
"AudioChannel": 1, // Integer, optional, number of audio channels, values: 1 or 2, default is 1
}
public void setCustomTTSConfig(String CustomTTSConfig)
And VoiceType field needs to select one, here is to use your own custom TTS, VoiceType is some built-in sound qualities
{
"TTSType": "tencent", // String TTS type, currently supports "tencent" and "minixmax", other vendors support in progress
"AppId": "Your application ID", // String required
"SecretId": "Your Secret ID", // String Required
"SecretKey": "Your Secret Key", // String Required
"VoiceType": 101001, // Integer Required, Sound quality ID, includes standard and premium sound quality. Premium sound quality is more realistic and differently priced than standard sound quality. See TTS billing overview for details. For the full list of sound quality IDs, see the TTS sound quality list.
"Speed": 1.25, // Integer Optional, speech speed, range: [-2,6], corresponding to different speeds: -2: represents 0.6x -1: represents 0.8x 0: represents 1.0x (default) 1: represents 1.2x 2: represents 1.5x 6: represents 2.5x For more precise speed control, you can retain two decimal places, such as 0.5/1.25/2.81, etc. For parameter value to actual speed conversion, refer to Speed Conversion
"Volume": 5, // Integer Optional, Volume level, range: [0,10], corresponding to 11 levels of volume, default is 0, which represents normal volume.
"PrimaryLanguage": 1, // Integer Optional, Primary language 1- Chinese (default) 2- English 3- Japanese
"FastVoiceType": "xxxx" // Optional parameter, Fast VRS parameter
}
{
"TTSType": "azure", // Required: String TTS type
"SubscriptionKey": "xxxxxxxx", // Required: String subscription key
"Region": "chinanorth3", // Required: String subscription region
"VoiceName": "zh-CN-XiaoxiaoNeural", // Required: String Timbre Name required
"Language": "zh-CN", // Required: String Language for synthesis
"Rate": 1 // Optional: float Playback Speed 0.5-2 default is 1
}
TTS
Please refer to the specific protocol standards in the Tencent documentation
{
"TTSType": "custom", // Required String
"APIKey": "ApiKey", // Required String for Authentication
"APIUrl": "http://0.0.0.0:8080/stream-audio" // Required String, TTS API URL
"AudioFormat": "wav", // String, optional, expected audio format, such as mp3, ogg_opus, pcm, wav, default is wav, currently only pcm and wav are supported,
"SampleRate": 16000, // Integer, optional, audio sample rate, default is 16000 (16k), recommended value is 16000
"AudioChannel": 1, // Integer, optional, number of audio channels, values: 1 or 2, default is 1
}
CustomTTSConfig
- And VoiceType field needs to select one, here is to use your own custom TTS, VoiceType is some built-in sound qualities
{
"TTSType": "tencent", // String TTS type, currently supports "tencent" and "minixmax", other vendors support in progress
"AppId": "Your application ID", // String required
"SecretId": "Your Secret ID", // String Required
"SecretKey": "Your Secret Key", // String Required
"VoiceType": 101001, // Integer Required, Sound quality ID, includes standard and premium sound quality. Premium sound quality is more realistic and differently priced than standard sound quality. See TTS billing overview for details. For the full list of sound quality IDs, see the TTS sound quality list.
"Speed": 1.25, // Integer Optional, speech speed, range: [-2,6], corresponding to different speeds: -2: represents 0.6x -1: represents 0.8x 0: represents 1.0x (default) 1: represents 1.2x 2: represents 1.5x 6: represents 2.5x For more precise speed control, you can retain two decimal places, such as 0.5/1.25/2.81, etc. For parameter value to actual speed conversion, refer to Speed Conversion
"Volume": 5, // Integer Optional, Volume level, range: [0,10], corresponding to 11 levels of volume, default is 0, which represents normal volume.
"PrimaryLanguage": 1, // Integer Optional, Primary language 1- Chinese (default) 2- English 3- Japanese
"FastVoiceType": "xxxx" // Optional parameter, Fast VRS parameter
}
{
"TTSType": "azure", // Required: String TTS type
"SubscriptionKey": "xxxxxxxx", // Required: String subscription key
"Region": "chinanorth3", // Required: String subscription region
"VoiceName": "zh-CN-XiaoxiaoNeural", // Required: String Timbre Name required
"Language": "zh-CN", // Required: String Language for synthesis
"Rate": 1 // Optional: float Playback Speed 0.5-2 default is 1
}
TTS
Please refer to the specific protocol standards in the Tencent documentation
{
"TTSType": "custom", // Required String
"APIKey": "ApiKey", // Required String for Authentication
"APIUrl": "http://0.0.0.0:8080/stream-audio" // Required String, TTS API URL
"AudioFormat": "wav", // String, optional, expected audio format, such as mp3, ogg_opus, pcm, wav, default is wav, currently only pcm and wav are supported,
"SampleRate": 16000, // Integer, optional, audio sample rate, default is 16000 (16k), recommended value is 16000
"AudioChannel": 1, // Integer, optional, number of audio channels, values: 1 or 2, default is 1
}
@Deprecated public Variable[] getPromptVariables()
@Deprecated public void setPromptVariables(Variable[] PromptVariables)
PromptVariables
- Prompt word variable.public Long getVadSilenceTime()
public void setVadSilenceTime(Long VadSilenceTime)
VadSilenceTime
- Automatic speech recognition vad time ranges from 240 to 2000, with a default of 1000, measured in milliseconds. smaller values will make automatic speech recognition segment faster.public AICallExtractConfigElement[] getExtractConfig()
public void setExtractConfig(AICallExtractConfigElement[] ExtractConfig)
ExtractConfig
- Call content extraction configuration.public void toMap(HashMap<String,String> map, String prefix)
toMap
in class AbstractModel
Copyright © 2025. All rights reserved.