The SAPI application programming interface (API) dramatically reduces the code overhead required for an application to use speech recognition and text-to-speech, making speech technology more accessible and robust for a wide range of applications.
The SAPI API provides a high-level interface between an application and speech engines. SAPI implements all the low-level details needed to control and manage the real-time operations of various speech engines.
The two basic types of SAPI engines are text-to-speech (TTS) systems and speech recognizers. TTS systems synthesize text strings and files into spoken audio using synthetic voices. Speech recognizers convert human spoken audio into readable text strings and files.
The following example creates an instance of the speech engine and calls the Speak method.
' ########################################################################################
' SAPI example
' Creates an instance of the speech engine and calls the Speak method.
' ########################################################################################
#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "SAPI.INC"
' ========================================================================================
' Main
' ========================================================================================
FUNCTION PBMAIN () AS LONG
LOCAL pISpVoice AS ISpVoice
LOCAL wszText AS STRING
LOCAL ulStreamNumber AS DWORD
' Create an instance of the ISpVoice interface
pISpVoice = NEWCOM CLSID $CLSID_SpVoice
IF ISNOTHING(pISpVoice) THEN EXIT FUNCTION
' Speak some text
wszText = UCODE$("Hello everybody" & $NUL)
pISpVoice.Speak(STRPTR(wszText), %SPF_DEFAULT, ulStreamNumber)
pISpVoice.WaitUntilDone(&HFFFFFFFF)
' Release the interface
pISpVoice = NOTHING
END FUNCTION
' ========================================================================================
The following example enumerates the voices collection.
' ########################################################################################
' SAPI example
' Enumerates the voices collection.
' ########################################################################################
#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "SAPIUTILS.INC"
#INCLUDE ONCE "OLE2UTILS.INC"
' ========================================================================================
' Main
' ========================================================================================
FUNCTION PBMAIN () AS LONG
LOCAL hr AS LONG
LOCAL pISpVoice AS IspVoice
LOCAL pIEnumSpObjectTokens AS IEnumSpObjectTokens
LOCAL wszCategoryId AS STRING
LOCAL nCount AS DWORD
LOCAL i AS DWORD
LOCAL pISpObjectToken AS ISpObjectToken
LOCAL celtFetched AS DWORD
LOCAL pszValue AS DWORD
' Create an instance of the ISpVoice interface
pISpVoice = NEWCOM CLSID $CLSID_SpVoice
IF ISNOTHING(pISpVoice) THEN EXIT FUNCTION
' Get a reference to an enumerator for the voices collection
' using the helper function SpEnumTokens
wszCategoryId = UCODE$($SPCAT_VOICES & $NUL)
hr = SpEnumTokens(STRPTR(wszCategoryId), %NULL, %NULL, pIEnumSpObjectTokens)
' Parse the collection
IF SUCCEEDED(hr) THEN
pIEnumSpObjectTokens.GetCount(nCount)
FOR i = 0 TO nCount - 1
hr = pIEnumSpObjectTokens.Next(1, pISpObjectToken, celtFetched)
IF FAILED(hr) OR celtFetched = 0 THEN EXIT FOR
hr = SpGetDescription(pISpObjectToken, pszValue)
IF hr = %S_OK AND pszValue <> %NULL THEN
? W2A_(pszValue)
CoTaskMemFree pszValue
END IF
pISpObjectToken = NOTHING
NEXT
pIEnumSpObjectTokens = NOTHING
END IF
pISpVoice = NOTHING
#IF %DEF(%PB_CC32)
WAITKEY$
#ENDIF
END FUNCTION
' ========================================================================================
The following example sets the voice used by the speech engine..
' ########################################################################################
' SAPI example
' Changes the voice used by the speech engine.
' ########################################################################################
#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "SAPIUTILS.INC"
' ========================================================================================
' Main
' ========================================================================================
FUNCTION PBMAIN () AS LONG
LOCAL hr AS LONG
LOCAL pISpVoice AS IspVoice
LOCAL wszText AS STRING
LOCAL ulStreamNumber AS DWORD
' Create an instance of the ISpVoice interface
pISpVoice = NEWCOM CLSID $CLSID_SpVoice
IF ISNOTHING(pISpVoice) THEN EXIT FUNCTION
' Set the voice using the helper function SpSetVoice
hr = SpSetVoice(pISpVoice, "Microsoft Mary")
' Speak some text
wszText = UCODE$("Hello everybody" & $NUL)
pISpVoice.Speak(STRPTR(wszText), %SPF_DEFAULT, ulStreamNumber)
pISpVoice.WaitUntilDone(&HFFFFFFFF)
pISpVoice = NOTHING
END FUNCTION
' ========================================================================================
The following example (based 99% on José's code) shows simple function I had use for recently.
This function can be used like:
SAPI_SpeakToFile("IVONA 2 Amy - British English female voice [22kHz]", "Hello, I am SAPI voice", EXE.PATH$+"MyTest.WAV")
... and might come handy in case you need to:
- Select your voice of preferrence
- Save the output of SAPI engine to file instead of playing it
' -- You need to include SAPIUTILS.INC before using this
FUNCTION SAPI_SpeakToFile (BYVAL sActor AS STRING, BYVAL sText AS STRING, BYVAL sFileName AS STRING) AS LONG
LOCAL hr AS LONG
LOCAL voice AS IspVoice
LOCAL fileStream AS ISpeechFileStream
LOCAL wszText AS STRING
LOCAL ulStreamNumber AS DWORD
' Create an instance of the ISpVoice interface
voice = NEWCOM CLSID $CLSID_SpVoice
IF ISNOTHING(voice) THEN EXIT FUNCTION
' Set the voice using the helper function SpSetVoice
hr = SpSetVoice(voice, sActor)
' Create an instance of the ISpeechFileStream interface
fileStream = NEWCOM CLSID $CLSID_SpFileStream
IF ISNOTHING(fileStream) THEN EXIT FUNCTION
' Assign file output
fileStream.Open(UCODE$(sFileName+$NUL), %SSFMCreateForWrite)
' Redirect output to the file
voice.SetOutput(fileStream, %FALSE)
' Speak some text
wszText = UCODE$(sText & $NUL)
voice.Speak(STRPTR(wszText), %SPF_DEFAULT, ulStreamNumber)
voice.WaitUntilDone(&HFFFFFFFF)
voice = NOTHING
END FUNCTION
Petr
Hello,
I'm trying to compile this example in PB Win 10 but get string operand expected
voice.Speak( STRPTR(wszText), %SPF_DEFAULT, ulStreamNumber)
This was written for the PB9 compiler many years ago.
Try this:
#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "windows.inc"
#INCLUDE ONCE "sapi.inc"
FUNCTION SAPI_SpeakToFile (BYVAL sActor AS WSTRING, BYVAL sText AS WSTRING, BYVAL sFileName AS WSTRING) AS LONG
LOCAL hr AS LONG
LOCAL voice AS IspVoice
LOCAL fileStream AS ISpeechFileStream
LOCAL wszText AS WSTRING
LOCAL ulStreamNumber AS DWORD
' Create an instance of the ISpVoice interface
voice = NEWCOM CLSID $CLSID_SpVoice
IF ISNOTHING(voice) THEN EXIT FUNCTION
' Set the voice using the helper function SpSetVoice
hr = SpSetVoice(voice, sActor)
' Create an instance of the ISpeechFileStream interface
fileStream = NEWCOM CLSID $CLSID_SpFileStream
IF ISNOTHING(fileStream) THEN EXIT FUNCTION
' Assign file output
fileStream.Open(sFileName, %SSFMCreateForWrite)
' Redirect output to the file
voice.SetOutput(fileStream, %FALSE)
' Speak some text
wszText = sText
voice.Speak(BYVAL STRPTR(wszText), %SPF_DEFAULT, ulStreamNumber)
voice.WaitUntilDone(&HFFFFFFFF)
voice = NOTHING
END FUNCTION