• Welcome to Powerbasic Museum 2020-B.
 

News:

Forum in repository mode. No new members allowed.

Main Menu

SAPI: Microsoft Speech API

Started by José Roca, August 17, 2008, 04:15:13 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

José Roca

 
The SAPI application programming interface (API) dramatically reduces the code overhead required for an application to use speech recognition and text-to-speech, making speech technology more accessible and robust for a wide range of applications.

The SAPI API provides a high-level interface between an application and speech engines. SAPI implements all the low-level details needed to control and manage the real-time operations of various speech engines.

The two basic types of SAPI engines are text-to-speech (TTS) systems and speech recognizers. TTS systems synthesize text strings and files into spoken audio using synthetic voices. Speech recognizers convert human spoken audio into readable text strings and files.

The following example creates an instance of the speech engine and calls the Speak method.


' ########################################################################################
' SAPI example
' Creates an instance of the speech engine and calls the Speak method.
' ########################################################################################

#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "SAPI.INC"

' ========================================================================================
' Main
' ========================================================================================
FUNCTION PBMAIN () AS LONG

   LOCAL pISpVoice AS ISpVoice
   LOCAL wszText AS STRING
   LOCAL ulStreamNumber AS DWORD

   ' Create an instance of the ISpVoice interface
   pISpVoice = NEWCOM CLSID $CLSID_SpVoice
   IF ISNOTHING(pISpVoice) THEN EXIT FUNCTION

   ' Speak some text
   wszText = UCODE$("Hello everybody" & $NUL)
   pISpVoice.Speak(STRPTR(wszText), %SPF_DEFAULT, ulStreamNumber)
   pISpVoice.WaitUntilDone(&HFFFFFFFF)

   ' Release the interface
   pISpVoice = NOTHING

END FUNCTION
' ========================================================================================


José Roca

#1
 
The following example enumerates the voices collection.


' ########################################################################################
' SAPI example
' Enumerates the voices collection.
' ########################################################################################

#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "SAPIUTILS.INC"
#INCLUDE ONCE "OLE2UTILS.INC"

' ========================================================================================
' Main
' ========================================================================================
FUNCTION PBMAIN () AS LONG

   LOCAL hr AS LONG
   LOCAL pISpVoice AS IspVoice
   LOCAL pIEnumSpObjectTokens AS IEnumSpObjectTokens
   LOCAL wszCategoryId AS STRING

   LOCAL nCount AS DWORD
   LOCAL i AS DWORD
   LOCAL pISpObjectToken AS ISpObjectToken
   LOCAL celtFetched AS DWORD
   LOCAL pszValue AS DWORD

   ' Create an instance of the ISpVoice interface
   pISpVoice = NEWCOM CLSID $CLSID_SpVoice
   IF ISNOTHING(pISpVoice) THEN EXIT FUNCTION

   ' Get a reference to an enumerator for the voices collection
   ' using the helper function SpEnumTokens
   wszCategoryId = UCODE$($SPCAT_VOICES & $NUL)
   hr = SpEnumTokens(STRPTR(wszCategoryId), %NULL, %NULL, pIEnumSpObjectTokens)
   ' Parse the collection
   IF SUCCEEDED(hr) THEN
      pIEnumSpObjectTokens.GetCount(nCount)
      FOR i = 0 TO nCount - 1
         hr = pIEnumSpObjectTokens.Next(1, pISpObjectToken, celtFetched)
         IF FAILED(hr) OR celtFetched = 0 THEN EXIT FOR
         hr = SpGetDescription(pISpObjectToken, pszValue)
         IF hr = %S_OK AND pszValue <> %NULL THEN
            ? W2A_(pszValue)
            CoTaskMemFree pszValue
         END IF
         pISpObjectToken = NOTHING
      NEXT
      pIEnumSpObjectTokens = NOTHING
   END IF

   pISpVoice = NOTHING

   #IF %DEF(%PB_CC32)
      WAITKEY$
   #ENDIF

END FUNCTION
' ========================================================================================


José Roca

#2
 
The following example sets the voice used by the speech engine..


' ########################################################################################
' SAPI example
' Changes the voice used by the speech engine.
' ########################################################################################

#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "SAPIUTILS.INC"

' ========================================================================================
' Main
' ========================================================================================
FUNCTION PBMAIN () AS LONG

   LOCAL hr AS LONG
   LOCAL pISpVoice AS IspVoice
   LOCAL wszText AS STRING
   LOCAL ulStreamNumber AS DWORD

   ' Create an instance of the ISpVoice interface
   pISpVoice = NEWCOM CLSID $CLSID_SpVoice
   IF ISNOTHING(pISpVoice) THEN EXIT FUNCTION

   ' Set the voice using the helper function SpSetVoice
   hr = SpSetVoice(pISpVoice, "Microsoft Mary")

   ' Speak some text
   wszText = UCODE$("Hello everybody" & $NUL)
   pISpVoice.Speak(STRPTR(wszText), %SPF_DEFAULT, ulStreamNumber)
   pISpVoice.WaitUntilDone(&HFFFFFFFF)

   pISpVoice = NOTHING

END FUNCTION
' ========================================================================================


Petr Schreiber

#3
The following example (based 99% on José's code) shows simple function I had use for recently.
This function can be used like:

SAPI_SpeakToFile("IVONA 2 Amy - British English female voice [22kHz]", "Hello, I am SAPI voice", EXE.PATH$+"MyTest.WAV")


... and might come handy in case you need to:

  • Select your voice of preferrence
  • Save the output of SAPI engine to file instead of playing it


' -- You need to include SAPIUTILS.INC before using this
FUNCTION SAPI_SpeakToFile (BYVAL sActor AS STRING, BYVAL sText AS STRING, BYVAL sFileName AS STRING) AS LONG

 LOCAL hr AS LONG
 LOCAL voice AS IspVoice
 LOCAL fileStream AS ISpeechFileStream
 LOCAL wszText AS STRING
 LOCAL ulStreamNumber AS DWORD

 ' Create an instance of the ISpVoice interface
 voice = NEWCOM CLSID $CLSID_SpVoice
 IF ISNOTHING(voice) THEN EXIT FUNCTION

 ' Set the voice using the helper function SpSetVoice
 hr = SpSetVoice(voice, sActor)

 ' Create an instance of the ISpeechFileStream interface
 fileStream = NEWCOM CLSID $CLSID_SpFileStream
 IF ISNOTHING(fileStream) THEN EXIT FUNCTION
 ' Assign file output
 fileStream.Open(UCODE$(sFileName+$NUL), %SSFMCreateForWrite)

 ' Redirect output to the file
 voice.SetOutput(fileStream, %FALSE)

 ' Speak some text
 wszText = UCODE$(sText & $NUL)
 voice.Speak(STRPTR(wszText), %SPF_DEFAULT, ulStreamNumber)
 voice.WaitUntilDone(&HFFFFFFFF)

 voice = NOTHING

END FUNCTION


Petr
AMD Sempron 3400+ | 1GB RAM @ 533MHz | GeForce 6200 / GeForce 9500GT | 32bit Windows XP SP3

psch.thinbasic.com

Steve Bouffe

Hello,

I'm trying to compile this example in PB Win 10 but get string operand expected

voice.Speak( STRPTR(wszText), %SPF_DEFAULT, ulStreamNumber)

José Roca

This was written for the PB9 compiler many years ago.

Try this:


#COMPILE EXE
#DIM ALL
#INCLUDE ONCE "windows.inc"
#INCLUDE ONCE "sapi.inc"


FUNCTION SAPI_SpeakToFile (BYVAL sActor AS WSTRING, BYVAL sText AS WSTRING, BYVAL sFileName AS WSTRING) AS LONG

  LOCAL hr AS LONG
  LOCAL voice AS IspVoice
  LOCAL fileStream AS ISpeechFileStream
  LOCAL wszText AS WSTRING
  LOCAL ulStreamNumber AS DWORD

  ' Create an instance of the ISpVoice interface
  voice = NEWCOM CLSID $CLSID_SpVoice
  IF ISNOTHING(voice) THEN EXIT FUNCTION

  ' Set the voice using the helper function SpSetVoice
  hr = SpSetVoice(voice, sActor)

  ' Create an instance of the ISpeechFileStream interface
  fileStream = NEWCOM CLSID $CLSID_SpFileStream
  IF ISNOTHING(fileStream) THEN EXIT FUNCTION
  ' Assign file output
  fileStream.Open(sFileName, %SSFMCreateForWrite)

  ' Redirect output to the file
  voice.SetOutput(fileStream, %FALSE)

  ' Speak some text
  wszText = sText
  voice.Speak(BYVAL STRPTR(wszText), %SPF_DEFAULT, ulStreamNumber)
  voice.WaitUntilDone(&HFFFFFFFF)

  voice = NOTHING

END FUNCTION