What's the best way to integrate Speech to text?

Hey all!

I’m looking to build a small app where you can interact with an avatar in XR. First step towards that is speech to text. What’s the best way to allow users to speak and convert what they say into text?

Thanks a ton!

Ahmed

Hey Ahmed, I’ve done this before in an 8th Wall project integrating the Speech to text by OpenAI via API.

It’s quite straightforward, you use the Media Recorder API to request permission to record audio and execute the recording.

Preferably include a UI that makes it clear for the user to know when to start recording and when to stop. KISS.

Once the recording is done, you can listen for ondataavailable to capture the data that was recorded and convert to a format that it’s accepted by the Speech to Text API (like mp3). Post with the mp3 to the API and it will return a transcript.

I can vouch for how good it is because it was able to spell my name perfectly out of a sentence in English.

Alternatively, you could use 8th Wall’s Media Recorder to do this and hack your way into extracting audio from the videoBlob that it returns, not sure if you can record audio only with it.

Let me know how it works!

Awesome! Thank you Florencia

1 Like