Speech to text conversion

speech/recognize@1.0.0
Supported by 4 providers
A
AT
GS
IT

Speech recognition

Real-time speech recognition.

Input
Audio content
Language code
Audio encoding
Maximum alternatives
Result
Results

1.Choose a provider

2.Use Recognize with mock in your code

npm i @superfaceai/one-sdk
const { SuperfaceClient } = require('@superfaceai/one-sdk');

const sdk = new SuperfaceClient();

async function run() {
  // Load the profile
  const profile = await sdk.getProfile('speech/recognize@1.0.0');

  // Use the profile
  const result = await profile
    .getUseCase('Recognize')
    .perform({
      audioContent: '<base64 encoded wav audio>',
      languageCode: 'en-US'
    }, {
      provider: 'mock'
    });

  // Handle the result
  try {
    const data = result.unwrap();
    console.log(data);
  } catch (error) {
    console.error(error);
  }
}

run();

Structure details

Input (object)

audioContent
Audio data in the encoding specified by audioEncodig input parameter.
languageCode
The language (and potentially also the region) of the speech expressed as a BCP-47 language tag, e.g. 'en-US'.
audioEncoding
Encoding of audio data sent. This input is optional for WAV audio files and required for other audio formats.
maxAlternatives
Maximum number of recognition hypotheses to be returned. The server may return fewer than maxAlternatives. Valid values are 0-30. Default value is 1.

Example

{
  "audioContent": "<base64 encoded wav audio>",
  "languageCode": "en-US"
}

Result (object)

results
Sequential list of transcription results corresponding to sequential portions of audio.

Example

{
  "results": [
    {
      "alternatives": [
        {
          "confidence": 0.8393012,
          "transcript": "hello world"
        }
      ]
    }
  ]
}

Implementation details

Provider
mock
Use case
Recognize
Author
@superface
Source
Verified