Synchronous API (Request-Response)

Introduction

This section describes the Synchronous HTTP API for Gowajee's speech-to-text (STT) service. With this API, clients send a request to the server and wait until the STT processing is complete to receive the result. This approach ensures that you receive the transcription data in a single, straightforward response.

Note: At the moment, the Synchronous API supports only the Pulse model.

Workflow

Send Request: The client sends an HTTP request to the Gowajee API endpoint, including the audio data to be transcribed.
Processing: The server processes the audio data using the specified STT model.
Receive Response: The server responds with the transcription result once the processing is complete.

Request

Method: POST
Endpoint: https://api.gowajee.ai/v1/speech-to-text/${MODEL}/transcribe

Supported Models

Model

Value

Pulse

pulse

Cosmos

cosmos

Headers

Name

Type

Required

Description

x-api-key

string

Yes

An API key to access the service

Body Parameters

Name

Type

Required

Description

audioData

string

Yes

Content of Audio data in base64 encoded string format or multipart/form-data

getSpeakingRate

boolean

Get speaking rate (syllables per second)

getWordTimestamps

boolean

Get timestamps for all the words in the transcription. Available only for the Pulse model.

boostWordList

string[]

Add specific words to increase the chance of these words appearing in results. Available only for the Pulse and Cosmos models. Read more details.

boostScore

integer

The number between 1 to 20 to increase the chance of boostWordList appearing in results. Available only for the Pulse and Cosmos models. Read more details.

multichannels

boolean

Set multichannels=true if your audioData is multichannel audio. This is useful for audio with multiple speakers with multiple channels. Read more details.

diarization

boolean

Set diarization=true if you want to perform speaker separation with diarization feature. Read more details.

numSpeakers

integer

Number of speakers in your audioData Read more details.

minSpeakers

integer

Minimum number of speakers in your audioData

Response

{
  "type": "ASR_PULSE",
  "amount": 4.517,
  "output": {
    "results": [
      {
        "transcript": "วันนี้กินอะไรดี",
        "startTime": 0,
        "endTime": 4.517
      }
    ],
    "duration": 4.517,
    "version": "2.2.0"
  }
}

PreviousTranscription NextAsynchronous API (Webhook Notification)

Last updated 1 year ago