Typescript - incorrect type when using verbose_json as the whisper transcription response_format #702

jessebs · 2024-03-03T05:48:44Z

Confirm this is a Node library issue and not an underlying OpenAI API issue

This is an issue with the Node library

Describe the bug

With whisper, while using the verbose_json response_format parameter, the audio.transcriptions.create returns a type Transcription, which does not include the extra details from verbose_json

To Reproduce

See the code snippet

Code snippets

const response = await openAIClient.audio.transcriptions.create({
  model: 'whisper-1',
  file: fileStream,
  response_format: 'verbose_json',
  timestamp_granularities: ['segment']
})

console.log(response.text)
// @ts-ignore
console.log(response['language']) // language isn't part of the Transcription interface

OS

macOS

Node version

18.16.1

Library version

openai v4.28.4

The text was updated successfully, but these errors were encountered:

rattrayalex · 2024-03-05T04:34:54Z

We hope to add support for this in the coming months.

dereckmezquita · 2024-03-16T02:50:55Z

I just ran this today and I get this back from the response includes langauge:

const transcription = await this.client.audio.transcriptions.create({
    file: fs.createReadStream(tempFileName),
    response_format: 'verbose_json',
    model: 'whisper-1'
});

{
  task: "transcribe",
  language: "english",
  duration: 2.0399999618530273,
  text: "Hello World and all the bunnies!",
  segments: [
    {
      id: 0,
      seek: 0,
      start: 0,
      end: 2,
      text: " Hello World and all the bunnies!",
      tokens: [ 50364, 2425, 3937, 293, 439, 264, 6702, 40549, 0, 50464 ],
      temperature: 0,
      avg_logprob: -0.5200682878494263,
      compression_ratio: 0.8421052694320679,
      no_speech_prob: 0.017731403931975365,
    }
  ],
}

I'm piggy backing off this issue, I did notice that if I set return type to 'text' the return type is expected is a Transcription type object, but I receive a simple string. This is fine but it confuses TypeScript I have to do:

return result as unknown as string;

Question: are more tokens required/credits used if I request 'verbose_json' vs 'text'?

…on response_format, fixes openai#702

wrogati · 2024-03-24T16:40:18Z

Hello everyone,

I encountered the same issue regarding the return object. As a temporary workaround in my project, I added an interface based on the documentation to better handle the function's return.

Another approach could be to use a fork of the project and implement this fix. However, this always necessitates staying vigilant for possible updates and conflicts from the original repository.

To address this, I've submitted a Pull Request with the correction, hoping the maintainers will integrate this fix. Let's wait and see.

@dereckmezquita Regarding the question of whether the cost differs depending on the type of return, I'm not certain, but I believe it does not. The billing is based on the generation of data counted by tokens, not the size of the response. What the documentation makes clear is that requesting the "verbose_json" response in "words" segments increases latency.

jessebs added the bug Something isn't working label Mar 3, 2024

rattrayalex added enhancement New feature or request and removed bug Something isn't working labels Mar 5, 2024

wrogati added a commit to wrogati/openai-node-fork that referenced this issue Mar 24, 2024

Fix incorrect type when using verbose_json as the whisper transcripti…

ea2de12

…on response_format, fixes openai#702

wrogati mentioned this issue Mar 24, 2024

Add Verbose Transcription Object Return Type to Audio Transcription Endpoint #735

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Typescript - incorrect type when using verbose_json as the whisper transcription response_format #702

Typescript - incorrect type when using verbose_json as the whisper transcription response_format #702

jessebs commented Mar 3, 2024

rattrayalex commented Mar 5, 2024

dereckmezquita commented Mar 16, 2024 •

edited

wrogati commented Mar 24, 2024

Typescript - incorrect type when using verbose_json as the whisper transcription response_format #702

Typescript - incorrect type when using verbose_json as the whisper transcription response_format #702

Comments

jessebs commented Mar 3, 2024

Confirm this is a Node library issue and not an underlying OpenAI API issue

Describe the bug

To Reproduce

Code snippets

OS

Node version

Library version

rattrayalex commented Mar 5, 2024

dereckmezquita commented Mar 16, 2024 • edited

wrogati commented Mar 24, 2024

dereckmezquita commented Mar 16, 2024 •

edited