Skip to content
This repository has been archived by the owner on May 10, 2024. It is now read-only.

Unable to do SpeakerDiarization #3894

Open
Adarsh1999 opened this issue Dec 21, 2019 · 0 comments
Open

Unable to do SpeakerDiarization #3894

Adarsh1999 opened this issue Dec 21, 2019 · 0 comments

Comments

@Adarsh1999
Copy link

Adarsh1999 commented Dec 21, 2019

I have been trying the speaker diarisation available by AWS by reading only available material on the net which is: https://docs.aws.amazon.com/transcribe/latest/dg/how-diarization.html.
The docs are very unclear and not leading to anywhere or where to start from still I tried a code by experimenting but always gives an internal error. So is there any way to get speaker diarisation or any guide to follow?


from __future__ import print_function
import time
import boto3
import uuid
transcribe = boto3.client('transcribe')
job_name = str(uuid.uuid4())
job_uri = "https://atris-bucket.s3.us-east-2.amazonaws.com/16000.wav"
transcribe.start_transcription_job(
    MediaSampleRateHertz=16000,


    TranscriptionJobName=job_name,
    LanguageCode='en-US',
    MediaFormat='wav',
    Media={
        'MediaFileUri': job_uri
    },
  

    Settings={

        'ShowSpeakerLabels': True,
        'MaxSpeakerLabels': 3,
        'ChannelIdentification': False,
        'ShowAlternatives': False,
        'VocabularyFilterName': 'string',

    })

while True:
    status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
    if status['TranscriptionJob']['TranscriptionJobStatus'] in ['COMPLETED', 'FAILED']:
        break
    print("Not ready yet...")
    time.sleep(5)
print(status)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant