Skip to content

Rajeshshashank/Speaker-Diarization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Speaker Diarization with Flask App

This project showcases the implementation of Speaker Diarization, a process of automatically detecting and separating different speakers in an audio recording, using Python and Flask.

The Flask app uses the diarization.py file, which contains the code for diarizing the audio file, and the app.py file, which contains the code for creating the web interface and displaying the results.

Usage

To use this project, you need to have Python installed on your machine. You can download Python from the official website https://www.python.org/downloads/.

After installing Python, you need to install the required Python packages by running the following command in your terminal:

pip install -r requirements.txt

Once the required packages are installed, you can run the Flask app by running the following command in your terminal:

python app.py

This will start the Flask app, and you can access it by opening your web browser and navigating to http://localhost:5000.

Code Details

The diarization.py file contains the code for diarizing the audio file. It uses the PyAudioAnalysis library to extract audio features and the k-means algorithm to cluster the audio frames into speaker segments. The code returns the speaker segments and their corresponding labels.

The app.py file contains the code for creating the web interface and displaying the results. It uses Flask to create a web server and render the HTML templates. The code handles file uploads, calls the diarization function, and displays the results on the web page.

index.html

The index.html file contains the HTML markup and CSS styling for the web interface. It has a simple design with a file upload form, a submit button, and a table to display the speaker segments. The table shows the start and end times of each segment, the speaker label, and a play button to listen to the segment.

The abstract section of the web page provides a brief explanation of speaker diarization and its applications. It also highlights the features of the Flask app and how to use it to diarize an audio file.

Usage and Implementation

To use this project, you need to upload an audio file in the supported formats (mp3, wav, and ogg) using the file upload form. Once the file is uploaded, click on the "Submit" button to diarize the audio file. The app will display the speaker segments in a table, and you can listen to each segment by clicking on the play button.

You can use this project as a starting point to implement speaker diarization in your own application. You can modify the code to use a different diarization algorithm or integrate it with other audio processing tasks. The Flask app can also be customized to match your application's design and requirements.