Skip to content

Image Captioning App in both audio and video format in Nepali language

Notifications You must be signed in to change notification settings

Ayushma00/image-captioning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nepali Image Captioning

Overview

Nepali Image Captioning is an innovative project in the field of AI and data science. Its primary objective is to enable image captioning and present the results in both visual and audible formats in the native Nepali language. This project addresses the challenge of inclusivity by catering to the visually impaired and young learners, facilitating their understanding of images through captions.

Demo Video

A demo of the Nepali Image Captioning project can be viewed here.

Problem Statement

The project recognizes the need for inclusivity in accessing visual content. Visually impaired individuals often face barriers in comprehending images, limiting their ability to engage with visual information. Additionally, young children who are still learning may benefit from image-based learning approaches, which are not always readily available.

Solution

The Nepali Image Captioning project offers a solution by providing image captions in the Nepali language. This feature aids visually impaired individuals and young learners in understanding the content of images. By converting visual information into accessible text and audio, the project promotes inclusivity and facilitates learning.

Moreover, the project's applications extend beyond accessibility. It can be utilized in recommendation systems, leveraging data from previous images to recommend new ones. This functionality enhances user experience and expands the project's utility in various domains.

Technologies Used

  • JavaScript (JS): Utilized for frontend development, enhancing user interaction and experience.
  • Django: Employed as the backend framework to manage data and server-side operations.
  • Heroku: Deployed on the Heroku platform for hosting and deployment.
  • Google Translate API: Integrated for language translation, enabling the generation of Nepali captions.
  • Keras RNN: Utilized for training and implementing recurrent neural networks for image captioning.
  • Natural Language Processing (NLP): Applied for text processing and language generation tasks.
  • CSS3 and HTML5: Used for styling and structuring the frontend components, ensuring a visually appealing and user-friendly interface.

Output

Image 1 Image 2 Image 3

Hackathon Participation

This project was developed as part of a hackathon, reflecting the collaborative efforts and innovative spirit of its creators. The hackathon environment provided a platform for rapid ideation and development, fostering creativity and teamwork in addressing real-world challenges.


Feel free to add or modify any information as needed!