Muzeeglot

In this repository, we present Muzeeglot, a propotype aiming at illustrating how multilingual music genre embedding space representations can be leveraged to generate cross-lingual music genre annotations for DBpedia music entities (artists, albums, tracks, etc ...).

Muzeeglot includes a web interface to visualize these multilingual music genre embeddings.

❓ How it works
⚙️ Architecture
🚀 Deployment
- 🔒 SSL support
💻 Development
📖 Cite

How it works

Based on annotations from one or several source languages, our system automatically predicts the corresponding annotations in a target language.

Languages supported:

🇫🇷 French
🇬🇧 English
🇪🇸 Spanish
🇳🇱 Dutch
🇨🇿 Czech
🇯🇵 Japanese

You will find more information about application usage here.

Architecture

Muzeeglot is based on a classic N-tier architecture including :

A Redis instance as storage engine.
A REST API developed in Python with FastAPI.
A frontend developed with VueJS, as a SPA (Single Page Application).

The overall stack is loadbalanced using Nginx webserver :

Data such as entities, tags, and languages are stored into the Redis instance. Additionnally, a text search index based on Whoosh is maintained using ngram tokenization on entity names.

Deployment

Deploying Muzeeglot requires the following tools to be installed :

You can then clone this repository and start Muzeeglot¹ :

git clone https://github.com/deezer/muzeeglot
cd muzeeglot
make start

Behind the scene it will build the required docker images and run a compose file with everything required locally in daemon mode.

¹ first deployment will be long as it requires data ingestion and indexing.

SSL support

In case you want to deploy Muzeeglot with SSL using LetsEncrypt, you need to first create certificate using the provided bot challenge. Start by editing the following configuration files to add your target domain :

frontend/nginx/certificate-builder.conf
frontend/nginx/muzeeglot-ssl.conf

Once you did so, you can run the following command to generate SSL certificates:

make letsencrypt DOMAIN=mydomain.tld

It will create a docker volume and provision it with certificate. Then you can run Muzeeglot as follows:

make ssl start

Development

Project can be managed using GNU Make through the following goals :

Goal	Description
api	Build api image
frontend	Build frontend image
run	Start the entire stack using docker-compose
start	Start the entire stack in daemon mode
stop	Stop the entier stack using docker-compose
logs	Display stack logs when running in daemon mode
clean	Clean docker volume for storage and indexes
letsencrypt	Generate certificate volume

Additional goals can be used to provide extra parameters:

Goal	Description
no-cache	Build images using `--no-cache` flag
ssl	Enable SSL support

If you want to use your own data, please provide the following files into api/data directory²:

Tag embeddings such as music genres are expected through embeddings.csv CSV file.
Reduced embeddings for display are expected through embeddings_reduced.csv CSV file.
Supported language are expected through languages.csv CSV file.
Indexed entities are expected through entites.csv CSV file.
Test corpus is expected through corpus.csv CSV file.

² you need to clean the data storage and index to force data ingestion when you redeploy.

Cite

@inproceedings{epure2020muzeeglot,
  title={Muzeeglot: annotation multilingue et multi-sources d'entit{\'e}s musicales {\`a} partir de repr{\'e}sentations de genres musicaux},
  author={Epure, Elena V and Salha, Guillaume and Voituret, F{\'e}lix and Baranes, Marion and Hennequin, Romain},
  booktitle={Actes de la 6e conf{\'e}rence conjointe Journ{\'e}es d'{\'E}tudes sur la Parole (JEP, 31e {\'e}dition), Traitement Automatique des Langues Naturelles (TALN, 27e {\'e}dition), Rencontre des {\'E}tudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (R{\'E}CITAL, 22e {\'e}dition). Volume 4: D{\'e}monstrations et r{\'e}sum{\'e}s d'articles internationaux},
  pages={18--21},
  year={2020},
  organization={ATALA}
}

How we learn multilingual music genre embeddings in more detail:

@inproceedings{epure2020modeling,
  title={Modeling the Music Genre Perception across Language-Bound Cultures},
  author={Epure, Elena V and Salha, Guillaume and Manuel, Moussallam and Hennequin, Romain},
  booktitle={The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)},
  month = nov,
  year={2020},
  publisher = {Association for Computational Linguistics},
}

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
api		api
docs		docs
frontend		frontend
Makefile		Makefile
README.md		README.md
docker-compose-ssl.yaml		docker-compose-ssl.yaml
docker-compose.yaml		docker-compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github

.github

api

api

docs

docs

frontend

frontend

Makefile

Makefile

README.md

README.md

docker-compose-ssl.yaml

docker-compose-ssl.yaml

docker-compose.yaml

docker-compose.yaml

Repository files navigation

Muzeeglot

How it works

Architecture

Deployment

SSL support

Development

Cite

About

Releases 1

Packages

Contributors 3

Languages

deezer/muzeeglot

Folders and files

Latest commit

History

Repository files navigation

Muzeeglot

How it works

Architecture

Deployment

SSL support

Development

Cite

About

Topics

Resources

Stars

Watchers

Forks

Languages