Skip to content

MarcBS/multimodal_keras_wrapper

Repository files navigation

Multimodal Keras Wrapper

Wrapper for Keras with support to easy multimodal data and models loading and handling.

PyPI version Build Status Requirements Status Compatibility license

Documentation

You can access the library documentation page at marcbs.github.io/multimodal_keras_wrapper/

Some code examples are available in demo.ipynb and test.py. Additionally, in the section Projects you can see some practical examples of projects using this library.

Installation

Assuming that you have pip installed, run:

pip install multimodal-keras-wrapper

Alternatively, if you want to install the library from the source code, you just have to follow these steps:

  1. Clone this repository.

  2. Include the repository path into your PYTHONPATH:

export PYTHONPATH=$PYTHONPATH:/path/to/multimodal_keras_wrapper
  1. Install the dependencies (it will install our custom Keras fork):
pip install -r requirements.txt

Additional dependencies

The following additional dependencies are required to fully exploit this library:

  • Keras - custom fork or original version
  • The cupy package can be used for performing numpy-like operations in the GPU. If not available, the package will fall back to numpy.
  • Coco-caption evaluation package (Only required to perform COCO evaluation). This package requires java (version 1.8.0 or newer).

Only when using NMS for certain localization utilities:

Projects

You can see more practical examples in projects which use this library:

TMA for Egocentric Video Captioning based on Temporally-linked Sequences.

NMT-Keras: Neural Machine Translation.

VIBIKNet for Visual Question Answering

ABiViRNet for Video Description

Sentence-SelectioNN for Domain Adaptation in SMT

Keras

For additional information on the Deep Learning library, visit the official web page www.keras.io or the GitHub repository https://github.com/keras-team/keras.

You can also use our custom Keras version, which provides several additional layers for Multimodal Learning.