multimodal-deep-learning

Here are 341 public repositories matching this topic...

sisinflab / Formal-MultiMod-Rec

Formalizing Multimedia Recommendation through Multimodal Deep Learning, accepted in ACM Transactions on Recommender Systems.

pytorch recommender-system reproducibility multimedia-systems multimodal-deep-learning multimodal-retrieval graph-neural-networks multimedia-recommendation

Updated May 10, 2024
Python

N-G-Asker / TasteRank

Star

TasteRank: Personalized Image Search and Recommendation. This research project proposes an AI-based method for scoring photos on relevance to user interests. TasteRank leverages language and vision models, including Mistral LLMs and OpenAI’s CLIP, and applies multimodal machine-learning techniques.

information-retrieval personalization search-algorithm relevance language-model clip ranking-algorithm language-generation multimodal-deep-learning computer-vison mistral-7b

Updated May 10, 2024
Jupyter Notebook

ThomasHelfer / multimodal-supernovae

Star

A codebase dedicated to exploring multimodal learning approaches by integrating images of host galaxies of supernovae and their corresponding light-curves and spectra.

pytorch astro multimodal-deep-learning

Updated May 9, 2024
Jupyter Notebook

nicolay-r / nicolay-r

Star

nlp information-retrieval tensorflow torch language-model relation-extraction multimodal-deep-learning tranformers large-language-models

Updated May 8, 2024

friedrichor / Awesome-Multimodal-Papers

Star

A curated list of awesome Multimodal studies.

deep-learning multimodal-learning multimodal multimodal-deep-learning multimodal-data multimodal-dialogue multimodal-large-language-models large-multimodal-models

Updated May 8, 2024
HTML

ManifoldRG / NEKO

Star

In Progress Implementation of GATO style Generalist Multimodal model capable of image, text, RL and Robotics tasks

reinforcement-learning deep-learning python3 pytorch multimodal multimodal-deep-learning

Updated May 9, 2024
Python

RunyuFan / FusionMixer-TGRS-2022

Star

Code for TGRS 2022 paper "Multilevel Spatial-Channel Feature Fusion Network for Urban Village Classification by Fusing Satellite and Streetview Images"

data-fusion multimodal-deep-learning urban-village-classification

Updated May 7, 2024
Python

MichiganNLP / visual_diversity_budget

Star

Annotations on a Budget: Leveraging Geo-Data Similarity to Balance Model Performance and Annotation Cost

datasets clip active-learning align multimodal-deep-learning computer-vison diversity-analysis blip2 geo-diverse

Updated May 7, 2024

katerynaCh / MMA-DFER

Star

This repository provides an official implementation for the paper MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild.

facial-expression-recognition emotion-recognition multimodal-deep-learning multimodal-emotion-recognition

Updated May 6, 2024
Python

jrzaurin / pytorch-widedeep

Star

A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch

python deep-learning text images tabular-data pytorch pytorch-cv multimodal-deep-learning pytorch-nlp pytorch-transformers model-hub pytorch-tabular-data

Updated May 6, 2024
Python

KimMeen / Time-LLM

Star

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"

machine-learning deep-learning time-series language-model time-series-analysis time-series-forecast time-series-forecasting multimodal-deep-learning cross-modality multimodal-time-series cross-modal-learning prompt-tuning large-language-models

Updated May 6, 2024
Python

richard-peng-xia / awesome-multimodal-in-medical-imaging

Star

A collection of resources on applications of multi-modal learning in medical imaging.

medical-imaging multimodal-learning visual-question-answering multimodal-deep-learning large-language-models medical-report-generation multimodal-large-language-models large-multimodal-models

Updated May 5, 2024

darmangerd / vubot

Star

Multimodal Computer Vision application leveraging object detections, gesture recognition and speech to text, in order to help user ask any questions about their environment.

computer-vision speech-recognition object-detection gesture-recognition multimodal multimodal-deep-learning

Updated May 8, 2024
Python

MMMU-Benchmark / MMMU

Star

This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"

machine-learning natural-language-processing deep-neural-networks computer-vision deep-learning evaluation question-answering stem multimodality multimodal-learning visual-question-answering multimodal multimodal-deep-learning foundation-models large-language-models llm llms large-multimodal-models

Updated May 4, 2024
Python

Rajeevveera24 / LatentAlignmentProcedural

Star

This repository is cloned from https://github.com/HLR/LatentAlignmentProcedural. This is a potential baseline explored for the textual_cloze task on the RecipeQA Dataset - https://hucvl.github.io/recipeqa/

multimodal-deep-learning multimodal-machine-learning recipeqa multimodal-prodedural-comprehension cmu-11777 cmu-11777-s24 cmu-11777-s24-mars

Updated May 4, 2024
Jupyter Notebook

willxxy / awesome-mmps

Star

Corpus of resources for multimodal machine learning with physiological signals

machine-learning deep-learning signal-processing physiological-signals multimodal-learning multimodal multimodal-deep-learning multimodal-data

Updated May 3, 2024

Rajeevveera24 / idl_project

Star

Dynamic Multimodal Inference [CMU Spring 2024 Course Project : 11-785 Introduction to Deep Learning]

multimodal-deep-learning gating-networks dynamic-neural-network

Updated May 3, 2024

zhu-xlab / DOFA

Star

Code for Neural Plasticity-Inspired Foundation Model for Observing the Earth Crossing Modalities

geospatial remote-sensing earth-science earth-observation multimodal-deep-learning foundation-models

Updated May 2, 2024
Jupyter Notebook

unitaryai / VTC

Star

VTC: Improving Video-Text Retrieval with User Comments

comments video-understanding multimodal-deep-learning video-text-retrieval vision-language-transformer vision-language-pretraining

Updated May 1, 2024
Python

Eva-Kaushik / EMKGCN-MultiModal-Music-Recommender

Star

The `MKGCN` class, coupled with the Spotify API, orchestrates a multi-modal knowledge graph convolutional network to enhance music recommendation systems by integrating user interaction data and diverse music modalities.

spotify-api multimodal-deep-learning music-recommendation-system spotipy-library emkgcn

Updated Apr 30, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the multimodal-deep-learning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-deep-learning topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal-deep-learning

Here are 341 public repositories matching this topic...

sisinflab / Formal-MultiMod-Rec

N-G-Asker / TasteRank

ThomasHelfer / multimodal-supernovae

nicolay-r / nicolay-r

friedrichor / Awesome-Multimodal-Papers

ManifoldRG / NEKO

RunyuFan / FusionMixer-TGRS-2022

MichiganNLP / visual_diversity_budget

katerynaCh / MMA-DFER

jrzaurin / pytorch-widedeep

KimMeen / Time-LLM

richard-peng-xia / awesome-multimodal-in-medical-imaging

darmangerd / vubot

MMMU-Benchmark / MMMU

Rajeevveera24 / LatentAlignmentProcedural

willxxy / awesome-mmps

Rajeevveera24 / idl_project

zhu-xlab / DOFA

unitaryai / VTC

Eva-Kaushik / EMKGCN-MultiModal-Music-Recommender

Improve this page

Add this topic to your repo