LAVIS - A One-stop Library for Language-Vision Intelligence
-
Updated
Apr 19, 2024 - Jupyter Notebook
LAVIS - A One-stop Library for Language-Vision Intelligence
The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations from everyone!
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
awesome grounding: A curated list of research papers in visual grounding
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]
A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
A collection of resources on applications of multi-modal learning in medical imaging.
Reference mapping for single-cell genomics
收集 ECCV 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!
Multimodal Sarcasm Detection Dataset
Recent Advances in Vision and Language Pre-training (VLP)
Deep learning based content moderation from text, audio, video & image input modalities.
A Survey on multimodal learning research.
Add a description, image, and links to the multimodal-deep-learning topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-deep-learning topic, visit your repo's landing page and select "manage topics."