multi-modal-learning

Star

Here are 93 public repositories matching this topic...

OpenRobotLab / EmbodiedScan

Star

[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

computer-vision robotics 3d-vision multi-modal-learning

Updated May 24, 2024
Python

DmitryRyumin / CVPR-2023-24-Papers

Star

CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!

Updated May 24, 2024
Python

amazon-science / contrastive_emc2

Star

Code the ICML 2024 paper: "EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence"

machine-learning deep-neural-networks machine-learning-algorithms multi-modal multi-modal-learning mcmc-sampling contrastive-learning

Updated May 22, 2024
Python

mlfoundations / open_clip

Star

An open source implementation of CLIP.

computer-vision deep-learning pytorch pretrained-models language-model contrastive-loss multi-modal-learning zero-shot-classification

Updated May 22, 2024
Jupyter Notebook

kyegomez / zeta

Sponsor

Star

Build high-performance AI models with modular building blocks

multi-platform deep-learning transformers pytorch artificial-intelligence transformer speech-recognition multi-modal multi-agent-systems multi-modal-learning gpt4 llama2 longnet

Updated May 21, 2024
Python

tudelft-iv / UniBEV

Star

[IVS'24] UniBEV: the official implementation of UniBEV

autonomous-driving robustness 3d-object-detection intelligent-vehicles multi-modal-learning multi-modal-fusion

Updated May 16, 2024
Python

zjukg / KG-MM-Survey

Star

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

information-extraction survey knowledge-graph awsome image-classification image-generation surveys entity-linking knowledge-graph-embeddings visual-question-answering entity-alignment paper-list awsome-list cross-modal-retrieval multi-modal-learning multi-modal-fusion large-language-models multi-modal-knowledge-graph

Updated May 16, 2024

924973292 / Awesome-Multi-Modal-Object-Re-Identification

Star

Multi-modal Object Re-identification

awesome person-reidentification code-list paper-list vehicle-reidentification multi-modal-learning multi-modal-object-re-identification missing-modal-retrieval

Updated May 12, 2024

924973292 / EDITOR

Star

【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification

multi-modal frequency-analysis reid person-reid vehicle-reidentification multi-modal-learning cvpr2024 rgbnt201 rgbnt100 msvr310 token-selection

Updated May 10, 2024
Python

zhengli97 / PromptKD

Star

[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Distillation for Vision-Language Models"

clip knowledge-distillation multi-modal-learning prompt-learning vision-language-model cvpr2024

Updated May 9, 2024
Python

vishalned / MMEarth-data

Star

This repository contains code to download data for the preprint "MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning"

computer-vision remote-sensing representation-learning earth-observation multi-modal-learning

Updated May 8, 2024
Python

QIN2DIM / hcaptcha-challenger

Star

🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.

computer-vision solver yolo object-detection image-segmentation multi-modal clip opencv-python onnx hcaptcha multi-modal-learning onnxruntime playwright onnx-models yolov5 zero-shot-classification hcaptcha-solver

Updated Apr 20, 2024
Python

deep-symbolic-mathematics / Multimodal-Symbolic-Regression

Star

Deep Symbolic Regression with Multimodal Pretraining

transformers symbolic-regression multi-modal-learning latent-space-interpolation equation-discovery ai4science ai4math

Updated Apr 18, 2024
Python

deep-symbolic-mathematics / Multimodal-Math-Pretraining

Star

[ICLR 2024 Spotlight] This is the official code for the paper "SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training"

deep-learning transformers symbolic-regression representation-learning multi-modal symbolic-math multi-modal-learning ai4science ai4math

Updated Apr 9, 2024
Python

GuanRunwei / Achelous

Star

Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar

object-detection object-tracking semantic-segmentation multi-task-learning point-cloud-segmentation multi-modal-learning multi-modal-fusion panoptic-perception 4d-mmwave-radar

Updated Apr 4, 2024
Python

lyuchenyang / Macaw-LLM

Star

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

machine-learning natural-language-processing deep-learning neural-networks language-model multi-modal-learning

Updated Apr 3, 2024
Python

huggingface / chug

Star

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

computer-vision pdf-document datasets distributed-training dataloading document-understanding multi-modal-learning webdataset

Updated Apr 3, 2024
Python

fmenat / missingviews-study-EO

Star

Public repository of our assessment work in missing views for EO applications

deep-learning remote-sensing missing-data earth-observation robustness multi-view-learning multi-modal-learning

Updated Apr 2, 2024
Python

wjun0830 / CGDETR

Star

Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"

computer-vision video-summarization pytorch video-understanding video-grounding multi-modal-learning detr moment-retrieval highlight-detection detection-transformer temporal-grounding text-video-retrieval

Updated Apr 2, 2024
Python

JHKim-snu / PGA

Star

Under review. [IROS 2024] PGA: Personalizing Grasping Agents with Single Human-Robot Interaction

personalization semi-supervised-learning vision-and-language robotic-manipulation visual-grounding multi-modal-learning

Updated Mar 30, 2024
Python

Improve this page

Add a description, image, and links to the multi-modal-learning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multi-modal-learning topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-modal-learning

Here are 93 public repositories matching this topic...

OpenRobotLab / EmbodiedScan

DmitryRyumin / CVPR-2023-24-Papers

amazon-science / contrastive_emc2

mlfoundations / open_clip

kyegomez / zeta

tudelft-iv / UniBEV

zjukg / KG-MM-Survey

924973292 / Awesome-Multi-Modal-Object-Re-Identification

924973292 / EDITOR

zhengli97 / PromptKD

vishalned / MMEarth-data

QIN2DIM / hcaptcha-challenger

deep-symbolic-mathematics / Multimodal-Symbolic-Regression

deep-symbolic-mathematics / Multimodal-Math-Pretraining

GuanRunwei / Achelous

lyuchenyang / Macaw-LLM

huggingface / chug

fmenat / missingviews-study-EO

wjun0830 / CGDETR

JHKim-snu / PGA

Improve this page

Add this topic to your repo