[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
-
Updated
May 24, 2024 - Python
[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!
Code the ICML 2024 paper: "EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence"
An open source implementation of CLIP.
Build high-performance AI models with modular building blocks
[IVS'24] UniBEV: the official implementation of UniBEV
Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey
Multi-modal Object Re-identification
【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification
[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Distillation for Vision-Language Models"
This repository contains code to download data for the preprint "MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning"
🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.
Deep Symbolic Regression with Multimodal Pretraining
[ICLR 2024 Spotlight] This is the official code for the paper "SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training"
Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar
Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
Public repository of our assessment work in missing views for EO applications
Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"
Under review. [IROS 2024] PGA: Personalizing Grasping Agents with Single Human-Robot Interaction
Add a description, image, and links to the multi-modal-learning topic page so that developers can more easily learn about it.
To associate your repository with the multi-modal-learning topic, visit your repo's landing page and select "manage topics."