#

grounding

Here are 33 public repositories matching this topic...

TheShadow29 / awesome-grounding

awesome grounding: A curated list of research papers in visual grounding

natural-language-processing computer-vision paper awesome-list arxiv papers video-understanding captioning-images captioning-videos phrase-grounding language-grounding multimodal-deep-learning grounding visual-grounding embodied-agent video-grounding image-grounding paper-roadmap

Updated Apr 9, 2023

BAAI-Agents / Cradle

The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.

ai gcc multimodality vlm cradle computer-control lmm grounding ai-agent large-language-models llm generative-ai vision-language-model ai-agents-framework general-computer-control personoid foundation-agent

Updated Apr 15, 2024
Python

FoundationVision / Groma

Grounded Multimodal Large Language Model with Localized Visual Tokenization

llama multimodal grounding foundation-models large-language-models llm mllm vision-language-model llama2

Updated Jun 7, 2024
Python

cliport / cliport

CLIPort: What and Where Pathways for Robotic Manipulation

natural-language-processing computer-vision deep-learning robotics pytorch vision manipulation clip rearrangement grounding vision-language

Updated Nov 2, 2023
Jupyter Notebook

allenai / lumos

Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"

decision-making planning question-answering maths reasoning grounding web-agent language-agent

Updated Mar 19, 2024
Python

mees / calvin

CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks

natural-language-processing computer-vision deep-learning robotics pytorch vision manipulation vision-and-language grounding vision-language

Updated May 17, 2024
Python

mbzuai-oryx / Video-LLaVA

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

video transcription lmm grounding video-grounding llm video-conversation

Updated Jan 2, 2024
Python

flowersteam / Grounding_LLMs_with_online_RL

We perform functional grounding of LLMs' knowledge in BabyAI-Text

reinforcement-learning language-model grounding interactive-agents

Updated May 21, 2024
Python

linhuixiao / CLIP-VG

Self-paced Curriculum Adapting of CLIP for Visual Grounding.

Updated May 13, 2024
Jupyter Notebook

TheShadow29 / zsgnet-pytorch

Official implementation of ICCV19 oral paper Zero-Shot grounding of Objects from Natural Language Queries (https://arxiv.org/abs/1908.07129)

nlp vision objects grounding

Updated Apr 22, 2020
Python

TheShadow29 / vognet-pytorch

[CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)

nlp video vision captioning-videos vision-and-language grounding pytorch-implementation visual-grounding video-grounding video-object-grounding object-grounding

Updated Jun 10, 2020
Python

lukashermann / hulc

Hierarchical Universal Language Conditioned Policies

natural-language-processing computer-vision deep-learning robotics pytorch vision manipulation vision-and-language grounding vision-language

Updated Mar 19, 2024
Python

TheShadow29 / VidSitu

[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)

nlp video vision srl captioning captioning-videos vision-and-language grounding video-language event-relations semantic-roles

Updated Aug 17, 2021
Python

zjukg / DUET

[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning

semantic pytorch transformer cross-modal zero-shot-learning knowledge-transfer grounding visual-grounding pretrained-language-model

Updated Feb 9, 2024
Python

mees / hulc2

[ICRA2023] Grounding Language with Visual Affordances over Unstructured Data

natural-language-processing computer-vision deep-learning robotics pytorch vision manipulation vision-and-language grounding vision-language

Updated Oct 29, 2023
Python

yuleiniu / vc

Code for CVPR'18 "Grounding Referring Expressions in Images by Variational Context"

tensorflow cvpr2018 grounding

Updated Jul 4, 2018
Python

MohitShridhar / ingress

Visual Grounding of Referring Expressions for Human-Robot Interaction

deep-learning object-detection expressions hri grounding referring

Updated Nov 16, 2018
Shell

eslambakr / LAR-Look-Around-and-Refer

This is the official implementation for our paper;"LAR:Look Around and Refer".

machine-learning deep-neural-networks geometry transformers cnn deeplearning 3d projective-geometry multimodal-deep-learning grounding neurips neurips-2022 3dvisualgrounding

Updated Dec 1, 2022
C++

linhuixiao / HiVG

Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.

Updated Apr 27, 2024
Python

uvavision / SelfEQ

[CVPR 2024] Code for "Improved Visual Grounding through Self-Consistent Explanations".

grounding visual-grounding cvpr2024

Updated Mar 1, 2024
Python

Improve this page

Add a description, image, and links to the grounding topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the grounding topic, visit your repo's landing page and select "manage topics."