Table of Contents
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-05-23 | Awesome Multi-modal Object Tracking | Chunhui Zhang et.al. | 2405.14200 | null |
2024-05-20 | DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM | Xuchen Li et.al. | 2405.12139 | null |
2024-05-16 | A Novel Bounding Box Regression Method for Single Object Tracking | Omar Abdelaziz et.al. | 2405.10444 | null |
2024-05-16 | Beyond Traditional Single Object Tracking: A Survey | Omar Abdelaziz et.al. | 2405.10439 | null |
2024-05-08 | TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking | Pengcheng Shao et.al. | 2405.05004 | link |
2024-04-22 | 360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos | Yinzhe Xu et.al. | 2404.13953 | null |
2024-04-18 | Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training | Jin Gao et.al. | 2404.12210 | link |
2024-04-16 | Attention-Aware Visualization: Tracking and Responding to User Perception Over Time | Arvind Srinivasan et.al. | 2404.10732 | null |
2024-04-15 | Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL | Fangwei Zhong et.al. | 2404.09857 | null |
2024-04-15 | Learning Tracking Representations from Single Point Annotations | Qiangqiang Wu et.al. | 2404.09504 | null |
2024-04-11 | PillarTrack: Redesigning Pillar-based Transformer Network for Single Object Tracking on Point Clouds | Weisheng Xu et.al. | 2404.07495 | link |
2024-05-02 | Longitudinal Analysis and Quantitative Assessment of Child Development through Mobile Interaction | Juan Carlos Ruiz-Garcia et.al. | 2404.06919 | null |
2024-04-09 | LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks | Jianlang Chen et.al. | 2404.06247 | link |
2024-04-08 | Semi-Supervised Novelty Detection for Precise Ultra-Wideband Error Signal Prediction | Umberto Albertin et.al. | 2404.05351 | null |
2024-03-29 | Context-Aware Integration of Language and Visual References for Natural Language Tracking | Yanyan Shao et.al. | 2403.19975 | null |
2024-03-27 | TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes | Liangyu Xu et.al. | 2403.18238 | null |
2024-03-26 | OmniVid: A Generative Framework for Universal Video Understanding | Junke Wang et.al. | 2403.17935 | link |
2024-03-26 | Exploring Dynamic Transformer for Efficient Object Tracking | Jiawen Zhu et.al. | 2403.17651 | null |
2024-03-29 | Elysium: Exploring Object-level Perception in Videos via MLLM | Han Wang et.al. | 2403.16558 | link |
2024-03-25 | Multi-attention Associate Prediction Network for Visual Tracking | Xinglong Sun et.al. | 2403.16395 | null |
2024-03-28 | SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking | Xiaojun Hou et.al. | 2403.16002 | link |
2024-03-23 | Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking | Shaoyu Sun et.al. | 2403.15831 | null |
2024-03-19 | TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO | Chaoran Xiong et.al. | 2403.12504 | null |
2024-03-18 | Pedestrian Tracking with Monocular Camera using Unconstrained 3D Motion Model | Jan Krejčí et.al. | 2403.11978 | null |
2024-03-16 | A Spectrum-based Image Denoising Method with Edge Feature Enhancement | Peter Luvton et.al. | 2403.11036 | null |
2024-03-15 | Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers | Jinxia Xie et.al. | 2403.10574 | null |
2024-03-14 | OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning | Lingyi Hong et.al. | 2403.09634 | null |
2024-02-27 | ACTrack: Adding Spatio-Temporal Condition for Visual Object Tracking | Yushan Han et.al. | 2403.07914 | null |
2024-04-03 | Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline | Xiao Wang et.al. | 2403.05839 | link |
2024-03-08 | Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance | Liting Lin et.al. | 2403.05231 | null |
2024-03-08 | Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy | Yuelin Zhang et.al. | 2403.05146 | link |
2024-03-06 | VastTrack: Vast Category Visual Object Tracking | Liang Peng et.al. | 2403.03493 | link |
2024-02-28 | Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks | Zhewei Wu et.al. | 2402.17976 | null |
2024-02-26 | SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking | Yu Lin et.al. | 2402.16249 | link |
2024-02-26 | Reading Relevant Feature from Global Representation Memory for Visual Object Tracking | Xinyu Zhou et.al. | 2402.14392 | null |
2024-02-13 | Optimized Information Flow for Transformer Tracking | Janani Kugarajeevan et.al. | 2402.08195 | link |
2024-02-07 | BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision | Xin Zhao et.al. | 2402.04519 | null |
2024-02-04 | Spatio-temporal Prompting Network for Robust Video Feature Extraction | Guanxiong Sun et.al. | 2402.02574 | link |
2024-01-24 | Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region | Shengjing Tian et.al. | 2401.13285 | null |
2024-01-23 | Correlation-Embedded Transformer Tracking: A Single-Branch Framework | Fei Xie et.al. | 2401.12743 | link |
2024-01-20 | Unifying Visual and Vision-Language Tracking via Contrastive Learning | Yinchao Ma et.al. | 2401.11228 | link |
2024-01-20 | Towards Category Unification of 3D Single Object Tracking on Point Clouds | Jiahao Nie et.al. | 2401.11204 | null |
2024-01-18 | Multi-task Learning for Joint Re-identification, Team Affiliation, and Role Classification for Sports Visual Tracking | Amir M. Mansourian et.al. | 2401.09942 | null |
2024-01-12 | Dense Optical Flow Estimation Using Sparse Regularizers from Reduced Measurements | Muhammad Wasim Nawaz et.al. | 2401.06396 | null |
2024-01-18 | Hold 'em and Fold 'em: Towards Human-scale, Feedback-Controlled Soft Origami Robots | Immanuel Ampomah Mensah et.al. | 2401.04650 | null |
2024-01-06 | Explicit Visual Prompts for Visual Object Tracking | Liangtao Shi et.al. | 2401.03142 | link |
2024-01-03 | ODTrack: Online Dense Temporal Token Learning for Visual Tracking | Yaozong Zheng et.al. | 2401.01686 | link |
2023-12-27 | X Modality Assisting RGBT Object Tracking | Zhaisheng Ding et.al. | 2312.17273 | null |
2023-12-22 | Cross-Modal Object Tracking via Modality-Aware Fusion Network and A Large-Scale Dataset | Lei Liu et.al. | 2312.14446 | link |
2023-12-18 | Multi-Correlation Siamese Transformer Network with Dense Connection for 3D Single Object Tracking | Shihao Feng et.al. | 2312.11051 | link |
2023-12-17 | Robust 3D Tracking with Quality-Aware Shape Completion | Jingwen Zhang et.al. | 2312.10608 | null |
2023-12-15 | Tracking Skiers from the Top to the Bottom | Matteo Dunnhofer et.al. | 2312.09723 | null |
2023-12-11 | M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking | Jiaming Liu et.al. | 2312.06117 | link |
2023-12-07 | Instance Tracking in 3D Scenes from Egocentric Videos | Yunhan Zhao et.al. | 2312.04117 | link |
2024-02-19 | Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking | Jiawei Ge et.al. | 2311.17085 | null |
2023-11-21 | Visual tracking brain computer interface | Changxing Huang et.al. | 2311.12592 | null |
2024-01-10 | ViKi-HyCo: A Hybrid-Control approach for complex car-like maneuvers | Edison P. Velasco Sánchez et.al. | 2311.07268 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-05-23 | PuzzleAvatar: Assembling 3D Avatars from Personal Albums | Yuliang Xiu et.al. | 2405.14869 | null |
2024-05-23 | A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns | Asaf Yehudai et.al. | 2405.14863 | null |
2024-05-23 | Bitune: Bidirectional Instruction-Tuning | Dawid J. Kopiczko et.al. | 2405.14862 | null |
2024-05-23 | Not All Language Model Features Are Linear | Joshua Engels et.al. | 2405.14860 | link |
2024-05-23 | PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression | Vladimir Malinovskii et.al. | 2405.14852 | null |
2024-05-23 | A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis | Yue Yang et.al. | 2405.14839 | null |
2024-05-23 | From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step | Yuntian Deng et.al. | 2405.14838 | null |
2024-05-23 | HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models | Bernal Jiménez Gutiérrez et.al. | 2405.14831 | null |
2024-05-23 | Designing A Sustainable Marine Debris Clean-up Framework without Human Labels | Raymond Wang et.al. | 2405.14815 | null |
2024-05-23 | As an AI Language Model, "Yes I Would Recommend Calling the Police'': Norm Inconsistency in LLM Decision-Making | Shomik Jain et.al. | 2405.14812 | null |
2024-05-23 | Implicit Personalization in Language Models: A Systematic Study | Zhijing Jin et.al. | 2405.14808 | null |
2024-05-23 | Can LLMs Solve longer Math Word Problems Better? | Xin Xu et.al. | 2405.14804 | null |
2024-05-23 | Lessons from the Trenches on Reproducible Evaluation of Language Models | Stella Biderman et.al. | 2405.14782 | null |
2024-05-23 | WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models | Peng Wang et.al. | 2405.14768 | link |
2024-05-23 | FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models | Hongyang Yang et.al. | 2405.14767 | link |
2024-05-23 | Evaluating Large Language Models for Public Health Classification and Extraction Tasks | Joshua Harris et.al. | 2405.14766 | null |
2024-05-23 | Large language models can be zero-shot anomaly detectors for time series? | Sarah Alnegheimish et.al. | 2405.14755 | null |
2024-05-23 | A Transformer-Based Approach for Smart Invocation of Automatic Code Completion | Aral de Moor et.al. | 2405.14753 | null |
2024-05-23 | MultiCast: Zero-Shot Multivariate Time Series Forecasting Using LLMs | Georgios Chatzigeorgakidis et.al. | 2405.14748 | null |
2024-05-23 | Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View | Xuan Liu et.al. | 2405.14744 | null |
2024-05-21 | Reducing Transformer Key-Value Cache Size with Cross-Layer Attention | William Brandon et.al. | 2405.12981 | null |
2024-05-21 | OmniGlue: Generalizable Feature Matching with Foundation Model Guidance | Hanwen Jiang et.al. | 2405.12979 | null |
2024-05-21 | BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once | Theodore Zhao et.al. | 2405.12971 | null |
2024-05-21 | Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale | Shriram Chennakesavalu et.al. | 2405.12961 | null |
2024-05-21 | Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models | Zhangyue Yin et.al. | 2405.12939 | null |
2024-05-21 | Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs | Bilgehan Sel et.al. | 2405.12933 | null |
2024-05-21 | Code-mixed Sentiment and Hate-speech Prediction | Anjali Yadav et.al. | 2405.12929 | null |
2024-05-21 | Streamlining Software Reviews: Efficient Predictive Modeling with Minimal Examples | Tim Menzies et.al. | 2405.12920 | null |
2024-05-21 | G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation | Xingyuan Pan et.al. | 2405.12915 | null |
2024-05-21 | An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation | Zhiyu Tan et.al. | 2405.12914 | null |
2024-05-21 | Topic Modelling Case Law Using a Large Language Model and a New Taxonomy for UK Law: AI Insights into Summary Judgment | Holli Sargeant et.al. | 2405.12910 | link |
2024-05-21 | Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents | San Kim et.al. | 2405.12900 | null |
2024-05-21 | Investigating Persuasion Techniques in Arabic: An Empirical Study Leveraging Large Language Models | Abdurahmman Alzahrani et.al. | 2405.12884 | null |
2024-05-21 | LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language | James Requeima et.al. | 2405.12856 | null |
2024-05-21 | OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models | Zhaojian Yu et.al. | 2405.12843 | null |
2024-05-21 | SmartFlow: Robotic Process Automation using LLMs | Arushi Jain et.al. | 2405.12842 | null |
2024-05-21 | Large Language Models Meet NLP: A Survey | Libo Qin et.al. | 2405.12819 | null |
2024-05-21 | Test Oracle Automation in the era of LLMs | Facundo Molina et.al. | 2405.12766 | null |
2024-05-21 | C3L: Content Correlated Vision-Language Instruction Tuning Data Generation via Contrastive Learning | Ji Ma et.al. | 2405.12752 | null |
2024-05-21 | Generative AI and Large Language Models for Cyber Security: All Insights You Need | Mohamed Amine Ferrag et.al. | 2405.12750 | null |
2024-05-20 | Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning | Guanglin Zhou et.al. | 2405.12217 | link |
2024-05-20 | MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark | Hongwei Liu et.al. | 2405.12209 | link |
2024-05-20 | Developers' Perceptions on the Impact of ChatGPT in Software Development: A Survey | Thiago S. Vaillant et.al. | 2405.12195 | null |
2024-05-20 | CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models | Haoxiang Shi et.al. | 2405.12174 | null |
2024-05-20 | Fennec: Fine-grained Language Model Evaluation and Correction Extended through Branching and Bridging | Xiaobo Liang et.al. | 2405.12163 | link |
2024-05-20 | Eliciting Problem Specifications via Large Language Models | Robert E. Wray et.al. | 2405.12147 | null |
2024-05-20 | DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM | Xuchen Li et.al. | 2405.12139 | null |
2024-05-20 | MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning | Ting Jiang et.al. | 2405.12130 | link |
2024-05-20 | Reindex-Then-Adapt: Improving Large Language Models for Conversational Recommendation | Zhankui He et.al. | 2405.12119 | null |
2024-05-20 | Imp: Highly Capable Large Multimodal Models for Mobile Devices | Zhenwei Shao et.al. | 2405.12107 | link |
2024-05-20 | DOP: Diagnostic-Oriented Prompting for Large Language Models in Mathematical Correction | Hao Chen et.al. | 2405.12100 | null |
2024-05-20 | Distributional Semantics, Holism, and the Instability of Meaning | Jumbly Grindrod et.al. | 2405.12084 | null |
2024-05-20 | PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation | Zhuobin Huang et.al. | 2405.12079 | null |
2024-05-20 | CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models | Tong Zhang et.al. | 2405.12063 | link |
2024-05-20 | STYLE: Improving Domain Transferability of Asking Clarification Questions in Large Language Model Powered Conversational Agents | Yue Chen et.al. | 2405.12059 | null |
2024-05-20 | KG-RAG: Bridging the Gap Between Knowledge and Creativity | Diego Sanmartin et.al. | 2405.12035 | null |
2024-05-20 | Can AI Relate: Testing Large Language Model Response for Mental Health Support | Saadia Gabriel et.al. | 2405.12021 | null |
2024-05-20 | MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering | Jingqun Tang et.al. | 2405.11985 | null |
2024-05-20 | A review on the use of large language models as virtual tutors | Silvia García-Méndez et.al. | 2405.11983 | null |
2024-05-20 | Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays | Zhichao Sun et.al. | 2405.11976 | link |
2024-05-17 | Observational Scaling Laws and the Predictability of Language Model Performance | Yangjun Ruan et.al. | 2405.10938 | null |
2024-05-17 | A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers | Kaiyu Huang et.al. | 2405.10936 | link |
2024-05-17 | The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks | Lucius Bushnaq et.al. | 2405.10928 | null |
2024-05-17 | Blackbox Adaptation for Medical Image Segmentation | Jay N. Paranjape et.al. | 2405.10913 | link |
2024-05-17 | COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain | Dimitrios P. Panagoulias et.al. | 2405.10893 | null |
2024-05-17 | Application of Artificial Intelligence in Schizophrenia Rehabilitation Management: Systematic Literature Review | Hongyi Yang et.al. | 2405.10883 | null |
2024-05-17 | ECR-Chain: Advancing Generative Language Models to Better Emotion-Cause Reasoners through Reasoning Chains | Zhaopei Huang et.al. | 2405.10860 | link |
2024-05-17 | The Future of Large Language Model Pre-training is Federated | Lorenzo Sani et.al. | 2405.10853 | null |
2024-05-17 | Open-Vocabulary Spatio-Temporal Action Detection | Tao Wu et.al. | 2405.10832 | null |
2024-05-17 | Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities | Hao Zhou et.al. | 2405.10825 | null |
2024-05-17 | ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios | Markus Bayer et.al. | 2405.10808 | null |
2024-05-17 | The Relational Machine Calculus | Chris Barrett et.al. | 2405.10801 | null |
2024-05-17 | Empowering Small-Scale Knowledge Graphs: A Strategy of Leveraging General-Purpose Knowledge Graphs for Enriched Embeddings | Albert Sawczyn et.al. | 2405.10745 | null |
2024-05-17 | Efficient Multimodal Large Language Models: A Survey | Yizhang Jin et.al. | 2405.10739 | link |
2024-05-17 | INDUS: Effective and Efficient Language Models for Scientific Applications | Bishwaranjan Bhattacharjee et.al. | 2405.10725 | null |
2024-05-17 | SignLLM: Sign Languages Production Large Language Models | Sen Fang et.al. | 2405.10718 | null |
2024-05-17 | Persian Pronoun Resolution: Leveraging Neural Networks and Language Models | Hassan Haji Mohammadi et.al. | 2405.10714 | null |
2024-05-17 | SynDy: Synthetic Dynamic Dataset Generation Framework for Misinformation Tasks | Michael Shliselberg et.al. | 2405.10700 | null |
2024-05-17 | Revolutionizing Process Mining: A Novel Architecture for ChatGPT Integration and Enhanced User Experience through Optimized Prompt Engineering | Mehrdad Agha Mohammad Ali Kermani et.al. | 2405.10689 | null |
2024-05-17 | Realistic Evaluation of Toxicity in Large Language Models | Tinh Son Luong et.al. | 2405.10659 | null |
2024-05-16 | UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models | Sahel Sharifymoghaddam et.al. | 2405.10311 | null |
2024-05-16 | 4D Panoptic Scene Graph Generation | Jingkang Yang et.al. | 2405.10305 | link |
2024-05-16 | Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees | Yu Gui et.al. | 2405.10301 | null |
2024-05-16 | HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models | Rhea Sanjay Sukthanker et.al. | 2405.10299 | link |
2024-05-17 | Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning | Yuexiang Zhai et.al. | 2405.10292 | null |
2024-05-16 | Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction | Jianhao Chen et.al. | 2405.10288 | null |
2024-05-16 | FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models | Adrian Bulat et.al. | 2405.10286 | null |
2024-05-16 | Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers | Tuo Zhang et.al. | 2405.10276 | null |
2024-05-16 | Keep It Private: Unsupervised Privatization of Online Text | Calvin Bao et.al. | 2405.10260 | link |
2024-05-16 | When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models | Xianzheng Ma et.al. | 2405.10255 | null |
2024-05-16 | PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology | George Shaikovski et.al. | 2405.10254 | null |
2024-05-16 | A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks | Xuanfan Ni et.al. | 2405.10251 | null |
2024-05-16 | IntelliExplain: Enhancing Interactive Code Generation through Natural Language Explanations for Non-Professional Programmers | Hao Yan et.al. | 2405.10250 | null |
2024-05-16 | A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts | Xinru Zhang et.al. | 2405.10246 | null |
2024-05-16 | DocuMint: Docstring Generation for Python using Small Language Models | Bibek Poudel et.al. | 2405.10243 | link |
2024-05-16 | Low-Rank Adaptation of Time Series Foundational Models for Out-of-Domain Modality Forecasting | Divij Gupta et.al. | 2405.10216 | null |
2024-05-16 | CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations | Jiahao Zhao et.al. | 2405.10212 | null |
2024-05-16 | LFED: A Literary Fiction Evaluation Dataset for Large Language Models | Linhao Yu et.al. | 2405.10166 | link |
2024-05-16 | PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning | Jiancheng Pan et.al. | 2405.10160 | link |
2024-05-16 | Speaker Verification in Agent-Generated Conversations | Yizhe Yang et.al. | 2405.10150 | null |
2024-05-15 | Modeling Bilingual Sentence Processing: Evaluating RNN and Transformer Architectures for Cross-Language Structural Priming | Bushi Xiao et.al. | 2405.09508 | null |
2024-05-15 | Constrained Learning for Causal Inference and Semiparametric Statistics | Tiffany Tianhui Cai et.al. | 2405.09493 | null |
2024-05-15 | Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts | Donya Rooein et.al. | 2405.09482 | null |
2024-05-15 | Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models | Majid Zarharan et.al. | 2405.09454 | link |
2024-05-15 | M |
Yufeng Jiang et.al. | 2405.09446 | link |
2024-05-15 | Facilitating Opinion Diversity through Hybrid NLP Approaches | Michiel van der Meer et.al. | 2405.09439 | null |
2024-05-15 | A Survey On Text-to-3D Contents Generation In The Wild | Chenhan Jiang et.al. | 2405.09431 | null |
2024-05-15 | MicroPython Testbed for Federated Learning Algorithms | Miroslav Popovic et.al. | 2405.09423 | link |
2024-05-15 | Matching domain experts by training from scratch on domain knowledge | Xiaoliang Luo et.al. | 2405.09395 | null |
2024-05-15 | Compositional imprecise probability | Jack Liell-Cock et.al. | 2405.09391 | null |
2024-05-15 | PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models | Devansh Jain et.al. | 2405.09373 | null |
2024-05-15 | SARATR-X: A Foundation Model for Synthetic Aperture Radar Images Target Recognition | Weijie L et.al. | 2405.09365 | null |
2024-05-15 | Large Language Model Bias Mitigation from the Perspective of Knowledge Editing | Ruizhe Chen et.al. | 2405.09341 | null |
2024-05-15 | Prompting-based Synthetic Data Generation for Few-Shot Question Answering | Maximilian Schmidt et.al. | 2405.09335 | null |
2024-05-15 | Transfer Learning in Pre-Trained Large Language Models for Malware Detection Based on System Calls | Pedro Miguel Sánchez Sánchez et.al. | 2405.09318 | null |
2024-05-15 | Comparing the Efficacy of GPT-4 and Chat-GPT in Mental Health Care: A Blind Assessment of Large Language Models for Psychological Support | Birger Moell et.al. | 2405.09300 | null |
2024-05-15 | Do language models capture implied discourse meanings? An investigation with exhaustivity implicatures of Korean morphology | Hagyeong Shin et.al. | 2405.09293 | null |
2024-05-15 | Sign of the Times: Evaluating the use of Large Language Models for Idiomaticity Detection | Dylan Phelps et.al. | 2405.09279 | null |
2024-05-15 | Dynamic Activation Pitfalls in LLaMA Models: An Empirical Study | Chi Ma et.al. | 2405.09274 | null |
2024-05-15 | New Textual Corpora for Serbian Language Modeling | Mihailo Škorić et.al. | 2405.09250 | null |
2024-05-14 | Efficient Vision-Language Pre-training by Cluster Masking | Zihao Wei et.al. | 2405.08815 | link |
2024-05-14 | Towards Enhanced RAC Accessibility: Leveraging Datasets and LLMs | Edison Jair Bejarano Sepulveda et.al. | 2405.08792 | null |
2024-05-14 | Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring | Tiantian Zhang et.al. | 2405.08786 | null |
2024-05-14 | Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in LLMs | Akhila Yerukola et.al. | 2405.08760 | link |
2024-05-14 | Distributed Threat Intelligence at the Edge Devices: A Large Language Model-Driven Approach | Syed Mhamudul Hasan et.al. | 2405.08755 | null |
2024-05-14 | Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding | Zhimin Li et.al. | 2405.08748 | link |
2024-05-14 | Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory | Xueyan Niu et.al. | 2405.08707 | null |
2024-05-14 | EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera | Beilei Cui et.al. | 2405.08672 | link |
2024-05-14 | Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research | Qinglong Cao et.al. | 2405.08668 | link |
2024-05-14 | Thinking Tokens for Language Modeling | David Herel et.al. | 2405.08644 | null |
2024-05-15 | ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation | Dimitris Gkoumas et.al. | 2405.08619 | null |
2024-05-14 | A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine | Hanguang Xiao et.al. | 2405.08603 | null |
2024-05-15 | EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark | Xiaohui Zhang et.al. | 2405.08596 | null |
2024-05-14 | Open-Vocabulary Object Detection via Neighboring Region Attention Alignment | Sunyuan Qiang et.al. | 2405.08593 | null |
2024-05-14 | Improving Transformers with Dynamically Composable Multi-Head Attention | Da Xiao et.al. | 2405.08553 | link |
2024-05-14 | Self-Distillation Improves DNA Sequence Inference | Tong Yu et.al. | 2405.08538 | link |
2024-05-14 | Falcon 7b for Software Mention Detection in Scholarly Documents | AmeerAli Khan et.al. | 2405.08514 | null |
2024-05-14 | Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure | Odysseas S. Chlapanis et.al. | 2405.08502 | link |
2024-05-14 | Is Less More? Quality, Quantity and Context in Idiom Processing with Natural Language Models | Agne Knietaite et.al. | 2405.08497 | link |
2024-05-14 | Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models | Andrea Piergentili et.al. | 2405.08477 | null |
2024-05-13 | Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots | Chengyue Wu et.al. | 2405.07990 | null |
2024-05-13 | A Generalist Learner for Multifaceted Medical Image Interpretation | Hong-Yu Zhou et.al. | 2405.07988 | null |
2024-05-13 | The Platonic Representation Hypothesis | Minyoung Huh et.al. | 2405.07987 | link |
2024-05-13 | Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly Segmentation | Kevin Stangl et.al. | 2405.07969 | null |
2024-05-13 | PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation | Suad Alshammari et.al. | 2405.07963 | null |
2024-05-13 | AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments | Samuel Schmidgall et.al. | 2405.07960 | null |
2024-05-13 | EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning | Yinzhu Quan et.al. | 2405.07938 | null |
2024-05-13 | PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition | Ziyang Zhang et.al. | 2405.07932 | link |
2024-05-13 | Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data | Mahdi Morafah et.al. | 2405.07925 | null |
2024-05-13 | Can Better Text Semantics in Prompt Tuning Improve VLM Generalization? | Hari Chandana Kuchibhotla et.al. | 2405.07921 | null |
2024-05-13 | A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking | Ferdinand Schlatt et.al. | 2405.07920 | link |
2024-05-13 | PLUTO: Pathology-Universal Transformer | Dinkar Juyal et.al. | 2405.07905 | null |
2024-05-13 | Russian-Language Multimodal Dataset for Automatic Summarization of Scientific Papers | Alena Tsanda et.al. | 2405.07886 | null |
2024-05-13 | Zero-Shot Tokenizer Transfer | Benjamin Minixhofer et.al. | 2405.07883 | link |
2024-05-13 | RLHF Workflow: From Reward Modeling to Online RLHF | Hanze Dong et.al. | 2405.07863 | link |
2024-05-13 | Can LLMs Help Predict Elections? (Counter)Evidence from the World's Largest Democracy | Pratik Gujral et.al. | 2405.07828 | null |
2024-05-13 | A View of How Language Models Will Transform Law | Frank Fagan et.al. | 2405.07826 | null |
2024-05-13 | FreeVA: Offline MLLM as Training-Free Video Assistant | Wenhao Wu et.al. | 2405.07798 | link |
2024-05-13 | DEPTH: Discourse Education through Pre-Training Hierarchically | Zachary Bamberger et.al. | 2405.07788 | link |
2024-05-13 | Generating Human Motion in 3D Scenes from Text Descriptions | Zhi Cen et.al. | 2405.07784 | null |
2024-05-10 | Linearizing Large Language Models | Jean Mercat et.al. | 2405.06640 | link |
2024-05-10 | Value Augmented Sampling for Language Model Alignment and Personalization | Seungwook Han et.al. | 2405.06639 | link |
2024-05-10 | Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA Benchmark | Evan M. Williams et.al. | 2405.06634 | null |
2024-05-10 | Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models | Chakshu Moar et.al. | 2405.06626 | null |
2024-05-10 | Explaining Text Similarity in Transformer Models | Alexandros Vasileiou et.al. | 2405.06604 | link |
2024-05-10 | Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach | Elham Ravanbakhsh et.al. | 2405.06586 | null |
2024-05-10 | What Can Natural Language Processing Do for Peer Review? | Ilia Kuznetsov et.al. | 2405.06563 | link |
2024-05-10 | Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval | Mengjia Niu et.al. | 2405.06545 | null |
2024-05-10 | Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts | Wenyu Huang et.al. | 2405.06524 | null |
2024-05-10 | UniDM: A Unified Framework for Data Manipulation with Large Language Models | Yichen Qian et.al. | 2405.06510 | null |
2024-05-10 | Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling | Lyumanshan Ye et.al. | 2405.06495 | null |
2024-05-10 | Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification | Yaoqin Ye et.al. | 2405.06468 | null |
2024-05-10 | Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation | JoonHo Lee et.al. | 2405.06424 | link |
2024-05-10 | Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions? | Hunter McNichols et.al. | 2405.06414 | null |
2024-05-10 | Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL | Ning Cheng et.al. | 2405.06410 | null |
2024-05-10 | Program Synthesis using Inductive Logic Programming for the Abstraction and Reasoning Corpus | Filipe Marinho Rocha et.al. | 2405.06399 | null |
2024-05-10 | Memory Mosaics | Jianyu Zhang et.al. | 2405.06394 | link |
2024-05-10 | LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play | Li-Chun Lu et.al. | 2405.06373 | null |
2024-05-10 | LMD3: Language Model Data Density Dependence | John Kirchenbauer et.al. | 2405.06331 | null |
2024-05-10 | Correlation Dimension of Natural Language in a Statistical Manifold | Xin Du et.al. | 2405.06321 | null |
2024-05-09 | Natural Language Processing RELIES on Linguistics | Juri Opitz et.al. | 2405.05966 | null |
2024-05-09 | OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning | Dan Qiao et.al. | 2405.05957 | link |
2024-05-09 | Probing Multimodal LLMs as World Models for Driving | Shiva Sreeram et.al. | 2405.05956 | link |
2024-05-09 | Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning | Junzhi Chen et.al. | 2405.05955 | null |
2024-05-09 | CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts | Jiachen Li et.al. | 2405.05949 | link |
2024-05-09 | DOLOMITES: Domain-Specific Long-Form Methodical Tasks | Chaitanya Malaviya et.al. | 2405.05938 | null |
2024-05-09 | Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness | Siyuan Li et.al. | 2405.05930 | null |
2024-05-09 | Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? | Zorik Gekhman et.al. | 2405.05904 | null |
2024-05-09 | Co-driver: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes | Ziang Guo et.al. | 2405.05885 | null |
2024-05-09 | FlockGPT: Guiding UAV Flocking with Linguistic Orchestration | Artem Lykov et.al. | 2405.05872 | null |
2024-05-09 | Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control | Gunshi Gupta et.al. | 2405.05852 | link |
2024-05-09 | Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning | Artem Lykov et.al. | 2405.05824 | link |
2024-05-09 | Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference | Zhihang Lin et.al. | 2405.05803 | link |
2024-05-09 | Towards a More Inclusive AI: Progress and Perspectives in Large Language Model Training for the Sámi Language | Ronny Paul et.al. | 2405.05777 | null |
2024-05-09 | Experimental Pragmatics with Machines: Testing LLM Predictions for the Inferences of Plain and Embedded Disjunctions | Polina Tsvilodub et.al. | 2405.05776 | null |
2024-05-09 | Large Language Model-Aided Evolutionary Search for Constrained Multiobjective Optimization | Zeyi Wang et.al. | 2405.05767 | null |
2024-05-09 | Similarity Guided Multimodal Fusion Transformer for Semantic Location Prediction in Social Media | Zhizhen Zhang et.al. | 2405.05760 | null |
2024-05-09 | Exploring the Potential of Human-LLM Synergy in Advancing Qualitative Analysis: A Case Study on Mental-Illness Stigma | Han Meng et.al. | 2405.05758 | null |
2024-05-09 | Can large language models understand uncommon meanings of common words? | Jinyang Wu et.al. | 2405.05741 | null |
2024-05-09 | Evaluating Dialect Robustness of Language Models via Conversation Understanding | Dipankar Srirag et.al. | 2405.05688 | link |
2024-05-08 | THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models | Prannay Kaul et.al. | 2405.05256 | null |
2024-05-08 | You Only Cache Once: Decoder-Decoder Architectures for Language Models | Yutao Sun et.al. | 2405.05254 | null |
2024-05-08 | Open Source Language Models Can Provide Feedback: Evaluating LLMs' Ability to Help Students Using GPT-4-As-A-Judge | Charles Koutcheme et.al. | 2405.05253 | link |
2024-05-09 | LLMs with Personalities in Multi-issue Negotiation Games | Sean Noh et.al. | 2405.05248 | null |
2024-05-08 | EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning | Jingfeng Yao et.al. | 2405.05237 | link |
2024-05-08 | SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants | Masoud Moghani et.al. | 2405.05226 | null |
2024-05-08 | Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers | Jiuxiang Gu et.al. | 2405.05219 | null |
2024-05-08 | FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models | Jinglin Xu et.al. | 2405.05216 | link |
2024-05-08 | MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning | Inderjeet Nair et.al. | 2405.05189 | null |
2024-05-08 | Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming | Tommaso Pasini et.al. | 2405.05176 | null |
2024-05-08 | Air Gap: Protecting Privacy-Conscious Conversational Agents | Eugene Bagdasaryan et.al. | 2405.05175 | null |
2024-05-08 | XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples | Peiqin Lin et.al. | 2405.05116 | link |
2024-05-08 | QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs | Weijia Zhang et.al. | 2405.05109 | null |
2024-05-08 | Concerns on Bias in Large Language Models when Creating Synthetic Personae | Helena A. Haxvig et.al. | 2405.05080 | null |
2024-05-08 | Impact of Tone-Aware Explanations in Recommender Systems | Ayano Okoso et.al. | 2405.05061 | null |
2024-05-08 | Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models | Aylin Gunal et.al. | 2405.05060 | null |
2024-05-08 | Seeds of Stereotypes: A Large-Scale Textual Analysis of Race and Gender Associations with Diseases in Online Sources | Lasse Hyldig Hansen et.al. | 2405.05049 | null |
2024-05-08 | Ning Wang et.al. | 2405.05010 | null | |
2024-05-08 | ADELIE: Aligning Large Language Models on Information Extraction | Yunjia Qi et.al. | 2405.05008 | link |
2024-05-08 | NAVRepair: Node-type Aware C/C++ Code Vulnerability Repair | Ruoke Wang et.al. | 2405.04994 | null |
2024-05-07 | ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning | Jing Lin et.al. | 2405.04533 | null |
2024-05-07 | QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving | Yujun Lin et.al. | 2405.04532 | link |
2024-05-07 | NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts | Shudan Zhang et.al. | 2405.04520 | null |
2024-05-07 | xLSTM: Extended Long Short-Term Memory | Maximilian Beck et.al. | 2405.04517 | null |
2024-05-07 | A Transformer with Stack Attention | Jiaoda Li et.al. | 2405.04515 | link |
2024-05-08 | Unveiling Disparities in Web Task Handling Between Human and Web Agent | Kihoon Son et.al. | 2405.04497 | null |
2024-05-07 | Toward In-Context Teaching: Adapting Examples to Students' Misconceptions | Alexis Ross et.al. | 2405.04495 | null |
2024-05-07 | Representation Learning of Daily Movement Data Using Text Encoders | Alexander Capstick et.al. | 2405.04494 | link |
2024-05-08 | DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model | DeepSeek-AI et.al. | 2405.04434 | link |
2024-05-07 | The Silicone Ceiling: Auditing GPT's Race and Gender Biases in Hiring | Lena Armstrong et.al. | 2405.04412 | null |
2024-05-07 | Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks | Georgios Pantazopoulos et.al. | 2405.04403 | link |
2024-05-07 | Large Language Models Cannot Explain Themselves | Advait Sarkar et.al. | 2405.04382 | null |
2024-05-07 | A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI | Hannah Chafetz et.al. | 2405.04333 | null |
2024-05-07 | Deception in Reinforced Autonomous Agents: The Unconventional Rabbit Hat Trick in Legislation | Atharvan Dogra et.al. | 2405.04325 | null |
2024-05-07 | Granite Code Models: A Family of Open Foundation Models for Code Intelligence | Mayank Mishra et.al. | 2405.04324 | link |
2024-05-07 | Accelerating Speculative Decoding using Dynamic Speculation Length | Jonathan Mamou et.al. | 2405.04304 | null |
2024-05-07 | Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework | Xiangpeng Wan et.al. | 2405.04294 | link |
2024-05-07 | Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore | Junchao Wu et.al. | 2405.04286 | null |
2024-05-07 | On the Foundations of Earth and Climate Foundation Models | Xiao Xiang Zhu et.al. | 2405.04285 | null |
2024-05-07 | Semantic API Alignment: Linking High-level User Goals to APIs | Robert Feldt et.al. | 2405.04236 | null |
2024-05-06 | Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs | Muhammad Uzair Khattak et.al. | 2405.03690 | null |
2024-05-06 | Pose Priors from Language Models | Sanjay Subramanian et.al. | 2405.03689 | null |
2024-05-06 | Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames | Keith Burghardt et.al. | 2405.03688 | link |
2024-05-06 | Language-Image Models with 3D Understanding | Jang Hyun Cho et.al. | 2405.03685 | null |
2024-05-06 | AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design | Kamal Choudhary et.al. | 2405.03680 | null |
2024-05-06 | When LLMs Meet Cybersecurity: A Systematic Literature Review | Jie Zhang et.al. | 2405.03644 | link |
2024-05-06 | A Controlled Experiment on the Energy Efficiency of the Source Code Generated by Code Llama | Vlad-Andrei Cursaru et.al. | 2405.03616 | null |
2024-05-06 | GREEN: Generative Radiology Report Evaluation and Error Notation | Sophie Ostmeier et.al. | 2405.03595 | null |
2024-05-06 | Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment | Abhinav Agarwalla et.al. | 2405.03594 | null |
2024-05-06 | Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing | Han Liu et.al. | 2405.03565 | null |
2024-05-07 | ID-centric Pre-training for Recommendation | Yiqing Wu et.al. | 2405.03562 | null |
2024-05-06 | AlphaMath Almost Zero: process Supervision without process | Guoxin Chen et.al. | 2405.03553 | link |
2024-05-06 | MAmmoTH2: Scaling Instructions from the Web | Xiang Yue et.al. | 2405.03548 | null |
2024-05-06 | Position Paper: Leveraging Foundational Models for Black-Box Optimization: Benefits, Challenges, and Future Directions | Xingyou Song et.al. | 2405.03547 | null |
2024-05-06 | Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-Context Learning | Yubo Mai et.al. | 2405.03509 | null |
2024-05-06 | UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images | Yiting Qu et.al. | 2405.03486 | null |
2024-05-06 | LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model | Haowen Sun et.al. | 2405.03485 | link |
2024-05-06 | Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational Search | Hideaki Joko et.al. | 2405.03480 | link |
2024-05-07 | Large Language Models (LLMs) as Agents for Augmented Democracy | Jairo Gudiño-Rosero et.al. | 2405.03452 | null |
2024-05-06 | SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence | Hangyuan Ji et.al. | 2405.03446 | null |
2024-05-03 | Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models | Piotr Padlewski et.al. | 2405.02287 | link |
2024-05-03 | Structural Pruning of Pre-trained Language Models via Neural Architecture Search | Aaron Klein et.al. | 2405.02267 | null |
2024-05-03 | On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning? | Maxime Zanella et.al. | 2405.02266 | link |
2024-05-03 | Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows | Jasmine Y. Shih et.al. | 2405.02260 | null |
2024-05-03 | What matters when building vision-language models? | Hugo Laurençon et.al. | 2405.02246 | null |
2024-05-03 | REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs | Deepa Tilwani et.al. | 2405.02228 | null |
2024-05-03 | Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks | Lujing Zhang et.al. | 2405.02225 | null |
2024-05-03 | FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems | Yashar Deldjoo et.al. | 2405.02219 | null |
2024-05-03 | Automatic Programming: Large Language Models and Beyond | Michael R. Lyu et.al. | 2405.02213 | null |
2024-05-03 | Assessing and Verifying Task Utility in LLM-Powered Applications | Negar Arabzadeh et.al. | 2405.02178 | null |
2024-05-03 | Hoaxpedia: A Unified Wikipedia Hoax Articles Dataset | Hsuvas Borkakoty et.al. | 2405.02175 | null |
2024-05-03 | Mapping the Unseen: Unified Promptable Panoptic Mapping with Dynamic Labeling using Foundation Models | Mohamad Al Mdfaa et.al. | 2405.02162 | null |
2024-05-03 | Neural Context Flows for Learning Generalizable Dynamical Systems | Roussel Desmond Nzoyem et.al. | 2405.02154 | link |
2024-05-03 | The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates | Giuseppe Russo Latona et.al. | 2405.02150 | link |
2024-05-03 | MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain | Chao Jiang et.al. | 2405.02144 | null |
2024-05-03 | Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection | Guillem Ramírez et.al. | 2405.02134 | null |
2024-05-03 | Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets | Xuelong Geng et.al. | 2405.02132 | null |
2024-05-03 | Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph | Vladyslav Nechakhin et.al. | 2405.02105 | null |
2024-05-03 | Argumentative Large Language Models for Explainable and Contestable Decision-Making | Gabriel Freedman et.al. | 2405.02079 | null |
2024-05-03 | Comparative Analysis of Retrieval Systems in the Real World | Dmytro Mozolevskyi et.al. | 2405.02048 | null |
2024-05-02 | Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models | Seungone Kim et.al. | 2405.01535 | link |
2024-05-02 | Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks | Murtaza Dalal et.al. | 2405.01534 | null |
2024-05-02 | OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning | Shihao Wang et.al. | 2405.01533 | link |
2024-05-02 | FLAME: Factuality-Aware Alignment for Large Language Models | Sheng-Chieh Lin et.al. | 2405.01525 | null |
2024-05-02 | A separability-based approach to quantifying generalization: which layer is best? | Luciano Dyballa et.al. | 2405.01524 | null |
2024-05-02 | Transformer-Aided Semantic Communications | Matin Mortaheb et.al. | 2405.01521 | null |
2024-05-02 | D2PO: Discriminator-Guided DPO with Response Evaluation Models | Prasann Singhal et.al. | 2405.01511 | link |
2024-05-02 | Analyzing the Role of Semantic Representations in the Era of Large Language Models | Zhijing Jin et.al. | 2405.01502 | link |
2024-05-02 | Supporting Business Document Workflows via Collection-Centric Information Foraging with Large Language Models | Raymond Fok et.al. | 2405.01501 | null |
2024-05-02 | Controllable Text Generation in the Instruction-Tuning Era | Dhananjay Ashok et.al. | 2405.01490 | null |
2024-05-02 | MANTIS: Interleaved Multi-Image Instruction Tuning | Dongfu Jiang et.al. | 2405.01483 | null |
2024-05-02 | NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment | Gerald Shen et.al. | 2405.01481 | link |
2024-05-02 | V-FLUTE: Visual Figurative Language Understanding with Textual Explanations | Arkadiy Saakyan et.al. | 2405.01474 | link |
2024-05-02 | Advancing human-centric AI for robust X-ray analysis through holistic self-supervised learning | Théo Moutakanni et.al. | 2405.01469 | null |
2024-05-02 | Understanding Retrieval-Augmented Task Adaptation for Vision-Language Models | Yifei Ming et.al. | 2405.01468 | null |
2024-05-02 | A Systematic Literature Review on Large Language Models for Automated Program Repair | Quanjun Zhang et.al. | 2405.01466 | link |
2024-05-02 | Natural Language to Verilog: Design of a Recurrent Spiking Neural Network using Large Language Models and ChatGPT | Paola Vitolo et.al. | 2405.01419 | null |
2024-05-02 | MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors | Yuan Tang et.al. | 2405.01413 | link |
2024-05-02 | Verification and Refinement of Natural Language Explanations through LLM-Symbolic Theorem Proving | Xin Quan et.al. | 2405.01379 | null |
2024-05-02 | GAIA: A General AI Assistant for Intelligent Accelerator Operations | Frank Mayet et.al. | 2405.01359 | null |
2024-05-01 | Self-Play Preference Optimization for Language Model Alignment | Yue Wu et.al. | 2405.00675 | null |
2024-05-01 | Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3 | Junsang Yoon et.al. | 2405.00664 | link |
2024-05-01 | HalluVault: A Novel Logic Programming-aided Metamorphic Testing Framework for Detecting Fact-Conflicting Hallucinations in Large Language Models | Ningke Li et.al. | 2405.00648 | null |
2024-05-01 | When Quantization Affects Confidence of Large Language Models? | Irina Proskurina et.al. | 2405.00632 | link |
2024-05-01 | "I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust | Sunnie S. Y. Kim et.al. | 2405.00623 | null |
2024-05-01 | Causal Evaluation of Language Models | Sirui Chen et.al. | 2405.00622 | link |
2024-05-01 | Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling | Yida Mu et.al. | 2405.00611 | null |
2024-05-01 | Investigating Automatic Scoring and Feedback using Large Language Models | Gloria Ashiya Katuka et.al. | 2405.00602 | null |
2024-05-01 | Are Models Biased on Text without Gender-related Language? | Catarina G Belém et.al. | 2405.00588 | link |
2024-05-01 | The Real, the Better: Aligning Large Language Models with Online Human Behaviors | Guanying Jiang et.al. | 2405.00578 | null |
2024-05-01 | EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model | Deng Li et.al. | 2405.00574 | null |
2024-05-01 | NumLLM: Numeric-Sensitive Large Language Model for Chinese Finance | Huan-Yi Su et.al. | 2405.00566 | null |
2024-05-01 | Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment | Zhili Liu et.al. | 2405.00557 | null |
2024-05-01 | Long-Term Human Trajectory Prediction using 3D Dynamic Scene Graphs | Nicolas Gorlo et.al. | 2405.00552 | link |
2024-05-01 | ChatBI: Towards Natural Language to Complex Business Intelligence SQL | Jinqing Lian et.al. | 2405.00527 | null |
2024-05-01 | CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions | Donghee Choi et.al. | 2405.00523 | null |
2024-05-01 | Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning | Lucas-Andreï Thil et.al. | 2405.00516 | null |
2024-05-01 | GOLD: Geometry Problem Solver with Natural Language Description | Jiaxin Zhang et.al. | 2405.00494 | link |
2024-05-01 | Is Temperature the Creativity Parameter of Large Language Models? | Max Peeperkorn et.al. | 2405.00492 | null |
2024-05-01 | The Pyramid of Captions | Delong Chen et.al. | 2405.00485 | null |
2024-04-30 | Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation | Yunhao Ge et.al. | 2404.19752 | null |
2024-04-30 | PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification | Leon Garza et.al. | 2404.19744 | null |
2024-04-30 | Better & Faster Large Language Models via Multi-token Prediction | Fabian Gloeckle et.al. | 2404.19737 | null |
2024-04-30 | A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications | Steph Buongiorno et.al. | 2404.19729 | null |
2024-04-30 | PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games | Steph Buongiorno et.al. | 2404.19721 | null |
2024-04-30 | Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns | Constantinos Patsakis et.al. | 2404.19715 | null |
2024-04-30 | Automated Generation of High-Quality Medical Simulation Scenarios Through Integration of Semi-Structured Data and Large Language Models | Scott Sumpter et.al. | 2404.19713 | null |
2024-04-30 | When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively | Tiziano Labruna et.al. | 2404.19705 | link |
2024-04-30 | Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners | Chun Feng et.al. | 2404.19696 | null |
2024-04-30 | Towards Generalist Robot Learning from Internet Video: A Survey | Robert McCarthy et.al. | 2404.19664 | null |
2024-04-30 | MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation | Min Zhang et.al. | 2404.19644 | null |
2024-04-30 | On Training a Neural Network to Explain Binaries | Alexander Interrante-Grant et.al. | 2404.19631 | null |
2024-04-30 | Seeing Through the Clouds: Cloud Gap Imputation with Prithvi Foundation Model | Denys Godwin et.al. | 2404.19609 | null |
2024-04-30 | Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning | Xuanli He et.al. | 2404.19597 | null |
2024-04-30 | RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing | Yucheng Hu et.al. | 2404.19543 | link |
2024-04-30 | MoST: Multi-modality Scene Tokenization for Motion Prediction | Norman Mu et.al. | 2404.19531 | null |
2024-04-30 | Do Large Language Models Understand Conversational Implicature -- A case study with a chinese sitcom | Shisen Yue et.al. | 2404.19509 | link |
2024-04-30 | More Compute Is What You Need | Zhen Guo et.al. | 2404.19484 | null |
2024-05-01 | Neuro-Vision to Language: Image Reconstruction and Language enabled Interaction via Brain Recordings | Guobin Shen et.al. | 2404.19438 | null |
2024-04-30 | Can Large Language Models put 2 and 2 together? Probing for Entailed Arithmetical Relationships | D. Panas et.al. | 2404.19432 | null |
2024-04-29 | Hallucination of Multimodal Large Language Models: A Survey | Zechen Bai et.al. | 2404.18930 | link |
2024-04-29 | Holmes: Benchmark the Linguistic Competence of Language Models | Andreas Waldis et.al. | 2404.18923 | null |
2024-04-29 | DPO Meets PPO: Reinforced Token Optimization for RLHF | Han Zhong et.al. | 2404.18922 | null |
2024-04-29 | TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation | Junhao Cheng et.al. | 2404.18919 | link |
2024-04-29 | Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting | Fangcheng Liu et.al. | 2404.18911 | link |
2024-04-29 | Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking | Hong Jin Kang et.al. | 2404.18881 | link |
2024-04-29 | More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness | Aaron J. Li et.al. | 2404.18870 | link |
2024-04-29 | Truth-value judgment in language models: belief directions are context sensitive | Stefan F. Schouten et.al. | 2404.18865 | null |
2024-04-29 | Performance-Aligned LLMs for Generating Fast Code | Daniel Nichols et.al. | 2404.18864 | null |
2024-04-29 | A Survey on Vision Mamba: Models, Applications and Challenges | Rui Xu et.al. | 2404.18861 | link |
2024-04-29 | VERT: Verified Equivalent Rust Transpilation with Few-Shot Learning | Aidan Z. H. Yang et.al. | 2404.18852 | null |
2024-04-29 | FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition | Yuxuan Yan et.al. | 2404.18848 | null |
2024-04-29 | It's Difficult to be Neutral -- Human and LLM-based Sentiment Annotation of Patient Comments | Petter Mæhlum et.al. | 2404.18832 | null |
2024-04-29 | Benchmarking Benchmark Leakage in Large Language Models | Ruijie Xu et.al. | 2404.18824 | link |
2024-04-29 | AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering | Wenxiang Zhao et.al. | 2404.18816 | null |
2024-04-29 | Unknown Script: Impact of Script on Cross-Lingual Transfer | Wondimagegnhue Tsegaye Tufa et.al. | 2404.18810 | link |
2024-04-29 | Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models | Pat Verga et.al. | 2404.18796 | null |
2024-04-29 | PECC: Problem Extraction and Coding Challenges | Patrick Haller et.al. | 2404.18766 | link |
2024-04-29 | Transitive Vision-Language Prompt Learning for Domain Generalization | Liyuan Wang et.al. | 2404.18758 | null |
2024-04-29 | Enhancing Interactive Image Retrieval With Query Rewriting Using Large Language Models and Vision Language Models | Hongyi Zhu et.al. | 2404.18746 | null |
2024-04-26 | Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo | Stephen Zhao et.al. | 2404.17546 | link |
2024-04-26 | Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models | Yuhang Huang et.al. | 2404.17534 | null |
2024-04-26 | Large Language Model Agent as a Mechanical Designer | Yayati Jadhav et.al. | 2404.17525 | null |
2024-04-26 | On the Use of Large Language Models to Generate Capability Ontologies | Luis Miguel Vieira da Silva et.al. | 2404.17524 | null |
2024-04-26 | Enhancing Legal Compliance and Regulation Analysis with Large Language Models | Shabnam Hassani et.al. | 2404.17522 | null |
2024-04-26 | A Comprehensive Evaluation on Event Reasoning of Large Language Models | Zhengwei Tao et.al. | 2404.17513 | link |
2024-04-26 | CEval: A Benchmark for Evaluating Counterfactual Text Generation | Van Bach Nguyen et.al. | 2404.17475 | null |
2024-04-26 | Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System | Robin Schmucker et.al. | 2404.17460 | null |
2024-04-26 | "ChatGPT Is Here to Help, Not to Replace Anybody" -- An Evaluation of Students' Opinions On Integrating ChatGPT In CS Courses | Bruno Pereira Cipriano et.al. | 2404.17443 | null |
2024-04-26 | PromptCIR: Blind Compressed Image Restoration with Prompt Learning | Bingchen Li et.al. | 2404.17433 | link |
2024-04-26 | Evaluation of Geographical Distortions in Language Models: A Crucial Step Towards Equitable Representations | Rémy Decoupes et.al. | 2404.17401 | null |
2024-04-26 | UniRGB-IR: A Unified Framework for Visible-Infrared Downstream Tasks via Adapter Tuning | Maoxun Yuan et.al. | 2404.17360 | null |
2024-04-26 | InspectorRAGet: An Introspection Platform for RAG Evaluation | Kshitij Fadnis et.al. | 2404.17347 | link |
2024-04-26 | Introducing cosmosGPT: Monolingual Training for Turkish Language Models | H. Toprak Kesgin et.al. | 2404.17336 | null |
2024-04-26 | A Novel Spike Transformer Network for Depth Estimation from Event Cameras via Cross-modality Knowledge Distillation | Xin Zhang et.al. | 2404.17335 | null |
2024-04-26 | An Extendable Cloud-Native Alloy Property Explorer | Zhuoyuan Li et.al. | 2404.17330 | link |
2024-04-26 | When to Trust LLMs: Aligning Confidence with Response Quality | Shuchang Tao et.al. | 2404.17287 | null |
2024-04-26 | Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact Checking News Claims with Black-Box LLM | Xuan Zhang et.al. | 2404.17283 | link |
2024-04-26 | Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT as a Pivot | Michelle Terblanche et.al. | 2404.17216 | null |
2024-04-26 | Low-Rank Knowledge Decomposition for Medical Foundation Models | Yuhang Zhou et.al. | 2404.17184 | null |
2024-04-25 | The Third Monocular Depth Estimation Challenge | Jaime Spencer et.al. | 2404.16831 | null |
2024-04-25 | Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials | Ye Fang et.al. | 2404.16829 | null |
2024-04-25 | V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection | Xuanyu Zhang et.al. | 2404.16824 | null |
2024-04-25 | How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites | Zhe Chen et.al. | 2404.16821 | link |
2024-04-25 | IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages | Harman Singh et.al. | 2404.16816 | link |
2024-04-26 | Make Your LLM Fully Utilize the Context | Shengnan An et.al. | 2404.16811 | link |
2024-04-25 | Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning | Tianhui Zhang et.al. | 2404.16807 | null |
2024-04-25 | AAPL: Adding Attributes to Prompt Learning for Vision-Language Models | Gahyeon Kim et.al. | 2404.16804 | link |
2024-04-25 | Weak-to-Strong Extrapolation Expedites Alignment | Chujie Zheng et.al. | 2404.16792 | link |
2024-04-25 | SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension | Bohao Li et.al. | 2404.16790 | link |
2024-04-25 | Continual Learning of Large Language Models: A Comprehensive Survey | Haizhou Shi et.al. | 2404.16789 | link |
2024-04-25 | Modeling Selective Feature Attention for Representation-based Siamese Text Matching | Jianxiang Zang et.al. | 2404.16776 | link |
2024-04-25 | REBEL: Reinforcement Learning via Regressing Relative Rewards | Zhaolin Gao et.al. | 2404.16767 | link |
2024-04-25 | Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model | Runzhe Zhan et.al. | 2404.16766 | null |
2024-04-25 | RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis | Xiaoman Zhang et.al. | 2404.16754 | null |
2024-04-25 | Embracing Diversity: Interpretable Zero-shot classification beyond one vector per class | Mazda Moayeri et.al. | 2404.16717 | null |
2024-04-25 | Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding | Mostafa Elhoushi et.al. | 2404.16710 | null |
2024-04-25 | Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents | Giorgio Piatti et.al. | 2404.16698 | null |
2024-04-25 | Influence of Solution Efficiency and Valence of Instruction on Additive and Subtractive Solution Strategies in Humans and GPT-4 | Lydia Uhler et.al. | 2404.16692 | null |
2024-04-25 | EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning | Hongxia Xie et.al. | 2404.16670 | link |
2024-04-24 | Hybrid LLM/Rule-based Approaches to Business Insights Generation from Structured Data | Aliaksei Vertsel et.al. | 2404.15604 | null |
2024-04-24 | ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction | Henry Peng Zou et.al. | 2404.15592 | link |
2024-04-24 | MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis | Jiaxin Zhuang et.al. | 2404.15580 | null |
2024-04-24 | Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? | Hossein Salami et.al. | 2404.15578 | null |
2024-04-24 | Retrieval Head Mechanistically Explains Long-Context Factuality | Wenhao Wu et.al. | 2404.15574 | link |
2024-04-23 | PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models | Shashi Kant Gupta et.al. | 2404.15549 | null |
2024-04-23 | BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis | Shuhang Lin et.al. | 2404.15532 | link |
2024-04-23 | Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models | Mihir Parmar et.al. | 2404.15522 | link |
2024-04-23 | Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval | Young Kyun Jang et.al. | 2404.15516 | null |
2024-04-23 | ToM-LM: Delegating Theory Of Mind Reasoning to External Symbolic Executors in Large Language Models | Weizhi Tang et.al. | 2404.15515 | null |
2024-04-23 | IryoNLP at MEDIQA-CORR 2024: Tackling the Medical Error Detection & Correction Task On the Shoulders of Medical Agents | Jean-Philippe Corbeil et.al. | 2404.15488 | link |
2024-04-23 | Large Language Models Spot Phishing Emails with Surprising Accuracy: A Comparative Analysis of Performance | Het Patel et.al. | 2404.15485 | null |
2024-04-23 | Can Large Language Models Learn the Physics of Metamaterials? An Empirical Study with ChatGPT | Darui Lu et.al. | 2404.15458 | null |
2024-04-23 | XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference | João Monteiro et.al. | 2404.15420 | null |
2024-04-23 | Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs | Davide Caffagni et.al. | 2404.15406 | null |
2024-04-23 | Aligning LLM Agents by Learning Latent Preference from User Edits | Ge Gao et.al. | 2404.15269 | link |
2024-04-23 | XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts | Yifeng Ding et.al. | 2404.15247 | link |
2024-04-23 | CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies | Weiyan Shi et.al. | 2404.15238 | link |
2024-04-23 | Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models | Aidan Z. H. Yang et.al. | 2404.15236 | null |
2024-04-23 | Re-Thinking Inverse Graphics With Large Language Models | Peter Kulits et.al. | 2404.15228 | null |
2024-04-23 | Does Instruction Tuning Make LLMs More Consistent? | Constanza Fierro et.al. | 2404.15206 | null |
2024-04-23 | Setting up the Data Printer with Improved English to Ukrainian Machine Translation | Yurii Paniv et.al. | 2404.15196 | link |
2024-04-23 | Regressive Side Effects of Training Language Models to Mimic Student Misconceptions | Shashank Sonkar et.al. | 2404.15156 | null |
2024-04-23 | Bias patterns in the application of LLMs for clinical decision support: A comprehensive study | Raphael Poulain et.al. | 2404.15149 | link |
2024-04-23 | Rethinking LLM Memorization through the Lens of Adversarial Compression | Avi Schwarzschild et.al. | 2404.15146 | null |
2024-04-23 | MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning | Sunan He et.al. | 2404.15127 | null |
2024-04-23 | Identifying Fairness Issues in Automatically Generated Testing Content | Kevin Stowe et.al. | 2404.15104 | null |
2024-04-23 | Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation | Xun Wu et.al. | 2404.15100 | null |
2024-04-23 | Detection of circular permutations by Protein Language Models | Yue Hu et.al. | 2404.15087 | link |
2024-04-23 | Multi-Head Mixture-of-Experts | Xun Wu et.al. | 2404.15045 | null |
2024-04-23 | TAXI: Evaluating Categorical Knowledge Editing for Language Models | Derek Powell et.al. | 2404.15004 | link |
2024-04-23 | Transformers Can Represent |
Anej Svete et.al. | 2404.14994 | null |
2024-04-23 | A Short Review for Ontology Learning from Text: Stride from Shallow Learning, Deep Learning to Large Language Models Trend | Rick Du et.al. | 2404.14991 | null |
2024-04-23 | Kerstin Kläser et.al. | 2404.14986 | null | |
2024-04-23 | Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case | Muhammad Asif Auyb et.al. | 2404.14977 | null |
2024-04-22 | AutoAD III: The Prequel -- Back to the Pixels | Tengda Han et.al. | 2404.14412 | null |
2024-04-22 | SpaceByte: Towards Deleting Tokenization from Large Language Modeling | Kevin Slagle et.al. | 2404.14408 | link |
2024-04-22 | RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios? | Adrian de Wynter et.al. | 2404.14397 | link |
2024-04-22 | SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation | Yuying Ge et.al. | 2404.14396 | link |
2024-04-22 | PARAMANU-GANITA: Language Model with Mathematical Capabilities | Mitodru Niyogi et.al. | 2404.14395 | null |
2024-04-22 | A Multimodal Automated Interpretability Agent | Tamar Rott Shaham et.al. | 2404.14394 | null |
2024-04-22 | A Survey on Self-Evolution of Large Language Models | Zhengwei Tao et.al. | 2404.14387 | link |
2024-04-22 | Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph | Xiaochen Kev Gao et.al. | 2404.14372 | link |
2024-04-23 | Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data | Fahim Tajwar et.al. | 2404.14367 | link |
2024-04-22 | Better Synthetic Data by Retrieving and Transforming Existing Datasets | Saumya Gandhi et.al. | 2404.14361 | link |
2024-04-22 | Rethinking Legal Compliance Automation: Opportunities with Large Language Models | Shabnam Hassani et.al. | 2404.14356 | null |
2024-04-22 | Calc-CMU at SemEval-2024 Task 7: Pre-Calc -- Learning to Use the Calculator Improves Numeracy in Language Models | Vishruth Veerendranath et.al. | 2404.14355 | link |
2024-04-22 | Automated Long Answer Grading with RiceChem Dataset | Shashank Sonkar et.al. | 2404.14316 | link |
2024-04-22 | Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels | Jan-Philipp Fränken et.al. | 2404.14313 | link |
2024-04-22 | Explaining Arguments' Strength: Unveiling the Role of Attacks and Supports (Technical Report) | Xiang Yin et.al. | 2404.14304 | link |
2024-04-22 | Marking: Visual Grading with Highlighting Errors and Annotating Missing Bits | Shashank Sonkar et.al. | 2404.14301 | null |
2024-04-22 | Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach | Yao Wan et.al. | 2404.14296 | link |
2024-04-22 | A Survey on Efficient Inference for Large Language Models | Zixuan Zhou et.al. | 2404.14294 | null |
2024-04-22 | LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots | Dongge Han et.al. | 2404.14285 | null |
2024-04-22 | Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback | Wenyi Xiao et.al. | 2404.14233 | null |
2024-04-19 | MoVA: Adapting Mixture of Vision Experts to Multimodal Context | Zhuofan Zong et.al. | 2404.13046 | link |
2024-04-19 | Unified Scene Representation and Reconstruction for 3D Large Language Models | Tao Chu et.al. | 2404.13044 | null |
2024-04-19 | Data Alignment for Zero-Shot Concept Generation in Dermatology AI | Soham Gadgil et.al. | 2404.13043 | null |
2024-04-19 | Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs | Biyang Guo et.al. | 2404.13033 | link |
2024-04-19 | When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering | Stephen Choi et.al. | 2404.13028 | null |
2024-04-19 | Stronger Random Baselines for In-Context Learning | Gregory Yauney et.al. | 2404.13020 | link |
2024-04-19 | Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models | Chuofan Ma et.al. | 2404.13013 | null |
2024-04-19 | Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs | Clemencia Siro et.al. | 2404.12994 | link |
2024-04-19 | FineRec:Exploring Fine-grained Sequential Recommendation | Xiaokun Zhang et.al. | 2404.12975 | link |
2024-04-19 | Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models | Yian Li et.al. | 2404.12966 | null |
2024-04-19 | Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction | Qinyuan Wu et.al. | 2404.12957 | null |
2024-04-19 | Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models | Konstantinos Vilouras et.al. | 2404.12920 | null |
2024-04-19 | Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models | Zhenyang Ni et.al. | 2404.12916 | link |
2024-04-19 | Large Language Models for Networking: Workflow, Advances and Challenges | Chang Liu et.al. | 2404.12901 | null |
2024-04-19 | Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning | Ahmed Elshabrawy et.al. | 2404.12897 | null |
2024-04-19 | Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation | Guanhua Chen et.al. | 2404.12879 | null |
2024-04-19 | LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency | Zhaodonghui Li et.al. | 2404.12872 | link |
2024-04-19 | How Does the Textual Information Affect the Retrieval of Multimodal In-Context Learning? | Yang Luo et.al. | 2404.12866 | null |
2024-04-19 | Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation | Yilong Chen et.al. | 2404.12861 | null |
2024-04-19 | TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages | Aleksei Dorkin et.al. | 2404.12845 | null |
2024-04-18 | BLINK: Multimodal Large Language Models Can See but Not Perceive | Xingyu Fu et.al. | 2404.12390 | null |
2024-04-18 | Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models | Aitor Ormazabal et.al. | 2404.12387 | null |
2024-04-18 | MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale | Xiaotang Gai et.al. | 2404.12372 | null |
2024-04-18 | When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes | Asaf Yehudai et.al. | 2404.12365 | link |
2024-04-18 | From |
Rafael Rafailov et.al. | 2404.12358 | null |
2024-04-18 | Towards a Foundation Model for Partial Differential Equation: Multi-Operator Learning and Extrapolation | Jingmin Sun et.al. | 2404.12355 | link |
2024-04-18 | V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning | Hang Hua et.al. | 2404.12353 | null |
2024-04-18 | Evaluating AI for Law: Bridging the Gap with Open-Source Solutions | Rohan Bhambhoria et.al. | 2404.12349 | null |
2024-04-18 | Large Language Models in Targeted Sentiment Analysis | Nicolay Rusnachenko et.al. | 2404.12342 | link |
2024-04-18 | Normative Requirements Operationalization with Large Language Models | Nick Feng et.al. | 2404.12335 | null |
2024-04-18 | Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment | Zhaofeng Wu et.al. | 2404.12318 | null |
2024-04-18 | Large Language Models for Synthetic Participatory Planning of Shared Automated Electric Mobility Systems | Jiangbo Yu et.al. | 2404.12317 | null |
2024-04-18 | Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair | Yusuke Sakai et.al. | 2404.12299 | null |
2024-04-18 | Augmenting emotion features in irony detection with Large language modeling | Yucheng Lin et.al. | 2404.12291 | null |
2024-04-18 | Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery | Yona Falinie A. Gaus et.al. | 2404.12285 | null |
2024-04-18 | Enhancing Embedding Performance through Large Language Model-based Text Enrichment and Rewriting | Nicholas Harris et.al. | 2404.12283 | null |
2024-04-18 | Advancing the Robustness of Large Language Models through Self-Denoised Smoothing | Jiabao Ji et.al. | 2404.12274 | link |
2024-04-18 | FedEval-LLM: Federated Evaluation of Large Language Models on Downstream Tasks with Collective Wisdom | Yuanqin He et.al. | 2404.12273 | null |
2024-04-18 | Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences | Shreya Shankar et.al. | 2404.12272 | null |
2024-04-18 | Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM | Michelle S. Lam et.al. | 2404.12259 | link |
2024-04-17 | Private federated discovery of out-of-vocabulary words for Gboard | Ziteng Sun et.al. | 2404.11607 | null |
2024-04-17 | VG4D: Vision-Language Model Goes 4D Video Recognition | Zhichao Deng et.al. | 2404.11605 | link |
2024-04-17 | A Deep Dive into Large Language Models for Automated Bug Localization and Repair | Soneya Binta Hossain et.al. | 2404.11595 | null |
2024-04-17 | Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding | Zezhong Fan et.al. | 2404.11589 | null |
2024-04-17 | LLMTune: Accelerate Database Knob Tuning with Large Language Models | Xinmei Huang et.al. | 2404.11581 | link |
2024-04-17 | On the Scalability of GNNs for Molecular Graphs | Maciej Sypetkowski et.al. | 2404.11568 | null |
2024-04-17 | MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation | Kuan-Chieh et.al. | 2404.11565 | null |
2024-04-17 | Quantifying Multilingual Performance of Large Language Models Across Languages | Zihao Li et.al. | 2404.11553 | null |
2024-04-17 | Evaluating Span Extraction in Generative Paradigm: A Reflection on Aspect-Based Sentiment Analysis | Soyoung Yang et.al. | 2404.11539 | null |
2024-04-17 | FedPFT: Federated Proxy Fine-Tuning of Foundation Models | Zhaopeng Peng et.al. | 2404.11536 | link |
2024-04-17 | Select and Reorder: A Novel Approach for Neural Sign Language Production | Harry Walsh et.al. | 2404.11532 | null |
2024-04-17 | Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization | Costas Mavromatis et.al. | 2404.11531 | link |
2024-04-17 | Embedding Privacy in Computational Social Science and Artificial Intelligence Research | Keenan Jones et.al. | 2404.11515 | null |
2024-04-17 | Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models | Yushuo Chen et.al. | 2404.11502 | link |
2024-04-17 | Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models | Yue Zhou et.al. | 2404.11500 | link |
2024-04-18 | Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent | Wei Chen et.al. | 2404.11459 | null |
2024-04-17 | Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models | Sunhao Dai et.al. | 2404.11457 | link |
2024-04-17 | AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts | Meng Jiang et.al. | 2404.11449 | null |
2024-04-17 | Open-Ended Wargames with Large Language Models | Daniel P. Hogan et.al. | 2404.11446 | link |
2024-04-17 | DUPE: Detection Undermining via Prompt Engineering for Deepfake Text | James Weichert et.al. | 2404.11408 | null |
2024-04-16 | Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback | Qiwei Di et.al. | 2404.10776 | null |
2024-04-16 | COMBO: Compositional World Models for Embodied Multi-Agent Cooperation | Hongxin Zhang et.al. | 2404.10775 | null |
2024-04-16 | Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification | Yu-Yang Li et.al. | 2404.10757 | link |
2024-04-16 | Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study | Shusheng Xu et.al. | 2404.10719 | null |
2024-04-16 | Dual Modalities of Text: Visual and Textual Generative Pre-training | Yekun Chai et.al. | 2404.10710 | null |
2024-04-16 | Question Difficulty Ranking for Multiple-Choice Reading Comprehension | Vatsal Raina et.al. | 2404.10704 | null |
2024-04-16 | An empirical study on code review activity prediction in practice | Doriane Olewicki et.al. | 2404.10703 | null |
2024-04-16 | Automating REST API Postman Test Cases Using LLM | S Deepika Sri et.al. | 2404.10678 | null |
2024-04-16 | Self-playing Adversarial Language Game Enhances LLM Reasoning | Pengyu Cheng et.al. | 2404.10642 | link |
2024-04-16 | HLAT: High-quality Large Language Model Pre-trained on AWS Trainium | Haozheng Fan et.al. | 2404.10630 | null |
2024-04-16 | Private Attribute Inference from Images with Vision-Language Models | Batuhan Tömekçe et.al. | 2404.10618 | null |
2024-04-16 | Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases | Yanze Li et.al. | 2404.10595 | null |
2024-04-16 | Construction of Domain-specified Japanese Large Language Model for Finance through Continual Pre-training | Masanori Hirano et.al. | 2404.10555 | null |
2024-04-16 | Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning | Xiao Wang et.al. | 2404.10552 | null |
2024-04-16 | Capturing the Macroscopic Behaviour of Molecular Dynamics with Membership Functions | Alexander Sikorski et.al. | 2404.10523 | null |
2024-04-16 | CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity | Moshe Berchansky et.al. | 2404.10513 | null |
2024-04-16 | White Men Lead, Black Women Help: Uncovering Gender, Racial, and Intersectional Bias in Language Agency | Yixin Wan et.al. | 2404.10508 | null |
2024-04-16 | Self-Supervised Visual Preference Alignment | Ke Zhu et.al. | 2404.10501 | link |
2024-04-16 | When Emotional Stimuli meet Prompt Designing: An Auto-Prompt Graphical Paradigm | Chenggian Ma et.al. | 2404.10500 | null |
2024-04-16 | Spiral of Silences: How is Large Language Model Killing Information Retrieval? -- A Case Study on Open Domain Question Answering | Xiaoyang Chen et.al. | 2404.10496 | link |
2024-04-15 | KG-CTG: Citation Generation through Knowledge Graph-guided Large Language Models | Avinash Anand et.al. | 2404.09763 | null |
2024-04-15 | Resilience of Large Language Models for Noisy Instructions | Bin Wang et.al. | 2404.09754 | null |
2024-04-15 | Personalized Collaborative Fine-Tuning for On-Device Large Language Models | Nicolas Wagner et.al. | 2404.09753 | link |
2024-04-15 | AMPCliff: quantitative definition and benchmarking of activity cliffs in antimicrobial peptides | Kewei Li et.al. | 2404.09738 | link |
2024-04-15 | Quantization of Large Language Models with an Overdetermined Basis | Daniil Merkulov et.al. | 2404.09737 | null |
2024-04-15 | Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models | Ziwei Luo et.al. | 2404.09732 | link |
2024-04-15 | Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model | Hyunsoo Cho et.al. | 2404.09717 | null |
2024-04-15 | Enhancing Robot Explanation Capabilities through Vision-Language Models: a Preliminary Study by Interpreting Visual Inputs for Improved Human-Robot Interaction | David Sobrín-Hidalgo et.al. | 2404.09705 | null |
2024-04-15 | Generative AI for Game Theory-based Mobile Networking | Long He et.al. | 2404.09699 | null |
2024-04-15 | Are Large Language Models Reliable Argument Quality Annotators? | Nailia Mirzakhmedova et.al. | 2404.09696 | null |
2024-04-15 | LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models | Guangyan Li et.al. | 2404.09695 | null |
2024-04-15 | Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data Annotation | Juhwan Choi et.al. | 2404.09682 | null |
2024-04-15 | Learn Your Reference Model for Real Good Alignment | Alexey Gorbatovski et.al. | 2404.09656 | null |
2024-04-15 | Do LLMs Understand Visual Anomalies? Uncovering LLM Capabilities in Zero-shot Anomaly Detection | Jiaqi Zhu et.al. | 2404.09654 | null |
2024-04-15 | Bridging Vision and Language Spaces with Assignment Prediction | Jungin Park et.al. | 2404.09632 | link |
2024-04-15 | AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception | Yipo Huang et.al. | 2404.09624 | link |
2024-04-15 | UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark | Zhaokun Zhou et.al. | 2404.09619 | null |
2024-04-15 | A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions | Pengfei Liu et.al. | 2404.09606 | link |
2024-04-15 | Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction | Zepeng Ding et.al. | 2404.09593 | null |
2024-04-15 | Modelling Language | Jumbly Grindrod et.al. | 2404.09579 | null |
2024-04-15 | Transformers, Contextualism, and Polysemy | Jumbly Grindrod et.al. | 2404.09577 | null |
2024-04-15 | Large language models and linguistic intentionality | Jumbly Grindrod et.al. | 2404.09576 | null |
2024-04-12 | Probing the 3D Awareness of Visual Foundation Models | Mohamed El Banani et.al. | 2404.08636 | link |
2024-04-12 | Pre-training Small Base LMs with Fewer Tokens | Sunny Sanyal et.al. | 2404.08634 | link |
2024-04-12 | FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models | Yanting Wang et.al. | 2404.08631 | link |
2024-04-12 | Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation | Yanhao Zheng et.al. | 2404.08603 | link |
2024-04-12 | Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts | Övgü Özdemir et.al. | 2404.08589 | link |
2024-04-12 | Pathological Primitive Segmentation Based on Visual Foundation Model with Zero-Shot Mask Generation | Abu Bakor Hayat Arnob et.al. | 2404.08584 | link |
2024-04-12 | FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation | Riza Velioglu et.al. | 2404.08582 | null |
2024-04-12 | Lossy Image Compression with Foundation Diffusion Models | Lucas Relic et.al. | 2404.08580 | null |
2024-04-12 | Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation | Hanlin Tian et.al. | 2404.08570 | null |
2024-04-12 | RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs | Shreyas Chaudhari et.al. | 2404.08555 | null |
2024-04-12 | Memory Traces: Are Transformers Tulving Machines? | Jean-Marie Chauvet et.al. | 2404.08543 | null |
2024-04-12 | Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward | Xuan Xie et.al. | 2404.08517 | null |
2024-04-12 | ChatGPT and general-purpose AI count fruits in pictures surprisingly well | Konlavach Mengsuwan et.al. | 2404.08515 | null |
2024-04-12 | Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | Haoran Qiu et.al. | 2404.08509 | link |
2024-04-12 | LaSagnA: Language-based Segmentation Assistant for Complex Queries | Cong Wei et.al. | 2404.08506 | link |
2024-04-12 | Strategic Interactions between Large Language Models-based Agents in Beauty Contests | Siting Lu et.al. | 2404.08492 | null |
2024-04-12 | Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation | Haozhe Zhao et.al. | 2404.08491 | link |
2024-04-12 | Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian | Stefano De Paoli et.al. | 2404.08488 | null |
2024-04-12 | Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task | Hassan Ali et.al. | 2404.08424 | null |
2024-04-12 | Adapting the Segment Anything Model During Usage in Novel Situations | Robin Schön et.al. | 2404.08421 | null |
2024-04-11 | OpenBias: Open-set Bias Detection in Text-to-Image Generative Models | Moreno D'Incà et.al. | 2404.07990 | link |
2024-04-11 | Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding | Yiwen Tang et.al. | 2404.07989 | link |
2024-04-11 | Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Representation Learning | Simon Schrodi et.al. | 2404.07983 | null |
2024-04-11 | Language Imbalance Can Boost Cross-lingual Generalisation | Anton Schäfer et.al. | 2404.07982 | link |
2024-04-11 | Manipulating Large Language Models to Increase Product Visibility | Aounon Kumar et.al. | 2404.07981 | link |
2024-04-11 | LLoCO: Learning Long Contexts Offline | Sijun Tan et.al. | 2404.07979 | link |
2024-04-11 | Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models | Haotian Zhang et.al. | 2404.07973 | null |
2024-04-11 | Rho-1: Not All Tokens Are What You Need | Zhenghao Lin et.al. | 2404.07965 | link |
2024-04-11 | On Unified Prompt Tuning for Request Quality Assurance in Public Code Review | Xinyu Chen et.al. | 2404.07942 | null |
2024-04-11 | Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation | Jinkyung Park et.al. | 2404.07926 | null |
2024-04-11 | LaVy: Vietnamese Multimodal Large Language Model | Chi Tran et.al. | 2404.07922 | link |
2024-04-11 | AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs | Zeyi Liao et.al. | 2404.07921 | link |
2024-04-11 | DesignQA: A Multimodal Benchmark for Evaluating Large Language Models' Understanding of Engineering Documentation | Anna C. Doris et.al. | 2404.07917 | link |
2024-04-11 | HGRN2: Gated Linear RNNs with State Expansion | Zhen Qin et.al. | 2404.07904 | link |
2024-04-11 | High-Dimension Human Value Representation in Large Language Models | Samuel Cahyawijaya et.al. | 2404.07900 | null |
2024-04-11 | Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations | Dayeon Ki et.al. | 2404.07851 | link |
2024-04-11 | On Training Data Influence of GPT Models | Qingyi Liu et.al. | 2404.07840 | link |
2024-04-11 | RecurrentGemma: Moving Past Transformers for Efficient Open Language Models | Aleksandar Botev et.al. | 2404.07839 | link |
2024-04-11 | Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution | Handi Deng et.al. | 2404.07833 | null |
2024-04-11 | Heron-Bench: A Benchmark for Evaluating Vision Language Models in Japanese | Yuichi Inoue et.al. | 2404.07824 | link |
2024-04-10 | BRAVE: Broadening the visual encoding of vision-language models | Oğuzhan Fatih Kar et.al. | 2404.07204 | null |
2024-04-10 | UMBRAE: Unified Multimodal Decoding of Brain Signals | Weihao Xia et.al. | 2404.07202 | null |
2024-04-10 | Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic | Sachin Goyal et.al. | 2404.07177 | link |
2024-04-10 | Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention | Tsendsuren Munkhdalai et.al. | 2404.07143 | null |
2024-04-10 | Open reaction-diffusion systems: bridging probabilistic theory across scales | Mauricio J. del Razo et.al. | 2404.07119 | null |
2024-04-10 | Continuous Language Model Interpolation for Dynamic and Controllable Text Generation | Sara Kangaslahti et.al. | 2404.07117 | link |
2024-04-11 | From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications | Yongqiang Ma et.al. | 2404.07108 | null |
2024-04-10 | Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs | Bowen Jin et.al. | 2404.07103 | link |
2024-04-10 | Dynamic Generation of Personalities with Large Language Models | Jianzhi Liu et.al. | 2404.07084 | link |
2024-04-10 | VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning | Alexandros Xenos et.al. | 2404.07078 | link |
2024-04-10 | Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers? | Mingyu Jin et.al. | 2404.07066 | link |
2024-04-10 | Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study | Alessandro Stolfo et.al. | 2404.07060 | null |
2024-04-10 | Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation | Elisa Sanchez-Bayona et.al. | 2404.07053 | link |
2024-04-10 | ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling | Ege Özsoy et.al. | 2404.07031 | null |
2024-04-10 | Improving Language Model Reasoning with Self-motivated Learning | Yunlong Feng et.al. | 2404.07017 | null |
2024-04-10 | A Mathematical Theory for Learning Semantic Languages by Abstract Learners | Kuo-Yu Liao et.al. | 2404.07009 | null |
2024-04-10 | WordDecipher: Enhancing Digital Workspace Communication with Explainable AI for Non-native English Speakers | Yuexi Chen et.al. | 2404.07005 | null |
2024-04-10 | LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models | Igor Tufanov et.al. | 2404.07004 | null |
2024-04-10 | Event Grounded Criminal Court View Generation withCooperative (Large) Language Models | Linan Yue et.al. | 2404.07001 | link |
2024-04-10 | Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study | Hongru Du et.al. | 2404.06962 | link |
2024-04-09 | InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD | Xiaoyi Dong et.al. | 2404.06512 | link |
2024-04-09 | Can Feedback Enhance Semantic Grounding in Large Vision-Language Models? | Yuan-Hong Liao et.al. | 2404.06510 | null |
2024-04-09 | On the Effect of (Near) Duplicate Subwords in Language Modelling | Anton Schäfer et.al. | 2404.06508 | link |
2024-04-09 | Pitfalls of Conversational LLMs on News Debiasing | Ipek Baris Schlicht et.al. | 2404.06488 | null |
2024-04-10 | Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks | Chonghua Wang et.al. | 2404.06480 | link |
2024-04-10 | Text-Based Reasoning About Vector Graphics | Zhenhailong Wang et.al. | 2404.06479 | null |
2024-04-09 | Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of Large Language Models | Zihan Fang et.al. | 2404.06448 | null |
2024-04-09 | Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems | Kunal Garg et.al. | 2404.06413 | null |
2024-04-09 | AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents | Luca Gioacchini et.al. | 2404.06411 | link |
2024-04-09 | Take a Look at it! Rethinking How to Evaluate Language Model Jailbreak | Hongyu Cai et.al. | 2404.06407 | link |
2024-04-09 | Apprentices to Research Assistants: Advancing Research with Large Language Models | M. Namvarpour et.al. | 2404.06404 | null |
2024-04-09 | MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies | Shengding Hu et.al. | 2404.06395 | link |
2024-04-09 | MuPT: A Generative Symbolic Music Pretrained Transformer | Xingwei Qu et.al. | 2404.06393 | null |
2024-04-09 | Event Extraction in Basque: Typologically motivated Cross-Lingual Transfer-Learning Analysis | Mikel Zubillaga et.al. | 2404.06392 | null |
2024-04-09 | Latent Distance Guided Alignment Training for Large Language Models | Haotian Luo et.al. | 2404.06390 | null |
2024-04-09 | Model Generation from Requirements with LLMs: an Exploratory Study | Alessio Ferrari et.al. | 2404.06371 | null |
2024-04-09 | Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in Python | Valdecy Pereira et.al. | 2404.06370 | link |
2024-04-09 | VISION2UI: A Real-World Dataset with Layout for Code Generation from UI Designs | Yi Gui et.al. | 2404.06369 | null |
2024-04-09 | ClinLinker: Medical Entity Linking of Clinical Concept Mentions in Spanish | Fernando Gallego et.al. | 2404.06367 | null |
2024-04-09 | Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero shot Medical Image Segmentation | Sidra Aleem et.al. | 2404.06362 | link |
2024-04-08 | MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding | Bo He et.al. | 2404.05726 | link |
2024-04-08 | Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs | Keen You et.al. | 2404.05719 | null |
2024-04-08 | Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding | Ahmad Idrissi-Yaghir et.al. | 2404.05694 | null |
2024-04-08 | Evaluating Mathematical Reasoning Beyond Accuracy | Shijie Xia et.al. | 2404.05692 | link |
2024-04-08 | Retrieval-Augmented Open-Vocabulary Object Detection | Jooyeon Kim et.al. | 2404.05687 | link |
2024-04-08 | MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation | Kunpeng Song et.al. | 2404.05674 | link |
2024-04-08 | CoReS: Orchestrating the Dance of Reasoning and Segmentation | Xiaoyi Bao et.al. | 2404.05673 | null |
2024-04-08 | Fighting crime with Transformers: Empirical analysis of address parsing methods in payment data | Haitham Hammami et.al. | 2404.05632 | link |
2024-04-08 | LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking | Faren Yan et.al. | 2404.05624 | null |
2024-04-08 | MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning | Matteo Farina et.al. | 2404.05621 | link |
2024-04-08 | SpeechAlign: Aligning Speech Generation to Human Preferences | Dong Zhang et.al. | 2404.05600 | link |
2024-04-08 | MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering | Iñigo Alonso et.al. | 2404.05590 | null |
2024-04-08 | Enhancing Software Related Information Extraction with Generative Language Models through Single-Choice Question Answering | Wolfgang Otto et.al. | 2404.05587 | null |
2024-04-08 | Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model | Yue-Hua Han et.al. | 2404.05583 | null |
2024-04-08 | 360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System | Shen Gao et.al. | 2404.05569 | null |
2024-04-08 | Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models | Bowen Pan et.al. | 2404.05567 | null |
2024-04-08 | Chinese Sequence Labeling with Semi-Supervised Boundary-Aware Language Model Pre-training | Longhui Zhang et.al. | 2404.05560 | link |
2024-04-08 | Evaluating Interventional Reasoning Capabilities of Large Language Models | Tejas Kasetty et.al. | 2404.05545 | null |
2024-04-08 | OPSD: an Offensive Persian Social media Dataset and its baseline evaluations | Mehran Safayani et.al. | 2404.05540 | null |
2024-04-08 | Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data | Tim Baumgärtner et.al. | 2404.05530 | null |
2024-04-05 | Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2) | Michael Saxon et.al. | 2404.04251 | link |
2024-04-05 | Physical Property Understanding from Language-Embedded Feature Fields | Albert J. Zhai et.al. | 2404.04242 | null |
2024-04-05 | Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents | Harsh Kohli et.al. | 2404.04237 | null |
2024-04-05 | player2vec: A Language Modeling Approach to Understand Player Behavior in Games | Tianze Wang et.al. | 2404.04234 | null |
2024-04-05 | Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation | Ji-Jia Wu et.al. | 2404.04231 | link |
2024-04-05 | Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation | Tong Su et.al. | 2404.04212 | null |
2024-04-05 | Social Skill Training with Large Language Models | Diyi Yang et.al. | 2404.04204 | null |
2024-04-05 | Do Sentence Transformers Learn Quasi-Geospatial Concepts from General Text? | Ilya Ilyankou et.al. | 2404.04169 | null |
2024-04-05 | Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model | Xinrun Du et.al. | 2404.04167 | null |
2024-04-05 | Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval | João Coelho et.al. | 2404.04163 | null |
2024-04-05 | BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models | Jacek Wiland et.al. | 2404.04113 | link |
2024-04-05 | Large language models as oracles for instantiating ontologies with domain-specific knowledge | Giovanni Ciatto et.al. | 2404.04108 | link |
2024-04-05 | Robust Preference Optimization with Provable Noise Tolerance for LLMs | Xize Liang et.al. | 2404.04102 | null |
2024-04-05 | Label Propagation for Zero-shot Classification with Vision-Language Models | Vladan Stojnić et.al. | 2404.04072 | link |
2024-04-05 | Assessing the quality of information extraction | Filip Seitl et.al. | 2404.04068 | null |
2024-04-05 | CLUE: A Clinical Language Understanding Evaluation for LLMs | Amin Dada et.al. | 2404.04067 | link |
2024-04-05 | VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots | Akhil Padmanabha et.al. | 2404.04066 | null |
2024-04-05 | A Comparison of Methods for Evaluating Generative IR | Negar Arabzadeh et.al. | 2404.04044 | link |
2024-04-05 | Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer | Hele-Andra Kuulmets et.al. | 2404.04042 | null |
2024-04-05 | Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds | Annerose Eichel et.al. | 2404.04031 | null |
2024-04-04 | OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views | Francis Engelmann et.al. | 2404.03650 | null |
2024-04-04 | AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent | Hanyu Lai et.al. | 2404.03648 | link |
2024-04-04 | Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra | Darioush Kevian et.al. | 2404.03647 | null |
2024-04-04 | Locating and Editing Factual Associations in Mamba | Arnab Sen Sharma et.al. | 2404.03646 | link |
2024-04-04 | Training LLMs over Neurally Compressed Text | Brian Lester et.al. | 2404.03626 | null |
2024-04-04 | Standardizing Knowledge Engineering Practices with a Reference Architecture | Bradley P. Allen et.al. | 2404.03624 | null |
2024-04-04 | Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph | Marco Bronzini et.al. | 2404.03623 | null |
2024-04-04 | Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models | Wenshan Wu et.al. | 2404.03622 | null |
2024-04-04 | DeViDe: Faceted medical knowledge for improved medical vision-language pre-training | Haozhe Luo et.al. | 2404.03618 | null |
2024-04-04 | Sailor: Open Language Models for South-East Asia | Longxu Dou et.al. | 2404.03608 | link |
2024-04-04 | Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization | Aniruddha Nrusimha et.al. | 2404.03605 | link |
2024-04-04 | Evaluating LLMs at Detecting Errors in LLM Responses | Ryo Kamoi et.al. | 2404.03602 | link |
2024-04-04 | Intent Detection and Entity Extraction from BioMedical Literature | Ankan Mullick et.al. | 2404.03598 | link |
2024-04-04 | ReFT: Representation Finetuning for Language Models | Zhengxuan Wu et.al. | 2404.03592 | link |
2024-04-04 | SemGrasp: Semantic Grasp Generation via Language Aligned Discretization | Kailin Li et.al. | 2404.03590 | null |
2024-04-04 | Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models | Yantao Liu et.al. | 2404.03577 | link |
2024-04-04 | Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity | Jake Varley et.al. | 2404.03570 | null |
2024-04-04 | Personalized LLM Response Generation with Parameterized Memory Injection | Kai Zhang et.al. | 2404.03565 | null |
2024-04-04 | Select and Summarize: Scene Saliency for Movie Script Summarization | Rohit Saxena et.al. | 2404.03561 | null |
2024-04-04 | How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes | Harmon Bhasin et.al. | 2404.03558 | link |
2024-04-03 | ALOHa: A New Measure for Hallucination in Captioning Models | Suzanne Petryk et.al. | 2404.02904 | null |
2024-04-03 | MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment | Duygu Ceylan et.al. | 2404.02899 | null |
2024-04-03 | ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline | Yifan Xu et.al. | 2404.02893 | link |
2024-04-03 | MODNO: Multi Operator Learning With Distributed Neural Operators | Zecheng Zhang et.al. | 2404.02892 | null |
2024-04-03 | Linear Attention Sequence Parallelism | Weigao Sun et.al. | 2404.02882 | link |
2024-04-03 | Integrating Explanations in Learning LTL Specifications from Demonstrations | Ashutosh Gupta et.al. | 2404.02872 | null |
2024-04-03 | Toward Inference-optimal Mixture-of-Expert Large Language Models | Longfei Yun et.al. | 2404.02852 | null |
2024-04-03 | I-Design: Personalized LLM Interior Designer | Ata Çelen et.al. | 2404.02838 | null |
2024-04-03 | Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models | Wanyun Cui et.al. | 2404.02837 | null |
2024-04-03 | Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison | Maxime Bouthors et.al. | 2404.02835 | null |
2024-04-03 | Empowering Biomedical Discovery with AI Agents | Shanghua Gao et.al. | 2404.02831 | null |
2024-04-03 | BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models | Qijun Luo et.al. | 2404.02827 | link |
2024-04-03 | Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models | Haoran Sun et.al. | 2404.02823 | link |
2024-04-03 | A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches | Zhigen Zhao et.al. | 2404.02817 | null |
2024-04-03 | The RealHumanEval: Evaluating Large Language Models' Abilities to Support Programmers | Hussein Mozannar et.al. | 2404.02806 | link |
2024-04-03 | Efficient Multi-Vector Dense Retrieval Using Bit Vectors | Franco Maria Nardini et.al. | 2404.02805 | link |
2024-04-03 | AI and personalized learning: bridging the gap with modern educational goals | Kristjan-Julius Laak et.al. | 2404.02798 | null |
2024-04-03 | CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech | Jaehyeon Kim et.al. | 2404.02781 | null |
2024-04-03 | FPT: Feature Prompt Tuning for Few-shot Readability Assessment | Ziyang Wang et.al. | 2404.02772 | link |
2024-04-03 | DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement | Hao Wu et.al. | 2404.02755 | null |
2024-04-02 | Segment Any 3D Object with Language | Seungjun Lee et.al. | 2404.02157 | null |
2024-04-02 | Iterated Learning Improves Compositionality in Large Vision-Language Models | Chenhao Zheng et.al. | 2404.02145 | null |
2024-04-02 | Topic-based Watermarks for LLM-Generated Text | Alexander Nemecek et.al. | 2404.02138 | null |
2024-04-02 | ViTamin: Designing Scalable Vision Models in the Vision-Language Era | Jienneg Chen et.al. | 2404.02132 | link |
2024-04-02 | FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning | Joel Niklaus et.al. | 2404.02127 | link |
2024-04-02 | Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models | Wanyong Feng et.al. | 2404.02124 | link |
2024-04-02 | GINopic: Topic Modeling with Graph Isomorphism Network | Suman Adhya et.al. | 2404.02115 | link |
2024-04-02 | CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems | Sara Rosenthal et.al. | 2404.02103 | link |
2024-04-02 | Advancing LLM Reasoning Generalists with Preference Trees | Lifan Yuan et.al. | 2404.02078 | link |
2024-04-02 | Red-Teaming Segment Anything Model | Krzysztof Jankowski et.al. | 2404.02067 | link |
2024-04-02 | Digital Forgetting in Large Language Models: A Survey of Unlearning Methods | Alberto Blanco-Justicia et.al. | 2404.02062 | null |
2024-04-02 | Long-context LLMs Struggle with Long In-context Learning | Tianle Li et.al. | 2404.02060 | link |
2024-04-02 | IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT | Junchen Fu et.al. | 2404.02059 | link |
2024-04-02 | Deconstructing In-Context Learning: Understanding Prompts via Corruption | Namrata Shivagunde et.al. | 2404.02054 | link |
2024-04-02 | A Survey on Large Language Model-Based Game Agents | Sihao Hu et.al. | 2404.02039 | link |
2024-04-02 | MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages | Daryna Dementieva et.al. | 2404.02037 | null |
2024-04-02 | Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts | Zhuo Chen et.al. | 2404.02022 | null |
2024-04-02 | Large Language Models for Orchestrating Bimanual Robots | Kun Chu et.al. | 2404.02018 | null |
2024-04-02 | MuxServe: Flexible Multiplexing for Efficient Multiple LLM Serving | Jiangfei Duan et.al. | 2404.02015 | null |
2024-04-02 | Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language Models | Stephan Linzbach et.al. | 2404.01992 | null |
2024-03-29 | Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models | Atsuyuki Miyai et.al. | 2403.20331 | link |
2024-03-29 | Are We on the Right Way for Evaluating Large Vision-Language Models? | Lin Chen et.al. | 2403.20330 | link |
2024-03-29 | ReALM: Reference Resolution As Language Modeling | Joel Ruben Antony Moniz et.al. | 2403.20329 | null |
2024-03-29 | Gecko: Versatile Text Embeddings Distilled from Large Language Models | Jinhyuk Lee et.al. | 2403.20327 | null |
2024-03-29 | Convolutional Prompting meets Language Models for Continual Learning | Anurag Roy et.al. | 2403.20317 | null |
2024-03-29 | Learn "No" to Say "Yes" Better: Improving Vision-Language Models via Negations | Jaisidh Singh et.al. | 2403.20312 | link |
2024-03-29 | Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference | Jovan Stojkovic et.al. | 2403.20306 | null |
2024-03-29 | Can LLMs Correct Physicians, Yet? Investigating Effective Interaction Methods in the Medical Domain | Burcu Sayin et.al. | 2403.20288 | link |
2024-03-29 | LUQ: Long-text Uncertainty Quantification for LLMs | Caiqi Zhang et.al. | 2403.20279 | null |
2024-04-01 | Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want | Weifeng Lin et.al. | 2403.20271 | link |
2024-03-29 | Latxa: An Open Language Model and Evaluation Suite for Basque | Julen Etxaniz et.al. | 2403.20266 | link |
2024-03-29 | ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models | Thibaut Thonet et.al. | 2403.20262 | null |
2024-03-29 | MedCLIP-SAM: Bridging Text and Image Towards Universal Medical Image Segmentation | Taha Koleilat et.al. | 2403.20253 | null |
2024-03-29 | Using LLMs to Model the Beliefs and Preferences of Targeted Populations | Keiichi Namikoshi et.al. | 2403.20252 | null |
2024-03-29 | Long-Tailed Anomaly Detection with Learnable Class Names | Chih-Hui Ho et.al. | 2403.20236 | null |
2024-03-29 | H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model | Chao Pang et.al. | 2403.20213 | link |
2024-03-29 | Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science | Yazheng Yang et.al. | 2403.20208 | null |
2024-03-29 | The Future of Combating Rumors? Retrieval, Discrimination, and Generation | Junhao Xu et.al. | 2403.20204 | null |
2024-03-29 | ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Capability for Large Vision-Language Models | Shuo Liu et.al. | 2403.20194 | null |
2024-03-29 | HARMamba: Efficient Wearable Sensor Human Activity Recognition Based on Bidirectional Selective SSM | Shuangjian Li et.al. | 2403.20183 | null |
2024-03-28 | RSMamba: Remote Sensing Image Classification with State Space Model | Keyan Chen et.al. | 2403.19654 | link |
2024-03-28 | InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction | Sirui Xu et.al. | 2403.19652 | null |
2024-03-28 | MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions | Kai Zhang et.al. | 2403.19651 | null |
2024-03-28 | Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models | Samuel Marks et.al. | 2403.19647 | link |
2024-03-28 | Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change Captioning | Chenyang Liu et.al. | 2403.19646 | link |
2024-03-28 | Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models | Yucheng Shi et.al. | 2403.19631 | null |
2024-03-28 | RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents | Zeren Chen et.al. | 2403.19622 | null |
2024-03-28 | SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects | Avinash Ummadisingu et.al. | 2403.19607 | null |
2024-03-28 | Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation | Zhongliang Zhou et.al. | 2403.19584 | null |
2024-03-28 | Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics | Norman Di Palo et.al. | 2403.19578 | null |
2024-03-28 | WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models | Piotr Molenda et.al. | 2403.19548 | null |
2024-03-28 | Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models | Ang Lv et.al. | 2403.19521 | link |
2024-03-28 | Improving Clinical NLP Performance through Language Model-Generated Synthetic Clinical Data | Shan Chen et.al. | 2403.19511 | link |
2024-03-28 | LLMs as Academic Reading Companions: Extending HCI Through Synthetic Personae | Celia Chen et.al. | 2403.19506 | null |
2024-03-28 | Evolving Assembly Code in an Adversarial Environment | Irina Maliukov et.al. | 2403.19489 | null |
2024-03-28 | JDocQA: Japanese Document Question Answering Dataset for Generative Language Models | Eri Onami et.al. | 2403.19454 | link |
2024-03-28 | Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model | Qi Gou et.al. | 2403.19443 | null |
2024-03-28 | OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion | Xinyu Zhan et.al. | 2403.19417 | null |
2024-03-28 | BP4ER: Bootstrap Prompting for Explicit Reasoning in Medical Dialogue Generation | Yuhong He et.al. | 2403.19414 | null |
2024-03-28 | Checkpoint Merging via Bayesian Optimization in LLM Pretraining | Deyuan Liu et.al. | 2403.19390 | null |
2024-03-27 | Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models | Yanwei Li et.al. | 2403.18814 | link |
2024-03-27 | ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation | Suraj Patni et.al. | 2403.18807 | link |
2024-03-27 | Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation | Mateusz Klimaszewski et.al. | 2403.18804 | link |
2024-03-27 | Projective Methods for Mitigating Gender Bias in Pre-trained Language Models | Hillary Dawkins et.al. | 2403.18803 | link |
2024-03-27 | Long-form factuality in large language models | Jerry Wei et.al. | 2403.18802 | link |
2024-03-27 | Towards a World-English Language Model for On-Device Virtual Assistants | Rricha Jalota et.al. | 2403.18783 | null |
2024-03-27 | 3P-LLM: Probabilistic Path Planning using Large Language Model for Autonomous Robot Navigation | Ehsan Latif et.al. | 2403.18778 | null |
2024-03-27 | ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object | Chenshuang Zhang et.al. | 2403.18775 | link |
2024-03-27 | CheckEval: Robust Evaluation Framework using Large Language Model via Checklist | Yukyung Lee et.al. | 2403.18771 | null |
2024-03-27 | MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model | Yike Wu et.al. | 2403.18760 | link |
2024-03-27 | CYCLE: Learning to Self-Refine the Code Generation | Yangruibo Ding et.al. | 2403.18746 | link |
2024-03-27 | Understanding the Learning Dynamics of Alignment with Human Feedback | Shawn Im et.al. | 2403.18742 | link |
2024-03-27 | PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations | Ehsan Latif et.al. | 2403.18721 | null |
2024-03-27 | Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding | Xintong Wang et.al. | 2403.18715 | null |
2024-03-27 | The Invalsi Benchmark: measuring Language Models Mathematical and Language understanding in Italian | Andrea Esuli et.al. | 2403.18697 | null |
2024-03-27 | NL-ITI: Optimizing Probing and Intervention for Improvement of ITI Method | Jakub Hoscilowicz et.al. | 2403.18680 | link |
2024-03-27 | An Exploratory Study on Upper-Level Computing Students' Use of Large Language Models as Tools in a Semester-Long Project | Ben Arie Tanay et.al. | 2403.18679 | null |
2024-03-27 | SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens | Chengbo Liu et.al. | 2403.18647 | link |
2024-03-27 | To Recommend or Not: Recommendability Identification in Conversations with Pre-trained Language Models | Zhefan Wang et.al. | 2403.18628 | link |
2024-03-27 | Vulnerability Detection with Code Language Models: How Far Are We? | Yangruibo Ding et.al. | 2403.18624 | link |
2024-03-26 | OmniVid: A Generative Framework for Universal Video Understanding | Junke Wang et.al. | 2403.17935 | link |
2024-03-26 | Track Everything Everywhere Fast and Robustly | Yunzhou Song et.al. | 2403.17931 | null |
2024-03-26 | MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution | Wei Tao et.al. | 2403.17927 | null |
2024-03-26 | LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning | Rui Pan et.al. | 2403.17919 | link |
2024-03-26 | Large scale paired antibody language models | Henry Kenlay et.al. | 2403.17889 | null |
2024-03-26 | Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation | Carlos Gomes et.al. | 2403.17886 | null |
2024-03-26 | MIND Your Language: A Multilingual Dataset for Cross-lingual News Recommendation | Andreea Iana et.al. | 2403.17876 | link |
2024-03-26 | Addressing Social Misattributions of Large Language Models: An HCXAI-based Approach | Andrea Ferrario et.al. | 2403.17873 | null |
2024-03-26 | Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications | Philip Lippmann et.al. | 2403.17860 | null |
2024-03-26 | ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages | Bhawna Piryani et.al. | 2403.17859 | link |
2024-03-26 | Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs | David R. Mortensen et.al. | 2403.17856 | null |
2024-03-26 | ArabicaQA: A Comprehensive Dataset for Arabic Question Answering | Abdelrahman Abdallah et.al. | 2403.17848 | link |
2024-03-26 | Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation | Abdelrhman Werby et.al. | 2403.17846 | null |
2024-03-26 | Mechanistic Design and Scaling of Hybrid Architectures | Michael Poli et.al. | 2403.17844 | null |
2024-03-26 | ReMamber: Referring Image Segmentation with Mamba Twister | Yuhuan Yang et.al. | 2403.17839 | null |
2024-03-26 | A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalities | Ibrahim Ethem Hamamci et.al. | 2403.17834 | link |
2024-03-26 | Assessment of Multimodal Large Language Models in Alignment with Human Values | Zhelun Shi et.al. | 2403.17830 | null |
2024-03-26 | Accelerating Radio Spectrum Regulation Workflows with Large Language Models (LLMs) | Amir Ghasemi et.al. | 2403.17819 | null |
2024-03-26 | Graph Language Model (GLM): A new graph-based approach to detect social instabilities | Wallyson Lemes de Oliveira et.al. | 2403.17816 | null |
2024-03-26 | Are Compressed Language Models Less Subgroup Robust? | Leonidas Gee et.al. | 2403.17811 | link |
2024-03-25 | Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making | Shuai Ma et.al. | 2403.16812 | null |
2024-03-25 | An LLM-Based Digital Twin for Optimizing Human-in-the Loop Systems | Hanqing Yang et.al. | 2403.16809 | link |
2024-03-25 | Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback | Zhangqian Bi et.al. | 2403.16792 | null |
2024-03-25 | All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification | Deepak Narayan Gadde et.al. | 2403.16750 | null |
2024-03-25 | A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models | Nils Ingelhag et.al. | 2403.16730 | null |
2024-03-25 | ProCQA: A Large-scale Community-based Programming Question Answering Dataset for Code Search | Zehan Li et.al. | 2403.16702 | link |
2024-03-25 | Synapse: Learning Preferential Concepts from Visual Demonstrations | Sadanand Modak et.al. | 2403.16689 | null |
2024-03-25 | Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography | Jiayue Zhang et.al. | 2403.16687 | null |
2024-03-25 | RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict | Yirong Zeng et.al. | 2403.16662 | link |
2024-03-25 | Grammatical vs Spelling Error Correction: An Investigation into the Responsiveness of Transformer-based Language Models using BART and MarianMT | Rohit Raju et.al. | 2403.16655 | null |
2024-03-25 | CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment | Feiteng Fang et.al. | 2403.16649 | link |
2024-03-25 | Virtual Co-Pilot: Multimodal Large Language Model-enabled Quick-access Procedures for Single Pilot Operations | Fan Li et.al. | 2403.16645 | null |
2024-03-25 | Semantically Enriched Cross-Lingual Sentence Embeddings for Crisis-related Social Media Texts | Rabindra Lamsal et.al. | 2403.16614 | null |
2024-03-25 | Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units | Biswesh Mohapatra et.al. | 2403.16609 | null |
2024-03-25 | TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques | Ashok Urlana et.al. | 2403.16592 | null |
2024-03-25 | Can Large Language Models (or Humans) Distill Text? | Nicolas Audinet de Pieuchon et.al. | 2403.16584 | null |
2024-03-25 | NSINA: A News Corpus for Sinhala | Hansi Hettiarachchi et.al. | 2403.16571 | link |
2024-03-25 | Elysium: Exploring Object-level Perception in Videos via MLLM | Han Wang et.al. | 2403.16558 | link |
2024-03-25 | DOrA: 3D Visual Grounding with Order-Aware Referring | Tung-Yu Wu et.al. | 2403.16539 | null |
2024-03-25 | Open-Set Recognition in the Age of Vision-Language Models | Dimity Miller et.al. | 2403.16528 | null |
2024-03-25 | Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art | Neeloy Chakraborty et.al. | 2403.16527 | null |
2024-03-25 | Harnessing the power of LLMs for normative reasoning in MASs | Bastin Tony Roy Savarimuthu et.al. | 2403.16524 | null |
2024-03-25 | Norm Violation Detection in Multi-Agent Systems using Large Language Models: A Pilot Study | Shawn He et.al. | 2403.16517 | null |
2024-03-25 | Linguistically Differentiating Acts and Recalls of Racial Microaggressions on Social Media | Uma Sushmitha Gunturi et.al. | 2403.16514 | null |
2024-03-22 | LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models | Yuzhang Shang et.al. | 2403.15388 | null |
2024-03-22 | Long-CLIP: Unlocking the Long-Text Capability of CLIP | Beichen Zhang et.al. | 2403.15378 | link |
2024-03-22 | InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding | Yi Wang et.al. | 2403.15377 | link |
2024-03-22 | Can large language models explore in-context? | Akshay Krishnamurthy et.al. | 2403.15371 | null |
2024-03-22 | CoLLEGe: Concept Embedding Generation for Large Language Models | Ryan Teehan et.al. | 2403.15362 | null |
2024-03-22 | Neural Plasticity-Inspired Foundation Model for Observing the Earth Crossing Modalities | Zhitong Xiong et.al. | 2403.15356 | link |
2024-03-22 | Controlled Training Data Generation with Diffusion Models | Teresa Yeo et.al. | 2403.15309 | null |
2024-03-22 | Sphere Neural-Networks for Rational Reasoning | Tiansi Dong et.al. | 2403.15297 | null |
2024-03-22 | Measuring Gender and Racial Biases in Large Language Models | Jiafu An et.al. | 2403.15281 | null |
2024-03-22 | Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review | Jinge Wang et.al. | 2403.15274 | null |
2024-03-22 | Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs | Xiaobin Zhang et.al. | 2403.15273 | null |
2024-03-22 | Imagination Augmented Generation: Learning to Imagine Richer Context for Question Answering over Large Language Models | Huanxuan Liao et.al. | 2403.15268 | link |
2024-03-22 | AI Exposure and Strategic Positioning on an Online Work Platform | Shun Yiu et.al. | 2403.15262 | null |
2024-03-22 | FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions | Orion Weller et.al. | 2403.15246 | link |
2024-03-22 | Shadow Generation for Composite Image Using Diffusion model | Qingyang Liu et.al. | 2403.15234 | link |
2024-03-22 | An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets | Jonathan Katzy et.al. | 2403.15230 | link |
2024-03-22 | Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models | Qiong Wu et.al. | 2403.15226 | null |
2024-03-22 | Anytime, Anywhere, Anyone: Investigating the Feasibility of Segment Anything Model for Crowd-Sourcing Medical Image Annotations | Pranav Kulkarni et.al. | 2403.15218 | link |
2024-03-22 | InstaSynth: Opportunities and Challenges in Generating Synthetic Instagram Data with ChatGPT for Sponsored Content Detection | Thales Bertaglia et.al. | 2403.15214 | link |
2024-03-22 | MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection | Taeheon Kim et.al. | 2403.15209 | null |
2024-03-21 | MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? | Renrui Zhang et.al. | 2403.14624 | null |
2024-03-21 | Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey | Zeyu Han et.al. | 2403.14608 | null |
2024-03-21 | MyVLM: Personalizing VLMs for User-Specific Queries | Yuval Alaluf et.al. | 2403.14599 | null |
2024-03-21 | ReAct Meets ActRe: Autonomous Annotations of Agent Trajectories for Contrastive Self-Training | Zonghan Yang et.al. | 2403.14589 | null |
2024-03-21 | Large Language Models for Multi-Choice Question Classification of Medical Subjects | Víctor Ponce-López et.al. | 2403.14582 | null |
2024-03-21 | RAmBLA: A Framework for Evaluating the Reliability of LLMs as Assistants in the Biomedical Domain | William James Bolton et.al. | 2403.14578 | link |
2024-03-21 | A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science | Clayton Cohn et.al. | 2403.14565 | null |
2024-03-21 | The Era of Semantic Decoding | Maxime Peyrard et.al. | 2403.14562 | null |
2024-03-21 | Lexicon-Level Contrastive Visual-Grounding Improves Language Modeling | Chengxu Zhuang et.al. | 2403.14551 | null |
2024-03-21 | EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling | Shimao Zhang et.al. | 2403.14541 | link |
2024-03-21 | Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference | Han Zhao et.al. | 2403.14520 | null |
2024-03-21 | The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs) | Joschka Haltaufderheide et.al. | 2403.14473 | null |
2024-03-21 | Detoxifying Large Language Models via Knowledge Editing | Mengru Wang et.al. | 2403.14472 | link |
2024-03-21 | ChatGPT Alternative Solutions: Large Language Models Survey | Hanieh Alipour et.al. | 2403.14469 | null |
2024-03-21 | Recourse for reclamation: Chatting with generative language models | Jennifer Chien et.al. | 2403.14467 | null |
2024-03-21 | Towards Single-System Illusion in Software-Defined Vehicles -- Automated, AI-Powered Workflow | Krzysztof Lebioda et.al. | 2403.14460 | null |
2024-03-21 | Multi-Level Explanations for Generative Language Models | Lucas Monteiro Paes et.al. | 2403.14459 | null |
2024-03-21 | gTBLS: Generating Tables from Text by Conditional Question Answering | Anirudh Sundar et.al. | 2403.14457 | null |
2024-03-21 | Language Models Can Reduce Asymmetry in Information Markets | Nasim Rahaman et.al. | 2403.14443 | null |
2024-03-21 | A Multimodal Approach to Device-Directed Speech Detection with Large Language Models | Dominik Wager et.al. | 2403.14438 | null |
2024-03-20 | RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition | Ziyu Liu et.al. | 2403.13805 | link |
2024-03-20 | Learning from Models and Data for Visual Grounding | Ruozhen He et.al. | 2403.13804 | null |
2024-03-20 | Reverse Training to Nurse the Reversal Curse | Olga Golovneva et.al. | 2403.13799 | null |
2024-03-20 | Bridge the Modality and Capacity Gaps in Vision-Language Model Selection | Chao Yi et.al. | 2403.13797 | null |
2024-03-20 | RewardBench: Evaluating Reward Models for Language Modeling | Nathan Lambert et.al. | 2403.13787 | link |
2024-03-20 | Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts | Guangzeng Han et.al. | 2403.13786 | link |
2024-03-20 | Information-Theoretic Distillation for Reference-less Summarization | Jaehun Jung et.al. | 2403.13780 | null |
2024-03-20 | Embedding Pose Graph, Enabling 3D Foundation Model Capabilities with a Compact Representation | Hugues Thomas et.al. | 2403.13777 | null |
2024-03-20 | Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models | Nicholas Bai et.al. | 2403.13771 | link |
2024-03-20 | Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model | Diwei Wang et.al. | 2403.13756 | null |
2024-03-20 | Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement | Catherine Arnett et.al. | 2403.13754 | null |
2024-03-20 | EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation | Atnafu Lambebo Tonja et.al. | 2403.13737 | null |
2024-03-20 | Large Language Models meet Network Slicing Management and Orchestration | Abdulhalim Dandoush et.al. | 2403.13721 | null |
2024-03-20 | SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning | Hongjun Wang et.al. | 2403.13684 | null |
2024-03-20 | PARAMANU-AYN: An Efficient Novel Generative and Instruction-tuned Language Model for Indian Legal Case Documents | Mitodru Niyogi et.al. | 2403.13681 | null |
2024-03-20 | RoleInteract: Evaluating the Social Interaction of Role-Playing Agents | Hongzhan Chen et.al. | 2403.13679 | link |
2024-03-20 | Grounding Spatial Relations in Text-Only Language Models | Gorka Azkune et.al. | 2403.13666 | link |
2024-03-20 | Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese | Meet Doshi et.al. | 2403.13638 | null |
2024-03-20 | VL-Mamba: Exploring State Space Models for Multimodal Learning | Yanyuan Qiao et.al. | 2403.13600 | null |
2024-03-20 | No more optimization rules: LLM-enabled policy-based multi-modal query optimizer (version 1) | Yifan Wang et.al. | 2403.13597 | null |
2024-03-19 | LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression | Zhuoshi Pan et.al. | 2403.12968 | link |
2024-03-19 | Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models | Zuyan Liu et.al. | 2403.12966 | link |
2024-03-19 | Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models | Ce Zhang et.al. | 2403.12964 | link |
2024-03-19 | Dated Data: Tracing Knowledge Cutoffs in Large Language Models | Jeffrey Cheng et.al. | 2403.12958 | null |
2024-03-19 | Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models | Elaine Sui et.al. | 2403.12952 | link |
2024-03-19 | Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models | Joana Ribeiro de Faria et.al. | 2403.12936 | null |
2024-03-19 | Segment Anything for comprehensive analysis of grapevine cluster architecture and berry properties | Efrain Torres-Lomas et.al. | 2403.12935 | null |
2024-03-19 | Rapid AIdeation: Generating Ideas With the Self and in Collaboration With Large Language Models | Gionnieve Lim et.al. | 2403.12928 | null |
2024-03-19 | Supporting Energy Policy Research with Large Language Models | Grant Buster et.al. | 2403.12924 | null |
2024-03-19 | Contextual AD Narration with Interleaved Multimodal Sequence | Hanlin Wang et.al. | 2403.12922 | null |
2024-03-19 | Semantic Layering in Room Segmentation via LLMs | Taehyeon Kim et.al. | 2403.12920 | null |
2024-03-19 | Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts | Sai Ashish Somayajula et.al. | 2403.12918 | link |
2024-03-19 | Yell At Your Robot: Improving On-the-Fly from Language Corrections | Lucy Xiaoyang Shi et.al. | 2403.12910 | null |
2024-03-19 | Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference | Baolin Li et.al. | 2403.12900 | null |
2024-03-19 | mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding | Anwen Hu et.al. | 2403.12895 | link |
2024-03-20 | MEDBind: Unifying Language and Multimodal Medical Data Embeddings | Yuan Gao et.al. | 2403.12894 | null |
2024-03-19 | HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning | Fucai Ke et.al. | 2403.12884 | null |
2024-03-19 | Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models | Zehui Chen et.al. | 2403.12881 | link |
2024-03-19 | Epistemology of Language Models: Do Language Models Have Holistic Knowledge? | Minsu Kim et.al. | 2403.12862 | null |
2024-03-19 | RASP: A Drone-based Reconfigurable Actuation and Sensing Platform Towards Ambient Intelligent Systems | Minghui Zhao et.al. | 2403.12853 | null |
2024-03-18 | Modality-Agnostic fMRI Decoding of Vision and Language | Mitja Nikolaus et.al. | 2403.11771 | null |
2024-03-18 | Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs | M. Jehanzeb Mirza et.al. | 2403.11755 | link |
2024-03-18 | Revisiting The Classics: A Study on Identifying and Rectifying Gender Stereotypes in Rhymes and Poems | Aditya Narayan Sankaran et.al. | 2403.11752 | null |
2024-03-18 | Embedded Named Entity Recognition using Probing Classifiers | Nicholas Popovič et.al. | 2403.11747 | null |
2024-03-18 | TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models | Lisa Weijler et.al. | 2403.11691 | null |
2024-03-18 | HDLdebugger: Streamlining HDL debugging with Large Language Models | Xufeng Yao et.al. | 2403.11671 | null |
2024-03-18 | Prioritized Semantic Learning for Zero-shot Instance Navigation | Xander Sun et.al. | 2403.11650 | null |
2024-03-18 | Arc2Face: A Foundation Model of Human Faces | Foivos Paraperas Papantoniou et.al. | 2403.11641 | link |
2024-03-18 | Compositional Kronecker Context Optimization for Vision-Language Models | Kun Ding et.al. | 2403.11631 | null |
2024-03-18 | Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model | Haoyun Xu et.al. | 2403.11621 | null |
2024-03-18 | CRS-Diff: Controllable Generative Remote Sensing Foundation Model | Datao Tang et.al. | 2403.11614 | link |
2024-03-18 | Linguacodus: A Synergistic Framework for Transformative Code Generation in Machine Learning Pipelines | Ekaterina Trofimova et.al. | 2403.11585 | null |
2024-03-18 | Reinforcement Learning with Token-level Feedback for Controllable Text Generation | Wendi Li et.al. | 2403.11558 | link |
2024-03-18 | LLM^3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning | Shu Wang et.al. | 2403.11552 | link |
2024-03-18 | Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters | Jiazuo Yu et.al. | 2403.11549 | link |
2024-03-18 | DEE: Dual-stage Explainable Evaluation Method for Text Generation | Shenyu Zhang et.al. | 2403.11509 | null |
2024-03-18 | Do CLIPs Always Generalize Better than ImageNet Models? | Qizhou Wang et.al. | 2403.11497 | null |
2024-03-18 | VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding | Yue Fan et.al. | 2403.11481 | null |
2024-03-18 | HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models | Huy Nghiem et.al. | 2403.11456 | link |
2024-03-18 | Zero-shot Compound Expression Recognition with Visual Language Model at the 6th ABAW Challenge | Jiahe Wang et.al. | 2403.11450 | null |
2024-03-18 | LLM Guided Evolution - The Automation of Models Advancing Models | Clint Morris et.al. | 2403.11446 | null |
2024-03-18 | StyleChat: Learning Recitation-Augmented Memory in LLMs for Stylized Dialogue Generation | Jinpeng Li et.al. | 2403.11439 | null |
2024-03-18 | InsCL: A Data-efficient Continual Learning Paradigm for Fine-tuning Large Language Models with Instructions | Yifan Wang et.al. | 2403.11435 | null |
2024-03-18 | A Novel Paradigm Boosting Translation Capabilities of Large Language Models | Jiaxin Guo et.al. | 2403.11430 | null |
2024-03-15 | VideoAgent: Long-form Video Understanding with Large Language Model as Agent | Xiaohan Wang et.al. | 2403.10517 | null |
2024-03-15 | Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization | Ratnadira Widyasari et.al. | 2403.10507 | null |
2024-03-15 | ATOM: Asynchronous Training of Massive Models for Deep Learning in a Decentralized Environment | Xiaofeng Wu et.al. | 2403.10504 | null |
2024-03-15 | Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study | Chenguang Wang et.al. | 2403.10499 | link |
2024-03-15 | Reconfigurable Robot Identification from Motion Data | Yuhang Hu et.al. | 2403.10496 | null |
2024-03-15 | Can a GPT4-Powered AI Agent Be a Good Enough Performance Attribution Analyst? | Bruno de Melo et.al. | 2403.10482 | null |
2024-03-15 | Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases | Jiarui Li et.al. | 2403.10446 | link |
2024-03-15 | Optimal Block-Level Draft Verification for Accelerating Speculative Decoding | Ziteng Sun et.al. | 2403.10444 | null |
2024-03-15 | Using an LLM to Turn Sign Spottings into Spoken Language Sentences | Ozge Mercanoglu Sincan et.al. | 2403.10434 | null |
2024-03-15 | SocialGenPod: Privacy-Friendly Generative AI Social Web Applications with Decentralised Personal Data Stores | Vidminas Vizgirda et.al. | 2403.10408 | link |
2024-03-15 | A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE | Hervé Déjean et.al. | 2403.10407 | null |
2024-03-15 | Monotonic Representation of Numeric Properties in Language Models | Benjamin Heinzerling et.al. | 2403.10381 | link |
2024-03-15 | EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models | Rocktim Jyoti Das et.al. | 2403.10378 | link |
2024-03-15 | TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale | Pengcheng Jiang et.al. | 2403.10351 | null |
2024-03-15 | Investigating grammatical abstraction in language models using few-shot learning of novel noun gender | Priyanka Sukumaran et.al. | 2403.10338 | null |
2024-03-15 | CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model | Shang-Hsuan Chiang et.al. | 2403.10326 | link |
2024-03-15 | NetBench: A Large-Scale and Comprehensive Network Traffic Benchmark Dataset for Foundation Models | Chen Qian et.al. | 2403.10319 | link |
2024-03-15 | Uni-SMART: Universal Science Multimodal Analysis and Research Transformer | Hengxing Cai et.al. | 2403.10301 | null |
2024-03-15 | Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models | Tian Meng et.al. | 2403.10287 | null |
2024-03-15 | Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning | Shang-Hsuan Chiang et.al. | 2403.10281 | link |
2024-03-14 | GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping | Yuhang Zheng et.al. | 2403.09637 | link |
2024-03-14 | Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference | Piotr Nawrot et.al. | 2403.09636 | null |
2024-03-14 | Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models | Akhil Kedia et.al. | 2403.09635 | link |
2024-03-14 | OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning | Lingyi Hong et.al. | 2403.09634 | null |
2024-03-14 | 3D-VLA: A 3D Vision-Language-Action Generative World Model | Haoyu Zhen et.al. | 2403.09631 | null |
2024-03-14 | Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking | Eric Zelikman et.al. | 2403.09629 | link |
2024-03-14 | Explore In-Context Segmentation via Latent Diffusion Models | Chaoyang Wang et.al. | 2403.09616 | null |
2024-03-14 | MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training | Brandon McKinzie et.al. | 2403.09611 | null |
2024-03-14 | Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey | Xiaoyu Liu et.al. | 2403.09606 | null |
2024-03-14 | Logical Discrete Graphical Models Must Supplement Large Language Models for Information Synthesis | Gregory Coppola et.al. | 2403.09599 | null |
2024-03-14 | Renovating Names in Open-Vocabulary Segmentation Benchmarks | Haiwen Huang et.al. | 2403.09593 | null |
2024-03-14 | ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models | Runyu Ma et.al. | 2403.09583 | null |
2024-03-14 | Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation | Yunhao Gou et.al. | 2403.09572 | null |
2024-03-14 | Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models | Laura Fernández-Becerra et.al. | 2403.09567 | null |
2024-03-14 | Welcome Your New AI Teammate: On Safety Analysis by Leashing Large Language Models | Ali Nouri et.al. | 2403.09565 | null |
2024-03-14 | PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps | Ruixuan Liu et.al. | 2403.09562 | null |
2024-03-14 | Less is More: Data Value Estimation for Visual Instruction Tuning | Zikang Liu et.al. | 2403.09559 | null |
2024-03-15 | Logits of API-Protected LLMs Leak Proprietary Information | Matthew Finlayson et.al. | 2403.09539 | null |
2024-03-14 | VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding | Chris Kelly et.al. | 2403.09530 | null |
2024-03-15 | WavCraft: Audio Editing and Generation with Natural Language Prompts | Jinhua Liang et.al. | 2403.09527 | link |
2024-03-13 | Simple and Scalable Strategies to Continually Pre-train Large Language Models | Adam Ibrahim et.al. | 2403.08763 | link |
2024-03-13 | Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework | Jingling Li et.al. | 2403.08743 | null |
2024-03-13 | The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models | Carlo Nicolini et.al. | 2403.08739 | null |
2024-03-13 | ILCiteR: Evidence-grounded Interpretable Local Citation Recommendation | Sayar Ghosh Roy et.al. | 2403.08737 | link |
2024-03-13 | Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization | Renjie Pi et.al. | 2403.08730 | null |
2024-03-14 | SOTOPIA- |
Ruiyi Wang et.al. | 2403.08715 | link |
2024-03-13 | Review of Generative AI Methods in Cybersecurity | Yagmur Yigit et.al. | 2403.08701 | null |
2024-03-13 | TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning | Shangding Gu et.al. | 2403.08694 | null |
2024-03-13 | Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages | Rik van Noord et.al. | 2403.08693 | null |
2024-03-13 | Zero-shot and Few-shot Generation Strategies for Artificial Clinical Records | Erlend Frayling et.al. | 2403.08664 | null |
2024-03-13 | Self-Supervised Learning for Covariance Estimation | Tzvi Diskin et.al. | 2403.08662 | null |
2024-03-13 | Human Alignment of Large Language Models through Online Preference Optimisation | Daniele Calandriello et.al. | 2403.08635 | null |
2024-03-13 | MedInsight: A Multi-Source Context Augmentation Framework for Generating Patient-Centric Medical Responses using Large Language Models | Subash Neupane et.al. | 2403.08607 | null |
2024-03-13 | Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation | Daniel Honerkamp et.al. | 2403.08605 | link |
2024-03-13 | DevBench: A Comprehensive Benchmark for Software Development | Bowen Li et.al. | 2403.08604 | link |
2024-03-13 | Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments | Sitao Cheng et.al. | 2403.08593 | null |
2024-03-13 | Non-discrimination Criteria for Generative Language Models | Sara Sterlie et.al. | 2403.08564 | null |
2024-03-13 | AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models | Yifei Gao et.al. | 2403.08542 | null |
2024-03-13 | Language models scale reliably with over-training and on downstream tasks | Samir Yitzhak Gadre et.al. | 2403.08540 | link |
2024-03-13 | Masked Generative Story Transformer with Character Guidance and Caption Augmentation | Christos Papadimitriou et.al. | 2403.08502 | link |
2024-03-12 | Beyond Text: Frozen Large Language Models in Visual Signal Comprehension | Lei Zhu et.al. | 2403.07874 | link |
2024-03-12 | Rethinking Generative Large Language Model Evaluation for Semantic Comprehension | Fangyun Wei et.al. | 2403.07872 | null |
2024-03-12 | Exploring Safety Generalization Challenges of Large Language Models via Code | Qibing Ren et.al. | 2403.07865 | null |
2024-03-12 | Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation | Shihao Zhao et.al. | 2403.07860 | link |
2024-03-12 | MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric | Haokun Lin et.al. | 2403.07839 | null |
2024-03-12 | DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies | William Xie et.al. | 2403.07832 | null |
2024-03-12 | The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage Brought By Model Editing | Jianchen Wang et.al. | 2403.07825 | null |
2024-03-12 | Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM | Sainbayar Sukhbaatar et.al. | 2403.07816 | null |
2024-03-12 | Chronos: Learning the Language of Time Series | Abdul Fatir Ansari et.al. | 2403.07815 | link |
2024-03-12 | Beyond Memorization: The Challenge of Random Memory Access in Language Models | Tongyao Zhu et.al. | 2403.07805 | link |
2024-03-12 | Fine-tuning Large Language Models with Sequential Instructions | Hanxu Hu et.al. | 2403.07794 | link |
2024-03-12 | Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations | Carlos Jose Xavier Cruz et.al. | 2403.07769 | link |
2024-03-12 | Synth |
Sahand Sharifzadeh et.al. | 2403.07750 | null |
2024-03-12 | FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models | Yan Liu et.al. | 2403.07747 | null |
2024-03-12 | Multi-modal Auto-regressive Modeling via Visual Words | Tianshuo Peng et.al. | 2403.07720 | link |
2024-03-12 | WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? | Alexandre Drouin et.al. | 2403.07718 | link |
2024-03-12 | StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models | Zhicheng Guo et.al. | 2403.07714 | link |
2024-03-12 | Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards | Wei Shen et.al. | 2403.07708 | null |
2024-03-12 | Large, Small or Both: A Novel Data Augmentation Framework Based on Language Models for Debiasing Opinion Summarization | Yanyue Zhang et.al. | 2403.07693 | null |
2024-03-12 | Reference-free Monolithic Preference Optimization with Odds Ratio | Jiwoo Hong et.al. | 2403.07691 | link |
2024-03-11 | Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena | Leonie Weissweiler et.al. | 2403.06965 | null |
2024-03-11 | Materials science in the era of large language models: a perspective | Ge Lei et.al. | 2403.06949 | null |
2024-03-11 | Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation | Xinyao Li et.al. | 2403.06946 | link |
2024-03-11 | Naming, Describing, and Quantifying Visual Objects in Humans and LLMs | Alberto Testoni et.al. | 2403.06935 | link |
2024-03-11 | ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis | Yanming Liu et.al. | 2403.06932 | link |
2024-03-11 | MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning | Yichuan Li et.al. | 2403.06914 | link |
2024-03-11 | Application of Quantum Tensor Networks for Protein Classification | Debarshi Kundu et.al. | 2403.06890 | null |
2024-03-11 | Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents | Nishchal Prasad et.al. | 2403.06872 | link |
2024-03-11 | Semantic Residual Prompts for Continual Learning | Martin Menabue et.al. | 2403.06870 | null |
2024-03-11 | Learning with Noisy Foundation Models | Hao Chen et.al. | 2403.06869 | null |
2024-03-11 | A Geospatial Approach to Predicting Desert Locust Breeding Grounds in Africa | Ibrahim Salihu Yusuf et.al. | 2403.06860 | null |
2024-03-11 | Development of a Reliable and Accessible Caregiving Language Model (CaLM) | Bambang Parmanto et.al. | 2403.06857 | null |
2024-03-11 | DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation | Guosheng Zhao et.al. | 2403.06845 | null |
2024-03-11 | RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback | Yanming Liu et.al. | 2403.06840 | link |
2024-03-11 | ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts | Lyuye Zhang et.al. | 2403.06838 | null |
2024-03-11 | Can LLMs Separate Instructions From Data? And What Do We Even Mean By That? | Egor Zverev et.al. | 2403.06833 | link |
2024-03-11 | The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework | Zhuo Chen et.al. | 2403.06832 | link |
2024-03-11 | ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model | Zhiwei Liu et.al. | 2403.06765 | link |
2024-03-11 | An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models | Liang Chen et.al. | 2403.06764 | link |
2024-03-11 | ALaRM: Align Language Models via Hierarchical Rewards Modeling | Yuhang Lai et.al. | 2403.06754 | link |
2024-03-08 | Bayesian Preference Elicitation with Language Models | Kunal Handa et.al. | 2403.05534 | null |
2024-03-08 | Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context | Machel Reid et.al. | 2403.05530 | null |
2024-03-08 | GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM | Hao Kang et.al. | 2403.05527 | link |
2024-03-08 | DeepSeek-VL: Towards Real-World Vision-Language Understanding | Haoyu Lu et.al. | 2403.05525 | link |
2024-03-08 | Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola | Yijiang Li et.al. | 2403.05523 | null |
2024-03-08 | Authorship Attribution in Bangla Literature (AABL) via Transfer Learning using ULMFiT | Aisha Khatun et.al. | 2403.05519 | null |
2024-03-08 | Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought | James Chua et.al. | 2403.05518 | link |
2024-03-08 | To Err Is Human, but Llamas Can Learn It Too | Agnes Luhtaru et.al. | 2403.05493 | null |
2024-03-08 | Will GPT-4 Run DOOM? | Adrian de Wynter et.al. | 2403.05468 | null |
2024-03-08 | Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs | Arijit Nag et.al. | 2403.05434 | null |
2024-03-08 | Towards Real-World Stickers Use: A New Dataset for Multi-Tag Sticker Recognition | Bingbing Wang et.al. | 2403.05428 | null |
2024-03-08 | FedFMS: Exploring Federated Foundation Models for Medical Image Segmentation | Yuxi Liu et.al. | 2403.05408 | link |
2024-03-08 | Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery | Xavier Bou et.al. | 2403.05381 | link |
2024-03-08 | VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model | Junsu Kim et.al. | 2403.05346 | null |
2024-03-08 | Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings | Wei Zhou et.al. | 2403.05338 | null |
2024-03-08 | ChatASU: Evoking LLM's Reflexion to Truly Understand Aspect Sentiment in Dialogues | Yiding Liu et.al. | 2403.05326 | null |
2024-03-08 | RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation | Zihao Wang et.al. | 2403.05313 | null |
2024-03-08 | Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents | Jinyang Li et.al. | 2403.05307 | null |
2024-03-08 | ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications | Sotaro Takeshita et.al. | 2403.05303 | link |
2024-03-08 | Modeling Dynamic (De)Allocations of Local Memory for Translation Validation | Abhishek Rose et.al. | 2403.05302 | null |
2024-03-07 | iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries | Adam Coscia et.al. | 2403.04760 | link |
2024-03-07 | KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts | Adam Coscia et.al. | 2403.04758 | link |
2024-03-07 | LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error | Boshi Wang et.al. | 2403.04746 | link |
2024-03-08 | How Far Are We from Intelligent Visual Deductive Reasoning? | Yizhe Zhang et.al. | 2403.04732 | link |
2024-03-07 | Common 7B Language Models Already Possess Strong Math Capabilities | Chen Li et.al. | 2403.04706 | null |
2024-03-07 | ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes | Hashmat Shadab Malik et.al. | 2403.04701 | link |
2024-03-07 | Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification | Ekaterina Fadeeva et.al. | 2403.04696 | null |
2024-03-07 | Telecom Language Models: Must They Be Large? | Nicola Piovesan et.al. | 2403.04666 | null |
2024-03-07 | Yi: Open Foundation Models by 01.AI | 01. AI et.al. | 2403.04652 | link |
2024-03-07 | Teaching Large Language Models to Reason with Reinforcement Learning | Alex Havrilla et.al. | 2403.04642 | null |
2024-03-07 | CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios | Qilang Ye et.al. | 2403.04640 | link |
2024-03-07 | A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds | Xuenan Xu et.al. | 2403.04594 | null |
2024-03-07 | Embodied Understanding of Driving Scenarios | Yunsong Zhou et.al. | 2403.04593 | link |
2024-03-07 | Wiki-TabNER:Advancing Table Interpretation Through Named Entity Recognition | Aneta Koleva et.al. | 2403.04577 | link |
2024-03-07 | Reducing self-supervised learning complexity improves weakly-supervised classification performance in computational pathology | Tim Lenz et.al. | 2403.04558 | null |
2024-03-07 | Enhancing Data Quality in Federated Fine-Tuning of Foundation Models | Wanru Zhao et.al. | 2403.04529 | null |
2024-03-07 | Where does In-context Translation Happen in Large Language Models | Suzanna Sia et.al. | 2403.04510 | null |
2024-03-07 | GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability | Zihan Luo et.al. | 2403.04483 | link |
2024-03-08 | Do Large Language Model Understand Multi-Intent Spoken Language ? | Shangjian Yin et.al. | 2403.04481 | link |
2024-03-08 | Pearl: A Review-driven Persona-Knowledge Grounded Conversational Recommendation Dataset | Minjin Kim et.al. | 2403.04460 | null |
2024-03-06 | Backtracing: Retrieving the Cause of the Query | Rose E. Wang et.al. | 2403.03956 | link |
2024-03-06 | Bridging Language and Items for Retrieval and Recommendation | Yupeng Hou et.al. | 2403.03952 | link |
2024-03-06 | The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models | Adithya Bhaskar et.al. | 2403.03942 | link |
2024-03-06 | Did Translation Models Get More Robust Without Anyone Even Noticing? | Ben Peters et.al. | 2403.03923 | null |
2024-03-06 | Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing | Asmita et.al. | 2403.03897 | link |
2024-03-06 | IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators | Indraneil Paul et.al. | 2403.03894 | link |
2024-03-06 | From One to Many: Expanding the Scope of Toxicity Mitigation in Language Models | Luiza Pozzobon et.al. | 2403.03893 | link |
2024-03-06 | FaaF: Facts as a Function for the evaluation of RAG systems | Vasileios Katranidis et.al. | 2403.03888 | link |
2024-03-06 | SaulLM-7B: A pioneering Large Language Model for Law | Pierre Colombo et.al. | 2403.03883 | null |
2024-03-06 | Learning to Decode Collaboratively with Multiple Language Models | Shannon Zejiang Shen et.al. | 2403.03870 | link |
2024-03-06 | On the Origins of Linear Representations in Large Language Models | Yibo Jiang et.al. | 2403.03867 | null |
2024-03-06 | KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions | Fangyuan Xu et.al. | 2403.03866 | null |
2024-03-06 | Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning | Deepanway Ghosal et.al. | 2403.03864 | link |
2024-03-06 | X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification | Hanzi Xu et.al. | 2403.03863 | link |
2024-03-06 | Designing Informative Metrics for Few-Shot Example Selection | Rishabh Adiga et.al. | 2403.03861 | null |
2024-03-06 | Emojinize : Enriching Any Text with Emoji Translations | Lars Henning Klein et.al. | 2403.03857 | null |
2024-03-06 | ShortGPT: Layers in Large Language Models are More Redundant Than You Expect | Xin Men et.al. | 2403.03853 | null |
2024-03-06 | Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ | Carolin Holtermann et.al. | 2403.03814 | link |
2024-03-06 | Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery | Wei Zhang et.al. | 2403.03790 | null |
2024-03-06 | PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion | Zekai Zhang et.al. | 2403.03788 | link |
2024-03-05 | The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning | Nathaniel Li et.al. | 2403.03218 | null |
2024-03-05 | CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments | Savitha Sam Abraham et.al. | 2403.03203 | null |
2024-03-05 | Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled by GPT-4 for Enhanced Interpretability and Public Engagement | Rafaela Martelo et.al. | 2403.03188 | link |
2024-03-05 | Reliable, Adaptable, and Attributable Language Models with Retrieval | Akari Asai et.al. | 2403.03187 | null |
2024-03-05 | MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting | Fangchen Liu et.al. | 2403.03174 | null |
2024-03-05 | SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection | Peng Qi et.al. | 2403.03170 | null |
2024-03-05 | PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset | Arda Uzunoğlu et.al. | 2403.03167 | link |
2024-03-05 | Quantum Many-Body Physics Calculations with Large Language Models | Haining Pan et.al. | 2403.03154 | null |
2024-03-05 | Language Guided Exploration for RL Agents in Text Environments | Hitesh Golchha et.al. | 2403.03141 | null |
2024-03-05 | CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following | Kaiyan Zhang et.al. | 2403.03129 | null |
2024-03-05 | Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution | Flor Miriam Plaza-del-Arco et.al. | 2403.03121 | null |
2024-03-05 | "In Dialogues We Learn": Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning | Chuanqi Cheng et.al. | 2403.03102 | null |
2024-03-05 | KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents | Yuqi Zhu et.al. | 2403.03101 | link |
2024-03-05 | Learning to Use Tools via Cooperative and Interactive Agents | Zhengliang Shi et.al. | 2403.03031 | null |
2024-03-05 | Socratic Reasoning Improves Positive Text Rewriting | Anmol Goel et.al. | 2403.03029 | null |
2024-03-05 | Word Importance Explains How Prompts Affect Language Model Outputs | Stefan Hackmann et.al. | 2403.03028 | null |
2024-03-05 | OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following | Haochen Shi et.al. | 2403.03017 | null |
2024-03-05 | Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations | Hasan Abu-Rasheed et.al. | 2403.03008 | null |
2024-03-05 | Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models | Gen Luo et.al. | 2403.03003 | link |
2024-03-05 | Localized Zeroth-Order Prompt Optimization | Wenyang Hu et.al. | 2403.02993 | null |
2024-03-02 | LM4OPT: Unveiling the Potential of Large Language Models in Formulating Mathematical Optimization Problems | Tasnim Ahmed et.al. | 2403.01342 | null |
2024-03-02 | Making Hybrid Languages: A Recipe | Leif Andersen et.al. | 2403.01335 | null |
2024-03-02 | Chaining thoughts and LLMs to learn DNA structural biophysics | Tyler D. Ross et.al. | 2403.01332 | link |
2024-03-02 | VBART: The Turkish LLM | Meliksah Turker et.al. | 2403.01308 | null |
2024-03-02 | ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation | Moran Yanuka et.al. | 2403.01306 | link |
2024-03-02 | Improving the Validity of Automatically Generated Feedback via Reinforcement Learning | Alexander Scarlatos et.al. | 2403.01304 | link |
2024-03-02 | NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention | Tianyi Zhang et.al. | 2403.01273 | link |
2024-03-02 | Employing LLMs for Incident Response Planning and Review | Sam Hays et.al. | 2403.01271 | null |
2024-03-02 | Dissecting Language Models: Machine Unlearning via Selective Pruning | Nicholas Pochinkov et.al. | 2403.01267 | null |
2024-03-02 | Accelerating Greedy Coordinate Gradient via Probe Sampling | Yiran Zhao et.al. | 2403.01251 | link |
2024-03-02 | SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code | Ziniu Hu et.al. | 2403.01248 | null |
2024-03-02 | Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal | Jianheng Huang et.al. | 2403.01244 | null |
2024-03-02 | IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact | Ruikang Liu et.al. | 2403.01241 | null |
2024-03-02 | Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy | Jamie Hayes et.al. | 2403.01218 | null |
2024-03-02 | API Is Enough: Conformal Prediction for Large Language Models Without Logit-Access | Jiayuan Su et.al. | 2403.01216 | null |
2024-03-02 | Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning | Shuo Yang et.al. | 2403.01209 | null |
2024-03-02 | The Case for Animal-Friendly AI | Sankalpa Ghose et.al. | 2403.01199 | null |
2024-03-02 | DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling | Shanghaoran Quan et.al. | 2403.01197 | link |
2024-03-02 | RAGged Edges: The Double-Edged Sword of Retrieval-Augmented Chatbots | Philip Feldman. James R. Foulds et.al. | 2403.01193 | null |
2024-03-02 | Balancing Exploration and Exploitation in LLM using Soft RLLF for Enhanced Negation Understanding | Ha-Thanh Nguyen et.al. | 2403.01185 | null |
2024-02-29 | The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations? | Alex Gu et.al. | 2402.19475 | null |
2024-02-29 | The All-Seeing Project V2: Towards General Relation Comprehension of the Open World | Weiyun Wang et.al. | 2402.19474 | link |
2024-02-29 | Retrieval-Augmented Generation for AI-Generated Content: A Survey | Penghao Zhao et.al. | 2402.19473 | link |
2024-02-29 | Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling | Gabriel Grand et.al. | 2402.19471 | null |
2024-03-01 | TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning | Kate Sanders et.al. | 2402.19467 | null |
2024-02-29 | Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models | Chen Qian et.al. | 2402.19465 | link |
2024-02-29 | Curiosity-driven Red-teaming for Large Language Models | Zhang-Wei Hong et.al. | 2402.19464 | link |
2024-02-29 | Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap | Saurabh Srivastava et.al. | 2402.19450 | link |
2024-02-29 | Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models | Frederik Kunstner et.al. | 2402.19449 | null |
2024-02-29 | ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL | Yifei Zhou et.al. | 2402.19446 | link |
2024-02-29 | Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation | Jonathan Yang et.al. | 2402.19432 | null |
2024-02-29 | Compositional API Recommendation for Library-Oriented Code Generation | Zexiong Ma et.al. | 2402.19431 | null |
2024-02-29 | Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models | Soham De et.al. | 2402.19427 | null |
2024-02-29 | Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines | Lijia Ma et.al. | 2402.19421 | null |
2024-02-29 | PaECTER: Patent-level Representation Learning using Citation-informed Transformers | Mainak Ghosh et.al. | 2402.19411 | null |
2024-02-29 | On the Scaling Laws of Geographical Representation in Language Models | Nathan Godey et.al. | 2402.19406 | null |
2024-02-29 | Entity-Aware Multimodal Alignment Framework for News Image Captioning | Junzhe Zhang et.al. | 2402.19404 | null |
2024-02-29 | Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Match Human Crowd Accuracy | Philipp Schoenegger et.al. | 2402.19379 | null |
2024-02-29 | OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models | Jenish Maharjan et.al. | 2402.19371 | null |
2024-02-29 | SoK: Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency | Akila Wickramasekara et.al. | 2402.19366 | null |
2024-02-28 | Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards | Haoxiang Wang et.al. | 2402.18571 | link |
2024-02-28 | Diffusion Language Models Are Versatile Protein Learners | Xinyou Wang et.al. | 2402.18567 | null |
2024-02-28 | A Categorization of Complexity Classes for Information Retrieval and Synthesis Using Natural Logic | Gregory Coppola et.al. | 2402.18566 | null |
2024-02-28 | Approaching Human-Level Forecasting with Language Models | Danny Halawi et.al. | 2402.18563 | null |
2024-02-28 | Implicit Bias of Next-Token Prediction | Christos Thrampoulidis et.al. | 2402.18551 | null |
2024-02-28 | Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling | Mahdi Karami et.al. | 2402.18508 | null |
2024-02-28 | Few-Shot Fairness: Unveiling LLM's Potential for Fairness-Aware Classification | Garima Chhikara et.al. | 2402.18502 | null |
2024-02-28 | Language Models Represent Beliefs of Self and Others | Wentao Zhu et.al. | 2402.18496 | null |
2024-02-28 | IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding | Lanyun Zhu et.al. | 2402.18476 | null |
2024-02-28 | Meta-Task Prompting Elicits Embedding from Large Language Models | Yibin Lei et.al. | 2402.18458 | null |
2024-02-28 | Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization | Deng Li et.al. | 2402.18447 | null |
2024-02-28 | Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication | Weize Chen et.al. | 2402.18439 | link |
2024-02-28 | A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models | Xiujie Song et.al. | 2402.18409 | null |
2024-02-28 | Balanced Similarity with Auxiliary Prompts: Towards Alleviating Text-to-Image Retrieval Bias for CLIP in Zero-shot Learning | Hanyao Wang et.al. | 2402.18400 | null |
2024-02-28 | Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models | Ercong Nie et.al. | 2402.18397 | null |
2024-02-28 | The First Place Solution of WSDM Cup 2024: Leveraging Large Language Models for Conversational Multi-Doc QA | Yiming Li et.al. | 2402.18385 | link |
2024-02-28 | Large Language Models As Evolution Strategies | Robert Tjarko Lange et.al. | 2402.18381 | null |
2024-02-28 | Tokenization Is More Than Compression | Craig W. Schmidt et.al. | 2402.18376 | null |
2024-02-28 | VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models | Seoyeon Kim et.al. | 2402.18374 | null |
2024-02-28 | Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning | Jiachun Li et.al. | 2402.18344 | null |
2024-02-27 | ShapeLLM: Universal 3D Object Understanding for Embodied Interaction | Zekun Qi et.al. | 2402.17766 | link |
2024-02-27 | The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits | Shuming Ma et.al. | 2402.17764 | null |
2024-02-27 | Massive Activations in Large Language Models | Mingjie Sun et.al. | 2402.17762 | link |
2024-02-27 | Towards Optimal Learning of Language Models | Yuxian Gu et.al. | 2402.17759 | null |
2024-02-27 | Evaluating Very Long-Term Conversational Memory of LLM Agents | Adyasha Maharana et.al. | 2402.17753 | null |
2024-02-27 | Tower: An Open Multilingual Large Language Model for Translation-Related Tasks | Duarte M. Alves et.al. | 2402.17733 | link |
2024-02-27 | AmbigNLG: Addressing Task Ambiguity in Instruction for NLG | Ayana Niwa et.al. | 2402.17717 | null |
2024-02-27 | Case-Based or Rule-Based: How Do Transformers Do the Math? | Yi Hu et.al. | 2402.17709 | link |
2024-02-27 | RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations | Jing Huang et.al. | 2402.17700 | link |
2024-02-27 | NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents | Tamara Czinczoll et.al. | 2402.17682 | link |
2024-02-27 | The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks | Ashwin Prasad Shivarpatna Venkatesh et.al. | 2402.17679 | null |
2024-02-27 | CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention | Mohammad Sadil Khan et.al. | 2402.17678 | null |
2024-02-27 | Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models | Yunpeng Huang et.al. | 2402.17671 | null |
2024-02-27 | Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs | Tanise Ceron et.al. | 2402.17649 | null |
2024-02-27 | SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation | Shuangrui Ding et.al. | 2402.17645 | link |
2024-02-27 | Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data | Xiao Liu et.al. | 2402.17644 | link |
2024-02-27 | Variational Learning is Effective for Large Deep Networks | Yuesong Shen et.al. | 2402.17641 | link |
2024-02-27 | Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling | David S. W. Williams et.al. | 2402.17622 | null |
2024-02-27 | Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization | Wenqi Zhang et.al. | 2402.17574 | link |
2024-02-27 | Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers | Xinyu Tang et.al. | 2402.17564 | link |
2024-02-26 | Integrating Large Language Models with Graphical Session-Based Recommendation | Naicheng Guo et.al. | 2402.16539 | null |
2024-02-26 | LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments | Junzhe Chen et.al. | 2402.16499 | null |
2024-02-26 | On Languaging a Simulation Engine | Han Liu et.al. | 2402.16482 | null |
2024-02-26 | Unveiling ChatGPT's Usage in Open Source Projects: A Mining-based Study | Rosalia Tufano et.al. | 2402.16480 | null |
2024-02-26 | mEdIT: Multilingual Text Editing via Instruction Tuning | Vipul Raheja et.al. | 2402.16472 | link |
2024-02-26 | Unveiling Vulnerability of Self-Attention | Khai Jiet Liong et.al. | 2402.16470 | link |
2024-02-26 | Defending LLMs against Jailbreaking Attacks via Backtranslation | Yihan Wang et.al. | 2402.16459 | link |
2024-02-26 | ProLLaMA: A Protein Large Language Model for Multi-Task Protein Language Processing | Liuzhenghao Lv et.al. | 2402.16445 | link |
2024-02-26 | ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors | Zhexin Zhang et.al. | 2402.16444 | link |
2024-02-26 | Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models | Tianyi Tang et.al. | 2402.16438 | null |
2024-02-26 | RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions | Yuansen Zhang et.al. | 2402.16431 | null |
2024-02-26 | Predicting Sustainable Development Goals Using Course Descriptions -- from LLMs to Conventional Foundation Models | Lev Kharlashkin et.al. | 2402.16420 | null |
2024-02-26 | From RAGs to riches: Using large language models to write documents for clinical trials | Nigel Markey et.al. | 2402.16406 | null |
2024-02-26 | MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property | Shiwen Ni et.al. | 2402.16389 | link |
2024-02-26 | Immunization against harmful fine-tuning attacks | Domenic Rosati et.al. | 2402.16382 | null |
2024-02-26 | Improving LLM-based Machine Translation with Systematic Self-Correction | Zhaopeng Feng et.al. | 2402.16379 | link |
2024-02-26 | Unraveling Babel: Exploring Multilingual Activation Patterns within Large Language Models | Weize Liu et.al. | 2402.16367 | null |
2024-02-26 | LLM Inference Unveiled: Survey and Roofline Model Insights | Zhihang Yuan et.al. | 2402.16363 | link |
2024-02-26 | Layer-wise Regularized Dropout for Neural Language Models | Shiwen Ni et.al. | 2402.16361 | null |
2024-02-26 | An Integrated Data Processing Framework for Pretraining Foundation Models | Yiding Sun et.al. | 2402.16358 | link |
2024-02-26 | Language-guided Skill Learning with Temporal Variational Inference | Haotian Fu et.al. | 2402.16354 | null |
2024-02-23 | AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning | Jianguo Zhang et.al. | 2402.15506 | link |
2024-02-23 | API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs | Kinjal Basu et.al. | 2402.15491 | null |
2024-02-23 | Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models | Yiran Liu et.al. | 2402.15481 | null |
2024-02-23 | Leveraging Domain Knowledge for Efficient Reward Modelling in RLHF: A Case-Study in E-Commerce Opinion Summarization | Swaroop Nath et.al. | 2402.15473 | link |
2024-02-23 | Repetition Improves Language Model Embeddings | Jacob Mitchell Springer et.al. | 2402.15449 | link |
2024-02-23 | A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models | Stefan Hegselmann et.al. | 2402.15422 | link |
2024-02-23 | PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning | Simon Holk et.al. | 2402.15420 | null |
2024-02-23 | Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy? | Nader Asadi et.al. | 2402.15414 | null |
2024-02-23 | Grasp, See and Place: Efficient Unknown Object Rearrangement with Policy Structure Prior | Kechun Xu et.al. | 2402.15402 | link |
2024-02-23 | Explorations of Self-Repair in Language Models | Cody Rushing et.al. | 2402.15390 | link |
2024-02-23 | Safe Task Planning for Language-Instructed Multi-Robot Systems using Conformal Prediction | Jun Wang et.al. | 2402.15368 | null |
2024-02-23 | Farsight: Fostering Responsible AI Awareness During AI Application Prototyping | Zijie J. Wang et.al. | 2402.15350 | link |
2024-02-23 | NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data | Sergei Bogdanov et.al. | 2402.15343 | link |
2024-02-23 | Ranking Entities along Conceptual Space Dimensions with LLMs: An Analysis of Fine-Tuning Strategies | Nitesh Kumar et.al. | 2402.15337 | null |
2024-02-23 | GPTVQ: The Blessing of Dimensionality for LLM Quantization | Mart van Baalen et.al. | 2402.15319 | null |
2024-02-23 | ArabianGPT: Native Arabic GPT-based Large Language | Anis Koubaa et.al. | 2402.15313 | null |
2024-02-23 | Counterfactual Generation with Identifiability Guarantees | Hanqi Yan et.al. | 2402.15309 | link |
2024-02-23 | Representing Online Handwriting for Recognition in Large Vision-Language Models | Anastasiia Fadeeva et.al. | 2402.15307 | null |
2024-02-23 | How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries | Somnath Banerjee et.al. | 2402.15302 | link |
2024-02-23 | Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models | Yuzhe Zhang et.al. | 2402.15301 | null |
2024-02-22 | PALO: A Polyglot Large Multimodal Model for 5B People | Muhammad Maaz et.al. | 2402.14818 | link |
2024-02-22 | Demographic Bias of Expert-Level Vision-Language Foundation Models in Medical Imaging | Yuzhe Yang et.al. | 2402.14815 | link |
2024-02-22 | WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition | Lianghui Zhu et.al. | 2402.14812 | link |
2024-02-22 | Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking | Nikhil Prakash et.al. | 2402.14811 | null |
2024-02-22 | CriticBench: Benchmarking LLMs for Critique-Correct Reasoning | Zicheng Lin et.al. | 2402.14809 | link |
2024-02-22 | RelayAttention for Efficient Large Language Model Serving with Long System Prompts | Lei Zhu et.al. | 2402.14808 | link |
2024-02-22 | A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health | Nikhil Behari et.al. | 2402.14807 | null |
2024-02-22 | Identifying Multiple Personalities in Large Language Models with External Evaluation | Xiaoyang Song et.al. | 2402.14805 | null |
2024-02-22 | Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models | Xudong Lu et.al. | 2402.14800 | link |
2024-02-22 | Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic | Nathaniel Weir et.al. | 2402.14798 | null |
2024-02-22 | Zero-shot cross-lingual transfer in instruction tuning of large language model | Nadezhda Chirkova et.al. | 2402.14778 | null |
2024-02-22 | 2D Matryoshka Sentence Embeddings | Xianming Li et.al. | 2402.14776 | null |
2024-02-22 | DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models | Yuhang Cao et.al. | 2402.14767 | link |
2024-02-22 | MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues | Ge Bai et.al. | 2402.14762 | null |
2024-02-22 | Generalizing Reward Modeling for Out-of-Distribution Preference Learning | Chen Jia et.al. | 2402.14760 | null |
2024-02-22 | Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation | Jiawei Wang et.al. | 2402.14744 | null |
2024-02-22 | Dependency Annotation of Ottoman Turkish with Multilingual BERT | Şaziye Betül Özateş et.al. | 2402.14743 | null |
2024-02-22 | Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs | Arash Ahmadian et.al. | 2402.14740 | null |
2024-02-22 | Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models | Seungduk Kim et.al. | 2402.14714 | link |
2024-02-22 | IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus | Honghao Gui et.al. | 2402.14710 | link |
2024-02-21 | Coercing LLMs to do and reveal (almost) anything | Jonas Geiping et.al. | 2402.14020 | link |
2024-02-21 | Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment | Vyas Raina et.al. | 2402.14016 | null |
2024-02-21 | OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems | Chaoqun He et.al. | 2402.14008 | link |
2024-02-21 | Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models | Zhiwei He et.al. | 2402.14007 | null |
2024-02-21 | Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models | Aline Ioste et.al. | 2402.14002 | null |
2024-02-21 | Analysing The Impact of Sequence Composition on Language Model Pre-Training | Yu Zhao et.al. | 2402.13991 | link |
2024-02-21 | Towards Building Multilingual Language Model for Medicine | Pengcheng Qiu et.al. | 2402.13963 | link |
2024-02-21 | Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality | Rahul Zalkikar et.al. | 2402.13954 | null |
2024-02-21 | Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning | Debjit Paul et.al. | 2402.13950 | null |
2024-02-21 | Do Efficient Transformers Really Save Computation? | Kai Yang et.al. | 2402.13934 | null |
2024-02-21 | Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content | Federico Bianchi et.al. | 2402.13926 | null |
2024-02-21 | SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization | Prakamya Mishra et.al. | 2402.13919 | link |
2024-02-21 | What Linguistic Features and Languages are Important in LLM Translation? | Ryandito Diandaru et.al. | 2402.13917 | null |
2024-02-21 | Calibrating Large Language Models with Sample Consistency | Qing Lyu et.al. | 2402.13904 | null |
2024-02-21 | Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models | Chenyang Lyu et.al. | 2402.13887 | null |
2024-02-21 | Haoyu Liu et.al. | 2402.13874 | null | |
2024-02-21 | An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach | Mohammad Amaz Uddin et.al. | 2402.13871 | null |
2024-02-21 | Kuaiji: the First Chinese Accounting Large Language Model | Jiayuan Luo et.al. | 2402.13866 | null |
2024-02-21 | RealDex: Towards Human-like Grasping for Robotic Dexterous Hand | Yumeng Liu et.al. | 2402.13853 | null |
2024-02-21 | VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models | Jiawei Liang et.al. | 2402.13851 | null |
2024-02-20 | Towards audio language modeling -- an overview | Haibin Wu et.al. | 2402.13236 | null |
2024-02-20 | Unlocking Insights: Semantic Search in Jupyter Notebooks | Lan Li et.al. | 2402.13234 | null |
2024-02-20 | A Touch, Vision, and Language Dataset for Multimodal Alignment | Letian Fu et.al. | 2402.13232 | link |
2024-02-20 | Investigating Cultural Alignment of Large Language Models | Badr AlKhamissi et.al. | 2402.13231 | link |
2024-02-20 | Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive | Arka Pal et.al. | 2402.13228 | link |
2024-02-20 | AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning | Qiao Jin et.al. | 2402.13225 | null |
2024-02-20 | RoCode: A Dataset for Measuring Code Intelligence from Problem Definitions in Romanian | Adrian Cosma et.al. | 2402.13222 | link |
2024-02-20 | How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts | Yusu Qian et.al. | 2402.13220 | null |
2024-02-20 | Softmax Probabilities (Mostly) Predict Large Language Model Correctness on Multiple-Choice Q&A | Benjamin Plaut et.al. | 2402.13213 | link |
2024-02-20 | Soft Self-Consistency Improves Language Model Agents | Han Wang et.al. | 2402.13212 | link |
2024-02-20 | Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation | Dongjin Kang et.al. | 2402.13211 | null |
2024-02-20 | Bayesian Reward Models for LLM Alignment | Adam X. Yang et.al. | 2402.13210 | null |
2024-02-20 | How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena | Marco Gaido et.al. | 2402.13208 | link |
2024-02-20 | Question Calibration and Multi-Hop Modeling for Temporal Question Answering | Chao Xue et.al. | 2402.13188 | null |
2024-02-20 | What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents | Mingyu Jin et.al. | 2402.13184 | null |
2024-02-20 | DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models | Norman Di Palo et.al. | 2402.13181 | null |
2024-02-20 | Benchmarking Retrieval-Augmented Generation for Medicine | Guangzhi Xiong et.al. | 2402.13178 | link |
2024-02-20 | Defending Jailbreak Prompts via In-Context Adversarial Game | Yujun Zhou et.al. | 2402.13148 | null |
2024-02-20 | OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog | Adnen Abdessaied et.al. | 2402.13146 | null |
2024-02-20 | The Hidden Space of Transformer Language Adapters | Jesujoba O. Alabi et.al. | 2402.13137 | null |
2024-02-19 | Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding | Zhuoming Chen et.al. | 2402.12374 | link |
2024-02-19 | AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies | Xiao Ye et.al. | 2402.12370 | link |
2024-02-19 | A Critical Evaluation of AI Feedback for Aligning Large Language Models | Archit Sharma et.al. | 2402.12366 | link |
2024-02-19 | Emergent Word Order Universals from Cognitively-Motivated Language Models | Tatsuki Kuribayashi et.al. | 2402.12363 | null |
2024-02-19 | Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge | Julien Delile et.al. | 2402.12352 | null |
2024-02-19 | GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations | Jinhao Duan et.al. | 2402.12348 | link |
2024-02-19 | Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! | Zhanhui Zhou et.al. | 2402.12343 | link |
2024-02-19 | Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models | Christian Schlarmann et.al. | 2402.12336 | link |
2024-02-19 | Query-Based Adversarial Prompt Generation | Jonathan Hayase et.al. | 2402.12329 | null |
2024-02-19 | Shall We Talk: Exploring Spontaneous Collaborations of Competing LLM Agents | Zengqing Wu et.al. | 2402.12327 | link |
2024-02-19 | ARKS: Active Retrieval in Knowledge Soup for Code Generation | Hongjin Su et.al. | 2402.12317 | link |
2024-02-19 | Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports | Felix J. Dorfner et.al. | 2402.12298 | null |
2024-02-19 | KARL: Knowledge-Aware Retrieval and Representations aid Retention and Learning in Students | Matthew Shu et.al. | 2402.12291 | null |
2024-02-19 | DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models | Xiaoyu Tian et.al. | 2402.12289 | null |
2024-02-19 | Adaptive Skeleton Graph Decoding | Shuowei Jin et.al. | 2402.12280 | null |
2024-02-19 | Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks | Nadezhda Chirkova et.al. | 2402.12279 | null |
2024-02-19 | Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from Large Language Models | Puxuan Yu et.al. | 2402.12276 | link |
2024-02-19 | High-quality Data-to-Text Generation for Severely Under-Resourced Languages with Out-of-the-box Large Language Models | Michela Lorandi et.al. | 2402.12267 | link |
2024-02-19 | Uncertainty quantification in fine-tuned LLMs using LoRA ensembles | Oleksandr Balabanov et.al. | 2402.12264 | null |
2024-02-19 | NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms | Jonathan Zheng et.al. | 2402.12261 | null |
2024-02-16 | PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter | Junfei Xiao et.al. | 2402.10896 | null |
2024-02-16 | RLVF: Learning from Verbal Feedback without Overgeneralization | Moritz Stephan et.al. | 2402.10893 | link |
2024-02-16 | Instruction Diversity Drives Generalization To Unseen Tasks | Dylan Zhang et.al. | 2402.10891 | null |
2024-02-16 | When is Tree Search Useful for LLM Planning? It Depends on the Discriminator | Ziru Chen et.al. | 2402.10890 | link |
2024-02-16 | Multi-modal preference alignment remedies regression of visual instruction tuning on language model | Shengzhi Li et.al. | 2402.10884 | link |
2024-02-16 | EcoRank: Budget-Constrained Text Re-ranking Using Large Language Models | Muhammad Shihab Rashid et.al. | 2402.10866 | null |
2024-02-16 | Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities | Mingyu Jin et.al. | 2402.10835 | null |
2024-02-16 | RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model | Jianhao Yuan et.al. | 2402.10828 | null |
2024-02-16 | Quantifying the Persona Effect in LLM Simulations | Tiancheng Hu et.al. | 2402.10811 | null |
2024-02-16 | Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond | Yongqi Li et.al. | 2402.10805 | null |
2024-02-16 | EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge | Xuan Shen et.al. | 2402.10787 | link |
2024-02-16 | A Condensed Transition Graph Framework for Zero-shot Link Prediction with Large Language Models | Mingchen Li et.al. | 2402.10779 | null |
2024-02-16 | AutoGPT+P: Affordance-based Task Planning with Large Language Models | Timo Birr et.al. | 2402.10778 | null |
2024-02-16 | How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs? | Ehsan Doostmohammadi et.al. | 2402.10770 | null |
2024-02-16 | Distillation Enhanced Generative Retrieval | Yongqi Li et.al. | 2402.10769 | null |
2024-02-16 | Inference to the Best Explanation in Large Language Models | Dhairya Dalal et.al. | 2402.10767 | null |
2024-02-16 | When Dataflow Analysis Meets Large Language Models | Chengpeng Wang et.al. | 2402.10754 | null |
2024-02-16 | ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages | Junjie Ye et.al. | 2402.10753 | link |
2024-02-16 | GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models | Pengcheng Jiang et.al. | 2402.10744 | link |
2024-02-16 | Let's Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning | Yinpeng Liu et.al. | 2402.10738 | link |
2024-02-15 | Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation | Huizhuo Yuan et.al. | 2402.10210 | null |
2024-02-15 | Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment | Rui Yang et.al. | 2402.10207 | link |
2024-02-15 | Chain-of-Thought Reasoning Without Prompting | Xuezhi Wang et.al. | 2402.10200 | null |
2024-02-15 | A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents | Lingbo Mo et.al. | 2402.10196 | link |
2024-02-15 | BitDelta: Your Fine-Tune May Only Be Worth One Bit | James Liu et.al. | 2402.10193 | link |
2024-02-15 | Uncertainty Decomposition and Quantification for In-Context Learning of Large Language Models | Chen Ling et.al. | 2402.10189 | link |
2024-02-15 | Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective | Tianyi Qiu et.al. | 2402.10184 | null |
2024-02-15 | TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation | Yaoxiang Wang et.al. | 2402.10178 | null |
2024-02-15 | OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset | Shubham Toshniwal et.al. | 2402.10176 | link |
2024-02-15 | Unlocking Structure Measuring: Introducing PDD, an Automatic Metric for Positional Discourse Coherence | Yinhong Liu et.al. | 2402.10175 | link |
2024-02-15 | OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models | Ali AhmadiTeshnizi et.al. | 2402.10172 | null |
2024-02-15 | Data Engineering for Scaling Language Models to 128K Context | Yao Fu et.al. | 2402.10171 | link |
2024-02-15 | Knowledge-Infused LLM-Powered Conversational Health Agent: A Case Study for Diabetes Patients | Mahyar Abbasian et.al. | 2402.10153 | null |
2024-02-15 | ControlLM: Crafting Diverse Personalities for Language Models | Yixuan Weng et.al. | 2402.10151 | link |
2024-02-15 | TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles | Yinhong Liu et.al. | 2402.10137 | null |
2024-02-15 | Zero-Shot Reasoning: Personalized Content Generation Without the Cold Start Problem | Davor Hafnar et.al. | 2402.10133 | null |
2024-02-15 | Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning | Ming Li et.al. | 2402.10110 | link |
2024-02-15 | Quantized Embedding Vectors for Controllable Diffusion Language Models | Cheng Kang et.al. | 2402.10107 | null |
2024-02-15 | GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving | Jiaxin Zhang et.al. | 2402.10104 | link |
2024-02-15 | Any-Shift Prompting for Generalization over Distributions | Zehao Xiao et.al. | 2402.10099 | null |
2024-02-14 | AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential Reasoning Ability | Siwei Yang et.al. | 2402.09404 | link |
2024-02-14 | Reinforcement Learning from Human Feedback with Active Queries | Kaixuan Ji et.al. | 2402.09401 | null |
2024-02-14 | Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference | Harry Dong et.al. | 2402.09398 | link |
2024-02-14 | LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset | Botao Yu et.al. | 2402.09391 | link |
2024-02-14 | HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context Learning in Factuality Evaluation | Yihao Fang et.al. | 2402.09390 | link |
2024-02-14 | Transformers Can Achieve Length Generalization But Not Robustly | Yongchao Zhou et.al. | 2402.09371 | null |
2024-02-14 | Pseudorandom Error-Correcting Codes | Miranda Christ et.al. | 2402.09370 | null |
2024-02-14 | Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking | Yi Fung et.al. | 2402.09369 | link |
2024-02-14 | Copyright Traps for Large Language Models | Matthieu Meeus et.al. | 2402.09363 | null |
2024-02-14 | HiRE: High Recall Approximate Top- |
Yashas Samaga B L et.al. | 2402.09360 | null |
2024-02-14 | Developing a Framework for Auditing Large Language Models Using Human-in-the-Loop | Maryam Amirizaniani et.al. | 2402.09346 | null |
2024-02-14 | Mitigating Reward Hacking via Information-Theoretic Reward Modeling | Yuchun Miao et.al. | 2402.09345 | null |
2024-02-14 | AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach | Maryam Amirizaniani et.al. | 2402.09334 | null |
2024-02-14 | ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization | Feifan Song et.al. | 2402.09320 | link |
2024-02-14 | Embracing the black box: Heading towards foundation models for causal discovery from time series data | Gideon Stein et.al. | 2402.09305 | link |
2024-02-14 | Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code | Vahid Majdinasab et.al. | 2402.09299 | link |
2024-02-14 | Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey | Zhichen Dong et.al. | 2402.09283 | link |
2024-02-14 | Leveraging Large Language Models for Enhanced NLP Task Performance through Knowledge Distillation and Optimized Training Strategies | Yining Huang et.al. | 2402.09282 | null |
2024-02-14 | Personalized Large Language Models | Stanisław Woźniak et.al. | 2402.09269 | null |
2024-02-14 | Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation | Xiaoying Zhang et.al. | 2402.09267 | null |
2024-02-13 | Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance | Linxi Zhao et.al. | 2402.08680 | null |
2024-02-13 | COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability | Xingang Guo et.al. | 2402.08679 | link |
2024-02-13 | Human Curriculum Effects Emerge with In-Context Learning in Neural Networks | Jacob Russin et.al. | 2402.08674 | null |
2024-02-13 | Rec-GPT4V: Multimodal Recommendation with Large Vision-Language Models | Yuqing Liu et.al. | 2402.08670 | null |
2024-02-13 | Improving Generalization in Semantic Parsing by Increasing Natural Language Variation | Irina Saparina et.al. | 2402.08666 | link |
2024-02-13 | The Last JITAI? The Unreasonable Effectiveness of Large Language Models in Issuing Just-in-Time Adaptive Interventions: Fostering Physical Activity in a Prospective Cardiac Rehabilitation Setting | David Haag et.al. | 2402.08658 | null |
2024-02-13 | PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs | Michael Dorkenwald et.al. | 2402.08657 | null |
2024-02-13 | Tandem Transformers for Inference Efficient LLMs | Aishwarya P S et.al. | 2402.08644 | null |
2024-02-13 | SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages | Nedjma Ousidhoum et.al. | 2402.08638 | null |
2024-02-13 | Knowledge Editing on Black-box Large Language Models | Xiaoshuai Song et.al. | 2402.08631 | link |
2024-02-13 | Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning | Haeju Lee et.al. | 2402.08594 | link |
2024-02-13 | Test-Time Backdoor Attacks on Multimodal Large Language Models | Dong Lu et.al. | 2402.08577 | link |
2024-02-13 | Online Foundation Model Selection in Robotics | Po-han Li et.al. | 2402.08570 | null |
2024-02-13 | Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast | Xiangming Gu et.al. | 2402.08567 | link |
2024-02-13 | Artificial Intelligence for Literature Reviews: Opportunities and Challenges | Francisco Bolanos et.al. | 2402.08565 | null |
2024-02-13 | Higher Layers Need More LoRA Experts | Chongyang Gao et.al. | 2402.08562 | link |
2024-02-13 | Grounding LLMs For Robot Task Planning Using Closed-loop State Feedback | Vineet Bhat et.al. | 2402.08546 | null |
2024-02-13 | The Application of ChatGPT in Responding to Questions Related to the Boston Bowel Preparation Scale | Xiaoqiang Liu et.al. | 2402.08492 | null |
2024-02-13 | Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models | Shaeke Salman et.al. | 2402.08473 | null |
2024-02-13 | Large Language Models for the Automated Analysis of Optimization Algorithms | Camilo Chacón Sartori et.al. | 2402.08472 | link |
2024-02-12 | A systematic investigation of learnability from single child linguistic input | Yulu Qin et.al. | 2402.07899 | link |
2024-02-12 | Suppressing Pink Elephants with Direct Principle Feedback | Louis Castricato et.al. | 2402.07896 | null |
2024-02-12 | WildfireGPT: Tailored Large Language Model for Wildfire Analysis | Yangxinyu Xie et.al. | 2402.07877 | null |
2024-02-12 | Policy Improvement using Language Feedback Models | Victor Zhong et.al. | 2402.07876 | null |
2024-02-12 | PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs | Soroush Nasiriany et.al. | 2402.07872 | null |
2024-02-12 | Scaling Laws for Fine-Grained Mixture of Experts | Jakub Krajewski et.al. | 2402.07871 | link |
2024-02-12 | PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models | Wei Zou et.al. | 2402.07867 | link |
2024-02-12 | Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models | Siddharth Karamcheti et.al. | 2402.07865 | link |
2024-02-12 | AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy | Philipp Schoenegger et.al. | 2402.07862 | null |
2024-02-12 | Lissard: Long and Simple Sequential Reasoning Datasets | Mirelle Bueno et.al. | 2402.07859 | null |
2024-02-12 | Mercury: An Efficiency Benchmark for LLM Code Synthesis | Mingzhe Du et.al. | 2402.07844 | link |
2024-02-12 | Do Membership Inference Attacks Work on Large Language Models? | Michael Duan et.al. | 2402.07841 | link |
2024-02-12 | Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model | Ahmet Üstün et.al. | 2402.07827 | null |
2024-02-12 | Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning | Z Liu et.al. | 2402.07818 | null |
2024-02-12 | Injecting Wiktionary to improve token-level contextual representations using contrastive learning | Anna Mosolova et.al. | 2402.07817 | null |
2024-02-12 | Retrieval-Augmented Thought Process as Sequential Decision Making | Thomas Pouplin et.al. | 2402.07812 | null |
2024-02-12 | Empowering Federated Learning for Massive Models with NVIDIA FLARE | Holger R. Roth et.al. | 2402.07792 | null |
2024-02-12 | TELLER: A Trustworthy Framework for Explainable, Generalizable and Controllable Fake News Detection | Hui Liu et.al. | 2402.07776 | link |
2024-02-12 | Quantitative knowledge retrieval from large language models | David Selby et.al. | 2402.07770 | link |
2024-02-12 | Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model | Mikail Khona et.al. | 2402.07757 | null |
2024-02-09 | Feedback Loops With Language Models Drive In-Context Reward Hacking | Alexander Pan et.al. | 2402.06627 | link |
2024-02-09 | Understanding the Effects of Iterative Prompting on Truthfulness | Satyapriya Krishna et.al. | 2402.06625 | null |
2024-02-09 | Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning | Shivalika Singh et.al. | 2402.06619 | null |
2024-02-09 | FaBERT: Pre-training BERT on Persian Blogs | Mostafa Masumi et.al. | 2402.06617 | null |
2024-02-09 | On the Out-Of-Distribution Generalization of Multimodal Large Language Models | Xingxuan Zhang et.al. | 2402.06599 | null |
2024-02-09 | CigaR: Cost-efficient Program Repair with LLMs | Dávid Hidvégi et.al. | 2402.06598 | link |
2024-02-09 | Understanding the Weakness of Large Language Model Agents within a Complex Android Environment | Mingzhe Xing et.al. | 2402.06596 | link |
2024-02-09 | Self-consistent context aware conformer transducer for speech recognition | Konstantin Kolokolov et.al. | 2402.06592 | null |
2024-02-09 | G-SciEdBERT: A Contextualized LLM for Science Assessment Tasks in German | Ehsan Latif et.al. | 2402.06584 | null |
2024-02-09 | Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning | Amir Ziai et.al. | 2402.06560 | link |
2024-02-09 | The Quantified Boolean Bayesian Network: Theory and Experiments with a Logical Graphical Model | Gregory Coppola et.al. | 2402.06557 | link |
2024-02-09 | Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA | Marek Šuppa et.al. | 2402.06549 | link |
2024-02-09 | Calibrating Long-form Generations from Large Language Models | Yukun Huang et.al. | 2402.06544 | null |
2024-02-09 | Introspective Planning: Guiding Language-Enabled Agents to Refine Their Own Uncertainty | Kaiqu Liang et.al. | 2402.06529 | link |
2024-02-09 | Multimodal Clinical Trial Outcome Prediction with Large Language Models | Wenhao Zheng et.al. | 2402.06512 | link |
2024-02-09 | Iris-SAM: Iris Segmentation Using a Foundational Model | Parisa Farmanifard et.al. | 2402.06497 | link |
2024-02-09 | Large Language Models for Captioning and Retrieving Remote Sensing Images | João Daniel Silva et.al. | 2402.06475 | null |
2024-02-09 | V-STaR: Training Verifiers for Self-Taught Reasoners | Arian Hosseini et.al. | 2402.06457 | null |
2024-02-09 | StruQ: Defending Against Prompt Injection with Structured Queries | Sizhe Chen et.al. | 2402.06363 | null |
2024-02-09 | CoSearchAgent: A Lightweight Collaborative Search Agent with Large Language Models | Peiyuan Gong et.al. | 2402.06360 | link |
2024-02-08 | SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models | Peng Gao et.al. | 2402.05935 | link |
2024-02-08 | Driving Everywhere with Large Language Model Policy Adaptation | Boyi Li et.al. | 2402.05932 | null |
2024-02-08 | WebLINX: Real-World Website Navigation with Multi-Turn Dialogue | Xing Han Lù et.al. | 2402.05930 | link |
2024-02-08 | An Interactive Agent Foundation Model | Zane Durante et.al. | 2402.05929 | null |
2024-02-08 | On the Convergence of Zeroth-Order Federated Tuning in Large Language Models | Zhenqing Ling et.al. | 2402.05926 | null |
2024-02-08 | Efficient Stagewise Pretraining via Progressive Subnetworks | Abhishek Panigrahi et.al. | 2402.05913 | null |
2024-02-08 | FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs | Eun Cheol Choi et.al. | 2402.05904 | link |
2024-02-08 | Large Language Model Meets Graph Neural Network in Knowledge Distillation | Shengxiang Hu et.al. | 2402.05894 | null |
2024-02-08 | Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking | Nikhil Sharma et.al. | 2402.05880 | null |
2024-02-08 | PromptCrypt: Prompt Encryption for Secure Communication with Large Language Models | Guo Lin et.al. | 2402.05868 | link |
2024-02-08 | How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis | Federico Bianchi et.al. | 2402.05863 | link |
2024-02-08 | Let Your Graph Do the Talking: Encoding Structured Data for LLMs | Bryan Perozzi et.al. | 2402.05862 | null |
2024-02-08 | Learning to Route Among Specialized Experts for Zero-Shot Generalization | Mohammed Muqeeth et.al. | 2402.05859 | link |
2024-02-08 | Limitations of Agents Simulated by Predictive Models | Raymond Douglas et.al. | 2402.05829 | null |
2024-02-08 | Is it Possible to Edit Large Language Models Robustly? | Xinbei Ma et.al. | 2402.05827 | link |
2024-02-08 | Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models | Lingzhi Wang et.al. | 2402.05813 | null |
2024-02-08 | Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning | Zhiheng Xi et.al. | 2402.05808 | link |
2024-02-08 | How do Transformers perform In-Context Autoregressive Learning? | Michael E. Sander et.al. | 2402.05787 | null |
2024-02-08 | Limits of Transformer Language Models on Algorithmic Learning | Jonathan Thomm et.al. | 2402.05785 | null |
2024-02-08 | Text-to-Code Generation with Modality-relative Pre-training | Fenia Christopoulou et.al. | 2402.05783 | null |
2024-02-07 | Opening the AI black box: program synthesis via mechanistic interpretability | Eric J. Michaud et.al. | 2402.05110 | link |
2024-02-07 | You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models | Alix Decrop et.al. | 2402.05102 | null |
2024-02-07 | Hydragen: High-Throughput LLM Inference with Shared Prefixes | Jordan Juravsky et.al. | 2402.05099 | link |
2024-02-07 | Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation | Dennis Hoftijzer et.al. | 2402.05090 | null |
2024-02-07 | A Roadmap to Pluralistic Alignment | Taylor Sorensen et.al. | 2402.05070 | link |
2024-02-07 | SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models | Lijun Li et.al. | 2402.05044 | link |
2024-02-07 | How BERT Speaks Shakespearean English? Evaluating Historical Bias in Contextual Language Models | Miriam Cuscito et.al. | 2402.05034 | null |
2024-02-07 | A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules? | Agustinus Kristiadi et.al. | 2402.05015 | link |
2024-02-07 | Pedagogical Alignment of Large Language Models | Shashank Sonkar et.al. | 2402.05000 | null |
2024-02-07 | An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration | Yihao Li et.al. | 2402.04978 | null |
2024-02-07 | ChatScratch: An AI-Augmented System Toward Autonomous Visual Programming Learning for Children Aged 6-12 | Liuqing Chen et.al. | 2402.04975 | null |
2024-02-07 | Reconfidencing LLMs from the Grouping Loss Perspective | Lihu Chen et.al. | 2402.04957 | null |
2024-02-07 | Chatbots in Knowledge-Intensive Contexts: Comparing Intent and LLM-Based Systems | Samuel Kernan Freire et.al. | 2402.04955 | null |
2024-02-07 | Prompting Implicit Discourse Relation Annotation | Frances Yung et.al. | 2402.04918 | null |
2024-02-07 | Personalized Text Generation with Fine-Grained Linguistic Control | Bashar Alhafni et.al. | 2402.04914 | link |
2024-02-07 | L4Q: Parameter Efficient Quantization-Aware Training on Large Language Models via LoRA-wise LSQ | Hyesung Jeon et.al. | 2402.04902 | null |
2024-02-07 | Detecting Generated Native Ads in Conversational Search | Sebastian Schmidt et.al. | 2402.04889 | link |
2024-02-07 | Multimodal Query Suggestion with Multi-Agent Reinforcement Learning from Human Feedback | Zheng Wang et.al. | 2402.04867 | null |
2024-02-07 | Automated Smart Contract Summarization via LLMs | Yingjie Mao et.al. | 2402.04863 | null |
2024-02-07 | CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay | Natasha Butt et.al. | 2402.04858 | null |
2024-02-06 | AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls | Yu Du et.al. | 2402.04253 | link |
2024-02-06 | HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal | Mantas Mazeika et.al. | 2402.04249 | link |
2024-02-06 | Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks | Jongho Park et.al. | 2402.04248 | link |
2024-02-06 | Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science | Xiangru Tang et.al. | 2402.04247 | null |
2024-02-06 | CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations | Ji Qi et.al. | 2402.04236 | link |
2024-02-06 | Can Generative Agents Predict Emotion? | Ciaran Regan et.al. | 2402.04232 | null |
2024-02-06 | "Task Success" is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors | Lin Guan et.al. | 2402.04210 | null |
2024-02-06 | Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models | David Sobrín-Hidalgo et.al. | 2402.04206 | null |
2024-02-06 | SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models | Yichen Shi et.al. | 2402.04178 | link |
2024-02-06 | Scaling Laws for Downstream Task Performance of Large Language Models | Berivan Isik et.al. | 2402.04177 | null |
2024-02-06 | Harnessing the Plug-and-Play Controller by Prompting | Hao Wang et.al. | 2402.04160 | null |
2024-02-06 | Multi-line AI-assisted Code Authoring | Omer Dunay et.al. | 2402.04141 | null |
2024-02-06 | Advancing Legal Reasoning: The Integration of AI to Navigate Complexities and Biases in Global Jurisprudence with Semi-Automated Arbitration Processes (SAAPs) | Michael De'Shazer et.al. | 2402.04140 | null |
2024-02-06 | Scientific Language Modeling: A Quantitative Review of Large Language Models in Molecular Science | Pengfei Liu et.al. | 2402.04119 | link |
2024-02-06 | Measuring Implicit Bias in Explicitly Unbiased Large Language Models | Xuechunzi Bai et.al. | 2402.04105 | null |
2024-02-06 | The Use of a Large Language Model for Cyberbullying Detection | Bayode Ogunleye et.al. | 2402.04088 | null |
2024-02-06 | A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation | Zhengbo Wang et.al. | 2402.04087 | link |
2024-02-06 | Provably learning a multi-head attention layer | Sitan Chen et.al. | 2402.04084 | null |
2024-02-06 | Iterative Prompt Refinement for Radiation Oncology Symptom Extraction Using Teacher-Student Large Language Models | Reza Khanmohammadi et.al. | 2402.04075 | null |
2024-02-06 | Retrieve to Explain: Evidence-driven Predictions with Language Models | Ravi Patel et.al. | 2402.04068 | link |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-05-23 | MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models | Jiuming Liu et.al. | 2405.14338 | null |
2024-05-22 | Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline | Dingyi Yang et.al. | 2405.14040 | null |
2024-05-22 | TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment | Wei Li et.al. | 2405.13911 | null |
2024-05-22 | Dense Connector for MLLMs | Huanjin Yao et.al. | 2405.13800 | null |
2024-05-22 | VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding | Yongxin Guo et.al. | 2405.13382 | null |
2024-05-21 | Anticipating Object State Changes | Victoria Manousaki et.al. | 2405.12789 | null |
2024-05-17 | Open-Vocabulary Spatio-Temporal Action Detection | Tao Wu et.al. | 2405.10832 | null |
2024-05-14 | Challenges in Deploying Long-Context Transformers: A Theoretical Peak Performance Analysis | Yao Fu et.al. | 2405.08944 | null |
2024-05-14 | CinePile: A Long Video Question Answering Dataset and Benchmark | Ruchit Rawal et.al. | 2405.08813 | null |
2024-05-14 | No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding | Yingjie Zhai et.al. | 2405.08344 | link |
2024-05-13 | FreeVA: Offline MLLM as Training-Free Video Assistant | Wenhao Wu et.al. | 2405.07798 | link |
2024-05-11 | Memory-Maze: Scenario Driven Benchmark and Visual Language Navigation Model for Guiding Blind People | Masaki Kuribayashi et.al. | 2405.07060 | null |
2024-05-11 | Retrieval Enhanced Zero-Shot Video Captioning | Yunchuan Ma et.al. | 2405.07046 | null |
2024-05-11 | Global Motion Understanding in Large-Scale Video Object Segmentation | Volodymyr Fedynyak et.al. | 2405.07031 | null |
2024-05-09 | A Survey on Backbones for Deep Video Action Recognition | Zixuan Tang et.al. | 2405.05584 | null |
2024-05-08 | Transfer-LMR: Heavy-Tail Driving Behavior Recognition in Diverse Traffic Scenarios | Chirag Parikh et.al. | 2405.05354 | null |
2024-05-07 | Vision Mamba: A Comprehensive Survey and Taxonomy | Xiao Liu et.al. | 2405.04404 | link |
2024-05-06 | Foundation Models for Video Understanding: A Survey | Neelu Madan et.al. | 2405.03770 | link |
2024-05-08 | How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs | Muhammad Uzair Khattak et.al. | 2405.03690 | null |
2024-05-06 | WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning | Yuanhan Zhang et.al. | 2405.03272 | null |
2024-04-30 | Cross-Block Fine-Grained Semantic Cascade for Skeleton-Based Sports Action Recognition | Zhendong Liu et.al. | 2404.19383 | null |
2024-05-01 | Capabilities of Gemini Models in Medicine | Khaled Saab et.al. | 2404.18416 | null |
2024-04-26 | Learning text-to-video retrieval from image captioning | Lucas Ventura et.al. | 2404.17498 | null |
2024-04-26 | MovieChat+: Question-aware Sparse Memory for Long Video Question Answering | Enxin Song et.al. | 2404.17176 | link |
2024-04-26 | Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive Prompting | Yuanyuan Liu et.al. | 2404.17100 | null |
2024-04-29 | PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning | Lin Xu et.al. | 2404.16994 | link |
2024-04-25 | SFMViT: SlowFast Meet ViT in Chaotic World | Jiaying Lin et.al. | 2404.16609 | link |
2024-04-23 | IPAD: Industrial Process Anomaly Detection Dataset | Jinfan Liu et.al. | 2404.15033 | null |
2024-04-23 | Pegasus-v1 Technical Report | Raehyuk Jung et.al. | 2404.14687 | null |
2024-04-26 | Narrative Action Evaluation with Prompt-Guided Multimodal Interaction | Shiyi Zhang et.al. | 2404.14471 | link |
2024-04-20 | Movie101v2: Improved Movie Narration Benchmark | Zihao Yue et.al. | 2404.13370 | null |
2024-04-18 | Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models | Reka Team et.al. | 2404.12387 | null |
2024-04-18 | From Image to Video, what do we need in multimodal LLMs? | Suyuan Huang et.al. | 2404.11865 | null |
2024-04-17 | VG4D: Vision-Language Model Goes 4D Video Recognition | Zhichao Deng et.al. | 2404.11605 | link |
2024-04-15 | Leveraging Temporal Contextualization for Video Action Recognition | Minji Kim et.al. | 2404.09490 | null |
2024-04-15 | The 8th AI City Challenge | Shuo Wang et.al. | 2404.09432 | null |
2024-04-16 | Human-in-the-Loop Segmentation of Multi-species Coral Imagery | Scarlett Raine et.al. | 2404.09406 | link |
2024-04-14 | In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition | Wiktor Mucha et.al. | 2404.09308 | null |
2024-04-14 | TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning | Quang Minh Dinh et.al. | 2404.09275 | link |
2024-04-14 | Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection | Jin Yang et.al. | 2404.09263 | link |
2024-04-12 | Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis | Maged Shoman et.al. | 2404.08229 | link |
2024-04-11 | Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval | Minkuk Kim et.al. | 2404.07610 | link |
2024-04-10 | A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos | Suleyman Ozdel et.al. | 2404.07351 | null |
2024-04-10 | Gaze-Guided Graph Neural Network for Action Anticipation Conditioned on Intention | Suleyman Ozdel et.al. | 2404.07347 | null |
2024-04-09 | MoReVQA: Exploring Modular Reasoning Models for Video Question Answering | Juhong Min et.al. | 2404.06511 | null |
2024-04-07 | X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model | Jan Held et.al. | 2404.06332 | null |
2024-04-24 | MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding | Bo He et.al. | 2404.05726 | link |
2024-04-06 | SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos | Tao Wu et.al. | 2404.04565 | null |
2024-04-19 | Koala: Key frame-conditioned long video-LLM | Reuben Tan et.al. | 2404.04346 | null |
2024-04-05 | Neural-Symbolic VideoQA: Learning Compositional Spatio-Temporal Reasoning for Real-world Video Question Answering | Lili Liang et.al. | 2404.04007 | null |
2024-04-04 | OW-VISCap: Open-World Video Instance Segmentation and Captioning | Anwesa Choudhuri et.al. | 2404.03657 | null |
2024-04-04 | MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens | Kirolos Ataallah et.al. | 2404.03413 | null |
2024-04-10 | LongVLM: Efficient Long Video Understanding via Large Language Models | Yuetian Weng et.al. | 2404.03384 | link |
2024-04-03 | DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement | Hao Wu et.al. | 2404.02755 | null |
2024-04-05 | SnAG: Scalable and Accurate Video Grounding | Fangzhou Mu et.al. | 2404.02257 | null |
2024-04-01 | TraveLER: A Multi-LMM Agent Framework for Video Question-Answering | Chuyi Shang et.al. | 2404.01476 | null |
2024-04-01 | CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes | Ting En Lam et.al. | 2404.01299 | null |
2024-04-01 | Streaming Dense Video Captioning | Xingyi Zhou et.al. | 2404.01297 | link |
2024-04-02 | Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward | Ruohong Zhang et.al. | 2404.01258 | link |
2024-04-01 | VideoDistill: Language-aware Vision Distillation for Video Question Answering | Bo Zou et.al. | 2404.00973 | null |
2024-03-31 | Ye Liu et.al. | 2404.00801 | link | |
2024-03-30 | Instrument-tissue Interaction Detection Framework for Surgical Video Understanding | Wenjun Lin et.al. | 2404.00322 | null |
2024-03-30 | ST-LLM: Large Language Models Are Effective Temporal Learners | Ruyang Liu et.al. | 2404.00308 | link |
2024-03-29 | A Unified Framework for Human-centric Point Cloud Video Understanding | Yiteng Xu et.al. | 2403.20031 | null |
2024-03-28 | Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality | Sishuo Chen et.al. | 2403.19221 | link |
2024-03-27 | An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM | Wonkyun Kim et.al. | 2403.18406 | link |
2024-03-26 | OmniVid: A Generative Framework for Universal Video Understanding | Junke Wang et.al. | 2403.17935 | link |
2024-03-25 | Understanding Long Videos in One Multimodal Language Model Pass | Kanchana Ranasinghe et.al. | 2403.16998 | link |
2024-03-24 | AVicuna: Audio-Visual LLM with Interleaver and Context-Boundary Alignment for Temporal Referential Dialogue | Yunlong Tang et.al. | 2403.16276 | null |
2024-03-22 | InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding | Yi Wang et.al. | 2403.15377 | link |
2024-03-25 | VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding | Ahmad Mahmood et.al. | 2403.14743 | null |
2024-03-21 | Language Repository for Long Video Understanding | Kumara Kahatapitiya et.al. | 2403.14622 | link |
2024-03-21 | Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels | Tianming Liang et.al. | 2403.14430 | null |
2024-03-18 | Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation | Zixin Zhu et.al. | 2403.12042 | link |
2024-03-18 | Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation | Wangbo Zhao et.al. | 2403.11808 | link |
2024-03-27 | LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything Model | Yuxin Cao et.al. | 2403.11656 | null |
2024-03-18 | VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding | Yue Fan et.al. | 2403.11481 | null |
2024-03-15 | VideoAgent: Long-form Video Understanding with Large Language Model as Agent | Xiaohan Wang et.al. | 2403.10517 | null |
2024-03-14 | Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding | Guo Chen et.al. | 2403.09626 | link |
2024-03-25 | Don't Judge by the Look: Towards Motion Coherent Video Representation | Yitian Zhang et.al. | 2403.09506 | link |
2024-03-13 | DAM: Dynamic Adapter Merging for Continual Video QA Learning | Feng Cheng et.al. | 2403.08755 | link |
2024-03-11 | Action Reimagined: Text-to-Pose Video Editing for Dynamic Human Actions | Lan Wang et.al. | 2403.07198 | null |
2024-03-12 | VideoMamba: State Space Model for Efficient Video Understanding | Kunchang Li et.al. | 2403.06977 | link |
2024-03-25 | An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models | Liang Chen et.al. | 2403.06764 | link |
2024-03-08 | Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation | Joseph Cho et.al. | 2403.05131 | null |
2024-03-11 | Beyond MOT: Semantic Multi-Object Tracking | Yunhao Li et.al. | 2403.05021 | null |
2024-03-08 | Pix2Gif: Motion-Guided Diffusion for GIF Generation | Hitesh Kandala et.al. | 2403.04634 | null |
2024-03-05 | A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives | Simone Alberto Peirone et.al. | 2403.03037 | null |
2024-03-03 | MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies | Zhende Song et.al. | 2403.01422 | null |
2024-03-01 | Abductive Ego-View Accident Video Understanding for Safe Driving Perception | Jianwu Fang et.al. | 2403.00436 | null |
2024-02-29 | Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers | Tsai-Shien Chen et.al. | 2402.19479 | null |
2024-03-11 | TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning | Kate Sanders et.al. | 2402.19467 | null |
2024-02-29 | Percept, Chat, and then Adapt: Multimodal Knowledge Transfer of Foundation Models for Open-World Video Recognition | Boyu Chen et.al. | 2402.18951 | null |
2024-02-27 | MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning | Huiyu Xiong et.al. | 2402.17680 | null |
2024-02-25 | LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding | Yuxuan Wang et.al. | 2402.16050 | link |
2024-02-22 | Think before You Leap: Content-Aware Low-Cost Edge-Assisted Video Semantic Segmentation | Mingxuan Yan et.al. | 2402.14326 | null |
2024-02-21 | LLMs Meet Long Video: Advancing Long Video Comprehension with An Interactive Visual Adapter in LLMs | Yunxin Li et.al. | 2402.13546 | null |
2024-02-28 | Video ReCap: Recursive Captioning of Hour-Long Videos | Md Mohaiminul Islam et.al. | 2402.13250 | null |
2024-02-20 | VideoPrism: A Foundational Visual Encoder for Video Understanding | Long Zhao et.al. | 2402.13217 | null |
2024-02-20 | Slot-VLM: SlowFast Slots for Video-Language Modeling | Jiaqi Xu et.al. | 2402.13088 | null |
2024-02-19 | System Identification of Neural Systems: Going Beyond Images to Modelling Dynamics | Mai Gamal et.al. | 2402.12519 | null |
2024-02-19 | LVCHAT: Facilitating Long Video Comprehension | Yu Wang et.al. | 2402.12079 | link |
2024-02-28 | Are you Struggling? Dataset and Baselines for Struggle Determination in Assembly Videos | Shijia Feng et.al. | 2402.11057 | null |
2024-02-16 | Question-Instructed Visual Descriptions for Zero-Shot Video Question Answering | David Romero et.al. | 2402.10698 | null |
2024-02-13 | World Model on Million-Length Video And Language With RingAttention | Hao Liu et.al. | 2402.08268 | link |
2024-02-12 | BDIQA: A New Dataset for Video Question Answering to Explore Cognitive Reasoning through Theory of Mind | Yuanyuan Mao et.al. | 2402.07402 | null |
2024-02-09 | Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning | Amir Ziai et.al. | 2402.06560 | link |
2024-02-09 | Dynamic swarms regulate the morphology and distribution of soft membrane domains | Aakanksha Gubbala et.al. | 2402.06518 | null |
2024-02-08 | Memory Consolidation Enables Long-Context Video Understanding | Ivana Balažević et.al. | 2402.05861 | null |
2024-02-06 | Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization | Yang Jin et.al. | 2402.03161 | null |
2024-02-04 | Spatio-temporal Prompting Network for Robust Video Feature Extraction | Guanxiong Sun et.al. | 2402.02574 | link |
2024-02-02 | Simulator-Free Visual Domain Randomization via Video Games | Chintan Trivedi et.al. | 2402.01335 | link |
2024-01-30 | YTCommentQA: Video Question Answerability in Instructional Videos | Saelyne Yang et.al. | 2401.17343 | link |
2024-01-30 | Multi-granularity Correspondence Learning from Long-term Noisy Videos | Yijie Lin et.al. | 2401.16702 | null |
2024-01-29 | Cutup and Detect: Human Fall Detection on Cutup Untrimmed Videos Using a Large Foundational Video Understanding Model | Till Grutschus et.al. | 2401.16280 | null |
2024-01-25 | Knowledge Graph Supported Benchmark and Video Captioning for Basketball | Zeyu Xi et.al. | 2401.13888 | null |
2024-01-22 | ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition | Jiaming Zhou et.al. | 2401.11654 | null |
2024-01-21 | Exploring Missing Modality in Multimodal Egocentric Datasets | Merey Ramazanova et.al. | 2401.11470 | null |
2024-01-19 | Learning to Visually Connect Actions and their Effects | Eric Peh et.al. | 2401.10805 | null |
2024-01-28 | Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering | Haibo Wang et.al. | 2401.10711 | null |
2024-01-17 | CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding | Yunze Liu et.al. | 2401.09057 | null |
2024-01-16 | Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data | Yuhui Zhang et.al. | 2401.08567 | link |
2024-01-16 | Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization | Chongzhi Zhang et.al. | 2401.08232 | null |
2024-01-11 | Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video Recognition | Yukun Zuo et.al. | 2401.06287 | null |
2024-01-10 | HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition | Qian Wu et.al. | 2401.04975 | link |
2024-01-10 | SnapCap: Efficient Snapshot Compressive Video Captioning | Jianqiao Sun et.al. | 2401.04903 | null |
2024-01-08 | Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification | Wentao Zhu et.al. | 2401.04154 | null |
2024-01-08 | Dr |
Chen Zhao et.al. | 2401.04105 | link |
2024-01-08 | STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering | Yueqian Wang et.al. | 2401.03901 | link |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-05-20 | Biomarker Selection for Adaptive Systems | Joshua Pickard et.al. | 2405.09809 | null |
2024-05-14 | No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding | Yingjie Zhai et.al. | 2405.08344 | link |
2024-05-13 | Improved Bound for Robust Causal Bandits with Linear Models | Zirui Yan et.al. | 2405.07795 | null |
2024-05-10 | Residual-based Attention Physics-informed Neural Networks for Efficient Spatio-Temporal Lifetime Assessment of Transformers Operated in Renewable Power Plants | Ibai Ramirez et.al. | 2405.06443 | null |
2024-05-10 | A Multi-Channel Spatial-Temporal Transformer Model for Traffic Flow Forecasting | Jianli Xiao et.al. | 2405.06266 | null |
2024-05-07 | DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving | Chen Min et.al. | 2405.04390 | null |
2024-05-07 | Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling | Jiawei Shi et.al. | 2405.04309 | null |
2024-05-06 | Hierarchical Space-Time Attention for Micro-Expression Recognition | Haihong Hao et.al. | 2405.03202 | link |
2024-05-21 | RSCaMa: Remote Sensing Image Change Captioning with State Space Model | Chenyang Liu et.al. | 2404.18895 | link |
2024-04-24 | Deep Predictive Model Learning with Parametric Bias: Handling Modeling Difficulties and Temporal Model Changes | Kento Kawaharazuka et.al. | 2404.15726 | null |
2024-04-19 | MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model | Kang Zeng et.al. | 2404.12794 | link |
2024-04-13 | Understanding Human-COVID-19 Dynamics using Geospatial Big Data: A Systematic Literature Review | Binbin Lin et.al. | 2404.10013 | null |
2024-04-15 | A spatio-temporal model to detect potential outliers in disease mapping | Victoire Michal et.al. | 2404.09882 | null |
2024-04-11 | Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos | Soumyabrata Chaudhuri et.al. | 2404.07645 | null |
2024-04-05 | Low-Rank Robust Subspace Tensor Clustering for Metro Passenger Flow Modeling | Jiuyun Hu et.al. | 2404.04403 | null |
2024-04-03 | Spatio-temporal Modeling of Count Data | Steffen Maletz et.al. | 2404.02982 | link |
2024-03-31 | Ye Liu et.al. | 2404.00801 | link | |
2024-03-30 | ST-LLM: Large Language Models Are Effective Temporal Learners | Ruyang Liu et.al. | 2404.00308 | link |
2024-03-28 | X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization | Anna Kukleva et.al. | 2403.19811 | link |
2024-03-25 | TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models | Zhongwei Zhang et.al. | 2403.17005 | null |
2024-04-13 | Recursive Joint Cross-Modal Attention for Multimodal Fusion in Dimensional Emotion Recognition | R. Gnana Praveen et.al. | 2403.13659 | link |
2024-03-19 | SUN Team's Contribution to ABAW 2024 Competition: Audio-visual Valence-Arousal Estimation and Expression Recognition | Denis Dresvyanskiy et.al. | 2403.12609 | null |
2024-03-18 | Bayesian Optimization Sequential Surrogate (BOSS) Algorithm: Fast Bayesian Inference for a Broad Class of Bayesian Hierarchical Models | Dayi Li et.al. | 2403.12250 | null |
2024-03-19 | Exploring Facial Expression Recognition through Semi-Supervised Pretraining and Temporal Modeling | Jun Yu et.al. | 2403.11942 | null |
2024-03-15 | Spatio-temporal Occupancy Models with INLA | Jafet Belmont et.al. | 2403.10680 | null |
2024-03-15 | Multivariate Bayesian models with flexible shared interactions for analyzing spatio-temporal patterns of rare cancers | Garazi Retegui et.al. | 2403.10440 | link |
2024-03-13 | Leveraging Non-Decimated Wavelet Packet Features and Transformer Models for Time Series Forecasting | Guy P Nason et.al. | 2403.08630 | null |
2024-03-10 | Coherent Temporal Synthesis for Incremental Action Segmentation | Guodong Ding et.al. | 2403.06102 | null |
2024-04-26 | Audio-Visual Person Verification based on Recursive Fusion of Joint Cross-Attention | R. Gnana Praveen et.al. | 2403.04654 | link |