Skip to content

stoneyang/cv-arxiv-daily

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[![Contributors][contributors-shield]][contributors-url] [![Forks][forks-shield]][forks-url] [![Stargazers][stars-shield]][stars-url] [![Issues][issues-shield]][issues-url]

Updated on 2024.05.24

Table of Contents
  1. pretrain
  2. downstream
  3. adaptor
  4. object detection

pretrain

Publish Date Title Authors PDF Code
2024-05-21 Personalized Residuals for Concept-Driven Text-to-Image Generation Cusuh Ham et.al. 2405.12978v1 null
2024-05-21 Transparency Distortion Robustness for SOTA Image Segmentation Tasks Volker Knauthe et.al. 2405.12864v1 null
2024-05-21 DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control Hong Chen et.al. 2405.12796v1 null
2024-05-21 EchoPT: A Pretrained Transformer Architecture that Predicts 2D In-Air Sonar Images for Mobile Robotics Jan Steckel et.al. 2405.12573v1 null
2024-05-21 ProtT3: Protein-to-Text Generation for Text-based Protein Understanding Zhiyuan Liu et.al. 2405.12564v1 link
2024-05-20 Octo: An Open-Source Generalist Robot Policy Octo Model Team et.al. 2405.12213v1 null
2024-05-20 Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices Nathaniel Cohen et.al. 2405.12211v1 null
2024-05-20 MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Ting Jiang et.al. 2405.12130v1 link
2024-05-21 Sheet Music Transformer ++: End-to-End Full-Page Optical Music Recognition for Pianoform Sheet Music Antonio Ríos-Vila et.al. 2405.12105v2 link
2024-05-20 Continuous Sign Language Recognition with Adapted Conformer via Unsupervised Pretraining Neena Aloysius et.al. 2405.12018v1 null
2024-05-20 Biomedical Entity Linking for Dutch: Fine-tuning a Self-alignment BERT Model on an Automatically Generated Wikipedia Corpus Fons Hartendorp et.al. 2405.11941v1 link
2024-05-20 Depth Prompting for Sensor-Agnostic Depth Estimation Jin-Hwi Park et.al. 2405.11867v1 null
2024-05-20 SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model Siavash Shams et.al. 2405.11831v1 null
2024-05-20 MM-Retinal: Knowledge-Enhanced Foundational Pretraining with Fundus Image-Text Expertise Ruiqi Wu et.al. 2405.11793v1 link
2024-05-20 TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models Junlong Jia et.al. 2405.11788v1 link
2024-05-17 FA-Depth: Toward Fast and Accurate Self-supervised Monocular Depth Estimation Fei Wang et.al. 2405.10885v1 link
2024-05-17 Multicenter Privacy-Preserving Model Training for Deep Learning Brain Metastases Autosegmentation Yixing Huang et.al. 2405.10870v1 null
2024-05-17 Improving face generation quality and prompt following with synthetic captions Michail Tarasiou et.al. 2405.10864v1 null
2024-05-17 Open-Vocabulary Spatio-Temporal Action Detection Tao Wu et.al. 2405.10832v1 null
2024-05-17 Specialising and Analysing Instruction-Tuned and Byte-Level Language Models for Organic Reaction Prediction Jiayun Pang et.al. 2405.10625v1 null
2024-05-17 UniCL: A Universal Contrastive Learning Framework for Large Time Series Models Jiawei Li et.al. 2405.10597v1 null
2024-05-17 A Deep Learning Approach to Heterogeneous Consumer Aesthetics in Retail Fashion Pranjal Rawat et.al. 2405.10498v1 null
2024-05-16 Data Selection for Transfer Unlearning Nazanin Mohammadi Sepahvand et.al. 2405.10425v1 null
2024-05-16 Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model Zheng Gu et.al. 2405.10316v1 null
2024-05-16 Libra: Building Decoupled Vision System on Large Language Models Yifan Xu et.al. 2405.10140v1 link
2024-05-16 Continuous Transfer Learning for UAV Communication-aware Trajectory Design Chenrui Sun et.al. 2405.10087v1 null
2024-05-16 HecVL: Hierarchical Video-Language Pretraining for Zero-shot Surgical Phase Recognition Kun Yuan et.al. 2405.10075v1 null
2024-05-16 Natural Language Can Help Bridge the Sim2Real Gap Albert Yu et.al. 2405.10020v1 null
2024-05-16 Histopathology Foundation Models Enable Accurate Ovarian Cancer Subtype Classification Jack Breen et.al. 2405.09990v1 link
2024-05-16 Cross-sensor self-supervised training and alignment for remote sensing Valerio Marsocci et.al. 2405.09922v1 null
2024-05-16 TransMI: A Framework to Create Strong Baselines from Multilingual Pretrained Language Models for Transliterated Data Yihong Liu et.al. 2405.09913v1 link
2024-05-16 IGOT: Information Gain Optimized Tokenizer on Domain Adaptive Pretraining Dawei Feng et.al. 2405.09857v1 null
2024-05-15 LoRA Learns Less and Forgets Less Dan Biderman et.al. 2405.09673v1 null
2024-05-15 Time-Equivariant Contrastive Learning for Degenerative Disease Progression in Retinal OCT Taha Emre et.al. 2405.09404v1 null
2024-05-15 Matching domain experts by training from scratch on domain knowledge Xiaoliang Luo et.al. 2405.09395v1 null
2024-05-15 HumanRankEval: Automatic Evaluation of LMs as Conversational Assistants Milan Gritta et.al. 2405.09186v1 null
2024-05-14 Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis Alexandre Englebert et.al. 2405.08932v1 null
2024-05-14 CLIP with Quality Captions: A Strong Pretraining for Vision Tasks Pavan Kumar Anasosalu Vasu et.al. 2405.08911v1 null
2024-05-14 Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Zhimin Li et.al. 2405.08748v1 link
2024-05-14 Self-supervised learning improves robustness of deep learning lung tumor segmentation to CT imaging differences Jue Jiang et.al. 2405.08657v1 null
2024-05-14 Hearing Touch: Audio-Visual Pretraining for Contact-Rich Manipulation Jared Mejia et.al. 2405.08576v1 null
2024-05-14 Improving Transformers with Dynamically Composable Multi-Head Attention Da Xiao et.al. 2405.08553v1 link
2024-05-14 Self-Distillation Improves DNA Sequence Inference Tong Yu et.al. 2405.08538v1 link
2024-05-14 Parameter-Efficient Instance-Adaptive Neural Video Compression Hyunmo Yang et.al. 2405.08530v1 null
2024-05-14 Investigating the 'Autoencoder Behavior' in Speech Self-Supervised Models: a focus on HuBERT's Pretraining Valentin Vielzeuf et.al. 2405.08402v1 null
2024-05-14 Could Chemical LLMs benefit from Message Passing Jiaqing Xie et.al. 2405.08334v1 null
2024-05-13 Rethinking Histology Slide Digitization Workflows for Low-Resource Settings Talat Zehra et.al. 2405.08169v1 link
2024-05-13 Improving Breast Cancer Grade Prediction with Multiparametric MRI Created Using Optimized Synthetic Correlated Diffusion Imaging Chi-en Amy Tai et.al. 2405.07861v1 null
2024-05-13 SAR Image Synthesis with Diffusion Models Denisa Qosja et.al. 2405.07776v1 null
2024-05-13 LlamaTurk: Adapting Open-Source Generative Large Language Models for Low-Resource Language Cagri Toraman et.al. 2405.07745v1 link
2024-05-13 Environmental Matching Attack Against Unmanned Aerial Vehicles Object Detection Dehong Kong et.al. 2405.07595v1 null
2024-05-13 Thai Universal Dependency Treebank Panyut Sriwirote et.al. 2405.07586v1 null
2024-05-13 Consistency Policy: Accelerated Visuomotor Policies via Consistency Distillation Aaditya Prasad et.al. 2405.07503v1 null
2024-05-13 CLIP-Powered TASS: Target-Aware Single-Stream Network for Audio-Visual Question Answering Yuanyuan Jiang et.al. 2405.07451v1 null
2024-05-13 Sakuga-42M Dataset: Scaling Up Cartoon Research Zhenglin Pan et.al. 2405.07425v1 link
2024-05-13 MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging Tasks Haijiang Tian et.al. 2405.07411v1 null
2024-05-12 Zero Shot Context-Based Object Segmentation using SLIP (SAM+CLIP) Saaketh Koundinya Gundavarapu et.al. 2405.07284v1 null
2024-05-10 Federated Document Visual Question Answering: A Pilot Study Khanh Nguyen et.al. 2405.06636v1 null
2024-05-10 LMD3: Language Model Data Density Dependence John Kirchenbauer et.al. 2405.06331v1 null
2024-05-10 Decoding Emotions in Abstract Art: Cognitive Plausibility of CLIP in Recognizing Color-Emotion Associations Hanna-Sophia Widhoelzl et.al. 2405.06319v1 null
2024-05-10 SaudiBERT: A Large Language Model Pretrained on Saudi Dialect Corpora Faisal Qarah et.al. 2405.06239v1 null
2024-05-10 VLSM-Adapter: Finetuning Vision-Language Segmentation Efficiently with Lightweight Blocks Manish Dhakal et.al. 2405.06196v1 null
2024-05-10 ACTION: Augmentation and Computation Toolbox for Brain Network Analysis with Functional MRI Yuqi Fang et.al. 2405.06178v1 null
2024-05-09 UnSegGNet: Unsupervised Image Segmentation using Graph Neural Networks Kovvuri Sai Gopal Reddy et.al. 2405.06057v1 link
2024-05-09 Efficient Pretraining Model based on Multi-Scale Local Visual Field Feature Reconstruction for PCB CT Image Element Segmentation Chen Chen et.al. 2405.05745v1 null
2024-05-09 Parameter-Efficient Fine-Tuning With Adapters Keyu Chen et.al. 2405.05493v1 null
2024-05-09 PLLM-CS: Pre-trained Large Language Model (LLM) for Cyber Threat Detection in Satellite Networks Mohammed Hassanin et.al. 2405.05469v1 null
2024-05-08 Deep Learning Method to Predict Wound Healing Progress Based on Collagen Fibers in Wound Tissue Juan He et.al. 2405.05297v1 null
2024-05-08 Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming Tommaso Pasini et.al. 2405.05176v1 null
2024-05-08 Seeds of Stereotypes: A Large-Scale Textual Analysis of Race and Gender Associations with Diseases in Online Sources Lasse Hyldig Hansen et.al. 2405.05049v1 null
2024-05-08 ${M^2D}$NeRF: Multi-Modal Decomposition NeRF with 3D Feature Fields Ning Wang et.al. 2405.05010v1 null
2024-05-08 ChuXin: 1.6B Technical Report Xiaomin Zhuang et.al. 2405.04828v1 null
2024-05-07 Remote Diffusion Kunal Sunil Kasodekar et.al. 2405.04717v1 null
2024-05-07 Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking Emre Can Acikgoz et.al. 2405.04685v1 null
2024-05-07 TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation Hritik Bansal et.al. 2405.04682v1 null
2024-05-07 S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling Minh Tran et.al. 2405.04489v1 null
2024-05-08 DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model DeepSeek-AI et.al. 2405.04434v2 link
2024-05-07 Cross-IQA: Unsupervised Learning for Image Quality Assessment Zhen Zhang et.al. 2405.04311v1 null
2024-05-07 Sign2GPT: Leveraging Large Language Models for Gloss-Free Sign Language Translation Ryan Wong et.al. 2405.04164v1 null
2024-05-07 Locally Differentially Private In-Context Learning Chunyan Zheng et.al. 2405.04032v1 null
2024-05-07 SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing Yuying Ge et.al. 2405.04007v1 null
2024-05-07 Knowledge Adaptation from Large Language Model to Recommendation for Practical Industrial Application Jian Jia et.al. 2405.03988v1 null
2024-05-07 Contextualization with SPLADE for High Recall Retrieval Eugene Yang et.al. 2405.03972v1 link
2024-05-07 AdsorbDiff: Adsorbate Placement via Conditional Denoising Diffusion Adeesh Kolluru et.al. 2405.03962v1 null
2024-05-06 Provable Preconditioned Plug-and-Play Approach for Compressed Sensing MRI Reconstruction Tao Hong et.al. 2405.03854v1 null
2024-05-06 Pose Priors from Language Models Sanjay Subramanian et.al. 2405.03689v1 null
2024-05-06 AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design Kamal Choudhary et.al. 2405.03680v1 null
2024-05-06 Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment Abhinav Agarwalla et.al. 2405.03594v1 null
2024-05-06 Whispy: Adapting STT Whisper Models to Real-Time Environments Antonio Bevilacqua et.al. 2405.03484v1 null
2024-05-06 Adapting Dual-encoder Vision-language Models for Paraphrased Retrieval Jiacheng Cheng et.al. 2405.03190v1 null
2024-05-06 GeoContrastNet: Contrastive Key-Value Edge Learning for Language-Agnostic Document Understanding Nil Biescas et.al. 2405.03104v1 null
2024-05-06 SketchGPT: Autoregressive Modeling for Sketch Generation and Recognition Adarsh Tiwari et.al. 2405.03099v1 null
2024-05-05 RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification June-Woo Kim et.al. 2405.02996v1 null
2024-05-05 Score-based Generative Priors Guided Model-driven Network for MRI Reconstruction Xiaoyu Qiao et.al. 2405.02958v1 null
2024-05-05 IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs Yuzhen Mao et.al. 2405.02842v1 null
2024-05-03 Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets Xuelong Geng et.al. 2405.02132v1 null
2024-05-03 A Mutual Information Perspective on Federated Contrastive Learning Christos Louizos et.al. 2405.02081v1 null
2024-05-03 SATO: Stable Text-to-Motion Framework Wenshuo Chen et.al. 2405.01461v2 link
2024-05-02 StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Yupeng Zhou et.al. 2405.01434v1 link
2024-05-02 CromSS: Cross-modal pre-training with noisy labels for remote sensing image segmentation Chenying Liu et.al. 2405.01217v1 null
2024-05-02 Language Fairness in Multilingual Information Retrieval Eugene Yang et.al. 2405.00978v1 link
2024-05-02 PLAID SHIRTTT for Large-Scale Streaming Dense Retrieval Dawn Lawrie et.al. 2405.00975v1 link
2024-05-01 Transformer-Based Self-Supervised Learning for Histopathological Classification of Ischemic Stroke Clot Origin K. Yeh et.al. 2405.00908v1 null
2024-05-01 SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models Burak Can Biner et.al. 2405.00878v1 null
2024-05-01 Adapting Pretrained Networks for Image Quality Assessment on High Dynamic Range Displays Andrei Chubarau et.al. 2405.00670v1 null
2024-05-01 Are Models Biased on Text without Gender-related Language? Catarina G Belém et.al. 2405.00588v1 link
2024-05-01 Self-supervised Pre-training of Text Recognizers Martin Kišš et.al. 2405.00420v1 link
2024-05-01 Expert Insight-Enhanced Follow-up Chest X-Ray Summary Generation Zhichuan Wang et.al. 2405.00344v1 null
2024-04-30 PAODING: A High-fidelity Data-free Pruning Toolkit for Debloating Pre-trained Neural Networks Mark Huasong Meng et.al. 2405.00074v1 null
2024-04-30 Seeing Through the Clouds: Cloud Gap Imputation with Prithvi Foundation Model Denys Godwin et.al. 2404.19609v1 null
2024-04-30 Automatic Cardiac Pathology Recognition in Echocardiography Images Using Higher Order Dynamic Mode Decomposition and a Vision Transformer for Small Datasets Andrés Bell-Navas et.al. 2404.19579v1 null
2024-04-30 CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation Weiquan Huang et.al. 2404.19394v1 link
2024-04-30 Knowledge Distillation vs. Pretraining from Scratch under a Fixed (Computation) Budget Minh Duc Bui et.al. 2404.19319v1 null
2024-04-30 Robust Pedestrian Detection via Constructing Versatile Pedestrian Knowledge Bank Sungjune Park et.al. 2404.19299v1 null
2024-04-30 Revisiting the Adversarial Robustness of Vision Language Models: a Multimodal Perspective Wanqi Zhou et.al. 2404.19287v1 null
2024-04-30 Understanding Multimodal Contrastive Learning Through Pointwise Mutual Information Toshimitsu Uesaka et.al. 2404.19228v1 null
2024-04-29 What Drives Performance in Multilingual Language Models? Sina Bagheri Nezhad et.al. 2404.19159v1 link
2024-04-29 Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing Leonardo Rossi et.al. 2404.18924v1 null
2024-04-29 Overcoming Knowledge Barriers: Online Imitation Learning from Observation with Pretrained World Models Xingyuan Zhang et.al. 2404.18896v1 null
2024-04-29 It's Difficult to be Neutral -- Human and LLM-based Sentiment Annotation of Patient Comments Petter Mæhlum et.al. 2404.18832v1 null
2024-04-30 PatentGPT: A Large Language Model for Intellectual Property Zilong Bai et.al. 2404.18255v2 null
2024-04-28 Efficient Remote Sensing with Harmonized Transfer Learning and Modality Alignment Tengjun Huang et.al. 2404.18253v1 link
2024-04-28 TextGram: Towards a better domain-adaptive pretraining Sharayu Hiwarkhedkar et.al. 2404.18228v1 null
2024-04-28 Can Perplexity Predict Fine-Tuning Performance? An Investigation of Tokenization Effects on Sequential Language Models for Nepali Nishant Luitel et.al. 2404.18071v1 null
2024-04-28 Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model Xiaolong Li et.al. 2404.18065v1 null
2024-04-27 Critical Review for One-class Classification: recent advances and the reality behind them Toshitaka Hayashi et.al. 2404.17931v1 null
2024-04-27 T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining Yi Yuan et.al. 2404.17806v1 null
2024-04-26 Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo Stephen Zhao et.al. 2404.17546v1 null
2024-04-26 Low Cost Machine Vision for Insect Classification Danja Brandt et.al. 2404.17488v1 null
2024-04-26 SAGHOG: Self-Supervised Autoencoder for Generating HOG Features for Writer Retrieval Marco Peer et.al. 2404.17221v1 link
2024-04-26 Self-supervised visual learning in the low-data regime: a comparative evaluation Sotirios Konstantakos et.al. 2404.17202v1 null
2024-04-26 Few-shot Calligraphy Style Learning Fangda Chen et.al. 2404.17199v1 link
2024-04-26 TIGQA:An Expert Annotated Question Answering Dataset in Tigrinya Hailay Teklehaymanot et.al. 2404.17194v1 null
2024-04-25 Türkçe Dil Modellerinin Performans Karşılaştırması Performance Comparison of Turkish Language Models Eren Dogan et.al. 2404.17010v1 null
2024-04-25 Constellation Dataset: Benchmarking High-Altitude Object Detection for an Urban Intersection Mehmet Kerem Turkcan et.al. 2404.16944v1 link
2024-04-25 A Short Survey of Human Mobility Prediction in Epidemic Modeling from Transformers to LLMs Christian N. Mayemba et.al. 2404.16921v1 null
2024-04-25 Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding Mostafa Elhoushi et.al. 2404.16710v1 null
2024-04-25 Road Surface Friction Estimation for Winter Conditions Utilising General Visual Features Risto Ojala et.al. 2404.16578v1 null
2024-04-25 Leveraging Pretrained Latent Representations for Few-Shot Imitation Learning on a Dexterous Robotic Hand Davide Liconti et.al. 2404.16483v1 null
2024-04-25 Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics Ben Williams et.al. 2404.16436v1 null
2024-04-25 TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models Haomiao Ni et.al. 2404.16306v1 null
2024-04-24 Towards a Holistic Evaluation of LLMs on Factual Knowledge Recall Jiaqing Yuan et.al. 2404.16164v1 null
2024-04-24 FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication Eric Slyman et.al. 2404.16123v1 null
2024-04-24 MoDE: CLIP Data Experts via Clustering Jiawei Ma et.al. 2404.16030v1 link
2024-04-24 Representing Part-Whole Hierarchies in Foundation Models by Learning Localizability, Composability, and Decomposability from Anatomy via Self-Supervision Mohammad Reza Hosseinzadeh Taher et.al. 2404.15672v1 null
2024-04-24 HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts Xinlei Niu et.al. 2404.15637v1 null
2024-04-24 Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? Hossein Salami et.al. 2404.15578v1 null
2024-04-24 Retrieval Head Mechanistically Explains Long-Context Factuality Wenhao Wu et.al. 2404.15574v1 null
2024-04-23 SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation Xiangyu Xu et.al. 2404.15276v1 link
2024-04-23 CT-GLIP: 3D Grounded Language-Image Pretraining with CT Scans and Radiology Reports for Full-Body Scenarios Jingyang Lin et.al. 2404.15272v1 null
2024-04-23 Setting up the Data Printer with Improved English to Ukrainian Machine Translation Yurii Paniv et.al. 2404.15196v1 link
2024-04-23 Combating Missing Modalities in Egocentric Videos at Test Time Merey Ramazanova et.al. 2404.15161v1 null
2024-04-23 DP-Net: Learning Discriminative Parts for image recognition Ronan Sicre et.al. 2404.15037v1 null
2024-04-23 IPAD: Industrial Process Anomaly Detection Dataset Jinfan Liu et.al. 2404.15033v1 null
2024-04-23 Multi-Modal Prompt Learning on Blind Image Quality Assessment Wensheng Pan et.al. 2404.14949v1 null
2024-04-23 Driver Activity Classification Using Generalizable Representations from Vision-Language Models Ross Greer et.al. 2404.14906v1 null
2024-04-23 FMint: Bridging Human Designed and Data Pretrained Models for Differential Equation Foundation Model Zezheng Song et.al. 2404.14688v1 null
2024-04-23 Automated Multi-Language to English Machine Translation Using Generative Pre-Trained Transformers Elijah Pelofske et.al. 2404.14680v1 null
2024-04-22 PARAMANU-GANITA: Language Model with Mathematical Capabilities Mitodru Niyogi et.al. 2404.14395v1 null
2024-04-22 Calc-CMU at SemEval-2024 Task 7: Pre-Calc -- Learning to Use the Calculator Improves Numeracy in Language Models Vishruth Veerendranath et.al. 2404.14355v1 link
2024-04-22 Automatic Discovery of Visual Circuits Achyuta Rajaram et.al. 2404.14349v1 link
2024-04-22 Heterogeneous Face Recognition Using Domain Invariant Units Anjith George et.al. 2404.14343v1 null
2024-04-22 Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels Jan-Philipp Fränken et.al. 2404.14313v1 link
2024-04-22 OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks Sophia Sirko-Galouchenko et.al. 2404.14027v1 null
2024-04-22 EventLens: Leveraging Event-Aware Pretraining and Cross-modal Linking Enhances Visual Commonsense Reasoning Mingjie Ma et.al. 2404.13847v1 null
2024-04-21 FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization Zhaopeng Gu et.al. 2404.13671v1 null
2024-04-21 PEACH: Pretrained-embedding Explanation Across Contextual and Hierarchical Structure Feiqi Cao et.al. 2404.13645v1 link
2024-04-21 Lost in Space: Probing Fine-grained Spatial Understanding in Vision and Language Resamplers Georgios Pantazopoulos et.al. 2404.13594v1 link
2024-04-19 MoVA: Adapting Mixture of Vision Experts to Multimodal Context Zhuofan Zong et.al. 2404.13046v1 link
2024-04-19 Training-and-prompt-free General Painterly Harmonization Using Image-wise Attention Sharing Teng-Fang Hsiao et.al. 2404.12900v1 link
2024-04-19 Grasper: A Generalist Pursuer for Pursuit-Evasion Problems Pengdeng Li et.al. 2404.12626v1 link
2024-04-18 Towards Large Language Models as Copilots for Theorem Proving in Lean Peiyang Song et.al. 2404.12534v1 link
2024-04-18 Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis Yufan Li et.al. 2404.12481v1 null
2024-04-18 mOthello: When Do Cross-Lingual Representation Alignment and Cross-Lingual Transfer Emerge in Multilingual Models? Tianze Hua et.al. 2404.12444v1 null
2024-04-18 MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale Xiaotang Gai et.al. 2404.12372v1 null
2024-04-18 AniClipart: Clipart Animation with Text-to-Video Priors Ronghuan Wu et.al. 2404.12347v1 null
2024-04-18 GraFIQs: Face Image Quality Assessment Using Gradient Magnitudes Jan Niklas Kolf et.al. 2404.12203v1 link
2024-04-18 OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data Chandeepa Dissanayake et.al. 2404.12195v1 link
2024-04-18 How to Benchmark Vision Foundation Models for Semantic Segmentation? Tommie Kerssies et.al. 2404.12172v1 null
2024-04-18 Aligning language models with human preferences Tomasz Korbak et.al. 2404.12150v1 link
2024-04-18 MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification Weikang Yu et.al. 2404.12081v1 link
2024-04-18 Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition Xunsong Li et.al. 2404.11903v1 null
2024-04-17 How often are errors in natural language reasoning due to paraphrastic variability? Neha Srikanth et.al. 2404.11717v1 null
2024-04-17 Pretraining Billion-scale Geospatial Foundational Models on Frontier Aristeidis Tsaris et.al. 2404.11706v1 null
2024-04-17 On the Scalability of GNNs for Molecular Graphs Maciej Sypetkowski et.al. 2404.11568v1 null
2024-04-17 Predicting Long-horizon Futures by Conditioning on Geometry and Time Tarasha Khurana et.al. 2404.11554v1 null
2024-04-17 ScaleFold: Reducing AlphaFold Initial Training Time to 10 Hours Feiwen Zhu et.al. 2404.11068v1 null
2024-04-17 Lightweight Unsupervised Federated Learning with Pretrained Vision Language Model Hao Yan et.al. 2404.11046v1 null
2024-04-17 Many-Shot In-Context Learning Rishabh Agarwal et.al. 2404.11018v1 null
2024-04-17 MaeFuse: Transferring Omni Features with Pretrained Masked Autoencoders for Infrared and Visible Image Fusion via Guided Training Jiayang Li et.al. 2404.11016v1 null
2024-04-16 More Room for Language: Investigating the Effect of Retrieval on Language Models David Samuel et.al. 2404.10939v1 null
2024-04-16 Retrieval Augmented Verification : Unveiling Disinformation with Structured Representations for Zero-Shot Real-Time Evidence-guided Fact-Checking of Multi-modal Social media posts Arka Ujjal Dey et.al. 2404.10702v1 null
2024-04-17 Do Counterfactual Examples Complicate Adversarial Training? Eric Yeats et.al. 2404.10588v2 null
2024-04-17 Optimization of Prompt Learning via Multi-Knowledge Representation for Vision-Language Models Enming Zhang et.al. 2404.10357v2 null
2024-04-16 From Data Deluge to Data Curation: A Filtering-WoRA Paradigm for Efficient Text-based Person Search Jintao Sun et.al. 2404.10292v1 null
2024-04-16 Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology Oren Kraus et.al. 2404.10242v1 link
2024-04-16 Compressible and Searchable: AI-native Multi-Modal Retrieval System with Learned Image Compression Jixiang Luo et.al. 2404.10234v1 null
2024-04-15 Self-Supervised Learning Featuring Small-Scale Image Dataset for Treatable Retinal Diseases Classification Luffina C. Huang et.al. 2404.10166v1 null
2024-04-15 NOISe: Nuclei-Aware Osteoclast Instance Segmentation for Mouse-to-Human Domain Transfer Sai Kumar Reddy Manne et.al. 2404.10130v1 link
2024-04-15 Explainable Light-Weight Deep Learning Pipeline for Improved Drought Stres Aswini Kumar Patra et.al. 2404.10073v1 null
2024-04-15 EgoPet: Egomotion and Interaction Data from an Animal's Perspective Amir Bar et.al. 2404.09991v1 null
2024-04-15 Contrastive Pretraining for Visual Concept Explanations of Socioeconomic Outcomes Ivica Obadic et.al. 2404.09768v1 null
2024-04-15 Bridging Vision and Language Spaces with Assignment Prediction Jungin Park et.al. 2404.09632v1 link
2024-04-15 Magic Clothing: Controllable Garment-Driven Image Synthesis Weifeng Chen et.al. 2404.09512v1 link
2024-04-15 Leveraging Temporal Contextualization for Video Action Recognition Minji Kim et.al. 2404.09490v1 null
2024-04-15 RankCLIP: Ranking-Consistent Language-Image Pretraining Yiming Zhang et.al. 2404.09387v1 null
2024-04-16 Text-to-Song: Towards Controllable Music Generation Incorporating Vocals and Accompaniment Zhiqing Hong et.al. 2404.09313v2 null
2024-04-13 MaSkel: A Model for Human Whole-body X-rays Generation from Human Masking Images Yingjie Xi et.al. 2404.09000v1 link
2024-04-13 DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detector Johan Edstedt et.al. 2404.08928v1 link
2024-04-13 Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension Mengnan Qi et.al. 2404.08885v1 null
2024-04-12 BERT-LSH: Reducing Absolute Compute For Attention Zezheng Li et.al. 2404.08836v1 null
2024-04-12 Probing the 3D Awareness of Visual Foundation Models Mohamed El Banani et.al. 2404.08636v1 link
2024-04-12 Pre-training Small Base LMs with Fewer Tokens Sunny Sanyal et.al. 2404.08634v1 link
2024-04-12 Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation Haozhe Zhao et.al. 2404.08491v1 link
2024-04-12 OTTER: Improving Zero-Shot Classification via Optimal Transport Changho Shin et.al. 2404.08461v1 null
2024-04-12 AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees William Fleshman et.al. 2404.08417v1 null
2024-04-12 Pretraining and Updating Language- and Domain-specific Large Language Model: A Case Study in Japanese Business Domain Kosuke Takahashi et.al. 2404.08262v1 null
2024-04-12 Improving Continuous Sign Language Recognition with Adapted Image Models Lianyu Hu et.al. 2404.08226v1 link
2024-04-12 Measuring Cross-lingual Transfer in Bytes Leandro Rodrigues de Souza et.al. 2404.08191v1 link
2024-04-11 Self-supervised Dataset Distillation: A Good Compression Is All You Need Muxin Zhou et.al. 2404.07976v1 link
2024-04-11 Rho-1: Not All Tokens Are What You Need Zhenghao Lin et.al. 2404.07965v1 link
2024-04-11 MindBridge: A Cross-Subject Brain Decoding Framework Shizun Wang et.al. 2404.07850v1 link
2024-04-11 Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck Nathan Godey et.al. 2404.07647v1 null
2024-04-11 Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval Minkuk Kim et.al. 2404.07610v1 link
2024-04-11 GLID: Pre-training a Generalist Encoder-Decoder Vision Model Jihao Liu et.al. 2404.07603v1 null
2024-04-11 A fine-tuning workflow for automatic first-break picking with deep learning Amir Mardan et.al. 2404.07400v1 link
2024-04-10 Accurate Tennis Court Line Detection on Amateur Recorded Matches Sameer Agrawal et.al. 2404.06977v1 null
2024-04-10 GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism Shuzhou Yuan et.al. 2404.06911v1 null
2024-04-10 Text-Based Reasoning About Vector Graphics Zhenhailong Wang et.al. 2404.06479v2 null
2024-04-10 MuPT: A Generative Symbolic Music Pretrained Transformer Xingwei Qu et.al. 2404.06393v2 null
2024-04-11 On adversarial training and the 1 Nearest Neighbor classifier Amir Hagai et.al. 2404.06313v2 link
2024-04-09 ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization Yixin Yang et.al. 2404.06251v1 link
2024-04-09 Anchor-based Robust Finetuning of Vision-Language Models Jinwei Han et.al. 2404.06244v1 null
2024-04-09 [Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus Leshem Choshen et.al. 2404.06214v1 null
2024-04-09 OmniFusion Technical Report Elizaveta Goncharova et.al. 2404.06212v1 link
2024-04-09 Unified Multi-modal Diagnostic Framework with Reconstruction Pre-training and Heterogeneity-combat Tuning Yupei Zhang et.al. 2404.06057v1 link
2024-04-09 Online/Offline Learning to Enable Robust Beamforming: Limited Feedback Meets Deep Generative Models Ying Li et.al. 2404.06055v1 null
2024-04-08 Language-Independent Representations Improve Zero-Shot Summarization Vladimir Solovyev et.al. 2404.05720v1 null
2024-04-08 MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning Matteo Farina et.al. 2404.05621v1 null
2024-04-08 Language Models on a Diet: Cost-Efficient Development of Encoders for Closely-Related Languages via Additional Pretraining Nikola Ljubešić et.al. 2404.05428v1 link
2024-04-08 Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Yiming Li et.al. 2404.05415v1 null
2024-04-07 StockGPT: A GenAI Model for Stock Prediction and Trading Dat Mai et.al. 2404.05101v1 null
2024-04-07 AUEditNet: Dual-Branch Facial Action Unit Intensity Manipulation with Implicit Disentanglement Shiwei Jin et.al. 2404.05063v1 null
2024-04-07 PagPassGPT: Pattern Guided Password Guessing via Generative Pretrained Transformer Xingyu Su et.al. 2404.04886v1 link
2024-04-07 Msmsfnet: a multi-stream and multi-scale fusion net for edge detection Chenguang Liu et.al. 2404.04856v1 null
2024-04-07 F-MALLOC: Feed-forward Memory Allocation for Continual Learning in Neural Machine Translation Junhong Wu et.al. 2404.04846v1 null
2024-04-07 Data Bias According to Bipol: Men are Naturally Right and It is the Role of Women to Follow Their Lead Irene Pagliai et.al. 2404.04838v1 null
2024-04-05 Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model Xinrun Du et.al. 2404.04167v1 null
2024-04-04 No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance Vishaal Udandarao et.al. 2404.04125v1 link
2024-04-05 Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation Mingyuan Zhou et.al. 2404.04057v1 null
2024-04-05 Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer Hele-Andra Kuulmets et.al. 2404.04042v1 null
2024-04-05 Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds Annerose Eichel et.al. 2404.04031v1 null
2024-04-04 Layerwise Early Stopping for Test Time Adaptation Sabyasachi Sahoo et.al. 2404.03784v1 null
2024-04-04 DiffBody: Human Body Restoration by Imagining with Generative Diffusion Prior Yiming Zhang et.al. 2404.03642v1 null
2024-04-04 Learn When (not) to Trust Language Models: A Privacy-Centric Adaptive Model-Aware Approach Chengkai Huang et.al. 2404.03514v1 null
2024-04-04 A Cause-Effect Look at Alleviating Hallucination of Knowledge-grounded Dialogue Generation Jifan Yu et.al. 2404.03491v1 null
2024-04-04 Scaling Up Video Summarization Pretraining with Large Language Models Dawit Mureja Argaw et.al. 2404.03398v1 null
2024-04-03 Scaling Laws for Galaxy Images Mike Walmsley et.al. 2404.02973v1 link
2024-04-03 MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation Petru-Daniel Tudosiu et.al. 2404.02790v1 null
2024-04-03 CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech Jaehyeon Kim et.al. 2404.02781v1 null
2024-04-03 DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement Hao Wu et.al. 2404.02755v1 null
2024-04-03 Cross-Architecture Transfer Learning for Linear-Cost Inference Transformers Sehyun Choi et.al. 2404.02684v1 null
2024-04-03 Large Language Models for Expansion of Spoken Language Understanding Systems to New Languages Jakub Hoscilowicz et.al. 2404.02588v1 link
2024-04-03 The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education Paiheng Xu et.al. 2404.02444v1 null
2024-04-03 What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases Anthony Meng Huat Tiong et.al. 2404.02415v1 link
2024-04-02 Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models Zeyu Yang et.al. 2404.02148v1 link
2024-04-02 Iterated Learning Improves Compositionality in Large Vision-Language Models Chenhao Zheng et.al. 2404.02145v1 null
2024-04-03 ViTamin: Designing Scalable Vision Models in the Vision-Language Era Jieneng Chen et.al. 2404.02132v2 link
2024-04-02 FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning Joel Niklaus et.al. 2404.02127v1 link
2024-04-02 Adaptive Feature Fusion Neural Network for Glaucoma Segmentation on Unseen Fundus Images Jiyuan Zhong et.al. 2404.02084v1 null
2024-04-02 Noise Masking Attacks and Defenses for Pretrained Speech Models Matthew Jagielski et.al. 2404.02052v1 null
2024-04-02 Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language Models Stephan Linzbach et.al. 2404.01992v1 null
2024-04-02 Activation Steering for Robust Type Prediction in CodeLLMs Francesca Lucchetti et.al. 2404.01903v1 null
2024-04-02 Poro 34B and the Blessing of Multilinguality Risto Luukkonen et.al. 2404.01856v1 null
2024-04-02 Where to Move Next: Zero-shot Generalization of LLMs for Next POI Recommendation Shanshan Feng et.al. 2404.01855v1 null
2024-03-29 Convolutional Prompting meets Language Models for Continual Learning Anurag Roy et.al. 2403.20317v1 null
2024-03-29 Latxa: An Open Language Model and Evaluation Suite for Basque Julen Etxaniz et.al. 2403.20266v1 link
2024-03-29 Long-Tailed Anomaly Detection with Learnable Class Names Chih-Hui Ho et.al. 2403.20236v1 null
2024-03-29 StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation Sidi Wu et.al. 2403.20142v1 null
2024-03-29 FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models Barbara Toniella Corradini et.al. 2403.20105v1 null
2024-03-29 Negative Label Guided OOD Detection with Pretrained Vision-Language Models Xue Jiang et.al. 2403.20078v1 link
2024-03-28 Siamese Vision Transformers are Scalable Audio-visual Learners Yan-Bo Lin et.al. 2403.19638v1 link
2024-03-28 SA-GS: Scale-Adaptive Gaussian Splatting for Training-Free Anti-Aliasing Xiaowei Song et.al. 2403.19615v1 link
2024-03-28 LocCa: Visual Pretraining with Location-aware Captioners Bo Wan et.al. 2403.19596v1 null
2024-03-28 Situation Awareness for Driver-Centric Driving Style Adaptation Johann Haselberger et.al. 2403.19595v1 link
2024-03-28 Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics Norman Di Palo et.al. 2403.19578v1 null
2024-03-28 Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment Alireza Ganjdanesh et.al. 2403.19490v1 null
2024-03-28 Checkpoint Merging via Bayesian Optimization in LLM Pretraining Deyuan Liu et.al. 2403.19390v1 null
2024-03-28 NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data Manuel Tonneau et.al. 2403.19260v1 link
2024-03-29 STaR-GATE: Teaching Language Models to Ask Clarifying Questions Chinmaya Andukuri et.al. 2403.19154v2 null
2024-03-28 Instruction-based Hypergraph Pretraining Mingdai Yang et.al. 2403.19063v1 null
2024-03-27 Bringing Textual Prompt to AI-Generated Image Quality Assessment Bowen Qu et.al. 2403.18714v1 null
2024-03-27 Noise-Robust Keyword Spotting through Self-supervised Pretraining Jacob Mørk et.al. 2403.18560v1 null
2024-03-27 OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning Noor Ahmed et.al. 2403.18550v1 null
2024-03-27 Enhanced Generative Recommendation via Content and Collaboration Integration Yidan Wang et.al. 2403.18480v1 null
2024-03-27 NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation Jingyang Huo et.al. 2403.18211v1 null
2024-03-26 Juru: Legal Brazilian Large Language Model from Reputable Sources Roseval Malaquias Junior et.al. 2403.18140v1 null
2024-03-26 The Impact of Syntactic and Semantic Proximity on Machine Translation with Back-Translation Nicolas Guerin et.al. 2403.18031v1 null
2024-03-26 The Unreasonable Ineffectiveness of the Deeper Layers Andrey Gromov et.al. 2403.17887v1 null
2024-03-26 GenesisTex: Adapting Image Denoising Diffusion to Texture Space Chenjian Gao et.al. 2403.17782v1 null
2024-03-26 Leave No Patient Behind: Enhancing Medication Recommendation for Rare Disease Patients Zihao Zhao et.al. 2403.17745v1 null
2024-03-26 Masked Autoencoders are PDE Learners Anthony Zhou et.al. 2403.17728v1 null
2024-03-26 REFeREE: A REference-FREE Model-Based Metric for Text Simplification Yichen Huang et.al. 2403.17640v1 link
2024-03-25 Exploring CausalWorld: Enhancing robotic manipulation via knowledge transfer and curriculum learning Xinrui Wang et.al. 2403.17266v1 null
2024-03-25 Joint chest X-ray diagnosis and clinical visual attention prediction with multi-stage cooperative learning: enhancing interpretability Zirui Qiu et.al. 2403.16970v1 null
2024-03-25 Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance Jiasheng Ye et.al. 2403.16952v1 link
2024-03-25 Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text Junshu Tang et.al. 2403.16897v1 null
2024-03-25 Can Machine Translation Bridge Multilingual Pretraining and Cross-lingual Transfer Learning? Shaoxiong Ji et.al. 2403.16777v1 null
2024-03-25 ProCQA: A Large-scale Community-based Programming Question Answering Dataset for Code Search Zehan Li et.al. 2403.16702v1 null
2024-03-25 A comparative analysis of embedding models for patent similarity Grazia Sveva Ascione et.al. 2403.16630v1 null
2024-03-25 Elysium: Exploring Object-level Perception in Videos via MLLM Han Wang et.al. 2403.16558v1 link
2024-03-25 An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models Zizhao Hu et.al. 2403.16530v1 null
2024-03-25 Self-Supervised Learning for Medical Image Data with Anatomy-Oriented Imaging Planes Tianwei Zhang et.al. 2403.16499v1 null
2024-03-25 PathoTune: Adapting Visual Foundation Model to Pathological Specialists Jiaxuan Lu et.al. 2403.16497v1 null
2024-03-25 LSTTN: A Long-Short Term Transformer-based Spatio-temporal Neural Network for Traffic Flow Forecasting Qinyao Luo et.al. 2403.16495v1 null
2024-03-25 DeepMachining: Online Prediction of Machining Errors of Lathe Machines Xiang-Li Lu et.al. 2403.16451v1 null
2024-03-25 KIT-19: A Comprehensive Korean Instruction Toolkit on 19 Tasks for Fine-Tuning Korean Large Language Models Dongjun Jang et.al. 2403.16444v1 null
2024-03-22 Long-CLIP: Unlocking the Long-Text Capability of CLIP Beichen Zhang et.al. 2403.15378v1 null
2024-03-22 CoLLEGe: Concept Embedding Generation for Large Language Models Ryan Teehan et.al. 2403.15362v1 null
2024-03-22 Neural Plasticity-Inspired Foundation Model for Observing the Earth Crossing Modalities Zhitong Xiong et.al. 2403.15356v1 null
2024-03-22 SFOD: Spiking Fusion Object Detector Yimeng Fan et.al. 2403.15192v1 link
2024-03-22 Brain-grounding of semantic vectors improves neural decoding of visual stimuli Shirin Vafaei et.al. 2403.15176v1 null
2024-03-22 LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement Nicholas Lee et.al. 2403.15042v1 null
2024-03-22 Risk and Response in Large Language Models: Evaluating Key Threat Categories Bahareh Harandizadeh et.al. 2403.14988v1 null
2024-03-22 CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusion model Seungdae Han et.al. 2403.14944v1 null
2024-03-21 VidLA: Video-Language Alignment at Scale Mamshad Nayeem Rizve et.al. 2403.14870v1 null
2024-03-21 TAMS: Translation-Assisted Morphological Segmentation Enora Rice et.al. 2403.14840v1 null
2024-03-21 ReNoise: Real Image Inversion Through Iterative Noising Daniel Garibi et.al. 2403.14602v1 null
2024-03-21 Towards Efficient Information Fusion: Concentric Dual Fusion Attention Based Multiple Instance Learning for Whole Slide Images Yujian Liu et.al. 2403.14346v1 null
2024-03-21 Beyond Surface Similarity: Detecting Subtle Semantic Shifts in Financial Narratives Jiaxin Liu et.al. 2403.14341v1 null
2024-03-21 Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition Sihyun Yu et.al. 2403.14148v1 null
2024-03-21 Text-Enhanced Data-free Approach for Federated Class-Incremental Learning Minh-Tuan Tran et.al. 2403.14101v1 link
2024-03-20 Evaluating Unsupervised Dimensionality Reduction Methods for Pretrained Sentence Embeddings Gaifan Zhang et.al. 2403.14001v1 null
2024-03-20 Visually Grounded Speech Models have a Mutual Exclusivity Bias Leanne Nortje et.al. 2403.13922v1 null
2024-03-20 Leveraging Linguistically Enhanced Embeddings for Open Information Extraction Fauzan Farooqui et.al. 2403.13903v1 null
2024-03-20 On Pretraining Data Diversity for Self-Supervised Learning Hasan Abed Al Kader Hammoud et.al. 2403.13808v1 link
2024-03-20 Learning from Models and Data for Visual Grounding Ruozhen He et.al. 2403.13804v1 null
2024-03-20 RewardBench: Evaluating Reward Models for Language Modeling Nathan Lambert et.al. 2403.13787v1 link
2024-03-20 When Cars meet Drones: Hyperbolic Federated Learning for Source-Free Domain Adaptation in Adverse Weather Giulia Rizzoli et.al. 2403.13762v1 null
2024-03-20 PARAMANU-AYN: An Efficient Novel Generative and Instruction-tuned Language Model for Indian Legal Case Documents Mitodru Niyogi et.al. 2403.13681v1 null
2024-03-20 Grounding Spatial Relations in Text-Only Language Models Gorka Azkune et.al. 2403.13666v1 link
2024-03-20 Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese Meet Doshi et.al. 2403.13638v1 null
2024-03-20 Bayesian Physics-informed Neural Networks for System Identification of Inverter-dominated Power Systems Simon Stock et.al. 2403.13602v1 null
2024-03-20 VL-Mamba: Exploring State Space Models for Multimodal Learning Yanyuan Qiao et.al. 2403.13600v1 null
2024-03-20 ReGround: Improving Textual and Spatial Grounding at No Cost Yuseung Lee et.al. 2403.13589v1 null
2024-03-19 Zero-Reference Low-Light Enhancement via Physical Quadruple Priors Wenjing Wang et.al. 2403.12933v1 null
2024-03-19 Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts Sai Ashish Somayajula et.al. 2403.12918v1 link
2024-03-19 Yell At Your Robot: Improving On-the-Fly from Language Corrections Lucy Xiaoyang Shi et.al. 2403.12910v1 null
2024-03-20 MEDBind: Unifying Language and Multimodal Medical Data Embeddings Yuan Gao et.al. 2403.12894v2 null
2024-03-19 Automated Data Curation for Robust Language Model Fine-Tuning Jiuhai Chen et.al. 2403.12776v1 null
2024-03-19 Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation Jingtao Sun et.al. 2403.12728v1 link
2024-03-19 Simple Hack for Transformers against Heavy Long-Text Classification on a Time- and Memory-Limited GPU Service Mirza Alim Mutasodirin et.al. 2403.12563v1 null
2024-03-19 Equity through Access: A Case for Small-scale Deep Learning Raghavendra Selvan et.al. 2403.12562v1 link
2024-03-19 Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs Md Ashiqur Rahman et.al. 2403.12553v1 null
2024-03-19 TT-BLIP: Enhancing Fake News Detection Using BLIP and Tri-Transformer Eunjee Choi et.al. 2403.12481v1 null
2024-03-18 Urban Scene Diffusion through Semantic Occupancy Map Junge Zhang et.al. 2403.11697v1 null
2024-03-18 Prioritized Semantic Learning for Zero-shot Instance Navigation Xander Sun et.al. 2403.11650v1 null
2024-03-18 Arc2Face: A Foundation Model of Human Faces Foivos Paraperas Papantoniou et.al. 2403.11641v1 null
2024-03-18 End-to-end multi-modal product matching in fashion e-commerce Sándor Tóth et.al. 2403.11593v1 null
2024-03-18 CasSR: Activating Image Power for Real-World Image Super-Resolution Haolan Chen et.al. 2403.11451v1 null
2024-03-18 Zero-shot Compound Expression Recognition with Visual Language Model at the 6th ABAW Challenge Jiahe Wang et.al. 2403.11450v1 null
2024-03-18 Boosting Continuous Emotion Recognition with Self-Pretraining using Masked Autoencoders, Temporal Convolutional Networks, and Transformers Weiwei Zhou et.al. 2403.11440v1 null
2024-03-18 X-LLaVA: Optimizing Bilingual Large Vision-Language Alignment Dongjae Shin et.al. 2403.11399v1 null
2024-03-17 Ensembling and Test Augmentation for Covid-19 Detection and Covid-19 Domain Adaptation from 3D CT-Scans Fares Bougourzi et.al. 2403.11338v1 null
2024-03-17 Stylized Face Sketch Extraction via Generative Prior with Limited Data Kwan Yun et.al. 2403.11263v1 null
2024-03-15 Frozen Feature Augmentation for Few-Shot Image Classification Andreas Bär et.al. 2403.10519v1 null
2024-03-15 Approximate Nullspace Augmented Finetuning for Robust Vision Transformers Haoyang Liu et.al. 2403.10476v1 null
2024-03-15 Using an LLM to Turn Sign Spottings into Spoken Language Sentences Ozge Mercanoglu Sincan et.al. 2403.10434v1 null
2024-03-15 Monotonic Representation of Numeric Properties in Language Models Benjamin Heinzerling et.al. 2403.10381v1 null
2024-03-15 Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder Jinseok Kim et.al. 2403.10255v1 null
2024-03-15 Generative Region-Language Pretraining for Open-Ended Object Detection Chuang Lin et.al. 2403.10191v1 link
2024-03-15 RAFT: Adapting Language Model to Domain Specific RAG Tianjun Zhang et.al. 2403.10131v1 link
2024-03-15 Codebook Transfer with Part-of-Speech for Vector-Quantized Image Modeling Baoquan Zhang et.al. 2403.10071v1 null
2024-03-15 Boundary Matters: A Bi-Level Active Finetuning Framework Han Lu et.al. 2403.10069v1 null
2024-03-14 Adapting OC20-trained EquiformerV2 Models for High-Entropy Materials Christian M. Clausen et.al. 2403.09811v1 null
2024-03-14 OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning Lingyi Hong et.al. 2403.09634v1 null
2024-03-14 Holo-Relighting: Controllable Volumetric Portrait Relighting from a Single Image Yiqun Mei et.al. 2403.09632v1 null
2024-03-14 Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Eric Zelikman et.al. 2403.09629v1 null
2024-03-14 Counterfactual contrastive learning: robust representations via causal image synthesis Melanie Roschewitz et.al. 2403.09605v1 link
2024-03-14 uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures Afrina Tabassum et.al. 2403.09579v1 link
2024-03-14 Unsupervised Modality-Transferable Video Highlight Detection with Representation Activation Sequence Learning Tingtian Li et.al. 2403.09401v1 null
2024-03-14 PreConfig: A Pretrained Model for Automating Network Configuration Fuliang Li et.al. 2403.09369v1 null
2024-03-14 HeadEvolver: Text to Head Avatars via Locally Learnable Mesh Deformation Duotun Wang et.al. 2403.09326v1 null
2024-03-14 Annotation Free Semantic Segmentation with Vision Foundation Models Soroush Seifi et.al. 2403.09307v1 null
2024-03-14 CLIP-EBC: CLIP Can Count Accurately through Enhanced Blockwise Classification Yiming Ma et.al. 2403.09281v1 null
2024-03-13 DAM: Dynamic Adapter Merging for Continual Video QA Learning Feng Cheng et.al. 2403.08755v1 link
2024-03-13 Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization Renjie Pi et.al. 2403.08730v1 null
2024-03-13 Data-Efficient Sleep Staging with Synthetic Time Series Pretraining Niklas Grieger et.al. 2403.08592v1 null
2024-03-13 Gaussian Splatting in Style Abhishek Saroha et.al. 2403.08498v1 null
2024-03-13 Towards Dense and Accurate Radar Perception Via Efficient Cross-Modal Diffusion Model Ruibin Zhang et.al. 2403.08460v1 null
2024-03-13 Gemma: Open Models Based on Gemini Research and Technology Gemma Team et.al. 2403.08295v1 null
2024-03-13 Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale Xiang Hu et.al. 2403.08293v1 null
2024-03-13 GPT, Ontology, and CAABAC: A Tripartite Personalized Access Control Model Anchored by Compliance, Context and Attribute Raza Nowrozy et.al. 2403.08264v1 null
2024-03-13 LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition Zhonglin Sun et.al. 2403.08161v1 link
2024-03-12 Learning Data Association for Multi-Object Tracking using Only Coordinates Mehdi Miah et.al. 2403.08018v1 null
2024-03-12 12 mJ per Class On-Device Online Few-Shot Class-Incremental Learning Yoga Esa Wibowo et.al. 2403.07851v1 link
2024-03-12 Chronos: Learning the Language of Time Series Abdul Fatir Ansari et.al. 2403.07815v1 link
2024-03-12 Boosting keyword spotting through on-device learnable user speech characteristics Cristian Cioflan et.al. 2403.07802v1 null
2024-03-12 Fine-tuning Neural Network Quantum States Riccardo Rende et.al. 2403.07795v1 null
2024-03-12 Synth$^2$: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings Sahand Sharifzadeh et.al. 2403.07750v1 null
2024-03-12 MoralBERT: Detecting Moral Values in Social Discourse Vjosa Preniqi et.al. 2403.07678v1 null
2024-03-12 Characterization of Large Language Model Development in the Datacenter Qinghao Hu et.al. 2403.07648v1 link
2024-03-12 Triples-to-isiXhosa (T2X): Addressing the Challenges of Low-Resource Agglutinative Data-to-Text Generation Francois Meyer et.al. 2403.07567v1 link
2024-03-12 Matrix-Transformation Based Low-Rank Adaptation (MTLoRA): A Brain-Inspired Method for Parameter-Efficient Fine-Tuning Yao Liang et.al. 2403.07440v1 null
2024-03-12 In-context learning enables multimodal large language models to classify cancer pathology images Dyke Ferber et.al. 2403.07407v1 null
2024-03-11 VideoMamba: State Space Model for Efficient Video Understanding Kunchang Li et.al. 2403.06977v1 link
2024-03-11 MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning Yichuan Li et.al. 2403.06914v1 null
2024-03-11 FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks Muhammad Saif Ullah Khan et.al. 2403.06904v1 null
2024-03-11 On the Generalization Ability of Unsupervised Pretraining Yuyang Deng et.al. 2403.06871v1 null
2024-03-11 Data-Independent Operator: A Training-Free Artifact Representation Extractor for Generalizable Deepfake Detection Chuangchuang Tan et.al. 2403.06803v1 link
2024-03-11 PeerAiD: Improving Adversarial Distillation from a Specialized Peer Tutor Jaewon Jung et.al. 2403.06668v1 null
2024-03-11 Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers Alexander H. Berger et.al. 2403.06601v1 null
2024-03-11 SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection Yuxuan Li et.al. 2403.06534v1 link
2024-03-11 FontCLIP: A Semantic Typography Visual-Language Model for Multilingual Font Applications Yuki Tatsukawa et.al. 2403.06453v1 null
2024-03-11 Can LLMs' Tuning Methods Work in Medical Multimodal Domain? Jiawei Chen et.al. 2403.06407v1 null
2024-03-08 DeepSeek-VL: Towards Real-World Vision-Language Understanding Haoyu Lu et.al. 2403.05525v1 link
2024-03-08 Self-Supervised Multiple Instance Learning for Acute Myeloid Leukemia Classification Salome Kazeminia et.al. 2403.05379v1 null
2024-03-08 ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications Sotaro Takeshita et.al. 2403.05303v1 link
2024-03-08 CommitBench: A Benchmark for Commit Message Generation Maximilian Schall et.al. 2403.05188v1 link
2024-03-08 GSEdit: Efficient Text-Guided Editing of 3D Objects via Gaussian Splatting Francesco Palandra et.al. 2403.05154v1 null
2024-03-08 Face2Diffusion for Fast and Editable Face Personalization Kaede Shiohara et.al. 2403.05094v1 link
2024-03-08 Agile Multi-Source-Free Domain Adaptation Xinyao Li et.al. 2403.05062v1 link
2024-03-07 An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control Aosong Feng et.al. 2403.04880v1 null
2024-03-07 I Can't Believe It's Not Scene Flow! Ishan Khatri et.al. 2403.04739v1 link
2024-03-07 Masked Capsule Autoencoders Miles Everett et.al. 2403.04724v1 null
2024-03-07 Yi: Open Foundation Models by 01.AI 01. AI et.al. 2403.04652v1 link
2024-03-07 Teaching Large Language Models to Reason with Reinforcement Learning Alex Havrilla et.al. 2403.04642v1 null
2024-03-07 Pix2Gif: Motion-Guided Diffusion for GIF Generation Hitesh Kandala et.al. 2403.04634v1 null
2024-03-07 CLIP the Bias: How Useful is Balancing Data in Multimodal Learning? Ibrahim Alabdulmohsin et.al. 2403.04547v1 null
2024-03-07 Source Matters: Source Dataset Impact on Model Robustness in Medical Imaging Dovile Juodelyte et.al. 2403.04484v1 link
2024-03-07 Enhancing Court View Generation with Knowledge Injection and Guidance Ang Li et.al. 2403.04366v1 link
2024-03-07 Federated Recommendation via Hybrid Retrieval Augmented Generation Huimin Zeng et.al. 2403.04256v1 link
2024-03-07 DEEP-ICL: Definition-Enriched Experts for Language Model In-Context Learning Xingwei Qu et.al. 2403.04233v1 null
2024-03-06 Bridging Language and Items for Retrieval and Recommendation Yupeng Hou et.al. 2403.03952v1 link
2024-03-06 The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models Adithya Bhaskar et.al. 2403.03942v1 link
2024-03-06 Designing Informative Metrics for Few-Shot Example Selection Rishabh Adiga et.al. 2403.03861v1 null
2024-03-06 MeaCap: Memory-Augmented Zero-shot Image Captioning Zequn Zeng et.al. 2403.03715v1 null
2024-03-06 On Transfer in Classification: How Well do Subsets of Classes Generalize? Raphael Baena et.al. 2403.03569v1 null
2024-03-06 Low-Dose CT Image Reconstruction by Fine-Tuning a UNet Pretrained for Gaussian Denoising for the Downstream Task of Image Enhancement Tim Selig et.al. 2403.03551v1 null
2024-03-06 CNN-based End-to-End Adaptive Controller with Stability Guarantees Myeongseok Ryu et.al. 2403.03499v1 null
2024-03-06 Multi-modal Deep Learning Chen Yuhua et.al. 2403.03385v1 null
2024-03-05 XAI-Based Detection of Adversarial Attacks on Deepfake Detectors Ben Pinhasov et.al. 2403.02955v1 null
2024-03-05 Enhancing Conceptual Understanding in Multimodal Contrastive Learning through Hard Negative Samples Philipp J. Rösch et.al. 2403.02875v1 null
2024-03-05 Crossing Linguistic Horizons: Finetuning and Comprehensive Evaluation of Vietnamese Large Language Models Sang T. Truong et.al. 2403.02715v1 null
2024-03-05 Breeze-7B Technical Report Chan-Jan Hsu et.al. 2403.02712v1 null
2024-03-04 A Tutorial on the Pretrain-Finetune Paradigm for Natural Language Processing Yu Wang et.al. 2403.02504v1 null
2024-03-04 Encodings for Prediction-based Neural Architecture Search Yash Akhauri et.al. 2403.02484v1 link
2024-03-04 Transformers Provably Learn Feature-Position Correlations in Masked Image Modeling Yu Huang et.al. 2403.02233v1 null
2024-03-04 TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models Yilong Ren et.al. 2403.02221v1 null
2024-03-04 What has LeBenchmark Learnt about French Syntax? Zdravko Dugonjić et.al. 2403.02173v1 null
2024-03-04 Enhancing Information Maximization with Distance-Aware Contrastive Learning for Source-Free Cross-Domain Few-Shot Learning Huali Xu et.al. 2403.01966v1 link
2024-03-02 Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning Shuo Yang et.al. 2403.01209v1 null
2024-03-01 Tree-Regularized Tabular Embeddings Xuan Li et.al. 2403.00963v1 link
2024-03-01 G3DR: Generative 3D Reconstruction in ImageNet Pradyumna Reddy et.al. 2403.00939v1 null
2024-03-01 Word Order and World Knowledge Qinghua Zhao et.al. 2403.00876v1 null
2024-03-01 Hierarchical Indexing for Retrieval-Augmented Opinion Summarization Tom Hosking et.al. 2403.00435v1 null
2024-03-01 Private Benchmarking to Prevent Contamination and Improve Comparative Evaluation of LLMs Nishanth Chandran et.al. 2403.00393v1 null
2024-03-01 MaskLRF: Self-supervised Pretraining via Masked Autoencoding of Local Reference Frames for Rotation-invariant 3D Point Set Analysis Takahiko Furuya et.al. 2403.00206v1 link
2024-02-29 Ask Your Distribution Shift if Pre-Training is Right for You Benjamin Cohen-Wang et.al. 2403.00194v1 link
2024-02-29 Non-Invasive Medical Digital Twins using Physics-Informed Self-Supervised Learning Keying Kuang et.al. 2403.00177v1 link
2024-02-29 UniTS: Building a Unified Time Series Model Shanghua Gao et.al. 2403.00131v1 link
2024-02-29 SeD: Semantic-Aware Discriminator for Image Super-Resolution Bingchen Li et.al. 2402.19387v1 null
2024-02-29 OzMAC: An Energy-Efficient Sparsity-Exploiting Multiply-Accumulate-Unit Design for DL Inference Harideep Nair et.al. 2402.19376v1 null
2024-02-29 Compact Speech Translation Models via Discrete Speech Units Pretraining Tsz Kin Lam et.al. 2402.19333v1 null
2024-02-29 Mirage: Cross-Embodiment Zero-Shot Policy Transfer with Cross-Painting Lawrence Yunliang Chen et.al. 2402.19249v1 null
2024-02-29 PeLLE: Encoder-based language models for Brazilian Portuguese based on open data Guilherme Lamartine de Mello et.al. 2402.19204v1 null
2024-02-29 VIXEN: Visual Text Comparison Network for Image Difference Captioning Alexander Black et.al. 2402.19119v1 null
2024-02-29 Improving Group Connectivity for Generalization of Federated Deep Learning Zexi Li et.al. 2402.18949v1 null
2024-02-29 Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data Takaaki Saeki et.al. 2402.18932v1 null
2024-02-29 Reducing Hallucinations in Entity Abstract Summarization with Facts-Template Decomposition Fangwei Zhu et.al. 2402.18873v1 link
2024-02-29 Dual Operating Modes of In-Context Learning Ziqian Lin et.al. 2402.18819v1 null
2024-02-28 Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation Nihal V. Nayak et.al. 2402.18334v1 link
2024-02-28 How to think step-by-step: A mechanistic understanding of chain-of-thought reasoning Subhabrata Dutta et.al. 2402.18312v1 link
2024-02-28 Feature Denoising For Low-Light Instance Segmentation Using Weighted Non-Local Blocks Joanne Lin et.al. 2402.18307v1 null
2024-02-28 Self-Supervised Learning in Electron Microscopy: Towards a Foundation Model for Advanced Image Analysis Bashir Kazimi et.al. 2402.18286v1 null
2024-02-28 NToP: NeRF-Powered Large-scale Dataset Generation for 2D and 3D Human Pose Estimation in Top-View Fisheye Images Jingrui Yu et.al. 2402.18196v1 null
2024-02-28 Diffusion-based Neural Network Weights Generation Bedionita Soro et.al. 2402.18153v1 null
2024-02-28 DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning Jianxiong Li et.al. 2402.18137v1 null
2024-02-28 Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization Han Guo et.al. 2402.18128v1 link
2024-02-28 Collaborative decoding of critical tokens for boosting factuality of large language models Lifeng Jin et.al. 2402.17982v1 null
2024-02-27 Acquiring Linguistic Knowledge from Multimodal Input Theodor Amariucai et.al. 2402.17936v1 null
2024-02-27 Tower: An Open Multilingual Large Language Model for Translation-Related Tasks Duarte M. Alves et.al. 2402.17733v1 null
2024-02-27 MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation Hanan Gani et.al. 2402.17725v1 link
2024-02-27 NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents Tamara Czinczoll et.al. 2402.17682v1 null
2024-02-27 SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation Shuangrui Ding et.al. 2402.17645v1 null
2024-02-27 Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data Xiao Liu et.al. 2402.17644v1 link
2024-02-27 Adapt Before Comparison: A New Perspective on Cross-Domain Few-Shot Segmentation Jonas Herzog et.al. 2402.17614v1 null
2024-02-27 A Large-scale Evaluation of Pretraining Paradigms for the Detection of Defects in Electroluminescence Solar Cell Images David Torpey et.al. 2402.17611v1 null
2024-02-27 Training-Free Long-Context Scaling of Large Language Models Chenxin An et.al. 2402.17463v1 link
2024-02-27 Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder Jiaqi Wang et.al. 2402.17433v1 null
2024-02-27 Investigating Continual Pretraining in Large Language Models: Insights and Implications Çağatay Yıldız et.al. 2402.17400v1 null
2024-02-26 Immunization against harmful fine-tuning attacks Domenic Rosati et.al. 2402.16382v1 null
2024-02-26 An Integrated Data Processing Framework for Pretraining Foundation Models Yiding Sun et.al. 2402.16358v1 link
2024-02-26 MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs Zimu Lu et.al. 2402.16352v1 null
2024-02-26 BLO-SAM: Bi-level Optimization Based Overfitting-Preventing Finetuning of SAM Li Zhang et.al. 2402.16338v1 null
2024-02-26 Learning Translations: Emergent Communication Pretraining for Cooperative Language Acquisition Dylan Cope et.al. 2402.16247v1 null
2024-02-26 High-Frequency-aware Hierarchical Contrastive Selective Coding for Representation Learning on Text-attributed Graphs Peiyan Zhang et.al. 2402.16240v1 null
2024-02-25 Task Specific Pretraining with Noisy Labels for Remote sensing Image Segmentation Chenying Liu et.al. 2402.16164v1 null
2024-02-25 StochCA: A Novel Approach for Exploiting Pretrained Models with Cross-Attention Seungwon Seo et.al. 2402.16092v1 link
2024-02-25 LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding Yuxuan Wang et.al. 2402.16050v1 link
2024-02-25 Adversarial-Robust Transfer Learning for Medical Imaging via Domain Assimilation Xiaohui Chen et.al. 2402.16005v1 null
2024-02-23 Repetition Improves Language Model Embeddings Jacob Mitchell Springer et.al. 2402.15449v1 link
2024-02-23 PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning Simon Holk et.al. 2402.15420v1 null
2024-02-23 United We Pretrain, Divided We Fail! Representation Learning for Time Series by Pretraining on 75 Datasets at Once Maurice Kraus et.al. 2402.15404v1 null
2024-02-23 Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control Masatoshi Uehara et.al. 2402.15194v1 null
2024-02-23 The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling Jiajun Ma et.al. 2402.15170v1 null
2024-02-23 Self-Adaptive Reconstruction with Contrastive Learning for Unsupervised Sentence Embeddings Junlong Liu et.al. 2402.15153v1 null
2024-02-23 ColBERT-XM: A Modular Multi-Vector Representation Model for Zero-Shot Multilingual Information Retrieval Antoine Louis et.al. 2402.15059v1 null
2024-02-23 CARBD-Ko: A Contextually Annotated Review Benchmark Dataset for Aspect-Level Sentiment Classification in Korean Dongjun Jang et.al. 2402.15046v1 null
2024-02-22 Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning Zhuoyan Xu et.al. 2402.15017v1 link
2024-02-22 Zero-shot cross-lingual transfer in instruction tuning of large language model Nadezhda Chirkova et.al. 2402.14778v1 null
2024-02-22 Prompting a Pretrained Transformer Can Be a Universal Approximator Aleksandar Petrov et.al. 2402.14753v1 null
2024-02-22 Dependency Annotation of Ottoman Turkish with Multilingual BERT Şaziye Betül Özateş et.al. 2402.14743v1 null
2024-02-22 Cleaner Pretraining Corpus Curation with Neural Web Scraping Zhipeng Xu et.al. 2402.14652v1 link
2024-02-22 Rethinking Scientific Summarization Evaluation: Grounding Explainable Metrics on Facet-aware Benchmark Xiuying Chen et.al. 2402.14359v1 null
2024-02-22 GAM-Depth: Self-Supervised Indoor Depth Estimation Leveraging a Gradient-Aware Mask and Semantic Constraints Anqi Cheng et.al. 2402.14354v1 null
2024-02-22 MVD$^2$: Efficient Multiview 3D Reconstruction for Multiview Diffusion Xin-Yang Zheng et.al. 2402.14253v1 null
2024-02-22 Swin3D++: Effective Multi-Source Pretraining for 3D Indoor Scene Understanding Yu-Qi Yang et.al. 2402.14215v1 link
2024-02-22 BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay Catherine Weaver et.al. 2402.14194v1 null
2024-02-21 T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching Zizheng Pan et.al. 2402.14167v1 link
2024-02-21 User-LLM: Efficient LLM Contextualization with User Embeddings Lin Ning et.al. 2402.13598v1 null
2024-02-21 Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment Yunxin Li et.al. 2402.13561v1 null
2024-02-21 LLMs Meet Long Video: Advancing Long Video Comprehension with An Interactive Visual Adapter in LLMs Yunxin Li et.al. 2402.13546v1 null
2024-02-21 FinGPT-HPC: Efficient Pretraining and Finetuning Large Language Models for Financial Applications with High-Performance Computing Xiao-Yang Liu et.al. 2402.13533v1 null
2024-02-21 How Important is Domain Specificity in Language Models and Instruction Finetuning for Biomedical Relation Extraction? Aviv Brokman et.al. 2402.13470v1 null
2024-02-20 Investigating Cultural Alignment of Large Language Models Badr AlKhamissi et.al. 2402.13231v1 link
2024-02-20 RoCode: A Dataset for Measuring Code Intelligence from Problem Definitions in Romanian Adrian Cosma et.al. 2402.13222v1 link
2024-02-20 VideoPrism: A Foundational Visual Encoder for Video Understanding Long Zhao et.al. 2402.13217v1 null
2024-02-20 Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables Haisong Gong et.al. 2402.13028v1 link
2024-02-20 Cell Graph Transformer for Nuclei Classification Wei Lou et.al. 2402.12946v1 link
2024-02-20 More Discriminative Sentence Embeddings via Semantic Graph Smoothing Chakib Fettal et.al. 2402.12890v1 link
2024-02-20 ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic Fajri Koto et.al. 2402.12840v1 link
2024-02-20 Equivariant Pretrained Transformer for Unified Geometric Learning on Multi-Domain 3D Molecules Rui Jiao et.al. 2402.12714v1 null
2024-02-20 PDEformer: Towards a Foundation Model for One-Dimensional Partial Differential Equations Zhanhong Ye et.al. 2402.12652v1 null
2024-02-19 GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations Jinhao Duan et.al. 2402.12348v1 link
2024-02-19 Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks Nadezhda Chirkova et.al. 2402.12279v1 null
2024-02-19 High-quality Data-to-Text Generation for Severely Under-Resourced Languages with Out-of-the-box Large Language Models Michela Lorandi et.al. 2402.12267v1 link
2024-02-19 Is It a Free Lunch for Removing Outliers during Pretraining? Baohao Liao et.al. 2402.12102v1 null
2024-02-19 Direct Consistency Optimization for Compositional Text-to-Image Personalization Kyungmin Lee et.al. 2402.12004v1 null
2024-02-19 DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation Chong Zeng et.al. 2402.11929v1 null
2024-02-19 MRKE: The Multi-hop Reasoning Evaluation of LLMs by Knowledge Edition Jian Wu et.al. 2402.11924v1 null
2024-02-19 ComFusion: Personalized Subject Generation in Multiple Specific Scenes From Single Image Yan Hong et.al. 2402.11849v1 null
2024-02-19 UniST: A Prompt-Empowered Universal Model for Urban Spatio-Temporal Prediction Yuan Yuan et.al. 2402.11838v1 null
2024-02-19 LLM as Prompter: Low-resource Inductive Reasoning on Arbitrary Knowledge Graphs Kai Wang et.al. 2402.11804v1 null
2024-02-16 Proving membership in LLM pretraining data via data watermarks Johnny Tian-Zheng Wei et.al. 2402.10892v1 null
2024-02-16 Enhancement-Driven Pretraining for Robust Fingerprint Representation Learning Ekta Gavas et.al. 2402.10847v1 null
2024-02-16 Associative Memories in the Feature Space Tommaso Salvatori et.al. 2402.10814v1 null
2024-02-16 BioFusionNet: Deep Learning-Based Survival Risk Stratification in ER+ Breast Cancer Through Multifeature and Multimodal Data Fusion Raktim Kumar Mondol et.al. 2402.10717v1 null
2024-02-16 Are ID Embeddings Necessary? Whitening Pre-trained Text Embeddings for Effective Sequential Recommendation Lingzi Zhang et.al. 2402.10602v1 null
2024-02-16 SPAR: Personalized Content-Based Recommendation via Long Engagement Attention Chiyu Zhang et.al. 2402.10555v1 null
2024-02-16 MFBind: a Multi-Fidelity Approach for Evaluating Drug Compounds in Practical Generative Modeling Peter Eckmann et.al. 2402.10387v1 null
2024-02-15 BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains Yanis Labrak et.al. 2402.10373v1 null
2024-02-15 Euclid preparation. Measuring detailed galaxy morphologies for Euclid with Machine Learning Euclid Collaboration et.al. 2402.10187v1 link
2024-02-15 Data Engineering for Scaling Language Models to 128K Context Yao Fu et.al. 2402.10171v1 link
2024-02-15 Towards Safer Large Language Models through Machine Unlearning Zheyuan Liu et.al. 2402.10058v1 null
2024-02-15 LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition Jinyuan Li et.al. 2402.09989v1 null
2024-02-15 Data Augmentation and Transfer Learning Approaches Applied to Facial Expressions Recognition Enrico Randellini et.al. 2402.09982v1 null
2024-02-15 All in One and One for All: A Simple yet Effective Method towards Cross-domain Graph Pretraining Haihong Zhao et.al. 2402.09834v1 null
2024-02-15 Knowledge of Pretrained Language Models on Surface Information of Tokens Tatsuya Hiraoka et.al. 2402.09808v1 null
2024-02-14 Towards Privacy-Aware Sign Language Translation at Scale Phillip Rust et.al. 2402.09611v1 null
2024-02-14 DeepATLAS: One-Shot Localization for Biomedical Data Peter D. Chang et.al. 2402.09587v1 null
2024-02-14 Deep Rib Fracture Instance Segmentation and Classification from CT on the RibFrac Challenge Jiancheng Yang et.al. 2402.09372v1 null
2024-02-14 Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking Yi Fung et.al. 2402.09369v1 null
2024-02-14 HiRE: High Recall Approximate Top-$k$ Estimation for Efficient LLM Inference Yashas Samaga B L et.al. 2402.09360v1 null
2024-02-14 Few-Shot Object Detection with Sparse Context Transformers Jie Mei et.al. 2402.09315v1 null
2024-02-14 Embracing the black box: Heading towards foundation models for causal discovery from time series data Gideon Stein et.al. 2402.09305v1 null
2024-02-14 Spectral Filters, Dark Signals, and Attention Sinks Nicola Cancedda et.al. 2402.09221v1 null
2024-02-14 MPIrigen: MPI Code Generation through Domain-Specific Language Models Nadav Schneider et.al. 2402.09126v1 link
2024-02-14 I can't see it but I can Fine-tune it: On Encrypted Fine-tuning of Transformers using Fully Homomorphic Encryption Prajwal Panzade et.al. 2402.09059v1 null
2024-02-14 Pretraining Vision-Language Model for Difference Visual Question Answering in Longitudinal Chest X-rays Yeongjae Cho et.al. 2402.08966v1 null
2024-02-14 Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation Ge Shi et.al. 2402.08882v1 null
2024-02-13 Human Curriculum Effects Emerge with In-Context Learning in Neural Networks Jacob Russin et.al. 2402.08674v1 null
2024-02-13 Tandem Transformers for Inference Efficient LLMs Aishwarya P S et.al. 2402.08644v1 null
2024-02-13 Captions Are Worth a Thousand Words: Enhancing Product Retrieval with Pretrained Image-to-Text Models Jason Tang et.al. 2402.08532v1 null
2024-02-13 Concept-1K: A Novel Benchmark for Instance Incremental Learning Junhao Zheng et.al. 2402.08526v1 link
2024-02-13 Pixel Sentence Representation Learning Chenghao Xiao et.al. 2402.08183v1 null
2024-02-12 Which Pretrain Samples to Rehearse when Finetuning Pretrained Models? Andrew Bai et.al. 2402.08096v1 null
2024-02-12 Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models Siddharth Karamcheti et.al. 2402.07865v1 link
2024-02-12 Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning Z Liu et.al. 2402.07818v1 null
2024-02-12 AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts Yifan Zhang et.al. 2402.07625v1 link
2024-02-12 Foundational Inference Models for Dynamical Systems Patrick Seifner et.al. 2402.07594v1 null
2024-02-12 Only the Curve Shape Matters: Training Foundation Models for Zero-Shot Multivariate Time Series Forecasting through Next Curve Shape Prediction Cheng Feng et.al. 2402.07570v1 link
2024-02-12 MAFIA: Multi-Adapter Fused Inclusive LanguAge Models Prachi Jain et.al. 2402.07519v1 null
2024-02-12 SLIT: Boosting Audio-Text Pre-Training via Multi-Stage Learning and Instruction Tuning Hang Zhao et.al. 2402.07485v1 null
2024-02-12 Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT Jon Saad-Falcon et.al. 2402.07440v1 null
2024-02-12 SemTra: A Semantic Skill Translator for Cross-Domain Zero-Shot Policy Adaptation Sangwoo Shin et.al. 2402.07418v1 null
2024-02-11 Multi-Modal Emotion Recognition by Text, Speech and Video Using Pretrained Transformers Minoo Shayaninasab et.al. 2402.07327v1 null
2024-02-09 Feature Density Estimation for Out-of-Distribution Detection via Normalizing Flows Evan D. Cook et.al. 2402.06537v1 null
2024-02-09 GS-CLIP: Gaussian Splatting for Contrastive Language-Image-3D Pretraining from Real-World Data Haoyuan Li et.al. 2402.06198v1 null
2024-02-09 Premier-TACO: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss Ruijie Zheng et.al. 2402.06187v1 null
2024-02-09 MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models Yixiao Zhang et.al. 2402.06178v1 null
2024-02-08 Early Fusion of Features for Semantic Segmentation Anupam Gupta et.al. 2402.06091v1 null
2024-02-08 Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing Yong Cao et.al. 2402.06015v1 null
2024-02-08 WebLINX: Real-World Website Navigation with Multi-Turn Dialogue Xing Han Lù et.al. 2402.05930v1 null
2024-02-08 Collaborative Control for Geometry-Conditioned PBR Image Generation Shimon Vainer et.al. 2402.05919v1 null
2024-02-08 Efficient Stagewise Pretraining via Progressive Subnetworks Abhishek Panigrahi et.al. 2402.05913v1 null
2024-02-08 SpiRit-LM: Interleaved Spoken and Written Language Model Tu Anh Nguyen et.al. 2402.05755v1 null
2024-02-08 Unified Speech-Text Pretraining for Spoken Dialog Modeling Heeseung Kim et.al. 2402.05706v1 null
2024-02-08 Pretrained Generative Language Models as General Learning Frameworks for Sequence-Based Tasks Ben Fauber et.al. 2402.05616v1 null
2024-02-08 Establishing degrees of closeness between audio recordings along different dimensions using large-scale cross-lingual models Maxime Fily et.al. 2402.05581v1 null
2024-02-07 BIKED++: A Multimodal Dataset of 1.4 Million Bicycle Image and Parametric CAD Designs Lyle Regenwetter et.al. 2402.05301v1 null
2024-02-07 SPAD : Spatially Aware Multiview Diffusers Yash Kant et.al. 2402.05235v1 null
2024-02-07 A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules? Agustinus Kristiadi et.al. 2402.05015v1 link
2024-02-07 Personalized Text Generation with Fine-Grained Linguistic Control Bashar Alhafni et.al. 2402.04914v1 link
2024-02-07 OV-NeRF: Open-vocabulary Neural Radiance Fields with Vision and Language Foundation Models for 3D Semantic Understanding Guibiao Liao et.al. 2402.04648v1 null
2024-02-06 PreGIP: Watermarking the Pretraining of Graph Neural Networks for Deep Intellectual Property Protection Enyan Dai et.al. 2402.04435v1 null
2024-02-06 Fine-Tuned Language Models Generate Stable Inorganic Materials as Text Nate Gruver et.al. 2402.04379v1 link
2024-02-06 The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry Michael Zhang et.al. 2402.04347v1 null
2024-02-06 EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters Quan Sun et.al. 2402.04252v1 link
2024-02-06 MusicRL: Aligning Music Generation to Human Preferences Geoffrey Cideron et.al. 2402.04229v1 null
2024-02-06 Scaling Laws for Downstream Task Performance of Large Language Models Berivan Isik et.al. 2402.04177v1 null
2024-02-06 Attention with Markov: A Framework for Principled Analysis of Transformers via Markov Chains Ashok Vardhan Makkuva et.al. 2402.04161v1 link
2024-02-06 A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation Zhengbo Wang et.al. 2402.04087v1 link
2024-02-06 Connecting the Dots: Collaborative Fine-tuning for Black-Box Vision-Language Models Zhengbo Wang et.al. 2402.04050v1 null
2024-02-06 Polyp-DDPM: Diffusion-Based Semantic Polyp Synthesis for Enhanced Segmentation Zolnamar Dorjsembe et.al. 2402.04031v1 link
2024-02-06 Low-rank Attention Side-Tuning for Parameter-Efficient Fine-Tuning Ningyuan Tang et.al. 2402.04009v1 null
2024-02-06 Understanding the Effect of Noise in LLM Training Data with Algorithmic Chains of Thought Alex Havrilla et.al. 2402.04004v1 null
2024-02-06 Humans Beat Deep Networks at Recognizing Objects in Unusual Poses, Given Enough Time Netta Ollikka et.al. 2402.03973v1 null
2024-02-05 Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining Jiarun Liu et.al. 2402.03302v1 link
2024-02-05 Training-Free Consistent Text-to-Image Generation Yoad Tewel et.al. 2402.03286v1 null
2024-02-05 CLIP Can Understand Depth Dunam Kim et.al. 2402.03251v1 null
2024-02-05 FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action Recognition Xiaohu Huang et.al. 2402.03241v1 null
2024-02-05 Towards mitigating uncann(eye)ness in face swaps via gaze-centric loss terms Ethan Wilson et.al. 2402.03188v1 null
2024-02-05 Time-, Memory- and Parameter-Efficient Visual Adaptation Otniel-Bogdan Mercea et.al. 2402.02887v1 null
2024-02-05 Enhancing Compositional Generalization via Compositional Feature Alignment Haoxiang Wang et.al. 2402.02851v1 link
2024-02-04 Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning Haoyi Zhu et.al. 2402.02500v1 null
2024-02-04 A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer Zhangyang Gao et.al. 2402.02464v1 null
2024-02-04 BECLR: Batch Enhanced Contrastive Few-Shot Learning Stylianos Poulakakis-Daktylidis et.al. 2402.02444v1 link
2024-02-02 From Words to Molecules: A Survey of Large Language Models in Chemistry Chang Liao et.al. 2402.01439v1 null
2024-02-02 Continual Learning for Large Language Models: A Survey Tongtong Wu et.al. 2402.01364v1 null
2024-02-02 Describing Images $\textit{Fast and Slow}$: Quantifying and Predicting the Variation in Human Signals during Visuo-Linguistic Processes Ece Takmaz et.al. 2402.01352v1 null
2024-02-02 Training-time Neuron Alignment through Permutation Subspace for Improving Linear Mode Connectivity and Model Fusion Zexi Li et.al. 2402.01342v1 null
2024-02-02 On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification Calum Heggan et.al. 2402.01274v1 null
2024-02-02 Can Shape-Infused Joint Embeddings Improve Image-Conditioned 3D Diffusion? Cristian Sbrolli et.al. 2402.01241v1 null
2024-02-02 In-Context Learning for Few-Shot Nested Named Entity Recognition Meishan Zhang et.al. 2402.01182v1 null
2024-02-02 Interpretation of Intracardiac Electrograms Through Textual Representations William Jongwon Han et.al. 2402.01115v1 null
2024-02-02 Double-Dip: Thwarting Label-Only Membership Inference Attacks with Transfer Learning and Randomization Arezoo Rajabi et.al. 2402.01114v1 null
2024-02-02 Specialized Language Models with Cheap Inference from Limited Domain Data David Grangier et.al. 2402.01093v1 null
2024-02-01 Can Large Language Models Understand Context? Yilun Zhu et.al. 2402.00858v1 null
2024-02-01 LLMs learn governing principles of dynamical systems, revealing an in-context neural scaling law Toni J. B. Liu et.al. 2402.00795v1 null
2024-02-01 CroissantLLM: A Truly Bilingual French-English Language Model Manuel Faysse et.al. 2402.00786v1 link
2024-02-01 AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning Fu-Yun Wang et.al. 2402.00769v1 link
2024-02-01 Unlearnable Algorithms for In-context Learning Andrei Muresanu et.al. 2402.00751v1 null
2024-02-01 Approximating Optimal Morphing Attacks using Template Inversion Laurent Colbois et.al. 2402.00695v1 null
2024-02-01 Improving Critical Node Detection Using Neural Network-based Initialization in a Genetic Algorithm Chanjuan Liu et.al. 2402.00404v1 null
2024-02-01 Real-time Stereo Speech Enhancement with Spatial-Cue Preservation based on Dual-Path Structure Masahito Togami et.al. 2402.00337v1 null
2024-02-01 Towards AI-Assisted Synthesis of Verified Dafny Methods Md Rakib Hossain Misu et.al. 2402.00247v1 link
2024-01-31 Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Luca Soldaini et.al. 2402.00159v1 link
2024-01-31 Binding Touch to Everything: Learning Unified Multimodal Tactile Representations Fengyu Yang et.al. 2401.18084v1 null
2024-01-31 Paramanu: A Family of Novel Efficient Indic Generative Foundation Language Models Mitodru Niyogi et.al. 2401.18034v1 null
2024-01-31 Efficient Subseasonal Weather Forecast using Teleconnection-informed Transformers Shan Zhao et.al. 2401.17870v1 null
2024-01-31 Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model Zihan Zhong et.al. 2401.17868v1 null
2024-01-31 Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction Xueyuan Chen et.al. 2401.17796v1 null
2024-01-31 EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning Jaeyeon Kim et.al. 2401.17690v1 link
2024-01-31 Towards Efficient and Reliable LLM Serving: A Real-World Workload Study Yuxin Wang et.al. 2401.17644v1 null
2024-01-31 Local and Global Contexts for Conversation Zuoquan Lin et.al. 2401.17588v1 link
2024-01-30 Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens Jiacheng Liu et.al. 2401.17377v1 null
2024-01-30 Transfer Learning for Text Diffusion Models Kehang Han et.al. 2401.17181v1 null
2024-01-29 Unsupervised Discovery of Steerable Factors When Graph Deep Generative Models Are Entangled Shengchao Liu et.al. 2401.17123v1 link
2024-01-30 Finetuning Large Language Models for Vulnerability Detection Alexey Shestov et.al. 2401.17010v1 null
2024-01-30 Distinguishing Fictional Voices: a Study of Authorship Verification Models for Quotation Attribution Gaspard Michel et.al. 2401.16968v1 link
2024-01-30 PBSCSR: The Piano Bootleg Score Composer Style Recognition Dataset Arhan Jain et.al. 2401.16803v1 link
2024-01-30 MolPLA: A Molecular Pretraining Framework for Learning Cores, R-Groups and their Linker Joints Mogan Gim et.al. 2401.16771v1 null
2024-01-30 Gradient-Based Language Model Red Teaming Nevan Wichers et.al. 2401.16656v1 link
2024-01-30 IRCoCo: Immediate Rewards-Guided Deep Reinforcement Learning for Code Completion Bolun Li et.al. 2401.16637v1 link
2024-01-29 ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks Bolei Ma et.al. 2401.16589v1 link
2024-01-29 Massively Multilingual Text Translation For Low-Resource Languages Zhong Zhou et.al. 2401.16582v1 null
2024-01-29 Scaling Sparse Fine-Tuning to Large Language Models Alan Ansell et.al. 2401.16405v1 null
2024-01-29 PICL: Physics Informed Contrastive Learning for Partial Differential Equations Cooper Lorsung et.al. 2401.16327v1 null
2024-01-29 Enhancing Molecular Property Prediction with Auxiliary Learning and Task-Specific Adaptation Vishal Dey et.al. 2401.16299v1 null
2024-01-29 Textual Entailment for Effective Triple Validation in Object Prediction Andrés García-Silva et.al. 2401.16293v1 null
2024-01-29 Cutup and Detect: Human Fall Detection on Cutup Untrimmed Videos Using a Large Foundational Video Understanding Model Till Grutschus et.al. 2401.16280v1 null
2024-01-29 Type-based Neural Link Prediction Adapter for Complex Query Answering Lingning Song et.al. 2401.16045v1 null
2024-01-29 Finding Challenging Metaphors that Confuse Pretrained Language Models Yucheng Li et.al. 2401.16012v1 null
2024-01-29 StableIdentity: Inserting Anybody into Anywhere at First Sight Qinghe Wang et.al. 2401.15975v1 null
2024-01-29 Masked Audio Modeling with CLAP and Multi-Objective Learning Yifei Xin et.al. 2401.15953v1 null
2024-01-29 HICH Image/Text (HICH-IT): Comprehensive Text and Image Datasets for Hypertensive Intracerebral Hemorrhage Research Jie Li et.al. 2401.15934v1 null
2024-01-26 RESPRECT: Speeding-up Multi-fingered Grasping with Residual Reinforcement Learning Federico Ceola et.al. 2401.14858v1 null
2024-01-26 Endowing Protein Language Models with Structural Knowledge Dexiong Chen et.al. 2401.14819v1 null
2024-01-26 MaLLaM -- Malaysia Large Language Model Husein Zolkepli et.al. 2401.14680v1 null
2024-01-26 An Empirical Investigation of Domain Adaptation Ability for Chinese Spelling Check Models Xi Wang et.al. 2401.14630v1 null
2024-01-26 Towards Lifelong Scene Graph Generation with Knowledge-ware In-context Prompt Learning Tao He et.al. 2401.14626v1 null
2024-01-25 MResT: Multi-Resolution Sensing for Real-Time Control with Vision-Language Models Saumya Saxena et.al. 2401.14502v1 null
2024-01-25 Rethinking Patch Dependence for Masked Autoencoders Letian Fu et.al. 2401.14391v1 null
2024-01-25 TURNA: A Turkish Encoder-Decoder Language Model for Enhanced Understanding and Generation Gökçe Uludoğan et.al. 2401.14373v1 link
2024-01-25 Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation Minglin Chen et.al. 2401.14257v1 null
2024-01-25 Assessing the Portability of Parameter Matrices Trained by Parameter-Efficient Finetuning Methods Mohammed Sabry et.al. 2401.14228v1 null
2024-01-25 BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models Senthil Purushwalkam et.al. 2401.13974v1 null
2024-01-24 S2TPVFormer: Spatio-Temporal Tri-Perspective View for temporally coherent 3D Semantic Occupancy Prediction Sathira Silva et.al. 2401.13785v1 null
2024-01-24 Enhancing Image Retrieval : A Comprehensive Study on Photo Search using the CLIP Mode Naresh Kumar Lahajal et.al. 2401.13613v1 null
2024-01-24 Large Malaysian Language Model Based on Mistral for Enhanced Local Language Understanding Husein Zolkepli et.al. 2401.13565v1 null
2024-01-25 Finetuning Foundation Models for Joint Analysis Optimization Matthias Vigl et.al. 2401.13536v2 null
2024-01-24 Generative Human Motion Stylization in Latent Space Chuan Guo et.al. 2401.13505v1 null
2024-01-24 MaLA-500: Massive Language Adaptation of Large Language Models Peiqin Lin et.al. 2401.13303v1 null
2024-01-24 Audio-Infused Automatic Image Colorization by Exploiting Audio Scene Semantics Pengcheng Zhao et.al. 2401.13270v1 null
2024-01-24 Segment Any Cell: A SAM-based Auto-prompting Fine-tuning Framework for Nuclei Segmentation Saiyang Na et.al. 2401.13220v1 null
2024-01-24 AdCorDA: Classifier Refinement via Adversarial Correction and Domain Adaptation Lulan Shen et.al. 2401.13212v1 null
2024-01-23 The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts Lingfeng Shen et.al. 2401.13136v1 null
2024-01-23 Digital Divides in Scene Recognition: Uncovering Socioeconomic Biases in Deep Learning Systems Michelle R. Greene et.al. 2401.13097v1 null
2024-01-23 GALA: Generating Animatable Layered Assets from a Single Scan Taeksoo Kim et.al. 2401.12979v1 null
2024-01-23 Pretraining and the Lasso Erin Craig et.al. 2401.12911v1 null
2024-01-23 PSDF: Prior-Driven Neural Implicit Surface Learning for Multi-view Reconstruction Wanjuan Su et.al. 2401.12751v1 null
2024-01-23 Evaluation of large language models for assessing code maintainability Marc Dillmann et.al. 2401.12714v1 null
2024-01-23 Persona-centric Metamorphic Relation guided Robustness Evaluation for Multi-turn Dialogue Modelling Yanbing Chen et.al. 2401.12483v1 null
2024-01-23 The Neglected Tails of Vision-Language Models Shubham Parashar et.al. 2401.12425v1 null
2024-01-22 OCT-SelfNet: A Self-Supervised Framework with Multi-Modal Datasets for Generalized and Robust Retinal Disease Detection Fatema-E Jannat et.al. 2401.12344v1 null
2024-01-22 Contrastive Learning and Cycle Consistency-based Transductive Transfer Learning for Target Annotation Shoaib Meraj Sami et.al. 2401.12340v1 null
2024-01-22 APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference Bowen Zhao et.al. 2401.12200v1 null
2024-01-22 An Empirical Analysis of In-context Learning Abilities of LLMs for MT Pranjal A. Chitale et.al. 2401.12097v1 null
2024-01-22 Multi-level Cross-modal Alignment for Image Clustering Liping Qiu et.al. 2401.11740v1 null
2024-01-22 M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition Mengmeng Wang et.al. 2401.11649v1 null
2024-01-21 MolTailor: Tailoring Chemical Molecular Representation to Specific Tasks via Text Prompts Haoqiang Guo et.al. 2401.11403v1 link
2024-01-21 LLMRA: Multi-modal Large Language Model based Restoration Assistant Xiaoyu Jin et.al. 2401.11401v1 null
2024-01-19 Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition Ismail Rasim Ulgen et.al. 2401.11017v1 null
2024-01-19 Mitigating Hallucinations of Large Language Models via Knowledge Consistent Alignment Fanqi Wan et.al. 2401.10768v1 link
2024-01-19 DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval Xiangpeng Yang et.al. 2401.10588v1 null
2024-01-19 Name Tagging Under Domain Shift via Metric Learning for Life Sciences Hongyi Liu et.al. 2401.10472v1 null
2024-01-19 Investigating Training Strategies and Model Robustness of Low-Rank Adaptation for Language Modeling in Speech Recognition Yu Yu et.al. 2401.10447v1 null
2024-01-18 Supervised Fine-tuning in turn Improves Visual Foundation Models Xiaohu Jiang et.al. 2401.10222v1 link
2024-01-18 Evolutionary Computation in the Era of Large Language Model: Survey and Roadmap Xingyu Wu et.al. 2401.10034v1 null
2024-01-18 Gender Bias in Machine Translation and The Era of Large Language Models Eva Vanmassenhove et.al. 2401.10016v1 null
2024-01-18 Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations Prince Jha et.al. 2401.09899v1 link
2024-01-18 Improving fine-grained understanding in image-text pre-training Ioana Bica et.al. 2401.09865v1 null
2024-01-18 Improving the Accuracy of Analog-Based In-Memory Computing Accelerators Post-Training Corey Lammie et.al. 2401.09859v1 null
2024-01-18 Simple and effective data augmentation for compositional generalization Yuekun Yao et.al. 2401.09815v1 null
2024-01-18 Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation Zesen Cheng et.al. 2401.09732v1 link
2024-01-17 CT Liver Segmentation via PVT-based Encoding and Refined Decoding Debesh Jha et.al. 2401.09630v1 link
2024-01-17 Aligning Large Language Models with Counterfactual DPO Bradley Butcher et.al. 2401.09566v1 null
2024-01-17 Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics for Detecting Human vs. Machine-Generated Text Mazal Bethany et.al. 2401.09407v1 null
2024-01-17 Machines Do See Color: A Guideline to Classify Different Forms of Racist Discourse in Large Corpora Diana Davila Gordillo et.al. 2401.09333v1 null
2024-01-17 An Efficient Generalizable Framework for Visuomotor Policies via Control-aware Augmentation and Privilege-guided Distillation Yinuo Zhao et.al. 2401.09258v1 null
2024-01-17 Preparing Lessons for Progressive Training on Language Models Yu Pan et.al. 2401.09192v1 null
2024-01-17 Visual Robotic Manipulation with Depth-Aware Pretraining Wanying Wang et.al. 2401.09038v1 null
2024-01-16 Fast Dynamic 3D Object Generation from a Single-view Video Zijie Pan et.al. 2401.08742v1 null
2024-01-16 Fixed Point Diffusion Models Xingjian Bai et.al. 2401.08741v1 null
2024-01-16 Tuning Language Models by Proxy Alisa Liu et.al. 2401.08565v1 null
2024-01-16 GATS: Gather-Attend-Scatter Konrad Zolna et.al. 2401.08525v1 null
2024-01-17 Salute the Classic: Revisiting Challenges of Machine Translation in the Age of Large Language Models Jianhui Pang et.al. 2401.08350v2 null
2024-01-16 MCRPL: A Pretrain, Prompt & Fine-tune Paradigm for Non-overlapping Many-to-one Cross-domain Recommendation Hao Liu et.al. 2401.08228v1 null
2024-01-16 SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation Zhixuan Liu et.al. 2401.08053v1 null
2024-01-15 How does self-supervised pretraining improve robustness against noisy labels across various medical image classification datasets? Bidur Khanal et.al. 2401.07990v1 null
2024-01-15 Word Boundary Information Isn't Useful for Encoder Language Models Edward Gow-Smith et.al. 2401.07923v1 null
2024-01-15 EMBRE: Entity-aware Masking for Biomedical Relation Extraction Mingjie Li et.al. 2401.07877v1 null
2024-01-15 VeCAF: VLM-empowered Collaborative Active Finetuning with Training Objective Awareness Rongyu Zhang et.al. 2401.07853v1 null
2024-01-15 Fusing Echocardiography Images and Medical Records for Continuous Patient Stratification Nathan Painchaud et.al. 2401.07796v1 null
2024-01-15 On the importance of Data Scale in Pretraining Arabic Language Models Abbas Ghaddar et.al. 2401.07760v1 link
2024-01-15 HexaGen3D: StableDiffusion is just one step away from Fast and Diverse Text-to-3D Generation Antoine Mercier et.al. 2401.07727v1 null
2024-01-12 Scalable 3D Panoptic Segmentation With Superpoint Graph Clustering Damien Robert et.al. 2401.06704v1 link
2024-01-12 TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language Models Yihong Liu et.al. 2401.06620v1 null
2024-01-12 BOK-VQA: Bilingual Outside Knowledge-based Visual Question Answering via Graph Representation Pretraining Minjun Kim et.al. 2401.06443v1 null
2024-01-12 AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters Li Lucy et.al. 2401.06408v1 link
2024-01-12 AffordanceLLM: Grounding Affordance from Vision Language Models Shengyi Qian et.al. 2401.06341v1 null
2024-01-12 Application Of Vision-Language Models For Assessing Osteoarthritis Disease Severity Banafshe Felfeliyan et.al. 2401.06331v1 null
2024-01-11 A Study on Self-Supervised Pretraining for Vision Problems in Gastrointestinal Endoscopy Edward Sanderson et.al. 2401.06278v1 null
2024-01-11 Transformers are Multi-State RNNs Matanel Oren et.al. 2401.06104v1 null
2024-01-11 Autocompletion of Chief Complaints in the Electronic Health Records using Large Language Models K M Sajjadul Islam et.al. 2401.06088v1 null
2024-01-11 LinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language Generalization Muhammad Farid Adilazuarda et.al. 2401.06034v1 null
2024-01-11 DiffDA: a diffusion model for weather-scale data assimilation Langwen Huang et.al. 2401.05932v1 null
2024-01-11 Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models Pengzhi Gao et.al. 2401.05861v1 link
2024-01-11 Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations Zhihui Xie et.al. 2401.05792v1 link
2024-01-11 Zero Resource Cross-Lingual Part Of Speech Tagging Sahil Chopra et.al. 2401.05727v1 null
2024-01-10 Diffusion Priors for Dynamic View Synthesis from Monocular Videos Chaoyang Wang et.al. 2401.05583v1 null
2024-01-10 Siamese Networks with Soft Labels for Unsupervised Lesion Detection and Patch Pretraining on Screening Mammograms Kevin Van Vorst et.al. 2401.05570v1 null
2024-01-10 Physics guided dual Self-supervised learning for structure-based materials property prediction Nihang Fu et.al. 2401.05223v1 link
2024-01-10 Pre-trained Large Language Models for Financial Sentiment Analysis Wei Luo et.al. 2401.05215v1 null
2024-01-10 MISS: A Generative Pretraining and Finetuning Approach for Med-VQA Jiawei Chen et.al. 2401.05163v1 null
2024-01-09 Phishing Website Detection through Multi-Model Analysis of HTML Content Furkan Çolhak et.al. 2401.04820v1 null
2024-01-10 RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation Mahdi Nikdan et.al. 2401.04679v2 null
2024-01-09 DepressionEmo: A novel dataset for multilabel classification of depression emotions Abu Bakar Siddiqur Rahman et.al. 2401.04655v1 link
2024-01-09 Representative Feature Extraction During Diffusion Process for Sketch Extraction with One Example Kwan Yun et.al. 2401.04362v1 null
2024-01-09 Private Fine-tuning of Large Language Models with Zeroth-order Optimization Xinyu Tang et.al. 2401.04343v1 null
2024-01-08 Dr$^2$Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning Chen Zhao et.al. 2401.04105v1 null
2024-01-08 FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Inference Zirui Liu et.al. 2401.04044v1 null
2024-01-08 TTMs: Fast Multi-level Tiny Time Mixers for Improved Zero-shot and Few-shot Forecasting of Multivariate Time Series Vijay Ekambaram et.al. 2401.03955v1 null
2024-01-08 TeleChat Technical Report Zihan Wang et.al. 2401.03804v1 null
2024-01-08 Anatomy of Neural Language Models Majd Saleh et.al. 2401.03797v1 link
2024-01-07 Transfer the linguistic representations from TTS to accent conversion with non-parallel data Xi Chen et.al. 2401.03538v1 null
2024-01-05 Locally Adaptive Neural 3D Morphable Models Michail Tarasiou et.al. 2401.02937v1 link
2024-01-05 MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance Renjie Pi et.al. 2401.02906v1 link
2024-01-05 Pheme: Efficient and Conversational Speech Generation Paweł Budzianowski et.al. 2401.02839v1 null
2024-01-05 Fus-MAE: A cross-attention-based data fusion approach for Masked Autoencoders in remote sensing Hugo Chan-To-Hing et.al. 2401.02764v1 null
2024-01-05 Detection and Classification of Diabetic Retinopathy using Deep Learning Algorithms for Segmentation to Facilitate Referral Recommendation for Test and Treatment Prediction Manoj S H et.al. 2401.02759v1 link
2024-01-05 MAMI: Multi-Attentional Mutual-Information for Long Sequence Neuron Captioning Alfirsa Damasyifa Fauzulhaq et.al. 2401.02744v1 null
2024-01-05 Synergistic Formulaic Alpha Generation for Quantitative Trading based on Reinforcement Learning Hong-Gi Shin et.al. 2401.02710v1 null
2024-01-05 Benchmarking PathCLIP for Pathology Image Analysis Sunyi Zheng et.al. 2401.02651v1 null
2024-01-05 MOODv2: Masked Image Modeling for Out-of-Distribution Detection Jingyao Li et.al. 2401.02611v1 null
2024-01-04 Vulnerabilities Unveiled: Adversarially Attacking a Multimodal Vision Langauge Model for Pathology Imaging Jai Prakash Veerla et.al. 2401.02565v1 null
2024-01-04 LLaMA Pro: Progressive LLaMA with Block Expansion Chengyue Wu et.al. 2401.02415v1 link
2024-01-04 TinyLlama: An Open-Source Small Language Model Peiyuan Zhang et.al. 2401.02385v1 link
2024-01-04 DIALIGHT: Lightweight Multilingual Development and Evaluation of Task-Oriented Dialogue Systems with Large Language Models Songbo Hu et.al. 2401.02208v1 null
2024-01-04 Location Aware Modular Biencoder for Tourism Question Answering Haonan Li et.al. 2401.02187v1 link
2024-01-04 SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment Ziping Ma et.al. 2401.02137v1 null
2024-01-04 Text2MDT: Extracting Medical Decision Trees from Medical Texts Wei Zhu et.al. 2401.02034v1 null
2024-01-03 Revisiting Zero-Shot Abstractive Summarization in the Era of Large Language Models from the Perspective of Position Bias Anshuman Chhabra et.al. 2401.01989v1 link
2024-01-03 The Power of Training: How Different Neural Network Setups Influence the Energy Demand Daniel Geißler et.al. 2401.01851v1 null
2024-01-03 FullLoRA-AT: Efficiently Boosting the Robustness of Pretrained Vision Transformers Zheng Yuan et.al. 2401.01752v1 null
2024-01-03 De-Confusing Pseudo-Labels in Source-Free Domain Adaptation Idit Diamant et.al. 2401.01650v1 null
2024-01-03 Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction Sequences Piotr Skalski et.al. 2401.01641v1 link
2024-01-02 Deep-ELA: Deep Exploratory Landscape Analysis with Self-Supervised Pretrained Transformers for Single- and Multi-Objective Continuous Optimization Problems Moritz Vinzent Seiler et.al. 2401.01192v1 null
2024-01-02 Query-Based Knowledge Sharing for Open-Vocabulary Multi-Label Classification Xuelin Zhu et.al. 2401.01181v1 null
2024-01-02 Quokka: An Open-source Large Language Model ChatBot for Material Science Xianjun Yang et.al. 2401.01089v1 link
2024-01-02 LLaMA Beyond English: An Empirical Study on Language Capability Transfer Jun Zhao et.al. 2401.01055v1 null
2024-01-02 Cheetah: Natural Language Generation for 517 African Languages Ife Adebara et.al. 2401.01053v1 null
2024-01-01 Multi-Lattice Sampling of Quantum Field Theories via Neural Operators Bálint Máté et.al. 2401.00828v1 null
2024-01-01 Self-supervised learning for skin cancer diagnosis with limited training data Hamish Haggerty et.al. 2401.00692v1 null
2024-01-01 Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models Guangji Bai et.al. 2401.00625v1 null
2023-12-31 Neural Networks Against (and For) Self-Training: Classification with Small Labeled and Large Unlabeled Sets Payam Karisani et.al. 2401.00575v1 link
2023-12-31 A Generalist FaceX via Learning Unified Facial Representation Yue Han et.al. 2401.00551v1 link
2023-12-29 MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining Jacob Portes et.al. 2312.17482v1 null
2023-12-29 FerKD: Surgical Label Adaptation for Efficient Distillation Zhiqiang Shen et.al. 2312.17473v1 link
2023-12-29 Video Understanding with Large Language Models: A Survey Yunlong Tang et.al. 2312.17432v1 link
2023-12-28 The LLM Surgeon Tycho F. A. van der Ouderaa et.al. 2312.17244v1 null
2023-12-28 Unsupervised Universal Image Segmentation Dantong Niu et.al. 2312.17243v1 link
2023-12-28 Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution Ying Wang et.al. 2312.17174v1 link
2023-12-28 Non-Vacuous Generalization Bounds for Large Language Models Sanae Lotfi et.al. 2312.17173v1 null
2023-12-28 Restoration by Generation with Constrained Priors Zheng Ding et.al. 2312.17161v1 null
2023-12-28 Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math Zengzhi Wang et.al. 2312.17120v1 link
2023-12-29 Length Extrapolation of Transformers: A Survey from the Perspective of Position Encoding Liang Zhao et.al. 2312.17044v2 null
2023-12-28 3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease Progression from Longitudinal OCTs Taha Emre et.al. 2312.16980v1 null
2023-12-27 I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models Xun Guo et.al. 2312.16693v1 null
2023-12-27 LIP-Loc: LiDAR Image Pretraining for Cross-Modal Localization Sai Shubodh Puligilla et.al. 2312.16648v1 null
2023-12-22 DRStageNet: Deep Learning for Diabetic Retinopathy Staging from Fundus Images Yevgeniy Men et.al. 2312.14891v1 null
2023-12-22 Hazards from Increasingly Accessible Fine-Tuning of Downloadable Foundation Models Alan Chan et.al. 2312.14751v1 null
2023-12-22 Harnessing Diffusion Models for Visual Perception with Meta Prompts Qiang Wan et.al. 2312.14733v1 link
2023-12-22 Inclusive normalization of face images to passport format Hongliu Cao et.al. 2312.14544v1 null
2023-12-22 ADA-GAD: Anomaly-Denoised Autoencoders for Graph Anomaly Detection Junwei He et.al. 2312.14535v1 null
2023-12-22 Generative Pretraining at Scale: Transformer-Based Encoding of Transactional Behavior for Fraud Detection Ze Yu Zhao et.al. 2312.14406v1 null
2023-12-22 Unveiling Backbone Effects in CLIP: Exploring Representational Synergies and Variances Cristian Rodriguez-Opazo et.al. 2312.14400v1 null
2023-12-22 StyleRetoucher: Generalized Portrait Image Retouching with GAN Priors Wanchao Su et.al. 2312.14389v1 null
2023-12-21 Crystal Growth Characterization of WSe$_2$ Thin Film Using Machine Learning Isaiah A. Moses et.al. 2312.14311v1 null
2023-12-21 DUSt3R: Geometric 3D Vision Made Easy Shuzhe Wang et.al. 2312.14132v1 null
2023-12-21 VideoPoet: A Large Language Model for Zero-Shot Video Generation Dan Kondratyuk et.al. 2312.14125v1 null
2023-12-21 Typhoon: Thai Large Language Models Kunat Pipatanakul et.al. 2312.13951v1 null
2023-12-21 TinySAM: Pushing the Envelope for Efficient Segment Anything Model Han Shu et.al. 2312.13789v1 link
2023-12-21 DreamTuner: Single Image is Enough for Subject-Driven Generation Miao Hua et.al. 2312.13691v1 null
2023-12-21 DyBluRF: Dynamic Deblurring Neural Radiance Fields for Blurry Monocular Video Minh-Quan Viet Bui et.al. 2312.13528v1 null
2023-12-20 Time is Encoded in the Weights of Finetuned Language Models Kai Nylund et.al. 2312.13401v1 null
2023-12-20 Conditional Image Generation with Pretrained Generative Model Rajesh Shrestha et.al. 2312.13253v1 null
2023-12-20 A 3D super-resolution of wind fields via physics-informed pixel-wise self-attention generative adversarial network Takuya Kurihana et.al. 2312.13212v1 null
2023-12-21 Molecular Hypergraph Neural Networks Junwu Chen et.al. 2312.13136v2 link
2023-12-19 Value Explicit Pretraining for Goal-Based Transfer Learning Kiran Lekkala et.al. 2312.12339v1 null
2023-12-19 Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment Lingling Xu et.al. 2312.12148v1 null
2023-12-19 ZS-SRT: An Efficient Zero-Shot Super-Resolution Training Method for Neural Radiance Fields Xiang Feng et.al. 2312.12122v1 null
2023-12-19 DMT: Comprehensive Distillation with Multiple Self-supervised Teachers Yuang Liu et.al. 2312.11938v1 null
2023-12-19 Empowering Dual-Level Graph Self-Supervised Pretraining with Motif Discovery Pengwei Yan et.al. 2312.11927v1 link
2023-12-18 Ultrasound Image Enhancement using CycleGAN and Perceptual Loss Shreeram Athreya et.al. 2312.11748v1 link
2023-12-18 Evaluating Language-Model Agents on Realistic Autonomous Tasks Megan Kinniment et.al. 2312.11671v1 null
2023-12-18 Implicit Affordance Acquisition via Causal Action-Effect Modeling in the Video Domain Hsiu-Yu Yang et.al. 2312.11345v1 null
2023-12-18 UniDCP: Unifying Multiple Medical Vision-language Tasks via Dynamic Cross-modal Learnable Prompts Chenlu Zhan et.al. 2312.11171v1 null
2023-12-19 Split and Rephrase with Large Language Models David Ponce et.al. 2312.11075v2 null
2023-12-17 CEIR: Concept-based Explainable Image Representation Learning Yan Cui et.al. 2312.10747v1 null
2023-12-17 Addressing Sample Inefficiency in Multi-View Representation Learning Kumar Krishna Agrawal et.al. 2312.10725v1 null
2023-12-17 T2M-HiFiGPT: Generating High Quality Human Motion from Textual Descriptions with Residual Discrete Representations Congyi Wang et.al. 2312.10628v1 null
2023-12-17 Do LLMs Work on Charts? Designing Few-Shot Prompts for Chart Question Answering and Summarization Xuan Long Do et.al. 2312.10610v1 null
2023-12-16 Paloma: A Benchmark for Evaluating Language Model Fit Ian Magnusson et.al. 2312.10523v1 null
2023-12-16 Enhancing Person Re-Identification through Tensor Feature Fusion Akram Abderraouf Gharbi et.al. 2312.10470v1 null
2023-12-16 RetailKLIP : Finetuning OpenCLIP backbone using metric learning on a single GPU for Zero-shot retail product image classification Muktabh Mayank Srivastava et.al. 2312.10282v1 null
2023-12-15 Bayesian Estimate of Mean Proper Scores for Diversity-Enhanced Active Learning Wei Tan et.al. 2312.10116v1 null
2023-12-15 PathoDuet: Foundation Models for Pathological Slide Analysis of H&E and IHC Stains Shengyi Hua et.al. 2312.09894v1 link
2023-12-15 Probing Pretrained Language Models with Hierarchy Properties Jesús Lovón-Melgarejo et.al. 2312.09670v1 null
2023-12-15 Vectorizing string entries for data processing on tables: when are larger language models better? Léo Grinsztajn et.al. 2312.09634v1 null
2023-12-15 Image Deblurring using GAN Zhengdong Li et.al. 2312.09496v1 null
2023-12-14 Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision Collin Burns et.al. 2312.09390v1 null
2023-12-14 Weight subcloning: direct initialization of transformers using larger pretrained ones Mohammad Samragh et.al. 2312.09299v1 null
2023-12-14 ZeroRF: Fast Sparse View 360° Reconstruction with Zero Pretraining Ruoxi Shi et.al. 2312.09249v1 null
2023-12-14 Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking Jacob Eisenstein et.al. 2312.09244v1 null
2023-12-14 OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields Chubin Zhang et.al. 2312.09243v1 link
2023-12-14 Reliability in Semantic Segmentation: Can We Use Synthetic Data? Thibaut Loiseau et.al. 2312.09231v1 null
2023-12-14 WIT-UAS: A Wildland-fire Infrared Thermal Dataset to Detect Crew Assets From Aerial Views Andrew Jong et.al. 2312.09159v1 link
2023-12-14 Exploring Transferability for Randomized Smoothing Kai Qiu et.al. 2312.09020v1 null
2023-12-14 OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers Han Liang et.al. 2312.08985v1 null
2023-12-14 BiPFT: Binary Pre-trained Foundation Transformer with Low-rank Estimation of Binarization Residual Polynomials Xingrun Xing et.al. 2312.08937v1 link
2023-12-14 Guided Diffusion from Self-Supervised Diffusion Features Vincent Tao Hu et.al. 2312.08825v1 null
2023-12-14 Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention Kaiqiang Song et.al. 2312.08618v1 null
2023-12-13 Enhancing Robot Program Synthesis Through Environmental Context Tianyi Chen et.al. 2312.08250v1 null
2023-12-13 Patch-wise Graph Contrastive Learning for Image Translation Chanyong Jung et.al. 2312.08223v1 null
2023-12-13 Knowledge-Aware Artifact Image Synthesis with LLM-Enhanced Prompting and Multi-Source Supervision Shengguang Wu et.al. 2312.08056v1 null
2023-12-13 SLJP: Semantic Extraction based Legal Judgment Prediction Prameela Madambakam et.al. 2312.07979v1 null
2023-12-13 CoIE: Chain-of-Instruct Editing for Multi-Attribute Face Manipulation Zhenduo Zhang et.al. 2312.07879v1 null
2023-12-13 Foundation Models in Robotics: Applications, Challenges, and the Future Roya Firoozi et.al. 2312.07843v1 null
2023-12-13 A Foundational Multimodal Vision Language AI Assistant for Human Pathology Ming Y. Lu et.al. 2312.07814v1 null
2023-12-12 Tell, don't show: Declarative facts influence how LLMs generalize Alexander Meinke et.al. 2312.07779v1 null
2023-12-12 A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning Yinmin Zhang et.al. 2312.07685v1 null
2023-12-12 GMTalker: Gaussian Mixture based Emotional talking video Portraits Yibo Xia et.al. 2312.07669v1 null
2023-12-12 Double-Flow GAN model for the reconstruction of perceived faces from brain activities Zihao Wang et.al. 2312.07478v1 null
2023-12-12 Cross-modal Contrastive Learning with Asymmetric Co-attention Network for Video Moment Retrieval Love Panta et.al. 2312.07435v1 null
2023-12-12 How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation Zhongyi Han et.al. 2312.07424v1 null
2023-12-12 ICL Markup: Structuring In-Context Learning using Soft-Token Tags Marc-Etienne Brunet et.al. 2312.07405v1 null
2023-12-12 Benchmarking Pretrained Vision Embeddings for Near- and Duplicate Detection in Medical Images Tuan Truong et.al. 2312.07273v1 null
2023-12-12 ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open Vocabulary Object Detection Joonhyun Jeong et.al. 2312.07266v1 null
2023-12-12 Dynamic Corrective Self-Distillation for Better Fine-Tuning of Pretrained Models Ibtihel Amara et.al. 2312.07028v1 null
2023-12-12 CCM: Adding Conditional Controls to Text-to-Image Consistency Models Jie Xiao et.al. 2312.06971v1 null
2023-12-12 READ-PVLA: Recurrent Adapter with Partial Video-Language Alignment for Parameter-Efficient Transfer Learning in Low-Resource Video-Language Modeling Thong Nguyen et.al. 2312.06950v1 null
2023-12-11 DYAD: A Descriptive Yet Abjuring Density efficient approximation to linear neural network layers Sarin Chandy et.al. 2312.06881v1 link
2023-12-11 De novo Design of Polymer Electrolytes with High Conductivity using GPT-based and Diffusion-based Generative Models Zhenze Yang et.al. 2312.06470v1 null
2023-12-11 PointVoxel: A Simple and Effective Pipeline for Multi-View Multi-Modal 3D Human Pose Estimation Zhiyu Pan et.al. 2312.06409v1 null
2023-12-11 MMDesign: Multi-Modality Transfer Learning for Generative Protein Design Jiangbin Zheng et.al. 2312.06297v1 null
2023-12-11 Medical Vision Language Pretraining: A survey Prashant Shrestha et.al. 2312.06224v1 null
2023-12-10 NovaCOMET: Open Commonsense Foundation Models with Symbolic Knowledge Distillation Peter West et.al. 2312.05979v1 null
2023-12-10 A Comprehensive Dataset and Automated Pipeline for Nailfold Capillary Analysis Linxi Zhao et.al. 2312.05930v1 link
2023-12-10 Building Variable-sized Models via Learngene Pool Boyu Shi et.al. 2312.05743v1 null
2023-12-10 Initialization Matters for Adversarial Transfer Learning Andong Hua et.al. 2312.05716v1 null
2023-12-09 Understanding the Effect of Model Compression on Social Bias in Large Language Models Gustavo Gonçalves et.al. 2312.05662v1 link
2023-12-09 Enhancing Medical Specialty Assignment to Patients using NLP Techniques Chris Solomou et.al. 2312.05585v1 null
2023-12-08 SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score Distillation Thuan Hoang Nguyen et.al. 2312.05239v1 null
2023-12-08 Datasets, Models, and Algorithms for Multi-Sensor, Multi-agent Autonomy Using AVstack R. Spencer Hallyburton et.al. 2312.04970v1 null
2023-12-08 Zoology: Measuring and Improving Recall in Efficient Language Models Simran Arora et.al. 2312.04927v1 link
2023-12-08 Cross-BERT for Point Cloud Pretraining Xin Li et.al. 2312.04891v1 null
2023-12-08 Adapting Vision Transformer for Efficient Change Detection Yang Zhao et.al. 2312.04869v1 null
2023-12-08 HuRef: HUman-REadable Fingerprint for Large Language Models Boyi Zeng et.al. 2312.04828v1 null
2023-12-07 STraceBERT: Source Code Retrieval using Semantic Application Traces Claudio Spiess et.al. 2312.04731v1 null
2023-12-07 Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models Victor Agostinelli et.al. 2312.04691v1 null
2023-12-07 ConVRT: Consistent Video Restoration Through Turbulence with Test-time Optimization of Neural Video Representations Haoming Cai et.al. 2312.04679v1 null
2023-12-07 On Sarcasm Detection with OpenAI GPT-based Models Montgomery Gole et.al. 2312.04642v1 null
2023-12-07 Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning Yongqi Dong et.al. 2312.04398v1 null
2023-12-07 Multi-View Unsupervised Image Generation with Cross Attention Guidance Llukman Cerkezi et.al. 2312.04337v1 null
2023-12-07 Diffusing Colors: Image Colorization with Text Guided Diffusion Nir Zabari et.al. 2312.04145v1 null
2023-12-07 Instance Tracking in 3D Scenes from Egocentric Videos Yunhan Zhao et.al. 2312.04117v1 link
2023-12-07 Enhancing the Rationale-Input Alignment for Self-explaining Rationalization Wei Liu et.al. 2312.04103v1 null
2023-12-05 DiffusionAtlas: High-Fidelity Consistent Diffusion Video Editing Shao-Yu Chang et.al. 2312.03772v1 null
2023-12-06 Blueprinting the Future: Automatic Item Categorization using Hierarchical Zero-Shot and Few-Shot Classifiers Ting Wang et.al. 2312.03561v1 null
2023-12-06 PneumoLLM: Harnessing the Power of Large Language Model for Pneumoconiosis Diagnosis Meiyue Song et.al. 2312.03490v1 link
2023-12-06 Molecule Joint Auto-Encoding: Trajectory Pretraining with 2D and 3D Diffusion Weitao Du et.al. 2312.03475v1 null
2023-12-05 Leveraging Laryngograph Data for Robust Voicing Detection in Speech Yixuan Zhang et.al. 2312.03129v1 link
2023-12-05 DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control Yuru Jia et.al. 2312.03048v1 null
2023-12-05 MagicStick: Controllable Video Editing via Control Handle Transformations Yue Ma et.al. 2312.03047v1 link
2023-12-05 Zero-Shot Point Cloud Registration Weijie Wang et.al. 2312.03032v1 null
2023-12-05 WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words Lukas Wolf et.al. 2312.02931v1 null
2023-12-05 Rare Galaxy Classes Identified In Foundation Model Representations Mike Walmsley et.al. 2312.02910v1 null
2023-12-05 Large Knowledge Model: Perspectives and Challenges Huajun Chen et.al. 2312.02706v1 null
2023-12-05 Prompt2NeRF-PIL: Fast NeRF Generation via Pretrained Implicit Latent Jianmeng Liu et.al. 2312.02568v1 null
2023-12-05 Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation Shanshan Zhong et.al. 2312.02439v1 link
2023-12-05 Visually Grounded Language Learning: a review of language games, datasets, tasks, and models Alessandro Suglia et.al. 2312.02431v1 null
2023-12-05 Efficient Online Data Mixing For Language Model Pre-Training Alon Albalak et.al. 2312.02406v1 null
2023-12-04 FaultFormer: Transformer-based Prediction of Bearing Faults Anthony Zhou et.al. 2312.02380v1 null
2023-12-04 Rejuvenating image-GPT as Strong Visual Representation Learners Sucheng Ren et.al. 2312.02147v1 link
2023-12-04 Object Recognition as Next Token Prediction Kaiyu Yue et.al. 2312.02142v1 link
2023-12-04 TPPoet: Transformer-Based Persian Poem Generation using Minimal Data and Advanced Decoding Techniques Amir Panahandeh et.al. 2312.02125v1 null
2023-12-04 Open-DDVM: A Reproduction and Extension of Diffusion Model for Optical Flow Estimation Qiaole Dong et.al. 2312.01746v1 link
2023-12-04 Data Management For Large Language Models: A Survey Zige Wang et.al. 2312.01700v1 null
2023-12-04 SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference Feng Wang et.al. 2312.01597v1 null
2023-12-04 APoLLo: Unified Adapter and Prompt Learning for Vision Language Models Sanjoy Chowdhury et.al. 2312.01564v1 null
2023-12-03 Effectively Fine-tune to Improve Large Multimodal Models for Radiology Report Generation Yuzhe Lu et.al. 2312.01504v1 null
2023-12-03 Improving In-Context Learning in Diffusion Models with Visual Context-Modulated Prompts Tianqi Chen et.al. 2312.01408v1 null
2023-12-03 Stable Messenger: Steganography for Message-Concealed Image Generation Quang Nguyen et.al. 2312.01284v1 null
2023-12-01 Mamba: Linear-Time Sequence Modeling with Selective State Spaces Albert Gu et.al. 2312.00752v1 null
2023-12-01 GIFT: Generative Interpretable Fine-Tuning Transformers Chinmay Savadikar et.al. 2312.00700v1 link
2023-12-01 Nonparametric Variational Regularisation of Pretrained Transformers Fabio Fehr et.al. 2312.00662v1 null
2023-12-01 Summarization-based Data Augmentation for Document Classification Yueguan Wang et.al. 2312.00513v1 link
2023-12-01 PyraTrans: Learning Attention-Enriched Multi-Scale Pyramid Network from Pre-Trained Transformers for Effective Malicious URL Detection Ruitong Liu et.al. 2312.00508v1 null
2023-12-01 On the Out-Of-Distribution Robustness of Self-Supervised Representation Learning for Phonocardiogram Signals Aristotelis Ballas et.al. 2312.00502v1 link
2023-12-01 Towards Generalizable Referring Image Segmentation via Target Prompt and Visual Coherence Yajie Liu et.al. 2312.00452v1 null
2023-12-01 Dolphins: Multimodal Language Model for Driving Yingzi Ma et.al. 2312.00438v1 null
2023-12-01 PEFTDebias : Capturing debiasing information using PEFTs Sumit Agarwal et.al. 2312.00434v1 null
2023-12-01 SynFundus: Generating a synthetic fundus images dataset with millions of samples and multi-disease annotations Fangxin Shang et.al. 2312.00377v1 null
2023-11-30 Initializing Models with Larger Ones Zhiqiu Xu et.al. 2311.18823v1 link
2023-11-30 ElasticDiffusion: Training-free Arbitrary Size Image Generation Moayed Haji-Ali et.al. 2311.18822v1 link
2023-11-30 ArthModel: Enhance Arithmetic Skills to Large Language Model Yingdi Guo et.al. 2311.18609v1 null
2023-11-30 Dataset Distillation via the Wasserstein Metric Haoyang Liu et.al. 2311.18531v1 null
2023-11-30 MV-CLIP: Multi-View CLIP for Zero-shot 3D Shape Recognition Dan Song et.al. 2311.18402v1 null
2023-11-30 Transfer Learning across Different Chemical Domains: Virtual Screening of Organic Materials with Deep Learning Models Pretrained on Small Molecule and Chemical Reaction Data Chengwei Zhang et.al. 2311.18377v1 null
2023-11-30 Hubness Reduction Improves Sentence-BERT Semantic Spaces Beatrix M. G. Nielsen et.al. 2311.18364v1 link
2023-11-30 OmniMotionGPT: Animal Motion Generation with Limited Data Zhangsihao Yang et.al. 2311.18303v1 null
2023-11-30 HKUST at SemEval-2023 Task 1: Visual Word Sense Disambiguation with Context Augmentation and Visual Assistance Zhuohao Yin et.al. 2311.18273v1 link
2023-11-30 LLVMs4Protest: Harnessing the Power of Large Language and Vision Models for Deciphering Protests in the News Yongjun Zhang et.al. 2311.18241v1 link
2023-11-29 A Simple Recipe for Language-guided Domain Generalized Segmentation Mohammad Fahes et.al. 2311.17922v1 null
2023-11-29 Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation Shuangrui Ding et.al. 2311.17893v1 null
2023-11-29 Evaluating VLMs for Score-Based, Multi-Probe Annotation of 3D Objects Rishabh Kabra et.al. 2311.17851v1 null
2023-11-29 SPiC-E : Structural Priors in 3D Diffusion Models using Cross Entity Attention Etai Sella et.al. 2311.17834v1 null
2023-11-29 DAP: Domain-aware Prompt Learning for Vision-and-Language Navigation Ting Liu et.al. 2311.17812v1 null
2023-11-29 PillarNeSt: Embracing Backbone Scaling and Pretraining for Pillar-based 3D Object Detection Weixin Mao et.al. 2311.17770v1 null
2023-11-29 SAMPro3D: Locating SAM Prompts in 3D for Zero-Shot Scene Segmentation Mutian Xu et.al. 2311.17707v1 null
2023-11-29 Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes Pavel Korshunov et.al. 2311.17655v1 null
2023-11-29 Continual Learning with Low Rank Adaptation Martin Wistuba et.al. 2311.17601v1 null
2023-11-29 HiDiffusion: Unlocking High-Resolution Creativity and Efficiency in Low-Resolution Trained Diffusion Models Shen Zhang et.al. 2311.17528v1 null
2023-11-28 Self-Supervised Motion Magnification by Backpropagating Through Optical Flow Zhaoying Pan et.al. 2311.17056v1 null
2023-11-28 Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models Zhengming Yu et.al. 2311.17050v1 null
2023-11-28 MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training Pavan Kumar Anasosalu Vasu et.al. 2311.17049v1 null
2023-11-28 Debiasing Multimodal Models via Causal Information Minimization Vaidehi Patil et.al. 2311.16941v1 link
2023-11-28 LLaFS: When Large-Language Models Meet Few-Shot Segmentation Lanyun Zhu et.al. 2311.16926v1 link
2023-11-28 The Falcon Series of Open Language Models Ebtesam Almazrouei et.al. 2311.16867v1 null
2023-11-28 As-Plausible-As-Possible: Plausibility-Aware Mesh Deformation Using 2D Diffusion Priors Seungwoo Yoo et.al. 2311.16739v1 null
2023-11-28 CLAP: Contrastive Learning with Augmented Prompts for Robustness on Pretrained Vision-Language Models Yichao Cai et.al. 2311.16445v1 null
2023-11-28 Text-Driven Image Editing via Learnable Regions Yuanze Lin et.al. 2311.16432v1 link
2023-11-28 Manifold Preserving Guided Diffusion Yutong He et.al. 2311.16424v1 null
2023-11-27 On Bringing Robots Home Nur Muhammad Mahi Shafiullah et.al. 2311.16098v1 link
2023-11-27 ViT-Lens-2: Gateway to Omni-modal Intelligence Weixian Lei et.al. 2311.16081v1 link
2023-11-27 MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Zeming Chen et.al. 2311.16079v1 link
2023-11-27 Exploring Attribute Variations in Style-based GANs using Diffusion Models Rishubh Parihar et.al. 2311.16052v1 null
2023-11-27 Sparsify-then-Classify: From Internal Neurons of Large Language Models To Efficient Text Classifiers Yilun Liu et.al. 2311.15983v1 null
2023-11-27 YUAN 2.0: A Large Language Model with Localized Filtering-based Attention Shaohua Wu et.al. 2311.15786v1 link
2023-11-27 Enhancing Diffusion Models with Text-Encoder Reinforcement Learning Chaofeng Chen et.al. 2311.15657v1 link
2023-11-27 ET3D: Efficient Text-to-3D Generation via Multi-View Distillation Yiming Chen et.al. 2311.15561v1 null
2023-11-27 Dataset Distillation in Latent Space Yuxuan Duan et.al. 2311.15547v1 null
2023-11-27 AerialBooth: Mutual Information Guidance for Text Controlled Aerial View Synthesis from a Single Image Divya Kothandaraman et.al. 2311.15478v1 null
2023-11-24 Calibrated Language Models Must Hallucinate Adam Tauman Kalai et.al. 2311.14648v1 null
2023-11-24 tinyCLAP: Distilling Constrastive Language-Audio Pretrained Models Francesco Paissan et.al. 2311.14517v1 null
2023-11-24 LLamol: A Dynamic Multi-Conditional Generative Transformer for De Novo Molecular Design Niklas Dobberstein et.al. 2311.14407v1 null
2023-11-24 ÚFAL CorPipe at CRAC 2023: Larger Context Improves Multilingual Coreference Resolution Milan Straka et.al. 2311.14391v1 null
2023-11-24 Binarized 3D Whole-body Human Mesh Recovery Zhiteng Li et.al. 2311.14323v1 link
2023-11-24 ZeroPS: High-quality Cross-modal Knowledge Transfer for Zero-Shot 3D Part Segmentation Yuheng Xue et.al. 2311.14262v1 null
2023-11-23 Learning to Solve Inverse Problems for Perceptual Sound Matching Han Han et.al. 2311.14213v1 null
2023-11-23 Hardware Resilience Properties of Text-Guided Image Classifiers Syed Talal Wasim et.al. 2311.14062v1 link
2023-11-23 Dialogue Quality and Emotion Annotations for Customer Support Conversations John Mendonça et.al. 2311.13910v1 link
2023-11-23 General Phrase Debiaser: Debiasing Masked Language Models at a Multi-Token Level Bingkang Shi et.al. 2311.13892v1 link
2023-11-22 Medical Image Retrieval Using Pretrained Embeddings Farnaz Khun Jush et.al. 2311.13547v1 null
2023-11-22 Revisiting Machine Learning based Test Case Prioritization for Continuous Integration Yifan Zhao et.al. 2311.13413v1 link
2023-11-22 High-Quality Face Caricature via Style Translation Lamyanba Laishram et.al. 2311.13338v1 null
2023-11-22 FedFN: Feature Normalization for Alleviating Data Heterogeneity Problem in Federated Learning Seongyoon Kim et.al. 2311.13267v1 null
2023-11-22 On the Calibration of Large Language Models and Alignment Chiwei Zhu et.al. 2311.13240v1 null
2023-11-22 Volumetric Reconstruction Resolves Off-Resonance Artifacts in Static and Dynamic PROPELLER MRI Annesha Ghosh et.al. 2311.13177v1 link
2023-11-22 GENET: Unleashing the Power of Side Information for Recommendation via Hypergraph Pre-training Yang Li et.al. 2311.13121v1 null
2023-11-21 Diffusion Model Alignment Using Direct Preference Optimization Bram Wallace et.al. 2311.12908v1 null
2023-11-21 Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks Samyak Jain et.al. 2311.12786v1 null
2023-11-21 Oasis: Data Curation and Assessment System for Pretraining of Large Language Models Tong Zhou et.al. 2311.12537v1 link
2023-11-21 Adapting pretrained speech model for Mandarin lyrics transcription and alignment Jun-You Wang et.al. 2311.12488v1 null
2023-11-21 PhayaThaiBERT: Enhancing a Pretrained Thai Language Model with Unassimilated Loanwords Panyut Sriwirote et.al. 2311.12475v1 null
2023-11-21 Malicious URL Detection via Pretrained Language Model Guided Multi-Level Feature Attention Network Ruitong Liu et.al. 2311.12372v1 null
2023-11-21 A Supervised Contrastive Learning Pretrain-Finetune Approach for Time Series Trang H. Tran et.al. 2311.12290v1 null
2023-11-21 ATLANTIC: Structure-Aware Retrieval-Augmented Language Model for Interdisciplinary Science Sai Munikoti et.al. 2311.12289v1 null
2023-11-21 Equipping Pretrained Unconditional Music Transformers with Instrument and Genre Controls Weihan Xu et.al. 2311.12257v1 null
2023-11-20 LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model Finetuning Han Guo et.al. 2311.12023v1 link
2023-11-20 Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues Sumire Honda et.al. 2311.11976v1 link
2023-11-20 Adaptive Training Distributions with Scalable Online Bilevel Optimization David Grangier et.al. 2311.11973v1 null
2023-11-20 LION : Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge Gongwei Chen et.al. 2311.11860v1 null
2023-11-20 Efficient Grammatical Error Correction Via Multi-Task Training and Optimized Training Schedule Andrey Bout et.al. 2311.11813v1 null
2023-11-20 KBioXLM: A Knowledge-anchored Biomedical Multilingual Pretrained Language Model Lei Geng et.al. 2311.11564v1 link
2023-11-20 A Multi-Center Study on the Adaptability of a Shared Foundation Model for Electronic Health Records Lin Lawrence Guo et.al. 2311.11483v1 null
2023-11-19 Self-Supervised Pretraining for Heterogeneous Hypergraph Neural Networks Abdalgader Abubaker et.al. 2311.11368v1 null
2023-11-19 Pair-wise Layer Attention with Spatial Masking for Video Prediction Ping Li et.al. 2311.11289v1 link
2023-11-19 Uncertainty quantification for noisy inputs-outputs in physics-informed neural networks and neural operators Zongren Zou et.al. 2311.11262v1 null
2023-11-17 Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2 Hamish Ivison et.al. 2311.10702v1 null
2023-11-17 Bias A-head? Analyzing Bias in Transformer-Based Language Model Attention Heads Yi Yang et.al. 2311.10395v1 null
2023-11-17 Leveraging Function Space Aggregation for Federated Learning at Scale Nikita Dhawan et.al. 2311.10291v1 null
2023-11-17 Diagnosing and Debiasing Corpus-Based Political Bias and Insults in GPT2 Ambri Ma et.al. 2311.10266v1 null
2023-11-16 Latent Feature-based Data Splits to Improve Generalisation Evaluation: A Hate Speech Detection Case Study Maike Züfle et.al. 2311.10236v1 link
2023-11-16 Self-supervised learning of multi-omics embeddings in the low-label, high-data regime Christian John Hurry et.al. 2311.09962v1 null
2023-11-16 Overcoming Data Scarcity in Biomedical Imaging with a Foundational Multi-Task Model Raphael Schäfer et.al. 2311.09847v1 null
2023-11-16 Investigating Data Contamination in Modern Benchmarks for Large Language Models Chunyuan Deng et.al. 2311.09783v1 null
2023-11-16 Back to Basics: A Simple Recipe for Improving Out-of-Domain Retrieval in Dense Encoders Hyunji Lee et.al. 2311.09765v1 link
2023-11-16 Translation Aligned Sentence Embeddings for Turkish Language Eren Unlu et.al. 2311.09748v1 null
2023-11-16 Whispers of Doubt Amidst Echoes of Triumph in NLP Robustness Ashim Gupta et.al. 2311.09694v1 null
2023-11-16 Augmenting Unsupervised Reinforcement Learning with Self-Reference Andrew Zhao et.al. 2311.09692v1 null
2023-11-16 Evolving Domain Adaptation of Pretrained Language Models for Text Classification Yun-Shiuan Chuang et.al. 2311.09661v1 null
2023-11-16 Efficient End-to-End Visual Document Understanding with Rationale Distillation Wang Zhu et.al. 2311.09612v1 null
2023-11-16 Enchancing Semi-Supervised Learning for Extractive Summarization with an LLM-based pseudolabeler Gaurav Sahu et.al. 2311.09559v1 null
2023-11-15 Single-Image 3D Human Digitization with Shape-Guided Diffusion Badour AlBahar et.al. 2311.09221v1 null
2023-11-15 TableLlama: Towards Open Large Generalist Models for Tables Tianshu Zhang et.al. 2311.09206v1 null
2023-11-15 Do Localization Methods Actually Localize Memorized Data in LLMs? Ting-Yun Chang et.al. 2311.09060v1 null
2023-11-15 Data Similarity is Not Enough to Explain Language Model Performance Gregory Yauney et.al. 2311.09006v1 link
2023-11-15 Incremental Object-Based Novelty Detection with Feedback Loop Simone Caldarella et.al. 2311.09004v1 null
2023-11-15 OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining Yihong Liu et.al. 2311.08849v1 null
2023-11-15 An Eye on Clinical BERT: Investigating Language Model Generalization for Diabetic Eye Disease Phenotyping Keith Harrigian et.al. 2311.08687v1 null
2023-11-15 Multistage Collaborative Knowledge Distillation from Large Language Models Jiachen Zhao et.al. 2311.08640v1 null
2023-11-14 Unsupervised segmentation of irradiation$\unicode{x2010}$induced order$\unicode{x2010}$disorder phase transitions in electron microscopy Arman H Ter-Petrosyan et.al. 2311.08585v1 null
2023-11-14 UT5: Pretraining Non autoregressive T5 with unrolled denoising Mahmoud G. Salem et.al. 2311.08552v1 null
2023-11-14 Open-vocabulary keyword spotting in any language through multilingual contrastive speech-phoneme pretraining Jian Zhu et.al. 2311.08323v1 null
2023-11-14 ARTEMIS: Using GANs with Multiple Discriminators to Generate Art James Baker et.al. 2311.08278v1 null
2023-11-14 Investigating the Encoding of Words in BERT's Neurons using Feature Textualization Tanja Baeumel et.al. 2311.08240v1 null
2023-11-14 Unlock the Power: Competitive Distillation for Multi-Modal Large Language Models Xinwei Li et.al. 2311.08213v1 null
2023-11-14 A Survey on Language Models for Code Ziyin Zhang et.al. 2311.07989v1 link
2023-11-14 Test-Time Training for Semantic Segmentation with Output Contrastive Loss Yunlong Zhang et.al. 2311.07877v1 link
2023-11-14 Probing clustering in neural network representations Thao Nguyen et.al. 2311.07864v1 null
2023-11-14 Overview of the TREC 2023 Product Product Search Track Daniel Campos et.al. 2311.07861v1 null
2023-11-14 Learning Mutually Informed Representations for Characters and Subwords Yilin Wang et.al. 2311.07853v1 null
2023-11-13 IruMozhi: Automatically classifying diglossia in Tamil Kabilan Prasanna et.al. 2311.07804v1 null
2023-11-13 Masked Face Dataset Generation and Masked Face Recognition Rui Cai et.al. 2311.07475v1 link
2023-11-13 Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse Ang Lv et.al. 2311.07468v1 null
2023-11-13 Language Grounded QFormer for Efficient Vision Language Understanding Moulik Choraria et.al. 2311.07449v1 null
2023-11-13 Hallucination Augmented Recitations for Language Models Abdullatif Köksal et.al. 2311.07424v1 null
2023-11-13 Fine-Tuning the Retrieval Mechanism for Tabular Deep Learning Felix den Breejen et.al. 2311.07343v1 null
2023-11-13 On Elastic Language Models Chen Zhang et.al. 2311.07204v1 null
2023-11-13 Developing a Named Entity Recognition Dataset for Tagalog Lester James V. Miranda et.al. 2311.07161v1 null
2023-11-13 SpectralGPT: Spectral Foundation Model Danfeng Hong et.al. 2311.07113v1 null
2023-11-13 ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models Ilker Kesen et.al. 2311.07022v1 null
2023-11-12 Concept-wise Fine-tuning Matters in Preventing Negative Transfer Yunqiao Yang et.al. 2311.06868v1 null
2023-11-10 BanglaBait: Semi-Supervised Adversarial Approach for Clickbait Detection on Bangla Clickbait Dataset Md. Motahar Mahtab et.al. 2311.06204v1 link
2023-11-10 FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores Daniel Y. Fu et.al. 2311.05908v1 null
2023-11-10 AI-native Interconnect Framework for Integration of Large Language Model Technologies in 6G Systems Sasu Tarkoma et.al. 2311.05842v1 null
2023-11-09 Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in Clutter Georgios Tziafas et.al. 2311.05779v1 link
2023-11-09 Efficiently Adapting Pretrained Language Models To New Languages Zoltan Csaki et.al. 2311.05741v1 null
2023-11-09 Window Attention is Bugged: How not to Interpolate Position Embeddings Daniel Bolya et.al. 2311.05613v1 null
2023-11-09 Mirror: A Universal Framework for Various Information Extraction Tasks Tong Zhu et.al. 2311.05419v1 link
2023-11-09 Improving Hand Recognition in Uncontrolled and Uncooperative Environments using Multiple Spatial Transformers and Loss Functions Wojciech Michal Matkowski et.al. 2311.05383v1 null
2023-11-09 ConRad: Image Constrained Radiance Fields for 3D Generation from a Single Image Senthil Purushwalkam et.al. 2311.05230v1 null
2023-11-08 Interpreting Pretrained Language Models via Concept Bottlenecks Zhen Tan et.al. 2311.05014v1 null
2023-11-08 Lightweight Diffusion Models with Distillation-Based Block Neural Architecture Search Siao Tang et.al. 2311.04950v1 null
2023-11-08 Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models Rocktim Jyoti Das et.al. 2311.04902v1 link
2023-11-08 Towards Few-Annotation Learning in Computer Vision: Application to Image Classification and Object Detection tasks Quentin Bouniot et.al. 2311.04888v1 null
2023-11-08 DACBERT: Leveraging Dependency Agreement for Cost-Efficient Bert Pretraining Martin Kuo et.al. 2311.04799v1 null
2023-11-08 Training CLIP models on Data from Scientific Papers Calvin Metzger et.al. 2311.04711v1 link
2023-11-08 Army of Thieves: Enhancing Black-Box Model Extraction via Ensemble based sample selection Akshit Jindal et.al. 2311.04588v1 link
2023-11-08 Large GPT-like Models are Bad Babies: A Closer Look at the Relationship between Linguistic Competence and Psycholinguistic Measures Julius Steuer et.al. 2311.04547v1 null
2023-11-07 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features Chenfeng Xu et.al. 2311.04391v1 null
2023-11-07 Evaluating the Effectiveness of Retrieval-Augmented Large Language Models in Scientific Document Reasoning Sai Munikoti et.al. 2311.04348v1 null
2023-11-07 Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning Rishabh Jain et.al. 2311.04313v1 null
2023-11-07 Selective Visual Representations Improve Convergence and Generalization for Embodied AI Ainaz Eftekhar et.al. 2311.04193v1 null
2023-11-07 Do Language Models Learn Semantics of Code? A Case Study in Vulnerability Detection Benjamin Steenhoek et.al. 2311.04109v1 null
2023-11-07 Language Representation Projection: Can We Transfer Factual Knowledge across Languages in Multilingual Language Models? Shaoyang Xu et.al. 2311.03788v1 null
2023-11-07 Unified Low-Resource Sequence Labeling by Sample-Aware Dynamic Sparse Finetuning Sarkar Snigdha Sarathi Das et.al. 2311.03748v1 link
2023-11-07 Analysis of the User Perception of Chatbots in Education Using A Partial Least Squares Structural Equation Modeling Approach Md Rabiul Hasan et.al. 2311.03636v1 null
2023-11-06 Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization Kun Lei et.al. 2311.03351v1 null
2023-11-06 A Foundation Model for Music Informatics Minz Won et.al. 2311.03318v1 link
2023-11-07 S-LoRA: Serving Thousands of Concurrent LoRA Adapters Ying Sheng et.al. 2311.03285v2 link
2023-11-06 An Efficient Self-Supervised Cross-View Training For Sentence Embedding Peerat Limkonchotiwat et.al. 2311.03228v1 link
2023-11-06 LDM3D-VR: Latent Diffusion Model for 3D VR Gabriela Ben Melech Stan et.al. 2311.03226v1 null
2023-11-06 A Simple yet Efficient Ensemble Approach for AI-generated Text Detection Harika Abburi et.al. 2311.03084v1 null
2023-11-06 CogVLM: Visual Expert for Pretrained Language Models Weihan Wang et.al. 2311.03079v1 link
2023-11-06 SugarViT -- Multi-objective Regression of UAV Images with Vision Transformers and Deep Label Distribution Learning Demonstrated on Disease Severity Prediction in Sugar Beet Maurice Günder et.al. 2311.03076v1 null
2023-11-06 Masking Hyperspectral Imaging Data with Pretrained Models Elias Arbash et.al. 2311.03053v1 null
2023-11-06 The Pursuit of Human Labeling: A New Perspective on Unsupervised Learning Artyom Gadetsky et.al. 2311.02940v1 link
2023-11-03 Leveraging Large-Scale Pretrained Vision Foundation Models for Label-Efficient 3D Point Cloud Segmentation Shichao Dong et.al. 2311.01989v1 null
2023-11-03 Generalization of Graph-Based Active Learning Relaxation Strategies Across Materials Xiaoxiao Wang et.al. 2311.01987v1 null
2023-11-03 The language of prompting: What linguistic properties make a prompt successful? Alina Leidinger et.al. 2311.01967v1 null
2023-11-03 ForecastPFN: Synthetically-Trained Zero-Shot Forecasting Samuel Dooley et.al. 2311.01933v1 link
2023-11-03 Towards Concept-Aware Large Language Models Chen Shani et.al. 2311.01866v1 null
2023-11-03 TCM-GPT: Efficient Pre-training of Large Language Models for Domain Adaptation in Traditional Chinese Medicine Guoxing Yang et.al. 2311.01786v1 null
2023-11-03 Data-Free Distillation of Language Model by Text-to-Text Transfer Zheyuan Bai et.al. 2311.01689v1 null
2023-11-02 Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models Andy Zhou et.al. 2311.01441v1 link
2023-11-02 Recognize Any Regions Haosen Yang et.al. 2311.01373v1 null
2023-11-02 Collaborative Large Language Model for Recommender Systems Yaochen Zhu et.al. 2311.01343v1 link
2023-11-02 Terrain-Informed Self-Supervised Learning: Enhancing Building Footprint Extraction from LiDAR Data with Limited Annotations Anuja Vats et.al. 2311.01188v1 null
2023-11-02 Noise-Robust Fine-Tuning of Pretrained Language Models via External Guidance Song Wang et.al. 2311.01108v1 null
2023-11-02 Expanding Expressiveness of Diffusion Models with Limited Data via Self-Distillation based Fine-Tuning Jiwan Hur et.al. 2311.01018v1 null
2023-11-02 VideoDreamer: Customized Multi-Subject Text-to-Video Generation with Disen-Mix Finetuning Hong Chen et.al. 2311.00990v1 null
2023-11-02 MAAIG: Motion Analysis And Instruction Generation Wei-Hsin Yeh et.al. 2311.00980v1 null
2023-11-02 Blending Reward Functions via Few Expert Demonstrations for Faithful and Accurate Knowledge-Grounded Dialogue Generation Wanyu Du et.al. 2311.00953v1 null
2023-11-02 Learning Defect Prediction from Unrealistic Data Kamel Alrashedy et.al. 2311.00931v1 null
2023-11-01 Crosslingual Retrieval Augmented In-context Learning for Bangla Xiaoqian Li et.al. 2311.00587v1 null
2023-11-01 MNN: Mixed Nearest-Neighbors for Self-Supervised Learning Chen Peng et.al. 2311.00562v1 link
2023-11-01 Form follows Function: Text-to-Text Conditional Graph Generation based on Functional Requirements Peter A. Zachares et.al. 2311.00444v1 null
2023-11-01 An analysis of large speech models-based representations for speech emotion recognition Adrian Bogdan Stânea et.al. 2311.00394v1 null
2023-11-01 fMRI-PTE: A Large-scale fMRI Pretrained Transformer Encoder for Multi-Subject Brain Activity Decoding Xuelin Qian et.al. 2311.00342v1 null
2023-11-01 Syntactic Inductive Bias in Transformer Language Models: Especially Helpful for Low-Resource Languages? Luke Gessler et.al. 2311.00268v1 null
2023-11-01 ChatGPT-Powered Hierarchical Comparisons for Image Classification Zhiyuan Ren et.al. 2311.00206v1 link
2023-10-31 Object-centric Video Representation for Long-term Action Anticipation Ce Zhang et.al. 2311.00180v1 link
2023-10-31 ChipNeMo: Domain-Adapted LLMs for Chip Design Mingjie Liu et.al. 2311.00176v1 null
2023-10-31 Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data Antonis Antoniades et.al. 2311.00136v1 null
2023-10-31 Vanishing Gradients in Reinforcement Finetuning of Language Models Noam Razin et.al. 2310.20703v1 null
2023-10-31 The Unreasonable Effectiveness of Random Target Embeddings for Continuous-Output Neural Machine Translation Evgeniia Tokarchuk et.al. 2310.20620v1 null
2023-10-31 Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building Omar Momen et.al. 2310.20589v1 null
2023-10-31 Breaking the Token Barrier: Chunking and Convolution for Efficient Long Text Classification with BERT Aman Jaiswal et.al. 2310.20558v1 null
2023-10-31 CapsFusion: Rethinking Image-Text Data at Scale Qiying Yu et.al. 2310.20550v1 null
2023-10-31 HWD: A Novel Evaluation Score for Styled Handwritten Text Generation Vittorio Pippi et.al. 2310.20316v1 link
2023-10-31 AutoMixer for Improved Multivariate Time-Series Forecasting on BizITOps Data Santosh Palaskar et.al. 2310.20280v1 null
2023-10-31 DEPN: Detecting and Editing Privacy Neurons in Pretrained Language Models Xinwei Wu et.al. 2310.20138v1 null
2023-10-31 Improving Prompt Tuning with Learned Prompting Layers Wei Zhu et.al. 2310.20127v1 null
2023-10-30 GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models Mianchu Wang et.al. 2310.20025v1 null
2023-10-30 LitCab: Lightweight Calibration of Language Models on Outputs of Varied Lengths Xin Liu et.al. 2310.19208v1 link
2023-10-29 Deep Audio Analyzer: a Framework to Industrialize the Research on Audio Forensics Valerio Francesco Puglisi et.al. 2310.19081v1 null
2023-10-29 Prompt-Engineering and Transformer-based Question Generation and Evaluation Rubaba Amyeen et.al. 2310.18867v1 null
2023-10-28 Rethinking Semi-Supervised Federated Learning: How to co-train fully-labeled and fully-unlabeled client imaging data Pramit Saha et.al. 2310.18815v1 null
2023-10-28 ProMap: Effective Bilingual Lexicon Induction via Language Model Prompting Abdellah El Mekki et.al. 2310.18778v1 link
2023-10-28 Integration of persistent Laplacian and pre-trained transformer for protein solubility changes upon mutation Jiahui Chen et.al. 2310.18760v1 null
2023-10-28 Probing LLMs for Joint Encoding of Linguistic Categories Giulio Starace et.al. 2310.18696v1 link
2023-10-28 Feature Guided Masked Autoencoder for Self-supervised Learning in Remote Sensing Yi Wang et.al. 2310.18653v1 link
2023-10-28 Local-Global Self-Supervised Visual Representation Learning Ali Javidani et.al. 2310.18651v1 link
2023-10-28 Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots Ruixiang Tang et.al. 2310.18633v1 null
2023-10-27 Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN Neeraj Kumar et.al. 2310.18169v1 null
2023-10-27 Does Role-Playing Chatbots Capture the Character Personalities? Assessing Personality Traits for Role-Playing Chatbots Xintao Wang et.al. 2310.17976v1 link
2023-10-27 FaultSeg Swin-UNETR: Transformer-Based Self-Supervised Pretraining Model for Fault Recognition Zeren Zhang et.al. 2310.17974v1 null
2023-10-27 Multivessel Coronary Artery Segmentation and Stenosis Localisation using Ensemble Learning Muhammad Bilal et.al. 2310.17954v1 null
2023-10-27 Transformers as Graph-to-Graph Models James Henderson et.al. 2310.17936v1 link
2023-10-27 Grid Jigsaw Representation with CLIP: A New Perspective on Image Clustering Zijie Song et.al. 2310.17869v1 null
2023-10-26 StyleBART: Decorate Pretrained Model with Style Adapters for Unsupervised Stylistic Headline Generation Hanqing Wang et.al. 2310.17743v1 null
2023-10-26 Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model Karsten Roth et.al. 2310.17653v1 null
2023-10-26 InstOptima: Evolutionary Multi-objective Instruction Optimization via Large Language Model-based Instruction Operators Heng Yang et.al. 2310.17630v1 link
2023-10-26 Proving Test Set Contamination in Black Box Language Models Yonatan Oren et.al. 2310.17623v1 null
2023-10-26 Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways Venkata S Govindarajan et.al. 2310.17591v1 link
2023-10-26 PAC-tuning:Fine-tuning Pretrained Language Models with PAC-driven Perturbed Gradient Descent Guangliang Liu et.al. 2310.17588v1 null
2023-10-26 Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models Laura Cabello et.al. 2310.17530v1 link
2023-10-26 Dialect Adaptation and Data Augmentation for Low-Resource ASR: TalTech Systems for the MADASR 2023 Challenge Tanel Alumäe et.al. 2310.17448v1 null
2023-10-26 AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors You-Ming Chang et.al. 2310.17419v1 null
2023-10-26 CADS: Unleashing the Diversity of Diffusion Models through Condition-Annealed Sampling Seyedmorteza Sadat et.al. 2310.17347v1 null
2023-10-26 Prototypical Contrastive Learning-based CLIP Fine-tuning for Object Re-identification Jiachen Li et.al. 2310.17218v1 null
2023-10-25 Proposal-Contrastive Pretraining for Object Detection from Fewer Data Quentin Bouniot et.al. 2310.16835v1 null
2023-10-25 Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation Yongxin Shi et.al. 2310.16809v1 link
2023-10-25 Detecting Pretraining Data from Large Language Models Weijia Shi et.al. 2310.16789v1 null
2023-10-25 BabyStories: Can Reinforcement Learning Teach Baby Language Models to Write Better Stories? Xingmeng Zhao et.al. 2310.16681v1 link
2023-10-25 Learning to Explain: A Model-Agnostic Framework for Explaining Black Box Models Oren Barkan et.al. 2310.16584v1 link
2023-10-25 The Distributional Hypothesis Does Not Fully Explain the Benefits of Masked Language Model Pretraining Ting-Rui Chiang et.al. 2310.16261v1 null
2023-10-24 Octopus: A Multitask Model and Toolkit for Arabic Natural Language Generation AbdelRahim Elmadany et.al. 2310.16127v1 null
2023-10-24 Locally Differentially Private Document Generation Using Zero Shot Prompting Saiteja Utpala et.al. 2310.16111v1 null
2023-10-24 A Unified, Scalable Framework for Neural Population Decoding Mehdi Azabou et.al. 2310.16046v1 null
2023-10-24 Finetuning Offline World Models in the Real World Yunhai Feng et.al. 2310.16029v1 null
2023-10-24 Characterizing Mechanisms for Factual Recall in Language Models Qinan Yu et.al. 2310.15910v1 null
2023-10-24 Do Stochastic Parrots have Feelings Too? Improving Neural Detection of Synthetic Text via Emotion Recognition Alan Cowap et.al. 2310.15904v1 link
2023-10-24 Automatic Aorta Segmentation with Heavily Augmented, High-Resolution 3-D ResUNet: Contribution to the SEG.A Challenge Marek Wodzinski et.al. 2310.15827v1 null
2023-10-24 Discriminator Guidance for Autoregressive Diffusion Models Filip Ekström Kelvinius et.al. 2310.15817v1 null
2023-10-24 Improving generalization in large language models by learning prefix subspaces Louis Falissard et.al. 2310.15793v1 null
2023-10-24 Mean Teacher DETR with Masked Feature Alignment: A Robust Domain Adaptive Detection Transformer Framework Weixi Weng et.al. 2310.15646v1 null
2023-10-24 Unveiling Multilinguality in Transformer Models: Exploring Language Specificity in Feed-Forward Networks Sunit Bhattacharya et.al. 2310.15552v1 null
2023-10-24 Let the Pretrained Language Models "Imagine" for Short Texts Topic Modeling Pritom Saha Akash et.al. 2310.15420v1 null
2023-10-23 FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling Haonan Qiu et.al. 2310.15169v1 null
2023-10-23 Novel-View Acoustic Synthesis from 3D Reconstructed Rooms Byeongjoo Ahn et.al. 2310.15130v1 link
2023-10-23 Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model Ruoxi Shi et.al. 2310.15110v1 link
2023-10-23 E4S: Fine-grained Face Swapping via Editing With Regional GAN Inversion Maomao Li et.al. 2310.15081v1 link
2023-10-23 SLOG: A Structural Generalization Benchmark for Semantic Parsing Bingzhi Li et.al. 2310.15040v1 null
2023-10-23 Fast 2D Bicephalous Convolutional Autoencoder for Compressing 3D Time Projection Chamber Data Yi Huang et.al. 2310.15026v1 null
2023-10-23 Once Upon a $\textit{Time}$ in $\textit{Graph}$: Relative-Time Pretraining for Complex Temporal Reasoning Sen Yang et.al. 2310.14709v1 null
2023-10-23 Extending Input Contexts of Language Models through Training on Segmented Sequences Petros Karypis et.al. 2310.14633v1 null
2023-10-23 Generative Pre-trained Transformer for Vietnamese Community-based COVID-19 Question Answering Tam Minh Vo et.al. 2310.14602v1 null
2023-10-23 The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 Languages Chiyu Zhang et.al. 2310.14557v1 null
2023-10-20 Technical Report for ICCV 2023 Visual Continual Learning Challenge: Continuous Test-time Adaptation for Semantic Segmentation Damian Sójka et.al. 2310.13533v1 null
2023-10-20 Cache me if you Can: an Online Cost-aware Teacher-Student framework to Reduce the Calls to Large Language Models Ilias Stogiannidis et.al. 2310.13395v1 null
2023-10-20 SILC: Improving Vision Language Pretraining with Self-Distillation Muhammad Ferjad Naeem et.al. 2310.13355v1 null
2023-10-20 Exploring the Impact of Corpus Diversity on Financial Pretrained Language Models Jaeyoung Choe et.al. 2310.13312v1 null
2023-10-20 Unified Pretraining for Recommendation via Task Hypergraphs Mingdai Yang et.al. 2310.13286v1 link
2023-10-20 DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics Kaiwen Zheng et.al. 2310.13268v1 link
2023-10-20 On the Language Encoder of Contrastive Cross-modal Models Mengjie Zhao et.al. 2310.13267v1 null
2023-10-20 Anomaly Detection of Command Shell Sessions based on DistilBERT: Unsupervised and Supervised Approaches Zefang Liu et.al. 2310.13247v1 null
2023-10-19 A Car Model Identification System for Streamlining the Automobile Sales Process Said Togru et.al. 2310.13198v1 null
2023-10-19 Do Language Models Learn about Legal Entity Types during Pretraining? Claire Barale et.al. 2310.13092v1 link
2023-10-19 A Predictive Factor Analysis of Social Biases and Task-Performance in Pretrained Masked Language Models Yi Zhou et.al. 2310.12936v1 null
2023-10-19 Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning Juan Rocamonde et.al. 2310.12921v1 null
2023-10-19 A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems Songbo Hu et.al. 2310.12892v1 null
2023-10-19 Predicting Ovarian Cancer Treatment Response in Histopathology using Hierarchical Vision Transformers and Multiple Instance Learning Jack Breen et.al. 2310.12866v1 link
2023-10-19 Survival of the Most Influential Prompts: Efficient Black-Box Prompt Search via Clustering and Pruning Han Zhou et.al. 2310.12774v1 link
2023-10-19 Query-aware Long Video Localization and Relation Discrimination for Deep Video Understanding Yuanxing Xu et.al. 2310.12724v1 null
2023-10-19 Reliable and Efficient In-Memory Fault Tolerance of Large Language Model Pretraining Yuxin Wang et.al. 2310.12670v1 null
2023-10-19 Pretraining Language Models with Text-Attributed Heterogeneous Graphs Tao Zou et.al. 2310.12580v1 link
2023-10-19 Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models Wenxuan Wang et.al. 2310.12481v1 null
2023-10-19 Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping Zijie Pan et.al. 2310.12474v1 link
2023-10-18 Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture Daniel Y. Fu et.al. 2310.12109v1 null
2023-10-18 Evaluating the Fairness of Discriminative Foundation Models in Computer Vision Junaid Ali et.al. 2310.11867v1 null
2023-10-18 Masked Pretraining for Multi-Agent Decision Making Jie Liu et.al. 2310.11846v1 null
2023-10-18 Subject-specific Deep Neural Networks for Count Data with High-cardinality Categorical Features Hangbin Lee et.al. 2310.11654v1 null
2023-10-18 Systematic Assessment of Factual Knowledge in Large Language Models Linhao Luo et.al. 2310.11638v1 null
2023-10-17 GenEval: An Object-Focused Framework for Evaluating Text-to-Image Alignment Dhruba Ghosh et.al. 2310.11513v1 link
2023-10-17 Hybrid quantum-classical graph neural networks for tumor classification in digital pathology Anupama Ray et.al. 2310.11353v1 null
2023-10-17 Elucidating The Design Space of Classifier-Guided Diffusion Generation Jiajun Ma et.al. 2310.11311v1 null
2023-10-17 Utilizing Weak Supervision To Generate Indonesian Conservation Dataset Mega Fransiska et.al. 2310.11258v1 null
2023-10-17 Query2Triple: Unified Query Encoding for Answering Diverse Complex Queries over Knowledge Graphs Yao Xu et.al. 2310.11246v1 link
2023-10-17 Leveraging Content-based Features from Multiple Acoustic Models for Singing Voice Conversion Xueyao Zhang et.al. 2310.11160v1 null
2023-10-17 MeKB-Rec: Personal Knowledge Graph Learning for Cross-Domain Recommendation Xin Su et.al. 2310.11088v1 null
2023-10-17 Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters Gyuseong Lee et.al. 2310.11031v1 null
2023-10-16 SD-HuBERT: Self-Distillation Induces Syllabic Organization in HuBERT Cheol Jun Cho et.al. 2310.10803v1 null
2023-10-16 LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation Ruiqi Wu et.al. 2310.10769v1 null
2023-10-16 BiomedJourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys Yu Gu et.al. 2310.10765v1 null
2023-10-16 Interactive Task Planning with Language Models Boyi Li et.al. 2310.10645v1 null
2023-10-16 Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models Kevin Black et.al. 2310.10639v1 null
2023-10-16 In-Context Pretraining: Language Modeling Beyond Document Boundaries Weijia Shi et.al. 2310.10638v1 null
2023-10-16 Llemma: An Open Language Model For Mathematics Zhangir Azerbayev et.al. 2310.10631v1 link
2023-10-16 Video Language Planning Yilun Du et.al. 2310.10625v1 null
2023-10-16 One For All & All For One: Bypassing Hyperparameter Tuning with Model Averaging For Cross-Lingual Transfer Fabian David Schmidt et.al. 2310.10532v1 link
2023-10-16 Unifying Image Processing as Visual Prompting Question Answering Yihao Liu et.al. 2310.10513v1 null
2023-10-16 Can Word Sense Distribution Detect Semantic Changes of Words? Xiaohang Tang et.al. 2310.10400v1 link
2023-10-16 $\textit{Swap and Predict}$ -- Predicting the Semantic Changes in Words across Corpora by Context Swapping Taichi Aida et.al. 2310.10397v1 link
2023-10-16 Cross-Lingual Consistency of Factual Knowledge in Multilingual Language Models Jirui Qi et.al. 2310.10378v1 link
2023-10-13 PromptRE: Weakly-Supervised Document-Level Relation Extraction via Prompting-Based Data Programming Chufan Gao et.al. 2310.09265v1 null
2023-10-13 Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy Anton Baryshnikov et.al. 2310.09247v1 link
2023-10-13 ClickPrompt: CTR Models are Strong Prompt Generators for Adapting Language Models to CTR Prediction Jianghao Lin et.al. 2310.09234v1 null
2023-10-13 PaLI-3 Vision Language Models: Smaller, Faster, Stronger Xi Chen et.al. 2310.09199v1 null
2023-10-13 Jointly-Learned Exit and Inference for a Dynamic Neural Network : JEI-DNN Florence Regol et.al. 2310.09163v1 null
2023-10-13 UniParser: Multi-Human Parsing with Unified Correlation Representation Learning Jiaming Chu et.al. 2310.08984v1 link
2023-10-13 Exploration with Principles for Diverse AI Supervision Hao Liu et.al. 2310.08899v1 null
2023-10-13 Open X-Embodiment: Robotic Learning Datasets and RT-X Models Abhishek Padalkar et.al. 2310.08864v1 null
2023-10-13 Speaking rate attention-based duration prediction for speed control TTS Jesuraj Bandekar et.al. 2310.08846v1 null
2023-10-13 From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models Dongsheng Jiang et.al. 2310.08825v1 link
2023-10-12 Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video Shashanka Venkataramanan et.al. 2310.08584v1 null
2023-10-12 Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining Licong Lin et.al. 2310.08566v1 null
2023-10-12 Do pretrained Transformers Really Learn In-context by Gradient Descent? Lingfeng Shen et.al. 2310.08540v1 null
2023-10-12 "SegLoc": Study on Novel Visual Self-supervised Learning Scheme (Segment Localization) Tailored for Dense Prediction Tasks of Security Inspection X-ray Images Shervin Halat et.al. 2310.08421v1 null
2023-10-12 How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression? Jingfeng Wu et.al. 2310.08391v1 null
2023-10-12 Improving Factual Consistency for Knowledge-Grounded Dialogue Systems via Knowledge Enhancement and Alignment Boyang Xue et.al. 2310.08372v1 null
2023-10-12 GePSAn: Generative Procedure Step Anticipation in Cooking Videos Mohamed Ashraf Abdelsalam et.al. 2310.08312v1 null
2023-10-12 CHIP: Contrastive Hierarchical Image Pretraining Arpit Mittal et.al. 2310.08304v1 null
2023-10-12 Expanding the Vocabulary of BERT for Knowledge Base Construction Dong Yang et.al. 2310.08291v1 link
2023-10-12 Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting Zijie Chen et.al. 2310.08129v1 null
2023-10-11 InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining Boxin Wang et.al. 2310.07713v1 null
2023-10-12 Rethinking the BERT-like Pretraining for DNA Sequences Chaoqi Liang et.al. 2310.07644v2 null
2023-10-12 Multimodal Graph Learning for Generative Tasks Minji Yoon et.al. 2310.07478v2 link
2023-10-12 NuTime: Numerically Multi-Scaled Embedding for Large-Scale Time Series Pretraining Chenguo Lin et.al. 2310.07402v2 null
2023-10-11 CLIP for Lightweight Semantic Segmentation Ke Jin et.al. 2310.07394v1 null
2023-10-11 Beyond Memorization: Violating Privacy Via Inference with Large Language Models Robin Staab et.al. 2310.07298v1 null
2023-10-12 Score Regularized Policy Optimization through Diffusion Behavior Huayu Chen et.al. 2310.07297v2 link
2023-10-11 IBoxCLA: Towards Robust Box-supervised Segmentation of Polyp via Improved Box-dice and Contrastive Latent-anchors Zhiwei Wang et.al. 2310.07248v1 null
2023-10-11 Crowd Counting in Harsh Weather using Image Denoising with Pix2Pix GANs Muhammad Asif Khan et.al. 2310.07245v1 null
2023-10-11 Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment Bowen Gao et.al. 2310.07229v1 null
2023-10-10 OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text Keiran Paster et.al. 2310.06786v1 null
2023-10-10 Uni3D: Exploring Unified 3D Representation at Scale Junsheng Zhou et.al. 2310.06773v1 link
2023-10-10 Tweedie Moment Projected Diffusions For Inverse Problems Benjamin Boys et.al. 2310.06721v1 null
2023-10-10 Learning Multiplex Embeddings on Text-rich Networks with One Text Encoder Bowen Jin et.al. 2310.06684v1 null
2023-10-10 Self-Supervised Representation Learning for Online Handwriting Text Classification Pouya Mehralian et.al. 2310.06645v1 null
2023-10-10 SpikeCLIP: A Contrastive Language-Image Pretrained Spiking Neural Network Tianlong Li et.al. 2310.06488v1 null
2023-10-10 CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model Peng Di et.al. 2310.06266v1 null
2023-10-10 Model Tuning or Prompt Tuning? A Study of Large Language Models for Clinical Concept and Relation Extraction Cheng Peng et.al. 2310.06239v1 null
2023-10-10 Domain Expansion via Network Adaptation for Solving Inverse Problems Nebiyou Yismaw et.al. 2310.06235v1 null
2023-10-10 GeoLLM: Extracting Geospatial Knowledge from Large Language Models Rohin Manvi et.al. 2310.06213v1 null
2023-10-09 TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models Zuxin Liu et.al. 2310.05905v1 null
2023-10-09 Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning Trevor McInroe et.al. 2310.05723v1 null
2023-10-09 A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics Kai He et.al. 2310.05694v1 link
2023-10-09 No Token Left Behind: Efficient Vision Transformer via Dynamic Token Idling Xuwei Xu et.al. 2310.05654v1 null
2023-10-09 UAVs and Neural Networks for search and rescue missions Hartmut Surmann et.al. 2310.05512v1 null
2023-10-09 Sentence-level Prompts Benefit Composed Image Retrieval Yang Bai et.al. 2310.05473v1 link
2023-10-09 Augmented Embeddings for Custom Retrievals Anirudh Khatry et.al. 2310.05380v1 null
2023-10-08 Visual Storytelling with Question-Answer Plans Danyang Liu et.al. 2310.05295v1 null
2023-10-08 Do Large Language Models Know about Facts? Xuming Hu et.al. 2310.05177v1 null
2023-10-08 UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model Jiabo Ye et.al. 2310.05126v1 link
2023-10-06 Transferring speech-generic and depression-specific knowledge for Alzheimer's disease detection Ziyun Cui et.al. 2310.04358v1 null
2023-10-06 A Comprehensive Evaluation of Large Language Models on Benchmark Biomedical Text Processing Tasks Israt Jahan et.al. 2310.04270v1 null
2023-10-06 Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement Learning Yinda Chen et.al. 2310.04148v1 link
2023-10-06 Robust Multimodal Learning with Missing Modalities via Parameter-Efficient Adaptation Md Kaykobad Reza et.al. 2310.03986v1 null
2023-10-05 Hard View Selection for Contrastive Learning Fabio Ferreira et.al. 2310.03940v1 null
2023-10-05 Bridging Low-level Geometry to High-level Concepts in Visual Servoing of Robot Manipulation Task Using Event Knowledge Graphs and Vision-Language Models Chen Jiang et.al. 2310.03932v1 null
2023-10-05 Less is More: On the Feature Redundancy of Pretrained Models When Transferring to Few-shot Tasks Xu Luo et.al. 2310.03843v1 null
2023-10-05 PrIeD-KIE: Towards Privacy Preserved Document Key Information Extraction Saifullah Saifullah et.al. 2310.03777v1 null
2023-10-05 Stylist: Style-Driven Feature Ranking for Robust Novelty Detection Stefan Smeu et.al. 2310.03738v1 null
2023-10-05 Tik-to-Tok: Translating Language Models One Token at a Time: An Embedding Initialization Strategy for Efficient Language Adaptation François Remy et.al. 2310.03477v1 null
2023-10-05 FreeReg: Image-to-Point Cloud Registration Leveraging Pretrained Diffusion Models and Monocular Depth Estimators Haiping Wang et.al. 2310.03420v1 null
2023-10-05 Procedural Text Mining with Large Language Models Anisa Rula et.al. 2310.03376v1 link
2023-10-05 Benchmarking Large Language Models As AI Research Agents Qian Huang et.al. 2310.03302v1 link
2023-10-05 SimVLG: Simple and Efficient Pretraining of Visual Language Generative Models Yiren Jian et.al. 2310.03291v1 null
2023-10-05 Fragment-based Pretraining and Finetuning on Molecular Graphs Kha-Dinh Luong et.al. 2310.03274v1 null
2023-10-04 On the Performance of Multimodal Language Models Utsav Garg et.al. 2310.03211v1 null
2023-10-04 Enhancing Accuracy in Deep Learning Using Random Matrix Theory Leonid Berlyand et.al. 2310.03165v1 null
2023-10-04 OpenMM 8: Molecular Dynamics Simulation with Machine Learning Potentials Peter Eastman et.al. 2310.03121v1 null
2023-10-04 Retrieval meets Long Context Large Language Models Peng Xu et.al. 2310.03025v1 null
2023-10-04 AstroCLIP: Cross-Modal Pre-Training for Astronomical Foundation Models Francois Lanusse et.al. 2310.03024v1 null
2023-10-04 Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions Satwik Bhattamishra et.al. 2310.03016v1 null
2023-10-04 Multiple Physics Pretraining for Physical Surrogate Models Michael McCabe et.al. 2310.02994v1 null
2023-10-04 Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors Ido Amos et.al. 2310.02980v1 null
2023-10-04 T$^3$Bench: Benchmarking Current Progress in Text-to-3D Generation Yuze He et.al. 2310.02977v1 null
2023-10-04 Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation Chen Dun et.al. 2310.02842v1 null
2023-10-03 Implicit regularization of multi-task learning and finetuning in overparameterized neural networks Jack W. Lindsey et.al. 2310.02396v1 null
2023-10-04 Who's Harry Potter? Approximate Unlearning in LLMs Ronen Eldan et.al. 2310.02238v2 null
2023-10-03 Think before you speak: Training Language Models With Pause Tokens Sachin Goyal et.al. 2310.02226v1 null
2023-10-03 SIEVE: Multimodal Dataset Pruning Using Image Captioning Models Anas Mahmoud et.al. 2310.02110v1 null
2023-10-03 Understanding Masked Autoencoders From a Local Contrastive Perspective Xiaoyu Yue et.al. 2310.01994v1 null
2023-10-03 Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving Long Chen et.al. 2310.01957v1 link
2023-10-03 MFOS: Model-Free & One-Shot Object Pose Estimation JongMin Lee et.al. 2310.01897v1 null
2023-10-04 LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment Bin Zhu et.al. 2310.01852v2 link
2023-10-03 MIMO-NeRF: Fast Neural Rendering with Multi-input Multi-output Neural Radiance Fields Takuhiro Kaneko et.al. 2310.01821v1 null
2023-10-03 SEA: Sparse Linear Attention with Estimated Attention Mask Heejun Lee et.al. 2310.01777v1 null
2023-10-03 Backdiff: a diffusion model for generalized transferable protein backmapping Yikai Liu et.al. 2310.01768v1 null
2023-10-02 L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models Ansong Ni et.al. 2309.17446v2 null
2023-09-29 Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks Vaidehi Patil et.al. 2309.17410v1 link
2023-09-29 Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency Zhihan Liu et.al. 2309.17382v1 null
2023-09-29 Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation Shih-Lun Wu et.al. 2309.17352v1 null
2023-09-29 Scaling Experiments in Self-Supervised Cross-Table Representation Learning Maximilian Schambach et.al. 2309.17339v1 null
2023-09-29 Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors Yukang Lin et.al. 2309.17261v1 null
2023-09-29 Glioma subtype classification from histopathological images using in-domain and out-of-domain transfer learning: An experimental study Vladimir Despotovic et.al. 2309.17223v1 null
2023-09-29 Reconstruction of Patient-Specific Confounders in AI-based Radiologic Image Interpretation using Generative Pretraining Tianyu Han et.al. 2309.17123v1 link
2023-09-28 Qwen Technical Report Jinze Bai et.al. 2309.16609v1 link
2023-09-28 Tensor Factorization for Leveraging Cross-Modal Knowledge in Data-Constrained Infrared Object Detection Manish Sharma et.al. 2309.16592v1 null
2023-09-28 Universal Sleep Decoder: Aligning awake and sleep neural representation across subjects Hui Zheng et.al. 2309.16457v1 null
2023-09-28 Predicting performance difficulty from piano sheet music images Pedro Ramoneda et.al. 2309.16287v1 null
2023-09-28 Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR Xugang Lu et.al. 2309.16093v1 null
2023-09-27 Effective Long-Context Scaling of Foundation Models Wenhan Xiong et.al. 2309.16039v1 null
2023-09-27 Graph-level Representation Learning with Joint-Embedding Predictive Architectures Geri Skenderi et.al. 2309.16014v1 null
2023-09-27 Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts Deniz Engin et.al. 2309.15915v1 null
2023-09-27 One For All: Video Conversation is Feasible Without Video Instruction Tuning Ruyang Liu et.al. 2309.15785v1 null
2023-09-27 Question answering using deep learning in low resource Indian language Marathi Dhiraj Amin et.al. 2309.15779v1 null
2023-09-27 ChatGPT-BCI: Word-Level Neural State Classification Using GPT, EEG, and Eye-Tracking Biomarkers in Semantic Inference Reading Comprehension Yuhong Zhang et.al. 2309.15714v1 null
2023-09-27 Jointly Training Large Autoregressive Multimodal Models Emanuele Aiello et.al. 2309.15564v1 null
2023-09-27 High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models Chunyu Qiang et.al. 2309.15512v1 null
2023-09-27 DreamCom: Finetuning Text-guided Inpainting Model for Image Composition Lingxiao Lu et.al. 2309.15508v1 null
2023-09-27 VideoAdviser: Video Knowledge Distillation for Multimodal Transfer Learning Yanan Wang et.al. 2309.15494v1 null
2023-09-27 Tackling VQA with Pretrained Foundation Models without Further Training Alvin De Jun Tan et.al. 2309.15487v1 null
2023-09-27 Towards Foundation Models Learned from Anatomy in Medical Imaging via Self-Supervision Mohammad Reza Hosseinzadeh Taher et.al. 2309.15358v1 null
2023-09-26 SEPT: Towards Efficient Scene Representation Learning for Motion Prediction Zhiqian Lan et.al. 2309.15289v1 null
2023-09-26 Language-EXtended Indoor SLAM (LEXIS): A Versatile System for Real-time Visual Scene Understanding Christina Kassab et.al. 2309.15065v1 null
2023-09-25 When Automated Assessment Meets Automated Content Generation: Examining Text Quality in the Era of GPTs Marialena Bevilacqua et.al. 2309.14488v1 null
2023-09-25 SINCERE: Supervised Information Noise-Contrastive Estimation REvisited Patrick Feeney et.al. 2309.14277v1 link
2023-09-26 Species196: A One-Million Semi-supervised Dataset for Fine-grained Species Recognition Wei He et.al. 2309.14183v2 null
2023-09-25 VidChapters-7M: Video Chapters at Scale Antoine Yang et.al. 2309.13952v1 link
2023-09-25 TouchUp-G: Improving Feature Representation through Graph-Centric Finetuning Jing Zhu et.al. 2309.13885v1 null
2023-09-24 Accelerating Large Batch Training via Gradient Signal to Noise Ratio (GSNR) Guo-qing Jiang et.al. 2309.13681v1 null
2023-09-24 VoiceLDM: Text-to-Speech with Environmental Context Yeonghyeon Lee et.al. 2309.13664v1 null
2023-09-24 Cross-modal Alignment with Optimal Transport for CTC-based ASR Xugang Lu et.al. 2309.13650v1 null
2023-09-24 Robust data driven discovery of a seismic wave equation Shijun Cheng et.al. 2309.13645v1 null
2023-09-24 Towards Robust Robot 3D Perception in Urban Environments: The UT Campus Object Dataset Arthur Zhang et.al. 2309.13549v1 null
2023-09-24 InSpaceType: Reconsider Space Type in Indoor Monocular Depth Estimation Cho-Ying Wu et.al. 2309.13516v1 null
2023-09-22 A matter of attitude: Focusing on positive and active gradients to boost saliency maps Oscar Llorente et.al. 2309.12913v1 link
2023-09-22 SRFNet: Monocular Depth Estimation with Fine-grained Structure via Spatial Reliability-oriented Fusion of Frames and Events Tianbo Pan et.al. 2309.12842v1 null
2023-09-22 Synthetic Boost: Leveraging Synthetic Data for Enhanced Vision-Language Segmentation in Echocardiography Rabin Adhikari et.al. 2309.12829v1 link
2023-09-22 Unsupervised Representations Improve Supervised Learning in Speech Emotion Recognition Amirali Soltani Tehrani et.al. 2309.12714v1 null
2023-09-21 Studying and improving reasoning in humans and machines Nicolas Yax et.al. 2309.12485v1 null
2023-09-21 Environment-biased Feature Ranking for Novelty Detection Robustness Stefan Smeu et.al. 2309.12301v1 null
2023-09-21 Weakly-supervised Automated Audio Captioning via text only training Theodoros Kouzelis et.al. 2309.12242v1 link
2023-09-21 Exploiting CLIP-based Multi-modal Approach for Artwork Classification and Retrieval Alberto Baldrati et.al. 2309.12110v1 null
2023-09-21 Accelerating Thematic Investment with Prompt Tuned Pretrained Language Models Valentin Leonhard Buchner et.al. 2309.12075v1 null
2023-09-21 BELT:Bootstrapping Electroencephalography-to-Language Decoding and Zero-Shot Sentiment Classification by Natural Language Supervision Jinzhao Zhou et.al. 2309.12056v1 null
2023-09-21 Beyond Image Borders: Learning Feature Extrapolation for Unbounded Image Composition Xiaoyu Liu et.al. 2309.12042v1 link
2023-09-21 DEYOv3: DETR with YOLO for Real-time Object Detection Haodong Ouyang et.al. 2309.11851v1 null
2023-09-21 Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues Norbert Braunschweiler et.al. 2309.11838v1 null
2023-09-21 Multimodal Transformers for Wireless Communications: A Case Study in Beam Prediction Yu Tian et.al. 2309.11811v1 link
2023-09-21 SLHCat: Mapping Wikipedia Categories and Lists to DBpedia by Leveraging Semantic, Lexical, and Hierarchical Features Zhaoyi Wang et.al. 2309.11791v1 null
2023-09-20 Galaxy Zoo DESI: Detailed Morphology Measurements for 8.7M Galaxies in the DESI Legacy Imaging Surveys Mike Walmsley et.al. 2309.11425v1 link
2023-09-20 GECTurk: Grammatical Error Correction and Detection Dataset for Turkish Atakan Kara et.al. 2309.11346v1 link
2023-09-21 Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism Chengcheng Wang et.al. 2309.11331v2 link
2023-09-20 Uncovering the effects of model initialization on deep model generalization: A study with adult and pediatric Chest X-ray images Sivaramakrishnan Rajaraman et.al. 2309.11318v1 null
2023-09-20 Using Artificial Intelligence for the Automation of Knitting Patterns Uduak Uboh et.al. 2309.11202v1 null
2023-09-20 Assessment of Pre-Trained Models Across Languages and Grammars Alberto Muñoz-Ortiz et.al. 2309.11165v1 link
2023-09-20 Hyperspectral Benchmark: Bridging the Gap between HSI Applications through Comprehensive Dataset and Pretraining Hannah Frank et.al. 2309.11122v1 link
2023-09-20 Visual Question Answering in the Medical Domain Louisa Canepa et.al. 2309.11080v1 null
2023-09-20 Weak Supervision for Label Efficient Visual Bug Detection Farrukh Rahman et.al. 2309.11077v1 null
2023-09-20 3D-U-SAM Network For Few-shot Tooth Segmentation in CBCT Images Yifu Zhang et.al. 2309.11015v1 null
2023-09-19 Motif-Centric Representation Learning for Symbolic Music Yuxuan Wu et.al. 2309.10597v1 null
2023-09-19 A Neighbourhood-Aware Differential Privacy Mechanism for Static Word Embeddings Danushka Bollegala et.al. 2309.10551v1 null
2023-09-19 OpenMSD: Towards Multilingual Scientific Documents Similarity Measurement Yang Gao et.al. 2309.10539v1 link
2023-09-19 FoleyGen: Visually-Guided Audio Generation Xinhao Mei et.al. 2309.10537v1 null
2023-09-19 Improving CLIP Robustness with Knowledge Distillation and Self-Training Clement Laroudie et.al. 2309.10361v1 null
2023-09-19 KoBigBird-large: Transformation of Transformer for Korean Language Understanding Kisu Yang et.al. 2309.10339v1 null
2023-09-19 Mixed-Distil-BERT: Code-mixed Language Modeling for Bangla, English, and Hindi Md Nishat Raihan et.al. 2309.10272v1 null
2023-09-18 Generative modeling, design and analysis of spider silk protein sequences for enhanced mechanical properties Wei Lu et.al. 2309.10170v1 null
2023-09-18 Understanding Catastrophic Forgetting in Language Models via Implicit Inference Suhas Kotha et.al. 2309.10105v1 link
2023-09-18 Plug in the Safety Chip: Enforcing Constraints for LLM-driven Robot Agents Ziyi Yang et.al. 2309.09919v1 null
2023-09-19 Harnessing Collective Intelligence Under a Lack of Cultural Consensus Necdet Gürkan et.al. 2309.09787v2 null
2023-09-18 DGM-DR: Domain Generalization with Mutual Information Regularized Diabetic Retinopathy Classification Aleksandr Matsun et.al. 2309.09670v1 null
2023-09-18 DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation Bowen Yin et.al. 2309.09668v1 link
2023-09-18 Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders Lester Phillip Violeta et.al. 2309.09627v1 null
2023-09-18 PromptST: Prompt-Enhanced Spatio-Temporal Multi-Attribute Prediction Zijian Zhang et.al. 2309.09500v1 null
2023-09-18 Self-supervised TransUNet for Ultrasound regional segmentation of the distal radius in children Yuyue Zhou et.al. 2309.09490v1 null
2023-09-18 Face-Driven Zero-Shot Voice Conversion with Memory-based Face-Voice Alignment Zheng-Yan Sheng et.al. 2309.09470v1 null
2023-09-18 Investigating Zero- and Few-shot Generalization in Fact Verification Liangming Pan et.al. 2309.09444v1 link
2023-09-18 Unified Pretraining Target Based Video-music Retrieval With Music Rhythm And Video Optical Flow Information Tianjun Mao et.al. 2309.09421v1 null
2023-09-15 How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models? Danni Liu et.al. 2309.08565v1 link
2023-09-15 Breathing New Life into 3D Assets with Generative Repainting Tianfu Wang et.al. 2309.08523v1 link
2023-09-15 Scaling Laws for Sparsely-Connected Foundation Models Elias Frantar et.al. 2309.08520v1 null
2023-09-15 Audio-free Prompt Tuning for Language-Audio Models Yiming Li et.al. 2309.08357v1 null
2023-09-15 Headless Language Models: Learning without Predicting with Contrastive Weight Tying Nathan Godey et.al. 2309.08351v1 null
2023-09-15 Leveraging the Power of Data Augmentation for Transformer-based Tracking Jie Zhao et.al. 2309.08264v1 null
2023-09-15 BROW: Better featuRes fOr Whole slide image based on self-distillation Yuanfeng Wu et.al. 2309.08259v1 null
2023-09-15 Fine-tune the pretrained ATST model for sound event detection Nian Shao et.al. 2309.08153v1 link
2023-09-15 Multi-Scale Estimation for Omni-Directional Saliency Maps Using Learnable Equator Bias Takao Yamanaka et.al. 2309.08139v1 link
2023-09-15 AnyOKP: One-Shot and Instance-Aware Object Keypoint Extraction with Pretrained ViT Fangbo Qin et.al. 2309.08134v1 null
2023-09-14 Physically Plausible Full-Body Hand-Object Interaction Synthesis Jona Braun et.al. 2309.07907v1 null
2023-09-15 Virchow: A Million-Slide Digital Pathology Foundation Model Eugene Vorontsov et.al. 2309.07778v2 null
2023-09-14 PerPLM: Personalized Fine-tuning of Pretrained Language Models via Writer-specific Intermediate Learning and Prompts Daisuke Oba et.al. 2309.07727v1 null
2023-09-14 L1-aware Multilingual Mispronunciation Detection Framework Yassine El Kheir et.al. 2309.07719v1 null
2023-09-14 NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches Chi-en Amy Tai et.al. 2309.07704v1 null
2023-09-14 SwitchGPT: Adapting Large Language Models for Non-Text Outputs Xinyu Wang et.al. 2309.07623v1 null
2023-09-14 VerilogEval: Evaluating Large Language Models for Verilog Code Generation Mingjie Liu et.al. 2309.07544v1 null
2023-09-14 DePT: Decoupled Prompt Tuning Ji Zhang et.al. 2309.07439v1 link
2023-09-14 Nucleus-aware Self-supervised Pretraining Using Unpaired Image-to-image Translation for Histopathology Images Zhiyun Song et.al. 2309.07394v1 link
2023-09-14 Training Audio Captioning Models without Audio Soham Deshmukh et.al. 2309.07372v1 link
2023-09-13 TransNet: A Transfer Learning-Based Network for Human Action Recognition K. Alomar et.al. 2309.06951v1 null
2023-09-13 Enhancing the Performance of Multi-Agent Reinforcement Learning for Controlling HVAC Systems Daniel Bayer et.al. 2309.06940v1 null
2023-09-14 VEATIC: Video-based Emotion and Affect Tracking in Context Dataset Zhihang Ren et.al. 2309.06745v2 null
2023-09-13 VLSlice: Interactive Vision-and-Language Slice Discovery Eric Slyman et.al. 2309.06703v1 link
2023-09-13 STUPD: A Synthetic Dataset for Spatial and Temporal Relation Reasoning Palaash Agrawal et.al. 2309.06680v1 null
2023-09-12 Zero-Shot Visual Classification with Guided Cropping Piyapat Saranrittichai et.al. 2309.06581v1 null
2023-09-12 Attention De-sparsification Matters: Inducing Diversity in Digital Pathology Representation Learning Saarthak Kapse et.al. 2309.06439v1 null
2023-09-12 Learning to Predict Concept Ordering for Common Sense Generation Tianhui Zhang et.al. 2309.06363v1 link
2023-09-12 360$^\circ$ from a Single Camera: A Few-Shot Approach for LiDAR Segmentation Laurenz Reichardt et.al. 2309.06197v1 null
2023-09-12 Active Label Refinement for Semantic Segmentation of Satellite Images Tuan Pham Minh et.al. 2309.06159v1 null
2023-09-12 Annotating Data for Fine-Tuning a Neural Ranker? Current Active Learning Strategies are not Better than Random Selection Sophia Althammer et.al. 2309.06131v1 null
2023-09-12 Do PLMs Know and Understand Ontological Knowledge? Weiqi Wu et.al. 2309.05936v1 link
2023-09-12 Frequency-Aware Masked Autoencoders for Multimodal Pretraining on Biosignals Ran Liu et.al. 2309.05927v1 null
2023-09-11 Natural Language Supervision for General-Purpose Audio Representations Benjamin Elizalde et.al. 2309.05767v1 null
2023-09-11 Learning the Geodesic Embedding with Graph Neural Networks Bo Pang et.al. 2309.05613v1 null
2023-09-11 Temporal Action Localization with Enhanced Instant Discriminability Dingfeng Shi et.al. 2309.05590v1 link
2023-09-11 Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation Anna Deichler et.al. 2309.05455v1 null
2023-09-11 Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP Jinzuomu Zhong et.al. 2309.05423v1 null
2023-09-11 Towards generalisable and calibrated synthetic speech detection with self-supervised representations Dan Oneata et.al. 2309.05384v1 null
2023-09-11 DeCUR: decoupling common & unique representations for multimodal self-supervision Yi Wang et.al. 2309.05300v1 link
2023-09-11 Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis Li Du et.al. 2309.05217v1 null
2023-09-11 SIM-Sync: From Certifiably Optimal Synchronization over the 3D Similarity Group to Scene Reconstruction with Learned Depth Xihang Yu et.al. 2309.05184v1 null
2023-09-10 Anatomy Completor: A Multi-class Completion Framework for 3D Anatomy Reconstruction Jianning Li et.al. 2309.04956v1 null
2023-09-10 Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation Yuan Gan et.al. 2309.04946v1 link
2023-09-08 Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning David Yunis et.al. 2309.04459v1 null
2023-09-08 Zero-Shot Robustification of Zero-Shot Models With Foundation Models Dyah Adila et.al. 2309.04344v1 null
2023-09-08 Enhancing Hierarchical Transformers for Whole Brain Segmentation with Intracranial Measurements Integration Xin Yu et.al. 2309.04071v1 null
2023-09-08 3D Denoisers are Good 2D Teachers: Molecular Pretraining via Denoising and Cross-Modal Distillation Sungjun Cho et.al. 2309.04062v1 null
2023-09-07 Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems Takuma Udagawa et.al. 2309.04031v1 null
2023-09-07 Multimodal Transformer for Material Segmentation Md Kaykobad Reza et.al. 2309.04001v1 link
2023-09-07 DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models Yung-Sung Chuang et.al. 2309.03883v1 link
2023-09-07 Prompt-based Context- and Domain-aware Pretraining for Vision and Language Navigation Ting Liu et.al. 2309.03661v1 null
2023-09-08 All Labels Together: Low-shot Intent Detection with an Efficient Label Semantic Encoding Paradigm Jiangshu Du et.al. 2309.03563v2 null
2023-09-07 SyncDreamer: Generating Multiview-consistent Images from a Single-view Image Yuan Liu et.al. 2309.03453v1 null
2023-09-06 Parameter Efficient Audio Captioning With Faithful Guidance Using Audio-text Shared Latent Representation Arvind Krishna Sridhar et.al. 2309.03340v1 null
2023-09-06 EvoCLINICAL: Evolving Cyber-Cyber Digital Twin with Active Transfer Learning for Automated Cancer Registry System Chengjie Lu et.al. 2309.03246v1 null
2023-09-06 Leveraging ASR Pretrained Conformers for Speaker Verification through Transfer Learning and Knowledge Distillation Danwei Cai et.al. 2309.03019v1 null
2023-09-07 HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models Guijin Son et.al. 2309.02706v2 null
2023-09-05 Self-Supervised Pretraining Improves Performance and Inference Efficiency in Multiple Lung Ultrasound Interpretation Tasks Blake VanBerlo et.al. 2309.02596v1 null
2023-09-05 Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning Lili Yu et.al. 2309.02591v1 null
2023-09-05 A Survey of the Impact of Self-Supervised Pretraining for Diagnostic Tasks with Radiological Images Blake VanBerlo et.al. 2309.02555v1 null
2023-09-05 Building a Winning Team: Selecting Source Model Ensembles using a Submodular Transferability Estimation Approach Vimal K B et.al. 2309.02429v1 null
2023-09-05 Weigh Your Own Words: Improving Hate Speech Counter Narrative Generation via Attention Regularization Helena Bonaldi et.al. 2309.02311v1 null
2023-09-05 Bring the Noise: Introducing Noise Robustness to Pretrained Automatic Speech Recognition Patrick Eickhoff et.al. 2309.02145v1 null
2023-09-04 Uncertainty in AI: Evaluating Deep Neural Networks on Out-of-Distribution Images Jamiu Idowu et.al. 2309.01850v1 null
2023-09-06 An Empirical Analysis for Zero-Shot Multi-Label Classification on COVID-19 CT Scans and Uncurated Reports Ethan Dack et.al. 2309.01740v2 null
2023-09-04 A Comparative Analysis of Pretrained Language Models for Text-to-Speech Marcel Granero-Moya et.al. 2309.01576v1 null
2023-09-04 DiverseMotion: Towards Diverse Human Motion Generation via Discrete Diffusion Yunhong Lou et.al. 2309.01372v1 null
2023-09-04 Can I Trust Your Answer? Visually Grounded Video Question Answering Junbin Xiao et.al. 2309.01327v1 null
2023-09-03 COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers Julien Denize et.al. 2309.01270v1 link
2023-09-03 Optimizing Mobile-Edge AI-Generated Everything (AIGX) Services by Prompt Engineering: Fundamental, Framework, and Case Study Yinqiu Liu et.al. 2309.01065v1 null
2023-09-01 Catalyst Property Prediction with CatBERTa: Unveiling Feature Exploration Strategies through Large Language Models Janghoon Ock et.al. 2309.00563v1 link
2023-09-01 Trust your Good Friends: Source-free Domain Adaptation by Reciprocal Neighborhood Clustering Shiqi Yang et.al. 2309.00528v1 null
2023-09-01 CPSP: Learning Speech Concepts From Phoneme Supervision Chunyu Qiang et.al. 2309.00424v1 null
2023-09-01 FactLLaMA: Optimizing Instruction-Following Language Models with External Knowledge for Automated Fact-Checking Tsun-Hin Cheung et.al. 2309.00240v1 null
2023-08-31 A Sequential Framework for Detection and Classification of Abnormal Teeth in Panoramic X-rays Tudor Dascalu et.al. 2309.00027v1 link
2023-08-31 StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation Yuhan Wang et.al. 2308.16909v1 link
2023-08-31 The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants Lucas Bandarkar et.al. 2308.16884v1 link
2023-08-31 Towards Multilingual Automatic Dialogue Evaluation John Mendonça et.al. 2308.16795v1 null
2023-08-31 Generate Your Own Scotland: Satellite Image Generation Conditioned on Maps Miguel Espinosa et.al. 2308.16648v1 link
2023-08-31 Expanding Frozen Vision-Language Models without Retraining: Towards Improved Robot Perception Riley Tavassoli et.al. 2308.16493v1 null
2023-08-31 Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff Satoshi Suzuki et.al. 2308.16454v1 null
2023-08-30 ToddlerBERTa: Exploiting BabyBERTa for Grammar Learning and Language Understanding Omer Veysel Cagatan et.al. 2308.16336v1 null
2023-08-30 Can Prompt Learning Benefit Radiology Report Generation? Jun Wang et.al. 2308.16269v1 null
2023-08-30 SAM-Med2D Junlong Cheng et.al. 2308.16184v1 link
2023-08-30 Quantifying Uncertainty in Answers from any Language Model via Intrinsic and Extrinsic Confidence Assessment Jiuhai Chen et.al. 2308.16175v1 null
2023-08-30 Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models Neha Sengupta et.al. 2308.16149v1 null
2023-08-30 MerA: Merging Pretrained Adapters For Few-Shot Learning Shwai He et.al. 2308.15982v1 null
2023-08-29 A General-Purpose Self-Supervised Model for Computational Pathology Richard J. Chen et.al. 2308.15474v1 null
2023-08-29 DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior Xinqi Lin et.al. 2308.15070v1 link
2023-08-29 Generative Model for Models: Rapid DNN Customization for Diverse Tasks and Resource Constraints Wenxing Xu et.al. 2308.15003v1 null
2023-08-28 SynthDistill: Face Recognition with Knowledge Distillation from Synthetic Data Hatef Otroshi Shahreza et.al. 2308.14852v1 null
2023-08-28 VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation Xudong Wang et.al. 2308.14710v1 link
2023-08-28 Fine-Tuning Llama 2 Large Language Models for Detecting Online Sexual Predatory Chats and Abusive Texts Thanh Thi Nguyen et.al. 2308.14683v1 null
2023-08-28 Adversarial Attacks on Foundational Vision Models Nathan Inkawhich et.al. 2308.14597v1 null
2023-08-28 Multimodal Detection of Social Spambots in Twitter using Transformers Loukas Ilias et.al. 2308.14484v1 null
2023-08-28 Self-Supervision for Tackling Unsupervised Anomaly Detection: Pitfalls and Opportunities Leman Akoglu et.al. 2308.14380v1 null
2023-08-28 FonMTL: Towards Multitask Learning for the Fon Language Bonaventure F. P. Dossou et.al. 2308.14280v1 link
2023-08-28 Parameter-Efficient Transfer Learning for Audio-Visual-Language Tasks Hongye Liu et.al. 2308.14274v1 null
2023-08-27 SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation Zhiyu Qu et.al. 2308.14191v1 null
2023-08-27 Only Encode Once: Making Content-based News Recommender Greener Qijiong Liu et.al. 2308.14155v1 null
2023-08-27 Situated Natural Language Explanations Zining Zhu et.al. 2308.14115v1 null
2023-08-25 In-context learning for model-free system identification Marco Forgione et.al. 2308.13380v1 link
2023-08-25 Refine Neutrino Events Reconstruction with BEiT-3 Chen Li et.al. 2308.13285v1 link
2023-08-25 Self-supervised Scene Text Segmentation with Object-centric Layered Representations Augmented by Text Regions Yibo Wang et.al. 2308.13178v1 null
2023-08-25 Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining? Fei Wang et.al. 2308.12898v2 link
2023-08-24 A Parse-Then-Place Approach for Generating Graphic Layouts from Textual Descriptions Jiawei Lin et.al. 2308.12700v1 null
2023-08-25 Masked Feature Modelling: Feature Masking for the Unsupervised Pre-training of a Graph Attention Network Block for Bottom-up Video Event Recognition Dimitrios Daskalakis et.al. 2308.12673v2 null
2023-08-24 A Small and Fast BERT for Chinese Medical Punctuation Restoration Tongtao Ling et.al. 2308.12568v1 link
2023-08-24 Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval Yuan Yuan et.al. 2308.12509v1 null
2023-08-24 Source-Free Collaborative Domain Adaptation via Multi-Perspective Feature Enrichment for Functional MRI Analysis Yuqi Fang et.al. 2308.12495v1 link
2023-08-23 D4: Improving LLM Pretraining via Document De-Duplication and Diversification Kushal Tirumala et.al. 2308.12284v1 null
2023-08-23 Language Reward Modulation for Pretraining Reinforcement Learning Ademi Adeniji et.al. 2308.12270v1 link
2023-08-23 Prompt2Model: Generating Deployable Models from Natural Language Instructions Vijay Viswanathan et.al. 2308.12261v1 link
2023-08-25 Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning Jiasheng Ye et.al. 2308.12219v2 link
2023-08-23 DR-Tune: Improving Fine-tuning of Pretrained Visual Models by Distribution Regularization with Semantic Calibration Nan Zhou et.al. 2308.12058v1 link
2023-08-23 Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages Jinyi Hu et.al. 2308.12038v1 link
2023-08-23 Local Distortion Aware Efficient Transformer Adaptation for Image Quality Assessment Kangmin Xu et.al. 2308.12001v1 null
2023-08-23 Blending-NeRF: Text-Driven Localized Editing in Neural Radiance Fields Hyeonseop Song et.al. 2308.11974v1 null
2023-08-23 CED: Consistent ensemble distillation for audio tagging Heinrich Dinkel et.al. 2308.11957v1 link
2023-08-22 Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations Mohammadreza Salehi et.al. 2308.11796v1 link
2023-08-22 Open Set Synthetic Image Source Attribution Shengbang Fang et.al. 2308.11557v1 null
2023-08-22 Masked Momentum Contrastive Learning for Zero-shot Semantic Understanding Jiantao Wu et.al. 2308.11448v1 null
2023-08-22 Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning Shansong Liu et.al. 2308.11276v1 null
2023-08-21 UnLoc: A Unified Framework for Video Localization Tasks Shen Yan et.al. 2308.11062v1 link
2023-08-21 SupEuclid: Extremely Simple, High Quality OoD Detection with Supervised Contrastive Learning and Euclidean Distance Jarrod Haas et.al. 2308.10973v1 null
2023-08-21 DocPrompt: Large-scale continue pretrain for zero-shot and few-shot document question answering Sijin Wu et.al. 2308.10959v1 null
2023-08-21 EALink: An Efficient and Accurate Pre-trained Framework for Issue-Commit Link Recovery Chenyuan Zhang et.al. 2308.10759v1 link
2023-08-23 Foundation Model-oriented Robustness: Robust Image Model Evaluation with Pretrained Models Peiyan Zhang et.al. 2308.10632v2 null
2023-08-21 When Prompt-based Incremental Learning Does Not Meet Strong Pretraining Yu-Ming Tang et.al. 2308.10445v1 link
2023-08-21 Turning a CLIP Model into a Scene Text Spotter Wenwen Yu et.al. 2308.10408v1 link
2023-08-20 cantnlp@LT-EDI@RANLP-2023: Homophobia/Transphobia Detection in Social Media Comments using Spatio-Temporally Retrained Language Models Sidney G. -J. Wong et.al. 2308.10370v1 null
2023-08-22 Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting Qidong Huang et.al. 2308.10315v2 link
2023-08-20 Make-It-4D: Synthesizing a Consistent Long-Term Dynamic Scene Video from a Single Image Liao Shen et.al. 2308.10257v1 null
2023-08-20 From Global to Local: Multi-scale Out-of-distribution Detection Ji Zhang et.al. 2308.10239v1 link
2023-08-20 ViT-Lens: Towards Omni-modal Representations Weixian Lei et.al. 2308.10185v1 link
2023-08-19 Efficient Representation Learning for Healthcare with Cross-Architectural Self-Supervision Pranav Singh et.al. 2308.10064v1 link
2023-08-18 Artificial-Spiking Hierarchical Networks for Vision-Language Representation Learning Yeming Chen et.al. 2308.09455v1 null
2023-08-18 Accelerated materials language processing enabled by GPT Jaewoong Choi et.al. 2308.09354v1 null
2023-08-18 DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability Runhui Huang et.al. 2308.09306v1 null
2023-08-18 V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models Heng Wang et.al. 2308.09300v1 null
2023-08-18 Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model Ryandhimas E. Zezario et.al. 2308.09262v1 null
2023-08-17 Semantic Consistency for Assuring Reliability of Large Language Models Harsh Raj et.al. 2308.09138v1 null
2023-08-17 Edit Temporal-Consistent Videos with Image Diffusion Model Yuanzhi Wang et.al. 2308.09091v1 null
2023-08-17 On the Evaluation of Neural Code Translation: Taxonomy and Benchmark Mingsheng Jiao et.al. 2308.08961v1 null
2023-08-17 Bag of Tricks for Long-Tailed Multi-Label Classification on Chest X-Rays Feng Hong et.al. 2308.08853v1 null
2023-08-16 Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer Guangyi Chen et.al. 2308.08414v1 null
2023-08-16 Advancing continual lifelong learning in neural information retrieval: definition, dataset, framework, and empirical evaluation Jingrui Hou et.al. 2308.08378v1 null
2023-08-16 Boosting Commit Classification with Contrastive Learning Jiajun Tong et.al. 2308.08263v1 null
2023-08-16 Is Self-Supervised Pretraining Good for Extrapolation in Molecular Property Prediction? Shun Takashige et.al. 2308.08129v1 null
2023-08-15 End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations Bolaji Yusuf et.al. 2308.08027v1 null
2023-08-15 RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models Jie Huang et.al. 2308.07922v1 null
2023-08-15 Dancing Avatar: Pose and Text-Guided Human Motion Videos Synthesis with Image Diffusion Model Bosheng Qin et.al. 2308.07749v1 null
2023-08-16 SPM: Structured Pretraining and Matching Architectures for Relevance Modeling in Meituan Search Wen Zan et.al. 2308.07711v2 null
2023-08-15 Self-supervised Hypergraphs for Learning Multiple World Interpretations Alina Marcu et.al. 2308.07615v1 null
2023-08-15 SGDiff: A Style Guided Diffusion Model for Fashion Synthesis Zhengwentai Sun et.al. 2308.07605v1 link
2023-08-15 AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model Jeong Hun Yeo et.al. 2308.07593v1 null
2023-08-14 Semantic Similarity Loss for Neural Source Code Summarization Chia-Yi Su et.al. 2308.07429v1 link
2023-08-14 Platypus: Quick, Cheap, and Powerful Refinement of LLMs Ariel N. Lee et.al. 2308.07317v1 link
2023-08-15 SEMI-CenterNet: A Machine Learning Facilitated Approach for Semiconductor Defect Inspection Vic De Ridder et.al. 2308.07180v2 null
2023-08-14 CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation Hongguang Zhu et.al. 2308.07146v1 link
2023-08-14 On the Importance of Spatial Relations for Few-shot Action Recognition Yilun Zhang et.al. 2308.07119v1 null
2023-08-14 A One Stop 3D Target Reconstruction and multilevel Segmentation Method Jiexiong Xu et.al. 2308.06974v1 link
2023-08-14 Robustness Stress Testing in Medical Image Classification Mobarakol Islam et.al. 2308.06889v1 link
2023-08-14 Towards Open-Set Test-Time Adaptation Utilizing the Wisdom of Crowds in Entropy Minimization Jungsoo Lee et.al. 2308.06879v1 null
2023-08-13 Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks David Junhao Zhang et.al. 2308.06739v1 null
2023-08-13 IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models Hu Ye et.al. 2308.06721v1 null
2023-08-12 GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher Youliang Yuan et.al. 2308.06463v1 link
2023-08-11 Zero-shot Text-driven Physically Interpretable Face Editing Yapeng Meng et.al. 2308.05976v1 null
2023-08-10 AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining Haohe Liu et.al. 2308.05734v1 null
2023-08-10 Generative Diffusion Models for Radio Wireless Channel Modelling and Sampling Ushnish Sengupta et.al. 2308.05583v1 null
2023-08-10 Fine-grained building roof instance segmentation based on domain adapted pretraining and composite dual-backbone Guozhang Liu et.al. 2308.05358v1 null
2023-08-10 Multimodal Pretrained Models for Sequential Decision-Making: Synthesis, Verification, Grounding, and Perception Yunhao Yang et.al. 2308.05295v1 null
2023-08-09 Deep Learning Model Transfer in Forest Mapping using Multi-source Satellite SAR and Optical Images Shaojia Ge et.al. 2308.05005v1 null
2023-08-09 Transferable Models for Bioacoustics with Human Language Supervision David Robinson et.al. 2308.04978v1 link
2023-08-09 JEDI: Joint Expert Distillation in a Semi-Supervised Multi-Dataset Student-Teacher Scenario for Video Action Recognition Lucian Bicsi et.al. 2308.04934v1 null
2023-08-09 Deep Generative Networks for Heterogeneous Augmentation of Cranial Defects Kamil Kwarciak et.al. 2308.04883v1 null
2023-08-09 Optimizing a Transformer-based network for a deep learning seismic processing workflow Randy Harsuko et.al. 2308.04739v1 null
2023-08-08 Temporal DINO: A Self-supervised Video Strategy to Enhance Action Prediction Izzeddin Teeti et.al. 2308.04589v1 null
2023-08-08 Semi-Supervised Semantic Segmentation of Cell Nuclei via Diffusion-based Large-Scale Pre-Training and Collaborative Learning Zhuchen Shao et.al. 2308.04578v1 null
2023-08-08 Improving Medical Image Classification in Noisy Labels Using Only Self-supervised Pretraining Bidur Khanal et.al. 2308.04551v1 link
2023-08-08 Pengembangan Model untuk Mendeteksi Kerusakan pada Terumbu Karang dengan Klasifikasi Citra Fadhil Muhammad et.al. 2308.04337v1 null
2023-08-08 In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning Xiaochuang Han et.al. 2308.04275v1 link
2023-08-08 Prompted Contrast with Masked Motion Modeling: Towards Versatile 3D Action Representation Learning Jiahang Zhang et.al. 2308.03975v1 null
2023-08-07 AdaptiveSAM: Towards Efficient Tuning of SAM for Surgical Scene Segmentation Jay N. Paranjape et.al. 2308.03726v1 link
2023-08-07 Exploring Visual Pre-training for Robot Manipulation: Datasets, Models and Methods Ya Jing et.al. 2308.03620v1 null
2023-08-07 When GPT Meets Program Analysis: Towards Intelligent Detection of Smart Contract Logic Vulnerabilities in GPTScan Yuqiang Sun et.al. 2308.03314v1 null
2023-08-06 Introducing Feature Attention Module on Convolutional Neural Network for Diabetic Retinopathy Detection Susmita Ghosh et.al. 2308.02985v1 null
2023-08-05 DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation Qiaosong Qi et.al. 2308.02915v1 null
2023-08-05 Improving Generalization of Image Captioning with Unsupervised Prompt Learning Hongchen Wei et.al. 2308.02862v1 null
2023-08-05 Dual Degradation-Inspired Deep Unfolding Network for Low-Light Image Enhancement Huake Wang et.al. 2308.02776v1 null
2023-08-04 Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP Qihang Yu et.al. 2308.02487v1 link
2023-08-04 A Parameter-efficient Multi-subject Model for Predicting fMRI Activity Connor Lane et.al. 2308.02351v1 link
2023-08-04 Explaining Relation Classification Models with Semantic Extents Lars Klöser et.al. 2308.02193v1 link
2023-08-03 DualCoOp++: Fast and Effective Adaptation to Multi-Label Recognition with Limited Annotations Ping Hu et.al. 2308.01890v1 null
2023-08-03 MAP: A Model-agnostic Pretraining Framework for Click-through Rate Prediction Jianghao Lin et.al. 2308.01737v1 link
2023-08-03 Baby's CoThought: Leveraging Large Language Models for Enhanced Reasoning in Compact Models Zheyu Zhang et.al. 2308.01684v1 link
2023-08-03 MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies Ke Chen et.al. 2308.01546v1 null
2023-08-03 Multimodal Neurons in Pretrained Text-Only Transformers Sarah Schwettmann et.al. 2308.01544v1 null
2023-08-03 MFIM: Megapixel Facial Identity Manipulation Sanghyeon Na et.al. 2308.01536v1 null
2023-08-02 Teaching Smaller Language Models To Generalise To Unseen Compositional Questions Tim Hartill et.al. 2308.00946v1 null
2023-08-01 Ada-DQA: Adaptive Diverse Quality-aware Feature Acquisition for Video Quality Assessment Hongbo Liu et.al. 2308.00729v1 null
2023-08-01 Adaptive Semantic Consistency for Cross-domain Few-shot Classification Hengchu Lu et.al. 2308.00727v1 null
2023-08-01 CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code Nadezhda Chirkova et.al. 2308.00683v1 null
2023-08-01 An L2-Normalized Spatial Attention Network For Accurate And Fast Classification Of Brain Tumors In 2D T1-Weighted CE-MRI Images Grace Billingsley et.al. 2308.00491v1 link
2023-08-01 DINO-CXR: A self supervised method based on vision transformer for chest X-ray classification Mohammadreza Shakouri et.al. 2308.00475v1 null
2023-08-01 ViT2EEG: Leveraging Hybrid Pretrained Vision Transformers for EEG Data Ruiqi Yang et.al. 2308.00454v1 link
2023-08-01 Fountain -- an intelligent contextual assistant combining knowledge representation and language models for manufacturing risk identification Saurabh Kumar et.al. 2308.00364v1 null
2023-08-01 Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models Jiaao Chen et.al. 2308.00304v1 null
2023-08-01 The Algonauts Project 2023 Challenge: UARK-UAlbany Team Solution Xuan-Bac Nguyen et.al. 2308.00262v1 link
2023-08-01 Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias Itay Itzhak et.al. 2308.00225v1 null
2023-07-31 Generative Models as a Complex Systems Science: How can we make sense of large language model behavior? Ari Holtzman et.al. 2308.00189v1 null
2023-07-31 Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity Charlie Hou et.al. 2308.00177v1 null
2023-07-31 Towards Trustworthy and Aligned Machine Learning: A Data-centric Survey with Causality Perspectives Haoyang Liu et.al. 2307.16851v1 null
2023-07-31 UniVTG: Towards Unified Video-Language Temporal Grounding Kevin Qinghong Lin et.al. 2307.16715v1 link
2023-07-31 DDG-Net: Discriminability-Driven Graph Network for Weakly-supervised Temporal Action Localization Xiaojun Tang et.al. 2307.16415v1 link
2023-07-31 MobileVidFactory: Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text Junchen Zhu et.al. 2307.16371v1 null
2023-07-31 AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos? Qi Zhao et.al. 2307.16368v1 null
2023-07-30 Unified Model for Image, Video, Audio and Language Tasks Mustafa Shukor et.al. 2307.16184v1 link
2023-07-30 HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation Jinbo Wu et.al. 2307.16183v1 null
2023-08-01 Motion Degeneracy in Self-supervised Learning of Elevation Angle Estimation for 2D Forward-Looking Sonar Yusheng Wang et.al. 2307.16160v2 null
2023-07-29 Instance-Wise Adaptive Tuning and Caching for Vision-Language Models Chunjin Yang et.al. 2307.15983v1 null
2023-07-29 GeneMask: Fast Pretraining of Gene Sequences to Enable Few-Shot Learning Soumyadeep Roy et.al. 2307.15933v1 link
2023-07-28 SimDETR: Simplifying self-supervised pretraining for DETR Ioannis Maniadis Metaxas et.al. 2307.15697v1 null
2023-07-28 The FlySpeech Audio-Visual Speaker Diarization System for MISP Challenge 2022 Li Zhang et.al. 2307.15400v1 null
2023-07-28 Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions Yifei Xin et.al. 2307.15344v1 null
2023-07-28 ChatHome: Development and Evaluation of a Domain-Specific Language Model for Home Renovation Cheng Wen et.al. 2307.15290v1 link
2023-07-28 Multilingual Lexical Simplification via Paraphrase Generation Kang Liu et.al. 2307.15286v1 link
2023-07-28 AC-Norm: Effective Tuning for Medical Image Analysis via Affine Collaborative Normalization Chuyan Zhang et.al. 2307.15282v1 link
2023-07-28 A deep transfer learning network for structural condition identification with limited real-world training data Nengxin Bao et.al. 2307.15249v1 null
2023-07-27 Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields Xiangyu Wang et.al. 2307.15131v1 link
2023-07-27 Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration Harry Cheng et.al. 2307.14866v1 null
2023-07-27 Gloss-free Sign Language Translation: Improving from Visual-Language Pretraining Benjia Zhou et.al. 2307.14768v1 null
2023-07-27 A Weakly Supervised Segmentation Network Embedding Cross-scale Attention Guidance and Noise-sensitive Constraint for Detecting Tertiary Lymphoid Structures of Pancreatic Tumors Bingxue Wang et.al. 2307.14603v1 null
2023-07-26 MiDaS v3.1 -- A Model Zoo for Robust Monocular Relative Depth Estimation Reiner Birkl et.al. 2307.14460v1 link
2023-07-26 Controllable Generation of Dialogue Acts for Dialogue Systems via Few-Shot Response Generation and Ranking Angela Ramirez et.al. 2307.14440v1 link
2023-07-26 Visual Instruction Inversion: Image Editing via Visual Prompting Thao Nguyen et.al. 2307.14331v1 null
2023-07-26 Comparative Analysis of Libraries for the Sentimental Analysis Wendy Ccoya et.al. 2307.14311v1 null
2023-07-27 RPG-Palm: Realistic Pseudo-data Generation for Palmprint Recognition Lei Shen et.al. 2307.14016v2 null
2023-07-26 ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution Mingjin Zhang et.al. 2307.14010v1 null
2023-07-26 Tracking Anything in High Quality Jiawen Zhu et.al. 2307.13974v1 link
2023-07-26 How Does Diffusion Influence Pretrained Language Models on Out-of-Distribution Data? Huazheng Wang et.al. 2307.13949v1 link
2023-07-26 FinTree: Financial Dataset Pretrain Transformer Encoder for Relation Extraction Hyunjong Ok et.al. 2307.13900v1 null
2023-07-25 Pretrained Deep 2.5D Models for Efficient Predictive Modeling from Retinal OCT Taha Emre et.al. 2307.13865v1 null
2023-07-25 E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning Cheng Han et.al. 2307.13770v1 link
2023-07-25 QuickQual: Lightweight, convenient retinal image quality scoring with off-the-shelf pretrained models Justin Engelmann et.al. 2307.13646v1 link
2023-07-25 XDLM: Cross-lingual Diffusion Language Model for Machine Translation Linyao Chen et.al. 2307.13560v1 null
2023-07-25 Zshot: An Open-source Framework for Zero-Shot Named Entity Recognition and Relation Extraction Gabriele Picco et.al. 2307.13497v1 null
2023-07-24 DeepGATGO: A Hierarchical Pretraining-Based Graph-Attention Model for Automatic Protein Function Prediction Zihao Li et.al. 2307.13004v1 null
2023-07-25 Towards a Visual-Language Foundation Model for Computational Pathology Ming Y. Lu et.al. 2307.12914v2 null
2023-07-24 Multiscale Video Pretraining for Long-Term Activity Forecasting Reuben Tan et.al. 2307.12854v1 null
2023-07-24 Predicting Ordinary Differential Equations with Transformers Sören Becker et.al. 2307.12617v1 null
2023-07-25 TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition Shilin Lu et.al. 2307.12493v2 link
2023-07-23 CommonsenseVIS: Visualizing and Understanding Commonsense Reasoning Capabilities of Natural Language Models Xingbo Wang et.al. 2307.12382v1 null
2023-07-23 Self-Supervised Learning for Audio-Based Emotion Recognition Peranut Nimitsurachat et.al. 2307.12343v1 null
2023-07-23 Geometry-Aware Adaptation for Pretrained Models Nicholas Roberts et.al. 2307.12226v1 null
2023-07-22 Pathology-and-genomics Multimodal Transformer for Survival Outcome Prediction Kexin Ding et.al. 2307.11952v1 link
2023-07-21 Bibliometric Analysis of Publisher and Journal Instructions to Authors on Generative-AI in Academic and Scientific Publishing Conner Ganjavi et.al. 2307.11918v1 null
2023-07-21 Enhancing CLIP with GPT-4: Harnessing Visual Descriptions as Prompts Mayug Maniparambil et.al. 2307.11661v1 null
2023-07-21 Advancing Visual Grounding with Scene Knowledge: Benchmark and Method Zhihong Chen et.al. 2307.11558v1 link
2023-07-21 Generating Image-Specific Text Improves Fine-grained Image Classification Emily Mu et.al. 2307.11315v1 null
2023-07-20 Heuristic Hyperparameter Choice for Image Anomaly Detection Zeyu Jiang et.al. 2307.11197v1 null
2023-07-20 Integrating Pretrained ASR and LM to Perform Sequence Generation for Spoken Language Understanding Siddhant Arora et.al. 2307.11005v1 null
2023-07-20 PASTA: Pretrained Action-State Transformer Agents Raphael Boige et.al. 2307.10936v1 null
2023-07-20 BlendFace: Re-designing Identity Encoders for Face-Swapping Kaede Shiohara et.al. 2307.10854v1 link
2023-07-20 HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces Stella Bounareli et.al. 2307.10797v1 link
2023-07-20 Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition Weidong Chen et.al. 2307.10757v1 link
2023-07-20 Learning Discriminative Visual-Text Representation for Polyp Re-Identification Suncheng Xiang et.al. 2307.10625v1 link
2023-07-20 Deep fused flow and topology features for botnet detection basing on pretrained GCN Meng Xiaoyuan et.al. 2307.10583v1 null
2023-07-20 SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer Daegyeom Kim et.al. 2307.10550v1 null
2023-07-19 Interpreting and Correcting Medical Image Classification with PIP-Net Meike Nauta et.al. 2307.10404v1 null
2023-07-19 Gradient Sparsification For Masked Fine-Tuning of Transformers James O' Neill et.al. 2307.10098v1 null
2023-07-20 Class Attention to Regions of Lesion for Imbalanced Medical Image Recognition Jia-Xin Zhuang et.al. 2307.10036v2 null
2023-07-19 An analysis on the effects of speaker embedding choice in non auto-regressive TTS Adriana Stan et.al. 2307.09898v1 null
2023-07-19 Pseudo Outlier Exposure for Out-of-Distribution Detection using Pretrained Transformers Jaeyoung Kim et.al. 2307.09455v2 null
2023-07-19 Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron et.al. 2307.09288v2 link
2023-07-18 UniTabE: Pretraining a Unified Tabular Encoder for Heterogeneous Tabular Data Yazheng Yang et.al. 2307.09249v1 null
2023-07-18 Division Gets Better: Learning Brightness-Aware and Detail-Sensitive Representations for Low-Light Image Enhancement Huake Wang et.al. 2307.09104v1 null
2023-07-18 Multimodal Machine Learning for Extraction of Theorems and Proofs in the Scientific Literature Shrey Mishra et.al. 2307.09047v1 link
2023-07-18 Accuracy versus time frontiers of semi-supervised and self-supervised learning on medical images Zhe Huang et.al. 2307.08919v1 link
2023-07-17 Flow Matching in Latent Space Quan Dao et.al. 2307.08698v1 link
2023-07-17 Deficiency-Aware Masked Transformer for Video Inpainting Yongsheng Yu et.al. 2307.08629v1 link
2023-07-17 Scale-Aware Modulation Meet Transformer Weifeng Lin et.al. 2307.08579v1 link
2023-07-17 Does Visual Pretraining Help End-to-End Reasoning? Chen Sun et.al. 2307.08506v1 null
2023-07-17 Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts Rebekka Hubert et.al. 2307.08426v1 link
2023-07-18 CLIP-Guided StyleGAN Inversion for Text-Driven Real Image Editing Ahmet Canberk Baykal et.al. 2307.08397v2 null
2023-07-17 Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition Shaoshi Ling et.al. 2307.08234v1 null
2023-07-17 Zero-Shot Image Harmonization with Generative Model Prior Jianqi Chen et.al. 2307.08182v1 link
2023-07-16 Diffusion to Confusion: Naturalistic Adversarial Patch Generation Based on Diffusion Model for Object Detector Shuo-Yen Lin et.al. 2307.08076v1 null
2023-07-16 Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling Longyue Wang et.al. 2307.08074v1 null
2023-07-14 DreamTeacher: Pretraining Image Backbones with Deep Generative Models Daiqing Li et.al. 2307.07487v1 null
2023-07-14 Towards spoken dialect identification of Irish Liam Lonergan et.al. 2307.07436v1 null
2023-07-14 Improving Zero-Shot Generalization for CLIP with Synthesized Prompts Zhengbo Wang et.al. 2307.07397v1 link
2023-07-14 Using Large Language Models for Zero-Shot Natural Language Generation from Knowledge Graphs Agnes Axelsson et.al. 2307.07312v1 null
2023-07-13 Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling He Huang et.al. 2307.07057v1 null
2023-07-13 In-context Autoencoder for Context Compression in a Large Language Model Tao Ge et.al. 2307.06945v1 null
2023-07-13 mBLIP: Efficient Bootstrapping of Multilingual Vision-LLMs Gregor Geigle et.al. 2307.06930v1 link
2023-07-13 Explainable 2D Vision Models for 3D Medical Data Alexander Ziller et.al. 2307.06614v1 null
2023-07-12 T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation Kaiyi Huang et.al. 2307.06350v1 null
2023-07-12 Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution Mostafa Dehghani et.al. 2307.06304v1 null
2023-07-12 Instruction Mining: High-Quality Instruction Data Selection for Large Language Models Yihan Cao et.al. 2307.06290v1 null
2023-07-12 Pluggable Neural Machine Translation Models via Memory-augmented Adapters Yuzhuang Xu et.al. 2307.06029v1 link
2023-07-12 What Happens During Finetuning of Vision Transformers: An Invariance Based Investigation Gabriele Merlin et.al. 2307.06006v1 null
2023-07-13 PIGEON: Predicting Image Geolocations Lukas Haas et.al. 2307.05845v2 null
2023-07-11 EgoAdapt: A multi-stream evaluation study of adaptation to real-world egocentric user video Matthias De Lange et.al. 2307.05784v1 link
2023-07-13 Rad-ReStruct: A Novel VQA Benchmark and Method for Structured Radiology Reporting Chantal Pellegrini et.al. 2307.05766v2 link
2023-07-11 Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering Pengfei Li et.al. 2307.05314v1 null
2023-07-11 Attribute Controlled Dialogue Prompting Runcheng Liu et.al. 2307.05228v1 null
2023-07-11 Generative Pretraining in Multimodality Quan Sun et.al. 2307.05222v1 link
2023-07-11 ExFaceGAN: Exploring Identity Directions in GAN's Learned Latent Space for Synthetic Identity Generation Fadi Boutros et.al. 2307.05151v1 null
2023-07-11 Uni-Removal: A Semi-Supervised Framework for Simultaneously Addressing Multiple Degradations in Real-World Images Yongheng Zhang et.al. 2307.05075v1 null
2023-07-10 FedYolo: Augmenting Federated Learning with Pretrained Transformers Xuechen Zhang et.al. 2307.04905v1 null
2023-07-10 Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos Sagnik Majumder et.al. 2307.04760v1 null
2023-07-10 Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback Jaskirat Singh et.al. 2307.04749v1 null
2023-07-12 Weakly-supervised positional contrastive learning: application to cirrhosis classification Emma Sarfati et.al. 2307.04617v2 link
2023-07-10 Q-YOLOP: Quantization-aware You Only Look Once for Panoptic Driving Perception Chi-Chih Chang et.al. 2307.04537v1 null
2023-07-06 Structure Guided Multi-modal Pre-trained Transformer for Knowledge Graph Reasoning Ke Liang et.al. 2307.03591v1 null
2023-07-07 Derivative Free Weight-space Ensembling Dean Ninalga et.al. 2307.03506v1 null
2023-07-07 Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification & Segmentation Dahyun Kang et.al. 2307.03407v1 null
2023-07-07 Teaching Arithmetic to Small Transformers Nayoung Lee et.al. 2307.03381v1 null
2023-07-06 Encoder-Decoder Networks for Self-Supervised Pretraining and Downstream Signal Bandwidth Regression on Digital Antenna Arrays Rajib Bhattacharjea et.al. 2307.03327v1 null
2023-07-06 To pretrain or not to pretrain? A case study of domain-specific pretraining for semantic segmentation in histopathology Tushar Kataria et.al. 2307.03275v1 null
2023-07-06 Vision Language Transformers: A Survey Clayton Fields et.al. 2307.03254v1 null
2023-07-06 VideoGLUE: Video General Understanding Evaluation of Foundation Models Liangzhe Yuan et.al. 2307.03166v1 null
2023-07-06 Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain Aryo Gema et.al. 2307.03042v1 null
2023-07-06 A Critical Look at the Current Usage of Foundation Model for Dense Recognition Task Shiqi Yang et.al. 2307.02862v1 null
2023-07-06 Large Language Models Empowered Autonomous Edge AI for Connected Intelligence Yifei Shen et.al. 2307.02779v1 null
2023-07-05 ODD: A Benchmark Dataset for the NLP-based Opioid Related Aberrant Behavior Detection Sunjae Kwon et.al. 2307.02591v1 null
2023-07-05 Named Entity Inclusion in Abstractive Text Summarization Sergey Berezin et.al. 2307.02570v1 null
2023-07-05 Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks Zhaofeng Wu et.al. 2307.02477v1 null
2023-07-05 Interactive Image Segmentation with Cross-Modality Vision Transformers Kun Li et.al. 2307.02280v1 link
2023-07-05 LOAF-M2L: Joint Learning of Wording and Formatting for Singable Melody-to-Lyric Generation Longshen Ou et.al. 2307.02146v1 null
2023-07-05 Prompting Diffusion Representations for Cross-Domain Semantic Segmentation Rui Gong et.al. 2307.02138v1 null
2023-07-05 EHRSHOT: An EHR Benchmark for Few-Shot Evaluation of Foundation Models Michael Wornow et.al. 2307.02028v1 link
2023-07-05 A ChatGPT Aided Explainable Framework for Zero-Shot Medical Image Diagnosis Jiaxiang Liu et.al. 2307.01981v1 null
2023-07-04 KDSTM: Neural Semi-supervised Topic Modeling with Knowledge Distillation Weijie Xu et.al. 2307.01878v1 null
2023-07-04 Pretraining is All You Need: A Multi-Atlas Enhanced Transformer Framework for Autism Spectrum Disorder Classification Lucas Mahler et.al. 2307.01759v1 link
2023-07-04 Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure Yikang Wang et.al. 2307.01546v1 null
2023-07-04 Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation Jian Guan et.al. 2307.01542v1 null
2023-07-03 Don't freeze: Finetune encoders for better Self-Supervised HAR Vitor Fortes Rey et.al. 2307.01168v1 null
2023-07-04 Improving Language Plasticity via Pretraining with Active Forgetting Yihong Chen et.al. 2307.01163v2 null
2023-07-03 Generating Reliable Pixel-Level Labels for Source Free Domain Adaptation Gabriel Tjio et.al. 2307.00893v1 null
2023-07-03 Augmenting Deep Learning Adaptation for Wearable Sensor Data through Combined Temporal-Frequency Image Encoding Yidong Zhu et.al. 2307.00883v1 null
2023-07-03 Analysis of Task Transferability in Large Pre-trained Classifiers Akshay Mehra et.al. 2307.00823v1 link
2023-07-01 Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model Jiong Cai et.al. 2307.00370v1 null
2023-07-01 Improving Multitask Retrieval by Promoting Task Specialization Wenzheng Zhang et.al. 2307.00342v1 null
2023-07-01 Hierarchical Pretraining for Biomedical Term Embeddings Bryan Cai et.al. 2307.00266v1 null
2023-06-30 Multiscale Progressive Text Prompt Network for Medical Image Segmentation Xianjun Han et.al. 2307.00174v1 null
2023-06-30 Stitched ViTs are Flexible Vision Backbones Zizheng Pan et.al. 2307.00154v1 link
2023-06-30 Class-Incremental Learning using Diffusion Model for Distillation and Replay Quentin Jodelet et.al. 2306.17560v1 null
2023-06-30 Why does my medical AI look at pictures of birds? Exploring the efficacy of transfer learning across domain boundaries Frederic Jonske et.al. 2306.17555v1 null
2023-06-30 MeLM, a generative pretrained language modeling framework that solves forward and inverse mechanics problems Markus J. Buehler et.al. 2306.17525v1 null
2023-07-03 LMBot: Distilling Graph Knowledge into Language Model for Graph-less Deployment in Twitter Bot Detection Zijian Cai et.al. 2306.17408v2 null
2023-06-29 Towards Open-Domain Topic Classification Hantian Ding et.al. 2306.17290v1 null
2023-06-29 Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models Simian Luo et.al. 2306.17203v1 link
2023-06-29 An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training Zitian Chen et.al. 2306.17165v1 null
2023-06-29 Classifying Crime Types using Judgment Documents from Social Media Haoxuan Xu et.al. 2306.17020v1 null
2023-06-29 MIS-FM: 3D Medical Image Segmentation using Foundation Models Pretrained on a Large-Scale Unannotated Dataset Guotai Wang et.al. 2306.16925v1 link
2023-07-03 Probabilistic Linguistic Knowledge and Token-level Text Augmentation Zhengxiang Wang et.al. 2306.16644v2 null
2023-06-29 Representation learning of vertex heatmaps for 3D human mesh reconstruction from multi-view images Sungho Chun et.al. 2306.16615v1 null
2023-06-28 Multi-Site Clinical Federated Learning using Recursive and Attentive Models and NVFlare Won Joon Yun et.al. 2306.16367v1 null
2023-06-28 S2SNet: A Pretrained Neural Network for Superconductivity Discovery Ke Liu et.al. 2306.16270v1 link
2023-06-28 Effective Transfer of Pretrained Large Visual Model for Fabric Defect Segmentation via Specifc Knowledge Injection Zhewei Chen et.al. 2306.16186v1 null
2023-06-27 Classification of Infant Sleep/Wake States: Cross-Attention among Large Scale Pretrained Transformer Networks using Audio, ECG, and IMU Data Kai Chieh Chang et.al. 2306.15808v1 null
2023-06-27 ConKI: Contrastive Knowledge Injection for Multimodal Sentiment Analysis Yakun Yu et.al. 2306.15796v1 null
2023-06-27 HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution Eric Nguyen et.al. 2306.15794v1 null
2023-06-27 Evidential Detection and Tracking Collaboration: New Problem, Benchmark and Algorithm for Robust Anti-UAV System Xue-Feng Zhu et.al. 2306.15767v1 null
2023-06-27 Semi-supervised Multimodal Representation Learning through a Global Workspace Benjamin Devillers et.al. 2306.15711v1 link
2023-06-28 Extending Context Window of Large Language Models via Positional Interpolation Shouyuan Chen et.al. 2306.15595v2 null
2023-06-28 TrickVOS: A Bag of Tricks for Video Object Segmentation Evangelos Skartados et.al. 2306.15377v2 null
2023-06-27 Gender Bias in BERT -- Measuring and Analysing Biases through Sentiment Rating in a Realistic Downstream Classification Task Sophie Jentzsch et.al. 2306.15298v1 null
2023-06-27 Can Pretrained Language Models Derive Correct Semantics from Corrupt Subwords under Noise? Xinzhe Li et.al. 2306.15268v1 link
2023-06-28 Wespeaker baselines for VoxSRC2023 Shuai Wang et.al. 2306.15161v2 null
2023-06-28 MIMIC: Masked Image Modeling with Image Correspondences Kalyani Marathe et.al. 2306.15128v2 link
2023-06-26 Understanding In-Context Learning via Supportive Pretraining Data Xiaochuang Han et.al. 2306.15091v1 null
2023-06-26 Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression Allan Raventós et.al. 2306.15063v1 link
2023-06-26 Supervised Pretraining Can Learn In-Context Reinforcement Learning Jonathan N. Lee et.al. 2306.14892v1 null
2023-06-26 Composing Parameter-Efficient Modules with Arithmetic Operations Jinghan Zhang et.al. 2306.14870v1 link
2023-06-27 Kosmos-2: Grounding Multimodal Large Language Models to the World Zhiliang Peng et.al. 2306.14824v2 null
2023-06-26 DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models Ximing Xing et.al. 2306.14685v1 null
2023-06-26 Improved Bayes Risk Can Yield Reduced Social Welfare Under Competition Meena Jagadeesan et.al. 2306.14670v1 null
2023-06-26 Localized Text-to-Image Generation for Free via Cross Attention Control Yutong He et.al. 2306.14636v1 null
2023-06-26 Transfer Learning across Several Centuries: Machine and Historian Integrated Method to Decipher Royal Secretary's Diary Sojung Lucia Kim et.al. 2306.14592v1 null
2023-06-26 A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis Aishwarya Agarwal et.al. 2306.14544v1 null
2023-06-26 ParameterNet: Parameters Are All You Need for Large-scale Visual Pretraining of Mobile Networks Kai Han et.al. 2306.14525v1 null
2023-06-27 DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing Yujun Shi et.al. 2306.14435v2 null
2023-06-23 Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation Massimiliano Patacchiola et.al. 2306.13554v1 link
2023-06-23 DreamEditor: Text-Driven 3D Scene Editing with Neural Fields Jingyu Zhuang et.al. 2306.13455v1 null
2023-06-23 Long-range Language Modeling with Self-retrieval Ohad Rubin et.al. 2306.13421v1 null
2023-06-23 Variance-Covariance Regularization Improves Representation Learning Jiachen Zhu et.al. 2306.13292v1 null
2023-06-22 PromptIR: Prompting for All-in-One Blind Image Restoration Vaishnav Potlapalli et.al. 2306.13090v1 link
2023-06-22 Can a single image processing algorithm work equally well across all phases of DCE-MRI? Adam G. Tattersall et.al. 2306.12988v1 null
2023-06-22 AudioPaLM: A Large Language Model That Can Speak and Listen Paul K. Rubenstein et.al. 2306.12925v1 null
2023-06-22 Learning from Visual Observation via Offline Pretrained State-to-Go Transformer Bohan Zhou et.al. 2306.12860v1 null
2023-06-23 Otter-Knowledge: benchmarks of multimodal knowledge graph representation learning from different sources for drug discovery Hoang Thanh Lam et.al. 2306.12802v2 link
2023-06-22 Blended-NeRF: Zero-Shot Object Generation and Blending in Existing Neural Radiance Fields Ori Gordon et.al. 2306.12760v1 null
2023-06-22 Restoration of the JPEG Maximum Lossy Compressed Face Images with Hourglass Block based on Early Stopping Discriminator Jongwook Si et.al. 2306.12757v1 null
2023-06-22 FlowFace++: Explicit Semantic Flow-supervised End-to-End Face Swapping Yu Zhang et.al. 2306.12686v1 null
2023-06-22 Identifying and Disentangling Spurious Features in Pretrained Image Representations Rafayel Darbinyan et.al. 2306.12673v1 null
2023-06-21 Comparative Analysis of Segment Anything Model and U-Net for Breast Tumor Detection in Ultrasound and Mammography Images Mohsen Ahmadi et.al. 2306.12510v1 null
2023-06-21 LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models Shizhe Diao et.al. 2306.12420v1 link
2023-06-21 Introspective Action Advising for Interpretable Transfer Learning Joseph Campbell et.al. 2306.12314v1 null
2023-06-21 ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining Dezhi Peng et.al. 2306.12106v1 null
2023-06-20 Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food Applications Saed Rezayi et.al. 2306.11892v1 null
2023-06-20 Unsupervised Deep Unfolded PGD for Transmit Power Allocation in Wireless Systems Ramoni Adeogun et.al. 2306.11865v1 null
2023-06-20 A Simple and Effective Pruning Approach for Large Language Models Mingjie Sun et.al. 2306.11695v1 link
2023-06-20 Inter-Cell Network Slicing With Transfer Learning Empowered Multi-Agent Deep Reinforcement Learning Tianlun Hu et.al. 2306.11552v1 null
2023-06-20 MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian Willy Fitra Hendria et.al. 2306.11341v1 link
2023-06-19 RemoteCLIP: A Vision Language Foundation Model for Remote Sensing Fan Liu et.al. 2306.11029v1 null
2023-06-19 Semi-Supervised Learning for hyperspectral images by non parametrically predicting view assignment Shivam Pande et.al. 2306.10955v1 null
2023-06-19 Detailed retinal vessel segmentation without human annotations using simulated optical coherence tomography angiographs Linus Kreitner et.al. 2306.10941v1 link
2023-06-19 Vocal Timbre Effects with Differentiable Digital Signal Processing David Südholt et.al. 2306.10886v1 link
2023-06-19 A deep dive into explainable self-supervised transformers for point clouds Ioannis Romanelis et.al. 2306.10798v1 link
2023-06-19 Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference Junhao Zheng et.al. 2306.10790v1 null
2023-06-18 Point-Cloud Completion with Pretrained Text-to-image Diffusion Models Yoni Kasten et.al. 2306.10533v1 null
2023-06-16 CLIP2Protect: Protecting Facial Privacy using Text-Guided Makeup via Adversarial Latent Search Fahad Shamshad et.al. 2306.10008v1 link
2023-06-16 Robot Learning with Sensorimotor Pre-training Ilija Radosavovic et.al. 2306.10007v1 null
2023-06-16 SLACK: Stable Learning of Augmentations with Cold-start and KL regularization Juliette Marrie et.al. 2306.09998v1 null
2023-06-16 LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning Jifan Zhang et.al. 2306.09910v1 link
2023-06-16 Revealing the impact of social circumstances on the selection of cancer therapy through natural language processing of social work notes Shenghuan Sun et.al. 2306.09877v1 null
2023-06-16 MixedTeacher : Knowledge Distillation for fast inference textural anomaly detection Simon Thomine et.al. 2306.09859v1 null
2023-06-16 The Big Data Myth: Using Diffusion Models for Dataset Generation to Train Deep Detection Models Roy Voetman et.al. 2306.09762v1 null
2023-06-16 Scaling Open-Vocabulary Object Detection Matthias Minderer et.al. 2306.09683v1 null
2023-06-16 CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models Hao-Wen Dong et.al. 2306.09635v1 null
2023-06-16 CMLM-CSE: Based on Conditional MLM Contrastive Learning for Sentence Embeddings Wei Zhang et.al. 2306.09594v1 null
2023-06-15 Segment Any Point Cloud Sequences by Distilling Vision Foundation Models Youquan Liu et.al. 2306.09347v1 link
2023-06-15 Semantic HELM: An Interpretable Memory for Reinforcement Learning Fabian Paischer et.al. 2306.09312v1 link
2023-06-15 Text Promptable Surgical Instrument Segmentation with Vision-Language Models Zijian Zhou et.al. 2306.09244v1 null
2023-06-15 SCALE: Scaling up the Complexity for Advanced Language Model Evaluation Vishvaksenan Rasiah et.al. 2306.09237v1 null
2023-06-15 Audio Tagging on an Embedded Hardware Platform Gabriel Bibbo et.al. 2306.09106v1 null
2023-06-15 Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration Chenyang Lyu et.al. 2306.09093v1 link
2023-06-15 COSA: Concatenated Sample Pretrained Vision-Language Foundation Model Sihan Chen et.al. 2306.09085v1 link
2023-06-15 Behavioral Cloning via Search in Embedded Demonstration Dataset Federico Malato et.al. 2306.09082v1 null
2023-06-15 When Hyperspectral Image Classification Meets Diffusion Models: An Unsupervised Feature Learning Framework Jingyi Zhou et.al. 2306.08964v1 null
2023-06-15 A Comparison of Self-Supervised Pretraining Approaches for Predicting Disease Risk from Chest Radiograph Images Yanru Chen et.al. 2306.08955v1 null
2023-06-13 Image Captioners Are Scalable Vision Learners Too Michael Tschannen et.al. 2306.07915v1 null
2023-06-13 GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition Yu Pan et.al. 2306.07848v1 null
2023-06-13 Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images Ming Y. Lu et.al. 2306.07831v1 null
2023-06-13 Monolingual and Cross-Lingual Knowledge Transfer for Topic Classification Dmitry Karpov et.al. 2306.07797v1 null
2023-06-13 Multi-objective Molecular Optimization for Opioid Use Disorder Treatment Using Generative Network Complex Hongsong Feng et.al. 2306.07484v1 null
2023-06-13 Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard Ehsan Kamalloo et.al. 2306.07471v1 null
2023-06-12 Scalable 3D Captioning with Pretrained Models Tiange Luo et.al. 2306.07279v1 null
2023-06-12 MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images Junchen Zhu et.al. 2306.07257v1 null
2023-06-13 Fair Learning to Rank with Distribution-free Risk Control Ruocheng Guo et.al. 2306.07188v2 null
2023-06-12 Gradient Ascent Post-training Enhances Language Model Generalization Dongkeun Yoon et.al. 2306.07052v1 link
2023-06-12 Generating Synthetic Datasets by Interpolating along Generalized Geodesics Jiaojiao Fan et.al. 2306.06866v1 null
2023-06-11 Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability Jiacheng Ye et.al. 2306.06688v1 null
2023-06-10 Bootstrapping Code-Text Pretrained Language Model to Detect Inconsistency Between Code and Comment Anh T. V. Dau et.al. 2306.06347v1 null
2023-06-10 Improving Non-autoregressive Translation Quality with Pretrained Language Model, Embedding Distillation and Upsampling Strategy for CTC Shen-sian Syu et.al. 2306.06345v1 null
2023-06-09 DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents Fuxiao Liu et.al. 2306.06306v1 link
2023-06-09 $FPDM$: Domain-Specific Fast Pre-training Technique using Document-Level Metadata Abhilash Nandy et.al. 2306.06190v1 null
2023-06-09 Virtual Node Tuning for Few-shot Node Classification Zhen Tan et.al. 2306.06063v1 null
2023-06-09 Benchmarking self-supervised video representation learning Akash Kumar et.al. 2306.06010v1 null
2023-06-09 Exploring Effective Mask Sampling Modeling for Neural Image Compression Lin Liu et.al. 2306.05704v1 null
2023-06-09 Embodied Executable Policy Learning with Language-based Scene Summarization Jielin Qiu et.al. 2306.05696v1 null
2023-06-09 On the Importance of Feature Decorrelation for Unsupervised Representation Learning in Reinforcement Learning Hojoon Lee et.al. 2306.05637v1 link
2023-06-08 Hexatagging: Projective Dependency Parsing as Tagging Afra Amini et.al. 2306.05477v1 null
2023-06-08 Tracking Objects with 3D Representation from Videos Jiawei He et.al. 2306.05416v1 null
2023-06-08 Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models Memories Shizhe Diao et.al. 2306.05406v1 link
2023-06-08 RDumb: A simple approach that questions our progress in continual test-time adaptation Ori Press et.al. 2306.05401v1 link
2023-06-08 Extensive Evaluation of Transformer-based Architectures for Adverse Drug Events Extraction Simone Scaboro et.al. 2306.05276v1 link
2023-06-09 Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models Tianzhe Chu et.al. 2306.05272v2 link
2023-06-08 SyncDiffusion: Coherent Montage via Synchronized Joint Diffusions Yuseung Lee et.al. 2306.05178v1 null
2023-06-08 Variable Radiance Field for Real-Life Category-Specifc Reconstruction from Single Image Kun Wang et.al. 2306.05145v1 null
2023-06-08 DLAMA: A Framework for Curating Culturally Diverse Facts for Probing the Knowledge of Pretrained Language Models Amr Keleg et.al. 2306.05076v1 null
2023-06-08 Improving Visual Prompt Tuning for Self-supervised Vision Transformers Seungryong Yoo et.al. 2306.05067v1 link
2023-06-08 Learning A Foundation Language Model for Geoscience Knowledge Understanding and Utilization Cheng Deng et.al. 2306.05064v1 link
2023-06-07 Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection Yu Bai et.al. 2306.04637v1 null
2023-06-07 Proximity-Informed Calibration for Deep Neural Networks Miao Xiong et.al. 2306.04590v1 link
2023-06-07 Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages Claytone Sikasote et.al. 2306.04428v1 link
2023-06-07 SF-FSDA: Source-Free Few-Shot Domain Adaptive Object Detection with Efficient Labeled Data Factory Han Sun et.al. 2306.04385v1 null
2023-06-07 Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks Haiyang Xu et.al. 2306.04362v1 link
2023-06-08 GPT Self-Supervision for a Better Data Annotator Xiaohuan Pei et.al. 2306.04349v2 null
2023-06-08 Coarse Is Better? A New Pipeline Towards Self-Supervised Learning with Uncurated Images Ke Zhu et.al. 2306.04244v2 null
2023-06-07 Leveraging Knowledge Graph Embeddings to Enhance Contextual Representations for Relation Extraction Fréjus A. A. Laleye et.al. 2306.04203v1 null
2023-06-07 From the One, Judge of the Whole: Typed Entailment Graph Construction with Predicate Generation Zhibin Chen et.al. 2306.04170v1 link
2023-06-07 Matte Anything: Interactive Natural Image Matting with Segment Anything Models Jingfeng Yao et.al. 2306.04121v1 null
2023-06-06 Learning Human Mesh Recovery in 3D Scenes Zehong Shen et.al. 2306.03847v1 null
2023-06-06 Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How Sebastian Pineda Arango et.al. 2306.03828v1 null
2023-06-06 On the Difference of BERT-style and CLIP-style Text Encoders Zhihong Chen et.al. 2306.03678v1 link
2023-06-06 BioBLP: A Modular Framework for Learning on Multimodal Biomedical Knowledge Graphs Daniel Daza et.al. 2306.03606v1 link
2023-06-06 LegoNet: Alternating Model Blocks for Medical Image Segmentation Ikboljon Sobirov et.al. 2306.03494v1 null
2023-06-06 Alzheimer Disease Classification through ASR-based Transcriptions: Exploring the Impact of Punctuation and Pauses Lucía Gómez-Zaragozá et.al. 2306.03443v1 null
2023-06-06 Quantifying the Variability Collapse of Neural Networks Jing Xu et.al. 2306.03440v1 null
2023-06-06 Towards Alleviating the Object Bias in Prompt Tuning-based Factual Knowledge Extraction Yuhang Wang et.al. 2306.03378v1 link
2023-06-06 Identifying Shared Decodable Concepts in the Human Brain Using Image-Language Foundation Models Cory Efird et.al. 2306.03375v1 null
2023-06-07 Vid2Act: Activate Offline Videos for Visual RL Minting Pan et.al. 2306.03360v2 null
2023-06-05 SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression Tim Dettmers et.al. 2306.03078v1 null
2023-06-05 Sensitivity-Aware Finetuning for Accuracy Recovery on Deep Learning Hardware Lakshmi Nair et.al. 2306.03076v1 null
2023-06-05 Continual Learning with Pretrained Backbones by Tuning in the Input Space Simone Marullo et.al. 2306.02947v1 null
2023-06-05 Second Language Acquisition of Neural Language Models Miyu Oba et.al. 2306.02920v1 null
2023-06-05 SelfEvolve: A Code Evolution Framework via Large Language Models Shuyang Jiang et.al. 2306.02907v1 null
2023-06-05 Learning Probabilistic Symmetrization for Architecture Agnostic Equivariance Jinwoo Kim et.al. 2306.02866v1 link
2023-06-05 Transformer-Based UNet with Multi-Headed Cross-Attention Skip Connections to Eliminate Artifacts in Scanned Documents David Kreuzer et.al. 2306.02815v1 null
2023-06-05 Explore and Exploit the Diverse Knowledge in Model Zoo for Domain Generalization Yimeng Chen et.al. 2306.02595v1 null
2023-06-05 Improved Active Multi-Task Representation Learning via Lasso Yiping Wang et.al. 2306.02556v1 null
2023-06-04 RadLing: Towards Efficient Radiology Report Understanding Rikhiya Ghosh et.al. 2306.02492v1 null
2023-06-02 Distilling Efficient Language-Specific Models for Cross-Lingual Transfer Alan Ansell et.al. 2306.01709v1 link
2023-06-02 Towards In-context Scene Understanding Ivana Balažević et.al. 2306.01667v1 null
2023-06-02 Pretrained Language Model based Web Search Ranking: From Relevance to Satisfaction Canjia Li et.al. 2306.01599v1 null
2023-06-02 Evaluating The Robustness of Self-Supervised Representations to Background/Foreground Removal Xavier F. Cadet et.al. 2306.01398v1 null
2023-06-02 Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23 Ioannis Tsiamas et.al. 2306.01327v1 null
2023-06-01 Systematic Evaluation of GPT-3 for Zero-Shot Personality Estimation Adithya V Ganesan et.al. 2306.01183v1 null
2023-06-01 TMI! Finetuned Models Leak Private Information from their Pretraining Data John Abascal et.al. 2306.01181v1 null
2023-06-01 The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only Guilherme Penedo et.al. 2306.01116v1 null
2023-06-01 Exploring the Versatility of Zero-Shot CLIP for Interstitial Lung Disease Classification Cara Van Uden et.al. 2306.01111v1 null
2023-06-01 Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles Chaitanya Ryali et.al. 2306.00989v1 link
2023-06-01 Continual Learning for Abdominal Multi-Organ and Tumor Segmentation Yixiao Zhang et.al. 2306.00988v1 link
2023-06-01 StyleGAN knows Normal, Depth, Albedo, and More Anand Bhattad et.al. 2306.00987v1 null
2023-06-02 Diffusion Self-Guidance for Controllable Image Generation Dave Epstein et.al. 2306.00986v2 null
2023-06-01 Train Offline, Test Online: A Real Robot Learning Benchmark Gaoyue Zhou et.al. 2306.00942v1 link
2023-06-01 STEVE-1: A Generative Model for Text-to-Behavior in Minecraft Shalev Lifshitz et.al. 2306.00937v1 null
2023-06-01 "Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning Abisek Rajakumar Kalarani et.al. 2306.00931v1 null
2023-06-01 Inserting Anybody in Diffusion Models via Celeb Basis Ge Yuan et.al. 2306.00926v1 link
2023-06-01 Adapting a ConvNeXt model to audio classification on AudioSet Thomas Pellegrini et.al. 2306.00830v1 null
2023-06-01 In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation Julian Bitterwolf et.al. 2306.00826v1 link
2023-06-01 Too Large; Data Reduction for Vision-Language Pre-Training Alex Jinpeng Wang et.al. 2305.20087v2 link
2023-05-31 Efficient Shapley Values Estimation by Amortization for Text Classification Chenghao Yang et.al. 2305.19998v1 link
2023-06-01 A Global Context Mechanism for Sequence Labeling Conglei Xu et.al. 2305.19928v2 link
2023-05-31 Structure-Aware Language Model Pretraining Improves Dense Retrieval on Structured Data Xinze Li et.al. 2305.19912v1 link
2023-05-31 How Does Pretraining Improve Discourse-Aware Translation? Zhihong Huang et.al. 2305.19847v1 null
2023-05-31 A Survey of Label-Efficient Deep Learning for 3D Point Clouds Aoran Xiao et.al. 2305.19812v1 link
2023-05-31 Automatic Discrimination of Human and Neural Machine Translation in Multilingual Scenarios Malina Chichirau et.al. 2305.19757v1 null
2023-05-31 Investigation of the Robustness of Neural Density Fields Jonas Schuhmacher et.al. 2305.19698v1 null
2023-05-31 End-to-end Training of Deep Boltzmann Machines by Unbiased Contrastive Divergence with Local Mode Initialization Shohei Taniguchi et.al. 2305.19684v1 link
2023-05-31 LAIT: Efficient Multi-Segment Encoding in Transformers with Layer-Adjustable Interaction Jeremiah Milbauer et.al. 2305.19585v1 null
2023-05-30 Jointly Reparametrized Multi-Layer Adaptation for Efficient and Private Tuning Umang Gupta et.al. 2305.19264v1 link
2023-05-30 DäRF: Boosting Radiance Fields from Sparse Inputs with Monocular Depth Adaptation Jiuhn Song et.al. 2305.19201v1 null
2023-05-30 Strategic Reasoning with Language Models Kanishk Gandhi et.al. 2305.19165v1 null
2023-05-30 LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images Viraj Prabhu et.al. 2305.19164v1 null
2023-05-30 Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained Static Sense Embeddings Haochen Luo et.al. 2305.19092v1 null
2023-05-30 Nested Diffusion Processes for Anytime Image Generation Noam Elata et.al. 2305.19066v1 link
2023-05-30 Voice Conversion With Just Nearest Neighbors Matthew Baas et.al. 2305.18975v1 link
2023-05-30 Prompt-based Tuning of Transformer Models for Multi-Center Medical Image Segmentation Numan Saeed et.al. 2305.18948v1 null
2023-05-30 Empirical Sufficiency Lower Bounds for Language Modeling with Locally-Bootstrapped Semantic Structures Jakob Prange et.al. 2305.18915v1 link
2023-05-30 Dissecting Chain-of-Thought: A Study on Compositional In-Context Learning of MLPs Yingcong Li et.al. 2305.18869v1 null
2023-05-29 CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice Juan Zuluaga-Gomez et.al. 2305.18283v1 link
2023-05-29 Concept Decomposition for Visual Exploration and Inspiration Yael Vinker et.al. 2305.18203v1 null
2023-05-29 Multiscale Positive-Unlabeled Detection of AI-Generated Texts Yuchuan Tian et.al. 2305.18149v1 link
2023-05-29 Conditional Score Guidance for Text-Driven Image-to-Image Translation Hyunsoo Lee et.al. 2305.18007v1 null
2023-05-29 Data Augmentation for Low-Resource Keyphrase Generation Krishna Garg et.al. 2305.17968v1 link
2023-05-28 Transfer Learning for Power Outage Detection Task with Limited Training Data Olukunle Owolabi et.al. 2305.17817v1 null
2023-05-28 Adapting Language-Audio Models as Few-Shot Audio Learners Jinhua Liang et.al. 2305.17719v1 null
2023-05-28 Z-GMOT: Zero-shot Generic Multiple Object Tracking Kim Hoang Tran et.al. 2305.17648v1 null
2023-05-30 Learning from Children: Improving Image-Caption Pretraining via Curriculum Hammad A. Ayyubi et.al. 2305.17540v2 link
2023-05-27 Text-to-image Editing by Image Information Removal Zhongping Zhang et.al. 2305.17489v1 null
2023-05-26 BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks Kai Zhang et.al. 2305.17100v1 link
2023-05-26 Learning and Leveraging Verifiers to Improve Planning Capabilities of Pre-trained Language Models Daman Arora et.al. 2305.17077v1 null
2023-05-26 Exploiting Abstract Meaning Representation for Open-Domain Question Answering Cunxiang Wang et.al. 2305.17050v1 null
2023-05-26 Commonsense Knowledge Graph Completion Via Contrastive Pretraining and Node Clustering Siwei Wu et.al. 2305.17019v1 null
2023-05-26 D-CALM: A Dynamic Clustering-based Active Learning Approach for Mitigating Bias Sabit Hassan et.al. 2305.17013v1 null
2023-05-29 Three Towers: Flexible Contrastive Learning with Pretrained Image Models Jannik Kossen et.al. 2305.16999v2 null
2023-05-26 Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation David Brandfonbrener et.al. 2305.16985v1 null
2023-05-26 Compositional Generalization without Trees using Multiset Tagging and Latent Permutations Matthias Lindemann et.al. 2305.16954v1 null
2023-05-26 On Evaluating Adversarial Robustness of Large Vision-Language Models Yunqing Zhao et.al. 2305.16934v1 link
2023-05-26 Calibration of Transformer-based Models for Identifying Stress and Depression in Social Media Loukas Ilias et.al. 2305.16797v1 null
2023-05-25 Parallel Sampling of Diffusion Models Andy Shih et.al. 2305.16317v1 link
2023-05-25 Cross-Lingual Knowledge Distillation for Answer Sentence Selection in Low-Resource Languages Shivanshu Gupta et.al. 2305.16302v1 null
2023-05-25 Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation Lisa Dunlap et.al. 2305.16289v1 link
2023-05-25 ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation Zhengyi Wang et.al. 2305.16213v1 link
2023-05-26 Diversity-Aware Coherence Loss for Improving Neural Topic Models Raymond Li et.al. 2305.16199v2 link
2023-05-25 Explainability Techniques for Chemical Language Models Stefan Hödl et.al. 2305.16192v1 link
2023-05-25 Language Models Implement Simple Word2Vec-style Vector Arithmetic Jack Merullo et.al. 2305.16130v1 link
2023-05-25 Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data Takafumi Moriya et.al. 2305.15971v1 null
2023-05-25 Latent Diffusion Model Based Foley Sound Generation System For DCASE Challenge 2023 Task 7 Yi Yuan et.al. 2305.15905v1 null
2023-05-25 On Architectural Compression of Text-to-Image Diffusion Models Bo-Kyeong Kim et.al. 2305.15798v1 null
2023-05-24 What can generic neural networks learn from a child's visual experience? A. Emin Orhan et.al. 2305.15372v1 null
2023-05-24 Solving Diffusion ODEs with Optimal Boundary Conditions for Better Image Super-Resolution Yiyang Ma et.al. 2305.15357v1 null
2023-05-24 Visual Programming for Text-to-Image Generation and Evaluation Jaemin Cho et.al. 2305.15328v1 null
2023-05-24 Self-Evolution Learning for Discriminative Language Model Pretraining Qihuang Zhong et.al. 2305.15275v1 null
2023-05-24 Revisiting Token Dropping Strategy in Efficient BERT Pretraining Qihuang Zhong et.al. 2305.15273v1 null
2023-05-24 ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers Jingfeng Yao et.al. 2305.15272v1 link
2023-05-24 Rethinking the Evaluation Protocol of Domain Generalization Han Yu et.al. 2305.15253v1 null
2023-05-24 L-CAD: Language-based Colorization with Any-level Descriptions Zheng Chang et.al. 2305.15217v1 null
2023-05-24 Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator Ziwei He et.al. 2305.15099v1 null
2023-05-24 Dynamic Masking Rate Schedules for MLM Pretraining Zachary Ankner et.al. 2305.15096v1 null
2023-05-23 Video Prediction Models as Rewards for Reinforcement Learning Alejandro Escontrela et.al. 2305.14343v1 null
2023-05-23 ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings William Brannon et.al. 2305.14321v1 link
2023-05-23 QLoRA: Efficient Finetuning of Quantized LLMs Tim Dettmers et.al. 2305.14314v1 link
2023-05-23 Weakly-Supervised Learning of Visual Relations in Multimodal Pretraining Emanuele Bugliarello et.al. 2305.14281v1 null
2023-05-23 Masked Path Modeling for Vision-and-Language Navigation Zi-Yi Dou et.al. 2305.14268v1 null
2023-05-24 DUBLIN -- Document Understanding By Language-Image Network Kriti Aggarwal et.al. 2305.14218v2 null
2023-05-23 Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks Tiedong Liu et.al. 2305.14201v1 null
2023-05-23 Accessing Higher Dimensions for Unsupervised Word Translation Sida I. Wang et.al. 2305.14200v1 null
2023-05-23 Evaluating Factual Consistency of Summaries with Large Language Models Shiqi Chen et.al. 2305.14069v1 link
2023-05-23 Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification Sangmin Bae et.al. 2305.14032v1 link
2023-05-22 Language-Agnostic Bias Detection in Language Models Abdullatif Köksal et.al. 2305.13302v1 null
2023-05-22 U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech Xin Jing et.al. 2305.13195v1 null
2023-05-22 A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity Shayne Longpre et.al. 2305.13169v1 null
2023-05-22 LMGQS: A Large-scale Dataset for Query-focused Summarization Ruochen Xu et.al. 2305.13086v1 null
2023-05-22 Textually Pretrained Speech Language Models Michael Hassid et.al. 2305.13009v1 null
2023-05-22 Rethinking Semi-supervised Learning with Language Models Zhengxiang Shi et.al. 2305.13002v1 link
2023-05-22 Text-based Person Search without Parallel Image-Text Data Yang Bai et.al. 2305.12964v1 null
2023-05-22 Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model Xiao Wang et.al. 2305.12816v1 null
2023-05-22 In-Context Learning of Large Language Models Explained as Kernel Regression Chi Han et.al. 2305.12766v1 null
2023-05-22 LEAN: Light and Efficient Audio Classification Network Shwetank Choudhary et.al. 2305.12712v1 null
2023-05-19 Neural Foundations of Mental Simulation: Future Prediction of Latent Representations on Dynamic Scenes Aran Nayebi et.al. 2305.11772v1 null
2023-05-19 Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition Siyuan Feng et.al. 2305.11569v1 null
2023-05-19 JOINEDTrans: Prior Guided Multi-task Transformer for Joint Optic Disc/Cup Segmentation and Fovea Detection Huaqing He et.al. 2305.11504v1 null
2023-05-19 TreePrompt: Learning to Compose Tree Prompts for Explainable Visual Grounding Chenchi Zhang et.al. 2305.11497v1 null
2023-05-19 ReDirTrans: Latent-to-Latent Translation for Gaze and Head Redirection Shiwei Jin et.al. 2305.11452v1 null
2023-05-18 CHBias: Bias Evaluation and Mitigation of Chinese Conversational Language Models Jiaxu Zhao et.al. 2305.11262v1 null
2023-05-18 Comparing Biases and the Impact of Multilingual Training across Multiple Languages Sharon Levy et.al. 2305.11242v1 null
2023-05-18 LIMA: Less Is More for Alignment Chunting Zhou et.al. 2305.11206v1 null
2023-05-18 ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities Peng Wang et.al. 2305.11172v1 link
2023-05-18 Exploring the Carbon Footprint of Hugging Face's ML Models: A Repository Mining Study Joel Castaño et.al. 2305.11164v1 null
2023-05-18 UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild Can Qin et.al. 2305.11147v1 null
2023-05-18 mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences David Uthus et.al. 2305.11129v1 null
2023-05-18 Generalized Planning in PDDL Domains with Pretrained Large Language Models Tom Silver et.al. 2305.11014v1 link
2023-05-18 The Web Can Be Your Oyster for Improving Large Language Models Junyi Li et.al. 2305.10998v1 null
2023-05-18 How does the task complexity of masked pretraining objectives affect downstream performance? Atsuki Yamaguchi et.al. 2305.10992v1 link
2023-05-18 FLIGHT Mode On: A Feather-Light Network for Low-Light Image Enhancement Mustafa Ozcan et.al. 2305.10889v1 null
2023-05-18 VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation Wenjing Wang et.al. 2305.10874v1 null
2023-05-18 Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning Wenhao Li et.al. 2305.10865v1 null
2023-05-17 DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining Sang Michael Xie et.al. 2305.10429v1 null
2023-05-17 What You See is What You Read? Improving Text-Image Alignment Evaluation Michal Yarom et.al. 2305.10400v1 link
2023-05-17 OpenSLU: A Unified, Modularized, and Extensible Toolkit for Spoken Language Understanding Libo Qin et.al. 2305.10231v1 link
2023-05-17 Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks Alon Jacovi et.al. 2305.10160v1 null
2023-05-17 Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models Alvin Heng et.al. 2305.10120v1 null
2023-05-17 CWD30: A Comprehensive and Holistic Dataset for Crop Weed Recognition in Precision Agriculture Talha Ilyas et.al. 2305.10084v1 null
2023-05-17 Dynamic Structural Brain Network Construction by Hierarchical Prototype Embedding GCN using T1-MRI Yilin Leng et.al. 2305.10077v1 null
2023-05-17 Equivariant Few-Shot Learning from Pretrained Models Sourya Basu et.al. 2305.09900v1 null
2023-05-16 The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation Mutian He et.al. 2305.09652v1 null
2023-05-16 Concurrent Misclassification and Out-of-Distribution Detection for Semantic Segmentation via Energy-Based Normalizing Flow Denis Gudovskiy et.al. 2305.09610v1 link
2023-05-16 An Empirical Study on Google Research Football Multi-agent Scenarios Yan Song et.al. 2305.09458v1 link
2023-05-16 Consistent Multi-Granular Rationale Extraction for Explainable Multi-hop Fact Verification Jiasheng Si et.al. 2305.09400v1 null
2023-05-16 Deep Ensembling for Perceptual Image Quality Assessment Nisar Ahmed et.al. 2305.09141v1 null
2023-05-15 Self-Supervised Pretraining on Paired Sequences of fMRI Data for Transfer Learning to Brain Decoding Tasks Sean Paulsen et.al. 2305.09057v1 null
2023-05-15 CLIP-VG: Self-paced Curriculum Adapting of CLIP via Exploiting Pseudo-Language Labels for Visual Grounding Linhui Xiao et.al. 2305.08685v1 null
2023-05-15 DarkBERT: A Language Model for the Dark Side of the Internet Youngjin Jin et.al. 2305.08596v1 null
2023-05-15 What's the Meaning of Superhuman Performance in Today's NLU? Simone Tedeschi et.al. 2305.08414v1 null
2023-05-15 TESS: Text-to-Text Self-Conditioned Simplex Diffusion Rabeeh Karimi Mahabadi et.al. 2305.08379v1 null
2023-05-15 "Nothing Abnormal": Disambiguating Medical Reports via Contrastive Knowledge Infusion Zexue He et.al. 2305.08300v1 null
2023-05-15 From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models Shangbin Feng et.al. 2305.08283v1 null
2023-05-14 FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge Shangbin Feng et.al. 2305.08281v1 null
2023-05-14 MatSci-NLP: Evaluating Scientific Language Models on Materials Science Language Tasks Using Text-to-Schema Modeling Yu Song et.al. 2305.08264v1 link
2023-05-14 Evaluating the roughness of structure-property relationships using pretrained molecular representations David E. Graff et.al. 2305.08238v1 null
2023-05-14 DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement Hendrik Schröter et.al. 2305.08227v1 null
2023-05-12 Measuring Progress in Fine-grained Vision-and-Language Understanding Emanuele Bugliarello et.al. 2305.07558v1 link
2023-05-12 Comprehensive Solution Program Centric Pretraining for Table-and-Text Hybrid Numerical Reasoning Qianying Liu et.al. 2305.07475v1 null
2023-05-12 CLIP-Count: Towards Text-Guided Zero-Shot Object Counting Ruixiang Jiang et.al. 2305.07304v1 link
2023-05-11 Simple Token-Level Confidence Improves Caption Correctness Suzanne Petryk et.al. 2305.07021v1 null
2023-05-11 A General-Purpose Multilingual Document Encoder Onur Galoğlu et.al. 2305.07016v1 link
2023-05-11 Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers Dahun Kim et.al. 2305.07011v1 null
2023-05-11 Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks Eshaan Nichani et.al. 2305.06986v1 null
2023-05-11 IUST_NLP at SemEval-2023 Task 10: Explainable Detecting Sexism with Transformers and Task-adaptive Pretraining Hadiseh Mahmoudi et.al. 2305.06892v1 null
2023-05-11 Extending Audio Masked Autoencoders Toward Audio Restoration Zhi Zhong et.al. 2305.06701v1 null
2023-05-11 WeditGAN: Few-shot Image Generation via Latent Space Relocation Yuxuan Duan et.al. 2305.06671v1 null
2023-05-11 A First Look at LLM-Powered Generative News Recommendation Qijiong Liu et.al. 2305.06566v1 link
2023-05-11 Undercover Deepfakes: Detecting Fake Segments in Videos Sanjay Saha et.al. 2305.06564v1 link
2023-05-11 How Good are Commercial Large Language Models on African Languages? Jessica Ojo et.al. 2305.06530v1 null
2023-05-10 Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs Roei Herzig et.al. 2305.06343v1 null
2023-05-10 XTab: Cross-table Pretraining for Tabular Transformers Bingzhao Zhu et.al. 2305.06090v1 link
2023-05-10 A Survey of Deep Code Search Yutao Xie et.al. 2305.05959v1 null
2023-05-10 Mover: Mask and Recovery based Facial Part Consistency Aware Method for Deepfake Video Detection Juan Hu et.al. 2305.05943v1 null
2023-05-10 SHS-Net: Learning Signed Hyper Surfaces for Oriented Normal Estimation of Point Clouds Qing Li et.al. 2305.05873v1 link
2023-05-10 Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? An Examination on Several Typical Tasks Xianzhi Li et.al. 2305.05862v1 null
2023-05-10 Vārta: A Large-Scale Headline-Generation Dataset for Indic Languages Rahul Aralikatte et.al. 2305.05858v1 link
2023-05-09 Region-based Contrastive Pretraining for Medical Image Retrieval with Anatomic Query Ho Hin Lee et.al. 2305.05598v1 null
2023-05-09 Recursions Are All You Need: Towards Efficient Deep Unfolding Networks Rawwad Alhejaili et.al. 2305.05505v1 link
2023-05-09 BadCS: A Backdoor Attack Framework for Code search Shiyi Qi et.al. 2305.05503v1 null
2023-05-09 Exploiting Pseudo Image Captions for Multimodal Summarization Chaoya Jiang et.al. 2305.05496v1 link
2023-05-09 What is the best recipe for character-level encoder-only modelling? Kris Cao et.al. 2305.05461v1 null
2023-05-09 MSVQ: Self-Supervised Learning with Multiple Sample Views and Queues Chen Peng et.al. 2305.05370v1 link
2023-05-09 A Framework for Designing Foundation Model based Systems Qinghua Lu et.al. 2305.05352v1 null
2023-05-09 Application of Artificial Intelligence in the Classification of Microscopical Starch Images for Drug Formulation Marvellous Ajala et.al. 2305.05321v1 null
2023-05-09 Robust Acoustic and Semantic Contextual Biasing in Neural Transducers for Speech Recognition Xuandi Fu et.al. 2305.05271v1 null
2023-05-09 Boosting Visual-Language Models by Exploiting Hard Samples Haonan Wang et.al. 2305.05208v1 null
2023-05-08 Toeplitz Neural Network for Sequence Modeling Zhen Qin et.al. 2305.04749v1 link
2023-05-08 Enhancing Knowledge Graph Construction Using Large Language Models Milena Trajanoska et.al. 2305.04676v1 null
2023-05-08 MultiTACRED: A Multilingual Version of the TAC Relation Extraction Dataset Leonhard Hennig et.al. 2305.04582v1 link
2023-05-08 A Multi-Modal Context Reasoning Approach for Conditional Inference on Joint Textual and Visual Clues Yunxin Li et.al. 2305.04530v1 link
2023-05-08 SNT: Sharpness-Minimizing Network Transformation for Fast Compression-friendly Pretraining Jung Hwan Heo et.al. 2305.04526v1 null
2023-05-08 Retriever and Ranker Framework with Probabilistic Hard Negative Sampling for Code Search Hande Dong et.al. 2305.04508v1 null
2023-05-08 Token-level Fitting Issues of Seq2seq Models Guangsheng Bao et.al. 2305.04493v1 null
2023-05-09 Vision Langauge Pre-training by Contrastive Learning with Cross-Modal Similarity Regulation Chaoya Jiang et.al. 2305.04474v2 null
2023-05-08 Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting Zhicheng Wang et.al. 2305.04440v1 null
2023-05-08 Breaking Through the Haze: An Advanced Non-Homogeneous Dehazing Method based on Fast Fourier Convolution and ConvNeXt Han Zhou et.al. 2305.04430v1 link
2023-05-05 Otter: A Multi-Modal Model with In-Context Instruction Tuning Bo Li et.al. 2305.03726v1 null
2023-05-05 COLA: How to adapt vision-language models to Compose Objects Localized with Attributes? Arijit Ray et.al. 2305.03689v1 null
2023-05-05 Retrieval Augmented Chest X-Ray Report Generation using OpenAI GPT models Mercy Ranjit et.al. 2305.03660v1 null
2023-05-05 Data Curation for Image Captioning with Text-to-Image Generative Models Wenyan Li et.al. 2305.03610v1 null
2023-05-05 DisenBooth: Disentangled Parameter-Efficient Tuning for Subject-Driven Text-to-Image Generation Hong Chen et.al. 2305.03374v1 null
2023-05-05 HiPool: Modeling Long Documents Using Graph Neural Networks Irene Li et.al. 2305.03319v1 link
2023-05-04 Chain-of-Skills: A Configurable Model for Open-domain Question Answering Kaixin Ma et.al. 2305.03130v1 null
2023-05-04 Adversarially-Guided Portrait Matting Sergej Chicherin et.al. 2305.02981v1 link
2023-05-04 End-to-end spoken language understanding using joint CTC loss and self-supervised, pretrained acoustic encoders Jixuan Wang et.al. 2305.02937v1 null
2023-05-04 Forward-Forward Contrastive Learning Md. Atik Ahamed et.al. 2305.02927v1 null
2023-05-04 DN at SemEval-2023 Task 12: Low-Resource Language Text Classification via Multilingual Pretrained Language Model Fine-tuning Daniil Homskiy et.al. 2305.02607v1 null
2023-05-04 How to Choose Pretrained Handwriting Recognition Models for Single Writer Fine-Tuning Vittorio Pippi et.al. 2305.02593v1 null
2023-05-03 Learning to Detect Novel and Fine-Grained Acoustic Sequences Using Pretrained Audio Representations Vasudha Kowtha et.al. 2305.02382v1 null
2023-05-03 PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives Silin Gao et.al. 2305.02364v1 link
2023-05-03 Entity Tracking in Language Models Najoung Kim et.al. 2305.02363v1 null
2023-05-03 Real-Time Radiance Fields for Single-Image Portrait View Synthesis Alex Trevithick et.al. 2305.02310v1 null
2023-05-05 A Neural Divide-and-Conquer Reasoning Framework for Image Retrieval from Linguistically Complex Text Yunxin Li et.al. 2305.02265v2 link
2023-05-03 Explaining Language Models' Predictions with High-Impact Concepts Ruochen Zhao et.al. 2305.02160v1 null
2023-05-02 KEPLET: Knowledge-Enhanced Pretrained Language Model with Topic Entity Awareness Yichuan Li et.al. 2305.01810v1 null
2023-05-02 Don't Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner Zhengxiang Shi et.al. 2305.01711v1 link
2023-05-02 SIA-FTP: A Spoken Instruction Aware Flight Trajectory Prediction Framework Dongyue Guo et.al. 2305.01661v1 null
2023-05-02 Unlimiformer: Long-Range Transformers with Unlimited Length Input Amanda Bertsch et.al. 2305.01625v1 link
2023-05-02 A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge Siddhant Arora et.al. 2305.01620v1 null
2023-05-02 RadAdapt: Radiology Report Summarization via Lightweight Domain Adaptation of Large Language Models Dave Van Veen et.al. 2305.01146v1 null
2023-05-01 Interpreting Pretrained Source-code Models using Neuron Redundancy Analyses Arushi Sharma et.al. 2305.00875v1 null
2023-04-30 Transfer of knowledge among instruments in automatic music transcription Michał Leś et.al. 2305.00426v1 null
2023-04-30 Cross-Shaped Windows Transformer with Self-supervised Pretraining for Clinically Significant Prostate Cancer Detection in Bi-parametric MRI Yuheng Li et.al. 2305.00385v1 null
2023-04-29 LD-GAN: Low-Dimensional Generative Adversarial Network for Spectral Image Generation with Variance Regularization Emmanuel Martinez et.al. 2305.00132v1 link
2023-04-28 Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR Ruchao Fan et.al. 2305.00115v1 null
2023-04-28 NLNDE at SemEval-2023 Task 12: Adaptive Pretraining and Source Language Selection for Low-Resource Multilingual Sentiment Analysis Mingyang Wang et.al. 2305.00090v1 null
2023-04-28 Unsupervised Discovery of 3D Hierarchical Structure with Generative Diffusion Features Nurislam Tursynbek et.al. 2305.00067v1 null
2023-04-28 CCpdf: Building a High Quality Corpus for Visually Rich Documents from Web Crawl Data Michał Turski et.al. 2304.14953v1 link
2023-04-28 Made of Steel? Learning Plausible Materials for Components in the Vehicle Repair Domain Annerose Eichel et.al. 2304.14745v1 link
2023-04-28 DIAMANT: Dual Image-Attention Map Encoders For Medical Image Segmentation Yousef Yeganeh et.al. 2304.14571v1 null
2023-04-27 Greybox Penetration Testing on Cloud Access Control with IAM Modeling and Deep Reinforcement Learning Yang Hu et.al. 2304.14540v1 null
2023-04-27 Gradient-based Maximally Interfered Retrieval for Domain Incremental 3D Object Detection Barza Nisar et.al. 2304.14460v1 link
2023-04-27 We're Afraid Language Models Aren't Modeling Ambiguity Alisa Liu et.al. 2304.14399v1 link
2023-04-27 UIO at SemEval-2023 Task 12: Multilingual fine-tuning for sentiment classification in low-resource languages Egil Rønningstad et.al. 2304.14189v1 null
2023-04-27 Lightweight, Pre-trained Transformers for Remote Sensing Timeseries Gabriel Tseng et.al. 2304.14065v1 link
2023-04-27 Retrieval-based Knowledge Augmented Vision Language Pre-training Jiahua Rao et.al. 2304.13923v1 null
2023-04-27 Neural Keyphrase Generation: Analysis and Evaluation Tuhin Kundu et.al. 2304.13883v1 null
2023-04-26 highway2vec -- representing OpenStreetMap microregions with respect to their road network characteristics Kacper Leśniara et.al. 2304.13865v1 link
2023-04-26 A Deep Learning Framework for Verilog Autocompletion Towards Design and Verification Automation Enrique Dehaerne et.al. 2304.13840v1 null
2023-04-26 Programmatically Grounded, Compositionally Generalizable Robotic Manipulation Renhao Wang et.al. 2304.13826v1 null
2023-04-26 Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models Haoqiang Kang et.al. 2304.13803v1 null
2023-04-26 Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation Lukas Hoyer et.al. 2304.13615v1 link
2023-04-26 Tissue Classification During Needle Insertion Using Self-Supervised Contrastive Learning and Optical Coherence Tomography Debayan Bhattacharya et.al. 2304.13574v1 null
2023-04-26 Self-Supervised Multi-Modal Sequential Recommendation Kunzhe Song et.al. 2304.13277v1 null
2023-04-25 Towards Compute-Optimal Transfer Learning Massimo Caccia et.al. 2304.13164v1 null
2023-04-25 Hypernymization of named entity-rich captions for grounding-based multi-modal pretraining Giacomo Nebbia et.al. 2304.13130v1 null
2023-04-25 Pretrain on just structure: Understanding linguistic inductive biases using transfer learning Isabel Papadimitriou et.al. 2304.13060v1 null
2023-04-25 On the Generalization of Learned Structured Representations Andrea Dittadi et.al. 2304.13001v1 null
2023-04-25 CitePrompt: Using Prompts to Identify Citation Intent in Scientific Papers Avishek Lahiri et.al. 2304.12730v1 link
2023-04-26 Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation Junde Wu et.al. 2304.12620v2 null
2023-04-26 OFAR: A Multimodal Evidence Retrieval Framework for Illegal Live-streaming Identification Lin Dengtian et.al. 2304.12608v2 null
2023-04-25 Model Conversion via Differentially Private Data-Free Distillation Bochao Liu et.al. 2304.12528v1 null
2023-04-25 Hint-Aug: Drawing Hints from Foundation Vision Transformers Towards Boosted Few-Shot Parameter-Efficient Tuning Zhongzhi Yu et.al. 2304.12520v1 null
2023-04-25 RenderDiffusion: Text Generation as Image Generation Junyi Li et.al. 2304.12519v1 null
2023-04-24 PEFT-Ref: A Modular Reference Architecture and Typology for Parameter-Efficient Finetuning Techniques Mohammed Sabry et.al. 2304.12410v1 null
2023-04-24 Generative Discovery of Novel Chemical Designs using Diffusion Modeling and Transformer Deep Neural Networks with Application to Deep Eutectic Solvents Rachel K. Luu et.al. 2304.12400v1 null
2023-04-24 Uni-QSAR: an Auto-ML Tool for Molecular Property Prediction Zhifeng Gao et.al. 2304.12239v1 null
2023-04-24 Deep Audio-Visual Singing Voice Transcription based on Self-Supervised Learning Models Xiangming Gu et.al. 2304.12082v1 null
2023-04-24 Robust Tickets Can Transfer Better: Drawing More Transferable Subnetworks in Transfer Learning Yonggan Fu et.al. 2304.11834v1 null
2023-04-22 Incomplete Multimodal Learning for Remote Sensing Data Fusion Yuxing Chen et.al. 2304.11381v1 null
2023-04-22 Single-stage Multi-human Parsing via Point Sets and Center-based Offsets Jiaming Chu et.al. 2304.11356v1 null
2023-04-22 Self-supervised Learning by View Synthesis Shaoteng Liu et.al. 2304.11330v1 null
2023-04-22 EEE, Remediating the failure of machine learning models via a network-based optimization patch Ruiyuan Kang et.al. 2304.11321v1 null
2023-04-21 Factored Neural Representation for Scene Understanding Yu-Shiang Wong et.al. 2304.10950v1 null
2023-04-24 Text2Time: Transformer-based Article Time Period Prediction Karthick Prasad Gunasekaran et.al. 2304.10859v2 null
2023-04-21 Rethinking Benchmarks for Cross-modal Image-text Retrieval Weijing Chen et.al. 2304.10824v1 link
2023-04-21 Deep Multiview Clustering by Contrasting Cluster Assignments Jie Chen et.al. 2304.10769v1 link
2023-04-20 MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models Deyao Zhu et.al. 2304.10592v1 link
2023-04-20 Implicit Temporal Modeling with Learnable Alignment for Video Recognition Shuyuan Tu et.al. 2304.10465v1 link
2023-04-20 Domain-specific Continued Pretraining of Language Models for Capturing Long Context in Mental Health Shaoxiong Ji et.al. 2304.10447v1 null
2023-04-20 Movie Box Office Prediction With Self-Supervised and Visually Grounded Pretraining Qin Chao et.al. 2304.10311v1 null
2023-04-20 OptoGPT: A Foundation Model for Inverse Design in Optical Multilayer Thin Film Structures Taigao Ma et.al. 2304.10294v1 null
2023-04-20 PREIM3D: 3D Consistent Precise Image Attribute Editing from a Single Image Jianhui Li et.al. 2304.10263v1 null
2023-04-20 Does Manipulating Tokenization Aid Cross-Lingual Transfer? A Study on POS Tagging for Non-Standardized Languages Verena Blaschke et.al. 2304.10158v1 link
2023-04-19 DCN-T: Dual Context Network with Transformer for Hyperspectral Image Classification Di Wang et.al. 2304.09915v1 link
2023-04-19 Domain Adaptable Self-supervised Representation Learning on Remote Sensing Satellite Imagery Muskaan Chopra et.al. 2304.09874v1 link
2023-04-19 NetGPT: Generative Pretrained Transformer for Network Traffic Xuying Meng et.al. 2304.09513v1 null
2023-04-20 Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes Simran Arora et.al. 2304.09433v2 link
2023-04-18 UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining Hyung Won Chung et.al. 2304.09151v1 null
2023-04-18 Decoding Neural Activity to Assess Individual Latent State in Ecologically Valid Contexts Stephen M. Gordon et.al. 2304.09050v1 null
2023-04-18 Adapter Learning in Pretrained Feature Extractor for Continual Learning of Diseases Wentao Zhang et.al. 2304.09042v1 null
2023-04-18 D2CSE: Difference-aware Deep continuous prompts for Contrastive Sentence Embeddings Hyunjae Lee et.al. 2304.08991v1 null
2023-04-18 Deep Collective Knowledge Distillation Jihyeon Seo et.al. 2304.08878v1 null
2023-04-18 Romanization-based Large-scale Adaptation of Multilingual Language Models Sukannya Purkayastha et.al. 2304.08865v1 null
2023-04-19 Self-Supervised 3D Action Representation Learning with Skeleton Cloud Colorization Siyuan Yang et.al. 2304.08799v2 null
2023-04-18 Sparks of GPTs in Edge Intelligence for Metaverse: Caching and Inference for Mobile AIGC Services Minrui Xu et.al. 2304.08782v1 null
2023-04-17 Delving into Shape-aware Zero-shot Semantic Segmentation Xinyu Liu et.al. 2304.08491v1 link
2023-04-17 BenchMD: A Benchmark for Modality-Agnostic Learning on Medical Images and Sensors Kathryn Wantlin et.al. 2304.08486v1 link
2023-04-18 Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation Jie An et.al. 2304.08477v2 null
2023-04-18 Inverse design of next-generation superconductors using data-driven deep generative models Daniel Wines et.al. 2304.08446v2 null
2023-04-17 VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset Sihan Chen et.al. 2304.08345v1 link
2023-04-17 Human Pose Estimation in Monocular Omnidirectional Top-View Images Jingrui Yu et.al. 2304.08186v1 null
2023-04-17 DETRs Beat YOLOs on Real-time Object Detection Wenyu Lv et.al. 2304.08069v1 link
2023-04-17 Self-Supervised Learning from Non-Object Centric Images with a Geometric Transformation Sensitive Architecture Taeho Kim Jong-Min Lee et.al. 2304.08014v1 null
2023-04-17 Learning to "Segment Anything" in Thermal Infrared Images through Knowledge Distillation with a Large Scale Dataset SATIR Junzhang Chen et.al. 2304.07969v1 link
2023-04-16 Sabiá: Portuguese Large Language Models Ramon Pires et.al. 2304.07880v1 null
2023-04-14 DINOv2: Learning Robust Visual Features without Supervision Maxime Oquab et.al. 2304.07193v1 link
2023-04-14 The Second Monocular Depth Estimation Challenge Jaime Spencer et.al. 2304.07051v1 null
2023-04-14 MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation Jie Guo et.al. 2304.06957v1 null
2023-04-14 Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text Wanrong Zhu et.al. 2304.06939v1 link
2023-04-14 3D Feature Prediction for Masked-AutoEncoder-Based Point Cloud Pretraining Siming Yan et.al. 2304.06911v1 null
2023-04-14 Generating Adversarial Examples with Better Transferability via Masking Unimportant Parameters of Surrogate Model Dingcheng Yang et.al. 2304.06908v1 null
2023-04-14 Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding Yu-Qi Yang et.al. 2304.06906v1 null
2023-04-17 A Contrastive Method Based on Elevation Data for Remote Sensing with Scarce and High Level Semantic Labels Omar A. Castaño-Idarraga et.al. 2304.06857v2 null
2023-04-13 Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study Boxin Wang et.al. 2304.06762v1 link
2023-04-13 Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction Hansheng Chen et.al. 2304.06714v1 null
2023-04-13 Verbs in Action: Improving verb understanding in video-language models Liliane Momeni et.al. 2304.06708v1 null
2023-04-14 G2T: A Simple but Effective Framework for Topic Modeling based on Pretrained Language Model and Community Detection Leihang Zhang et.al. 2304.06653v2 null
2023-04-13 Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation Mohit Sharma et.al. 2304.06600v1 null
2023-04-12 RECLIP: Resource-efficient CLIP by Training with Small Images Runze Li et.al. 2304.06028v1 null
2023-04-14 DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion Johanna Karras et.al. 2304.06025v2 null
2023-04-12 HaDR: Applying Domain Randomization for Generating Synthetic Multimodal Dataset for Hand Instance Segmentation in Cluttered Industrial Environments Stefan Grushko et.al. 2304.05826v1 null
2023-04-12 Impact of Pseudo Depth on Open World Object Segmentation with Minimal User Guidance Robin Schön et.al. 2304.05716v1 null
2023-04-12 Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning Nikhil Singh et.al. 2304.05600v1 null
2023-04-11 A surprisingly simple technique to control the pretraining bias for better transfer: Expand or Narrow your representation Florian Bordes et.al. 2304.05369v1 null
2023-04-11 A Billion-scale Foundation Model for Remote Sensing Images Keumgang Cha et.al. 2304.05215v1 null
2023-04-11 MRVM-NeRF: Mask-Based Pretraining for Neural Radiance Fields Ganlin Yang et.al. 2304.04962v1 null
2023-04-11 Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference Tao Lei et.al. 2304.04947v1 null
2023-04-10 Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition Shuhuai Ren et.al. 2304.04704v1 link
2023-04-10 Transfer Learning for Low-Resource Sentiment Analysis Razhan Hameed et.al. 2304.04703v1 link
2023-04-10 Attention at SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS) Debashish Roy et.al. 2304.04610v1 link
2023-04-10 hist2RNA: An efficient deep learning architecture to predict gene expression from breast cancer histopathology images Raktim Kumar Mondol et.al. 2304.04507v1 null
2023-04-10 Instance Neural Radiance Field Benran Hu et.al. 2304.04395v1 null
2023-04-10 Leveraging Neural Representations for Audio Manipulation Scott H. Hawley et.al. 2304.04394v1 null
2023-04-10 Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models Nikita Starodubcev et.al. 2304.04344v1 link
2023-04-09 Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why? Da Xu et.al. 2304.04330v1 null
2023-04-08 Unsupervised Story Discovery from Continuous News Streams via Scalable Thematic Embedding Susik Yoon et.al. 2304.04099v1 null
2023-04-08 WikiGoldSK: Annotated Dataset, Baselines and Few-Shot Learning Experiments for Slovak Named Entity Recognition Dávid Šuba et.al. 2304.04026v1 link
2023-04-07 Zero-shot CT Field-of-view Completion with Unconditional Generative Diffusion Prior Kaiwen Xu et.al. 2304.03760v1 null
2023-04-10 Anomalous Sound Detection using Audio Representation with Machine ID based Contrastive Learning Pretraining Jian Guan et.al. 2304.03588v2 null
2023-04-10 Graph Attention for Automated Audio Captioning Feiyang Xiao et.al. 2304.03586v2 link
2023-04-07 Language-aware Multiple Datasets Detection Pretraining for DETRs Jing Hao et.al. 2304.03580v1 null
2023-04-07 Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4 Hanmeng Liu et.al. 2304.03439v1 link
2023-04-06 RoSteALS: Robust Steganography using Autoencoder Latent Space Tu Bui et.al. 2304.03400v1 link
2023-04-06 Self-Supervised Video Similarity Learning Giorgos Kordopatis-Zilos et.al. 2304.03378v1 link
2023-04-06 Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting Syed Talal Wasim et.al. 2304.03307v1 link
2023-04-06 Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention Mingyu Ding et.al. 2304.03282v1 link
2023-04-06 When do you need Chain-of-Thought Prompting for ChatGPT? Jiuhai Chen et.al. 2304.03262v1 null
2023-04-06 Zero-Shot Next-Item Recommendation using Large Pretrained Language Models Lei Wang et.al. 2304.03153v1 null
2023-04-07 Geometric-aware Pretraining for Vision-centric 3D Object Detection Linyan Huang et.al. 2304.03105v2 link
2023-04-06 Convolutional neural networks for crack detection on flexible road pavements Hermann Tapamo et.al. 2304.02933v1 null
2023-04-06 Mask Detection and Classification in Thermal Face Images Natalia Kowalczyk et.al. 2304.02931v1 link
2023-04-06 Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce Yang Jin et.al. 2304.02853v1 null
2023-04-06 Longitudinal Multimodal Transformer Integrating Imaging and Latent Clinical Signatures From Routine EHRs for Pulmonary Nodule Classification Thomas Z. Li et.al. 2304.02836v1 null
2023-04-05 Bengali Fake Review Detection using Semi-supervised Generative Adversarial Networks Md. Tanvir Rouf Shawon et.al. 2304.02739v1 null
2023-04-05 Exploring the Utility of Self-Supervised Pretraining Strategies for the Detection of Absent Lung Sliding in M-Mode Lung Ultrasound Blake VanBerlo et.al. 2304.02724v1 null
2023-04-05 VicTR: Video-conditioned Text Representations for Activity Recognition Kumara Kahatapitiya et.al. 2304.02560v1 null
2023-04-05 Deep Perceptual Similarity is Adaptable to Ambiguous Contexts Gustav Grund Pihlgren et.al. 2304.02265v1 null
2023-04-05 Towards Efficient Task-Driven Model Reprogramming with Foundation Models Shoukai Xu et.al. 2304.02263v1 null
2023-04-04 Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT Ke Chen et.al. 2304.02160v1 null
2023-04-04 Optimal operating MR contrast for brain ventricle parcellation Savannah P. Hays et.al. 2304.02056v1 null
2023-04-04 Online augmentation of learned grasp sequence policies for more adaptable and data-efficient in-hand manipulation Ethan K. Gordon et.al. 2304.02052v1 null
2023-04-04 AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation Jheng-Hong Yang et.al. 2304.01961v1 link
2023-04-04 Unsupervised Improvement of Factual Knowledge in Language Models Nafis Sadeq et.al. 2304.01597v1 link
2023-04-03 Creating Custom Event Data Without Dictionaries: A Bag-of-Tricks Andrew Halterman et.al. 2304.01331v1 link
2023-04-03 Burstormer: Burst Image Restoration and Enhancement Transformer Akshay Dudhane et.al. 2304.01194v1 link
2023-04-03 ScandEval: A Benchmark for Scandinavian Natural Language Processing Dan Saattrup Nielsen et.al. 2304.00906v1 link
2023-04-03 GreekBART: The First Pretrained Greek Sequence-to-Sequence Model Iakovos Evdaimon et.al. 2304.00869v1 null
2023-04-03 Few-shot Fine-tuning is All You Need for Source-free Domain Adaptation Suho Lee et.al. 2304.00792v1 link
2023-04-03 Multi-Modal Representation Learning with Text-Driven Soft Masks Jaeyoo Park et.al. 2304.00719v1 null
2023-04-03 A Post-Training Framework for Improving Heterogeneous Graph Neural Networks Cheng Yang et.al. 2304.00698v1 null
2023-04-02 PK-Chat: Pointer Network Guided Knowledge Driven Generative Dialogue Model Cheng Deng et.al. 2304.00592v1 link
2023-04-02 DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks Qiangqiang Wu et.al. 2304.00571v1 link
2023-04-02 Video Pretraining Advances 3D Deep Learning on Chest CT Tasks Alexander Ke et.al. 2304.00546v1 link
2023-04-02 Instance-level Trojan Attacks on Visual Question Answering via Adversarial Learning in Neuron Activation Space Yuwei Sun et.al. 2304.00436v1 null
2023-03-31 Procedure-Aware Pretraining for Instructional Video Understanding Honglu Zhou et.al. 2303.18230v1 link
2023-03-31 Siamese DETR Zeren Chen et.al. 2303.18144v1 null
2023-03-31 INoD: Injected Noise Discriminator for Self-Supervised Representation Learning in Agricultural Fields Julia Hindel et.al. 2303.18101v1 null
2023-03-31 LaCViT: A Label-aware Contrastive Training Framework for Vision Transformers Zijun Long et.al. 2303.18013v1 null
2023-03-31 Knowledge Distillation for Feature Extraction in Underwater VSLAM Jinghe Yang et.al. 2303.17981v1 link
2023-03-31 Exploring the Limits of Deep Image Clustering using Pretrained Models Nikolas Adaloglou et.al. 2303.17896v1 null
2023-03-30 Learning Garment DensePose for Robust Warping in Virtual Try-On Aiyu Cui et.al. 2303.17688v1 null
2023-03-30 Whether and When does Endoscopy Domain Pretraining Make Sense? Dominik Batić et.al. 2303.17636v1 null
2023-03-30 Anatomically aware dual-hop learning for pulmonary embolism detection in CT pulmonary angiograms Florin Condrea et.al. 2303.17593v1 null
2023-03-30 DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder Chenpng Du et.al. 2303.17550v1 null
2023-03-30 Finetuning from Offline Reinforcement Learning: Challenges, Trade-offs and Practical Solutions Yicheng Luo et.al. 2303.17396v1 null
2023-03-30 A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision Lucas Beyer et.al. 2303.17376v1 null
2023-03-30 PMatch: Paired Masked Image Modeling for Dense Geometric Matching Shengjie Zhu et.al. 2303.17342v1 null
2023-03-30 Discriminative Class Tokens for Text-to-Image Diffusion Models Idan Schwartz et.al. 2303.17155v1 null
2023-03-29 Transductive few-shot adapters for medical image segmentation Julio Silva-Rodríguez et.al. 2303.17051v1 link
2023-03-29 AutoAD: Movie Description in Context Tengda Han et.al. 2303.16899v1 link
2023-03-29 Towards Understanding the Effect of Pretraining Label Granularity Guan Zhe Hong et.al. 2303.16887v1 null
2023-03-28 Training Language Models with Language Feedback at Scale Jérémy Scheurer et.al. 2303.16755v1 null
2023-03-29 Visibility Aware Human-Object Interaction Tracking from Single RGB Camera Xianghui Xie et.al. 2303.16479v1 null
2023-03-28 Variational Distribution Learning for Unsupervised Text-to-Image Generation Minsoo Kang et.al. 2303.16105v1 null
2023-03-28 Soft-prompt tuning to predict lung cancer using primary care free-text Dutch medical notes Auke Elfrink et.al. 2303.15846v1 null
2023-03-28 Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion Hiromichi Kamata et.al. 2303.15780v1 null
2023-03-28 SVD-DIP: Overcoming the Overfitting Problem in DIP-based CT Reconstruction Marco Nittscher et.al. 2303.15748v1 link
2023-03-28 Large-scale pretraining on pathological images for fine-tuning of small pathological benchmarks Masataka Kawai et.al. 2303.15693v1 null
2023-03-28 Pre-training Transformers for Knowledge Graph Completion Sanxing Chen et.al. 2303.15682v1 null
2023-03-28 StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing Senmao Li et.al. 2303.15649v1 null
2023-03-27 Training-free Style Transfer Emerges from h-space in Diffusion models Jaeseok Jeong et.al. 2303.15403v1 null
2023-03-27 Generalizable Neural Voxels for Fast Human Radiance Fields Taoran Yi et.al. 2303.15387v1 null
2023-03-27 Improving Neural Topic Models with Wasserstein Knowledge Distillation Suman Adhya et.al. 2303.15350v1 link
2023-03-27 Prompt-Guided Zero-Shot Anomaly Action Recognition using Pretrained Deep Skeleton Features Fumiaki Sato et.al. 2303.15167v1 null
2023-03-27 Parameter Efficient Local Implicit Image Function Network for Face Segmentation Mausoom Sarkar et.al. 2303.15122v1 null
2023-03-27 Adapting Pretrained Language Models for Solving Tabular Prediction Problems in the Electronic Health Record Christopher McMaster et.al. 2303.14920v1 null
2023-03-27 Seer: Language Instructed Video Prediction with Latent Diffusion Models Xianfan Gu et.al. 2303.14897v1 null
2023-03-25 Indian Language Summarization using Pretrained Sequence-to-Sequence Models Ashok Urlana et.al. 2303.14461v1 null
2023-03-25 Sem4SAP: Synonymous Expression Mining From Open Knowledge Graph For Language Model Synonym-Aware Pretraining Zhouhong Gu et.al. 2303.14425v1 null
2023-03-25 Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression Denis Kuznedelev et.al. 2303.14409v1 null
2023-03-27 Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data Paul Hager et.al. 2303.14080v2 link
2023-03-24 Accelerating Vision-Language Pretraining with Free Language Modeling Teng Wang et.al. 2303.14038v1 link
2023-03-24 SPEC: Summary Preference Decomposition for Low-Resource Abstractive Summarization Yi-Syuan Chen et.al. 2303.14011v1 null
2023-03-24 Robust Test-Time Adaptation in Dynamic Scenarios Longhui Yuan et.al. 2303.13899v1 link
2023-03-23 Three ways to improve feature alignment for open vocabulary detection Relja Arandjelović et.al. 2303.13518v1 null
2023-03-23 Ablating Concepts in Text-to-Image Diffusion Models Nupur Kumari et.al. 2303.13516v1 link
2023-03-23 A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias Puja Trivedi et.al. 2303.13500v1 null
2023-03-23 The effectiveness of MAE pre-pretraining for billion-scale pretraining Mannat Singh et.al. 2303.13496v1 null
2023-03-23 Increasing Textual Context Size Boosts Medical Image-Text Matching Idan Glassberg et.al. 2303.13340v1 null
2023-03-23 Parameter-Efficient Sparse Retrievers and Rerankers using Adapters Vaishali Pal et.al. 2303.13220v1 link
2023-03-23 Retrieval-Augmented Classification with Decoupled Representation Xinnian Liang et.al. 2303.13065v1 link
2023-03-23 gDoc: Automatic Generation of Structured API Documentation Shujun Wang et.al. 2303.13041v1 null
2023-03-23 MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models Dohwan Ko et.al. 2303.13009v1 link
2023-03-22 JaCoText: A Pretrained Model for Java Code-Text Generation Jessica López Espejel et.al. 2303.12869v1 null
2023-03-21 Affordance Diffusion: Synthesizing Hand-Object Interactions Yufei Ye et.al. 2303.12538v1 null
2023-03-21 Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding Morris Alper et.al. 2303.12513v1 link
2023-03-22 CLIP^2: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data Yihan Zeng et.al. 2303.12417v1 null
2023-03-21 Prompt-MIL: Boosting Multi-Instance Learning Schemes via Task-specific Prompt Tuning Jingwei Zhang et.al. 2303.12214v1 null
2023-03-21 Toward Accurate Interpretable Predictions of Materials Properties within Transformer Language Models Vadim Korolev et.al. 2303.12188v1 null
2023-03-21 MV-MR: multi-views and multi-representations for self-supervised learning and knowledge distillation Vitaliy Kinakh et.al. 2303.12130v1 link
2023-03-21 Logical Reasoning over Natural Language as Knowledge Representation: A Survey Zonglin Yang et.al. 2303.12023v1 null
2023-03-21 A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need? Chaoning Zhang et.al. 2303.11717v1 null
2023-03-21 Manipulating Transfer Learning for Property Inference Yulong Tian et.al. 2303.11643v1 link
2023-03-21 Large AI Models in Health Informatics: Applications, Challenges, and the Future Jianing Qiu et.al. 2303.11568v1 null
2023-03-20 eP-ALM: Efficient Perceptual Augmentation of Language Models Mustafa Shukor et.al. 2303.11403v1 link
2023-03-20 Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding Jihao Liu et.al. 2303.11325v1 null
2023-03-20 Conversation Modeling to Predict Derailment Jiaqing Yuan et.al. 2303.11184v1 null
2023-03-20 Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning Sungnyun Kim et.al. 2303.11101v1 null
2023-03-20 Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models René Haas et.al. 2303.11073v1 null
2023-03-20 Tubelet-Contrastive Self-Supervision for Video-Efficient Generalization Fida Mohammad Thoker et.al. 2303.11003v1 null
2023-03-20 EMC2-Net: Joint Equalization and Modulation Classification based on Constellation Network Hyun Ryu et.al. 2303.10934v1 link
2023-03-20 Exploring Representation Learning for Small-Footprint Keyword Spotting Fan Cui et.al. 2303.10912v1 null
2023-03-21 Actionlet-Dependent Contrastive Learning for Unsupervised Skeleton-Based Action Recognition Lilang Lin et.al. 2303.10904v2 null
2023-03-20 Character, Word, or Both? Revisiting the Segmentation Granularity for Chinese Pre-trained Language Models Xinnian Liang et.al. 2303.10893v1 null
2023-03-20 A Global Model Approach to Robust Few-Shot SAR Automatic Target Recognition Nathan Inkawhich et.al. 2303.10800v1 null
2023-03-17 Enhancing the Role of Context in Region-Word Alignment for Object Detection Kyle Buettner et.al. 2303.10093v1 null
2023-03-17 DialogPaint: A Dialog-based Image Editing Model Jingxuan Wei et.al. 2303.10073v1 null
2023-03-17 Breast Cancer Histopathology Image based Gene Expression Prediction using Spatial Transcriptomics data and Deep Learning Md Mamunur Rahaman et.al. 2303.09987v1 null
2023-03-17 Dual-path Adaptation from Image to Video Transformers Jungin Park et.al. 2303.09857v1 link
2023-03-17 CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos Seungju Han et.al. 2303.09713v1 null
2023-03-16 VEIL: Vetting Extracted Image Labels from In-the-Wild Captions for Weakly-Supervised Object Detection Arushi Rai et.al. 2303.09608v1 null
2023-03-16 DiffIR: Efficient Diffusion Model for Image Restoration Bin Xia et.al. 2303.09472v1 null
2023-03-16 Team SheffieldVeraAI at SemEval-2023 Task 3: Mono and multilingual approaches for news genre, topic and persuasion technique classification Ben Wu et.al. 2303.09421v1 null
2023-03-16 3D Masked Autoencoding and Pseudo-labeling for Domain Adaptive Segmentation of Heterogeneous Infant Brain MRI Xuzhe Zhang et.al. 2303.09373v1 null
2023-03-16 StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model Zipeng Xu et.al. 2303.09268v1 link
2023-03-16 GridCLIP: One-Stage Object Detection by Grid-Level CLIP Representation Learning Jiayi Lin et.al. 2303.09252v1 null
2023-03-16 Emotional Reaction Intensity Estimation Based on Multimodal Data Shangfei Wang et.al. 2303.09167v1 null
2023-03-15 Deep Learning Weight Pruning with RMT-SVD: Increasing Accuracy and Reducing Overfitting Yitzchak Shmalo et.al. 2303.08986v1 link
2023-03-15 Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement Fartash Faghri et.al. 2303.08983v1 null
2023-03-15 PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining Garrett Thomas et.al. 2303.08789v1 null
2023-03-15 2D and 3D CNN-Based Fusion Approach for COVID-19 Severity Prediction from 3D CT-Scans Fares Bougourzi et.al. 2303.08740v1 link
2023-03-15 Mapping Urban Population Growth from Sentinel-2 MSI and Census Data Using Deep Learning: A Case Study in Kigali, Rwanda Sebastian Hafner et.al. 2303.08511v1 link
2023-03-15 Task-specific Fine-tuning via Variational Information Bottleneck for Weakly-supervised Pathology Whole Slide Image Classification Honglin Li et.al. 2303.08446v1 null
2023-03-15 Lana: A Language-Capable Navigator for Instruction Following and Generation Xiaohan Wang et.al. 2303.08409v1 link
2023-03-15 SegPrompt: Using Segmentation Map as a Better Prompt to Finetune Deep Models for Kidney Stone Classification Wei Zhu et.al. 2303.08303v1 null
2023-03-14 Contextualized Medication Information Extraction Using Transformer-based Deep Learning Architectures Aokun Chen et.al. 2303.08259v1 null
2023-03-14 Diversity-Aware Meta Visual Prompting Qidong Huang et.al. 2303.08138v1 link
2023-03-15 Eliciting Latent Predictions from Transformers with the Tuned Lens Nora Belrose et.al. 2303.08112v2 link
2023-03-14 Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection Jinchao Li et.al. 2303.08019v1 null
2023-03-14 A Theory of Emergent In-Context Learning as Implicit Structure Induction Michael Hahn et.al. 2303.07971v1 null
2023-03-14 Edit-A-Video: Single Video Editing with Object-Aware Consistency Chaehun Shin et.al. 2303.07945v1 null
2023-03-15 Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation Junyoung Seo et.al. 2303.07937v2 null
2023-03-14 The Learnability of In-Context Learning Noam Wies et.al. 2303.07895v1 null
2023-03-14 Geolocation Predicting of Tweets Using BERT-Based Models Kateryna Lutsai et.al. 2303.07865v1 null
2023-03-14 Feature representations useful for predicting image memorability Takumi Harada et.al. 2303.07679v1 null
2023-03-14 Variation of Gender Biases in Visual Recognition Models Before and After Finetuning Jaspreet Ranjit et.al. 2303.07615v1 null
2023-03-13 Model-tuning Via Prompts Makes NLP Models Adversarially Robust Mrigank Raman et.al. 2303.07320v1 null
2023-03-13 Vision-Language Models as Success Detectors Yuqing Du et.al. 2303.07280v1 null
2023-03-13 InferFix: End-to-End Program Repair with LLMs Matthew Jin et.al. 2303.07263v1 null
2023-03-13 PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents Weixiong Lin et.al. 2303.07240v1 null
2023-03-13 AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments Hao Wen et.al. 2303.07129v1 null
2023-03-13 Generating multiple-choice questions for medical question answering with distractors and cue-masking Damien Sileo et.al. 2303.07069v1 null
2023-03-14 Pretrained ViTs Yield Versatile Representations For Medical Images Christos Matsoukas et.al. 2303.07034v2 link
2023-03-13 Self-supervised based general laboratory progress pretrained model for cardiovascular event detection Li-Chin Chen et.al. 2303.06980v1 null
2023-03-14 Uni-RXN: A Unified Framework Bridging the Gap between Chemical Reaction Pretraining and Conditional Molecule Generation Bo Qiang et.al. 2303.06965v2 link
2023-03-13 Contextually-rich human affect perception using multimodal scene information Digbalay Bose et.al. 2303.06904v1 link
2023-03-10 Rewarding Chatbots for Real-World Engagement with Millions of Users Robert Irvine et.al. 2303.06135v1 null
2023-03-13 Improving Domain-Invariance in Self-Supervised Learning via Batch Styles Standardization Marin Scalbert et.al. 2303.06088v2 null
2023-03-10 MVImgNet: A Large-scale Dataset of Multi-view Images Xianggang Yu et.al. 2303.06042v1 null
2023-03-10 Marginalia and machine learning: Handwritten text recognition for Marginalia Collections Adam Axelsson et.al. 2303.05929v1 link
2023-03-10 Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection Luting Wang et.al. 2303.05892v1 null
2023-03-10 3D Masked Autoencoders with Application to Anomaly Detection in Non-Contrast Enhanced Breast MRI Daniel M. Lang et.al. 2303.05861v1 null
2023-03-10 Contrastive Language-Image Pretrained (CLIP) Models are Powerful Out-of-Distribution Detectors Felix Michels et.al. 2303.05828v1 null
2023-03-10 Scaling Up 3D Kernels with Bayesian Frequency Re-parameterization for Medical Image Segmentation Ho Hin Lee et.al. 2303.05785v1 null
2023-03-10 CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition with Variational Alignment Jiangbin Zheng et.al. 2303.05725v1 null
2023-03-10 MuLTI: Efficient Video-and-Language Understanding with MultiWay-Sampler and Multiple Choice Modeling Jiaqi Xu et.al. 2303.05707v1 null
2023-03-09 FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning Kazi Injamamul Haque et.al. 2303.05416v1 link
2023-03-09 Greener yet Powerful: Taming Large Code Generation Models with Quantization Xiaokai Wei et.al. 2303.05378v1 null
2023-03-09 Can a Frozen Pretrained Language Model be used for Zero-shot Neural Retrieval on Entity-centric Questions? Yasuto Hoshi et.al. 2303.05153v1 null
2023-03-08 Enhancing Low-resolution Face Recognition with Feature Similarity Knowledge Distillation Sungho Shin et.al. 2303.04681v1 null
2023-03-08 Aberration-Aware Depth-from-Focus Xinge Yang et.al. 2303.04654v1 null
2023-03-08 FastSurf: Fast Neural RGB-D Surface Reconstruction using Per-Frame Intrinsic Refinement and TSDF Fusion Prior Learning Seunghwan Lee et.al. 2303.04508v1 null
2023-03-08 Onsets and Velocities: Affordable Real-Time Piano Transcription Using Convolutional Neural Networks Andres Fernandez et.al. 2303.04485v1 link
2023-03-07 PSDNet: Determination of Particle Size Distributions Using Synthetic Soil Images and Convolutional Neural Networks Javad Manashti et.al. 2303.04269v1 null
2023-03-07 Comparing PSDNet, pretrained networks, and traditional feature extraction for predicting the particle size distribution of granular materials from photographs Javad Manashti et.al. 2303.04265v1 null
2023-03-09 Patch of Invisibility: Naturalistic Black-Box Adversarial Attacks on Object Detectors Raz Lapid et.al. 2303.04238v2 null
2023-03-07 Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models? Boris Knyazev et.al. 2303.04143v1 link
2023-03-07 Foundation Models for Decision Making: Problems, Methods, and Opportunities Sherry Yang et.al. 2303.04129v1 null
2023-03-07 CroCoSum: A Benchmark Dataset for Cross-Lingual Code-Switched Summarization Ruochen Zhang et.al. 2303.04092v1 null
2023-03-07 Larger language models do in-context learning differently Jerry Wei et.al. 2303.03846v1 null
2023-03-07 Lformer: Text-to-Image Generation with L-shape Block Parallel Decoding Jiacheng Li et.al. 2303.03800v1 null
2023-03-07 Prediction of transonic flow over supercritical airfoils using geometric-encoding and deep-learning strategies Zhiwen Deng et.al. 2303.03695v1 null
2023-03-07 AST-SED: An Effective Sound Event Detection Method Based on Audio Spectrogram Transformer Kang Li et.al. 2303.03689v1 null
2023-03-06 Structured Kernel Estimation for Photon-Limited Deconvolution Yash Sanghvi et.al. 2303.03472v1 link
2023-03-06 CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning Hritik Bansal et.al. 2303.03323v1 null
2023-03-06 IPA-CLIP: Integrating Phonetic Priors into Vision and Language Pretraining Chihaya Matsuhira et.al. 2303.03144v1 null
2023-03-06 ST-KeyS: Self-Supervised Transformer for Keyword Spotting in Historical Handwritten Documents Sana Khamekhem Jemni et.al. 2303.03127v1 null
2023-03-06 HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention Shijie Geng et.al. 2303.02995v1 link
2023-03-06 Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning Zhen Wang et.al. 2303.02861v1 null
2023-03-05 Learning to Localize in Unseen Scenes with Relative Pose Regressors Ofer Idan et.al. 2303.02717v1 link
2023-03-05 Knowledge-Enhanced Semi-Supervised Federated Learning for Aggregating Heterogeneous Lightweight Clients in IoT Jiaqi Wang et.al. 2303.02668v1 null
2023-03-05 Comparative study of Transformer and LSTM Network with attention mechanism on Image Captioning Pranav Dandwate et.al. 2303.02648v1 null
2023-03-05 Effectiveness of Data Augmentation for Prefix Tuning with Limited Data Stephen Obadinma et.al. 2303.02577v1 null
2023-03-04 CapDet: Unifying Dense Captioning and Open-World Detection Pretraining Yanxin Long et.al. 2303.02489v1 null
2023-03-03 Data-Efficient Training of CNNs and Transformers with Coresets: A Stability Perspective Animesh Gupta et.al. 2303.02095v1 link
2023-03-03 Bespoke: A Block-Level Neural Network Optimization Framework for Low-Cost Deployment Jong-Ryul Lee et.al. 2303.01913v1 null
2023-03-03 Word-As-Image for Semantic Typography Shir Iluz et.al. 2303.01818v1 null
2023-03-03 Team Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task Strategies for Genre and Framing Detection in Online News Yuta Koreeda et.al. 2303.01794v1 null
2023-03-03 DeepfakeMAE: Facial Part Consistency Aware Masked Autoencoder for Deepfake Video Detection Juan Hu et.al. 2303.01740v1 null
2023-03-02 Hierarchical discriminative learning improves visual representations of biomedical microscopy Cheng Jiang et.al. 2303.01605v1 null
2023-03-02 Evolutionary Augmentation Policy Optimization for Self-supervised Learning Noah Barrett et.al. 2303.01584v1 null
2023-03-02 On the Provable Advantage of Unsupervised Pretraining Jiawei Ge et.al. 2303.01566v1 null
2023-03-02 Transferring Models Trained on Natural Images to 3D MRI via Position Encoded Slice Models Umang Gupta et.al. 2303.01491v1 link
2023-03-02 Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning Bo Wan et.al. 2303.01313v1 null
2023-03-02 MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering Jingjing Jiang et.al. 2303.01239v1 link
2023-03-02 FlowFormer++: Masked Cost Volume Autoencoding for Pretraining Optical Flow Estimation Xiaoyu Shi et.al. 2303.01237v1 null
2023-03-02 Unsupervised Meta-Learning via Few-shot Pseudo-supervised Contrastive Learning Huiwon Jang et.al. 2303.00996v1 link
2023-03-02 Learning to Grow Pretrained Models for Efficient Transformer Training Peihao Wang et.al. 2303.00980v1 null
2023-03-02 Image Labels Are All You Need for Coarse Seagrass Segmentation Scarlett Raine et.al. 2303.00973v1 link
2023-03-02 Enhancing General Face Forgery Detection via Vision Transformer with Low-Rank Adaptation Chenqi Kong et.al. 2303.00917v1 null
2023-03-02 Large-Scale Domain-Specific Pretraining for Biomedical Vision-Language Processing Sheng Zhang et.al. 2303.00915v1 null
2023-03-01 RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training Zheng Yuan et.al. 2303.00534v1 null
2023-03-01 Competence-Based Analysis of Language Models Adam Davies et.al. 2303.00333v1 null
2023-03-01 Speeding Up EfficientNet: Selecting Update Blocks of Convolutional Neural Networks using Genetic Algorithm in Transfer Learning Md. Mehedi Hasana et.al. 2303.00261v1 null
2023-02-28 In-Context Instruction Learning Seonghyeon Ye et.al. 2302.14691v1 link
2023-02-28 Fast as CHITA: Neural Network Pruning with Combinatorial Optimization Riade Benbaki et.al. 2302.14623v1 null
2023-02-28 IQ-Flow: Mechanism Design for Inducing Cooperative Behavior to Self-Interested Agents in Sequential Social Dilemmas Bengisu Guresti et.al. 2302.14604v1 link
2023-02-28 BrainBERT: Self-supervised representation learning for intracranial recordings Christopher Wang et.al. 2302.14367v1 link
2023-03-01 Turning a CLIP Model into a Scene Text Detector Wenwen Yu et.al. 2302.14338v2 null
2023-02-28 Remote Sensing Scene Classification with Masked Image Modeling (MIM) Liya Wang et.al. 2302.14256v1 null
2023-02-28 CHGNet: Pretrained universal neural network potential for charge-informed atomistic modeling Bowen Deng et.al. 2302.14231v1 link
2023-02-28 Weighted Sampling for Masked Language Modeling Linhan Zhang et.al. 2302.14225v1 null
2023-02-28 Are Character-level Translations Worth the Wait? An Extensive Comparison of Character- and Subword-level Models for Machine Translation Lukas Edman et.al. 2302.14220v1 null
2023-02-27 Linear pretraining in recurrent mixture density networks Hubert Normandin-Taillon et.al. 2302.14141v1 null
2023-02-27 EDMAE: An Efficient Decoupled Masked Autoencoder for Standard View Identification in Pediatric Echocardiography Yiman Liu et.al. 2302.13869v1 null
2023-02-27 Contrastive Video Question Answering via Video Graph Transformer Junbin Xiao et.al. 2302.13668v1 null
2023-02-27 Using Auxiliary Tasks In Multimodal Fusion Of Wav2vec 2.0 And BERT For Multimodal Emotion Recognition Dekai Sun et.al. 2302.13661v1 null
2023-02-27 Pretraining De-Biased Language Model with Large-scale Click Logs for Document Ranking Xiangsheng Li et.al. 2302.13498v1 null
2023-02-27 FedCLIP: Fast Generalization and Personalization for CLIP in Federated Learning Wang Lu et.al. 2302.13485v1 link
2023-02-26 Improving Representational Continuity via Continued Pretraining Michael Sun et.al. 2302.13289v1 null
2023-02-25 Toward Fairness in Text Generation via Mutual Information Minimization based on Importance Sampling Rui Wang et.al. 2302.13136v1 null
2023-02-25 Self-Supervised and Supervised Deep Learning for PET Image Reconstruction Andrew J. Reader et.al. 2302.13086v1 null
2023-02-24 Modulating Pretrained Diffusion Models for Multimodal Image Synthesis Cusuh Ham et.al. 2302.12764v1 null
2023-02-27 Language Models are Few-shot Learners for Prognostic Prediction Zekai Chen et.al. 2302.12692v2 null
2023-02-24 A Knowledge Distillation framework for Multi-Organ Segmentation of Medaka Fish in Tomographic Image Jwalin Bhatt et.al. 2302.12562v1 null
2023-02-23 Dynamic Benchmarking of Masked Language Models on Temporal Concept Drift with Multiple Views Katerina Margatina et.al. 2302.12297v1 null
2023-02-23 Improving Adaptive Conformal Prediction Using Self-Supervised Learning Nabeel Seedat et.al. 2302.12238v1 link
2023-02-23 Uncertainty Guided Ensemble Self-Training for Semi-Supervised Global Field Reconstruction Yunyang Zhang et.al. 2302.11940v1 link
2023-02-23 Power Time Series Forecasting by Pretrained LM Tian Zhou et.al. 2302.11939v1 null
2023-02-23 A framework for benchmarking class-out-of-distribution detection and its application to ImageNet Ido Galil et.al. 2302.11893v1 link
2023-02-23 What Can We Learn From The Selective Prediction And Uncertainty Estimation Performance Of 523 Imagenet Classifiers Ido Galil et.al. 2302.11874v1 link
2023-02-23 Detachedly Learn a Classifier for Class-Incremental Learning Ziheng Li et.al. 2302.11730v1 null
2023-02-23 An efficient method for Out-of-Distribution Detection Mingyu Xu et.al. 2302.11716v1 null
2023-02-22 ACE: Zero-Shot Image to Image Translation via Pretrained Auto-Contrastive-Encoder Sihan Xu et.al. 2302.11705v1 link
2023-02-22 Lang2LTL: Translating Natural Language Commands to Temporal Robot Task Specification Jason Xinyu Liu et.al. 2302.11649v1 null
2023-02-22 Transfer Learning Enhanced Full Waveform Inversion Stefan Kollmannsberger et.al. 2302.11259v1 null
2023-02-22 Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification Meng Liu et.al. 2302.11254v1 link
2023-02-22 Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities Hexiang Hu et.al. 2302.11154v1 null
2023-02-21 A Reinforcement Learning Framework for Online Speaker Diarization Baihan Lin et.al. 2302.10924v1 null
2023-02-21 On Calibrating Diffusion Probabilistic Models Tianyu Pang et.al. 2302.10688v1 link
2023-02-21 Playing the Werewolf game with artificial intelligence for language understanding Hisaichi Shibata et.al. 2302.10646v1 null
2023-02-21 Few-Shot Point Cloud Semantic Segmentation via Contrastive Self-Supervision and Multi-Resolution Attention Jiahui Wang et.al. 2302.10501v1 null
2023-02-20 Towards Universal Fake Image Detectors that Generalize Across Generative Models Utkarsh Ojha et.al. 2302.10174v1 null
2023-02-20 Cross-domain Compositing with Pretrained Diffusion Models Roy Hachnochi et.al. 2302.10167v1 link
2023-02-20 Progressive Knowledge Distillation: Building Ensembles for Efficient Inference Don Kurian Dennis et.al. 2302.10093v1 null
2023-02-20 Domain-Specific Pretraining Improves Confidence in Whole Slide Image Classification Soham Rohit Chitnis et.al. 2302.09833v1 null
2023-02-20 Composer: Creative and Controllable Image Synthesis with Composable Conditions Lianghua Huang et.al. 2302.09778v1 null
2023-02-19 Text Classification in the Wild: a Large-scale Long-tailed Name Normalization Dataset Jiexing Qi et.al. 2302.09509v1 link
2023-02-19 Self-supervised Cloth Reconstruction via Action-conditioned Cloth Tracking Zixuan Huang et.al. 2302.09502v1 null
2023-02-19 Why Is Public Pretraining Necessary for Private Model Training? Arun Ganesh et.al. 2302.09483v1 null
2023-02-19 Learning Language Representations with Logical Inductive Bias Jianshu Chen et.al. 2302.09458v1 null
2023-02-18 A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT Ce Zhou et.al. 2302.09419v1 null
2023-02-17 Privately Customizing Prefinetuning to Better Match User Data in Federated Learning Charlie Hou et.al. 2302.09042v1 null
2023-02-16 LayoutDiffuse: Adapting Foundational Diffusion Models for Layout-to-Image Generation Jiaxin Cheng et.al. 2302.08908v1 null
2023-02-16 Pretraining Language Models with Human Preferences Tomasz Korbak et.al. 2302.08582v1 null
2023-02-16 Learning to diagnose cirrhosis from radiological and histological labels with joint self and weakly-supervised pretraining strategies Emma Sarfati et.al. 2302.08427v1 null
2023-02-16 Retrieval-augmented Image Captioning Rita Ramos et.al. 2302.08268v1 link
2023-02-16 Do We Still Need Clinical Language Models? Eric Lehman et.al. 2302.08091v1 null
2023-02-16 Shared Microexponents: A Little Shifting Goes a Long Way Bita Rouhani et.al. 2302.08007v1 null
2023-02-15 The Expressive Power of Tuning Only the Norm Layers Angeliki Giannou et.al. 2302.07937v1 null
2023-02-15 Commonsense Reasoning for Conversational AI: A Survey of the State of the Art Christopher Richardson et.al. 2302.07926v1 null
2023-02-15 Meeting the Needs of Low-Resource Languages: The Value of Automatic Alignments via Pretrained Models Abteen Ebrahimi et.al. 2302.07912v1 null
2023-02-14 Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models Shrimai Prabhumoye et.al. 2302.07388v1 null
2023-02-14 AutoBiasTest: Controllable Sentence Generation for Automated and Open-Ended Social Bias Testing in Language Models Rafal Kocielnik et.al. 2302.07371v1 null
2023-02-14 READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises Chenglei Si et.al. 2302.07324v1 link
2022-12-02 ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning Shachar Don-Yehiya et.al. 2212.01378v1 null
2022-12-02 DWRSeg: Dilation-wise Residual Network for Real-time Semantic Segmentation Haoran Wei et.al. 2212.01173v1 null
2022-12-02 Tackling Low-Resourced Sign Language Translation: UPC at WMT-SLT 22 Laia Tarrés et.al. 2212.01140v1 null
2022-12-02 Masked Contrastive Pre-Training for Efficient Video-Text Retrieval Fangxun Shu et.al. 2212.00986v1 null
2022-12-01 Weakly Supervised Annotations for Multi-modal Greeting Cards Dataset Sidra Hanif et.al. 2212.00847v1 null
2022-12-01 Navigating an Ocean of Video Data: Deep Learning for Humpback Whale Classification in YouTube Videos Michelle Ramirez et.al. 2212.00822v1 null
2022-12-01 Sparsity Agnostic Depth Completion Andrea Conti et.al. 2212.00790v1 null
2022-12-01 Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation Haochen Wang et.al. 2212.00774v1 link
2022-12-01 Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis Odysseas S. Chlapanis et.al. 2212.00678v1 null
2022-12-01 Hyperbolic Contrastive Learning for Visual Representations beyond Objects Songwei Ge et.al. 2212.00653v1 link
2022-12-01 Finetune like you pretrain: Improved finetuning of zero-shot vision models Sachin Goyal et.al. 2212.00638v1 link
2022-12-01 Language models and brain alignment: beyond word-level semantics and prediction Gabriele Merlin et.al. 2212.00596v1 null
2022-12-01 IRRGN: An Implicit Relational Reasoning Graph Network for Multi-turn Response Selection Jingcheng Deng et.al. 2212.00482v1 link
2022-12-01 Localization vs. Semantics: How Can Language Benefit Visual Representation Learning? Zhuowan Li et.al. 2212.00281v1 null
2022-11-30 AIO-P: Expanding Neural Performance Predictors Beyond Image Classification Keith G. Mills et.al. 2211.17228v1 link
2022-11-30 GENNAPE: Towards Generalized Neural Architecture Performance Estimators Keith G. Mills et.al. 2211.17226v1 link
2022-11-30 Topological Data Analysis for Speech Processing Eduard Tulchinskii et.al. 2211.17223v1 null
2022-11-30 ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT Rui Pan et.al. 2211.17201v1 link
2022-11-30 Learning Label Modular Prompts for Text Classification in the Wild Hailin Chen et.al. 2211.17142v1 null
2022-11-30 BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch? Joel Niklaus et.al. 2211.17135v1 null
2022-11-30 Quadapter: Adapter for GPT-2 Quantization Minseop Park et.al. 2211.16912v1 null
2022-11-30 Geoclidean: Few-Shot Generalization in Euclidean Geometry Joy Hsu et.al. 2211.16663v1 link
2022-11-29 Exploiting Category Names for Few-Shot Classification with Vision-Language Models Taihong Xiao et.al. 2211.16594v1 null
2022-11-29 BARTSmiles: Generative Masked Language Models for Molecular Representations Gayane Chilingaryan et.al. 2211.16349v1 link
2022-11-29 Few-shot Query-Focused Summarization with Prefix-Merging Ruifeng Yuan et.al. 2211.16164v1 null
2022-11-29 Better Generalized Few-Shot Learning Even Without Base Data Seongwoong Kim et.al. 2211.16095v1 null
2022-11-29 Syntactic Substitutability as Unsupervised Dependency Syntax Jasper Jian et.al. 2211.16031v1 null
2022-11-29 MoDA: Map style transfer for self-supervised Domain Adaptation of embodied agents Eun Sun Lee et.al. 2211.15992v1 null
2022-11-29 Extending the Subwording Model of Multilingual Pretrained Models for New Languages Kenji Imamura et.al. 2211.15965v1 link
2022-11-29 PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals Zhihao Zhang et.al. 2211.15940v1 null
2022-11-29 ClueWeb22: 10 Billion Web Documents with Rich Information Arnold Overwijk et.al. 2211.15848v1 null
2022-11-28 COVID-19 Classification Using Deep Learning Two-Stage Approach Mostapha Alsaidi et.al. 2211.15817v1 null
2022-11-28 Handling Image and Label Resolution Mismatch in Remote Sensing Scott Workman et.al. 2211.15790v1 null
2022-11-28 Action-GPT: Leveraging Large-scale Language Models for Improved and Generalized Zero Shot Action Generation Sai Shashank Kalakonda et.al. 2211.15603v1 null
2022-11-28 Scientific and Creative Analogies in Pretrained Language Models Tamara Czinczoll et.al. 2211.15268v1 null
2022-11-27 Multi-Modal Few-Shot Temporal Action Detection via Vision-Language Meta-Adaptation Sauradip Nag et.al. 2211.14905v1 null
2022-11-27 Dense Text Retrieval based on Pretrained Language Models: A Survey Wayne Xin Zhao et.al. 2211.14876v1 link
2022-11-27 Detect-Localize-Repair: A Unified Framework for Learning to Debug with CodeT5 Nghi D. Q. Bui et.al. 2211.14875v1 null
2022-11-27 Foiling Explanations in Deep Neural Networks Snir Vitrack Tamam et.al. 2211.14860v1 null
2022-11-27 Training Deep Learning Models for Massive MIMO CSI Feedback with Small Datasets in New Environments Zhenyu Liu et.al. 2211.14785v1 null
2022-11-27 A Theoretical Study of Inductive Biases in Contrastive Learning Jeff Z. HaoChen et.al. 2211.14699v1 null
2022-11-27 DigGAN: Discriminator gradIent Gap Regularization for GAN Training with Limited Data Tiantian Fang et.al. 2211.14694v1 link
2022-11-26 Randomized Conditional Flow Matching for Video Prediction Aram Davtyan et.al. 2211.14575v1 null
2022-11-25 WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction Guillaume Le Moing et.al. 2211.14308v1 null
2022-11-25 PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices Kazuki Osawa et.al. 2211.14133v1 null
2022-11-25 Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization Chen Zhao et.al. 2211.14053v1 null
2022-11-25 Efficient Feature Extraction for High-resolution Video Frame Interpolation Moritz Nottebaum et.al. 2211.14005v1 null
2022-11-25 Learning with Silver Standard Data for Zero-shot Relation Extraction Tianyin Wang et.al. 2211.13883v1 null
2022-11-25 ComCLIP: Training-Free Compositional Image and Text Matching Kenan Jiang et.al. 2211.13854v1 null
2022-11-24 Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt Zhichao Yang et.al. 2211.13813v1 link
2022-11-24 Contrastive pretraining for semantic segmentation is robust to noisy positive pairs Sebastian Gerard et.al. 2211.13756v1 null
2022-11-24 Sketch-Guided Text-to-Image Diffusion Models Andrey Voynov et.al. 2211.13752v1 null
2022-11-24 Prototypical Fine-tuning: Towards Robust Performance Under Varying Data Sizes Yiqiao Jin et.al. 2211.13638v1 null
2022-11-23 ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation Sara Atito et.al. 2211.13189v1 null
2022-11-22 Pyrocast: a Machine Learning Pipeline to Forecast Pyrocumulonimbus (PyroCb) Clouds Kenza Tazi et.al. 2211.13052v1 null
2022-11-23 Can we Adopt Self-supervised Pretraining for Chest X-Rays? Arsh Verma et.al. 2211.12931v1 null
2022-11-23 An ensemble of VisNet, Transformer-M, and pretraining models for molecular property prediction in OGB Large-Scale Challenge @ NeurIPS 2022 Yusong Wang et.al. 2211.12791v1 null
2022-11-23 Masked Autoencoding for Scalable and Generalizable Decision Making Fangchen Liu et.al. 2211.12740v1 link
2022-11-23 FRE: A Fast Method For Anomaly Detection And Segmentation Ibrahima Ndiour et.al. 2211.12650v1 null
2022-11-22 Retrieval-Augmented Multimodal Language Modeling Michihiro Yasunaga et.al. 2211.12561v1 null
2022-11-22 EDICT: Exact Diffusion Inversion via Coupled Transformations Bram Wallace et.al. 2211.12446v1 null
2022-11-21 LoopDA: Constructing Self-loops to Adapt Nighttime Semantic Segmentation Fengyi Shen et.al. 2211.11870v1 link
2022-11-22 Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models Ted Xiao et.al. 2211.11736v2 null
2022-11-22 Multitask Vision-Language Prompt Tuning Sheng Shen et.al. 2211.11720v2 link
2022-11-21 Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention Zineng Tang et.al. 2211.11701v1 link
2022-11-21 Understanding and Improving Visual Prompting: A Label-Mapping Perspective Aochuan Chen et.al. 2211.11635v1 link
2022-11-21 Deanthropomorphising NLP: Can a Language Model Be Conscious? Matthew Shardlow et.al. 2211.11483v1 null
2022-11-21 CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model Yongyu Yan et.al. 2211.11363v1 null
2022-11-21 VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models Ajay Jain et.al. 2211.11319v1 null
2022-11-21 Spatiotemporal Modeling of Multivariate Signals With Graph Neural Networks and Structured State Space Models Siyi Tang et.al. 2211.11176v1 link
2022-11-21 Unifying Vision-Language Representation Space with Single-tower Transformer Jiho Jang et.al. 2211.11153v1 null
2022-11-21 Doubly Contrastive End-to-End Semantic Segmentation for Autonomous Driving under Adverse Weather Jongoh Jeong et.al. 2211.11131v1 null
2022-11-18 Context Variance Evaluation of Pretrained Language Models for Prompt-based Biomedical Knowledge Probing Zonghai Yao et.al. 2211.10265v1 null
2022-11-18 3d human motion generation from the text via gesture action classification and the autoregressive model Gwantae Kim et.al. 2211.10003v1 null
2022-11-17 SAR-based landslide classification pretraining leads to better segmentation Vanessa Böhm et.al. 2211.09927v1 link
2022-11-17 SPACEx: Speech-driven Portrait Animation with Controllable Expression Siddharth Gururani et.al. 2211.09809v1 null
2022-11-17 InstructPix2Pix: Learning to Follow Image Editing Instructions Tim Brooks et.al. 2211.09800v1 null
2022-11-17 Null-text Inversion for Editing Real Images using Guided Diffusion Models Ron Mokady et.al. 2211.09794v1 null
2022-11-17 MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors Yuang Zhang et.al. 2211.09791v1 link
2022-11-17 Assessing Neural Network Robustness via Adversarial Pivotal Tuning Peter Ebert Christensen et.al. 2211.09782v1 null
2022-11-17 3DLatNav: Navigating Generative Latent Spaces for Semantic-Aware 3D Object Manipulation Amaya Dharmasiri et.al. 2211.09770v1 link
2022-11-17 Style Classification of Rabbinic Literature for Detection of Lost Midrash Tanhuma Material Shlomo Tannor et.al. 2211.09710v1 null
2022-11-17 UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer Kunchang Li et.al. 2211.09552v1 link
2022-11-17 CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge Linli Yao et.al. 2211.09371v1 null
2022-11-16 Technical Report on Neural Language Models and Few-Shot Learning for Systematic Requirements Processing in MDSE Vincent Bertram et.al. 2211.09084v1 null
2022-11-16 Self-supervised Egomotion and Depth Learning via Bi-directional Coarse-to-Fine Scale Recovery Hao Qu et.al. 2211.08904v1 null
2022-11-16 Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed Representations Linlin Liu et.al. 2211.08794v1 null
2022-11-15 Dynamic-Pix2Pix: Noise Injected cGAN for Modeling Input and Target Domain Joint Distributions with Limited Training Data Mohammadreza Naderi et.al. 2211.08570v1 null
2022-11-15 A Point in the Right Direction: Vector Prediction for Spatially-aware Self-supervised Volumetric Representation Learning Yejia Zhang et.al. 2211.08533v1 null
2022-11-15 On the Compositional Generalization Gap of In-Context Learning Arian Hosseini et.al. 2211.08473v1 null
2022-11-15 Mechanistic Mode Connectivity Ekdeep Singh Lubana et.al. 2211.08422v1 link
2022-11-15 Towards an objective characterization of an individual's facial movements using Self-Supervised Person-Specific-Models Yanis Tazi et.al. 2211.08279v1 link
2022-11-15 A Universal Discriminator for Zero-Shot Generalization Haike Xu et.al. 2211.08099v1 null
2022-11-15 Autonomous Golf Putting with Data-Driven and Physics-Based Methods Annika Junker et.al. 2211.08081v1 null
2022-11-15 Multilingual and Multimodal Topic Modelling with Pretrained Embeddings Elaine Zosa et.al. 2211.08057v1 link
2022-11-15 Persian Emotion Detection using ParsBERT and Imbalanced Data Handling Approaches Amirhossein Abaskohi et.al. 2211.08029v1 null
2022-11-15 Contextual Transformer for Offline Meta Reinforcement Learning Runji Lin et.al. 2211.08016v1 null
2022-11-15 DIGEST: Deeply supervIsed knowledGE tranSfer neTwork learning for brain tumor segmentation with incomplete multi-modal MRI scans Haoran Li et.al. 2211.07993v1 null
2022-11-15 NeRFFaceEditing: Disentangled Face Editing in Neural Radiance Fields Kaiwen Jiang et.al. 2211.07968v1 null
2022-11-15 False: False Negative Samples Aware Contrastive Learning for Semantic Segmentation of High-Resolution Remote Sensing Image Zhaoyang Zhang et.al. 2211.07928v1 null
2022-11-14 Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures Gal Metzer et.al. 2211.07600v1 link
2022-11-14 Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations Swarnadeep Saha et.al. 2211.07517v1 link
2022-11-14 Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification Juan I. Pisula et.al. 2211.07384v1 link
2022-11-14 Fast Text-Conditional Discrete Denoising on Vector-Quantized Latent Spaces Dominic Rampas et.al. 2211.07292v1 link
2022-11-14 The Role of Local Alignment and Uniformity in Image-Text Contrastive Learning on Medical Images Philip Müller et.al. 2211.07254v1 null
2022-11-14 Towards A Unified Conformer Structure: from ASR to ASV Task Dexin Liao et.al. 2211.07201v1 link
2022-11-14 Unsupervised Galaxy Morphological Visual Representation with Deep Contrastive Learning Shoulin Wei et.al. 2211.07168v1 null
2022-11-14 Information-guided pixel augmentation for pixel-wise contrastive learning Quan Quan et.al. 2211.07118v1 null
2022-11-14 BiViT: Extremely Compressed Binary Vision Transformer Yefei He et.al. 2211.07091v1 null
2022-11-14 SPE: Symmetrical Prompt Enhancement for Fact Probing Yiyuan Li et.al. 2211.07078v1 null
2022-11-11 Physically-Based Face Rendering for NIR-VIS Face Recognition Yunqi Miao et.al. 2211.06408v1 link
2022-11-11 From Competition to Collaboration: Making Toy Datasets on Kaggle Clinically Useful for Chest X-Ray Diagnosis Using Federated Learning Pranav Kulkarni et.al. 2211.06212v1 null
2022-11-11 How Much Hate with #china? A Preliminary Analysis on China-related Hateful Tweets Two Years After the Covid Pandemic Began Jinghua Xu et.al. 2211.06116v1 link
2022-11-11 Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in Temporal Action Localization Tasks Hyolim Kang et.al. 2211.06023v1 null
2022-11-10 Measuring Reliability of Large Language Models through Semantic Consistency Harsh Raj et.al. 2211.05853v1 null
2022-11-10 Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control Xiang Fan et.al. 2211.05750v1 link
2022-11-10 3D-CSL: self-supervised 3D context similarity learning for Near-Duplicate Video Retrieval Rui Deng et.al. 2211.05352v1 null
2022-11-09 Flaky Performances when Pretraining on Relational Databases Shengchao Liu et.al. 2211.05213v1 null
2022-11-09 An Empirical Study on Clustering Pretrained Embeddings: Is Deep Strictly Better? Tyler R. Scott et.al. 2211.05183v1 null
2022-11-09 Large Language Models with Controllable Working Memory Daliang Li et.al. 2211.05110v1 null
2022-11-09 Detecting Languages Unintelligible to Multilingual Models through Local Structure Probes Louis Clouâtre et.al. 2211.05015v1 null
2022-11-09 Optimized Global Perturbation Attacks For Brain Tumour ROI Extraction From Binary Classification Models Sajith Rajapaksa et.al. 2211.04926v1 null
2022-11-09 Foundation Models for Semantic Novelty in Reinforcement Learning Tarun Gupta et.al. 2211.04878v1 null
2022-11-09 Masked Vision-Language Transformers for Scene Text Recognition Jie Wu et.al. 2211.04785v1 link
2022-11-08 Reducing Down(stream)time: Pretraining Molecular GNNs using Heterogeneous AI Accelerators Jenna A. Bilbrey et.al. 2211.04598v1 link
2022-11-08 Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency Gaps Hiroki Iida et.al. 2211.03988v1 null
2022-11-08 Parameter and Data Efficient Continual Pre-training for Robustness to Dialectal Variance in Arabic Soumajyoti Sarkar et.al. 2211.03966v1 null
2022-11-08 Pretraining in Deep Reinforcement Learning: A Survey Zhihui Xie et.al. 2211.03959v1 null
2022-11-08 From fat droplets to floating forests: cross-domain transfer learning using a PatchGAN-based segmentation model Kameswara Bharadwaj Mantha et.al. 2211.03937v1 null
2022-11-07 ToDD: Topological Compound Fingerprinting in Computer-Aided Drug Discovery Andac Demir et.al. 2211.03808v1 null
2022-11-08 Generalizable Re-Identification from Videos with Cycle Association Zhongdao Wang et.al. 2211.03663v2 null
2022-11-07 SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for Dynamic Scenes Libo Sun et.al. 2211.03660v1 link
2022-11-07 Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining Qiang Chen et.al. 2211.03594v1 null
2022-11-07 ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech Xiaoran Fan et.al. 2211.03545v1 link
2022-11-07 C3PO: Learning to Achieve Arbitrary Goals via Massively Entropic Pretraining Alexis Jacq et.al. 2211.03521v1 null
2022-11-07 How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers Michael Hassid et.al. 2211.03495v1 link
2022-11-07 Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First Regularization Zhengkun Tian et.al. 2211.03284v1 null
2022-11-07 AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages Bonaventure F. P. Dossou et.al. 2211.03263v1 link
2022-11-06 On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey Xu Guo et.al. 2211.03154v1 null
2022-11-06 Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning Yu Meng et.al. 2211.03044v1 link
2022-11-04 SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers A. Arezzo et.al. 2211.02366v1 link
2022-11-04 Logits are predictive of network type Ali Borji et.al. 2211.02272v1 link
2022-11-04 Late Fusion with Triplet Margin Objective for Multimodal Ideology Prediction and Analysis Changyuan Qiu et.al. 2211.02269v1 null
2022-11-03 Zero-shot Video Moment Retrieval With Off-the-Shelf Models Anuj Diwan et.al. 2211.02178v1 null
2022-11-03 Could Giant Pretrained Image Models Extract Universal Representations? Yutong Lin et.al. 2211.02043v1 null
2022-11-03 Contextual information integration for stance detection via cross-attention Tilman Beck et.al. 2211.01874v1 null
2022-11-03 Crosslingual Generalization through Multitask Finetuning Niklas Muennighoff et.al. 2211.01786v1 link
2022-11-03 Exploring the State-of-the-Art Language Modeling Methods and Data Augmentation Techniques for Multilingual Clause-Level Morphology Emre Can Acikgoz et.al. 2211.01736v1 link
2022-11-03 Channel-Aware Pretraining of Joint Encoder-Decoder Self-Supervised Model for Telephonic-Speech ASR Vrunda N. Sukhadia et.al. 2211.01669v1 null
2022-11-03 Revisiting Grammatical Error Correction Evaluation and Beyond Peiyuan Gong et.al. 2211.01635v1 null
2022-11-03 Robust Few-shot Learning Without Using any Adversarial Samples Gaurav Kumar Nayak et.al. 2211.01598v1 link
2022-11-03 PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales Peifeng Wang et.al. 2211.01562v1 null
2022-11-03 Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions Shuhao Gu et.al. 2211.01542v1 link
2022-11-02 On the Informativeness of Supervision Signals Ilia Sucholutsky et.al. 2211.01407v1 null
2022-11-03 Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese An Yang et.al. 2211.01335v2 link
2022-11-02 RegCLR: A Self-Supervised Framework for Tabular Representation Learning in the Wild Weiyao Wang et.al. 2211.01165v1 null
2022-11-02 Intermediate Fine-Tuning Using Imperfect Synthetic Speech for Improving Electrolaryngeal Speech Recognition Lester Phillip Violeta et.al. 2211.01079v1 null
2022-11-02 Dialect-robust Evaluation of Generated Text Jiao Sun et.al. 2211.00922v1 null
2022-11-02 P$^3$OVD: Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary Object Detection Yanxin Long et.al. 2211.00849v1 null
2022-11-01 Preserving In-Context Learning ability in Large Language Model Fine-tuning Yihan Wang et.al. 2211.00635v1 null
2022-11-01 T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5 Chan-Jan Hsu et.al. 2211.00586v1 link
2022-11-01 Modelling black-box audio effects with time-varying feature modulation Marco Comunità et.al. 2211.00497v1 null
2022-11-01 Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features Alexandra Vioni et.al. 2211.00342v1 null
2022-11-01 Speech-text based multi-modal training with bidirectional attention for improved speech recognition Yuhang Yang et.al. 2211.00325v1 link
2022-11-01 DensePure: Understanding Diffusion Models towards Adversarial Robustness Chaowei Xiao et.al. 2211.00322v1 null
2022-11-01 Training Vision-Language Models with Less Bimodal Supervision Elad Segal et.al. 2211.00262v1 link
2022-11-01 End-to-End Optimization and Learning for Multiagent Ensembles James Kotary et.al. 2211.00251v1 null
2022-10-31 A Close Look into the Calibration of Pre-trained Language Models Yangyi Chen et.al. 2211.00151v1 link
2022-10-31 Active Learning of Non-semantic Speech Tasks with Pretrained Models Harlin Lee et.al. 2211.00119v1 null
2022-10-31 Zero-Shot Text Classification with Self-Training Ariel Gera et.al. 2210.17541v1 link
2022-10-31 Domain Curricula for Code-Switched MT at MixMT 2022 Lekan Raheem et.al. 2210.17463v1 null
2022-10-31 Generative Negative Text Replay for Continual Vision-Language Pretraining Shipeng Yan et.al. 2210.17322v1 null
2022-10-31 GPS: Genetic Prompt Search for Efficient Few-shot Learning Hanwei Xu et.al. 2210.17041v1 link
2022-10-30 Improving Bilingual Lexicon Induction with Cross-Encoder Reranking Yaoyiran Li et.al. 2210.16953v1 link
2022-10-30 Transfer Learning with Synthetic Corpora for Spatial Role Labeling and Reasoning Roshanak Mirzaee et.al. 2210.16952v1 null
2022-10-30 ViTASD: Robust Vision Transformer Baselines for Autism Spectrum Disorder Facial Diagnosis Xu Cao et.al. 2210.16943v1 link
2022-10-30 Real-Time MRI Video synthesis from time aligned phonemes with sequence-to-sequence networks Sathvik Udupa et.al. 2210.16881v1 null
2022-10-30 Improved acoustic-to-articulatory inversion using representations from pretrained self-supervised learning models Sathvik Udupa et.al. 2210.16871v1 null
2022-10-30 Parameter-Efficient Tuning Makes a Good Classification Head Zhuoyi Yang et.al. 2210.16771v1 link
2022-10-28 LOFT: Finding Lottery Tickets through Filter-wise Training Qihan Wang et.al. 2210.16169v1 null
2022-10-28 Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement Learning Lisheng Wu et.al. 2210.16058v1 link
2022-10-28 Towards zero-shot Text-based voice editing using acoustic context conditioning, utterance embeddings, and reference encoders Jason Fong et.al. 2210.16045v1 null
2022-10-31 UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance Wei Li et.al. 2210.16031v2 null
2022-10-28 OhMG: Zero-shot Open-vocabulary Human Motion Generation Junfan Lin et.al. 2210.15929v1 null
2022-10-28 Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation Nobuyuki Morioka et.al. 2210.15868v1 null
2022-10-27 An Adversarial Active Sampling-based Data Augmentation Framework for Manufacturable Chip Design Mingjie Liu et.al. 2210.15765v1 null
2022-10-27 Nearest Neighbor Language Models for Stylistic Controllable Generation Severino Trotta et.al. 2210.15762v1 null
2022-10-27 Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech Takaaki Saeki et.al. 2210.15447v1 null
2022-10-26 Efficient Use of Large Pre-Trained Models for Low Resource ASR Peter Vieting et.al. 2210.15445v1 null
2022-10-27 2T-UNET: A Two-Tower UNet with Depth Clues for Robust Stereo Depth Estimation Rohit Choudhary et.al. 2210.15374v1 null
2022-10-27 Drug repositioning for Alzheimer's disease with transfer learning Yetao Wu et.al. 2210.15271v1 null
2022-10-27 Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling Peijie Jiang et.al. 2210.15231v1 null
2022-10-27 COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning Yue Yu et.al. 2210.15212v1 link
2022-10-27 Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for Industrial Insertion of Novel Connectors from Vision Ashvin Nair et.al. 2210.15206v1 null
2022-10-27 Dictionary-Assisted Supervised Contrastive Learning Patrick Y. Wu et.al. 2210.15172v1 link
2022-10-28 Learning Variational Motion Prior for Video-based Motion Capture Xin Chen et.al. 2210.15134v2 null
2022-10-26 Don't Prompt, Search! Mining-based Zero-Shot Learning with Language Models Mozes van de Kar et.al. 2210.14803v1 null
2022-10-26 Pretrained audio neural networks for Speech emotion recognition in Portuguese Marcelo Matheus Gauy et.al. 2210.14716v1 link
2022-10-26 Autoregressive Structured Prediction with Language Models Tianyu Liu et.al. 2210.14698v1 link
2022-10-26 Investigating the Role of Centering Theory in the Context of Neural Coreference Resolution Systems Yuchen Eleanor Jiang et.al. 2210.14678v1 null
2022-10-26 Fast Yet Effective Speech Emotion Recognition with Self-distillation Zhao Ren et.al. 2210.14636v1 null
2022-10-26 Improving Speech-to-Speech Translation Through Unlabeled Text Xuan-Phi Nguyen et.al. 2210.14514v1 null
2022-10-26 AVES: Animal Vocalization Encoder based on Self-Supervision Masato Hagiwara et.al. 2210.14493v1 link
2022-10-26 Leveraging Affirmative Interpretations from Negation Improves Natural Language Understanding Md Mosharaf Hossain et.al. 2210.14486v1 link
2022-10-26 Benchmarking Language Models for Code Syntax Understanding Da Shen et.al. 2210.14473v1 link
2022-10-26 Bi-Link: Bridging Inductive Link Predictions from Text via Contrastive Learning of Transformers and Prompts Bohua Peng et.al. 2210.14463v1 null
2022-10-25 MOFormer: Self-Supervised Transformer model for Metal-Organic Framework Property Prediction Zhonglin Cao et.al. 2210.14188v1 null
2022-10-25 Audio MFCC-gram Transformers for respiratory insufficiency detection in COVID-19 Marcelo Matheus Gauy et.al. 2210.14085v1 link
2022-10-25 Adapitch: Adaption Multi-Speaker Text-to-Speech Conditioned on Pitch Disentangling with Untranscribed Data Xulong Zhang et.al. 2210.13803v1 null
2022-10-25 DEMETR: Diagnosing Evaluation Metrics for Translation Marzena Karpinska et.al. 2210.13746v1 link
2022-10-25 PALT: Parameter-Lite Transfer of Language Models for Knowledge Graph Completion Jianhao Shen et.al. 2210.13715v1 link
2022-10-24 The Robustness Limits of SoTA Vision Models to Natural Variation Mark Ibrahim et.al. 2210.13604v1 null
2022-10-24 Video based Object 6D Pose Estimation using Transformers Apoorva Beedu et.al. 2210.13540v1 **[link](h

About

🎓Automatically Update CV Papers Daily using Github Actions (Update Every 24th hours)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%