Skip to content

Multi modal

Multi-modal

Publish Date Title Authors PDF Code
2024-11-29 Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark Joseph Heyward et.al. 2411.19941v1 null
2024-11-29 Dynamic EEG-fMRI mapping: Revealing the relationship between brain connectivity and cognitive state Guiran Liu et.al. 2411.19922v1 null
2024-11-29 Handling irresolvable conflicts in the Semantic Web: an RDF-based conflict-tolerant version of the Deontic Traditional Scheme Livio Robaldo et.al. 2411.19918v1 link
2024-11-29 Nonparametric Estimation for a Log-concave Distribution Function with Interval-censored Data Chi Wing Chu et.al. 2411.19878v1 null
2024-11-29 SpaRC: Sparse Radar-Camera Fusion for 3D Object Detection Philipp Wolters et.al. 2411.19860v1 null
2024-11-29 SDR-GNN: Spectral Domain Reconstruction Graph Neural Network for Incomplete Multimodal Learning in Conversational Emotion Recognition Fangze Fu et.al. 2411.19822v1 null
2024-11-29 CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectives Armin Saghafian et.al. 2411.19787v1 link
2024-11-29 MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks Yiming Wu et.al. 2411.19786v1 null
2024-11-29 LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos Tiantian Geng et.al. 2411.19772v1 null
2024-11-29 JetFormer: An Autoregressive Generative Model of Raw Images and Text Michael Tschannen et.al. 2411.19722v1 null
2024-11-29 Multimodal Whole Slide Foundation Model for Pathology Tong Ding et.al. 2411.19666v1 link
2024-11-29 Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings Qiong Wu et.al. 2411.19628v1 link
2024-11-29 Self-Supervised Denoiser Framework Emilien Valat et.al. 2411.19593v1 null
2024-11-29 Enhancing AI microscopy for foodborne bacterial classification via adversarial domain adaptation across optical and biological variability Siddhartha Bhattacharya et.al. 2411.19514v1 null
2024-11-29 Interleaved-Modal Chain-of-Thought Jun Gao et.al. 2411.19488v1 null
2024-11-29 Effective Fine-Tuning of Vision-Language Models for Accurate Galaxy Morphology Analysis Ruoqi Wang et.al. 2411.19475v1 null
2024-11-29 Look Every Frame All at Once: Video-Ma$^2$mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing Hosu Lee et.al. 2411.19460v1 null
2024-11-29 Adaptive Interactive Segmentation for Multimodal Medical Imaging via Selection Engine Zhi Li et.al. 2411.19447v1 link
2024-11-28 CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections Mohamed Fazli Imam et.al. 2411.19346v1 link
2024-11-28 Beyond Logit Lens: Contextual Embeddings for Robust Hallucination Detection & Grounding in VLMs Anirudh Phukan et.al. 2411.19187v1 null
2024-11-28 HOT3D: Hand and Object Tracking in 3D from Egocentric Multi-View Videos Prithviraj Banerjee et.al. 2411.19167v1 null
2024-11-28 On Moving Object Segmentation from Monocular Video with Transformers Christian Homeyer et.al. 2411.19141v1 null
2024-11-28 Headache to Overstock? Promoting Long-tail Items through Debiased Product Bundling Shuo Xu et.al. 2411.19107v1 null
2024-11-28 PCDreamer: Point Cloud Completion Through Multi-view Diffusion Priors Guangshun Wei et.al. 2411.19036v1 null
2024-11-28 Perception of Visual Content: Differences Between Humans and Foundation Models Nardiena A. Pratama et.al. 2411.18968v1 null
2024-11-28 Second harmonic generation with 48% conversion efficiency from cavity polygon modes in a monocrystalline lithium niobate microdisk resonator Chao Sun et.al. 2411.18870v1 null
2024-11-28 CrossTracker: Robust Multi-modal 3D Multi-Object Tracking via Cross Correction Lipeng Gu et.al. 2411.18850v1 null
2024-11-27 Stratified Non-Negative Tensor Factorization Alexander Sietsema et.al. 2411.18805v1 null
2024-11-27 MRI Breast tissue segmentation using nnU-Net for biomechanical modeling Melika Pooyan et.al. 2411.18784v1 null
2024-11-27 Decoding Non-Linearity and Complexity: Deep Tabular Learning Approaches for Materials Science Vahid Attari et.al. 2411.18717v1 null