2025-02-20 |
LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models |
Shangqing Tu et.al. |
2502.14834v1 |
null |
2025-02-20 |
Fundamental Limitations in Defending LLM Finetuning APIs |
Xander Davies et.al. |
2502.14828v1 |
null |
2025-02-20 |
FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis |
Fadillah Maani et.al. |
2502.14807v1 |
null |
2025-02-20 |
Humanoid-VLA: Towards Universal Humanoid Control with Visual Integration |
Pengxiang Ding et.al. |
2502.14795v1 |
null |
2025-02-20 |
RendBEV: Semantic Novel View Synthesis for Self-Supervised Bird's Eye View Segmentation |
Henrique Piñeiro Monteagudo et.al. |
2502.14792v1 |
null |
2025-02-20 |
Structurally Disentangled Feature Fields Distillation for 3D Understanding and Editing |
Yoel Levy et.al. |
2502.14789v1 |
null |
2025-02-20 |
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features |
Michael Tschannen et.al. |
2502.14786v1 |
null |
2025-02-20 |
ReVision: A Dataset and Baseline VLM for Privacy-Preserving Task-Oriented Visual Instruction Rewriting |
Abhijit Mishra et.al. |
2502.14780v1 |
null |
2025-02-20 |
On the Influence of Context Size and Model Choice in Retrieval-Augmented Generation Systems |
Juraj Vladika et.al. |
2502.14759v1 |
null |
2025-02-20 |
The formation of a soliton gas condensate for the focusing Nonlinear Schrödinger equation |
Aikaterini Gkogkou et.al. |
2502.14749v1 |
null |
2025-02-20 |
EAGER-LLM: Enhancing Large Language Models as Recommenders through Exogenous Behavior-Semantic Integration |
Minjie Hong et.al. |
2502.14735v1 |
null |
2025-02-20 |
Sentence Smith: Formally Controllable Text Transformation and its Application to Evaluation of Text Embedding Models |
Hongji Li et.al. |
2502.14734v1 |
null |
2025-02-20 |
Multi-dataset synergistic in supervised learning to pre-label structural components in point clouds from shell construction scenes |
Lukas Rauch et.al. |
2502.14721v1 |
null |
2025-02-20 |
From Knowledge Generation to Knowledge Verification: Examining the BioMedical Generative Capabilities of ChatGPT |
Ahmed Abdeen Hamed et.al. |
2502.14714v1 |
null |
2025-02-20 |
SegAug: CTC-Aligned Segmented Augmentation For Robust RNN-Transducer Based Speech Recognition |
Khanh Le et.al. |
2502.14685v1 |
null |
2025-02-20 |
MAGO-SP: Detection and Correction of Water-Fat Swaps in Magnitude-Only VIBE MRI |
Robert Graf et.al. |
2502.14659v1 |
null |
2025-02-20 |
Exploring RWKV for Sentence Embeddings: Layer-wise Analysis and Baseline Comparison for Semantic Similarity |
Xinghan Pan et.al. |
2502.14620v1 |
link |
2025-02-20 |
Monocular Depth Estimation and Segmentation for Transparent Object with Iterative Semantic and Geometric Fusion |
Jiangyuan Liu et.al. |
2502.14616v1 |
null |
2025-02-20 |
Vision Foundation Models in Medical Image Analysis: Advances and Challenges |
Pengchen Liang et.al. |
2502.14584v1 |
null |
2025-02-20 |
Dynamic Preference-based Multi-modal Trip Planning of Public Transport and Shared Mobility |
Yimeng Zhang et.al. |
2502.14528v1 |
null |
2025-02-20 |
Learning Temporal 3D Semantic Scene Completion via Optical Flow Guidance |
Meng Wang et.al. |
2502.14520v1 |
null |
2025-02-20 |
Stories that (are) Move(d by) Markets: A Causal Exploration of Market Shocks and Semantic Shifts across Different Partisan Groups |
Felix Drinkall et.al. |
2502.14497v1 |
null |
2025-02-20 |
Integrating Extra Modality Helps Segmentor Find Camouflaged Objects Well |
Chengyu Fang et.al. |
2502.14471v1 |
null |
2025-02-20 |
An Enhancement of Jiang, Z., et al.s Compression-Based Classification Algorithm Applied to News Article Categorization |
Sean Lester C. Benavides et.al. |
2502.14444v1 |
null |
2025-02-20 |
From Bugs to Breakthroughs: Novice Errors in CS2 |
Nadja Just et.al. |
2502.14438v1 |
null |
2025-02-20 |
Role of the Pretraining and the Adaptation data sizes for low-resource real-time MRI video segmentation |
Masoud Thajudeen Tholan et.al. |
2502.14418v1 |
null |
2025-02-20 |
Reliable Explainability of Deep Learning Spatial-Spectral Classifiers for Improved Semantic Segmentation in Autonomous Driving |
Jon Gutiérrez-Zaballa et.al. |
2502.14416v1 |
null |
2025-02-20 |
Leveraging Small LLMs for Argument Mining in Education: Argument Component Identification, Classification, and Assessment |
Lucile Favero et.al. |
2502.14389v1 |
null |
2025-02-20 |
Topology-Aware Wavelet Mamba for Airway Structure Segmentation in Postoperative Recurrent Nasopharyngeal Carcinoma CT Scans |
Haishan Huang et.al. |
2502.14363v1 |
null |
2025-02-20 |
Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning |
Jiachen Zhu et.al. |
2502.14361v1 |
null |