2025-02-20 |
Benchmarking Multimodal RAG through a Chart-based Document Question-Answering Generation Framework |
Yuming Yang et.al. |
2502.14864v1 |
null |
2025-02-20 |
Spatial and Temporal Periodic Density Patterns in Driven Bose-Einstein Condensates |
A. del Río-Lima et.al. |
2502.14849v1 |
null |
2025-02-20 |
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation |
Yue Yang et.al. |
2502.14846v1 |
null |
2025-02-20 |
Dynamic Concepts Personalization from Single Videos |
Rameen Abdal et.al. |
2502.14844v1 |
null |
2025-02-20 |
LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models |
Shangqing Tu et.al. |
2502.14834v1 |
null |
2025-02-20 |
Improving the Diffusability of Autoencoders |
Ivan Skorokhodov et.al. |
2502.14831v1 |
null |
2025-02-20 |
Turning on the Light: Polymorphism-Induced Photoluminescence in Cysteine Crystals |
Debarshi Banerjee et.al. |
2502.14826v1 |
null |
2025-02-20 |
PREM: Privately Answering Statistical Queries with Relative Error |
Badih Ghazi et.al. |
2502.14809v1 |
null |
2025-02-20 |
FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis |
Fadillah Maani et.al. |
2502.14807v1 |
null |
2025-02-20 |
A Survey on Text-Driven 360-Degree Panorama Generation |
Hai Wang et.al. |
2502.14799v1 |
null |
2025-02-20 |
Micro Blossom: Accelerated Minimum-Weight Perfect Matching Decoding for Quantum Error Correction |
Yue Wu et.al. |
2502.14787v1 |
null |
2025-02-20 |
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features |
Michael Tschannen et.al. |
2502.14786v1 |
null |
2025-02-20 |
ReVision: A Dataset and Baseline VLM for Privacy-Preserving Task-Oriented Visual Instruction Rewriting |
Abhijit Mishra et.al. |
2502.14780v1 |
null |
2025-02-20 |
DC-ControlNet: Decoupling Inter- and Intra-Element Conditions in Image Generation with Diffusion Models |
Hongji Yang et.al. |
2502.14779v1 |
null |
2025-02-20 |
Harnessing PDF Data for Improving Japanese Large Multimodal Models |
Jeonghun Baek et.al. |
2502.14778v1 |
null |
2025-02-20 |
MedVAE: Efficient Automated Interpretation of Medical Images with Large-Scale Generalizable Autoencoders |
Maya Varma et.al. |
2502.14753v1 |
null |
2025-02-20 |
AIdeation: Designing a Human-AI Collaborative Ideation System for Concept Designers |
Wen-Fan Wang et.al. |
2502.14747v1 |
null |
2025-02-20 |
H$α$ Variability of AB Aur b with the Hubble Space Telescope: Probing the Nature of a Protoplanet Candidate with Accretion Light Echoes |
Brendan P. Bowler et.al. |
2502.14736v1 |
null |
2025-02-20 |
Model-based time super-sampling of turbulent flow field sequences |
Qihong Lorena Li-Hu et.al. |
2502.14722v1 |
null |
2025-02-20 |
Instrumented mouthguards in elite sports: Validity and head acceleration event (HAE) incidence in NCAA American Football |
Mario Rotundo et.al. |
2502.14710v1 |
null |
2025-02-20 |
TRUSWorthy: Toward Clinically Applicable Deep Learning for Confident Detection of Prostate Cancer in Micro-Ultrasound |
Mohamed Harmanani et.al. |
2502.14707v1 |
null |
2025-02-20 |
Two-Sided Matching with Resource-Regional Caps |
Felipe Garrido-Lucero et.al. |
2502.14690v1 |
null |
2025-02-20 |
Constraints on optical and near-infrared variability in the localisation of the long-period radio transient GLEAM-X J1627-52 |
J. D. Lyman et.al. |
2502.14688v1 |
null |
2025-02-20 |
Lopsided and Bulging Distribution of Satellites around Paired Halos. I. Observational Measurements and Comparison with Halo-based Models |
Yanhan Guo et.al. |
2502.14666v1 |
null |
2025-02-20 |
MAGO-SP: Detection and Correction of Water-Fat Swaps in Magnitude-Only VIBE MRI |
Robert Graf et.al. |
2502.14659v1 |
null |
2025-02-20 |
Variance Reduction Methods Do Not Need to Compute Full Gradients: Improved Efficiency through Shuffling |
Daniil Medyakov et.al. |
2502.14648v1 |
null |
2025-02-20 |
NAVIG: Natural Language-guided Analysis with Vision Language Models for Image Geo-localization |
Zheyuan Zhang et.al. |
2502.14638v1 |
null |
2025-02-20 |
ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation |
Angxiao Yue et.al. |
2502.14637v1 |
link |
2025-02-20 |
ATRI: Mitigating Multilingual Audio Text Retrieval Inconsistencies by Reducing Data Distribution Errors |
Yuguo Yin et.al. |
2502.14627v1 |
null |
2025-02-20 |
Online Envy Minimization and Multicolor Discrepancy: Equivalences and Separations |
Daniel Halpern et.al. |
2502.14624v1 |
null |