Skip to content

Image Classification

Image Classification

Publish Date Title Authors PDF Code
2025-02-20 Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts Sara Ghaboura et.al. 2502.14865v1 null
2025-02-20 Benchmarking Multimodal RAG through a Chart-based Document Question-Answering Generation Framework Yuming Yang et.al. 2502.14864v1 null
2025-02-20 Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Yue Yang et.al. 2502.14846v1 null
2025-02-20 Dynamic Concepts Personalization from Single Videos Rameen Abdal et.al. 2502.14844v1 null
2025-02-20 LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models Shangqing Tu et.al. 2502.14834v1 null
2025-02-20 Improving the Diffusability of Autoencoders Ivan Skorokhodov et.al. 2502.14831v1 null
2025-02-20 Turning on the Light: Polymorphism-Induced Photoluminescence in Cysteine Crystals Debarshi Banerjee et.al. 2502.14826v1 null
2025-02-20 Cross Validation for Correlated Data in Regression and Classification Models, with Applications to Deep Learning Oren Yuval et.al. 2502.14808v1 null
2025-02-20 FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis Fadillah Maani et.al. 2502.14807v1 null
2025-02-20 A Survey on Text-Driven 360-Degree Panorama Generation Hai Wang et.al. 2502.14799v1 null
2025-02-20 SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Michael Tschannen et.al. 2502.14786v1 null
2025-02-20 ReVision: A Dataset and Baseline VLM for Privacy-Preserving Task-Oriented Visual Instruction Rewriting Abhijit Mishra et.al. 2502.14780v1 null
2025-02-20 DC-ControlNet: Decoupling Inter- and Intra-Element Conditions in Image Generation with Diffusion Models Hongji Yang et.al. 2502.14779v1 null
2025-02-20 Harnessing PDF Data for Improving Japanese Large Multimodal Models Jeonghun Baek et.al. 2502.14778v1 null
2025-02-20 Sparse Activations as Conformal Predictors Margarida M. Campos et.al. 2502.14773v1 null
2025-02-20 MedVAE: Efficient Automated Interpretation of Medical Images with Large-Scale Generalizable Autoencoders Maya Varma et.al. 2502.14753v1 null
2025-02-20 AIdeation: Designing a Human-AI Collaborative Ideation System for Concept Designers Wen-Fan Wang et.al. 2502.14747v1 null
2025-02-20 Robust Information Selection for Hypothesis Testing with Misclassification Penalties Jayanth Bhargav et.al. 2502.14738v1 null
2025-02-20 H$α$ Variability of AB Aur b with the Hubble Space Telescope: Probing the Nature of a Protoplanet Candidate with Accretion Light Echoes Brendan P. Bowler et.al. 2502.14736v1 null
2025-02-20 Model-based time super-sampling of turbulent flow field sequences Qihong Lorena Li-Hu et.al. 2502.14722v1 null
2025-02-20 Internal Incoherency Scores for Constraint-based Causal Discovery Algorithms Sofia Faltenbacher et.al. 2502.14719v1 null
2025-02-20 TRUSWorthy: Toward Clinically Applicable Deep Learning for Confident Detection of Prostate Cancer in Micro-Ultrasound Mohamed Harmanani et.al. 2502.14707v1 null
2025-02-20 Constraints on optical and near-infrared variability in the localisation of the long-period radio transient GLEAM-X J1627-52 J. D. Lyman et.al. 2502.14688v1 null
2025-02-20 Beyond the Surface: Uncovering Implicit Locations with LLMs for Personalized Local News Gali Katz et.al. 2502.14660v1 null
2025-02-20 MAGO-SP: Detection and Correction of Water-Fat Swaps in Magnitude-Only VIBE MRI Robert Graf et.al. 2502.14659v1 null
2025-02-20 NAVIG: Natural Language-guided Analysis with Vision Language Models for Image Geo-localization Zheyuan Zhang et.al. 2502.14638v1 null
2025-02-20 Monocular Depth Estimation and Segmentation for Transparent Object with Iterative Semantic and Geometric Fusion Jiangyuan Liu et.al. 2502.14616v1 null
2025-02-20 A Millimeter-Wave Photometric Camera for Long-Range Imaging Through Optical Obscurants Using Kinetic Inductance Detectors Jack Sayers et.al. 2502.14607v1 null
2025-02-20 Emergent Goldstone flat bands and spontaneous symmetry breaking with type-B Goldstone modes Huan-Qiang Zhou et.al. 2502.14605v1 null
2025-02-20 Noisy Test-Time Adaptation in Vision-Language Models Chentao Cao et.al. 2502.14604v1 null