2025-01-16 |
Distilling Multi-modal Large Language Models for Autonomous Driving |
Deepti Hegde et.al. |
2501.09757v1 |
null |
2025-01-16 |
SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces |
Sumit Chaturvedi et.al. |
2501.09756v1 |
null |
2025-01-16 |
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation |
Philippe Hansen-Estruch et.al. |
2501.09755v1 |
null |
2025-01-16 |
Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues |
Youngjoon Jang et.al. |
2501.09754v1 |
null |
2025-01-16 |
SRE-Conv: Symmetric Rotation Equivariant Convolution for Biomedical Image Classification |
Yuexi Du et.al. |
2501.09753v1 |
link |
2025-01-16 |
FAST: Efficient Action Tokenization for Vision-Language-Action Models |
Karl Pertsch et.al. |
2501.09747v1 |
null |
2025-01-16 |
Improvement of Data Analytics Techniques in Reflection High Energy Electron Diffraction to Enable Machine Learning |
Patrick T. Gemperline et.al. |
2501.09743v1 |
link |
2025-01-16 |
ComplexVAD: Detecting Interaction Anomalies in Video |
Furkan Mumcu et.al. |
2501.09733v1 |
null |
2025-01-16 |
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps |
Nanye Ma et.al. |
2501.09732v1 |
null |
2025-01-16 |
Generating particle physics Lagrangians with transformers |
Yong Sheng Koay et.al. |
2501.09729v1 |
null |
2025-01-16 |
A Simple Aerial Detection Baseline of Multimodal Language Models |
Qingyun Li et.al. |
2501.09720v1 |
link |
2025-01-16 |
FLOL: Fast Baselines for Real-World Low-Light Enhancement |
Juan C. Benito et.al. |
2501.09718v1 |
null |
2025-01-16 |
Practical Continual Forgetting for Pre-trained Vision Models |
Hongbo Zhao et.al. |
2501.09705v1 |
link |
2025-01-16 |
Infinity norm bounds for the inverse of Nekrasov matrices using scaling matrices |
Héctor Orera et.al. |
2501.09704v1 |
null |
2025-01-16 |
Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key |
Zhihe Yang et.al. |
2501.09695v1 |
null |
2025-01-16 |
Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation |
Jiho Choi et.al. |
2501.09688v1 |
null |
2025-01-16 |
Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark |
Alexis Roger et.al. |
2501.09672v1 |
null |
2025-01-16 |
Unitary Expressions: A Necessary Abstraction for Extensible Quantum Programming Languages and Systems |
Ed Younis et.al. |
2501.09667v1 |
null |
2025-01-16 |
Approaching optimal microwave-acoustic transduction on lithium niobate using SQUID arrays |
A. Hugot et.al. |
2501.09661v1 |
null |
2025-01-16 |
A Survey of Research in Large Language Models for Electronic Design Automation |
Jingyu Pan et.al. |
2501.09655v1 |
null |
2025-01-16 |
NS-Gym: Open-Source Simulation Environments and Benchmarks for Non-Stationary Markov Decision Processes |
Nathaniel S. Keplinger et.al. |
2501.09646v1 |
link |
2025-01-16 |
Supersolid dipolar phases in planar geometry: effects of tilted polarization |
Daniel Lima et.al. |
2501.09641v1 |
null |
2025-01-16 |
Unified Face Matching and Physical-Digital Spoofing Attack Detection |
Arun Kunwar et.al. |
2501.09635v1 |
null |
2025-01-16 |
Optimal paths and dynamical symmetry breaking in the current fluctuations of driven diffusive media |
Pablo I. Hurtado et.al. |
2501.09629v1 |
null |
2025-01-16 |
WMamba: Wavelet-based Mamba for Face Forgery Detection |
Siran Peng et.al. |
2501.09617v1 |
null |
2025-01-16 |
Metric Learning with Progressive Self-Distillation for Audio-Visual Embedding Learning |
Donghuo Zeng et.al. |
2501.09608v1 |
null |
2025-01-16 |
From Scarcity to Capability: Empowering Fake News Detection in Low-Resource Languages with LLMs |
Hrithik Majumdar Shibu et.al. |
2501.09604v1 |
link |
2025-01-16 |
Mesh2SLAM in VR: A Fast Geometry-Based SLAM Framework for Rapid Prototyping in Virtual Reality Applications |
Carlos Augusto Pinheiro de Sousa et.al. |
2501.09600v1 |
null |
2025-01-16 |
Atleus: Accelerating Transformers on the Edge Enabled by 3D Heterogeneous Manycore Architectures |
Pratyush Dhingra et.al. |
2501.09588v1 |
null |
2025-01-16 |
Sequential PatchCore: Anomaly Detection for Surface Inspection using Synthetic Impurities |
Runzhou Mao et.al. |
2501.09579v1 |
null |