CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing.- Noise-assisted Prompt Learning for Image Forgery Detection and Localization.- Data Collection-free Masked Video Modeling.- Protecting NeRFs' Copyright via Plug-And-Play Watermarking Base Model.- Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization.- AnyControl: Create Your Artwork with Versatile Control on Text-to-Image Generation.- SEED: A Simple and Effective 3D DETR in Point Clouds.- AEDNet: Adaptive Embedding and Multiview-Aware Disentanglement for Point Cloud Completion.
- Synergy of Sight and Semantics: Visual Intention Understanding with CLIP.- Intrinsic Single-Image HDR Reconstruction.- T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning.- Pathology-knowledge Enhanced Multi-instance Prompt Learning for Few-shot Whole Slide Image Classification.- Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching.- BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models.- Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene.- DATENeRF: Depth-Aware Text-based Editing of NeRFs.
- XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution.- ABC Easy as 123: A Blind Counter for Exemplar-Free Multi-Class Class-agnostic Counting.- Category Adaptation Meets Projected Distillation in Generalized Continual Category Discovery.- LaRa: Efficient Large-Baseline Radiance Fields.- Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement.- MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment.- Grounding Language Models for Visual Entity Recognition.- ELSE: Efficient Deep Neural Network Inference through Line-based Sparsity Exploration.
- DiffusionDepth: Diffusion Denoising Approach for Monocular Depth Estimation.- DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation.- TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos.