MutDet: Mutually Optimizing Pre-training for Remote Sensing Object Detection.- Self-Supervised Video Copy Localization with Regional Token Representation.- Enhancing Perceptual Quality in Video Super-Resolution through Temporally-Consistent Detail Synthesis using Diffusion Models.- RoGUENeRF: A Robust Geometry-Consistent Universal Enhancer for NeRF.- Bridging the Gap: Studio-like Avatar Creation from a Monocular Phone Capture.- ControlLLM: Augment Language Models with Tools by Searching on Graphs.- UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction.- DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors.
- Vamos: Versatile Action Models for Video Understanding.- Prioritized Semantic Learning for Zero-shot Instance Navigation.- RoadPainter: Points Are Ideal Navigators for Topology transformER.- FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis.- Can OOD Object Detectors Learn from Foundation Models?.- Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion.- MERLiN: Single-Shot Material Estimation and Relighting for Photometric Stereo.- Boosting 3D Single Object Tracking with 2D Matching Distillation and 3D Pre-training.
- Diffusion-Based Image-to-Image Translation by Noise Correction via Prompt Interpolation.- Real-data-driven 2000 FPS Color Video from Mosaicked Chromatic Spikes.- Brain-ID: Learning Contrast-agnostic Anatomical Representations for Brain Imaging.- TTT-MIM: Test-Time Training with Masked Image Modeling for Denoising Distribution Shifts.- RadEdit: stress-testing biomedical vision models via diffusion image editing.- SPAMming Labels: Efficient Annotations for the Trackers of Tomorrow.- AdaDiffSR: Adaptive Region-aware Dynamic acceleration Diffusion Model for Real-World Image Super-Resolution.- Explicitly Guided Information Interaction Network for Cross-modal Point Cloud Completion.
- Towards Real-world Event-guided Low-light Video Enhancement and Deblurring.- Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation.- TrackNeRF: Bundle Adjusting NeRF from Sparse and Noisy Views via Feature Tracks.