InstructIR: High-Quality Image Restoration Following Human Instructions.- Asynchronous Large Language Model Enhanced Planner for Autonomous Driving.- Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation.- LayoutFlow: Flow Matching for Layout Generation.- Making Large Language Models Better Planners with Reasoning-Decision Alignment.- R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection.- Representation Enhancement-Stabilization: Reducing Bias-Variance of Domain Generalization.- Continual Learning for Remote Physiological Measurement: Minimize Forgetting and Simplify Inference.
- An Optimization Framework to Enforce Multi-View Consistency for Texturing 3D Meshes.- STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians.- RGBD GS-ICP SLAM.- Efficient NeRF Optimization - Not All Samples Remain Equally Hard.- Revisiting Calibration of Wide-Angle Radially Symmetric Cameras.- Rawformer: Unpaired Raw-to-Raw Translation for Learnable Camera ISPs.- Robust Incremental Structure-from-Motion with Hybrid Features.- Revisiting Domain-Adaptive Object Detection in Adverse Weather by the Generation and Composition of High-Quality Pseudo-Labels.
- Prediction Exposes Your Face: Black-box Model Inversion via Prediction Alignment.- Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models.- UniCal: Unified Neural Sensor Calibration.- Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models.- Urban Waterlogging Detection: A Challenging Benchmark and Large-Small Model Co-Adapter.- Pseudo-Embedding for Generalized Few-Shot Point Cloud Segmentation.- WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering.- ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions.
- Statewide Visual Geolocalization in the Wild.- Any2Point: Empowering Any-modality Transformers for Efficient 3D Understanding.- Trajectory-aligned Space-time Tokens for Few-shot Action Recognition.