Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection.- Training-free Video Temporal Grounding using Large-scale Pre-trained Models.- Revisit Self-supervision with Local Structure-from-Motion.- FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis.- Efficient Learning of Event-based Dense Representation using Hierarchical Memories with Adaptive Update.- SNP: Structured Neuron-level Pruning to Preserve Attention Scores.- Multi-Granularity Sparse Relationship Matrix Prediction Network for End-to-End Scene Graph Generation.- Flash-Splat: 3D Reflection Removal with Flash Cues and Gaussian Splats.
- PALM: Predicting Actions through Language Models.- Motion Keyframe Interpolation for Any Human Skeleton using Point Cloud-based Human Motion Data Homogenisation.- SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher.- Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment.- Improving Hyperbolic Representations via Gromov-Wasserstein Regularization.- VSViG: Real-time Video-based Seizure Detection via Skeleton-based Spatiotemporal ViG.- DiffSurf: A Transformer-based Diffusion Model for Generating and Reconstructing 3D Surfaces in Pose.- Exploiting Supervised Poison Vulnerability to Strengthen Self-Supervised Defense.
- Dense Hand-Object(HO) GraspNet with Full Grasping Taxonomy and Dynamics.- Human Pose Recognition via Occlusion-Preserving Abstract Images.- DA-BEV: Unsupervised Domain Adaptation for Bird's Eye View Perception.- SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow.- PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation.- Depth-Aware Blind Image Decomposition for Real-World Adverse Weather Recovery.- DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image Manipulation.- Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation.
- Personalized Privacy Protection Mask Against Unauthorized Facial Recognition.- PosterLlama: Bridging Design Ability of Langauge Model to Content-Aware Layout Generation.- PreciseControl: Enhancing Text-To-Image Diffusion Models with Fine-Grained Attribute Control.