Teach CLIP to Develop a Number Sense for Ordinal Regression.- Compact 3D Scene Representation via Self-Organizing Gaussian Grids.- Pix2Gif: Motion-Guided Diffusion for GIF Generation.- VETRA: A Dataset for Vehicle Tracking in Aerial Imagery - New Challenges for Multi-Object Tracking.- SelfGeo: Self-supervised and Geodesic-consistent Estimation of Keypoints on Deformable Shapes.- Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning.- T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models.- ExMatch: Self-guided Exploitation for Semi-Supervised Learning with Scarce Labeled Samples.
- Towards Certifiably Robust Face Recognition.- Linking in Style: Understanding learned features in deep learning models.- Stable Video Portraits.- UDA-Bench: Revisiting Common Assumptions in Unsupervised Domain Adaptation Using a Standardized Framework.- CliffPhys: Camera-based Respiratory Measurement using Clifford Neural Networks.- Learned Rate Control for Frame-Level Adaptive Neural Video Compression via Dynamic Neural Network.- PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers.- Vision-Language Dual-Pattern Matching for Out-of-Distribution Detection.
- Synthesizing Environment-Specific People in Photographs.- Weight Conditioning for Smooth Optimization of Neural Networks.- Energy-Clibrated VAE with Test Time Free Lunch.- MoEAD: A Parameter-efficient Model for Multi-class Anomaly Detection.- SceneTeller: Language-to-3D Scene Generation.- MagMax: Leveraging Model Merging for Seamless Continual Learning.- InternVideo2: Scaling Foundation Models for Multimodal Video Understanding.- DiffusionPen: Towards Controlling the Style of Handwritten Text Generation.
- Debiasing surgeon: fantastic weights and how to find them.- Denoising Vision Transformers.- Differentiable Product Quantization for Memory Efficient Camera Relocalization.