CS-Prompt: Learning Prompt to Rearrange Class Space for Prompt-based Continual Learning.- Text-Anchored Score Composition: Tackling Condition Misalignment in Text-to-Image Diffusion Models.- Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.- Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression.- OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation.- CatchBackdoor: Backdoor Detection via Critical Trojan Neural Path Fuzzing.- UCIP: A Universal Framework for Compressed Image Super-Resolution using Dynamic Prompt.- LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents.
- ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference.- Two-Stage Active Learning for Efficient Temporal Action Segmentation.- TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation.- MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views.- Domain-Adaptive 2D Human Pose Estimation via Dual Teachers in Extremely Low-Light Conditions.- Towards More Practical Group Activity Detection: A New Benchmark and Model.- Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models.- Zero-Shot Image Feature Consensus with Deep Functional Maps.
- WindPoly: Polygonal Mesh Reconstruction via Winding Numbers.- MinD-3D: Reconstruct High-quality 3D objects in Human Brain.- Tokenize Anything via Prompting.- Geospecific View Generation - Geometry-Context Aware High-resolution Ground View Inference from Satellite Views.- Scissorhands: Scrub Data Influence via Connection Sensitivity in Networks.- City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web.- GRAPE: Generalizable and Robust Multi-view Facial Capture.- Training-Free Model Merging for Multi-target Domain Adaptation.
- Multi-RoI Human Mesh Recovery with Camera Consistency and Contrastive Losses.- Co-Student: Collaborating Strong and Weak Students for Sparsely Annotated Object Detection.- Open-Vocabulary Camouflaged Object Segmentation.