VideoMamba: Spatio-Temporal Selective State Space Model.- Text to Layer-wise 3D Clothed Human Generation.- Texture-GS: Disentangle the Geometry and Texture for 3D Gaussian Splatting Editing.- Fully Sparse 3D Occupancy Prediction.- Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data.- CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-aware 3D Gaussian Field.- Shifted Autoencoders for Point Annotation Restoration in Object Counting.- PointLLM: Empowering Large Language Models to Understand Point Clouds.
- GarmentAligner: Text-to-Garment Generation via Retrieval-augmented Multi-level Corrections.- Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving.- Enhancing Diffusion Models with Text-Encoder Reinforcement Learning.- Asymmetric Mask Scheme for Self-Supervised Real Image Denoising.- Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation.- BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting.- Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis.- BaSIC: BayesNet Structure Learning for Computational Scalable Neural Image Compression.
- FlexAttention for Efficient High-Resolution Vision-Language Models.- Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable Repainting.- AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation.- Spatially-Variant Degradation Model for Dataset-free Super-resolution.- DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation.- Learning Exhaustive Correlation for Spectral Super-Resolution: Where Spatial-Spectral Attention Meets Linear Dependence.- Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation.- EAFormer: Scene Text Segmentation with Edge-Aware Transformers.
- Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects.- DetailSemNet: Elevating Signature Verification through Detail-Semantic Integration.- LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation.