https://github.com/cowqer/eccv2024-papers-with-code
ECCV 2024 论文和开源项目合集,同时欢迎各位大佬提交issue,分享ECCV 2024论文和开源项目
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.6%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
ECCV 2024 论文和开源项目合集,同时欢迎各位大佬提交issue,分享ECCV 2024论文和开源项目
Basic Info
- Host: GitHub
- Owner: cowqer
- Default Branch: master
- Homepage: https://mp.weixin.qq.com/s/NRjCfZxJF2Z0Ugbhj-8G4g
- Size: 300 KB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of amusi/ECCV2024-Papers-with-Code
Created over 1 year ago
· Last pushed over 1 year ago
https://github.com/cowqer/ECCV2024-Papers-with-Code/blob/master/
# ECCV 2024 (Papers with Code) ECCV 2024 decisions are now available > 1issueECCV 2024 > > 2CVCV https://github.com/amusi/daily-paper-computer-vision > > - [CVPR 2024](https://github.com/amusi/CVPR2024-Papers-with-Code) > - [ECCV 2022](ECCV2022-Papers-with-Code.md) > - [ECCV 2020](ECCV2020-Papers-with-Code.md) ECCV 2024CVerAIAIGC  # ECCV 2024 - [3DGS(Gaussian Splatting)](#3DGS) - [Mamba / SSM)](#Mamba) - [Avatars](#Avatars) - [Backbone](#Backbone) - [CLIP](#CLIP) - [MAE](#MAE) - [Embodied AI](#Embodied-AI) - [GAN](#GAN) - [GNN](#GNN) - [(MLLM)](#MLLM) - [(LLM)](#LLM) - [NAS](#NAS) - [OCR](#OCR) - [NeRF](#NeRF) - [DETR](#DETR) - [Prompt](#Prompt) - [(Diffusion Models)](#Diffusion) - [ReID()](#ReID) - [(Long-Tail)](#Long-Tail) - [Vision Transformer](#Vision-Transformer) - [(Vision-Language)](#VL) - [(Self-supervised Learning)](#SSL) - [(Data Augmentation)](#DA) - [(Object Detection)](#Object-Detection) - [(Anomaly Detection)](#Anomaly-Detection) - [(Visual Tracking)](#VT) - [(Semantic Segmentation)](#Semantic-Segmentation) - [(Instance Segmentation)](#Instance-Segmentation) - [(Panoptic Segmentation)](#Panoptic-Segmentation) - [(Medical Image)](#MI) - [(Medical Image Segmentation)](#MIS) - [(Video Object Segmentation)](#VOS) - [(Video Instance Segmentation)](#VIS) - [(Referring Image Segmentation)](#RIS) - [(Image Matting)](#Matting) - [(Image Editing)](#Image-Editing) - [Low-level Vision](#LLV) - [(Super-Resolution)](#SR) - [(Denoising)](#Denoising) - [(Deblur)](#Deblur) - [(Autonomous Driving)](#Autonomous-Driving) - [3D(3D Point Cloud)](#3D-Point-Cloud) - [3D(3D Object Detection)](#3DOD) - [3D(3D Semantic Segmentation)](#3DSS) - [3D(3D Object Tracking)](#3D-Object-Tracking) - [3D(3D Semantic Scene Completion)](#3DSSC) - [3D(3D Registration)](#3D-Registration) - [3D(3D Human Pose Estimation)](#3D-Human-Pose-Estimation) - [3DMesh(3D Human Mesh Estimation)](#3D-Human-Pose-Estimation) - [(Medical Image)](#Medical-Image) - [(Image Generation)](#Image-Generation) - [(Video Generation)](#Video-Generation) - [3D(3D Generation)](#3D-Generation) - [(Video Understanding)](#Video-Understanding) - [(Action Recognition)](#Action-Recognition) - [(Action Detection)](#Action-Detection) - [(Text Detection)](#Text-Detection) - [(Knowledge Distillation)](#KD) - [(Model Pruning)](#Pruning) - [(Image Compression)](#IC) - [(3D Reconstruction)](#3D-Reconstruction) - [(Depth Estimation)](#Depth-Estimation) - [(Trajectory Prediction)](#TP) - [(Lane Detection)](#Lane-Detection) - [(Image Captioning)](#Image-Captioning) - [(Visual Question Answering)](#VQA) - [(Sign Language Recognition)](#SLR) - [(Video Prediction)](#Video-Prediction) - [(Novel View Synthesis)](#NVS) - [Zero-Shot Learning()](#ZSL) - [(Stereo Matching)](#Stereo-Matching) - [(Feature Matching)](#Feature-Matching) - [(Scene Graph Generation)](#SGG) - [(Counting)](#Counting) - [(Implicit Neural Representations)](#INR) - [(Image Quality Assessment)](#IQA) - [(Video Quality Assessment)](#Video-Quality-Assessment) - [(Datasets)](#Datasets) - [(New Tasks)](#New-Tasks) - [(Others)](#Others) # 3DGS(Gaussian Splatting) **MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images** - Project: https://donydchen.github.io/mvsplat - Paper: https://arxiv.org/abs/2403.14627 - Codehttps://github.com/donydchen/mvsplat **CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians** - Paper: https://arxiv.org/abs/2404.01133 - Code: https://github.com/DekuLiuTesla/CityGaussian **FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting** - Project: https://zehaozhu.github.io/FSGS/ - Paper: https://arxiv.org/abs/2312.00451 - Code: https://github.com/VITA-Group/FSGS # Mamba / SSM **VideoMamba: State Space Model for Efficient Video Understanding** - Paper: https://arxiv.org/abs/2403.06977 - Code: https://github.com/OpenGVLab/VideoMamba **ZIGMA: A DiT-style Zigzag Mamba Diffusion Model** - Paper: https://arxiv.org/abs/2403.13802 - Code: https://taohu.me/zigma/ # Avatars # Backbone # CLIP # MAE # Embodied AI # GAN # OCR **Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors** - Paper: https://arxiv.org/pdf/2312.05286 - Code: https://github.com/SJTU-DeepVisionLab/FreeReal **PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer** - Paper: https://arxiv.org/abs/2407.07764 - Code: https://github.com/SJTU-DeepVisionLab/PosFormer # Occupancy **Fully Sparse 3D Occupancy Prediction** - Paper: https://arxiv.org/abs/2312.17118 - Code: https://github.com/MCG-NJU/SparseOcc # NeRF **NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields** - Project: https://nerf-mae.github.io/ - Paper: https://arxiv.org/pdf/2404.01300 - Code: https://github.com/zubair-irshad/NeRF-MAE # DETR # Prompt # (MLLM) **SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant** - Paper: https://arxiv.org/abs/2403.11299 - Code: https://github.com/heliossun/SQ-LLaVA **ControlCap: Controllable Region-level Captioning** - Paper: https://arxiv.org/abs/2401.17910 - Code: https://github.com/callsys/ControlCap # (LLM) # NAS # ReID() # (Diffusion Models) **ZIGMA: A DiT-style Zigzag Mamba Diffusion Model** - Paper: https://arxiv.org/abs/2403.13802 - Code: https://taohu.me/zigma/ **Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation** - Paper: https://arxiv.org/abs/2403.16394 - Code: https://github.com/zdxdsw/skewed_relations_T2I **The Lottery Ticket Hypothesis in Denoising: Towards Semantic-Driven Initialization** - Project: https://ut-mao.github.io/noise.github.io/ - Paper: https://arxiv.org/abs/2312.08872 - Code: https://github.com/UT-Mao/Initial-Noise-Construction # Vision Transformer **GiT: Towards Generalist Vision Transformer through Universal Language Interface** - Paper: https://arxiv.org/abs/2403.09394 - Code: https://github.com/Haiyang-W/GiT # (Vision-Language) **GalLoP: Learning Global and Local Prompts for Vision-Language Models** - Paperhttps://arxiv.org/abs/2407.01400 # (Object Detection) **Relation DETR: Exploring Explicit Position Relation Prior for Object Detection** - Paper: https://arxiv.org/abs/2407.11699v1 - Code: https://github.com/xiuqhou/Relation-DETR - Dataset: https://huggingface.co/datasets/xiuqhou/SA-Det-100k **GRA: Detecting Oriented Objects through Group-wise Rotating and Attention** - Paper: https://arxiv.org/pdf/2403.11127 - Code: https://github.com/wangjiangshan0725/GRA **Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector** - Project: http://yuqianfu.com/CDFSOD-benchmark/ - Paper: https://arxiv.org/pdf/2402.03094 - Code: https://github.com/lovelyqian/CDFSOD-benchmark # (Anomaly Detection) # (Object Tracking) # (Semantic Segmentation) **Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation** - Paper: https://arxiv.org/abs/2405.06228 - Code: https://github.com/nizhenliang/CGRSeg # (Medical Image) **Brain-ID: Learning Contrast-agnostic Anatomical Representations for Brain Imaging** - Paper: https://arxiv.org/abs/2311.16914 - Code: https://github.com/peirong26/Brain-ID **FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification** - Project: https://ophai.hms.harvard.edu/datasets/harvard-fairdomain20k - Paper : https://arxiv.org/abs/2407.08813 - Dataset: https://drive.google.com/drive/u/1/folders/1huH93JVeXMj9rK6p1OZRub868vv0UK0O - Code: https://github.com/Harvard-Ophthalmology-AI-Lab/FairDomain # (Medical Image Segmentation) **ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image** - Project: https://scribbleprompt.csail.mit.edu/ - Paper: https://arxiv.org/abs/2312.07381 - Code: https://github.com/halleewong/ScribblePrompt **AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking** - Paper: https://arxiv.org/abs/2407.06468 - Code: https://github.com/ricklisz/AnatoMask **Representing Topological Self-Similarity Using Fractal Feature Maps for Accurate Segmentation of Tubular Structures** - Paper: https://arxiv.org/abs/2407.14754 - Code: https://github.com/cbmi-group/FFM-Multi-Decoder-Network # (Video Object Segmentation) **DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries** - Project: https://zhang-tao-whu.github.io/projects/DVIS_DAQ/ - Paper: https://arxiv.org/abs/2404.00086 - Code: https://github.com/zhang-tao-whu/DVIS_Plus # (Autonomous Driving) **Fully Sparse 3D Occupancy Prediction** - Paper: https://arxiv.org/abs/2312.17118 - Code: https://github.com/MCG-NJU/SparseOcc **milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing** - Paper: https://arxiv.org/abs/2306.17010 - Code: https://github.com/Toytiny/milliFlow/ **4D Contrastive Superflows are Dense 3D Representation Learners** - Paper : https://arxiv.org/abs/2407.06190 - Code: https://github.com/Xiangxu-0103/SuperFlow # 3D(3D-Point-Cloud) # 3D(3D Object Detection) **3D Small Object Detection with Dynamic Spatial Pruning** - Project: https://xuxw98.github.io/DSPDet3D/ - Paper: https://arxiv.org/abs/2305.03716 - Code: https://github.com/xuxw98/DSPDet3D **Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection** - Paper: https://arxiv.org/abs/2402.03634 - Code: https://github.com/LiewFeng/RayDN # 3D(3D Semantic Segmentation) # (Image Editing) # /(Image Inpainting) **BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion** - Project https://tencentarc.github.io/BrushNet/ - Paper: https://arxiv.org/abs/2403.06976 - Code: https://github.com/TencentARC/BrushNet # (Video Editing) # Low-level Vision **Restoring Images in Adverse Weather Conditions via Histogram Transformer** - Paper: https://arxiv.org/abs/2407.10172 - Code: https://github.com/sunshangquan/Histoformer **OneRestore: A Universal Restoration Framework for Composite Degradation** - Project https://gy65896.github.io/projects/ECCV2024_OneRestore - Paper: https://arxiv.org/abs/2407.04621 - Code: https://github.com/gy65896/OneRestore # (Super-Resolution) # (Denoising) ## (Image Denoising) # 3D(3D Human Pose Estimation) # (Image Generation) **Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models** - Paper: https://arxiv.org/abs/2404.07389 - Code: https://github.com/YasminZhang/EBAMA **Every Pixel Has its Moments: Ultra-High-Resolution Unpaired Image-to-Image Translation via Dense Normalization** - Project: https://kaminyou.com/Dense-Normalization/ - Paper: https://arxiv.org/abs/2407.04245 - Code: https://github.com/Kaminyou/Dense-Normalization **ZIGMA: A DiT-style Zigzag Mamba Diffusion Model** - Paper: https://arxiv.org/abs/2403.13802 - Code: https://taohu.me/zigma/ **Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation** - Paper: https://arxiv.org/abs/2403.16394 - Code: https://github.com/zdxdsw/skewed_relations_T2I # (Video Generation) **VideoStudio: Generating Consistent-Content and Multi-Scene Videos** - Project: https://vidstudio.github.io/ - Code: https://github.com/FuchenUSTC/VideoStudio # 3D # (Video Understanding) **VideoMamba: State Space Model for Efficient Video Understanding** - Paper: https://arxiv.org/abs/2403.06977 - Code: https://github.com/OpenGVLab/VideoMamba **C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition** - Paper: https://arxiv.org/abs/2407.06113 - Code: https://github.com/RongchangLi/ZSCAR_C2C # (Action Recognition) **SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders** - Paper: https://arxiv.org/abs/2407.13460 - Code: https://github.com/pha123661/SA-DVAE # (Knowledge Distillation) # (Image Compression) **Image Compression for Machine and Human Vision With Spatial-Frequency Adaptation** - Code: https://github.com/qingshi9974/ECCV2024-AdpatICMH - Paper: http://arxiv.org/abs/2407.09853 # (Stereo Matching) # (Scene Graph Generation) # (Counting) **Zero-shot Object Counting with Good Exemplars** - Paper: https://arxiv.org/abs/2407.04948 - Code: https://github.com/HopooLinZ/VA-Count # (Video Quality Assessment) # (Datasets) # (Others) **Multi-branch Collaborative Learning Network for 3D Visual Grounding** - Paper: https://arxiv.org/abs/2407.05363v2 - Code: https://github.com/qzp2018/MCLN **PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers** - Code: https://github.com/ananthu-aniraj/pdiscoformer - Paper: https://arxiv.org/abs/2407.04538 **SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments** - Project: https://fraunhoferhhi.github.io/spvloc/ - Paper: https://arxiv.org/abs/2404.10527 - Code: https://github.com/fraunhoferhhi/spvloc **REFRAME: Reflective Surface Real-Time Rendering for Mobile Devices** - Project: https://xdimlab.github.io/REFRAME/ - Paper: https://arxiv.org/abs/2403.16481 - Code: https://github.com/MARVELOUSJI/REFRAME
Owner
- Name: coqqer
- Login: cowqer
- Kind: user
- Repositories: 1
- Profile: https://github.com/cowqer
GitHub Events
Total
- Push event: 1
- Create event: 1
Last Year
- Push event: 1
- Create event: 1