https://github.com/cowqer/eccv2024-papers-with-code

ECCV 2024 论文和开源项目合集，同时欢迎各位大佬提交issue，分享ECCV 2024论文和开源项目

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (4.6%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

ECCV 2024 论文和开源项目合集，同时欢迎各位大佬提交issue，分享ECCV 2024论文和开源项目

Basic Info

Host: GitHub
Owner: cowqer
Default Branch: master
Homepage: https://mp.weixin.qq.com/s/NRjCfZxJF2Z0Ugbhj-8G4g
Size: 300 KB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Fork of amusi/ECCV2024-Papers-with-Code

Created over 1 year ago · Last pushed over 1 year ago

https://github.com/cowqer/ECCV2024-Papers-with-Code/blob/master/

# ECCV 2024 (Papers with Code)

ECCV 2024 decisions are now available


> 1issueECCV 2024
>
> 2CVCV https://github.com/amusi/daily-paper-computer-vision
>
> - [CVPR 2024](https://github.com/amusi/CVPR2024-Papers-with-Code)
> - [ECCV 2022](ECCV2022-Papers-with-Code.md)
> - [ECCV 2020](ECCV2020-Papers-with-Code.md)

ECCV 2024CVerAIAIGC

![](CVer.png)

# ECCV 2024 

- [3DGS(Gaussian Splatting)](#3DGS)
- [Mamba / SSM)](#Mamba)
- [Avatars](#Avatars)
- [Backbone](#Backbone)
- [CLIP](#CLIP)
- [MAE](#MAE)
- [Embodied AI](#Embodied-AI)
- [GAN](#GAN)
- [GNN](#GNN)
- [(MLLM)](#MLLM)
- [(LLM)](#LLM)
- [NAS](#NAS)
- [OCR](#OCR)
- [NeRF](#NeRF)
- [DETR](#DETR)
- [Prompt](#Prompt)
- [(Diffusion Models)](#Diffusion)
- [ReID()](#ReID)
- [(Long-Tail)](#Long-Tail)
- [Vision Transformer](#Vision-Transformer)
- [(Vision-Language)](#VL)
- [(Self-supervised Learning)](#SSL)
- [(Data Augmentation)](#DA)
- [(Object Detection)](#Object-Detection)
- [(Anomaly Detection)](#Anomaly-Detection)
- [(Visual Tracking)](#VT)
- [(Semantic Segmentation)](#Semantic-Segmentation)
- [(Instance Segmentation)](#Instance-Segmentation)
- [(Panoptic Segmentation)](#Panoptic-Segmentation)
- [(Medical Image)](#MI)
- [(Medical Image Segmentation)](#MIS)
- [(Video Object Segmentation)](#VOS)
- [(Video Instance Segmentation)](#VIS)
- [(Referring Image Segmentation)](#RIS)
- [(Image Matting)](#Matting)
- [(Image Editing)](#Image-Editing)
- [Low-level Vision](#LLV)
- [(Super-Resolution)](#SR)
- [(Denoising)](#Denoising)
- [(Deblur)](#Deblur)
- [(Autonomous Driving)](#Autonomous-Driving)
- [3D(3D Point Cloud)](#3D-Point-Cloud)
- [3D(3D Object Detection)](#3DOD)
- [3D(3D Semantic Segmentation)](#3DSS)
- [3D(3D Object Tracking)](#3D-Object-Tracking)
- [3D(3D Semantic Scene Completion)](#3DSSC)
- [3D(3D Registration)](#3D-Registration)
- [3D(3D Human Pose Estimation)](#3D-Human-Pose-Estimation)
- [3DMesh(3D Human Mesh Estimation)](#3D-Human-Pose-Estimation)
- [(Medical Image)](#Medical-Image)
- [(Image Generation)](#Image-Generation)
- [(Video Generation)](#Video-Generation)
- [3D(3D Generation)](#3D-Generation)
- [(Video Understanding)](#Video-Understanding)
- [(Action Recognition)](#Action-Recognition)
- [(Action Detection)](#Action-Detection)
- [(Text Detection)](#Text-Detection)
- [(Knowledge Distillation)](#KD)
- [(Model Pruning)](#Pruning)
- [(Image Compression)](#IC)
- [(3D Reconstruction)](#3D-Reconstruction)
- [(Depth Estimation)](#Depth-Estimation)
- [(Trajectory Prediction)](#TP)
- [(Lane Detection)](#Lane-Detection)
- [(Image Captioning)](#Image-Captioning)
- [(Visual Question Answering)](#VQA)
- [(Sign Language Recognition)](#SLR)
- [(Video Prediction)](#Video-Prediction)
- [(Novel View Synthesis)](#NVS)
- [Zero-Shot Learning()](#ZSL)
- [(Stereo Matching)](#Stereo-Matching)
- [(Feature Matching)](#Feature-Matching)
- [(Scene Graph Generation)](#SGG)
- [(Counting)](#Counting)
- [(Implicit Neural Representations)](#INR)
- [(Image Quality Assessment)](#IQA)
- [(Video Quality Assessment)](#Video-Quality-Assessment)
- [(Datasets)](#Datasets)
- [(New Tasks)](#New-Tasks)
- [(Others)](#Others)



# 3DGS(Gaussian Splatting)

**MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images**

- Project: https://donydchen.github.io/mvsplat
- Paper: https://arxiv.org/abs/2403.14627
- Codehttps://github.com/donydchen/mvsplat

**CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians**

- Paper: https://arxiv.org/abs/2404.01133
- Code: https://github.com/DekuLiuTesla/CityGaussian

**FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting**

- Project: https://zehaozhu.github.io/FSGS/
- Paper: https://arxiv.org/abs/2312.00451
- Code: https://github.com/VITA-Group/FSGS





# Mamba / SSM

**VideoMamba: State Space Model for Efficient Video Understanding**

- Paper: https://arxiv.org/abs/2403.06977
- Code: https://github.com/OpenGVLab/VideoMamba

**ZIGMA: A DiT-style Zigzag Mamba Diffusion Model**

- Paper: https://arxiv.org/abs/2403.13802
- Code: https://taohu.me/zigma/



# Avatars







# Backbone





# CLIP







# MAE



# Embodied AI





# GAN



# OCR

**Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors**

- Paper: https://arxiv.org/pdf/2312.05286

- Code: https://github.com/SJTU-DeepVisionLab/FreeReal 

**PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer**

- Paper: https://arxiv.org/abs/2407.07764
- Code: https://github.com/SJTU-DeepVisionLab/PosFormer



# Occupancy

**Fully Sparse 3D Occupancy Prediction**

- Paper: https://arxiv.org/abs/2312.17118
- Code: https://github.com/MCG-NJU/SparseOcc





# NeRF

**NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields**

- Project: https://nerf-mae.github.io/
- Paper: https://arxiv.org/pdf/2404.01300
- Code: https://github.com/zubair-irshad/NeRF-MAE 



# DETR





# Prompt



# (MLLM)

**SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant**

- Paper: https://arxiv.org/abs/2403.11299
- Code: https://github.com/heliossun/SQ-LLaVA

**ControlCap: Controllable Region-level Captioning**

- Paper: https://arxiv.org/abs/2401.17910
- Code: https://github.com/callsys/ControlCap 



# (LLM)





# NAS



# ReID()





# (Diffusion Models)

**ZIGMA: A DiT-style Zigzag Mamba Diffusion Model**

- Paper: https://arxiv.org/abs/2403.13802
- Code: https://taohu.me/zigma/

**Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation**

- Paper: https://arxiv.org/abs/2403.16394
- Code: https://github.com/zdxdsw/skewed_relations_T2I

**The Lottery Ticket Hypothesis in Denoising: Towards Semantic-Driven Initialization**

- Project: https://ut-mao.github.io/noise.github.io/
- Paper: https://arxiv.org/abs/2312.08872
- Code: https://github.com/UT-Mao/Initial-Noise-Construction



# Vision Transformer

**GiT: Towards Generalist Vision Transformer through Universal Language Interface**

- Paper: https://arxiv.org/abs/2403.09394
- Code: https://github.com/Haiyang-W/GiT



# (Vision-Language)

**GalLoP: Learning Global and Local Prompts for Vision-Language Models**

- Paperhttps://arxiv.org/abs/2407.01400



# (Object Detection)

**Relation DETR: Exploring Explicit Position Relation Prior for Object Detection**

- Paper: https://arxiv.org/abs/2407.11699v1
- Code: https://github.com/xiuqhou/Relation-DETR
- Dataset: https://huggingface.co/datasets/xiuqhou/SA-Det-100k 

**GRA: Detecting Oriented Objects through Group-wise Rotating and Attention**
- Paper: https://arxiv.org/pdf/2403.11127
- Code: https://github.com/wangjiangshan0725/GRA

**Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector**

- Project: http://yuqianfu.com/CDFSOD-benchmark/
- Paper: https://arxiv.org/pdf/2402.03094
- Code: https://github.com/lovelyqian/CDFSOD-benchmark 



# (Anomaly Detection)





# (Object Tracking)







# (Semantic Segmentation)

**Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation**

- Paper: https://arxiv.org/abs/2405.06228

- Code: https://github.com/nizhenliang/CGRSeg



# (Medical Image)

**Brain-ID: Learning Contrast-agnostic Anatomical Representations for Brain Imaging**

- Paper: https://arxiv.org/abs/2311.16914
- Code: https://github.com/peirong26/Brain-ID 

**FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification**

- Project: https://ophai.hms.harvard.edu/datasets/harvard-fairdomain20k
- Paper : https://arxiv.org/abs/2407.08813
- Dataset: https://drive.google.com/drive/u/1/folders/1huH93JVeXMj9rK6p1OZRub868vv0UK0O
- Code: https://github.com/Harvard-Ophthalmology-AI-Lab/FairDomain



# (Medical Image Segmentation)

**ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image**

- Project: https://scribbleprompt.csail.mit.edu/
- Paper: https://arxiv.org/abs/2312.07381
- Code: https://github.com/halleewong/ScribblePrompt

**AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking**

- Paper: https://arxiv.org/abs/2407.06468
- Code: https://github.com/ricklisz/AnatoMask

**Representing Topological Self-Similarity Using Fractal Feature Maps for Accurate Segmentation of Tubular Structures**

- Paper: https://arxiv.org/abs/2407.14754
- Code: https://github.com/cbmi-group/FFM-Multi-Decoder-Network 



# (Video Object Segmentation)

**DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries**

- Project: https://zhang-tao-whu.github.io/projects/DVIS_DAQ/
- Paper: https://arxiv.org/abs/2404.00086
- Code: https://github.com/zhang-tao-whu/DVIS_Plus 



# (Autonomous Driving)

**Fully Sparse 3D Occupancy Prediction**

- Paper: https://arxiv.org/abs/2312.17118
- Code: https://github.com/MCG-NJU/SparseOcc

**milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing**

- Paper: https://arxiv.org/abs/2306.17010
- Code: https://github.com/Toytiny/milliFlow/

 **4D Contrastive Superflows are Dense 3D Representation Learners**

- Paper : https://arxiv.org/abs/2407.06190
- Code: https://github.com/Xiangxu-0103/SuperFlow 



# 3D(3D-Point-Cloud)





# 3D(3D Object Detection)

**3D Small Object Detection with Dynamic Spatial Pruning**

- Project: https://xuxw98.github.io/DSPDet3D/
- Paper: https://arxiv.org/abs/2305.03716
- Code: https://github.com/xuxw98/DSPDet3D

**Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection**

- Paper: https://arxiv.org/abs/2402.03634
- Code: https://github.com/LiewFeng/RayDN 



# 3D(3D Semantic Segmentation)



# (Image Editing)







# /(Image Inpainting)

**BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion**

- Project https://tencentarc.github.io/BrushNet/
- Paper: https://arxiv.org/abs/2403.06976
- Code: https://github.com/TencentARC/BrushNet



# (Video Editing)





# Low-level Vision

**Restoring Images in Adverse Weather Conditions via Histogram Transformer**

- Paper: https://arxiv.org/abs/2407.10172
- Code: https://github.com/sunshangquan/Histoformer

**OneRestore: A Universal Restoration Framework for Composite Degradation**

- Project  https://gy65896.github.io/projects/ECCV2024_OneRestore
- Paper: https://arxiv.org/abs/2407.04621
- Code: https://github.com/gy65896/OneRestore 

# (Super-Resolution)





# (Denoising)

## (Image Denoising)



# 3D(3D Human Pose Estimation)





# (Image Generation)

**Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models**

- Paper: https://arxiv.org/abs/2404.07389
- Code: https://github.com/YasminZhang/EBAMA

**Every Pixel Has its Moments: Ultra-High-Resolution Unpaired Image-to-Image Translation via Dense Normalization**

- Project: https://kaminyou.com/Dense-Normalization/
- Paper: https://arxiv.org/abs/2407.04245
- Code: https://github.com/Kaminyou/Dense-Normalization 

**ZIGMA: A DiT-style Zigzag Mamba Diffusion Model**

- Paper: https://arxiv.org/abs/2403.13802
- Code: https://taohu.me/zigma/

**Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation**

- Paper: https://arxiv.org/abs/2403.16394
- Code: https://github.com/zdxdsw/skewed_relations_T2I 



# (Video Generation)

**VideoStudio: Generating Consistent-Content and Multi-Scene Videos**

- Project: https://vidstudio.github.io/
- Code: https://github.com/FuchenUSTC/VideoStudio 





# 3D





# (Video Understanding)

**VideoMamba: State Space Model for Efficient Video Understanding**

- Paper: https://arxiv.org/abs/2403.06977
- Code: https://github.com/OpenGVLab/VideoMamba

**C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition**

- Paper: https://arxiv.org/abs/2407.06113
- Code: https://github.com/RongchangLi/ZSCAR_C2C



# (Action Recognition)

**SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders**

- Paper: https://arxiv.org/abs/2407.13460
- Code: https://github.com/pha123661/SA-DVAE 



# (Knowledge Distillation)



# (Image Compression)

**Image Compression for Machine and Human Vision With Spatial-Frequency Adaptation**

- Code: https://github.com/qingshi9974/ECCV2024-AdpatICMH
- Paper: http://arxiv.org/abs/2407.09853 



# (Stereo Matching)





# (Scene Graph Generation)





# (Counting)

**Zero-shot Object Counting with Good Exemplars**

- Paper: https://arxiv.org/abs/2407.04948
- Code: https://github.com/HopooLinZ/VA-Count 





# (Video Quality Assessment)



# (Datasets)



# (Others)

**Multi-branch Collaborative Learning Network for 3D Visual Grounding**

- Paper: https://arxiv.org/abs/2407.05363v2
- Code: https://github.com/qzp2018/MCLN 

**PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers**

- Code: https://github.com/ananthu-aniraj/pdiscoformer
- Paper: https://arxiv.org/abs/2407.04538

**SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments**

- Project: https://fraunhoferhhi.github.io/spvloc/ 
- Paper: https://arxiv.org/abs/2404.10527
- Code: https://github.com/fraunhoferhhi/spvloc

**REFRAME: Reflective Surface Real-Time Rendering for Mobile Devices**

- Project: https://xdimlab.github.io/REFRAME/
- Paper: https://arxiv.org/abs/2403.16481
- Code: https://github.com/MARVELOUSJI/REFRAME

Owner

Name: coqqer
Login: cowqer
Kind: user

Repositories: 1
Profile: https://github.com/cowqer

GitHub Events

Total

Push event: 1
Create event: 1

Last Year

Push event: 1
Create event: 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/cowqer/eccv2024-papers-with-code

Science Score: 10.0%

Repository

Basic Info

Statistics

https://github.com/cowqer/ECCV2024-Papers-with-Code/blob/master/

Owner

GitHub Events

Total

Last Year