Updated 6 months ago
pyramidtabnet
Official PyTorch implementation of PyramidTabNet: Transformer-based Table Recognition in Image-based Documents
Updated 6 months ago
most-different-text-selection
Use embedding data from LLMs to determine the most different text in a given corpus.
Updated 5 months ago
https://github.com/bytedance/dolphin
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.