https://github.com/chris10m/rfb-text-detection

A Dense Text Detection model using Receptive Field Blocks

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.6%) to scientific vocabulary

Keywords

cnn-keras east east-detection keras keras-models rfb-text text text-detect text-detecting text-detection text-detector

Last synced: 5 months ago · JSON representation

Repository

A Dense Text Detection model using Receptive Field Blocks

Basic Info

Host: GitHub
Owner: Chris10M
License: mit
Language: C++
Default Branch: master
Size: 20.2 MB

Statistics

Stars: 31
Watchers: 5
Forks: 11
Open Issues: 9
Releases: 0

Topics

cnn-keras east east-detection keras keras-models rfb-text text text-detect text-detecting text-detection text-detector

Created almost 7 years ago · Last pushed about 3 years ago

Metadata Files

Readme License

Receptive Field Blocks Text Detection Module (RFBTD Module)

A Dense Text Detection model using Receptive Field Blocks

Introduction

This repo contains text detection based on Receptive Field Blocks. The text detection model provides a dense receptive field, for predicting text boxes in dense natural scene images like documents, articles etc.

The model is also inspired from EAST: An Efficient and Accurate Scene Text Detector, where the RRBOX part and loss function is taken from.

The features of the model are summarized below: + Keras implementation for lucid and clean code. + Backbone: Resnet50 + Inference time for 720p images:

**GPU VERSION**
+ Graphics Card: MX130
+ Inference Time: 700ms
+ Batch Size: 1

**CPU VERSION**
+ CPU: Intel i7-8550U CPU @ 1.80GHz
+ Inference Time: 1750ms
+ Batch Size: 1

The pre-trained model provided achieves 47.09(Single Crop, Resize Only) F1-score on ICDAR 2015, but was not trained on ICDAR 2015. To improve accuracy fine-tune on ICDAR 2015, and predict with multiple crops.
The model is tuned for predicting text boxes for natural scene documents, like bank statements, forms, recipts, etc, and evidently do OCR on these text boxes.

Please cite these paper, paper if you find this useful.

Live Demo
Installation
Download Pre-Trained Model
Demo
Eval
Train
Examples

LiveDemo

The model is hosted on my server so if you want to try it out live, click here and the model is hosted with the resolution resized to @480p. This is due to the model not fiting on my single core, 512mb RAM server. So accuracy will flounder due to the small image input.

Installation

Requirements from requirements.txt

tensorflow==1.13.1

Keras==2.2.4

numpy==1.16.2

opencvcontribpython==4.0.0.21

plumbum==1.6.7

Download

The Pre-trained Model is available at this Link GoogleDrive

Demo

If you've downloaded the pre-trained model, run python3 run_demo.py the images are taken from testimages/inputimages and the output is predicted to testimages/predictedimages

Eval

If you want to benchmark it on ICDAR 2015, run python eval.py the images are taken from evalimages/evaluationimages and the output is predicted to evalimages/predictedboxes, text files will be then written to the output path. This format confides to the ICDAR text detection challenge format.

Train

The training code is not uploaded as of now, but will definitely post it in the upcoming days.

Examples

Here are some examples, the images on the left represent source images, and the images on the right are images overlayed with the predicted bounding boxes.

Issues

If you encounter any issues, please create an issue tracker.

Model Architecture

If you want to get more insight on the RF Block, just peruse the model architecture.

And a huge shout out for argman/EAST for providing C++ NMS.

Owner

Name: Christen Millerdurai
Login: Chris10M
Kind: user

Repositories: 25
Profile: https://github.com/Chris10M

PhD & Researcher @ AV DFKI-Kaiserslautern.

GitHub Events

Total

Last Year

Dependencies

requirements.txt pypi

Keras ==2.2.4
numpy ==1.16.2
opencv_contrib_python ==4.0.0.21
plumbum ==1.6.7
tensorflow ==1.13.1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/chris10m/rfb-text-detection

Science Score: 10.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Receptive Field Blocks Text Detection Module (RFBTD Module)

Introduction

Contents

LiveDemo

Installation

Download

Demo

Eval

Train

Examples

Issues

Model Architecture

Owner

GitHub Events

Total

Last Year

Dependencies