text_summarization_lstm_gru
# Comparing the performance of LSTM and GRU for Text Summarization using Pointer Generator Networks
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.6%) to scientific vocabulary
Keywords
Repository
# Comparing the performance of LSTM and GRU for Text Summarization using Pointer Generator Networks
Basic Info
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Comparing the performance of LSTM and GRU for Text Summarization using Pointer Generator Networks
Compare the performance of LSTM and GRU in Text Summarization using Pointer Generator Networks as discussed in https://github.com/abisee/pointer-generator
Based on code at https://github.com/steph1793/PointerGeneratorSummarizer
Prerequisites
Python 3.7+
Tensorflow 2.8+ (Any 2.x should work, but not tested)
rouge 1.0.1 (pip install rouge)
Dataset
We use the CNN-DailyMail dataset. The application reads data in the tfrecords format files.
Dataset can be created and processed based on the instructions at https://github.com/abisee/cnn-dailymail
Alternatively, pre-processed dataset can be downloaded from https://github.com/JafferWilson/Process-Data-of-CNN-DailyMail
Easiest way yet, download the dataset zip file and extract it into the dataset directory (by default, this will be "./dataset/")
Pre-Trained Models (Optional)
In case you want to skip the training and run evaluate the models, you can download the checkpoint zip file and extract it into the checkpoint directory (by default, this will be "./checkpoint/").
Usage
Training and Evaluation will perform the actions on all four models. If you wish to train/eval only one model, pass it as --model_name.
Training
~~~ python main.py --mode="train" --vocabpath="./dataset/vocab" --datadir="./dataset/chunked_train" ~~~
Evaluation
~~~ python main.py --mode="eval" --vocabpath="./dataset/vocab" --datadir="./dataset/chunked_val" ~~~
Parameters
Most of the parameters have defaults and can be skipped. Here are the parameters that you can tweak along with their defaults.
| Parameter | Default | Description | | ----------------------- | --------------- | ----------------------------------------------------------------------------------------------------------------------- | | maxenclen | 512 | Encoder input max sequence length | | maxdeclen | 128 | Decoder input max sequence length | | maxdecsteps | 128 | maximum number of words of the predicted abstract | | mindecsteps | 32 | Minimum number of words of the predicted abstract | | batchsize | 4 | batch size | | beamsize | 4 | beam size for beam search decoding (must be equal to batch size in decode mode) | | vocabsize | 50000 | Vocabulary size | | embedsize | 128 | Words embeddings dimension | | encunits | 256 | Encoder LSTM/GRU cell units number | | decunits | 256 | Decoder LSTM/GRU cell units number | | attnunits | 512 | [context vector, decoder state, decoder input] feedforward result dimension - used to compute the attention weights | | learningrate | 0.15 | Learning rate | | adagradinitacc | 0.1 | Adagrad optimizer initial accumulator value. Please refer to the Adagrad optimizer API documentation on tensorflow site | | maxgradnorm | 0.8 | Gradient norm above which gradients must be clipped | | checkpointssavesteps | 1000 | Save checkpoints every N steps | | maxcheckpoints | 10 | Maximum number of checkpoints to keep. Older ones are deleted | | maxsteps | 50000 | Max number of iterations | | maxnumtoeval | 100 | Max number of examples to evaluate | | checkpointdir | "./checkpoint" | Checkpoint directory | | logdir | "./log" | Directory in which to write logs | | resultsdir | None | Directory in which we write the intermediate results (Article, Actual Summary and Predicted Summary) during evaluation | | datadir | None | Data Folder | | vocabpath | None | Path to vocab file | | mode | None | Should be "train" or "eval" | | model_name | None | Name of a specific model. If empty, all models are used |
Owner
- Name: Sajith Madhavan
- Login: sajithm
- Kind: user
- Location: Bangalore, India
- Repositories: 2
- Profile: https://github.com/sajithm
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: >-
Comparing the performance of LSTM and GRU for Text
Summarization using Pointer Generator Networks
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Sajith
email: sajithm@msn.com
family-names: Madhavan
orcid: 'https://orcid.org/0000-0002-8227-6144'
repository-code: >-
https://github.com/sajithm/text_summarization_lstm_gru
license: Apache-2.0