https://github.com/apachecn-archive/planning-based-hierarchical-variational-model

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: apachecn-archive
Language: Python
Default Branch: master
Size: 18.6 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 3 years ago · Last pushed about 3 years ago

Metadata Files

Readme

Long and Diverse Text Generation with Planning-based Hierarchical Variational Model

Introduction

Existing neural methods for data-to-text generation are still struggling to produce long and diverse texts: they are insufficient to model input data dynamically during generation, to capture inter-sentence coherence, or to generate diversified expressions. To address these issues, we propose a Planning-based Hierarchical Variational Model (PHVM). Our model first plans a sequence of groups (each group is a subset of input items to be covered by a sentence) and then realizes each sentence conditioned on the planning result and the previously generated context, thereby decomposing long text generation into dependent sentence generation sub-tasks. To capture expression diversity, we devise a hierarchical latent structure where a global planning latent variable models the diversity of reasonable planning and a sequence of local latent variables controls sentence realization.

This project is a Tensorflow implementation of our work.

Requirements

Python 3.6
Numpy
Tensorflow 1.4.0

Quick Start

Dataset

Our dataset contains 119K pairs of product specifications and the corresponding advertising text. For more information, please refer to our paper.
Preprocess
- Download data from https://drive.google.com/open?id=1vB0fT1ex2Tsid-i5s-jqdz9QUFbCh0CO and unzip the file, which will create a new directory named data. The path to our dataset is ./data/data.jsonl.
- We provided most preprocessed data under ./data/processed/ except pre-trained word embeddings which can be generated with the following command line:
bash preprocess.sh
Train

./run.sh
Test

./test.sh

Citation

Our paper is available at https://arxiv.org/abs/1908.06605v2.

Please kindly cite our paper if this paper and the code are helpful.

Owner

Name: ApacheCN 归档
Login: apachecn-archive
Kind: organization
Email: wizard.z@qq.com

Repositories: 180
Profile: https://github.com/apachecn-archive

防止重要项目丢失而设立的归档

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science