https://github.com/dimits-ts/large-text-nlp-survey
A survey paper exploring the use of state-of-the-art deep neural network architectures in NLP problems featuring very large documents.
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.5%) to scientific vocabulary
Keywords
Repository
A survey paper exploring the use of state-of-the-art deep neural network architectures in NLP problems featuring very large documents.
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Neural Natural Language Processing for Long Texts: A Survey of the State-of-the-Art
by Dimitrios Tsirmpas, Ioannis Gkionis, Ioannis Mademlis, Georgios Papadopoulos
This paper has been submitted in Engineering Applications of Artificial Intelligence.
Abstract
The adoption of Deep Neural Networks (DNNs) has greatly benefited Natural Language Processing (NLP) during the past decade. However, the demands of long document analysis are quite different from those of shorter texts, while the ever increasing size of documents uploaded on-line renders automated understanding of lengthy texts a critical issue. Relevant applications include automated Web mining, legal document review, medical records analysis, financial reports analysis, contract management, environmental impact assessment, news aggregation, etc. Despite the relatively recent development of efficient algorithms for analyzing long documents, practical tools in this field are currently flourishing. This article serves as an entry point into this dynamic domain and aims to achieve two objectives. Firstly, it provides an overview of the relevant neural building blocks, serving as a concise tutorial for the field. Secondly, it offers a brief examination of the current state-of-the-art in long document NLP, with a primary focus on two key tasks: document classification and document summarization. Sentiment analysis for long texts is also covered, since it is typically treated as a particular case of document classification. Consequently, this article presents an introductory exploration of document-level analysis, addressing the primary challenges, concerns, and existing solutions. Finally, the article presents publicly available annotated datasets that can facilitate further research in this area.
Owner
- Name: Dimitris Tsirmpas
- Login: dimits-ts
- Kind: user
- Repositories: 1
- Profile: https://github.com/dimits-ts
I like playing around with data and building stuff.
GitHub Events
Total
Last Year
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Dimitris Tsirbas | m****a@g****m | 5 |
| dimits-exe | 7****e | 1 |
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- dimits-ts (1)