https://github.com/centre-for-humanities-computing/literary_evocation
Contains data of the Ficiton4 corpus and for our experiment on literary sentiment evocation
https://github.com/centre-for-humanities-computing/literary_evocation
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.9%) to scientific vocabulary
Keywords
Repository
Contains data of the Ficiton4 corpus and for our experiment on literary sentiment evocation
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Fiction4 sentiment evocation
Data & code for textual features influence on human sentiment perception in literary texts
🔬 Data
| | No. texts | No. annotations | No. words | Period | |-------------|-----|------|--------|------------| | Fairy tales | 3 | 772 | 18,597 | 1837-1847 | | Hymns | 65 | 2,026 | 12,798 | 1798-1873 | | Prose | 1 | 1,923 | 30,279 | 1952 | | Poetry | 40 | 1,579 | 11,576 | 1965 |
We present the Fiction4 corpus of literary texts, spanning 109 individual texts across 4 genres and two languages (English and Danish) in the 19th and 20th century. The corpus consists of 3 main authors, Sylvia Plath for poetry, Ernest Hemingway for prose and H.C. Andersen for fairytales. Hymns represent a heterogenous colleciton from Danish official church hymnbooks from 1798-1873. The corpus was annotated for valence on a sentence basis by at least 2 annotators/sentence.
Full Fiction4 corpus data in \data\fiction4_data.json
We compare this fiction corpus again nonfiction texts (across genres)
The nonlit considered is: 1. EmoBank (from this paper https://aclanthology.org/E17-2092/), repo here. So these are multigenre sentences. (n=10,062 & range=(1 to 674 toks) & meanlength=87.8 toks) 2. Facebook posts (from this paper https://aclanthology.org/W16-0404.pdf), repo here. So these are facebook posts (multiple sentences)(n=2,895 & range=(2 to 445 toks) & meanlength=86.7 toks)
💻 Code
All code for our study on human/model sentiment perception across these corpora is available in this repository, see primarily feature extraction (get_features.py) and analysis (analysis.py).
Annotator agreement calculation for each subcategory of the Fiction4 corpus is in /annotation/annotator_agreement.py
Owner
- Name: Center for Humanities Computing Aarhus
- Login: centre-for-humanities-computing
- Kind: organization
- Email: chcaa@cas.au.dk
- Location: Aarhus, Denmark
- Website: https://chc.au.dk/
- Repositories: 130
- Profile: https://github.com/centre-for-humanities-computing
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 12 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- importlib *
- json *
- matplotlib *
- nltk *
- numpy *
- os *
- pandas *
- plotly *
- scikit-learn *
- scipy *
- seaborn *
- transformers *