align

Python library for extracting quantitative, reproducible metrics of multi-level alignment between speakers in naturalistic language corpora.

https://github.com/nickduran/align-linguistic-alignment

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 12 DOI reference(s) in README
○
Academic publication links
✓
Committers with academic emails
3 of 12 committers (25.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary

Keywords

conversation-analysis corpus-tools linguistic-alignment linguistic-analysis ngram-analysis nltk notebooks python text-analysis word2vec

Keywords from Contributors

autograding interactive serializer emulation packaging network-simulation shellcodes hacking gridding observability

Last synced: 9 months ago · JSON representation

Repository

Python library for extracting quantitative, reproducible metrics of multi-level alignment between speakers in naturalistic language corpora.

Basic Info

Host: GitHub
Owner: nickduran
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 54.8 MB

Statistics

Stars: 52
Watchers: 3
Forks: 16
Open Issues: 18
Releases: 8

Topics

conversation-analysis corpus-tools linguistic-alignment linguistic-analysis ngram-analysis nltk notebooks python text-analysis word2vec

Created over 8 years ago · Last pushed about 1 year ago

Metadata Files

Readme License

ALIGN, a computational tool for multi-level language analysis (optimized for Python 3.10)

align is a Python library for extracting quantitative, reproducible metrics of multi-level alignment between two speakers in naturalistic language corpora. The method was introduced in "ALIGN: Analyzing Linguistic Interactions with Generalizable techNiques" (Duran, Paxton, & Fusaroli, 2019; Psychological Methods).

Examples of papers relying on the ALIGN library:

Duran, N. D., Paige, A., & D'Mello, S. K. (2024). Multi‐Level Linguistic Alignment in a Dynamic Collaborative Problem‐Solving Task. Cognitive Science, 48(1), e13398. https://doi.org/10.1111/cogs.13398
Dideriksen, C., Christiansen, M. H., Tylén, K., Dingemanse, M., & Fusaroli, R. (2023). Quantifying the interplay of conversational devices in building mutual understanding. Journal of Experimental Psychology: General, 152(3), 864. Pre-print: https://doi.org/10.31234/osf.io/a5r74
Dideriksen, C., Christiansen, M. H., Dingemanse, M., Højmark‐Bertelsen, M., Johansson, C., Tylén, K., & Fusaroli, R. (2023). Language‐Specific Constraints on Conversation: Evidence from Danish and Norwegian. Cognitive Science, 47(11), e13387. Pre-print: https://doi.org/10.31234/osf.io/t3s6c.
Fusaroli, R., Weed, E., Rocca, R., Fein, D., & Naigles, L. (2023). Caregiver linguistic alignment to autistic and typically developing children: A natural language processing approach illuminates the interactive components of language development. Cognition, 236, 105422. Pre-print: https://doi.org/10.31234/osf.io/ysjec
Fusaroli, R., Weed, E., Rocca, R., Fein, D., & Naigles, L. (2023). Repeat After Me? Both Children With and Without Autism Commonly Align Their Language With That of Their Caregivers. Cognitive Science, 47(11), e13369. DOI: 10.31234/osf.io/m8fhk.
Tylén, K., Fusaroli, R., Østergaard, S. M., Smith, P., & Arnoldi, J. (2023). The Social Route to Abstraction: Interaction and Diversity Enhance Performance and Transfer in a Rule‐Based Categorization Task. Cognitive Science, 47(9), e13338.
Trujillo, J. P., Dideriksen, C., Tylén, K., Christiansen, M. H., & Fusaroli, R. (2023). The dynamic interplay of kinetic and linguistic coordination in Danish and Norwegian conversation. Cognitive Science, 47(6), e13298.

Installation

align may be downloaded directly using pip.

To download the stable version released on PyPI:

pip install align

Or to update:

pip install align --upgrade

And it's always good practice to install a package like align, which has several dependencies (see requirements.txt), in a virtual environment.

Anaconda users: The above should work in the vast majority of cases. However, if you prefer an easy way to install align within a virtual environment in one go, or you are experiencing problems with trying to update align, a YAML file has been provided to streamline things. Just follow these simple steps:

Download the environment.yml file and navigate to the folder where it has been downloaded

Run the following command in Terminal: conda env create -f environment.yml

Be sure to activate the new enviroment (i.e., conda activate align0.1.1) before running any align analyses (such as the tutorials; see below)

If you experience any problems, please put them in the "Issues" section of this repository.

Quick documentation

ALIGN consists of two primary modules for conducting analyses, prepare_transcripts and calculate_alignment. To get a quick glance of the functions contained within each module, please check out the following:

prepare_transcripts: https://nickduran.github.io/align-linguistic-alignment/prepare_transcripts.html
calculate_alignment: https://nickduran.github.io/align-linguistic-alignment/calculate_alignment.html

Additional tools required for some `align` options

The Google News pre-trained word2vec vectors (GoogleNews-vectors-negative300.bin) and the Stanford part-of-speech tagger (stanford-postagger-full-2020-11-17) are required for some optional align parameters but must be downloaded separately. Please see the tutorials for more information.

Google News: https://code.google.com/archive/p/word2vec/ (page) or https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing (direct download)
Stanford POS tagger: https://nlp.stanford.edu/software/tagger.shtml#Download (page) or https://nlp.stanford.edu/software/stanford-tagger-4.2.0.zip (direct download)

Tutorials

We created Jupyter Notebook tutorials to provide an easily accessible step-by-step walkthrough on how to use align. Below are descriptions of the current tutorials that can be found in the examples directory within this repository. If unfamiliar with Jupyter Notebooks, instructions for installing and running can be found here: http://jupyter.org/install. We recommend installing Jupyter using Anaconda. Anaconda is a widely-used Python data science platform that helps streamline workflows.

Jupyter Notebook 1: CHILDES
- This tutorial walks users through an analysis of conversations from a single English corpus from the CHILDES database (MacWhinney, 2000)---specifically, Kuczaj’s Abe corpus (Kuczaj, 1976). We analyze the last 20 conversations in the corpus in order to explore how ALIGN can be used to track multi-level linguistic alignment between a parent and child over time, which may be of interest to developmental language researchers. Specifically, we explore how alignment between a parent and a child changes over a brief span of developmental trajectory.
Jupyter Notebook 2: Devil's Advocate
- This tutorial walks users throught the analysis reported in (Duran, Paxton, & Fusaroli, 2019). The corpus consists of 94 written transcripts of conversations, lasting eight minutes each, collected from an experimental study of truthful and deceptive communication. The goal of the study was to examine interpersonal linguistic alignment between dyads across two conversations where participants either agreed or disagreed with each other (as a randomly assigned between-dyads condition) and where one of the conversations involved the truth and the other deception (as a within-subjects condition).

We are in the process of adding more tutorials and would welcome additional tutorials by interested contributors.

Attribution

If you find the package useful, please cite our manuscript:

Duran, N., Paxton, A., & Fusaroli, R. (2019). ALIGN: Analyzing Linguistic Interactions with Generalizable techNiques. Psychological Methods. http://dynamicog.org/papers/

Licensing of example data

CHILDES
- Example corpus "Kuczaj Corpus" by Stan Kuczaj is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License (https://childes.talkbank.org/access/Eng-NA/Kuczaj.html):

Kuczaj, S. (1977). The acquisition of regular and irregular past tense forms. Journal of Verbal Learning and Verbal Behavior, 16, 589–600.

Devil's Advocate
- The complete de-identified dataset of raw conversational transcripts is hosted on a secure protected-access repository provided by the Inter-university Consortium for Political and Social Research (ICPSR). Please click on the link to access: http://dx.doi.org/10.3886/ICPSR37124.v1. Due to the requirements of our IRB, please note that users interested in obtaining these data must complete a Restricted Data Use Agreement, specify the reason for the request, and obtain IRB approval or notice of exemption for their research.

Duran, Nicholas, Alexandra Paxton, and Riccardo Fusaroli. Conversational Transcripts of Truthful and Deceptive Speech Involving Controversial Topics, Central California, 2012. ICPSR37124-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2018-08-29.

Owner

Name: Nicholas Duran
Login: nickduran
Kind: user
Location: Glendale, AZ
Company: Arizona State University

Website: dynamicog.org
Repositories: 5
Profile: https://github.com/nickduran

Nicholas Duran is an associate professor in the Social and Behavioral Sciences division of the New College of Interdisciplinary Arts and Sciences at ASU

GitHub Events

Total

Issues event: 2
Watch event: 12
Issue comment event: 2
Member event: 1
Push event: 9
Pull request event: 2
Fork event: 2
Create event: 1

Last Year

Issues event: 2
Watch event: 12
Issue comment event: 2
Member event: 1
Push event: 9
Pull request event: 2
Fork event: 2
Create event: 1

Committers

Last synced: over 2 years ago

All Time

Total Commits: 446
Total Committers: 12
Avg Commits per committer: 37.167
Development Distribution Score (DDS): 0.617

Past Year

Commits: 1
Committers: 1
Avg Commits per committer: 1.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Alexandra Paxton	p**a@g**m	171
nickduran	n**1@g**m	132
Nick Duran	n**4@a**u	60
Nicholas Duran	n**n@a**u	44
Nick Duran	n**n@n**u	13
Alexandra Paxton	a****n	10
yuvipanda	y**a@g**m	8
Nick Duran	n**n@N**l	3
Saul Kohn	s**n@S**l	2
Ludvig Renbo Olsen	m**l@l**k	1
dependabot[bot]	4****]	1
Riccardo Fusaroli	f**i@g**m	1

Committer Domains (Top 20 + Academic)

asu.edu: 2 ludvigolsen.dk: 1 nc-4175056.3500.dhcp.asu.edu: 1

Issues and Pull Requests

Last synced: almost 2 years ago

All Time

Total issues: 23
Total pull requests: 40
Average time to close issues: 3 months
Average time to close pull requests: 20 days
Total issue authors: 9
Total pull request authors: 6
Average comments per issue: 0.91
Average comments per pull request: 0.63
Merged pull requests: 33
Bot issues: 0
Bot pull requests: 6

Past Year

Issues: 0
Pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: less than a minute
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

fusaroli (9)
nickduran (4)
AdrianaChieng (3)
a-paxton (2)
douggetty (1)
jseale-asapp (1)
pavelgold (1)
akhilraheja (1)
katrinsgr (1)
LudvigOlsen (1)

Pull Request Authors

nickduran (24)
a-paxton (7)
dependabot[bot] (6)
yuvipanda (2)
LudvigOlsen (1)
SaulAryehKohn (1)

Top Labels

Issue Labels

enhancement (11) bug (3) help wanted (1) dependencies (1) documentation (1)

Pull Request Labels

dependencies (6)

Packages

Total packages: 1
Total downloads:
- pypi 562 last-month

Total dependent packages: 0
Total dependent repositories: 8
Total versions: 10
Total maintainers: 2

pypi.org: align

Analyzing Linguistic Interaction with Generalizable techNiques. Read the latest ALIGN tutorials.

Homepage: https://github.com/nickduran/align-linguistic-alignment
Documentation: https://align.readthedocs.io/
License: LICENSE
Latest release: 0.1.1
published almost 4 years ago

Versions: 10
Dependent Packages: 0
Dependent Repositories: 8
Downloads: 562 Last month
Docker Downloads: 0

Rankings

Docker downloads count: 3.9%

Dependent repos count: 5.2%

Average: 9.3%

Dependent packages count: 10.0%

Forks count: 10.5%

Stargazers count: 10.8%

Downloads: 15.1%

Maintainers (2)

a-paxton nickd3ps

Last synced: 9 months ago

Dependencies

requirements.txt pypi

alabaster =0.7.12=pyhd3eb1b0_0
align =0.1.0=dev_0
appnope =0.1.2=py310hecd8cb5_1001
asttokens =2.0.5=pyhd3eb1b0_0
babel =2.9.1=pyhd3eb1b0_0
backcall =0.2.0=pyhd3eb1b0_0
blas =1.0=mkl
bleach =4.1.0=pyhd3eb1b0_0
bottleneck =1.3.4=py310h4e76f89_0
brotlipy =0.7.0=py310hca72f7f_1002
build =0.7.0=pyhd8ed1ab_0
bzip2 =1.0.8=h1de35cc_0
ca-certificates =2022.6.15=h033912b_0
certifi =2022.6.15=py310h2ec42d9_0
cffi =1.15.0=py310hc55c11b_1
charset-normalizer =2.0.4=pyhd3eb1b0_0
check-manifest =0.48=pyhd8ed1ab_0
click =8.0.4=py310hecd8cb5_0
cmarkgfm =0.8.0=py310h1961e1f_1
colorama =0.4.4=pyhd3eb1b0_0
commonmark =0.9.1=pyhd3eb1b0_0
cryptography =37.0.1=py310hf6deb26_0
cython =0.29.28=py310he9d5cce_0
dataclasses =0.8=pyh6d0b6a4_7
decorator =5.1.1=pyhd3eb1b0_0
docutils =0.17.1=pypi_0
executing =0.8.3=pyhd3eb1b0_0
future =0.18.2=py310hecd8cb5_1
gensim =4.1.2=py310he9d5cce_0
idna =3.3=pyhd3eb1b0_0
imagesize =1.3.0=pyhd3eb1b0_0
importlib-metadata =4.11.3=py310hecd8cb5_0
importlib_metadata =4.11.3=hd3eb1b0_0
intel-openmp =2021.4.0=hecd8cb5_3538
ipython =8.3.0=py310hecd8cb5_0
jedi =0.18.1=py310hecd8cb5_1
jinja2 =3.0.3=pyhd3eb1b0_0
joblib =1.1.0=pyhd3eb1b0_0
keyring =23.4.0=py310hecd8cb5_0
libcxx =12.0.0=h2f01273_0
libffi =3.3=hb1e8313_2
libgfortran =3.0.1=h93005f0_2
markupsafe =2.1.1=py310hca72f7f_0
matplotlib-inline =0.1.2=pyhd3eb1b0_2
mkl =2021.4.0=hecd8cb5_637
mkl-service =2.4.0=py310hca72f7f_0
mkl_fft =1.3.1=py310hf879493_0
mkl_random =1.2.2=py310hc081a56_0
ncurses =6.3=hca72f7f_2
nltk =3.7=pyhd3eb1b0_0
numexpr =2.8.1=py310hdcd3fac_2
numpy =1.22.3=py310hdcd3fac_0
numpy-base =1.22.3=py310hfd2de13_0
openssl =1.1.1p=hfe4f2af_0
packaging =21.3=pyhd3eb1b0_0
pandas =1.4.2=py310he9d5cce_0
parso =0.8.3=pyhd3eb1b0_0
pep517 =0.12.0=py310hecd8cb5_0
pexpect =4.8.0=pyhd3eb1b0_3
pickleshare =0.7.5=pyhd3eb1b0_1003
pip =21.2.4=py310hecd8cb5_0
pkginfo =1.8.2=pyhd3eb1b0_0
prompt-toolkit =3.0.20=pyhd3eb1b0_0
ptyprocess =0.7.0=pyhd3eb1b0_2
pure_eval =0.2.2=pyhd3eb1b0_0
pycparser =2.21=pyhd3eb1b0_0
pygments =2.11.2=pyhd3eb1b0_0
pyopenssl =22.0.0=pyhd3eb1b0_0
pyparsing =3.0.4=pyhd3eb1b0_0
pysocks =1.7.1=py310hecd8cb5_0
python =3.10.4=hdfd78df_0
python-build =0.8.0=pyhd8ed1ab_0
python-dateutil =2.8.2=pyhd3eb1b0_0
python_abi =3.10=2_cp310
pytz =2022.1=py310hecd8cb5_0
readline =8.1.2=hca72f7f_1
readme_renderer =35.0=pyhd8ed1ab_0
regex =2022.3.15=py310hca72f7f_0
requests =2.28.0=py310hecd8cb5_0
requests-toolbelt =0.9.1=pyhd3eb1b0_0
rfc3986 =1.4.0=pyhd3eb1b0_0
rich =12.4.4=pyhd8ed1ab_0
scipy =1.7.3=py310h3dd3380_0
setuptools =61.2.0=py310hecd8cb5_0
six =1.16.0=pyhd3eb1b0_1
smart_open =5.2.1=py310hecd8cb5_0
snowballstemmer =2.2.0=pyhd3eb1b0_0
sphinx =3.5.3=pyhd3eb1b0_0
sphinx-rtd-theme =1.0.0=pypi_0
sphinxcontrib-applehelp =1.0.2=pyhd3eb1b0_0
sphinxcontrib-devhelp =1.0.2=pyhd3eb1b0_0
sphinxcontrib-htmlhelp =2.0.0=pyhd3eb1b0_0
sphinxcontrib-jsmath =1.0.1=pyhd3eb1b0_0
sphinxcontrib-qthelp =1.0.3=pyhd3eb1b0_0
sphinxcontrib-serializinghtml =1.1.5=pyhd3eb1b0_0
sqlite =3.38.5=h707629a_0
stack_data =0.2.0=pyhd3eb1b0_0
tk =8.6.12=h5d9f67b_0
toml =0.10.2=pyhd8ed1ab_0
tomli =1.2.2=pyhd3eb1b0_0
tqdm =4.64.0=py310hecd8cb5_0
traitlets =5.1.1=pyhd3eb1b0_0
twine =4.0.1=pyhd8ed1ab_1
typing_extensions =4.1.1=pyh06a4308_0
tzdata =2022a=hda174b7_0
urllib3 =1.26.9=py310hecd8cb5_0
wcwidth =0.2.5=pyhd3eb1b0_0
webencodings =0.5.1=py310hecd8cb5_1
wheel =0.37.1=pyhd3eb1b0_0
xz =5.2.5=hca72f7f_1
zipp =3.8.0=py310hecd8cb5_0
zlib =1.2.12=h4dc903c_2

environment.yml pypi

align ==0.1.1
pip ==21.2.4

align

Science Score: 36.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

ALIGN, a computational tool for multi-level language analysis (optimized for Python 3.10)

Installation

Quick documentation

Additional tools required for some align options

Tutorials

Attribution

Licensing of example data

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: align

Rankings

Maintainers (2)

Dependencies

Additional tools required for some `align` options