bigbang

Scientific analysis of collaborative communities

https://github.com/datactive/bigbang

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README
  • Academic publication links
    Links to: ieee.org
  • Committers with academic emails
    11 of 42 committers (26.2%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.0%) to scientific vocabulary

Keywords

datatracker listserv mailman mbox
Last synced: 6 months ago · JSON representation ·

Repository

Scientific analysis of collaborative communities

Basic Info
Statistics
  • Stars: 156
  • Watchers: 15
  • Forks: 54
  • Open Issues: 104
  • Releases: 9
Topics
datatracker listserv mailman mbox
Created over 11 years ago · Last pushed 10 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md



BigBang

BigBang is a toolkit for studying communications data from collaborative projects. It currently supports analyzing mailing lists from Sourceforge, Mailman, ListServ (version 16.5 and 17), Pipermail (version 0.09), Hypermail (version 2.4.0) or .mbox files.

Complete documentation for BigBang can be found on ReadTheDocs.

DOI codecov Gitter

Background

Many Standards Development Organizations (SDOs) have working groups that organize themselves through mailing lists. This mailing list data is a valuable source of research insights but can be challenging to gather and analyze. BigBang is an open source toolkit for studying processes of open collaboration and deliberation via analysis of the communications records. Its tools for collecting, analyzing, and visualizing mailing list data are used by a community of information policy researchers to study participation trends and interaction in these settings.

Three things BigBang Does

  • Ingress. Tools for collecting data from SDOs, especially their mailing lists.
  • Analysis. Tools for (pre)processing the data to produce useful insights.
  • Usability/Visualization. Tools for visualizing and interacting with data.

Institutional Collaboration

BigBang has been developed by a growing team of researchers spread across many universities and institutions, including UC Berkeley, University of Amsterdam, and New York University. Its development has been funded by Article 19 and Germany's Prototype Fund.

In addition to its scholarly use, BigBang has been building relationships with SDOs themselves. In 2021, the Internet Architecture Board hosted a workshop on Analyzing IETF Data, in which BigBang was featured as a tool for IAB to develop insights into internet governance.

BigBang as Research Software

BigBing is research software -- written by scholars for our research purposes.

It is part of Scientific Python ecosystem, drawing on many other open source scientific software libraries, such as NumPy, Matplotlib, Pandas, and Jupyter Notebook.

BigBang is a reflexive process. Several of the core developers are also qualitative scholars of socio-technical systems and institutions. Researchers commonly combine BigBang with participant observation in the SDOs they are studying. BigBang is governed by a steering committee of its core developers.

Installation*

You need to have Git and Pip (for Python3) installed.

Clone the repository and create a virtualenv:

```sh git clone https://github.com/datactive/bigbang.git cd bigbang python3 -m venv env

activate the virtualenv

. env/bin/activate ```

Inside the virtualenv, install BigBang:

sh pip install ".[dev]"

When you're done, you can deactivate the virtualenv:

sh deactivate

This video tutorial shows how to install BigBang. BigBang Video Tutorial

Usage

There are serveral Jupyter notebooks in the examples/ directory of this repository. To open them and begin exploring, run the following commands in the root directory of this repository:

bash source activate bigbang jupyter notebook --notebook-dir=examples/

BigBang contains scripts that make it easy to collect data from a variety of sources. For example, to collect data from an open mailing list archive hosted by Mailman, use:

bash bigbang collect-mail --url https://mail.python.org/pipermail/scipy-dev/

You can also give this command a file with several urls, one per line. One of these is provided in the examples/ directory.

bash bigbang collect-mail --file examples/urls.txt

Once the data has been collected, BigBang has functions to support analysis.

You can read more about data source supported by BigBang in the documentation.

Development

Unit tests

To run the automated unit tests, use: pytest tests/unit.

Our current goal is code coverage of 60%. Add new unit tests within tests/unit. Unit tests run quickly, without relying on network requests.

Documentation

Docstrings are preferred, so that auto-generated web-based documentation will be possible (#412). You can follow the Google style guide for docstrings.

Formatting

Run pre-commit install to get automated usage of black, flake8 and isort to all Python code files for consistent formatting across developers. We try to follow the PEP8 style guide.

Community

If you are interested in participating in BigBang development or would like support from the core development team, please subscribe to the bigbang-dev mailing list and let us know your suggestions, questions, requests and comments. A development chatroom is also available.

In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to make participation in our project and our community a harassment-free experience for everyone.

Publications

These academic publications use BigBang as part of their methods:

  • Becker, Christoph., ten Oever, Niels, and Riccardo Nanni. 2022 “The standardisation of lawful interception technologies in the 3GPP: interrogating 5G and surveillance amid US-China competition“, TPRC2022, Washington DC https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4167105
  • Benthall, Sebastian. 2015. “Testing Generative Models of Online Collaboration with BigBang.” In , 182–89. https://conference.scipy.org/proceedings/scipy2015/sebastian_benthall.html.
  • Doty, Nick. 2015. “Reviewing for Privacy in Internet and Web Standard-Setting.” In Security and Privacy Workshops (SPW), 2015 IEEE, 185–192. IEEE. https://ieeexplore.ieee.org/document/7163224/
  • Milan, Stefania, and Niels ten Oever. 2017. “Coding and Encoding Rights in Internet Infrastructure.” Internet Policy Review 6 (1)
  • ten Oever, Niels. 2018. “Productive Contestation, Civil Society, and Global Governance: Human Rights as a Boundary Object in ICANN.” Policy & Internet, June. https://doi.org/10.1002/poi3.172.
  • Nanni, Riccardo. “Digital Sovereignty and Internet Standards: Normative Implications of Public-Private Relations among Chinese Stakeholders in the Internet Engineering Task Force.” Information, Communication & Society 0, no. 0 (October 1, 2022): 1–21. https://doi.org/10.1080/1369118X.2022.2129270.
  • ten Oever, Niels. 2021. “‘This Is Not How We Imagined It’ - Technological Affordances, Economic Drivers and the Internet Architecture Imaginary.” New Media & Society. https://journals.sagepub.com/doi/full/10.1177/1461444820929320
  • ten Oever, N., Milan, S., & Beraldo, D. (2020). Studying Discourse in Internet Governance through Mailing-list Analysis. In D. L. Cogburn, L. DeNardis, N. S. Levinson, & F. Musiani (Eds.), Research Methods in Internet Governance. Cambridge, MA: MIT Press. https://direct.mit.edu/books/oa-monograph/4936/chapter/625914/Studying-Discourse-in-Internet-Governance-through

License

MIT, see LICENSE for its text. This license may be changed at any time according to the principles of the project Governance.

Acknowledgements

This project is funded by:



Owner

  • Name: Datactive
  • Login: datactive
  • Kind: organization

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Benthall"
  given-names: "Sebastian"
  orcid: "https://orcid.org/0000-0002-1789-5109"
- family-names: "Doty"
  given-names: "Nick"
- family-names: "ten Oever"
  given-names: "Niels"
  orcid: "https://orcid.org/0000-0001-5134-2199"
- family-names: "Becker"
  given-names: "Christoph"
  orcid: "https://orcid.org/0000-0002-3324-3880"
title: "BigBang"
version: 0.3.0
doi: 10.5281/zenodo.5243261
date-released: 2021-08-24
url: "https://github.com/datactive/bigbang"

GitHub Events

Total
  • Create event: 2
  • Release event: 2
  • Issues event: 1
  • Watch event: 3
  • Push event: 16
  • Pull request event: 1
  • Fork event: 2
Last Year
  • Create event: 2
  • Release event: 2
  • Issues event: 1
  • Watch event: 3
  • Push event: 16
  • Pull request event: 1
  • Fork event: 2

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 1,109
  • Total Committers: 42
  • Avg Commits per committer: 26.405
  • Development Distribution Score (DDS): 0.674
Past Year
  • Commits: 70
  • Committers: 6
  • Avg Commits per committer: 11.667
  • Development Distribution Score (DDS): 0.7
Top Committers
Name Email Commits
sb s****l@g****m 362
Christovis c****r@d****k 168
Christovis c****1@g****m 100
Nick Doty n****y@i****u 95
Aryan Falahatpisheh a****h@b****u 78
Niels ten Oever n****s@d****g 66
Sebastian Benthall sb@u****e 34
sb sb@i****u 25
Davide Beraldo D****o 23
Christovis 3****s 21
Micah Lee m****h@m****m 21
Mridul Seth g****t@m****m 13
Effy e****x@g****m 9
Venkata Poreddy v****y@g****m 9
Émilien Schultz e****z@g****m 7
Shreyas s****s@g****m 7
Niels ten Oever n****s@a****g 7
sb sb@p****n 7
Spiros Eliopoulos s****u@g****m 7
davidberra d****o@l****m 6
Jack005 c****5@g****m 5
Dave Lester d****r@g****m 5
berra d****e@d****t 4
priyankaiitg p****g@g****m 3
Ki Deuk Kim k****m@b****u 3
Raj Agrawal r****l@b****u 3
unknown l****e@C****x 2
François Garillot f****s@g****t 2
Harsh Gupta g****6@g****m 2
Jessica Xu j****u@b****u 2
and 12 more...

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 87
  • Total pull requests: 41
  • Average time to close issues: almost 2 years
  • Average time to close pull requests: 20 days
  • Total issue authors: 14
  • Total pull request authors: 8
  • Average comments per issue: 1.74
  • Average comments per pull request: 1.71
  • Merged pull requests: 36
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: 2 days
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.5
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • sbenthall (62)
  • nllz (4)
  • npdoty (3)
  • micahflee (3)
  • emilienschultz (3)
  • MridulS (2)
  • Christovis (2)
  • laurenmarietta (2)
  • agt24 (1)
  • priyankaiitg (1)
  • ccosborne (1)
  • falahat (1)
  • danielsgriffin (1)
Pull Request Authors
  • nllz (12)
  • Christovis (12)
  • sbenthall (4)
  • micahflee (3)
  • u451f (2)
  • effyli (2)
  • MridulS (1)
  • priyankaiitg (1)
Top Labels
Issue Labels
prototypefund (8) data source (7) documentation (7) enhancement (6) Examples (6) Archive (5) Git Repo (3) analysis (2) CLI (2) bug (1) help wanted (1) installation (1) BBIP (1) wontfix (1) tests (1)
Pull Request Labels
enhancement (1)

Dependencies

.github/workflows/workflow-main.yaml actions
  • actions/checkout v2 composite
  • actions/setup-python main composite
  • codecov/codecov-action v1 composite