seesus

seesus: a social, environmental, and economic sustainability classifier for Python - Published in JOSS (2024)

https://github.com/caimeng2/seesus

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 8 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    1 of 4 committers (25.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

classification regular-expressions sdg sustainability sustainability-developoment-goals text-mining

Scientific Fields

Engineering Computer Science - 60% confidence
Last synced: 6 months ago · JSON representation ·

Repository

A Python package that identifies 17 Sustainable Development Goals and their 169 Targets in text, and classifies into social, environmental, and economic sustainability.

Basic Info
  • Host: GitHub
  • Owner: caimeng2
  • License: gpl-3.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 827 KB
Statistics
  • Stars: 8
  • Watchers: 1
  • Forks: 3
  • Open Issues: 0
  • Releases: 2
Topics
classification regular-expressions sdg sustainability sustainability-developoment-goals text-mining
Created almost 3 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

DOI

seesus: a social, environmental, and economic sustainability classifier

seesus is a Python package that evaluates whether a textual expression aligns with the concept of sustainability as defined by the United Nations Sustainable Development Goals (SDGs). It labels a statement with the 17 SDGs as well as 169 specific targets and categorizes the statement into social, environmental, or economic sustainability. For analysis in R, please check SDGdector.

seesus currently has four main functions:

  1. Evaluating whether a statement aligns with the concept of sustainability
  2. Identifying SDGs and associated targets in a statement
  3. Classifying a statement into social, environmental, and economic sustainability
  4. Customizing match syntax

Installation

Please install seesus from PyPI by inputting the following command in your terminal:

pip install seesus

Example

Analyzing an individual sentence

```python from seesus import SeeSus

text1 = "We aim to contribute to the mitigation of climate change by reducing carbon emissions in the city." result1 = SeeSus(text1)

print a summary of the results

print(result1)

print result on whether a statement aligns with sustainability, True or False

print(result1.sus)

print the names of identified SDGs

print(result1.sdg)

print the descriptions of identified SDGs

print(result1.sdg_desc)

print the names of identified SDG targets

print(result1.target)

print the descriptions of identified SDG targets

print(result1.target_desc)

determine which dimension of sustainability (social, environmental, or economic) a statement belongs to

print(result1.see) ```

Analyzing a paragraph or a longer document

To achieve the best results, it is recommended to split a paragraph or a whole document into individual sentences (i.e., using individual sentences as the basic unit for seesus to analyze). This can be done by tools such as nltk.tokenize and re.split.

```python import re

source: https://www.nyc.gov/site/planning/about/dcp-priorities/resiliency-sustainability.page

text2 = "By working with communities in the floodplain and facilitating flood-resistant building design, DCP is reducing the city’s risks to sea level rise and coastal flooding. Hurricane Sandy was a stark reminder of these risks. The City, led by the Mayor’s Office of Recovery and Resiliency (ORR), has developed a multifaceted plan for recovering from Sandy and improving the city’s resiliency–the ability of its neighborhoods, buildings and infrastructure to withstand and recover quickly from flooding and climate events. As part of this effort, DCP has initiated a series of projects to identify and implement land use and zoning changes as well as other actions needed to support the short-term recovery and long-term vitality of communities affected by Hurricane Sandy and other areas at risk of coastal flooding."

for sent in re.split(r'(?<!\w.\w.)(?<![A-Z][a-z].)(?<=.|\?)\s', text2): result = SeeSus(sent) print('"', sent, '"', sep = "") print("Is the sentence related to the concept of sustainability?", result.sus) print("Which SDGs?", result.sdg) print("Which SDG targets specifically?", result.target) print("which dimensions of sustainability?", result.see) print("----------------") ```

Customizing match syntax

```python

print match syntax

SeeSus.showsyntax("SDG1general")

customize match dyntax

SeeSus.editsyntax("SDG1general", "my match terms") ```

Please run example.ipynb to see more example usage.

Methodology

In an era of large language models, seesus chooses to use predefined regular expression patterns instead of machine learning, because this method is more transparent, replicable, and controllable. The regular expression syntax was developed for the 17 SDGs and the 169 SDG targets, including both direct and indirect matching. The accuracy of the matching syntax was manually tested, reviewed, and improved using randomly selected statements from corporate reports. Three rounds of adjustments were conducted to finalize the syntax. seesus achieves an accuracy rate of 76%, as determined by alignment with manual coding. Human intercoder agreement on the same text stands at 83%. Considering the inherent ambiguity and complexity of language, as well as the interconnected nature of the SDGs, the accuracy of seesus is rather high. Please see SDGdector for detailed information on the accuracy evaluation and manual refinement.

How to cite

Cai, M., Li, Y., Colbry, D., Frans, V. F., & Zhang, Y. (2024). seesus: a social, environmental, and economic sustainability classifier for Python. Journal of Open Source Software, 9(96), 6244. https://doi.org/10.21105/joss.06244

@article{Cai_seesus_a_social_2024, author = {Cai, Meng and Li, Yingjie and Colbry, Dirk and Frans, Veronica F. and Zhang, Yuqian}, doi = {10.21105/joss.06244}, journal = {Journal of Open Source Software}, month = apr, number = {96}, pages = {6244}, title = {{seesus: a social, environmental, and economic sustainability classifier for Python}}, url = {https://joss.theoj.org/papers/10.21105/joss.06244}, volume = {9}, year = {2024} }

Maintenance

Please report any issues if you find that a matching syntax is not accurate or can be improved. We welcome contributions to enhance the classification accuracy of seesus.

Owner

  • Name: Meng Cai
  • Login: caimeng2
  • Kind: user

JOSS Publication

seesus: a social, environmental, and economic sustainability classifier for Python
Published
April 08, 2024
Volume 9, Issue 96, Page 6244
Authors
Meng Cai ORCID
School of Planning, Design and Construction, Michigan State University, East Lansing, MI 48824, United States, Department of Civil and Environmental Engineering, Technical University of Darmstadt, Darmstadt 64287, Germany
Yingjie Li ORCID
Natural Capital Project, Woods Institute for the Environment, Stanford University, Stanford, CA 94305, United States
Dirk Colbry ORCID
Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, United States
Veronica F. Frans ORCID
Center for Systems Integration and Sustainability, Department of Fisheries and Wildlife, Michigan State University, East Lansing, MI 48823, United States, Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI 48824, United States, W.K. Kellogg Biological Station, Michigan State University, Hickory Corners, MI 49060, United States
Yuqian Zhang ORCID
Center for Systems Integration and Sustainability, Department of Fisheries and Wildlife, Michigan State University, East Lansing, MI 48823, United States, Environmental Science and Policy Program, Michigan State University, East Lansing, MI 48823, United States
Editor
Olivia Guest ORCID
Tags
Sustainability Sustainable Development Goals (SDGs) Text mining Text analysis classification Regular expressions

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Cai
  given-names: Meng
  orcid: "https://orcid.org/0000-0002-8318-572X"
- family-names: Li
  given-names: Yingjie
  orcid: "https://orcid.org/0000-0002-8401-0649"
- family-names: Colbry
  given-names: Dirk
  orcid: "https://orcid.org/0000-0003-0666-9883"
- family-names: Frans
  given-names: Veronica F.
  orcid: "https://orcid.org/0000-0002-5634-3956"
- family-names: Zhang
  given-names: Yuqian
  orcid: "https://orcid.org/0000-0001-7576-2526"
contact:
- family-names: Cai
  given-names: Meng
  orcid: "https://orcid.org/0000-0002-8318-572X"
doi: 10.5281/zenodo.10854083
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Cai
    given-names: Meng
    orcid: "https://orcid.org/0000-0002-8318-572X"
  - family-names: Li
    given-names: Yingjie
    orcid: "https://orcid.org/0000-0002-8401-0649"
  - family-names: Colbry
    given-names: Dirk
    orcid: "https://orcid.org/0000-0003-0666-9883"
  - family-names: Frans
    given-names: Veronica F.
    orcid: "https://orcid.org/0000-0002-5634-3956"
  - family-names: Zhang
    given-names: Yuqian
    orcid: "https://orcid.org/0000-0001-7576-2526"
  date-published: 2024-04-08
  doi: 10.21105/joss.06244
  issn: 2475-9066
  issue: 96
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 6244
  title: "seesus: a social, environmental, and economic sustainability
    classifier for Python"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.06244"
  volume: 9
title: "seesus: a social, environmental, and economic sustainability
  classifier for Python"

GitHub Events

Total
  • Watch event: 2
Last Year
  • Watch event: 2

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 110
  • Total Committers: 4
  • Avg Commits per committer: 27.5
  • Development Distribution Score (DDS): 0.055
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Cai, Meng m****1@g****m 104
Dirk Colbry c****i@m****u 3
Yingjie Li y****u@g****m 2
Olivia Guest o****t 1
Committer Domains (Top 20 + Academic)
msu.edu: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: 6 days
  • Total issue authors: 0
  • Total pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • oliviaguest (2)
  • colbrydi (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 30 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 4
  • Total maintainers: 1
pypi.org: seesus

a social, environmental, and economic sustainability classifier based on the UN Sustainable Development Goals

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 30 Last month
Rankings
Dependent packages count: 7.3%
Average: 37.9%
Dependent repos count: 68.5%
Maintainers (1)
Last synced: 6 months ago

Dependencies

pyproject.toml pypi
  • pytest *
  • regex *