Council Data Project

Council Data Project: Software for Municipal Data Collection, Analysis, and Publication - Published in JOSS (2021)

https://github.com/councildataproject/cookiecutter-cdp-deployment

Science Score: 98.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 9 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

cdp-deployments civic-tech cookiecutter-template government-data local-government open-government

Keywords from Contributors

audio-classification speaker-id speaker-identification transformers mesh

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 40% confidence
Last synced: 6 months ago · JSON representation ·

Repository

Cookiecutter template for creating new CDP instances.

Basic Info
  • Host: GitHub
  • Owner: CouncilDataProject
  • License: mpl-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 11 MB
Statistics
  • Stars: 27
  • Watchers: 3
  • Forks: 9
  • Open Issues: 10
  • Releases: 17
Topics
cdp-deployments civic-tech cookiecutter-template government-data local-government open-government
Created over 5 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Citation Zenodo

README.md

cookiecutter-cdp-deployment

Cookiecutter Check Status DOI

Cookiecutter template for creating new Council Data Project deployments.


Council Data Project

Council Data Project is an open-source project dedicated to providing journalists, activists, researchers, and all members of each community we serve with the tools they need to stay informed and hold their Council Members accountable.

For more information about Council Data Project, please visit our website.

About

This repository is a "cookiecutter template" for an entirely new Council Data Project (CDP) Instance. By following the steps defined in the Usage section, our tools will create and manage all the database, file storage, and processing infrastructure needed to serve the CDP web application.

While our tools will setup and manage all processing and storage infrastructure, you (or your team) must provide and maintain the custom Python code to gather event information and handle billing for the costs of the deployment.

For more information about costs and billing, see Cost.

CDP Instance Features

  • Plain text search of past events and meeting items
    (search for "missing middle housing" or "bike lanes")
  • Filter and sort event and meeting item search results
    (filter by date range, committee, etc.)
  • Automatic timestamped-transcript generation
    (jump right to a specific public comment or debate)
  • Meeting item and amendment tracking
    (check for amendment passage, upcoming meetings, etc.)
  • Share event at timepoint
    (jump right to the point in the meeting you want to share)
  • Full event minutes details
    (view all documents and presentations related to each event)

See the current Seattle CDP Instance for a live example.

Note: Some features are dependent on how much data is provided during event gather. More information see our ingestion models documentation.

Usage

Regardless of your deployment strategy, you may find reading the Things to Know section helpful prior to deployment.

Note: while this cookiecutter will help you setup a repository and CDP infrastructure, you will still need to write your own custom data ingestion function. Writing a basic data ingestion function ranges from taking a couple of hours to a couple of days depending on how much data you want to provide to our system.

Deploying Under the councildataproject.org Domain

If you want your deployment under the councildataproject.org domain (i.e. https://councildataproject.org/seattle), you will need to fill out the "New Instance Deployment" Issue Form.

The Council Data Project team will help you along in the process on the issue from there.

Deploying Under Your Own Domain

If you want to host your deployment under a different domain (i.e. Your-Org-Name.github.io/your-municipality), you will need to install cookiecutter and use this template.

Follow along with the video walkthrough

Before you begin, please note that you will need to install or have available the following:

Once all tools are installed, the rest of the infrastructure setup process should take an hour or two.

In a terminal with Python 3.10+ installed:

bash pip install cookiecutter cookiecutter gh:CouncilDataProject/cookiecutter-cdp-deployment

Follow the prompts in your terminal and fill in the details for the instance deployment. At the end of the process a new directory will have been created with all required files and further instructions to set up your new deployment.

For more details and examples on each parameter of this cookiecutter template, see Cookiecutter Parameters.

Follow the steps in the "Initial Repository Setup" section of the README.md file within the generated SETUP directory.

For more details on what is created from using this cookiecutter template, see Cookiecutter Repo Generation.

The short summary of setup tasks remaining are:

  • The creation of a new GitHub repository for the instance.
  • Logging in or creating an account for Google Cloud.
  • Initialize the basic infrastructure.
  • Assign a billing account to the created Google Cloud project.
  • Generate credentials for the Google Project for use in automated scripts.
  • Attach credentials as secrets to the GitHub repository.
  • Push the cookiecutter generated files to the GitHub repository.
  • Setup web hosting through GitHub Pages.
  • Enable open access for data stored by Google Cloud and Firebase.
  • Write a data ingestion function for your municipality (it may be useful to build off of cdp-scrapers).

You can also see an example generated repository and the full steps listed here.

Cookiecutter Parameters

| Parameter | Description | Example 1 | Example 2 | | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------- | ------------------------------------------------- | | municipality | The name of the municipality (town, city, county, etc.) that this CDP Instance will store data for. | Seattle | King County | | ianatimezone | The IANA Timezone string of the municipality that this CDP instance is for. | America/LosAngeles | America/Chicago | | governingbodytype | What type of governing body this instance is for. | city council | county council | | municipalityslug | The name of the municipality cleaned for use in the web application and parts of repository naming. | seattle | king-county | | pythonmunicipalityslug | The name of the municipality cleaned for use in specifically Python parts of the application. | seattle | kingcounty | | infrastructureslug | The name of the municipality cleaned for use in specifically application infrastructure. Must be globally unique to GCP. | cdp-seattle-abasjkqy | cdp-king-county-uiqmsbaw | | maintainerororgfullname | The full name of the primary maintainer or organization that will be managing this instance deployment. | Eva Maxfield Brown | Council Data Project | | hostinggithubusernameororg | The GitHub username or organization that will host this instance's repository. (Used in the web application's domain name) | evamaxfield | CouncilDataProject | | hostinggithubreponame | A specific name to give to the repository. (Used in the web application's full address) | cdp-seattle | king-county | | hostinggithuburl | From the provided information, the expected URL of the GitHub repository. | https://github.com/evamaxfield/cdp-seattle | https://github.com/CouncilDataProject/king-county | | hostingwebappaddress | From the provided information, the expected URL of the web application. | https://evamaxfield.github.io/cdp-seattle | https://councildataproject.org/king-county | | firestoreregion | The desired region to host the firestore instance. (Firestore docs) | us-west1 | europe-central2 | | eventgathertimedeltalookbackdays | The number of days to look back from the current date every time the event scraper runs. | 2 | 6 | | eventgathercron | The event gather CRON configuration. (GitHub Actions CRON Details) | 26 0,6,12,18 * * * | 17 3,9,15,21 * * * | | eventgatherrunnertimeoutminutes | Minutes to wait before creating a CML runner attempt will fail. | 15 | 16 | | eventgatherrunnermaxattempts | Number of times to attempt to create a CML runner. | 4 | 36 | | eventgatherrunnerretrywait_seconds | Number of seconds to wait between CML runner create attempts. | 600 | 600 |

Things to Know

Much of Council Data Project processing and resource management can be handled for free and purely on GitHub. However we do rely on a select few resources outside of GitHub to manage all services and applications.

The only service that will require a billing account to manage payment for resources used, is Google Cloud. Google Cloud will manage all databases, file storage, and heavy-compute such as speech-to-text for transcription. You can see more about the average monthly cost of running a CDP Instance in Cost.

For more details see Cookiecutter Repo Generation. After creating the repo, the following steps will have instructions and links specific to your deployment in the generated repository's README.

Cookiecutter Repo Generation

Cookiecutter is a Python package to generate templated projects. This repository is a template for cookiecutter to generate a CDP deployment repository which contains following:

  • A directory structure for your project
  • A directory for your web application to build and deploy from
  • A directory for infrastructure management
  • A directory for your Python event gather function and it's requirements
  • Continuous integration
    • Preconfigured for your web application to fully deploy
    • Preconfigured to deploy all required CDP infrastructure
    • Preconfigured to run CDP pipelines using GitHub Actions

To generate a new repository from this template, in a terminal with Python 3.5+ installed, run:

bash pip install cookiecutter cookiecutter gh:CouncilDataProject/cookiecutter-cdp-deployment

Note: This will only create the basic repository. You will still need to set up a Google Cloud account.

Google Cloud

All of your deployments data and some data processing will be done using Google Cloud Platform (GCP).

  • Your deployment's provided and generated data (meeting dates, committee names, councilmember details, etc) will live in Firestore.
  • Your deployment's generated files (audio clips, transcripts, etc.) will live in Filestore.
  • The audio from the provided video will be processed using Whisper on Google Compute Engine.

Cost

CDP was created and maintained by a group of people working on it in their free time. We didn't want to pay extreme amounts of money so why should you?

To that end, we try to make CDP as low cost as possible. Many of the current features are entirely free as long as the repo is open source:

Free Resources and Infrastructure:

  • Event Processing (GitHub Actions)
  • Event and Legislation Indexing (GitHub Actions)
  • Web Hosting (GitHub Pages)

The backend resources and processing are the only real costs and depend on usage. The more users that use your web application, the more the database and file storage cost. The CDP-Seattle monthly averages below are for the most utilized months of its existence so take these as close to upper-bounds.

Billed Resources and Infrastructure:

Total Average Monthly Cost: $61.00

This is the ongoing cost of storing new meetings as they occur once your instance is deployed. You may have an additonal upfront cost if you are seeding your database with older videos and using speech-to-text to transcribe them.

Future Processing Features

As we add more features to CDP that require additional processing or resources we will continue to try to minimize their costs wherever possible. Further, if a feature is optional, we will create a flag that maintainers can set to include or exclude the additional processing or resource usage. See Upgrades and New Features for more information.

Upgrades and New Features

In general, all upgrades, bugfixes, new features, and more will be delivered to your CDP repository via Dependabot.

After releasing a new version of cdp-backend or cdp-frontend, GitHub and Dependabot will automatically create a pull request to your instance repository which updates the version requirements of the pipelines, infrastructure, and/or web application.

These pull requests will contain the release notes for the each version that it upgrades through, i.e. if it upgrades from 3.0.7 to 3.0.9, it will contain the release notes for both 3.0.8 and 3.0.9. This should help you as a maintainer understand what each upgrade is fixing or adding.

An example of such an automated pull request can be seen here.

Finally, in the case that an upgrade requires some additional work for the maintainer, i.e. "regenerate the latest cookiecutter," or "run this script" -- we will explicitly say so in our release notes. Those additional tasks are usually quite simple we just haven't fully automated them yet.

An example of why we may ask for the maintainer to run a script after merging, would be to backfill the data needed for a new feature. For example, if we update our data model to allow for some new feature, data moving forward may be fine but data from the past will be missing values and it may be optional but recommended to run the backfill script to have the new feature available for all historical data.

Citation

If you have found CDP software, data, or ideas useful in your own work, please consider citing us:

Brown et al., (2021). Council Data Project: Software for Municipal Data Collection, Analysis, and Publication. Journal of Open Source Software, 6(68), 3904, https://doi.org/10.21105/joss.03904

bibtex @article{Brown2021, doi = {10.21105/joss.03904}, url = {https://doi.org/10.21105/joss.03904}, year = {2021}, publisher = {The Open Journal}, volume = {6}, number = {68}, pages = {3904}, author = {Eva Maxfield Brown and To Huynh and Isaac Na and Brian Ledbetter and Hawk Ticehurst and Sarah Liu and Emily Gilles and Katlyn M. f. Greene and Sung Cho and Shak Ragoler and Nicholas Weber}, title = {{Council Data Project: Software for Municipal Data Collection, Analysis, and Publication}}, journal = {Journal of Open Source Software} }

License

MIT

Owner

  • Name: CouncilDataProject
  • Login: CouncilDataProject
  • Kind: organization

Tools for transparency and accessibility in council action.

JOSS Publication

Council Data Project: Software for Municipal Data Collection, Analysis, and Publication
Published
December 02, 2021
Volume 6, Issue 68, Page 3904
Authors
Eva Maxfield Brown ORCID
University of Washington Information School, University of Washington, Seattle
To Huynh ORCID
University of Washington, Seattle
Isaac Na ORCID
Washington University, St. Louis
Brian Ledbetter
University of Washington Information School, University of Washington, Seattle
Hawk Ticehurst
University of Washington Information School, University of Washington, Seattle
Sarah Liu
Independent Contributor
Emily Gilles
Independent Contributor
Katlyn M. f. Greene
Independent Contributor
Sung Cho
Independent Contributor
Shak Ragoler
Independent Contributor
Nicholas Weber ORCID
University of Washington Information School, University of Washington, Seattle
Editor
Chris Hartgerink ORCID
Tags
open government open data open infrastructure municipal governance data archival civic technology natural language processing

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: >-
  Council Data Project: Software for Municipal Data
  Collection, Analysis, and Publication
message: Please cite this software using these metadata.
type: software
authors:
  - given-names: Eva Maxfield
    family-names: Brown
    email: jmxbrown@uw.edu
    affiliation: >-
      University of Washington Information School,
      University of Washington, Seattle
    orcid: 'https://orcid.org/0000-0003-2564-0373'
  - given-names: To
    family-names: Huynh
    affiliation: 'University of Washington, Seattle'
    orcid: 'https://orcid.org/0000-0002-9664-3662'
  - given-names: Isaac
    family-names: Na
    affiliation: 'Washington University, St. Louis'
    orcid: 'https://orcid.org/0000-0002-0182-1615'
  - given-names: Brian
    family-names: Ledbetter
    affiliation: >-
      University of Washington Information School,
      University of Washington, Seattle
  - given-names: Hawk
    family-names: Ticehurst
    affiliation: >-
      University of Washington Information School,
      University of Washington, Seattle
  - given-names: Sarah
    family-names: Liu
  - given-names: Emily
    family-names: Gilles
  - given-names: Katlyn M. F.
    family-names: Greene
  - given-names: Sung
    family-names: Cho
  - given-names: Shak
    family-names: Ragoler
  - given-names: Nicholas
    family-names: Weber
    email: nmweber@uw.edu
    affiliation: >-
      University of Washington Information School,
      University of Washington, Seattle
    orcid: 'https://orcid.org/0000-0002-6008-3763'
identifiers:
  - type: doi
    value: 10.21105/joss.03904
  - type: url
    value: 'https://doi.org/10.21105/joss.03904'
repository-code: >-
  https://github.com/CouncilDataProject/cookiecutter-cdp-deployment
url: 'https://councildataproject.org'
abstract: >-
  Cities, counties, and states throughout the USA are
  bound by law to archive recordings of public
  meetings. Most local governments comply with these
  laws by posting documents, audio, or video
  recordings online. As there is no set standard for
  municipal data archives however, parsing and
  processing such data is typically time consuming
  and highly dependent on each municipality. Council
  Data Project (CDP) is a set of open-source tools
  that improve the accessibility of local government
  data by systematically collecting, transforming,
  and re-publishing this data to the web. The data
  re-published by CDP is packaged and presented
  within a searchable web application that vastly
  simplifies the process of finding specific
  information within the archived data. We envision
  this project being used by a variety of groups
  including civic technologists hoping to promote
  government transparency, researchers focused on
  public policy, natural language processing, machine
  learning, or information retrieval and discovery,
  and many others.
keywords:
  - public interest technology
  - open government
  - open data
  - data archival
  - municipal governance
  - Python
  - JavaScript
  - natural language processing
license: MIT
commit: 59c30846b40c289b1a8b6ab10b0b46b5bdfc3c59
version: 3.0.0
date-released: '2021-12-02'

GitHub Events

Total
  • Watch event: 1
  • Push event: 40
Last Year
  • Watch event: 1
  • Push event: 40

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 203
  • Total Committers: 12
  • Avg Commits per committer: 16.917
  • Development Distribution Score (DDS): 0.483
Past Year
  • Commits: 19
  • Committers: 4
  • Avg Commits per committer: 4.75
  • Development Distribution Score (DDS): 0.526
Top Committers
Name Email Commits
JacksonMaxfield j****n@g****m 105
Eva Maxfield Brown e****n@g****m 58
dependabot[bot] 4****] 16
Hawk Ticehurst 3****t 7
Isaac Na i****9@g****m 5
Smai Fullerton s****n@g****m 3
Gregory Foster g****r 3
Sung Cho 6****a 2
Nic n****r@g****m 1
sarahjliu s****8@g****m 1
Toni Wells 7****e 1
Wes Hargrove 1****e 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 62
  • Total pull requests: 55
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 5 days
  • Total issue authors: 10
  • Total pull request authors: 9
  • Average comments per issue: 3.79
  • Average comments per pull request: 1.07
  • Merged pull requests: 47
  • Bot issues: 0
  • Bot pull requests: 23
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • evamaxfield (41)
  • gregoryfoster (6)
  • sarahjliu (5)
  • dphoria (3)
  • nniiicc (2)
  • nathansgithub (1)
  • argusSecurityBot (1)
  • johnfelipe (1)
  • phildini (1)
  • isaacna (1)
Pull Request Authors
  • dependabot[bot] (23)
  • evamaxfield (18)
  • gregoryfoster (4)
  • smai-f (3)
  • isaacna (3)
  • nniiicc (1)
  • whargrove (1)
  • dphoria (1)
  • sarahjliu (1)
Top Labels
Issue Labels
new instance (34) enhancement (12) bug (8) good first issue (1) documentation (1)
Pull Request Labels
dependencies (23) github_actions (14) documentation (10) python (9) enhancement (6) bug (2)

Dependencies

{{ cookiecutter.hosting_github_repo_name }}/web/package.json npm
  • gh-pages ^2.2.0 development
  • react-scripts ^4.0.3 development
  • rimraf ^3.0.2 development
  • @councildataproject/cdp-frontend 3.1.2
  • react ^16.13.1
  • react-dom ^16.13.1
.github/workflows/scripts/requirements.txt pypi
  • cdp-backend >=3.1.1
  • cdp-scrapers *
  • cookiecutter *
  • requests *
{{ cookiecutter.hosting_github_repo_name }}/infra/requirements.txt pypi
  • cdp-backend ==3.1.1
  • pulumi *
.github/workflows/build.yml actions
  • JamesIves/github-pages-deploy-action v4 composite
  • actions/checkout v3 composite
  • actions/setup-node v3 composite
  • actions/setup-python v4 composite
.github/workflows/deployment-management-bot.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • peter-evans/create-or-update-comment v2 composite
  • peter-evans/find-comment v2 composite
.github/workflows/instance-configuration-validation-bot.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • peter-evans/create-or-update-comment v2 composite
  • peter-evans/find-comment v2 composite
.github/workflows/paper.yml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v3 composite
  • openjournals/openjournals-draft-action master composite
.github/workflows/test.yml actions
  • actions/checkout v3 composite
  • actions/setup-node v3 composite
  • actions/setup-python v4 composite