https://github.com/civicdatalab/up-fiscal-data-backend

https://github.com/civicdatalab/up-fiscal-data-backend

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.9%) to scientific vocabulary

Keywords

budget data-mining data-pipeline open-data selenium spending
Last synced: 5 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: CivicDataLab
  • License: mit
  • Default Branch: master
  • Homepage:
  • Size: 17.6 KB
Statistics
  • Stars: 0
  • Watchers: 7
  • Forks: 1
  • Open Issues: 7
  • Releases: 0
Topics
budget data-mining data-pipeline open-data selenium spending
Created over 5 years ago · Last pushed almost 3 years ago
Metadata Files
Readme License

README.md

Uttar Pradesh Fiscal Data Backend

A data scraping pipeline was setup to mine relevant data and sources from the Uttar Pradesh fiscal data portal, Koshvani.

Table of Contents

Platform

Tools

Setup

Challenges

Contributions

Repo Structure

Platform

Platfrom Name : Koshvani web -- A Gateway to Finance Activities in the State of Uttar Pradesh Platform URL : http://koshvani.up.nic.in/

A more detailed analysis of the platform and in-scope data can be found here.

Tools

Though the data on the Koshvani platform is available in structured format to us and analyse, scraping it through traditional methods was turning out to be a challenge.

Keeping in mind the platform structure and behaviour, a decision was undertaken to select Selenium as the mode of data mining and storing. The Selenium framework allows to automate browser actions to extract in-scope datasets.

Setup

Instructions for setting up the data pipeline.

<<TBD>>

Challenges

During the data scraping exercise, the following challenges were faced during mining of the data. The respective resolutions for those challeges are also documented here.

| Challenge | Resolution | |---|---| | | | | | | | | |

Contributions

You can refer to the contributing guidelines and understand how to contribute.

Repo Structure

root └── contribute/ └── CODE-OF-CONDUCT.md └── CONTRIBUTING.md └── LICENSE.md └── README.md

Owner

  • Name: CivicDataLab
  • Login: CivicDataLab
  • Kind: organization
  • Email: info@civicdatalab.in
  • Location: India

Harnessing Data, Tech, Design and Social Science to strengthen the course of Civic Engagements in India.

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: 12 months ago

All Time
  • Total issues: 7
  • Total pull requests: 6
  • Average time to close issues: 23 days
  • Average time to close pull requests: about 12 hours
  • Total issue authors: 2
  • Total pull request authors: 2
  • Average comments per issue: 1.14
  • Average comments per pull request: 0.33
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • TheDataAreClean (6)
  • gggodhwani (1)
Pull Request Authors
  • shreyaagrawal0809 (5)
  • heaven00 (1)
Top Labels
Issue Labels
enhancement (6) documentation (1)
Pull Request Labels
documentation (1)