searchseco-miner

v2 js miner implementation for SearchSECO

https://github.com/secureseco/searchseco-miner

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: ieee.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.5%) to scientific vocabulary

Keywords

jest miner search-seco typescript
Last synced: 6 months ago · JSON representation ·

Repository

v2 js miner implementation for SearchSECO

Basic Info
Statistics
  • Stars: 4
  • Watchers: 3
  • Forks: 3
  • Open Issues: 19
  • Releases: 1
Topics
jest miner search-seco typescript
Created almost 3 years ago · Last pushed 6 months ago
Metadata Files
Readme License Citation Codemeta

README.md

SecureSECO

License: MIT RSD DOI

The goal of the SecureSECO initiative is to secure and increase trust in the software ecosystem, through the use of distributed ledger technology and empirical software engineering research.

The software ecosystem is a trust-rich part of the world. Collaboratively, software engineers put their trust in major hubs in the ecosystem, such as package managers, repository services, and programming language ecosystems. However, there are many parts of the chain in which this trust can be broken. We present a vision for a trust ensuring mechanism in the software ecosystem that mitigates the presented risks. If our community manages to implement this mechanism, we can create an urgently needed secure software ecosystem.

The initiative is an academic initiative with partners from several universities and companies.

Website

https://secureseco.org/

SearchSECO Miner

This is the repository for the SecureSECO DAO miner built to scrape Github, upload project data to the SecureSECO database and to connect with the DAO to facilitate claiming of rewards.

Initial Setup

This project uses Node v18.

Environment variables

All environment variables are listed in src/config/.env.example. The variables are exposed via a .env file in the same folder as .env.example, and this example file serves as a template for the variables that need to be specified in .env.

Setting up environment variables and installing dependencies

Variables

Below is a list of the specified environment variables that need to be specified by the user. - MINER_NAME: The optional name of the miner. This value defaults to 'client' - GITHUB_TOKEN: The github token supplied by the user. Used to fetch author and project data from Github. The github token can have the minimal amount of access rights. - PERSONAL_WALLET_ADDRESS: The wallet address of the user. In order to successfully link to the DAO, the same address must be used as the one linked to the DAO.

Dependencies

  • The miner uses srcML to parse some languages to XML. Install the relevant executable. If not installed, the miner will skip all files which have to be parsed with srcml.
  • The miner also uses Git to interface with github. Make sure it is installed and run the following commands in a terminal with admin rights:
    • git config --system core.longpaths true - Some filenames are too long to be accessed with git, and this flag enables long filenames.
    • git config --system core.protectNTFS false - Some filepaths are incorrectly formatted (e.g have symbols such as : or * in them) for NTFS filesystems, and this flag disables a check for those filepaths. ## Library Dependencies

searchSECO-miner uses the following external libraries and modules:

  • cassandra-driver: ^4.6.4
  • copyfiles: ^2.4.1
  • dotenv: ^16.0.3
  • prompt-sync: ^4.2.0
  • uuid: ^9.0.0
  • yargs: ^17.7.2
  • searchseco-crawler : "file:src/modules/searchSECO-crawler"
  • searchseco-databaseapi: "file:src/modules/searchSECO-databaseAPI"
  • searchseco-logger: "file:src/modules/searchSECO-logger"
  • searchseco-parser: "file:src/modules/searchSECO-parser"
  • searchseco-spider: "file:src/modules/searchSECO-spider"

Installing and running the miner

Run using npm

Install submodules: git submodule init Update the submodules: git submodule update --init --recursive Fill in the relevant variables in the .env file and install dependencies: npm i Build the miner for the target operating sytem: npm run build-win or npm run build-unix Run the miner with the following command structure: npm run execute -- <command> [options] To get a list of all commands and options, run: npm run execute -- --help For example: npm run execute -- check https://github.com/SecureSECO/searchSECO-miner -V 5 For help: npm help run-script

Build from source

Optionally fill in all relevant variables in .env and run the following command. Choose the target depending on your operating system. npm run package-[win|linux|mac] This will create a folder called ./build, in which is the executable. This executable can be run the same way as in the latest release, but the github_token option does not have to be set if the .env file has been created and filled in.

Verbosity

The miner can be set to be more or less verbose. Each command can be suffixed with a --verbose [VERBOSITY] flag. The specific verbosity values are listed below. - 1: Silent. Only [INFO] messages are shown - 2: Errors only - 3: Errors and warnings only - 4: Everything - 5: Everything including [DEBUG] statements

License

This project is licensed under the MIT license. See LICENSE for more info.

This program has been developed by students from the bachelor Computer Science at Utrecht University within the Software Project course. © Copyright Utrecht University (Department of Information and Computing Sciences)

Related Artciles

Jansen, S., Farshidi, S., Gousios, G., Visser, J., Storm, T. V. D., & Bruntink, M. (2020). SearchSECO: A Worldwide Index of the Open Source Software Ecosystem. In M. Papadakis, & M. Cordy (Eds.), Proceedings of the 19th Belgium-Netherlands Software Evolution Workshop, BENEVOL 2020, Luxembourg, December 3-4, 2020 (Vol. 2912). (CEUR Workshop Proceedings). CEUR-WS.org. http://ceurws.org/Vol-2912/paper3.pdf

Deekshitha, S. Farshidi, J. Maassen, R. Bakhshi, R. Van Nieuwpoort and S. Jansen, "FAIRSECO: An Extensible Framework for Impact Measurement of Research Software," 2023 IEEE 19th International Conference on e-Science (e-Science), Limassol, Cyprus, 2023, pp. 1-10, doi: 10.1109/e-Science58273.2023.10254664. https://ieeexplore.ieee.org/document/10254664

Islam Aminul, Jansen Slinger. (2024). Securing Software Ecosystems through Repository Mining. The 15th International Conference on Software Business (ICSOB 2024), NOVEMBER 18-20, 2024, Utrecht, The Netherlands, Vol-3921. https://ceur-ws.org/Vol-3921/phd-paper8.pdf

Islam Aminul, Krishna Kaipa, Jansen Slinger. (2024). Work in Progress Paper: Detecting Method Level License Conflicts in the Worldwide Software Ecosystem. BENEVOL’24: Belgium-Dutch Software Evolution Workshop.Conference link: https://benevol2024.github.io/

Owner

  • Name: SecureSECO
  • Login: SecureSECO
  • Kind: organization
  • Email: slinger.jansen@uu.nl
  • Location: Netherlands

Citation (CITATION.cff)

cff-version: 1.2.0
message: ' If you use this software or the associated data,
  feel free to cite us.'
title: 'searchSECO-miner'
type: software
authors:
  - family-names: Jansen
    given-names: Slinger
    affiliation: PI and coordinator of the project
    orcid: '0000-0003-3752-2868'
  - family-names: Giezeman
    given-names: Geert-Jan
    affiliation: RSE
  - family-names: Voordouw
    given-names: Martijn
    affiliation: RSE
  - family-names: Beffers
    given-names: Wouter
  - family-names: Islam
    given-names: Aminul
    orcid: '0009-0005-4792-4256'
  - given-names: Deekshitha
    orcid: '0000-0003-1831-8941'
version: 'V1.0.0'
url: 'https://github.com/SecureSECO/searchSECO-miner'
license: 'https://opensource.org/licenses/MIT'

CodeMeta (codemeta.json)

{
  "$schema": "http://codemeta.github.io/codemeta.json",
  "name": "searchSECO-miner",
  "description": "This is the repository for the SecureSECO DAO miner built to scrape Github, upload project data to the SecureSECO database and to connect with the DAO to facilitate claiming of rewards.",
  "version": "V1.0.0",
  "url": "https://github.com/SecureSECO/searchSECO-miner",
  "license": "https://opensource.org/licenses/MIT",
  "author": [
    {
      "@type": "Person",
      "name": "Slinger Jansen",
      "affiliation": "PI and coordinator of the project",
      "identifier": {
        "@type": "PropertyValue",
        "propertyID": "ORCID",
        "value": "https://orcid.org/0000-0003-3752-2868"
      }
    },
    {
      "@type": "Person",
      "name": "Geert-Jan Giezeman",
      "affiliation": "RSE"
    },
    {
      "@type": "Person",
      "name": "Martijn Voordouw",
      "affiliation": "RSE"
    },
    {
      "@type": "Person",
      "name": "Wouter Beffers"
    },
    {
      "@type": "Person",
      "name": "Aminul Islam",
      "identifier": {
        "@type": "PropertyValue",
        "propertyID": "ORCID",
        "value": "https://orcid.org/0009-0005-4792-4256"
      }
    },
    {
      "@type": "Person",
      "name": "Deekshitha",
      "identifier": {
        "@type": "PropertyValue",
        "propertyID": "ORCID",
        "value": "https://orcid.org/0000-0003-1831-8941"
      }
    }
  ],
  "keywords": [
    "data mining",
    "research software"
  ],
  "repository": "https://github.com/SecureSECO/searchSECO-miner",
  "referencePublication": [
    {
      "@type": "ScholarlyArticle",
      "name": "SearchSECO: A Worldwide Index of the Open Source Software Ecosystem",
      "publisher": "CEUR-WS.org",
      "url": "http://ceurws.org/Vol-2912/./paper3.pdf"
    },
    {
      "@type": "ScholarlyArticle",
      "name": "FAIRSECO: An Extensible Framework for Impact Measurement of Research Software",
      "publisher": "IEEE",
      "url": "https://ieeexplore.ieee.org/document/10254664"
    }
  ],
  "identifier": {
    "@type": "PropertyValue",
    "propertyID": "DOI",
    "value": "10.5281/zenodo.13710367"
  }
}

GitHub Events

Total
  • Issues event: 3
  • Watch event: 1
  • Issue comment event: 3
  • Push event: 60
  • Pull request event: 6
  • Create event: 1
Last Year
  • Issues event: 3
  • Watch event: 1
  • Issue comment event: 3
  • Push event: 60
  • Pull request event: 6
  • Create event: 1