https://github.com/chaoss/grimoirelab-graal

A Generic Repository AnALyzer

https://github.com/chaoss/grimoirelab-graal

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.4%) to scientific vocabulary

Keywords

agnostic analysis generic source-code

Keywords from Contributors

grimoirelab orchestration software-analytics data-enrichment chaoss community handbook mentorship project-governance
Last synced: 5 months ago · JSON representation

Repository

A Generic Repository AnALyzer

Basic Info
  • Host: GitHub
  • Owner: chaoss
  • License: gpl-3.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 4.44 MB
Statistics
  • Stars: 25
  • Watchers: 9
  • Forks: 64
  • Open Issues: 18
  • Releases: 0
Topics
agnostic analysis generic source-code
Created almost 8 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog License Authors

README.md

Graal: a Generic Repository AnALyzer Build Status Coverage Status PyPI version

Graal leverages on the Git backend of Perceval and enhances it to set up ad-hoc source code analysis. Thus, it fetches the commits from a Git repository and provides a mechanism to plug third party tools/libraries focused on source code analysis.

How it works

The Perceval Git backend creates a local mirror of a Git repository (local or remote), fetches the metadata of commits in chronological order and returns them as a list of JSON documents (one per commit). Graal leverages on the incremental functionalities provided by the Git backend and enhances the logic to handle Git repositories by creating a working tree to perform checkout operations (which are not possible on a Git mirror). Graal intercepts each JSON document and enables the user to perform the following steps: - Filter. The filtering is used to select or discard commits based on the information available in the JSON document and/or via the Graal parameters. For any selected commit, Graal executes a checkout on the working tree using the commit hash, thus setting the state of the working tree at that given revision. - Analyze. The analysis takes the JSON document and the current working tree and enables the user to set up ad-hoc source code analysis by plugging existing tools through system calls or their Python interfaces, when possible. The results of the analysis are parsed and manipulated by the user and then automatically embedded in the JSON document. It is worth noting that in this step the user can rely on some predefined functionalities of Graal to deal with the repository snapshot (e.g., listing files, creating archives). - Post-process. In the final step, the inflated JSON document can be optionally processed to alter (e.g., renaming, removing) its attributes, thus granting the user complete control over the output of Graal executions.

Several parameters (inherited from the Git backend) are available to control the execution; for instance, from_date and to_date allow to select commits authored since and before a given date, branches allows to fetch commits only from specific branches, and latest_items returns only those commits which are new since the last fetch operation. Graal includes additional parameters to drive the analysis to filter in/out files and directories in the repository (in_paths and out_paths), set the entrypoint and define the details level of the analysis (useful when analyzing large software projects).

Requirements

You will also need some other Python libraries for running the tool, you can find the whole list of dependencies in pyproject.toml file.

How to install and create the executables:

  • github-linguist

Library is used to detect blob languages, ignore binary or vendored files, suppress generated files in diffs, and generate language breakdown graphs. $ gem install github-linguist -v 7.15 - FOSSology

Open source license compliance software system and toolkit. You can run license, copyright, and export control scans from the command line. $ wget https://github.com/fossology/fossology/releases/download/3.11.0/FOSSology-3.11.0-ubuntu-focal.tar.gz $ tar -xzf FOSSology-3.11.0-ubuntu-focal.tar.gz $ sudo apt-get -y install ./packages/fossology-common_3.11.0-1_amd64.deb \ ./packages/fossology-nomos_3.11.0-1_amd64.deb

  • Cloc

Count blank lines, comment lines, and physical lines of source code in many programming languages. $ sudo apt-get install cloc

  • SCC

A tool similar to cloc - for counting physical the lines of code, blank lines, comment lines, and physical lines of source code in many programming languages and COCOMO estimates written in pure Go. $ go install github.com/boyter/scc@latest

  • ScanCode Toolkit

ScanCode detects licenses, copyrights, package manifests & dependencies and more by scanning code. $ mkdir exec $ cd exec $ git clone https://github.com/nexB/scancode-toolkit.git $ cd scancode-toolkit $ git checkout -b test_scancli 96069fd84066c97549d54f66bd2fe8c7813c6b52 $ ./scancode --help

Note: We're now using a clone of scancode-toolkit instead of a release, as the latest release is of 15th February 2019 and the scancli.py script (required for execution of scancode_cli) was incorporated later i.e 5th March 2019 and there hasn't been a release since.

  • crossJadolint

This is a Dockerfile linter tool, implemented in Java and essentialy is a port of Hadolint (Haskell Dockerfile Linter) $ cd exec $ wget https://github.com/crossminer/crossJadolint/releases/download/Pre-releasev2/jadolint.jar

Installation

There are several ways to install Graal on your system: packages or source code using Poetry or pip.

PyPI

Graal can be installed using pip, a tool for installing Python packages. To do it, run the next command: $ pip install graal

Source code

To install from the source code you will need to clone the repository first: $ git clone https://github.com/chaoss/grimoirelab-graal $ cd grimoirelab-graal

Then use pip or Poetry to install the package along with its dependencies.

Pip

To install the package from local directory run the following command: $ pip install . In case you are a developer, you should install graal in editable mode: $ pip install -e .

Poetry

We use poetry for dependency management and packaging. You can install it following its documentation. Once you have installed it, you can install graal and the dependencies in a project isolated environment using: $ poetry install To spaw a new shell within the virtual environment use: $ poetry shell

Backends

Several backends have been developed to assess the genericity of Graal. Those backends leverage on source code analysis tools, where executions are triggered via system calls or their Python interfaces. In the current status, the backends mostly target Python code, however other backends can be easily developed to cover other programming languages. The currently available backends are: - CoCom gathers data about code complexity (e.g., cyclomatic complexity, LOC) from projects written in popular programming languages such as: C/C++, Java, Scala, JavaScript, Ruby, Python, Lua and Golang. It leverages on Cloc, Lizard and scc. The tool can be exectued at file and repository levels activated with the help of category: code_complexity_lizard_file or code_complexity_lizard_repository. - CoDep extracts package and class dependencies of a Python module and serialized them as JSON structures, composed of edges and nodes, thus easing the bridging with front-end technologies for graph visualizations. It combines PyReverse and NetworkX. - CoQua retrieves code quality insights, such as checks about line-code’s length, well-formed variable names, unused imported modules and code clones. It uses PyLint and Flake8. The tools can be activated by passing the corresponding category: code_quality_pylint or code_quality_flake8. - CoVuln scans the code to identify security vulnerabilities such as potential SQL and Shell injections, hard-coded passwords and weak cryptographic key size. It relies on Bandit. - CoLic scans the code to extract license & copyright information. It currently supports Nomos and ScanCode. They can be activated by passing the corresponding category: code_license_nomos, code_license_scancode, or code_license_scancode_cli. - CoLang gathers insights about code language distribution of a git repository. It relies on Linguist and Cloc tools. They can be activated by passing the corresponding category: code_language_linguist or code_language_cloc.

How to develop a backend

Creating your own backend is pretty easy, you only need to redefine the following methods of Graal: - filtercommit. This method is used to select or discard commits based on the information available in the JSON document and/or via the Graal parameters (e.g., the commits authored by a given user or targeting a given software component). For any selected commit, Graal executes a checkout on the working tree using the commit hash, thus setting the state of the working tree at that given revision. - _analyze. This method takes the document and the current working tree and allows to connect existing tools through system calls or their Python interfaces, when possible. The results of the analysis, parsed and manipulated by the user, are automatically embedded in the JSON document. - _post. This method allows to alter (e.g., renaming, removing) the attributes of the inflated JSON documents.

How to use

From command line

Launching Graal from command line does not require much effort, but only some basic knowledge of GNU/Linux shell commands.

The example below shows how easy it is to fetch code complexity information from a Git repository. The CoCom backend requires the URL where the repository is located (https://github.com/chaoss/grimoirelab-perceval) and the local path where to mirror the repository (/tmp/graal-cocom). Then, the JSON documents produced are redirected to the file graal-cocom.test.

  • CoCom Backend

$ graal cocom https://github.com/chaoss/grimoirelab-perceval --git-path /tmp/graal-cocom > /graal-cocom.test Starting the quest for the Graal. Git worktree /tmp/... created! Fetching commits: ... Git worktree /tmp/... deleted! Fetch process completed: .. commits inspected Quest completed.

  • CoLic Backend

graal colic https://github.com/chaoss/grimoirelab-toolkit --git-path /tmp/scancode_cli --exec-path /home/scancode-toolkit/etc/scripts/scancli.py --category code_license_scancode_cli Starting the quest for the Graal. Git worktree /tmp/... created! Fetching commits: ... Git worktree /tmp/... deleted! Fetch process completed: .. commits inspected Quest completed.

In the above example, we're using scancodecli analyzer. Similarly, we can use the scancode analyzer by providing the category as `codelicense_scancode` and it's corresponding executable path.

From Python

Graal’s functionalities can be embedded in Python scripts. Again, the effort of using Graal is minimum. In this case the user only needs some knowledge of Python scripting. The example below shows how to use Graal in a script.

The graal.backends.core.cocom module is imported at the beginning of the file, then the repo_uri and repo_dir variables are set to the URI of the Git repository and the local path where to mirror it. These variables are used to initialize a CoCom class object. In the last line of the script, the commits inflated with the result of the analysis are retrieved using the fetch method. The fetch method inherits its argument from Perceval, thus it optionally accept two Datetime objects to gather only those commits after and before a given date, a list of branches to focus on specific development activities, and a flag to collect the commits available after the last execution.

```

! /usr/bin/env python3

from graal.backends.core.cocom import CoCom

URL for the git repo to analyze

repo_uri = ’http://github.com/chaoss/grimoirelab-perceval’

directory where to mirror the repo

repo_dir = ’/tmp/graal-cocom’

Cocom object initialization

cc = CoCom(uri=repouri, gitpath=repo_dir)

fetch all commits

commits = [commit for commit in cc.fetch()] ```

How to integrate it with Arthur

Arthur is another tool of the Grimoirelab ecosystem. It was originally designed to allow to schedule and run Perceval executions at scale through distributed Redis queues, and store the obtained results in an ElasticSearch database.

Arthur has been extended to allow handling Graal tasks, which inherit from Perceval Git tasks. The code to make this extension possible is available at: https://github.com/chaoss/grimoirelab-kingarthur/pull/33.

Information about Arthur is available at https://github.com/chaoss/grimoirelab-kingarthur.

Owner

  • Name: CHAOSS
  • Login: chaoss
  • Kind: organization

GitHub Events

Total
  • Release event: 18
  • Watch event: 5
  • Delete event: 3
  • Push event: 21
  • Pull request review event: 2
  • Pull request event: 5
  • Fork event: 3
  • Create event: 17
Last Year
  • Release event: 18
  • Watch event: 5
  • Delete event: 3
  • Push event: 21
  • Pull request review event: 2
  • Pull request event: 5
  • Fork event: 3
  • Create event: 17

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 437
  • Total Committers: 10
  • Avg Commits per committer: 43.7
  • Development Distribution Score (DDS): 0.6
Past Year
  • Commits: 67
  • Committers: 2
  • Avg Commits per committer: 33.5
  • Development Distribution Score (DDS): 0.358
Top Committers
Name Email Commits
Santiago Dueñas s****s@b****m 175
Valerio Cosentino v****s@b****m 142
Jose Javier Merchante j****e@b****m 67
inishchith i****h@g****m 44
Venu Vardhan Reddy Tekula v****u@b****m 2
Jesus M. Gonzalez-Barahona j****b@g****s 2
pranjal.aswani p****i@h****m 2
SunflowerPKU 7****1@q****m 1
Sanjana Nayar s****1@g****m 1
Eva Millán e****n@b****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 38
  • Total pull requests: 75
  • Average time to close issues: 26 days
  • Average time to close pull requests: 5 days
  • Total issue authors: 14
  • Total pull request authors: 12
  • Average comments per issue: 2.89
  • Average comments per pull request: 1.29
  • Merged pull requests: 56
  • Bot issues: 0
  • Bot pull requests: 2
Past Year
  • Issues: 0
  • Pull requests: 5
  • Average time to close issues: N/A
  • Average time to close pull requests: about 7 hours
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • inishchith (11)
  • GeorgLink (7)
  • valeriocos (5)
  • svdo (5)
  • kelson42 (1)
  • vchrombie (1)
  • leonvisscher (1)
  • jjmerchante (1)
  • vinhbt (1)
  • apoorvaanand1998 (1)
  • FredaPinto (1)
  • nhasabni (1)
  • florentk (1)
  • altsalt (1)
Pull Request Authors
  • valeriocos (26)
  • jjmerchante (26)
  • inishchith (11)
  • wmeijer221 (3)
  • vchrombie (3)
  • sduenas (3)
  • jwalden (2)
  • dependabot[bot] (2)
  • sanjana091001 (1)
  • jgbarah (1)
  • altsalt (1)
  • evamillan (1)
Top Labels
Issue Labels
Pull Request Labels
dependencies (2)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 2,063 last-month
  • Total docker downloads: 69
  • Total dependent packages: 3
  • Total dependent repositories: 19
  • Total versions: 112
  • Total maintainers: 2
pypi.org: graal

A generic source code analyzer

  • Versions: 112
  • Dependent Packages: 3
  • Dependent Repositories: 19
  • Downloads: 2,063 Last month
  • Docker Downloads: 69
Rankings
Dependent packages count: 2.4%
Dependent repos count: 3.3%
Docker downloads count: 4.3%
Forks count: 5.5%
Average: 6.0%
Downloads: 6.7%
Stargazers count: 13.6%
Maintainers (2)
Last synced: 6 months ago

Dependencies

poetry.lock pypi
  • coverage 6.4.1 develop
  • astroid 2.11.5
  • bandit 1.7.4
  • beautifulsoup4 4.11.1
  • certifi 2022.5.18.1
  • cffi 1.15.0
  • charset-normalizer 2.0.12
  • cloc 0.2.5
  • colorama 0.4.4
  • colored 1.4.3
  • cryptography 3.4.8
  • dill 0.3.5.1
  • dulwich 0.20.42
  • execnet 1.9.0
  • feedparser 6.0.10
  • flake8 4.0.1
  • gitdb 4.0.9
  • gitpython 3.1.27
  • grimoirelab-toolkit 0.3.0
  • idna 3.3
  • importlib-metadata 4.2.0
  • isort 5.10.1
  • lazy-object-proxy 1.7.1
  • lizard 1.16.6
  • mccabe 0.6.1
  • networkx 2.6.3
  • pbr 5.9.0
  • perceval 0.20.0rc1
  • platformdirs 2.5.2
  • pycodestyle 2.8.0
  • pycparser 2.21
  • pydot 1.4.2
  • pyflakes 2.4.0
  • pyjwt 2.4.0
  • pylint 2.13.9
  • pyparsing 3.0.9
  • python-dateutil 2.8.2
  • pyyaml 6.0
  • requests 2.27.1
  • sgmllib3k 1.0.0
  • six 1.16.0
  • smmap 5.0.0
  • soupsieve 2.3.2.post1
  • stevedore 3.5.0
  • tomli 2.0.1
  • typed-ast 1.5.4
  • typing-extensions 4.2.0
  • urllib3 1.26.9
  • wrapt 1.14.1
  • zipp 3.8.0
pyproject.toml pypi
  • coverage ^6.3.2 develop
  • flake8 ^4.0.1 develop
  • bandit >=1.4.0
  • cloc ^0.2.5
  • execnet ^1.9.0
  • flake8 >=3.7.7
  • grimoirelab-toolkit >=0.3
  • lizard 1.16.6
  • networkx >=2.1
  • perceval >=0.19
  • pydot >=1.2.4
  • pylint >=1.8.4
  • python ^3.7
.github/workflows/changelog.yml actions
  • bitergia/release-tools-check-changelog master composite
.github/workflows/release.yml actions
  • actions/checkout 93ea575cb5d8a053eaa0ac8fa3b40d7e05a33cc8 composite
  • actions/download-artifact fb598a63ae348fa914e94cd0ff38f362e927b741 composite
  • actions/setup-go c4a742cab115ed795e34d4513e2cf7d472deb55f composite
  • actions/setup-python 13ae5bb136fac2878aff31522b9efb785519f984 composite
  • chaoss/grimoirelab-github-actions/build master composite
  • chaoss/grimoirelab-github-actions/publish master composite
  • chaoss/grimoirelab-github-actions/release master composite
  • ruby/setup-ruby eae47962baca661befdfd24e4d6c34ade04858f7 composite
.github/workflows/tests.yml actions
  • actions/checkout 93ea575cb5d8a053eaa0ac8fa3b40d7e05a33cc8 composite
  • actions/setup-go c4a742cab115ed795e34d4513e2cf7d472deb55f composite
  • actions/setup-python 13ae5bb136fac2878aff31522b9efb785519f984 composite
  • ruby/setup-ruby eae47962baca661befdfd24e4d6c34ade04858f7 composite
tests/data/Dockerfile docker
  • debian stretch-slim build
requirements_dev.txt pypi