https://github.com/chaoss/grimoirelab-perceval
Send Sir Perceval on a quest to retrieve and gather data from software repositories.
Science Score: 46.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: researchgate.net, acm.org -
✓Committers with academic emails
1 of 44 committers (2.3%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.1%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Send Sir Perceval on a quest to retrieve and gather data from software repositories.
Basic Info
- Host: GitHub
- Owner: chaoss
- License: gpl-3.0
- Language: Python
- Default Branch: main
- Homepage: http://perceval.readthedocs.io/
- Size: 4.04 MB
Statistics
- Stars: 303
- Watchers: 28
- Forks: 181
- Open Issues: 34
- Releases: 119
Topics
Metadata Files
README.md
Perceval

Send Sir Perceval on a quest to retrieve and gather data from software repositories.
Usage
```
usage: perceval [-g]
Send Sir Perceval on a quest to retrieve and gather data from software repositories.
Repositories are reached using specific backends. The most common backends are:
askbot Fetch questions and answers from Askbot site
bugzilla Fetch bugs from a Bugzilla server
bugzillarest Fetch bugs from a Bugzilla server (>=5.0) using its REST API
confluence Fetch contents from a Confluence server
discourse Fetch posts from Discourse site
dockerhub Fetch repository data from Docker Hub site
gerrit Fetch reviews from a Gerrit server
git Fetch commits from Git
github Fetch issues, pull requests and repository information from GitHub
gitlab Fetch issues, merge requests from GitLab
gitter Fetch messages from a Gitter room
googlehits Fetch hits from Google API
groupsio Fetch messages from Groups.io
hyperkitty Fetch messages from a HyperKitty archiver
jenkins Fetch builds from a Jenkins server
jira Fetch issues from JIRA issue tracker
launchpad Fetch issues from Launchpad issue tracker
mattermost Fetch posts from a Mattermost server
mbox Fetch messages from MBox files
mediawiki Fetch pages and revisions from a MediaWiki site
meetup Fetch events from a Meetup group
nntp Fetch articles from a NNTP news group
pagure Fetch issues from Pagure
phabricator Fetch tasks from a Phabricator site
pipermail Fetch messages from a Pipermail archiver
redmine Fetch issues from a Redmine server
rocketchat Fetch messages from a Rocket.Chat channel
rss Fetch entries from a RSS feed server
slack Fetch messages from a Slack channel
stackexchange Fetch questions from StackExchange sites
supybot Fetch messages from Supybot log files
telegram Fetch messages from the Telegram server
twitter Fetch tweets from the Twitter Search API
optional arguments: -h, --help show this help message and exit -v, --version show version -g, --debug set debug mode on -l, --list show available backends
Run 'perceval
```
Requirements
- Python >= 3.10
- Poetry >= 1.2
- git
- build-essential
You will also need some other libraries for running the tool, you can find the whole list of dependencies in pyproject.toml file.
How to install
- build-essentials
Build-essentials is a package that contains a set of tools to compile and build software. It is required to work with Debian packages.
$ sudo apt-get install build-essential
- git
Git is a version control system that allows you to keep track of changes in your code. It is required to work with Git repositories.
$ sudo apt-get install git
Installation
There are several ways to install Perceval on your system: packages or source code using Poetry or pip or using Docker.
PyPI
Perceval can be installed using pip, a tool for installing Python packages. To
do it, run the next command:
$ pip install perceval
Source code
To install from the source code you will need to clone the repository first:
$ git clone https://github.com/chaoss/grimoirelab-perceval
$ cd grimoirelab-perceval
Then use pip or Poetry to install the package along with its dependencies.
Pip
To install the package from local directory run the following command:
$ pip install .
In case you are a developer, you should install perceval in editable mode:
$ pip install -e .
Poetry
We use poetry for dependency management and
packaging. You can install it following its
documentation. Once you have
installed it, you can install perceval and the dependencies in a project
isolated environment using:
$ poetry install
To spaw a new shell within the virtual environment use:
$ poetry shell
Docker
A Perceval Docker image is available at DockerHub.
Detailed information on how to run and/or build this image can be found here.
Documentation
Documentation is generated automatically in the ReadTheDocs Perceval site.
References
If you use Perceval in your research papers, please refer to Perceval: software project data at your will -- Pre-print:
APA style
Dueñas, S., Cosentino, V., Robles, G., & Gonzalez-Barahona, J. M. (2018, May). Perceval: software project data at your will. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (pp. 1-4). ACM.
BibTeX
@inproceedings{duenas2018perceval,
title={Perceval: software project data at your will},
author={Due{\~n}as, Santiago and Cosentino, Valerio and Robles, Gregorio and Gonzalez-Barahona, Jesus M},
booktitle={Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings},
pages={1--4},
year={2018},
organization={ACM}
}
Examples
Askbot
$ perceval askbot 'http://askbot.org/' --from-date '2016-01-01'
Bugzilla
To fetch bugs from Bugzilla, you have two options:
a) Use the traditional backend
$ perceval bugzilla 'https://bugzilla.redhat.com/' --backend-user user --backend-password pass --from-date '2016-01-01'
b) Use the REST API backend for Buzilla 5.0 (or higher) servers. We strongly recommend this backend when data is fetched from version servers >=5.0 because the retrieval process is much faster.
$ perceval bugzillarest 'https://bugzilla.mozilla.org/' --backend-user user --backend-password pass --from-date '2016-01-01'
Confluence
$ perceval confluence 'https://wiki.opnfv.org/' --from-date '2016-01-01'
Discourse
$ perceval discourse 'https://foro.mozilla-hispano.org/' --from-date '2016-01-01'
Docker Hub
$ perceval dockerhub grimoirelab perceval
Gerrit
To run gerrit, you will need an authorized SSH private key:
$ eval `ssh-agent -s`
$ ssh-add ~/.ssh/id_rsa
Identity added: /home/user/.ssh/id_rsa (/home/user/.ssh/id_rsa)
To run the backend, execute the next command:
$ perceval gerrit --user user 'review.openstack.org' --from-date '2016-01-01'
Git
To run this backend execute the next command. Take into account that to run this backend Git program has to be installed on your system.
$ perceval git 'https://github.com/chaoss/grimoirelab-perceval.git' --from-date '2016-01-01'
To run the backend against a private git repository, you must pass the credentials directly in the URL:
$ perceval git https://<username>:<password>@repository-url
For example, for private GitHub repositories:
$ perceval git https://<username>:<api-token>@github.com/chaoss/grimoirelab-perceval
Git backend can also work with a Git log file as input. We recommend to use the next command to get the most complete log file.
$ git log --raw --numstat --pretty=fuller --decorate=full --parents --reverse --topo-order -M -C -c --remotes=origin --all > /tmp/gitlog.log
Then, to run the backend, just execute any of the next commands:
$ perceval git --git-log '/tmp/gitlog.log' 'file:///myrepo.git'
or
$ perceval git '/tmp/gitlog.log'
GitHub
$ perceval github elastic logstash --from-date '2016-01-01'
The GitHub backend accepts the categories issue, pull_request and
repository which allow to fetch the specific data.
$ perceval github --category issue elastic logstash
GitLab
$ perceval gitlab fdroid fdroiddata -t $GITLAB_TOKEN --from-date '2016-01-01'
Gitter
$ perceval gitter -t 'abcdefghi' --from-date '2020-03-18' 'jenkinsci' 'jenkins'
GoogleHits
$ perceval googlehits "bitergia grimoirelab"
Groups.io
$ perceval groupsio 'updates' -e '<me@example.com>' -p 'my-password' --from-date '2016-01-01'
In order to fetch the data from a group, you should first subscribe to it via
the Groups.io website. In case you want to know the group names where you are
subscribed, you can use the following script:
https://gist.github.com/valeriocos/ad33a0b9b2d13a8336230c8c59df3c55
HyperKitty
$ perceval hyperkitty 'https://lists.mailman3.org/archives/list/mailman-users@mailman3.org' --from-date 2017-01-01
Jenkins
$ perceval jenkins 'https://build.opnfv.org/ci/'
JIRA
$ perceval jira 'https://tickets.puppetlabs.com' --project PUP --from-date '2016-01-01'
Launchpad
$ perceval launchpad ubuntu --from-date '2016-01-01'
Mattermost
$ perceval mattermost 'http://mattermost.example.com' jgw7jdmjkjf19ffkwnw59i5f9e --from-date '2016-01-01' -t 'abcdefghijk'
MBox
$ perceval mbox 'http://example.com' /tmp/mboxes/
MediaWiki
$ perceval mediawiki 'https://wiki.mozilla.org' --from-date '2016-06-30'
Meetup
$ perceval meetup 'Software-Development-Analytics' --from-date '2016-06-01' -t abcdefghijk
NNTP
$ perceval nntp 'news.mozilla.org' 'mozilla.dev.project-link' --offset 10
Pagure
$ perceval pagure '389-ds-base' --from-date '2020-03-06'
Phabricator
$ perceval phabricator 'https://secure.phabricator.com/' -t 123456789abcefe
Pipermail
$ perceval pipermail 'https://mail.gnome.org/archives/libart-hackers/'
Pipermail also is able to fetch data from Apache's mod_box interface:
$ perceval pipermail 'http://mail-archives.apache.org/mod_mbox/httpd-dev/'
Redmine
$ perceval redmine 'https://www.redmine.org/' --from-date '2016-01-01' -t abcdefghijk
Rocket.Chat
Rocket.Chat backend needs an API token and a User Id to authenticate to the
server.
$ perceval rocketchat -t 'abchdefghij' -u '1234abcd' --from-date '2020-05-02' https://open.rocket.chat general
RSS
$ perceval rss 'https://blog.bitergia.com/feed/'
Slack
Slack backend requires an API token for authentication. Slack apps can be used
to generate and configure this API token. The scopes required by a Slack app for
the backend are channels:history, channels:read and users:read. To know
more about Slack apps and its integration please refer the Slack apps
documentation. For more information about
the scopes required by a Slack app please refer the Scopes and permissions
documentation.
The following script can also be used to generate an OAuth2 token to access the Slack API.
$ perceval slack C0001 --from-date 2016-01-12 -t abcedefghijk
StackExchange
$ perceval stackexchange --site stackoverflow --tagged python --from-date '2016-01-01' -t abcdabcdabcdabcd
Supybot
$ perceval supybot 'http://channel.example.com' /tmp/supybot/
Telegram
Telegram backend needs an API token to authenticate the bot. In addition and in order to fetch messages from a group or channel, privacy settings must be disabled. To know how to create a bot, to obtain its token and to configure it please read the Telegram Bots docs pages.
Note that the messages are available on the Telegram server until the bot fetches them, but they will not be kept longer than 24 hours.
$ perceval telegram mybot -t 12345678abcdefgh --chats 1 2 -10
Twitter backend needs a bearer token to authenticate the requests. It can be obtained using the code available on GistGitHub: https://gist.github.com/valeriocos/7d4d28f72f53fbce49f1512ba77ef5f6
$ perceval twitter grimoirelab -t 12345678abcdefgh
Community Backends
Some backends are implemented in a seperate repository but not merged into chaoss/grimoirelab-perceval due to long-run maintainence reasons. Please feel free to check the backends and contact the maintainers for any issues or questions related to them.
- Bundle for Puppet, Inc. ecosystem: chaoss/grimoirelab-perceval-puppet
- Bundle for OPNFV ecosystem: chaoss/grimoirelab-perceval-opnfv
- Bundle for Mozilla ecosystem: chaoss/grimoirelab-perceval-mozilla
- Bundle for FINOS ecosystem: Bitergia/grimoirelab-perceval-finos
- Weblate backend: chaoss/grimoirelab-perceval-weblate
- Zulip backend: vchrombie/grimoirelab-perceval-zulip
- OSF backend: gitlab.com/open-rit/perceval-osf
- Gitee backend: grimoirelab-gitee/grimoirelab-perceval-gitee
- Airtable backend: perceval-backends/grimoirelab-perceval-airtable
- Bitbucket backend: perceval-backends/grimoirelab-perceval-bitbucket
Running tests
Perceval comes with a comprehensive list of unit tests. To run them, in addition
to the dependencies installed with Perceval, you need httpretty.
License
Licensed under GNU General Public License (GPL), version 3 or later.
Owner
- Name: CHAOSS
- Login: chaoss
- Kind: organization
- Website: https://chaoss.community/
- Twitter: chaossproj
- Repositories: 64
- Profile: https://github.com/chaoss
GitHub Events
Total
- Create event: 19
- Release event: 16
- Issues event: 9
- Watch event: 12
- Delete event: 5
- Issue comment event: 20
- Push event: 27
- Pull request review comment event: 9
- Pull request review event: 20
- Pull request event: 42
- Fork event: 4
Last Year
- Create event: 19
- Release event: 16
- Issues event: 9
- Watch event: 12
- Delete event: 5
- Issue comment event: 20
- Push event: 27
- Pull request review comment event: 9
- Pull request review event: 20
- Pull request event: 42
- Fork event: 4
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Santiago Dueñas | s****s@b****m | 751 |
| Valerio Cosentino | v****s@b****m | 497 |
| Jose Javier Merchante | j****e@b****m | 77 |
| Alvaro del Castillo | a****s@b****m | 59 |
| Alberto Martín | a****n@b****m | 51 |
| Quan Zhou | q****n@b****m | 37 |
| Venu Vardhan Reddy Tekula | v****u@c****y | 24 |
| Jesus M. Gonzalez-Barahona | j****b@g****s | 20 |
| dependabot[bot] | 4****] | 17 |
| Animesh Kumar | a****1@g****m | 14 |
| Harshal Mittal | h****4@g****m | 4 |
| Miguel Ángel Fernández | m****n@b****m | 3 |
| Emi Simpson | e****i@a****v | 3 |
| Chris Burgess | c****s@c****z | 2 |
| David Pose Fernández | d****e@b****m | 2 |
| Fil Maj | m****l@g****m | 2 |
| Nitish Gupta | i****g@g****m | 2 |
| Ria Gupta | r****5@i****n | 2 |
| david | d****d@s****o | 2 |
| camillem | c****m | 2 |
| sevagenv | s****v@g****m | 2 |
| zhifeiyue | z****e@g****m | 2 |
| Gregorio | g****x@g****s | 2 |
| Lukasz Gryglicki | l****i@o****l | 2 |
| stevekola | k****9@g****m | 1 |
| nworb95 | n****5@g****m | 1 |
| Israel Herraiz | i****z@b****m | 1 |
| J. Manrique Lopez de la Fuente | j****e@b****m | 1 |
| Stephan Barth | s****h@g****m | 1 |
| mjgaughan | m****n@p****e | 1 |
| and 14 more... | ||
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 76
- Total pull requests: 128
- Average time to close issues: about 3 years
- Average time to close pull requests: 3 months
- Total issue authors: 56
- Total pull request authors: 28
- Average comments per issue: 3.8
- Average comments per pull request: 2.45
- Merged pull requests: 86
- Bot issues: 0
- Bot pull requests: 28
Past Year
- Issues: 6
- Pull requests: 37
- Average time to close issues: about 1 hour
- Average time to close pull requests: 9 days
- Issue authors: 6
- Pull request authors: 5
- Average comments per issue: 0.33
- Average comments per pull request: 0.59
- Merged pull requests: 25
- Bot issues: 0
- Bot pull requests: 9
Top Authors
Issue Authors
- jgbarah (8)
- vchrombie (4)
- xiao623 (3)
- lukaszgryglicki (3)
- zhquan (3)
- canasdiaz (2)
- MalloZup (2)
- albertinisg (2)
- animeshk08 (2)
- akshatmalik (1)
- abhiandthetruth (1)
- RCheesley (1)
- marcofranssen (1)
- pwnfoo (1)
- isikozsoy (1)
Pull Request Authors
- jjmerchante (42)
- dependabot[bot] (34)
- sduenas (13)
- zhquan (12)
- vchrombie (11)
- grittyashish (4)
- eyehwan (2)
- animeshk08 (2)
- xurizaemon (2)
- jorgegarciarey (2)
- Alch-Emi (2)
- zhifeiyue (2)
- mjgaughan (2)
- VSevagen (2)
- mafesan (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 6,185 last-month
- Total docker downloads: 273
- Total dependent packages: 14
- Total dependent repositories: 54
- Total versions: 163
- Total maintainers: 2
pypi.org: perceval
Send Sir Perceval on a quest to fetch and gather data from software repositories.
- Homepage: https://chaoss.github.io/grimoirelab/
- Documentation: https://perceval.readthedocs.io/
- License: GPL-3.0+
-
Latest release: 1.3.3
published 6 months ago
Rankings
Dependencies
- bitergia/release-tools-check-changelog master composite
- actions/checkout 93ea575cb5d8a053eaa0ac8fa3b40d7e05a33cc8 composite
- actions/download-artifact fb598a63ae348fa914e94cd0ff38f362e927b741 composite
- actions/setup-python 13ae5bb136fac2878aff31522b9efb785519f984 composite
- chaoss/grimoirelab-github-actions/build master composite
- chaoss/grimoirelab-github-actions/publish master composite
- chaoss/grimoirelab-github-actions/release master composite
- actions/checkout 93ea575cb5d8a053eaa0ac8fa3b40d7e05a33cc8 composite
- actions/setup-python 13ae5bb136fac2878aff31522b9efb785519f984 composite
- python 3.4-slim build
- poetry ==1.1.9
- coverage 5.5 develop
- flake8 4.0.1 develop
- httpretty 1.1.4 develop
- importlib-metadata 4.2.0 develop
- mccabe 0.6.1 develop
- pycodestyle 2.8.0 develop
- pyflakes 2.4.0 develop
- zipp 3.12.0 develop
- alabaster 0.7.13
- attrs 21.4.0
- babel 2.11.0
- beautifulsoup4 4.11.2
- certifi 2022.12.7
- cffi 1.15.1
- charset-normalizer 3.0.1
- colorama 0.4.6
- cryptography 3.4.8
- docutils 0.17.1
- dulwich 0.20.50
- feedparser 6.0.10
- furo 2021.11.23
- grimoirelab-toolkit 0.3.3
- idna 3.4
- imagesize 1.4.1
- jinja2 3.1.2
- markdown-it-py 1.1.0
- markupsafe 2.1.2
- mdit-py-plugins 0.2.8
- myst-parser 0.15.2
- packaging 23.0
- pycparser 2.21
- pygments 2.14.0
- pyjwt 2.6.0
- python-dateutil 2.8.2
- pytz 2022.7.1
- pyyaml 6.0
- requests 2.28.2
- setuptools 67.1.0
- sgmllib3k 1.0.0
- six 1.16.0
- snowballstemmer 2.2.0
- soupsieve 2.3.2.post1
- sphinx 4.3.2
- sphinxcontrib-applehelp 1.0.2
- sphinxcontrib-devhelp 1.0.2
- sphinxcontrib-htmlhelp 2.0.0
- sphinxcontrib-jsmath 1.0.1
- sphinxcontrib-qthelp 1.0.3
- sphinxcontrib-serializinghtml 1.1.5
- typing-extensions 4.4.0
- urllib3 1.26.14
- coverage ^5.5 develop
- flake8 ^4.0.1 develop
- httpretty ^1.1.4 develop
- PyJWT ^2.4.0
- beautifulsoup4 ^4.3.2
- cryptography ^3.3.2
- dulwich ^0.20.0
- feedparser ^6.0.8
- furo ^2021.8.31
- grimoirelab-toolkit >=0.3
- myst-parser ^0.15.2
- python ^3.7
- python-dateutil ^2.6.0
- requests ^2.7.0
- urllib3 ^1.26