code-release

A repository analyzing the impact of open-source code in machine learning, robotics, and controls research

https://github.com/utiasdsl/code-release

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org, ieee.org
✓
Committers with academic emails
2 of 2 committers (100.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary

Keywords

control-systems open-source reproducible-research robotics

Last synced: 10 months ago · JSON representation ·

Repository

A repository analyzing the impact of open-source code in machine learning, robotics, and controls research

Basic Info

Host: GitHub
Owner: utiasDSL
Language: Python
Default Branch: main
Homepage: https://arxiv.org/abs/2308.10008
Size: 5.43 MB

Statistics

Stars: 10
Watchers: 3
Forks: 0
Open Issues: 0
Releases: 0

Topics

control-systems open-source reproducible-research robotics

Created almost 3 years ago · Last pushed almost 2 years ago

Metadata Files

Readme Citation

code-release

This repository contains all the data and plotting scripts required to reproduce the plots in our paper "What is the Impact of Releasing Code with Publications? Statistics from the Machine Learning, Robotics, and Control Communities." The preprint is available here.

Installation

Install/upgrade Python3 dependencies:

sh pip3 install --upgrade pip pip3 install pyyaml pip3 install tikzplotlib pip3 install matplotlib --upgrade

This was tested on macOS 13.3 with the following:

sh anaconda 2022.10 matplotlib 3.7.1 pip 23.0.1 python 3.9.13 pyyaml 6.0 tikzplotlib 0.10.1

Use

Clone this repository and run its main.py script:

sh git clone https://github.com/utiasDSL/code-release.git cd code-release/ python3 main.py

Output

The script will sequentially generate the following figures:

fig1 fig2

fig3 fig4

fig5 fig6

fig7

Contribution

Our determination of available open-source code for publications is not perfect. If we incorrectly associated your publication with or without code, please open a pull request with the correction. We appreciate your contributions!

Citation

Please cite our work (paper) or (preprint) as:

bibtex @ARTICLE{oscrelease2024, author={Zhou, Siqi and Brunke, Lukas and Tao, Allen and Hall, Adam W. and Bejarano, Federico Pizarro and Panerati, Jacopo and Schoellig, Angela P.}, journal={IEEE Control Systems Magazine}, title={What Is the Impact of Releasing Code With Publications? Statistics from the Machine Learning, Robotics, and Control Communities}, year={2024}, volume={44}, number={4}, pages={38-46}, doi={10.1109/MCS.2024.3402888} }

Learning Systems and Robotics Lab at the Technical University of Munich (TUM) and the University of Toronto

Owner

Name: Dynamic Systems Lab
Login: utiasDSL
Kind: organization

Website: http://www.dynsyslab.org/
Repositories: 4
Profile: https://github.com/utiasDSL

Citation (citation_count_box_plot.py)

'''Plotting functions for box plots

'''
import matplotlib.pyplot as plt
import matplotlib.patheffects as path_effects
import seaborn as sns
import pandas as pd
import tikzplotlib


# Fixes AttributeError when using a legend in matplotlib for tikzplotlib
from matplotlib.lines import Line2D
from matplotlib.legend import Legend
Line2D._us_dashSeq    = property(lambda self: self._dash_pattern[1])
Line2D._us_dashOffset = property(lambda self: self._dash_pattern[0])
Legend._ncol = property(lambda self: self._ncols)


def calc_percentage(minuend, subtrahend, denom):
    if abs(denom) <= 1e-8:
        return float('inf')
    return (minuend - subtrahend) / denom * 100.0


def percentual_change(percentages):
    percentage_diff = []
    for i in range(len(percentages)-1):
        percentage_diff.append(calc_percentage(percentages[i + 1], percentages[i], percentages[i]))

    return percentage_diff


def print_percentages(stats_code, stats_no_code, name=''):
    percent_no_code = percentual_change(stats_no_code)
    percent_code = percentual_change(stats_code)
    print('{} NO code: {}'.format(name, stats_no_code))
    print('Changes: ', ['{:.2f} %'.format(percent) for percent in percent_no_code])
    print('{} W/ code: {}'.format(name, stats_code))
    print('Changes: ', ['{:.2f} %'.format(percent) for percent in percent_code])
    diff_percentage_quartile = [calc_percentage(percent_code[i], percent_no_code[i], percent_no_code[i]) for i in range(len(percent_no_code))]
    print('diff Changes: ', ['{:.2f} %'.format(percent) for percent in diff_percentage_quartile])


def add_percentile_labels(ax, percentile_name='Median', fmt='.1f'):
    # adapted from https://stackoverflow.com/a/63295846 by Christian Karcher
    no_code = []
    code = []
    
    # Get the lines and boxes in the box plot
    lines = ax.get_lines()
    boxes = [c for c in ax.get_children() if type(c).__name__ == 'PathPatch']
    lines_per_box = int(len(lines) / len(boxes))

    # Determine the elements to loop over (lines for medians, boxes for quartiles)
    if percentile_name == 'Median':
        plot_elements = lines[4:len(lines):lines_per_box]
    elif percentile_name == 'Third Quartile':
        plot_elements = boxes
    else:
        raise NotImplementedError
    
    for plot_element in plot_elements:
        # Determine x and y data for median or quartile
        if percentile_name == 'Median':
            x, y = (data.mean() for data in plot_element.get_data())
            foreground = plot_element.get_color()
        elif percentile_name == 'Third Quartile':
            top_left = plot_element.get_path().vertices[2, :]
            top_right = plot_element.get_path().vertices[3, :]
            y = top_left[1]
            x = (top_right[0] - top_left[0]) / 2.0 + top_left[0]
            foreground = plot_element.get_edgecolor()
        else:
            raise NotImplementedError

        # Alternate between without OSC and with OSC data
        if len(code) >= len(no_code):
            no_code.append(y)
        else:
            code.append(y)
        
        # Add text for percentile
        text = ax.text(x, y, f'{y:{fmt}}', ha='center', va='center', 
                       fontweight='bold', color='white')
        
        # Create colored border around white text for contrast
        text.set_path_effects([
            path_effects.Stroke(linewidth=3, foreground=foreground),
            path_effects.Normal(),
        ])
    
    return code, no_code
        

def citation_count_w_wo_code(conf, cfg):
    add_statistics = cfg['ADD_STATISTICS']

    # Print conference name
    print('Conference {}'.format(conf))

    # Set paths for data and plotting
    csv_file_name = conf + '/ALL_DATA.csv'
    figure_output_file = 'plots/Code Availability vs Citations in {} Box Plot.'.format(conf)

    # Read CSV file
    data_plt = pd.read_csv(csv_file_name)

    # The number of years for plotting the data
    years = list(set(data_plt.Year))  # get the years
    reversed_years = years[::-1]  # reverse order of years

    # Create box plots
    ax = sns.boxplot(x='Year', y='Citations', hue='With Code', data=data_plt, showfliers=False, order=reversed_years) 

    # Set labels
    ax.set_xticklabels(['{} ({})'.format(reversed_years[0] + 1 - year, year) for year in reversed_years])
    ax.set_xlabel('Years since Publication (from {})'.format(reversed_years[0] + 1))
    ax.set_ylabel('Semantic Scholar Citations')
    plt.title(conf)

    if add_statistics:
        percentile_names = ['Median', 'Third Quartile']
        for percentile_name in percentile_names:
            print(percentile_name)
            # label the median and third quartiles on the plot
            code, no_code = add_percentile_labels(ax, percentile_name=percentile_name)
            # print changes in percentile for publications with and without OSC over the years 
            print_percentages(code, no_code, name=percentile_name)

    # Create tikz figure and save PNG
    tikzplotlib.clean_figure()
    tikzplotlib.save(figure_output_file + 'tex')
    plt.savefig(figure_output_file + 'png')
    plt.show()

GitHub Events

Total

Watch event: 1

Last Year

Watch event: 1

Committers

Last synced: over 2 years ago

All Time

Total Commits: 5
Total Committers: 2
Avg Commits per committer: 2.5
Development Distribution Score (DDS): 0.4

Past Year

Commits: 5
Committers: 2
Avg Commits per committer: 2.5
Development Distribution Score (DDS): 0.4

Top Committers

Name	Email	Commits
SiQi Zhou	s**u@r**a	3
Jacopo Panerati	j**i@u**a	2

Committer Domains (Top 20 + Academic)

utoronto.ca: 1 robotics.utias.utoronto.ca: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

code-release

Science Score: 77.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

code-release

Installation

Use

Output

Contribution

Citation

Owner

Citation (citation_count_box_plot.py)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels