code-release

A repository analyzing the impact of open-source code in machine learning, robotics, and controls research

https://github.com/utiasdsl/code-release

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, ieee.org
  • Committers with academic emails
    2 of 2 committers (100.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.7%) to scientific vocabulary

Keywords

control-systems open-source reproducible-research robotics
Last synced: 6 months ago · JSON representation ·

Repository

A repository analyzing the impact of open-source code in machine learning, robotics, and controls research

Basic Info
Statistics
  • Stars: 10
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
control-systems open-source reproducible-research robotics
Created over 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme Citation

README.md

code-release

This repository contains all the data and plotting scripts required to reproduce the plots in our paper "What is the Impact of Releasing Code with Publications? Statistics from the Machine Learning, Robotics, and Control Communities." The preprint is available here.

Installation

Install/upgrade Python3 dependencies:

sh pip3 install --upgrade pip pip3 install pyyaml pip3 install tikzplotlib pip3 install matplotlib --upgrade

This was tested on macOS 13.3 with the following:

sh anaconda 2022.10 matplotlib 3.7.1 pip 23.0.1 python 3.9.13 pyyaml 6.0 tikzplotlib 0.10.1

Use

Clone this repository and run its main.py script:

sh git clone https://github.com/utiasDSL/code-release.git cd code-release/ python3 main.py

Output

The script will sequentially generate the following figures:

fig1 fig2

fig3 fig4

fig5 fig6

fig7

Contribution

Our determination of available open-source code for publications is not perfect. If we incorrectly associated your publication with or without code, please open a pull request with the correction. We appreciate your contributions!

Citation

Please cite our work (paper) or (preprint) as:

bibtex @ARTICLE{oscrelease2024, author={Zhou, Siqi and Brunke, Lukas and Tao, Allen and Hall, Adam W. and Bejarano, Federico Pizarro and Panerati, Jacopo and Schoellig, Angela P.}, journal={IEEE Control Systems Magazine}, title={What Is the Impact of Releasing Code With Publications? Statistics from the Machine Learning, Robotics, and Control Communities}, year={2024}, volume={44}, number={4}, pages={38-46}, doi={10.1109/MCS.2024.3402888} }


Learning Systems and Robotics Lab at the Technical University of Munich (TUM) and the University of Toronto

Owner

  • Name: Dynamic Systems Lab
  • Login: utiasDSL
  • Kind: organization

Citation (citation_count_box_plot.py)

'''Plotting functions for box plots

'''
import matplotlib.pyplot as plt
import matplotlib.patheffects as path_effects
import seaborn as sns
import pandas as pd
import tikzplotlib


# Fixes AttributeError when using a legend in matplotlib for tikzplotlib
from matplotlib.lines import Line2D
from matplotlib.legend import Legend
Line2D._us_dashSeq    = property(lambda self: self._dash_pattern[1])
Line2D._us_dashOffset = property(lambda self: self._dash_pattern[0])
Legend._ncol = property(lambda self: self._ncols)


def calc_percentage(minuend, subtrahend, denom):
    if abs(denom) <= 1e-8:
        return float('inf')
    return (minuend - subtrahend) / denom * 100.0


def percentual_change(percentages):
    percentage_diff = []
    for i in range(len(percentages)-1):
        percentage_diff.append(calc_percentage(percentages[i + 1], percentages[i], percentages[i]))

    return percentage_diff


def print_percentages(stats_code, stats_no_code, name=''):
    percent_no_code = percentual_change(stats_no_code)
    percent_code = percentual_change(stats_code)
    print('{} NO code: {}'.format(name, stats_no_code))
    print('Changes: ', ['{:.2f} %'.format(percent) for percent in percent_no_code])
    print('{} W/ code: {}'.format(name, stats_code))
    print('Changes: ', ['{:.2f} %'.format(percent) for percent in percent_code])
    diff_percentage_quartile = [calc_percentage(percent_code[i], percent_no_code[i], percent_no_code[i]) for i in range(len(percent_no_code))]
    print('diff Changes: ', ['{:.2f} %'.format(percent) for percent in diff_percentage_quartile])


def add_percentile_labels(ax, percentile_name='Median', fmt='.1f'):
    # adapted from https://stackoverflow.com/a/63295846 by Christian Karcher
    no_code = []
    code = []
    
    # Get the lines and boxes in the box plot
    lines = ax.get_lines()
    boxes = [c for c in ax.get_children() if type(c).__name__ == 'PathPatch']
    lines_per_box = int(len(lines) / len(boxes))

    # Determine the elements to loop over (lines for medians, boxes for quartiles)
    if percentile_name == 'Median':
        plot_elements = lines[4:len(lines):lines_per_box]
    elif percentile_name == 'Third Quartile':
        plot_elements = boxes
    else:
        raise NotImplementedError
    
    for plot_element in plot_elements:
        # Determine x and y data for median or quartile
        if percentile_name == 'Median':
            x, y = (data.mean() for data in plot_element.get_data())
            foreground = plot_element.get_color()
        elif percentile_name == 'Third Quartile':
            top_left = plot_element.get_path().vertices[2, :]
            top_right = plot_element.get_path().vertices[3, :]
            y = top_left[1]
            x = (top_right[0] - top_left[0]) / 2.0 + top_left[0]
            foreground = plot_element.get_edgecolor()
        else:
            raise NotImplementedError

        # Alternate between without OSC and with OSC data
        if len(code) >= len(no_code):
            no_code.append(y)
        else:
            code.append(y)
        
        # Add text for percentile
        text = ax.text(x, y, f'{y:{fmt}}', ha='center', va='center', 
                       fontweight='bold', color='white')
        
        # Create colored border around white text for contrast
        text.set_path_effects([
            path_effects.Stroke(linewidth=3, foreground=foreground),
            path_effects.Normal(),
        ])
    
    return code, no_code
        

def citation_count_w_wo_code(conf, cfg):
    add_statistics = cfg['ADD_STATISTICS']

    # Print conference name
    print('Conference {}'.format(conf))

    # Set paths for data and plotting
    csv_file_name = conf + '/ALL_DATA.csv'
    figure_output_file = 'plots/Code Availability vs Citations in {} Box Plot.'.format(conf)

    # Read CSV file
    data_plt = pd.read_csv(csv_file_name)

    # The number of years for plotting the data
    years = list(set(data_plt.Year))  # get the years
    reversed_years = years[::-1]  # reverse order of years

    # Create box plots
    ax = sns.boxplot(x='Year', y='Citations', hue='With Code', data=data_plt, showfliers=False, order=reversed_years) 

    # Set labels
    ax.set_xticklabels(['{} ({})'.format(reversed_years[0] + 1 - year, year) for year in reversed_years])
    ax.set_xlabel('Years since Publication (from {})'.format(reversed_years[0] + 1))
    ax.set_ylabel('Semantic Scholar Citations')
    plt.title(conf)

    if add_statistics:
        percentile_names = ['Median', 'Third Quartile']
        for percentile_name in percentile_names:
            print(percentile_name)
            # label the median and third quartiles on the plot
            code, no_code = add_percentile_labels(ax, percentile_name=percentile_name)
            # print changes in percentile for publications with and without OSC over the years 
            print_percentages(code, no_code, name=percentile_name)

    # Create tikz figure and save PNG
    tikzplotlib.clean_figure()
    tikzplotlib.save(figure_output_file + 'tex')
    plt.savefig(figure_output_file + 'png')
    plt.show()

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 5
  • Total Committers: 2
  • Avg Commits per committer: 2.5
  • Development Distribution Score (DDS): 0.4
Past Year
  • Commits: 5
  • Committers: 2
  • Avg Commits per committer: 2.5
  • Development Distribution Score (DDS): 0.4
Top Committers
Name Email Commits
SiQi Zhou s****u@r****a 3
Jacopo Panerati j****i@u****a 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels