code-release
A repository analyzing the impact of open-source code in machine learning, robotics, and controls research
Science Score: 77.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, ieee.org -
✓Committers with academic emails
2 of 2 committers (100.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary
Keywords
Repository
A repository analyzing the impact of open-source code in machine learning, robotics, and controls research
Basic Info
- Host: GitHub
- Owner: utiasDSL
- Language: Python
- Default Branch: main
- Homepage: https://arxiv.org/abs/2308.10008
- Size: 5.43 MB
Statistics
- Stars: 10
- Watchers: 3
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
code-release
This repository contains all the data and plotting scripts required to reproduce the plots in our paper "What is the Impact of Releasing Code with Publications? Statistics from the Machine Learning, Robotics, and Control Communities." The preprint is available here.
Installation
Install/upgrade Python3 dependencies:
sh
pip3 install --upgrade pip
pip3 install pyyaml
pip3 install tikzplotlib
pip3 install matplotlib --upgrade
This was tested on macOS 13.3 with the following:
sh
anaconda 2022.10
matplotlib 3.7.1
pip 23.0.1
python 3.9.13
pyyaml 6.0
tikzplotlib 0.10.1
Use
Clone this repository and run its main.py script:
sh
git clone https://github.com/utiasDSL/code-release.git
cd code-release/
python3 main.py
Output
The script will sequentially generate the following figures:




Contribution
Our determination of available open-source code for publications is not perfect. If we incorrectly associated your publication with or without code, please open a pull request with the correction. We appreciate your contributions!
Citation
Please cite our work (paper) or (preprint) as:
bibtex
@ARTICLE{oscrelease2024,
author={Zhou, Siqi and Brunke, Lukas and Tao, Allen and Hall, Adam W. and Bejarano, Federico Pizarro and Panerati, Jacopo and Schoellig, Angela P.},
journal={IEEE Control Systems Magazine},
title={What Is the Impact of Releasing Code With Publications? Statistics from the Machine Learning, Robotics, and Control Communities},
year={2024},
volume={44},
number={4},
pages={38-46},
doi={10.1109/MCS.2024.3402888}
}
Learning Systems and Robotics Lab at the Technical University of Munich (TUM) and the University of Toronto
Owner
- Name: Dynamic Systems Lab
- Login: utiasDSL
- Kind: organization
- Website: http://www.dynsyslab.org/
- Repositories: 4
- Profile: https://github.com/utiasDSL
Citation (citation_count_box_plot.py)
'''Plotting functions for box plots
'''
import matplotlib.pyplot as plt
import matplotlib.patheffects as path_effects
import seaborn as sns
import pandas as pd
import tikzplotlib
# Fixes AttributeError when using a legend in matplotlib for tikzplotlib
from matplotlib.lines import Line2D
from matplotlib.legend import Legend
Line2D._us_dashSeq = property(lambda self: self._dash_pattern[1])
Line2D._us_dashOffset = property(lambda self: self._dash_pattern[0])
Legend._ncol = property(lambda self: self._ncols)
def calc_percentage(minuend, subtrahend, denom):
if abs(denom) <= 1e-8:
return float('inf')
return (minuend - subtrahend) / denom * 100.0
def percentual_change(percentages):
percentage_diff = []
for i in range(len(percentages)-1):
percentage_diff.append(calc_percentage(percentages[i + 1], percentages[i], percentages[i]))
return percentage_diff
def print_percentages(stats_code, stats_no_code, name=''):
percent_no_code = percentual_change(stats_no_code)
percent_code = percentual_change(stats_code)
print('{} NO code: {}'.format(name, stats_no_code))
print('Changes: ', ['{:.2f} %'.format(percent) for percent in percent_no_code])
print('{} W/ code: {}'.format(name, stats_code))
print('Changes: ', ['{:.2f} %'.format(percent) for percent in percent_code])
diff_percentage_quartile = [calc_percentage(percent_code[i], percent_no_code[i], percent_no_code[i]) for i in range(len(percent_no_code))]
print('diff Changes: ', ['{:.2f} %'.format(percent) for percent in diff_percentage_quartile])
def add_percentile_labels(ax, percentile_name='Median', fmt='.1f'):
# adapted from https://stackoverflow.com/a/63295846 by Christian Karcher
no_code = []
code = []
# Get the lines and boxes in the box plot
lines = ax.get_lines()
boxes = [c for c in ax.get_children() if type(c).__name__ == 'PathPatch']
lines_per_box = int(len(lines) / len(boxes))
# Determine the elements to loop over (lines for medians, boxes for quartiles)
if percentile_name == 'Median':
plot_elements = lines[4:len(lines):lines_per_box]
elif percentile_name == 'Third Quartile':
plot_elements = boxes
else:
raise NotImplementedError
for plot_element in plot_elements:
# Determine x and y data for median or quartile
if percentile_name == 'Median':
x, y = (data.mean() for data in plot_element.get_data())
foreground = plot_element.get_color()
elif percentile_name == 'Third Quartile':
top_left = plot_element.get_path().vertices[2, :]
top_right = plot_element.get_path().vertices[3, :]
y = top_left[1]
x = (top_right[0] - top_left[0]) / 2.0 + top_left[0]
foreground = plot_element.get_edgecolor()
else:
raise NotImplementedError
# Alternate between without OSC and with OSC data
if len(code) >= len(no_code):
no_code.append(y)
else:
code.append(y)
# Add text for percentile
text = ax.text(x, y, f'{y:{fmt}}', ha='center', va='center',
fontweight='bold', color='white')
# Create colored border around white text for contrast
text.set_path_effects([
path_effects.Stroke(linewidth=3, foreground=foreground),
path_effects.Normal(),
])
return code, no_code
def citation_count_w_wo_code(conf, cfg):
add_statistics = cfg['ADD_STATISTICS']
# Print conference name
print('Conference {}'.format(conf))
# Set paths for data and plotting
csv_file_name = conf + '/ALL_DATA.csv'
figure_output_file = 'plots/Code Availability vs Citations in {} Box Plot.'.format(conf)
# Read CSV file
data_plt = pd.read_csv(csv_file_name)
# The number of years for plotting the data
years = list(set(data_plt.Year)) # get the years
reversed_years = years[::-1] # reverse order of years
# Create box plots
ax = sns.boxplot(x='Year', y='Citations', hue='With Code', data=data_plt, showfliers=False, order=reversed_years)
# Set labels
ax.set_xticklabels(['{} ({})'.format(reversed_years[0] + 1 - year, year) for year in reversed_years])
ax.set_xlabel('Years since Publication (from {})'.format(reversed_years[0] + 1))
ax.set_ylabel('Semantic Scholar Citations')
plt.title(conf)
if add_statistics:
percentile_names = ['Median', 'Third Quartile']
for percentile_name in percentile_names:
print(percentile_name)
# label the median and third quartiles on the plot
code, no_code = add_percentile_labels(ax, percentile_name=percentile_name)
# print changes in percentile for publications with and without OSC over the years
print_percentages(code, no_code, name=percentile_name)
# Create tikz figure and save PNG
tikzplotlib.clean_figure()
tikzplotlib.save(figure_output_file + 'tex')
plt.savefig(figure_output_file + 'png')
plt.show()
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| SiQi Zhou | s****u@r****a | 3 |
| Jacopo Panerati | j****i@u****a | 2 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0