Science Score: 52.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
✓Institutional organization owner
Organization bodleian has institutional domain (www.bodleian.ox.ac.uk) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.1%) to scientific vocabulary
Keywords
Repository
An experimental library for writing WACZ files
Basic Info
- Host: GitHub
- Owner: bodleian
- License: mit
- Language: Rust
- Default Branch: main
- Homepage: https://lib.rs/crates/wacksy
- Size: 270 KB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 13
- Releases: 4
Topics
Metadata Files
README.md
Wacksy
An experimental Rust library for ~~reading and~~ writing ᴡᴀᴄᴢ files.
Install
With cargo installed, run the following command in your project directory:
cargo add wacksy
Example
This library provides two main ᴀᴘɪ functions.
from_file() takes a ᴡᴀʀᴄ file and returns a structured representation of a ᴡᴀᴄᴢ object.
zip() takes a ᴡᴀᴄᴢ object and zips it up to a byte array using rawzip.
rust
fn main() -> Result<(), Box<dyn Error>> {
let warc_file_path = Path::new("example.warc.gz"); // set path to your ᴡᴀʀᴄ file
let wacz_object = WACZ::from_file(warc_file_path)?; // index the ᴡᴀʀᴄ and create a ᴡᴀᴄᴢ object
let zipped_wacz: Vec<u8> = wacz_object.zip()?; // zip up the ᴡᴀᴄᴢ
fs::write("example.wacz", zipped_wacz)?; // write out to file
Ok(())
}
See the documentation for more details.
Background
According to Ed Summers, a ᴡᴀᴄᴢ file is "really just a ᴢɪᴘ file that contains ᴡᴀʀᴄ data and metadata at predicatble file locations."[^code4lib_talk]
The example in the spec outlines what a ᴡᴀᴄᴢ file should contain:
archive
└── data.warc.gz
datapackage.json
datapackage-digest.json
indexes
└── index.cdx.gz
pages
└── pages.jsonl
[^code4lib_talk]: For more discussion of the concept, see the talk "Web Archives in Digital Repositories" by Ilya Kremer and Ed Summers at Code4Lib 2022.
Similar libraries
License
MIT © Bodleian Libraries and contributors
Owner
- Name: Bodleian Libraries
- Login: bodleian
- Kind: organization
- Location: Oxford, UK
- Website: https://www.bodleian.ox.ac.uk
- Repositories: 56
- Profile: https://github.com/bodleian
The Bodleian Libraries of the University of Oxford
Citation (CITATION.cff)
cff-version: 1.2.0
title: Wacksy
message: >-
'If you use this software, please cite it using the
metadata from this file.'
references:
- authors:
- name: Webrecorder
title: 'Web Archive Collection Zipped (WACZ)'
type: standard
version: 1.1.1
date-published: '2021-06-03'
url: 'https://specs.webrecorder.net/wacz/1.1.1/'
type: software
authors:
- given-names: Pierre
family-names: Marshall
email: 'pierre.marshall@bodleian.ox.ac.uk'
affiliation: Oxford University
orcid: 'https://orcid.org/0000-0001-9245-7670'
repository-code: 'https://github.com/bodleian/wacksy'
abstract: 'An experimental library for writing WACZ files.'
keywords:
- wacz
- warc
- cdxj
- archive
license: MIT
version: '0.0.2'
date-released: '2025-08-06'
GitHub Events
Total
- Issues event: 9
- Delete event: 12
- Issue comment event: 10
- Push event: 37
- Pull request event: 3
- Create event: 13
Last Year
- Issues event: 9
- Delete event: 12
- Issue comment event: 10
- Push event: 37
- Pull request event: 3
- Create event: 13
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 6
- Total pull requests: 2
- Average time to close issues: about 2 months
- Average time to close pull requests: 9 days
- Total issue authors: 1
- Total pull request authors: 2
- Average comments per issue: 0.5
- Average comments per pull request: 0.5
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 1
Past Year
- Issues: 6
- Pull requests: 2
- Average time to close issues: about 2 months
- Average time to close pull requests: 9 days
- Issue authors: 1
- Pull request authors: 2
- Average comments per issue: 0.5
- Average comments per pull request: 0.5
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 1
Top Authors
Issue Authors
- extua (5)
Pull Request Authors
- github-actions[bot] (1)
- extua (1)