Recent Releases of aac-datasets
aac-datasets - Version 0.7.0
[0.7.0] 2024-07-19
Added
ytdlp_optsargument to AudioCaps download.num_dl_attemptsargument to AudioCaps download.load_datasetfunction to load a dataset from name.list_datasets_namesfunction to get dataset names.to_hf_datasetmethod to convert to HuggingFaceDatasetinstance.
- Python
Published by Labbeti 8 months ago
aac-datasets - Version 0.6.0
[0.6.0] 2025-05-26
Added
- AudioCaps support for version
v2. - Methods
to_dictandto_listto datasets classes.
Changed
- Rename AudioCaps v1
train_v2subset totrain_fixedto avoid confusion with AudioCaps v2trainsubset. - Rename WavCaps
audioset_no_audiocapssubset toaudioset_no_audiocaps_v1to specify which AudioCaps version is excluded.
Fixed
- Remove invalid warning when using WavCaps subset
freesound_no_clotho_v2. - Download link for AudioCaps V1 subset
train_fixed.
Removed
- Remove subset
freesound_no_clothofor WavCaps since it is confusing withfreesound_no_clotho_v2and should not be used.
- Python
Published by Labbeti 9 months ago
aac-datasets - Version 0.5.2
[0.5.2] 2024-03-23
Added
freesound_no_clotho_v2subset to WavCaps to avoid all bias with Clotho test and analysis subsets.
- Python
Published by Labbeti almost 2 years ago
aac-datasets - Version 0.5.1
[0.5.1] 2024-03-04
Fixed
- WavCaps download preparation (#3).
safe_rmdirfunction when sub-directories are deleted.
- Python
Published by Labbeti almost 2 years ago
aac-datasets - Version 0.5.0
[0.5.0] 2024-01-05
Changed
- Update typing for paths with python class
Path. - Refactor functional interface to load raw metadata for each dataset.
- Refactor class variables to init arguments.
- Faster AudioCaps download with
ThreadPoolExecutor.
- Python
Published by Labbeti about 2 years ago
aac-datasets - Version 0.4.1
[0.4.1] 2023-10-25
Added
AudioCaps.DOWNLOAD_AUDIOclass variable for compatibility with audiocaps-download 1.0.
Changed
- Set log level to WARNING if verbose<=0 in check.py and download.py scripts.
- Use
yt-dlpinstead ofyoutube-dlas backend to download AudioCaps audio files.. (#1) - Update default download message for AudioCaps. (#1)
- Update error message when checksum is invalid for Clotho and MACS datasets. (#2)
- Python
Published by Labbeti over 2 years ago
aac-datasets - Version 0.4.0
[0.4.0] 2023-09-25
Added
- First experimental implementation of WavCaps dataset.
- Subsets
dcase_t2a_audioanddcase_t2a_captionsfrom the DCASE Challenge task 6b, in Clotho dataset. - Subset
train_v2for AudioCaps dataset. - Dataset cards as separate dataclasses for each dataset.
- Get and set global user paths for root, ffmpeg and ytdl.
- Base class for all datasets to simplify manipulation of loaded data.
Changed
- Rename
testsubset todcase_aac_test,analysissubset todcase_aac_analysisfrom the DCASE Challenge task 6a, in Clotho dataset. - Function
get_install_infonow returnspackage_path.
- Python
Published by Labbeti over 2 years ago
aac-datasets - Version 0.3.3
[0.3.3] 2023-05-11
Added
- Script check.py now check if the audio files exists.
- Option
VERIFY_FILESfor Clotho and MACS datasets to validate checksums. CITATIONglobal constant for each dataset.
Changed
- Methods
atandgetitemnow use correct typing when passing an integer, list, slice or None values.
Fixed
- Python minimal version in README and pyproject.toml.
- Transform applied in
getitemmethod when argument is not an integer. - Incompatibility with
torchaudio>=2.0. - Remove 'tags' from AudioCaps columns when with_tags=False.
- Python
Published by Labbeti almost 3 years ago
aac-datasets - Version 0.3.2
[0.3.2] 2023-01-30
Added
AudioCaps.load_class_labels_indicesto load AudioSet classes map externally.- Compatibility and tests from Python 3.7 to 3.10.
Changed
- Attributes in datasets classes are now weakly private.
- Documentation theme and descriptions.
Fixed
- Workflow badge with Github changes. (https://github.com/badges/shields/issues/8671)
- Python
Published by Labbeti about 3 years ago
aac-datasets - Version 0.3.1
[0.3.1] 2022-10-31
Changed
- AudioCaps, Clotho and MACS order are now defined by their order in the corresponding captions CSV files when available.
- Update documentation usage and main page.
Fixed
- Workflow when requirements cache is invalid.
- Python
Published by Labbeti over 3 years ago
aac-datasets - Version 0.3.0
[0.3.0] 2022-09-28
Added
- Add
column_names,infoandshapeproperties in datasets. - Add
is_loadedandset_transformmethods in datasets. - Add column argument for method
getitemin datasets. - Entrypoints for command line scripts
aac-datasets-check,aac-datasets-downloadandaac-datasets-info.
Changed
- Enforce datasets order to sort by filename to avoid different orders returned by
os.listdir. - Function
check_directorynow returns the length of each dataset found in directory. - Rename
get_fieldmethods in datasets byatand add support for Iterable of keys and None key. - Change
atarguments order and names. - Split
BasicCollateinto 2 classes:BasicCollatewithout padding andAdvancedCollatewith padding options. - Weak private methods are now strongly private in datasets.
- Rename
item_transformtotransformin datasets. - Rename
load_tagstowith_tagsinAudioCaps.
Fixed
- AudioCaps loading when
with_tagsis False. - Clotho files download.
- Python
Published by Labbeti over 3 years ago
aac-datasets - Version 0.2.0
[0.2.0] 2022-08-30
Added
- CHANGELOG file.
- First version of the API documentation.
- Supports slicing and list indexing for the three datasets.
- Competence values for MACS annotators.
- Fields scene_label and identifier from TAU Urban acoustic scene dataset in MACS.
- Add
examples/dataloader.ipynbnotebook.
Changed
- Update README with PyPI install and software citation.
- Download functions returns the datasets downloaded.
- MACS now have a subset parameter.
- Underscores in functions names to avoid import private functions.
- Function
aac_datasets.check.check_directorynow returns only the list of subsets loaded. - Replace function
torchaudio.datasets.utils.download_urlbytorch.hub.download_url_to_fileto keep compatibility with future torchaudio version v0.12. - Rename
get_rawmethods in datasets byget_fieldand add support for slicing and multi-indexing.
Fixed
- LICENCE.txt and MACS_competence.yaml download for MACS dataset.
- Clotho download archives files.
Removed
- Transforms dictionary in datasets.
- Argument item_type in datasets.
- Method
getin datasets.
- Python
Published by Labbeti over 3 years ago
aac-datasets - Version 0.1.1
Added
- CITATION file
Changed
- MACS now downloads only the required TAU Urban Sound archive files
- Documentation for arguments in dataset constructors
Fixed
- Clotho analysis subset download and preparation
- Python
Published by Labbeti over 3 years ago
aac-datasets - Version 0.1.0
Added
- Initial versions of Clotho, AudioCaps and MACS pytorch dataset code
- Download & check scripts
- Python
Published by Labbeti over 3 years ago