Recent Releases of dataverse-metadata-crawler

dataverse-metadata-crawler - v0.1.5

What's Changed

1. Bug fixes

  • Fixed subject columns (CM_Subject_*) wrongly be False in spreadsheet #20
  • Fixed RequestAccess and CM_ProdAbbrev spelling in spreadsheet #20

Full Changelog: https://github.com/scholarsportal/dataverse-metadata-crawler/compare/v0.1.4...v0.1.5

- Python
Published by kenlhlui 11 months ago

dataverse-metadata-crawler - v01.4

1. Feature updates

  1. Added counting deaccession/draft datasets being crawled into the log.
  2. Added end of crawling message (✅ Crawling process completed successfully.)

2. Bug fixes

  1. Removed deaccession/draft datasets metadata from failed_metadata_uris_yyyymmdd-HHMMSS.json. These metdata record will now only showed in pid_dict_dd_yyyymmdd-HHMMSS.json.
  2. Removed non-created JSON file output listed in the log.

Full Changelog: https://github.com/scholarsportal/dataverse-metadata-crawler/compare/v0.1.3...v0.1.4

- Python
Published by kenlhlui 12 months ago

dataverse-metadata-crawler - v0.1.3

1. Feature updates

  1. Change example.ipynb to colud_cli.ipynb to better represent the use of the notebook.
  2. Updated colud_cli.ipynb to support interactive BASE_URL and API_KEY input, for creating the .env file

2. Others

  1. Updated the poetry-export_dependencies.yml (GitHub workflow file) to update the requirements.txt and poetry.lock files in a CI/CD manner.

Full Changelog: https://github.com/scholarsportal/dataverse-metadata-crawler/compare/v0.1.2...v0.1.3

- Python
Published by kenlhlui about 1 year ago

dataverse-metadata-crawler - v0.1.2

1. Feature updates

  1. Added example.ipynb for launching the tool in Binder- no Git or Python install required.
  2. Updated handling of checking connection. If the API_KEY input by the user is invalid, the tool will now fall back to using unauthenticated connection for crawling.

2. Others

  1. Changed defining headers for making GET requests to MetaDataCrawler.

Full Changelog: https://github.com/scholarsportal/dataverse-metadata-crawler/compare/v0.1.1...v0.1.2

- Python
Published by kenlhlui about 1 year ago

dataverse-metadata-crawler - v0.1.1

1. Schema changes

  1. The key for ds_metadata in the dataset will now use dataset IDs (unique identifiers for each dataset version in the Dataverse system). Example: ``` # Old version "doi:10.5072/FK2/DUGFC4": { # datasetPersistentId "status": "OK", "data": { "id": 850, "datasetId": 2663, "datasetPersistentId": "doi:10.5072/FK2/DUGFC4", ...

New version

{ "2663": { # datasetId "status": "OK", "data": { "id": 850, "datasetId": 2663, "datasetPersistentId": "doi:10.5072/FK2/DUGFC4", ... 2. `ds_metadata_yyyymmdd-HHMMSS.json` now contains `data`, `path_info` and `permission_info` at the second-level. { ... "status": "OK", "data": { ... }, "pathinfo": { ... }, "permissioninfo": { ... },

3. Changes to the following fields in `path_info` for consistency with the new schema: collectionalias -> CollectionAlias collectionid -> CollectionID pid -> datasetPersistentId dsid -> datasetId pathids -> path_ids

Old version

... "pathinfo": { "collectionalias": "toronto", "collectionid": 22, "pid": "doi:10.5072/FK2/DUGFC4", "dsid": 2663, "path": "/Nick Field Dataverse", "path_ids": [ 2641 ] }

New version

... "path_info": { "CollectionAlias": "toronto", "CollectionID": 22, "datasetPersistentId": "doi:10.5072/FK2/DUGFC4", "datasetId": 2663, "path": "/Nick Field Dataverse", "pathIds": [ 2641 ] }

```

2. Feature updates

  1. Comibed the representation (-d) and permission (-p) metadata into ds_metadata_yyyymmdd-HHMMSS.json as a single JSON file.
  2. Added the following permission roles count of dataset (DS_Collab, DS_Admin, DS_Contrib, DS_ContribPlus, DS_Curator, DS_FileDown, DS_Member) for spreadsheet output - Only available if -p is enabled

3. Bug Fixes

  1. Corrected spelling mistakes in the README file.
  2. Restored missing fields for representation metadata in the spreadsheet:
  3. TermsOfUse
  4. CM_AuthorAff
  5. CM_TimeEnd
  6. CM_CollectionStart
  7. CM_CollectionEnd
  8. Fixed handling -f responses with None objects. ****

- Python
Published by kenlhlui about 1 year ago

dataverse-metadata-crawler - v0.1.0

  1. Inital release

Full Changelog: https://github.com/scholarsportal/dataverse-metadata-crawler/commits/v0.1.0

- Python
Published by kenlhlui about 1 year ago

dataverse-metadata-crawler - v0.1.0

  1. Initial Release.

Full Changelog: https://github.com/scholarsportal/dataverse-metadata-crawler/commits/v0.1.0

- Python
Published by kenlhlui about 1 year ago