Science Score: 62.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 4 committers (25.0%) from academic institutions -
✓Institutional organization owner
Organization caltechlibrary has institutional domain (www.library.caltech.edu) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.3%) to scientific vocabulary
Keywords from Contributors
Repository
Automated Metadata Service
Basic Info
- Host: GitHub
- Owner: caltechlibrary
- License: other
- Language: Python
- Default Branch: main
- Size: 66.3 MB
Statistics
- Stars: 5
- Watchers: 7
- Forks: 4
- Open Issues: 9
- Releases: 35
Metadata Files
README.md
ames
Automated Metadata Service
Manage metadata from different sources. The examples in the package are specific to Caltech repositories, but could be generalized. This package is currently in development and will have additional sources and matchers added over time.
Install
You need to have Python 3.7 or later on your machine.
If you just need the python functions to write your own code
(like codemetatodatacite) open a terminal and type pip install ames
Full Install
The full install will include all the example scripts. You need to have Python 3.7 or later on your machine and git.
Clone ames
A full install starts by downloading this software using git. Find where you
want the ames folder to live on your computer in File Explorer or Finder
(This could be the Desktop or Documents folder, for example). Type cd
in anaconda prompt or terminal and drag the location from the file browser into
the terminal window. The path to the location
will show up, so your terminal will show a command like
cd /Users/tmorrell/Desktop. Hit enter. Then type
git clone https://github.com/caltechlibrary/ames.git. Once you
hit enter you'll see an ames folder. Type cd ames
Install
Now that you're in the ames folder, type python setup.py install. You can
now run all the different operations described below.
Updating
When there is a new version of the software, go to the ames
folder in anaconda prompt or terminal and type git pull. You shouldn't need to re-do
the installation steps unless there are major updates.
Organization
Harvesters
- crossref_refs - Harvest references in datacite metadata from crossref event data
- caltechdata - Harvest metadata from CaltechDATA
- cd_github - Harvest GitHub repos and codemeta files from CaltechDATA
- matomo - Harvest web statistics from matomo
- caltechfeeds - Harvest Caltech Library metadata from feeds.library.caltech.edu
Matchers
- caltechdata - Match content in CaltechDATA
- update_datacite - Match content in DataCite
Example Operations
The run scripts show examples of using ames to perform a specific update operation.
CodeMeta management
In the test directory these is an example of using the codemetatodatacite function to convert a codemeta file to DataCite standard metdata
CodeMeta Updating
Collect GitHub records in CaltechDATA, search for a codemeta.json file, and update CaltechDATA with new metadata.
CodeMeta Setup
You need to set an environmental variable with your token to access
CaltechDATA export TINDTOK=
CodeMeta Usage
Type python run_codemeta.py.
CaltechDATA Citation Alerts
Harvest citation data from the Crossref Event Data API, records in CaltechDATA, match records, update metadata in CaltechDATA, and send email to user.
Citation Alerts Setup
You need to set environmental variables with your token to access
CaltechDATA export TINDTOK= and Mailgun export MAILTOK=.
Citation Alerts Usage
Type python run_event_data.py. You'll be prompted for confirmation if any
new citations are found.
Media Updates
Update media records in DataCite that indicate the files associated with a DOI.
Media Setup
You need to set an environmental variable with your password for your DataCite
account using export DATACITE=
Media Usage
Type python run_media_update.py.
CaltechDATA metadata checks
This will run checks on the quality of metadata in CaltechDATA. Currently this
verifies whether redundent links are present in the related identifier section.
It also can update metadata with DataCite.
Metadata Checks Setup
You need to set environmental variables with your token to access
CaltechDATA export TINDTOK=
Metadata Checks Usage
Type python run_caltechdata_checks.py.
CaltechDATA Metadata Updates
This will improve the quality of metadata in CaltechDATA. This option is broken up into updates that should run frequently (currently every 10 minutes) and daily. Frequent updates include adding a recommended citation to the descriptions, and daily updates include adding CaltechTHESIS DOIs to CaltechDATA.
Metadata Updates Setup
You need to set environmental variables with your token to access
CaltechDATA export TINDTOK=
Metadata Updates Usage
Type python run_caltechdata_updates.py or python run_caltechdata_daily.py.
CaltechDATA COUNTER Usage Reports
This will harvest download and view information from matomo and format it into a COUNTER report. This feature is still being tested.
Usage Report Setup
You need to set environmental variables with your token to access
Matomo export MATTOK=
Usage Report Usage
Type python run_usage.py.
Archives Reports
Runs reports on ArchivesSpace. Current reports:
- accession_report: Returns accession records that match a certain subject
- format_report: Returns large report on accessions with certain media formats
Example usage:
python runarchivesreport.py accession_report accession.csv -subject "Manuscript Collection"
Update Eprints
Perform update options using the Eprints API. Supports url updates to https for resolver field, special character updates, and adjusting the item modified date (which also regenerates the public view of the page).
Example usage:
python runeprintsupdates.py update_date authors -recid 83420 -user tmorrell -password
CODA Reports
Runs reports on Caltech Library repositories. Current reports:
doi_report: Records (optionally filtered by year) and their DOIs.
thesis_report: Matches Eprints tsv export for CaltechTHESIS
thesis_metadata: Matches Eprints metadata tsv export for CaltechTHESIS
creator_report: Finds records where an Eprints Creator ID has an ORCID but it is not included on all records. Also lists cases where an author has two ORCIDS.
creator_search: Export a google sheet with the author lists of all publications associated with an author id. Requires -creator argument
people_search: Search across the CaltechPEOPLE collection by division
file_report: Records that have potential problems with the attached files
status_report: Reports on any records with an incorrect status in feeds
recordnumberreport: Reports on records where the record number and resolver URL don't match
alturlreport: Reports on records with discontinure alt_url field
license_report: Report out the license types in CaltechDATA
Report Usage
Type something like python run_coda_report.py doi_report thesis report.tsv -year 1977-1978
- The first option is the report type
- Next is the repository (thesis or authors)
- Next is the output file name (include .csv or .tsv extension, will show up in current directory)
Report Options
Some reports include a -year option to return just the records from a specific year (1977) or a range (1977-1978)
Some reports include a -group option to return just the records with a specific group name. Surround long names with quotes (e.g. "Keck Institute for Space Studies")
Some reports include a -item option to return just records with a specific item type. Supported types include:
- CaltechDATA item types (Dataset, Software, ...)
- CaltechAUTHORS item types (article, monograph, ...)
- CaltechAUTHORS monograph sub-types
- discussion_paper
- documentation
- manual
- other
- project_report
- report
- technical_report
- white_paper
- working_paper
There are some additional technical arguments if you want to change the default behavior.
Adding
-source eprintswill pull report data from Eprints instead of feeds. This is very slow. You may need to add -username and -password to provide login credentialsAdding
-sample XXXallows you to select a number of randomly selected records. This makes it more reasonable to pull data directly from Eprints.
You can combine multiple options to build more complex queries, such as this request for reports from a group:
console
python run_coda_report.py doi_report authors keck_tech_reports.csv -group "Keck Institute for Space Studies" -item technical_report project_report discussion_paper
console
python run_coda_report.py people_search people chem.csv -search "Chemistry and Chemical Engineering Division"
Owner
- Name: Caltech Library
- Login: caltechlibrary
- Kind: organization
- Email: helpdesk@library.caltech.edu
- Location: Pasadena, CA 91125
- Website: https://www.library.caltech.edu/
- Repositories: 84
- Profile: https://github.com/caltechlibrary
We manage the physical and digital holdings of the California Institute of Technology, provide services and training, and develop open-source software.
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: ames
authors:
- family-names: Morrell
given-names: Thomas E
orcid: https://orcid.org/0000-0001-9266-5146
- family-names: Doiel
given-names: Robert
orcid: https://orcid.org/0000-0003-0900-6903
- family-names: Bhattarai
given-names: Rohan
orcid: https://orcid.org/0009-0007-0323-4733
- family-names: Won
given-names: Elizabeth
orcid: https://orcid.org/0009-0002-2450-6471
- family-names: Abakah
given-names: Alexander
orcid: https://orcid.org/0009-0003-5640-6691
abstract: Automated Metadata Service: Manage metadata from different sources.
repository-code: "https://github.com/caltechlibrary/ames"
type: software
doi: 10.22002/4d93n-c8q12
version: 1.2.2
license-url: "https://data.caltech.edu/license"
keywords:
- GitHub
- metadata
- software
date-released: 2025-07-30
CodeMeta (codemeta.json)
{
"@context": "https://doi.org/10.5063/schema/codemeta-2.0",
"@type": "SoftwareSourceCode",
"description": "Automated Metadata Service: Manage metadata from different sources.",
"name": "ames",
"codeRepository": "https://github.com/caltechlibrary/ames",
"issueTracker": "https://github.com/caltechlibrary/ames/issues",
"license": "https://data.caltech.edu/license",
"version": "1.2.2",
"author": [
{
"@type": "Person",
"givenName": "Thomas E",
"familyName": "Morrell",
"affiliation": {
"@type": "Organization",
"name": "Caltech Library"
},
"email": "tmorrell@caltech.edu",
"@id": "https://orcid.org/0000-0001-9266-5146"
},
{
"@type": "Person",
"givenName": "Robert",
"familyName": "Doiel",
"affiliation": {
"@type": "Organization",
"name": "Caltech Library"
},
"email": "rsdoiel@caltech.edu",
"@id": "https://orcid.org/0000-0003-0900-6903"
},
{
"@type": "Person",
"givenName": "Rohan",
"familyName": "Bhattarai",
"affiliation": {
"@type": "Organization",
"name": "Caltech"
},
"email": "rbhattar@caltech.edu",
"@id": "https://orcid.org/0009-0007-0323-4733"
},
{
"@type": "Person",
"givenName": "Elizabeth",
"familyName": "Won",
"affiliation": {
"@type": "Organization",
"name": "Caltech"
},
"@id": "https://orcid.org/0009-0002-2450-6471"
},
{
"@type": "Person",
"givenName": "Alexander",
"familyName": "Abakah",
"affiliation": {
"@type": "Organization",
"name": "Caltech"
},
"@id": "https://orcid.org/0009-0003-5640-6691"
}
],
"developmentStatus": "active",
"downloadUrl": "https://github.com/caltechlibrary/ames/archive/main.zip",
"keywords": [
"GitHub",
"metadata",
"software"
],
"programmingLanguage": "Python",
"maintainer": [
{
"@id": "https://orcid.org/0000-0001-9266-5146",
"@type": "Person",
"affiliation": {
"@type": "Organization",
"name": "Caltech Library"
},
"familyName": "Morrell",
"givenName": "Thomas E."
}
],
"identifier": "10.22002/4d93n-c8q12",
"funding": {
"@type": "Grant",
"identifier": "2322420",
"name": "CC* Data Storage: Closing Caltech's data storage gap: from ad-hoc to well-managed stewardship of large-scale datasets",
"funder": {
"@id": "https://doi.org/10.13039/100000001",
"@type": "Organization",
"name": "National Science Foundation"
}
}
}
GitHub Events
Total
- Create event: 2
- Release event: 2
- Issues event: 5
- Watch event: 1
- Issue comment event: 4
- Push event: 32
- Pull request review comment event: 10
- Pull request review event: 11
- Pull request event: 6
- Fork event: 2
Last Year
- Create event: 2
- Release event: 2
- Issues event: 5
- Watch event: 1
- Issue comment event: 4
- Push event: 32
- Pull request review comment event: 10
- Pull request review event: 11
- Pull request event: 6
- Fork event: 2
Committers
Last synced: almost 3 years ago
All Time
- Total Commits: 346
- Total Committers: 4
- Avg Commits per committer: 86.5
- Development Distribution Score (DDS): 0.145
Top Committers
| Name | Commits | |
|---|---|---|
| Tom Morrell | t****l@c****u | 296 |
| Thomas Morrell | t****l@u****m | 36 |
| R. S. Doiel | r****l@g****m | 12 |
| Katrin Leinweber | 9****r@u****m | 2 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 18
- Total pull requests: 12
- Average time to close issues: 3 months
- Average time to close pull requests: 10 days
- Total issue authors: 3
- Total pull request authors: 6
- Average comments per issue: 1.0
- Average comments per pull request: 0.58
- Merged pull requests: 9
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 4
- Pull requests: 7
- Average time to close issues: about 1 month
- Average time to close pull requests: 13 days
- Issue authors: 1
- Pull request authors: 3
- Average comments per issue: 0.25
- Average comments per pull request: 0.43
- Merged pull requests: 6
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- tmorrell (16)
- rsdoiel (1)
- WardLT (1)
Pull Request Authors
- RohanBhattaraiNP (3)
- rsdoiel (3)
- elizabethjhwon (2)
- AbakahAlexander (2)
- katrinleinweber (1)
- tmorrell (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 667 last-month
- Total docker downloads: 81
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 25
- Total maintainers: 1
pypi.org: ames
Automated Metadata Service: Manage metadata from different sources.
- Homepage: https://github.com/caltechlibrary/ames
- Documentation: https://ames.readthedocs.io/
- License: https://data.caltech.edu/license
-
Latest release: 1.2.2
published 7 months ago