https://github.com/appeler/clean-names
Deduplicate and parse list of `dirty names'
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary
Keywords
Repository
Deduplicate and parse list of `dirty names'
Basic Info
Statistics
- Stars: 20
- Watchers: 2
- Forks: 3
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
ReadMe.md
Clean Names
The script takes a csv file with column 'Name' containing 'dirty names' --- names with all different formats: lastname firstname, firstname lastname, middlename lastname firstname etc. (see sample input file). And it produces a csv file that has all the columns of the original csv file and the following columns: 'uniqid', 'FirstName', 'MiddleInitial/Name', 'LastName', 'RomanNumeral', 'Title', 'Suffix'. The script takes out duplicate names by default (see sample output file).
Application
The script was used to fix names in CF-Scores from Database on Ideology, Money in Politics, and Elections. Processed database with clean names posted on Harvard DVN.
Installation
- Clone this repository
git clone https://github.com/soodoku/clean-names.git
Navigate to clean-names
Run
python setup.py install
Using Clean Names
Usage: process_names.py [options]
Command Line Options
-h, --help show this help message and exit
-o OUTFILE, --out=OUTFILE
Output file in CSV (default: sample_output.csv)
-c COLUMN, --column=COLUMN
Column name in CSV that contains Names (default: Name)
-a, --all
Export all names (do not take duplicate names out) (default: False)
Example
python process_names.py -a sample_input.csv
License
Scripts are released under the MIT License
Owner
- Name: appeler
- Login: appeler
- Kind: organization
- Website: https://appeler.github.io/
- Repositories: 24
- Profile: https://github.com/appeler
Making sense of names.
GitHub Events
Total
- Watch event: 4
Last Year
- Watch event: 4
Dependencies
- nameparser ==0.3.10
- nameparser *