csvanalyser
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.9%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
·
Repository
Basic Info
- Host: GitHub
- Owner: yzhu27
- License: mit
- Language: Python
- Default Branch: main
- Size: 146 KB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 2
- Open Issues: 0
- Releases: 0
Created almost 4 years ago
· Last pushed over 3 years ago
Metadata Files
Readme
Contributing
License
Code of conduct
Citation
README.md
CSV-Analyser
Welcome to Group 7's repository for 22 fall Software Engineering homework 2 & 3!
This project is intended to read and analyze CSV files. Based on the example source code written in LUA, we implemented multiple functions in Python as listed below. To suppot these functions, we defined 5 classes with specific methods as described below.
Installation
git clone https://github.com/yzhu27/CSVAnalyser.git
cd ./CSVAnalyser
python ./main.py -e ALL
*Notice: run main.py in the root directory directly.
Functions
Read CSV
- Import the input file to a dictionary line by line, separated by given separator.
CLI
- Update information through command line. Help string would be printed if run "-h".
Generate Statistical Summaries
- This function is for column data. For each column, the data is either numeric (which denoted with a leading upper case letter) or symbolic (which denoted with a leading lower case letter). Employ different statistical variebles to describe both types of data.
Classes
Cols
- Record column names and variables, differentiating dependent variables and independent variables by leading letters of column names.
Rows
- Record data by row.
Num
- Num class is for calculating features of numeric data. Methods of add, mid and div are included, among which mid is stand for the middle value of the sorted data, while div means standard deviation of this column of numbers.
Sym
- Sym class is for calculating features of symbolic data. Methods of add, mid and div are included, among which mid represents the most common symbol in the set; div is the entropy of these symbols.
Data
Test
The test cases are given by LUA source code and https://github.com/yzhu27/CSVAnalyser/blob/main/data/auto93.csv. Test coverage is:
Owner
- Name: Yang
- Login: yzhu27
- Kind: user
- Location: Raleigh, NC
- Company: NCSU
- Repositories: 8
- Profile: https://github.com/yzhu27
NCSU MCS Graduate Student, seeking for 2024 NG SDE position.
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Zhu" given-names: "Yuheng" orcid: "https://orcid.org/0000-0002-5976-6954" - family-names: "Zhu" given-names: "Yiran" orcid: "https://orcid.org/0000-0002-6260-5254" - family-names: "Wang" given-names: "Mengzhe" orcid: "https://orcid.org/0000-0003-3684-7150" - family-names: "Wang" given-names: "Pinxiang" orcid: "https://orcid.org/0000-0000-0000-0000" - family-names: "Huang" given-names: "Jiayuan" orcid: "https://orcid.org/0000-0002-8597-4861" title: "CSVAnalyser" version: 1.0.0 doi: 10.5281/zenodo.7093577 date-released: 2022-09-19 url: "https://github.com/yzhu27/CSVAnalyser"
GitHub Events
Total
Last Year
Dependencies
.github/workflows/Update-Coverage-on-Readme.yml
actions
- MishaKav/pytest-coverage-comment v1.1.29 composite
- actions-js/push master composite
- actions/checkout v3 composite
.github/workflows/pytest-coverage-comment.yml
actions
- MishaKav/pytest-coverage-comment v1.1.29 composite
- actions/checkout v3 composite
- actions/setup-python v2 composite