https://github.com/anand-kamble/data-filtering
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.9%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: anand-kamble
- Language: Python
- Default Branch: main
- Size: 346 KB
Statistics
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 2
- Releases: 0
Metadata Files
Readme.md
Use of LLM to Improve Plane Maintenance
Data Processor
This project leverages a Language Model (LLM) to enhance the efficiency and accuracy of plane maintenance by processing and analyzing relevant data.
Parameters
config(data_configorNone): The configuration object containing parameters for data processing. IfNone, aCopaErrorwill be raised.base_path(str, optional): The base path where the data is located relative to therun.shscript. Defaults to an empty string, which means the data is located in the current directory.test_mode(bool, optional): IfTrue, theDataProcessorwill run in test mode. Defaults toFalse.test_rows(int, optional): The number of rows to process in test mode. Defaults to100.drop_duplicates(bool, optional): IfTrue, duplicate rows will be dropped during data processing. Defaults toFalse.no_cache(bool, optional): IfTrue, caching will be disabled during data processing. Defaults toFalse.
Installation
To set up the Python virtual environment for this project, you'll need Poetry. Poetry is a tool for dependency management and packaging in Python. Follow the steps below to install Poetry and run the project:
Step 1: Install Poetry
You can find instructions for installing Poetry here.
Step 2: Set Up the Environment
Once Poetry is installed, follow these steps to generate the filtered data:
Install Dependencies:
sh poetry installActivate the Virtual Environment:
sh poetry shellInstall Additional Tools (First-Time Setup Only): Required only for development
You can skip this step if you do not plan to contribute to repository.sh pip install -U commitizenRun the Data Processor:
sh ./run.shAlternatively, if you have exited the Poetry shell, you can run:sh poetry run ./run.sh
Default Execution
If no arguments are provided, the script will use the default parameters specified in the script.
Custom Execution
You can provide custom parameters when running the script. For example:
sh
./run.sh --config "custom_config.json" --test_mode --test_rows 500
This flexibility allows you to tailor the data processing to your specific needs.
Owner
- Name: Anand Kamble
- Login: anand-kamble
- Kind: user
- Location: Tallahassee, FL
- Company: Florida State University
- Website: https://anand-kamble.github.io/
- Repositories: 24
- Profile: https://github.com/anand-kamble
Graduate student in FSU Scientific Computing
GitHub Events
Total
- Public event: 1
Last Year
- Public event: 1
Dependencies
- jupyter ^1.0.0 develop
- dfply ^0.3.3
- fletcher ^0.7.2
- halo ^0.0.31
- modin ^0.29.0
- pandas ^2.2.2
- python >=3.11, < 3.13
- static-frame ^2.6.0
- tidypandas ^0.3.0
- ydata-profiling ^4.7.0