https://github.com/anand-kamble/data-filtering

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.9%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: anand-kamble
Language: Python
Default Branch: main
Size: 346 KB

Statistics

Stars: 0
Watchers: 2
Forks: 0
Open Issues: 2
Releases: 0

Created about 2 years ago · Last pushed about 2 years ago

Metadata Files

Readme Changelog

Use of LLM to Improve Plane Maintenance

Data Processor

This project leverages a Language Model (LLM) to enhance the efficiency and accuracy of plane maintenance by processing and analyzing relevant data.

Parameters

config (data_config or None): The configuration object containing parameters for data processing. If None, a CopaError will be raised.
base_path (str, optional): The base path where the data is located relative to the run.sh script. Defaults to an empty string, which means the data is located in the current directory.
test_mode (bool, optional): If True, the DataProcessor will run in test mode. Defaults to False.
test_rows (int, optional): The number of rows to process in test mode. Defaults to 100.
drop_duplicates (bool, optional): If True, duplicate rows will be dropped during data processing. Defaults to False.
no_cache (bool, optional): If True, caching will be disabled during data processing. Defaults to False.

Installation

To set up the Python virtual environment for this project, you'll need Poetry. Poetry is a tool for dependency management and packaging in Python. Follow the steps below to install Poetry and run the project:

Step 1: Install Poetry

You can find instructions for installing Poetry here.

Step 2: Set Up the Environment

Once Poetry is installed, follow these steps to generate the filtered data:

Install Dependencies: sh poetry install
Activate the Virtual Environment: sh poetry shell
Install Additional Tools (First-Time Setup Only): Required only for development
You can skip this step if you do not plan to contribute to repository. sh pip install -U commitizen
Run the Data Processor: sh ./run.sh Alternatively, if you have exited the Poetry shell, you can run: sh poetry run ./run.sh

Default Execution

If no arguments are provided, the script will use the default parameters specified in the script.

Custom Execution

You can provide custom parameters when running the script. For example: sh ./run.sh --config "custom_config.json" --test_mode --test_rows 500

This flexibility allows you to tailor the data processing to your specific needs.

Owner

Name: Anand Kamble
Login: anand-kamble
Kind: user
Location: Tallahassee, FL
Company: Florida State University

Website: https://anand-kamble.github.io/
Repositories: 24
Profile: https://github.com/anand-kamble

Graduate student in FSU Scientific Computing

GitHub Events

Total

Public event: 1

Last Year

Public event: 1

Dependencies

pyproject.toml pypi

jupyter ^1.0.0 develop
dfply ^0.3.3
fletcher ^0.7.2
halo ^0.0.31
modin ^0.29.0
pandas ^2.2.2
python >=3.11, < 3.13
static-frame ^2.6.0
tidypandas ^0.3.0
ydata-profiling ^4.7.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science