time-series-forcasting-benchmark-dataset-preprocessing
Benchmark Datasets for Time Series Forecasting Preprocessing - NASA HTTP Dataset, WorldCup98 Dataset
https://github.com/pasanbhanu/time-series-forcasting-benchmark-dataset-preprocessing
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.5%) to scientific vocabulary
Keywords
Repository
Benchmark Datasets for Time Series Forecasting Preprocessing - NASA HTTP Dataset, WorldCup98 Dataset
Basic Info
- Host: GitHub
- Owner: PasanBhanu
- License: mit
- Language: Jupyter Notebook
- Default Branch: main
- Homepage: https://ita.ee.lbl.gov/html/traces.html
- Size: 2.95 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Data Pre Processor for Time Series Forcasting
This is a data preprocessing algorithm for widely used data sets provided by "The Internet Traffic Archive".
The supported datasets are, - WorldCup98 Dataset - View
1,352,804,107 web requests recorded at servers for the 1998 World Cup. - NASA HTTP Logs Dataset - View
3,461,612 HTTP logs from a busy WWW server for two months.
This algorithm process the both data sets and create CSV for time series analysis. CSV file format is given below.
| minute | count | |--------|-------| |1995-07-01 00:00:00| 42 | |1995-07-01 00:01:00| 61 | |1995-07-01 00:02:00| 57 |
Features of Algorithm
- WorldCup98 dataset automatic FTP download
- WorldCup98 dataset cross validation with original file for record count
- Visualize the processed data
- Timeseries ready csv output
- Shrink the dataset size for easier processing
Preprocessed Files
If you are interested in preprocessed files, check processeddata folder for CSV files.
Owner
- Name: Pasan Bhanu Guruge
- Login: PasanBhanu
- Kind: user
- Location: Sri Lanka
- Company: @Azbow
- Website: pasanbhanu.me
- Repositories: 7
- Profile: https://github.com/PasanBhanu
Tech Lead 🧑💻 | AI/ML Enthusiast 🤖 | K8 ☸️
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this code in your research or software, please cite it as below."
authors:
- family-names: "Guruge"
given-names: "Pasan Bhanu"
orcid: "https://orcid.org/0009-0008-2481-673X"
- family-names: "Priyadarshana"
given-names: "Y H P P"
orcid: "https://orcid.org/0000-0002-4319-3944"
title: "Time Series Forecasting Benchmark Dataset - NASA HTTP, WorldCup98"
version: 1.0.0
date-released: 2025-02-19
url: "https://github.com/PasanBhanu/time-series-forcasting-benchmark-dataset-preprocessing"
preferred-citation:
type: article
authors:
- family-names: "Guruge"
given-names: "Pasan Bhanu"
orcid: "https://orcid.org/0009-0008-2481-673X"
- family-names: "Priyadarshana"
given-names: "Y H P P"
orcid: "https://orcid.org/0000-0002-4319-3944"
doi: "10.3389/fcomp.2025.1509165"
journal: "Frontiers in Computer Science"
title: "Time series forecasting-based Kubernetes autoscaling using Facebook Prophet and Long Short-Term Memory"
volume: 7
year: 2025
GitHub Events
Total
- Push event: 2
Last Year
- Push event: 2
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Pasan Bhanu Guruge | p****e@h****m | 11 |
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0