signalslite

A small library to efficiently store and process global equity data, especially for Numerai's Signals tournament (WIP)

https://github.com/parmarsuraj99/signalslite

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.5%) to scientific vocabulary

Keywords

data-engineering finance quant stock
Last synced: 4 months ago · JSON representation ·

Repository

A small library to efficiently store and process global equity data, especially for Numerai's Signals tournament (WIP)

Basic Info
  • Host: GitHub
  • Owner: parmarsuraj99
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: master
  • Homepage:
  • Size: 2.19 MB
Statistics
  • Stars: 19
  • Watchers: 2
  • Forks: 4
  • Open Issues: 1
  • Releases: 4
Topics
data-engineering finance quant stock
Created over 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

pip install signalslite

Why:

  • I wanted a pipeline that can generate features quickly so I can add, remove, build more features whenever needed. So it should be able to do everything from scratch in couple of hours. A relational DB would increase workload of setting things up. So I decided to use parquet files split into daily structure. This is fast without any additional setup.

  • Least friction to get started. It should effortlessly run on consumer grade laptops. Consequently, automate the whole pipeline on cloud, so makes sense to make it "lite", use parallelization when possible, allow for free data sources. It can utilize cuda if available, but is able to run on cpu as well.

  • It should be able to run in Colab default runtime. One way to setup a pipeline is to save all data to mounted drive with more storage.

  • Under 1000 LoC possible? Goal is not to build the best pipeline, but instead, a wrapper on top of flexible code that new users can easily understand and modify as needed.

Stages:

  1. Daily Data Collection/updation:
    • Yahoo/EODHD (Thanks to https://github.com/degerhan/dsignals)
    • Save in daily parquet files
    • Update daily parquet files
    • Colab seem to be slow in loading data from yahoo. Will update.
  2. Generate primary features:
    • Technicla indicators (RSI, MACD, SMA, EMA, etc on various timeframes)
    • flexible enough to accomodate fundamental data and news vectors data since things are independent of each other
  3. Secondary features:
    • Generate features from primary features
    • like crossovers, ratios between technical features, etc
  4. Scaling:
    • bringing the cross sectional features to same scale [0, 1]
    • Now data looks similar to Numerai classic data
  5. Targets:
    • Generate your own targets for trading strategies
    • or use Numerai Signals targets
  6. Modelling:
    • your best models in Numerai classic should work here
  7. Scheduling:
    • Run the pipeline daily
    • should be able to run on cloud

Notes:

  • This is a work in progress. I will keep adding more features and examples.
  • more tests,
  • more documentation,
  • more examples,
  • more flexibility,
  • more speed,
  • more parallelization,
  • more cloud support,
  • more data sources,
  • more targets,
  • more models,
  • more everything

Hope you like it and find it useful. Please let me know if you have any suggestions or feedback. Thanks!

Owner

  • Name: Suraj Parmar
  • Login: parmarsuraj99
  • Kind: user
  • Location: Ontario, Canada

Trying to solve problems...

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Parmar"
  given-names: "Surajsinh"
  orcid: "https://orcid.org/0000-0000-0000-0000"
title: "signalslite"
version: 0.1-alpha.1
date-released: 2023-08-19
url: "https://github.com/parmarsuraj99/signalslite"

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 14 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 4
  • Total maintainers: 1
pypi.org: signalslite

A small package for Numerai Signals locally

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 14 Last month
Rankings
Dependent packages count: 7.5%
Average: 38.6%
Dependent repos count: 69.6%
Maintainers (1)
Last synced: 5 months ago