tclf

A scikit-learn compatible classifier to perform trade classification in Python.

https://github.com/karelze/tclf

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 24 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.7%) to scientific vocabulary

Keywords

empirical finance microstructure python rule-based-classifier scikit-learn trade-classification

Keywords from Contributors

mesh interpretability standardization energy-system-model energy-system hacking data-profilers pipeline-testing datacleaner animations
Last synced: 4 months ago · JSON representation ·

Repository

A scikit-learn compatible classifier to perform trade classification in Python.

Basic Info
Statistics
  • Stars: 19
  • Watchers: 3
  • Forks: 1
  • Open Issues: 1
  • Releases: 7
Topics
empirical finance microstructure python rule-based-classifier scikit-learn trade-classification
Created over 2 years ago · Last pushed 4 months ago
Metadata Files
Readme Changelog License Citation Codeowners

README.md

Trade Classification With Python

GitHubActions codecov Quality Gate Status

Logo

Documentation ✒️: https://karelze.github.io/tclf/

Source Code 🐍: https://github.com/KarelZe/tclf

tclf is a scikit-learn-compatible implementation of trade classification algorithms to classify financial markets transactions into buyer- and seller-initiated trades.

The key features are:

  • Easy: Easy to use and learn.
  • Sklearn-compatible: Compatible to the sklearn API. Use sklearn metrics and visualizations.
  • Feature complete: Wide range of supported algorithms. Use the algorithms individually or stack them like LEGO blocks.

Installation

pip console pip install tclf

uv⚡ console uv add tclf

Supported Algorithms

  • (Rev.) CLNV rule[^1]
  • (Rev.) EMO rule[^2]
  • (Rev.) LR algorithm[^6]
  • (Rev.) Tick test[^5]
  • Depth rule[^3]
  • Quote rule[^4]
  • Tradesize rule[^3]

For a primer on trade classification rules visit the rules section 🆕 in our docs.

Minimal Example

Let's start simple: classify all trades by the quote rule and all other trades, which cannot be classified by the quote rule, randomly.

Create a main.py with: ```python title="main.py" import numpy as np import pandas as pd

from tclf.classical_classifier import ClassicalClassifier

X = pd.DataFrame( [ [1.5, 1, 3], [2.5, 1, 3], [1.5, 3, 1], [2.5, 3, 1], [1, np.nan, 1], [3, np.nan, np.nan], ], columns=["tradeprice", "bidex", "ask_ex"], )

clf = ClassicalClassifier(layers=[("quote", "ex")], strategy="random") clf.fit(X) probs = clf.predict_proba(X) Run your script with console $ python main.py ``` In this example, input data is available as a pd.DataFrame with columns conforming to our naming conventions.

The parameter layers=[("quote", "ex")] sets the quote rule at the exchange level and strategy="random" specifies the fallback strategy for unclassified trades.

Advanced Example

Often it is desirable to classify both on exchange level data and nbbo data. Also, data might only be available as a numpy array. So let's extend the previous example by classifying using the quote rule at exchange level, then at nbbo and all other trades randomly.

```python title="main.py" hllines="6 16 17 20" import numpy as np from sklearn.metrics import accuracyscore

from tclf.classical_classifier import ClassicalClassifier

X = np.array( [ [1.5, 1, 3, 2, 2.5], [2.5, 1, 3, 1, 3], [1.5, 3, 1, 1, 3], [2.5, 3, 1, 1, 3], [1, np.nan, 1, 1, 3], [3, np.nan, np.nan, 1, 3], ] ) ytrue = np.array([-1, 1, 1, -1, -1, 1]) features = ["tradeprice", "bidex", "askex", "bidbest", "askbest"]

clf = ClassicalClassifier( layers=[("quote", "ex"), ("quote", "best")], strategy="random", features=features ) clf.fit(X) acc = accuracyscore(ytrue, clf.predict(X)) `` In this example, input data is available as np.arrays with both exchange ("ex") and nbbo data ("best"). We set the layers parameter tolayers=[("quote", "ex"), ("quote", "best")]to classify trades first on subset"ex"and remaining trades on subset"best". Additionally, we have to setClassicalClassifier(..., features=features)` to pass column information to the classifier.

Like before, column/feature names must follow our naming conventions.

Other Examples

For more practical examples, see our examples section.

Development

We are using tox with uv for development.

bash tox -e lint tox -e format tox -e test tox -e build

Citation

If you are using the package in publications, please cite as:

latex @software{bilz_tclf_2023, author = {Bilz, Markus}, license = {BSD 3}, month = nov, title = {{tclf} -- trade classification with python}, url = {https://github.com/KarelZe/tclf}, version = {0.0.1}, year = {2023} }

Footnotes

[^1]:

Chakrabarty, B., Li, B., Nguyen, V., & Van Ness, R. A. (2007). Trade classification algorithms for electronic communications network trades. Journal of Banking & Finance, 31(12), 3806–3821. https://doi.org/10.1016/j.jbankfin.2007.03.003
[^2]:
Ellis, K., Michaely, R., & O’Hara, M. (2000). The accuracy of trade classification rules: Evidence from nasdaq. The Journal of Financial and Quantitative Analysis, 35(4), 529–551. https://doi.org/10.2307/2676254
[^3]:
Grauer, C., Schuster, P., & Uhrig-Homburg, M. (2023). Option trade classification. https://doi.org/10.2139/ssrn.4098475
[^4]:
Harris, L. (1989). A day-end transaction price anomaly. The Journal of Financial and Quantitative Analysis, 24(1), 29. https://doi.org/10.2307/2330746
[^5]:
Hasbrouck, J. (2009). Trading costs and returns for U.s. Equities: Estimating effective costs from daily data. The Journal of Finance, 64(3), 1445–1477. https://doi.org/10.1111/j.1540-6261.2009.01469.x
[^6]:
Lee, C., & Ready, M. J. (1991). Inferring trade direction from intraday data. The Journal of Finance, 46(2), 733–746. https://doi.org/10.1111/j.1540-6261.1991.tb02683.x

Owner

  • Name: Markus Bilz
  • Login: KarelZe
  • Kind: user
  • Location: Germany
  • Company: Atruvia

Citation (CITATION.cff)

cff-version: 1.2.0
title: tclf
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Markus
    family-names: Bilz
    email: web@markusbilz.com
    orcid: 'https://orcid.org/0009-0009-6833-4393'
repository-code: 'https://github.com/KarelZe/tclf'
url: 'https://karelze.github.io/tclf/'
abstract: >-
  `tclf` is a scikit-learn-compatible implementation of
  trade classification algorithms to classify financial
  markets transactions into buyer- and seller-initiated
  trades.
keywords:
  - clnv
  - lee-ready
  - rule-based classifier
  - scikit-learn
  - trade-classification
license: BSD 3
version: 0.0.3
date-released: '2024-01-06'

GitHub Events

Total
  • Watch event: 2
  • Delete event: 11
  • Issue comment event: 89
  • Push event: 71
  • Pull request review comment event: 4
  • Pull request review event: 5
  • Pull request event: 55
  • Create event: 13
Last Year
  • Watch event: 2
  • Delete event: 11
  • Issue comment event: 89
  • Push event: 71
  • Pull request review comment event: 4
  • Pull request review event: 5
  • Pull request event: 55
  • Create event: 13

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 106
  • Total Committers: 5
  • Avg Commits per committer: 21.2
  • Development Distribution Score (DDS): 0.443
Past Year
  • Commits: 97
  • Committers: 4
  • Avg Commits per committer: 24.25
  • Development Distribution Score (DDS): 0.412
Top Committers
Name Email Commits
Markus Bilz m****l@m****m 59
pre-commit-ci[bot] 6****] 21
dependabot[bot] 4****] 18
Markus Bilz g****b@m****m 7
github-actions[bot] 4****] 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 14
  • Total pull requests: 221
  • Average time to close issues: 6 days
  • Average time to close pull requests: 6 days
  • Total issue authors: 3
  • Total pull request authors: 4
  • Average comments per issue: 0.21
  • Average comments per pull request: 2.11
  • Merged pull requests: 190
  • Bot issues: 0
  • Bot pull requests: 130
Past Year
  • Issues: 0
  • Pull requests: 56
  • Average time to close issues: N/A
  • Average time to close pull requests: 14 days
  • Issue authors: 0
  • Pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 1.95
  • Merged pull requests: 40
  • Bot issues: 0
  • Bot pull requests: 54
Top Authors
Issue Authors
  • KarelZe (12)
  • strcat32 (1)
  • michibau (1)
Pull Request Authors
  • KarelZe (91)
  • pre-commit-ci[bot] (70)
  • dependabot[bot] (54)
  • github-actions[bot] (6)
Top Labels
Issue Labels
enhancement (4) wontfix (2) dependencies (2) documentation (1) python (1)
Pull Request Labels
dependencies (65) github_actions (64) documentation (22) python (9) enhancement (6)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 51 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 9
  • Total maintainers: 1
pypi.org: tclf

Classify trades using trade classification algorithms 🐍

  • Versions: 9
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 51 Last month
Rankings
Dependent packages count: 10.1%
Average: 38.3%
Dependent repos count: 66.6%
Maintainers (1)
Last synced: 4 months ago

Dependencies

.github/workflows/tests.yaml actions
  • actions/checkout v2 composite
  • conda-incubator/setup-miniconda v2 composite
pyproject.toml pypi
  • numpy ^1.25.1
  • pandas ^2.0.3
  • python ^3.10
  • scikit-learn ^1.3.0
.github/workflows/publish.yaml actions
  • actions/checkout v4 composite
  • actions/download-artifact v3 composite
  • actions/setup-python v5 composite
  • actions/upload-artifact v3 composite
  • pypa/gh-action-pypi-publish release/v1 composite
  • sigstore/gh-action-sigstore-python v2.1.0 composite