recurrence-mimicking-learning

Hyper-efficient Offline Recurrent Reinforcement Learning Algorithm. It solves decision path of any length without sequential processing. Implemented for Sharpe Ratio optimization as a base problem.

https://github.com/tomwitkowski/recurrence-mimicking-learning

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.4%) to scientific vocabulary

Keywords

deep-learning offline-reinforcement-learning recurrent-reinforcement-learning reinforcement-learning sharpe-ratio-optimization trajectory-optimization
Last synced: 6 months ago · JSON representation ·

Repository

Hyper-efficient Offline Recurrent Reinforcement Learning Algorithm. It solves decision path of any length without sequential processing. Implemented for Sharpe Ratio optimization as a base problem.

Basic Info
  • Host: GitHub
  • Owner: tomWitkowski
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 28.3 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
deep-learning offline-reinforcement-learning recurrent-reinforcement-learning reinforcement-learning sharpe-ratio-optimization trajectory-optimization
Created about 1 year ago · Last pushed 10 months ago
Metadata Files
Readme License Citation

README.md

Recurrence Mimicking Learning (RML)

This repository contains code for Recurrence Mimicking Learning (RML) experiments described in our article. The method aims to optimize a global reward (such as the Sharpe Ratio) by mimicking how recurrent decisions would unfold over a time series, but without incurring the usual high cost of repeated model executions.

Method Overview

In brief, RML uses a single feedforward pass to generate actions for each time step as if they were generated recurrently. It does so by stacking the input $X$ multiple times along all possible previous actions $a_{t-1}$. After generating a stacked output, a lightweight re-indexing step ($\phi$-processing) reconstructs a trajectory of decisions that mirrors the recurrent process. This allows a direct calculation of the global reward (e.g., Sharpe Ratio) with only two forward passes, rather than $T$ passes in a traditional offline RRL.

Repository Structure

  • src/
    Source code with modules for data loading/preprocessing, different reinforcement learning methods (offline RRL, online RRL, RML), and separate training scripts.
  • pyproject.toml
    Project dependencies.
  • .gitignore
    Standard Python and OS ignore patterns.

Minimal Usage Example

  1. Install dependencies: bash bash install.sh
  2. Jak can configure experiments with: bash config.py
  3. To run time comparison with Offline RRL (RLSTM-A): bash python experiment_time.py
  4. To run exactness comparison with Offline RRL (RLSTM-A): bash python experiment_exactness.py ## Reference

For the complete description of the method, mathematical details, and experiments, see our article THE REFERENCE TO ADD

Owner

  • Login: tomWitkowski
  • Kind: user

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Recurrence Mimicking Learning
message: >-
  This repository contains an implementation of Recurrence
  Mimicking Learning (RML).
type: software
authors:
  - given-names: Tomasz
    family-names: Witkowski
    email: tomasz.witkowski1@edu.uekat.pl
    affiliation: University of Economics in Katowice
    orcid: 'https://orcid.org/0000-0001-9648-9098'
repository-code: >-
  https://github.com/tomWitkowski/recurrence-mimicking-learning
abstract: >+
  In brief, RML uses a single feedforward pass to generate
  actions for each time step as if they were generated
  recurrently. It does so by stacking the input X multiple
  times along all possible previous actions a_(t-1). After generating a stacked output, a lightweight
  re-indexing step (ϕ-processing) reconstructs a trajectory
  of decisions that mirrors the recurrent process. This
  allows a direct calculation of the global reward (e.g.,
  Sharpe Ratio) with only two forward passes, rather than T
  passes in a traditional offline RRL.
keywords:
  - recurrence mimicking learning
  - recurrent reinforcement learning
  - offline reinforcement learning
  - recurrent classification
  - sharpe ratio
license: MIT

GitHub Events

Total
  • Push event: 7
  • Create event: 2
Last Year
  • Push event: 7
  • Create event: 2