Recent Releases of msdm
msdm - v0.11 Release
- Major fix to A* implementation in 7a52fa71d7734a3968463919c40cfb3ba18a1627
- Additional table support
ImplicitDistributionimplementation- Implementation of Options Framework (Sutton, Precup & Singh, 1999)
- Python
Published by markkho over 2 years ago
msdm - v0.10 Release
Summary of changes/additions:
- Implemented a
Tableclass that allows for a dict and numpy-like interface with numpy array backend MarkovDecisionProcessandPartiallyObservableMDPalgorithms returnResultsobjects with attributes in the form ofTables (e.g.,state_value,action_value,policy) - note that this is a breaking change- For all MDPs and derived problem classes,
is_terminalhas been changed tois_absorbing FunctionalPolicyandTabularPolicyclasses introducedPolicyIteration,ValueIteration, andMultichainPolicyIterationhave been (re-)implemented- Tests have been streamlined
- Organization of core modules has been streamlined
- Python
Published by markkho about 3 years ago
msdm - v0.9 Release
Summary of changes/additions:
- RMAX implementation
- Fix TD Learning bug
- Fix TabularMDP.reachable_states
- New tests
- Python
Published by markkho over 3 years ago
msdm - v0.8 Release
Summary of changes/additions:
LAOStarerror handling- New
DictDistributionmethods - New
condition,chain, andis_normalizedmethods inFiniteDistribution
- Python
Published by markkho over 3 years ago
msdm - v0.7 Release
Summary of changes/additions:
- POMDP solvers:
- FSCBoundedPolicyIteration (new)
- FSCGradientAscent (minor changes)
- Planning algorithms
- Major refactor of LAOStar to support event listener pattern (note interface changes)
- Minor refactor of LRTDP to support event listener pattern
- Core classes
- Fix to TabularPolicy.from_q_matrices calculation of softmax distribution
- Minor changes to core POMDP implementation
- New domains
- GridMDP base class and plotting tools
- WindyGridWorld MDP
- clean up
- Python
Published by markkho about 4 years ago
msdm - v0.5 Release
This release mainly includes interfaces, algorithms, and test domains for tabular partially observable markov decision processes (POMDPs).
Summary of changes:
- Core POMDP classes:
- PartiallyObservableMDP
- TabularPOMDP
- BeliefMDP
- POMDPPolicy
- ValueBasedTabularPOMDPPolicy
- AlphaVectorPolicy
- FiniteStateController
- StochasticFiniteStateController
- Domains:
- HeavenOrHell
- LoadUnload
- Tiger
- Algorithms:
- PointBasedValueIteration
- QMDP
- FSCGradientAscent
- JuliaPOMDPs wrapper
- Fixes to Policy Iteration and Value Iteration
- Updated README.md
- Python
Published by markkho over 4 years ago
msdm - v0.4 Release
New Features - QLearning, SARSA, Expected SARSA, DoubleQLearning - Policy Iteration - Entropy Regularized Policy Iteration - Works with python 3.9 - QuickMDP and QuickTabularMDP constructors - Construction of TabularMDPs from matrices - New domains: CliffWalking, GridMDP generic class, Russell & Norvig gridworld example - Gridworld plotting of action values
- Python
Published by markkho over 4 years ago
msdm - Refactoring of core
Major overhaul of core and tabular methods:
- States/actions are assumed to be hashable (e.g., Gridworld now uses frozendict; no built-in hashing functions; dictionaries are the main way to create maps)
- The distribution classes have been streamlined (Multinomial has been removed and DictDistribution is the main way to represent categorical distributions; .sample() takes a random number generator)
- Policy classes have been simplified
- More thorough type hints
- Python
Published by markkho almost 5 years ago