https://github.com/vexives/caspian

A flexible deep learning/machine learning research library using NumPy.

https://github.com/vexives/caspian

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

A flexible deep learning/machine learning research library using NumPy.

Basic Info
  • Host: GitHub
  • Owner: Vexives
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 411 KB
Statistics
  • Stars: 1
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created over 1 year ago · Last pushed 10 months ago
Metadata Files
Readme License

README.md

Caspian - Deep Learning Architectures

| Information | Installation | Getting Started | Examples | Notes | Future Plans |

PyPI PyPI - License PyPI - Python Version

A flexible deep learning/machine learning research library using NumPy.

Information

Caspian is written entirely with base Python and NumPy, meaning no other library or framework is required. Contains many basic tools required to create machine learning models like neural networks, regressions, image processors, and more.

General structure and functionality inspired by popular frameworks such as PyTorch and TensorFlow.

On top of providing necessary layers and functions, Caspian also allows for simple creation of new layers and tools that the user may need. Each part of the Caspian architecture has its own abstraction that the user can inherit from, including:

  • cspn.Layer
  • cspn.Activation
  • cspn.PoolFunc
  • cspn.Loss
  • cspn.Optimizer
  • cspn.Scheduler

Caspian also provides support for CUDA parallel processing, using CuPy as an optional secondary import.

Installation

Before installing, this library requires:

  • Python 3.10+
  • NumPy v1.23.5+
  • CuPy (12x) v13.0.0+ (Optional)

bash $ pip install caspian-ml

Getting Started

Caspian architectures are split into 6 different class types:

  • Layers, the backbone behind any model and the main processors & learners.
  • Activations, non-linear functions which assist layers in learning and processing data.
  • PoolFuncs, similar to activations, but to be used with pooling layers and work on strided data rather than standard arrays.
  • Losses, functions which describe the loss, or error, of a model.
  • Optimizers, functions which assist in layer weight updating and learning.
  • Schedulers, functions which define the learning rate at a particular step in a model's learning process.

The structure of a network differs slightly from that of PyTorch or TensorFlow, where each layer, activation, optimizer, and scheduler is separate. With Caspian, layers can contain an activation or pooling function, as well as an optimizer. Optimizers contain a scheduler, which controls the learning rate of the optimizer and layer as a whole. Some layers, like Dropout and Upsampling1D do not contain optimizers OR activations, as they do not have any learnable parameters or perform any non-linear transformations.

Some types have default classes that allow that operation to be skipped or performed at a base level, like Linear for activations, StandardGD for optimizers, and SchedulerLR for schedulers. If an optimizer is required for a layer but not provided in the initialization, a default StandardGD optimizer with a SchedulerLR scheduler will automatically be assigned. Activation and pooling functions will not be defaulted if not provided, so they must be manually provided by the user.

GPU Computing

Caspian and its tools can also be used with CUDA through CuPy to increase speeds by a significant amount. Before importing Caspian or any of its tools, place this segment of Python code above the other imports: python import os os.environ["CSPN_CUDA"] = "cuda" This ensures that all modules and tools from Caspian are synced with CUDA, and CUDA-supported GPU computing should be enabled as long as CuPy and the CUDA toolkit are both properly installed.

If a custom tool for Caspian is expected to use both CPU and GPU computing, then use this import instead of directly importing NumPy or CuPy: python from caspian.cudalib import np This will automatically import the library that Caspian is currently using. This allows for easier compatibility and prevents the user from having to manually switch between the two libraries manually within their tool.

Examples

The setup and training of a model in Caspian is similar to other deep learning libraries of its kind, here is a quick training example of a neural network to provide more information:

Creation of a Model:

```python from caspian.layers import Layer, Dense from caspian.activations import Activation, Softmax from caspian.optimizers import Optimizer import numpy as np

class NeuralNet(Layer): def init(self, inputs: int, hiddens: int, outputs: int, activation: Activation, opt: Optimizer): insize = (inputs,) outsize = (outputs,) super().init(insize, outsize)

    self.x_1 = Dense(activation, inputs, hiddens, optimizer=opt.deepcopy())
    self.x_2 = Dense(activation, hiddens, outputs, optimizer=opt.deepcopy())
    self.softmax = Softmax()

def forward(self, data: np.ndarray, training: bool = False) -> np.ndarray:
    self.training = training
    step_1 = x_1(data, training)
    step_2 = x_2(step_1, training)
    return self.softmax(step_2)

def backward(self, dx: np.ndarray) -> np.ndarray:
    assert self.training is True
    d_sm = self.softmax.backward(dx)
    d_2 = x_2.backward(d_sm)
    d_1 = x_1.backward(d_1)
    return d_1

def step(self) -> None:
    x_1.step()
    x_2.step()

```

This is a simple neural network model containing two Dense layers, each with the same activation function and optimizer (separate instances are highly recommended) as provided. The variables in_size and out_size are a part of every layer class, and can be set for a layer using super().__init__(), which expects the input size and output size as tuples. If constructed like this, it can also be used inside of Sequence layers (similar to PyTorch's Sequential).

Creation of an Activation Function:

```python from caspian.activations import Activation import numpy as np

class ReLU(Activation): def forward(self, data: np.ndarray) -> np.ndarray: return np.maximum(0, data)

def backward(self, data: np.ndarray) -> np.ndarray:
    return (data >= 0) * 1

```

Creating a new activation function is quite simple as well, and only expects two functions, forward() and backward(), which take and return a NumPy array. Activations should return an array of the same size as the input for both functions, and can also have an __init__() if any internal variables are necessary. The abstract class cspn.Activation also provides default functionality for __call__(), which allows it to act like a standard Python function.

Creation of a Pooling Function:

```python from caspian.pooling import PoolFunc import numpy as np

class Average(PoolFunc): def forward(self, partition: np.ndarray) -> np.ndarray: return np.average(partition)

def backward(self, partition: np.ndarray) -> np.ndarray:
    return partition * (1.0 / partition.shape[self.axis])

```

Similar in structure to activation functions, but pooling functions return an ndarray with a smaller array rather than an array with the same size as the partition. Like activations as well, can be called like a standard Python function if inheriting from the PoolFunc abstract class. Each pooling function will have an internal variable self.axis (can be set during initialization) which can be used at any point in both the forward and backward passes.

Creation of a Loss Function:

```python from caspian.losses import Loss import numpy as np

class CrossEntropy(Loss): @staticmethod def forward(actual: np.ndarray, prediction: np.ndarray) -> float: clippred = np.clip(prediction, 1e-10, 1 - 1e-10) return -np.sum(actual * np.log(clippred))

@staticmethod
def backward(actual: np.ndarray, prediction: np.ndarray) -> np.ndarray:
    return prediction - actual

```

Loss functions quantify the rate of error of a model's predictions and provides the partial derivative with respect to the output (gradient array) that a model can use to learn. Losses are not a part of any layer or other class, and unless required by some special cases, do not store any internal variables. Because of this, losses can be created as either static classes or instantiable, depending on the user's choice.

Creation of an Optimizer:

```python from caspian.optimizers import Optimizer from caspian.schedulers import Scheduler import numpy as np

class Momentum(Optimizer): def init(self, momentum: float = 0.9, learnrate: float = 0.01, sched: Scheduler) -> None: super().init(learnrate, sched) self.momentum = momentum self.previous = 0.0

def process_grad(self, grad: np.ndarray) -> np.ndarray:
    learn_rate = self.scheduler(self.learn_rate)
    velocity_grad = self.momentum * self.previous - learn_rate * grad
    self.previous = velocity_grad
    return velocity_grad

def step(self) -> None:
    self.scheduler.step()

def reset_grad(self) -> None:
    self.previous = 0.0
    self.scheduler.reset()

def deepcopy(self) -> 'Momentum':
    return Momentum(self.momentum, self.learn_rate, self.scheduler.deepcopy())

```

The general framework for an optimizer is a little bit more complex, but still easy to assemble. The abstract Optimizer class initialization takes in two parameters, learn_rate as a float, and sched as a scheduler class.

The function process_grad() is the main transformation of the optimizer. It should process the given gradient array, apply the learning rate (if applicable), and return an array with the same size as the input.

The function step() is meant to keep track of the epoch or training iteration of the model that the optimizer is a part of. For the example above, it only calls the internal scheduler's step() function and does not modify any variables. However, some more advanced optimizers like ADAM may require an internal variable to be kept for this purpose.

Another function expected from optimizers is reset_grad(), which clears all previous gradient information and resets the learning rate scheduler for that optimizer.

The function deepcopy() is highly recommended if being used on multiple layers of a model, as each layer contains its own version of an optimizer and scheduler. It should pass a deep copy of whatever data structures it contains or needs into the initialization of a new instance.

Creation of a Learning Rate Scheduler:

```python from caspian.schedulers import Scheduler import numpy as np

class ConstantLR(Scheduler): def init(self, steps: int, const: float = 0.1) -> None: self.steps = steps self.const = const self.epoch = 0

def __call__(self, init_rate: float) -> float:
    return init_rate * self.const if self.epoch < self.steps else init_rate

def step(self) -> None:
    self.epoch += 1

def reset(self) -> None:
    self.epoch = 0

def deepcopy(self) -> 'ConstantLR':
    return ConstantLR(self.steps, self.const)

```

This is a basic scheduler that multiplies the initial learning rate by a set constant for a specific number of steps. The __call__() function is how a scheduler is called to process a learning rate, and is initialized with custom parameters that are unique to that subclass. Similar to how an optimizer is created, schedulers also have step(), reset(), and deepcopy() functions which perform the same operations as described for optimizers above.

Training and Using a Model:

Now, here's an example on how to create a neural network which can recognize digits from the MNIST data set using only Caspian tools:

```python import numpy as np

from caspian.layers import Conv2D, Pooling2D, Reshape, Dense, Container, Sequence from caspian.activations import Sigmoid, ReLU, Softmax from caspian.pooling import Maximum from caspian.losses import BinCrossEntropy from caspian.optimizers import StandardGD from keras.datasets import mnist

Import the dataset and reshape

(xtrain, ytrain), (xtest, ytest) = mnist.loaddata() xtrain = np.array(xtrain).reshape(xtrain.shape[0], 784) xtest = np.array(xtest).reshape(xtest.shape[0], 784) ytrain = np.zeros((ytrain.shape[0], ytrain.max()+1), dtype=np.float32)

for i in range(len(ytrain)): ytrain[i][int(ytrain[i])] = 1

xt = xtrain.reshape(-1, 60, 1, 28, 28) / 255.0 yt = ytrain.reshape(-1, 60, 10) print(xt.shape) print(yt.shape)

Create the model to be trained

optim = StandardGD(0.001)

d1 = Conv2D(Sigmoid(), 32, 3, (1, 28, 28)) d2 = Pooling2D(Maximum(), 2, (32, 26, 26), 2) d3 = Conv2D(Sigmoid(), 12, 3, (32, 13, 13)) d4 = Pooling2D(Maximum(), 2, (12, 11, 11), 2) d5 = Reshape((-1, 12, 5, 5), (-1, 1255)) d6 = Dense(ReLU(), 1255, 100) d7 = Dense(Sigmoid(), 100, 10) d8 = Container(Softmax())

Seq1 = Sequence([d1, d2, d3, d4, d5, d6, d7]) Seq1.set_optimizer(optim)

ent = BinCrossEntropy()

Training

losses = 0.0 for ep in range(50): for x, y in zip(xt, yt): x_r = Seq1.forward(x, True)

    err_grad = ent.backward(y, x_r)

    loss = ent.forward(y, x_r)

    Seq1.backward(err_grad)
    Seq1.step()

    losses += loss
print(f"Epoch {ep+1} - {losses / xt.shape[0]}")
losses = 0.0

```

The example above uses all of the tools that were created to stochastically train a basic neural network that can recognize digits 0 through 9 on a 28x28 size image. Improvements and changes can be made to the model for greater accuracy using other tools in the Caspian library.

Saving and Loading Layers:

[!NOTE] Saving and loading models may change in the future at an unknown time. In the event that it is changed, previously formatted .cspn files will no longer work with new ones. If this occurs, then it will be specified in the update that does so.

Once a model has been trained (or in the process of training), each layer can be exported and loaded at a different time. Layers, activations, pooling functions, optimizers, and schedulers all have methods which allow them to be encoded into strings and/or saved to files (of type .cspn).

Saving

Layers can be encoded into a string or saved to a file using the save_to_file() method, as shown here: python d1 = Conv2D(ReLU(), 32, 3, (1, 28, 28)) d1.save_to_file("layer1.cspn") If the file name is not specified and no parameters are given to this method, then a string is returned which contains the information of that layer. This includes the activation or pooling function, optimizer, and scheduler of that layer (if any are applicable).

For other tools like optimizers, schedulers, or functions, the repr() function is used in place of a set saving method. It returns a string with the name of the class and all initialized attributes of the object in the order of the initialization function with / as a separator (except for schedulers, which use :). A quick example: ```python opt = ADAM(learnrate = 0.001, sched = StepLR(10)) optinfo = repr(opt)

Returns "ADAM/0.9/0.99/1e-8/0.001/StepLR:10:0.1"

```

Loading

Once a layer has been saved to a file or encoded in a string, it can be re-loaded and re-instantiated from where it was saved before. Each layer has a static from_save() method, which takes two parameters. The first is a string context, which is either the name of the file to be loaded from or the encoded string containing the appropriate information. The second is a boolean file_load, which determines whether the context is either a file name or the encoded string itself. To use the method on a file: python new_layer = Conv2D.from_save("layer1.cspn", True) If the file provided is incorrectly formatted/modified or the file imported is not an appropriate .cspn file, an exception is thrown instead.

For all other saveable tools in the Caspian library, each tool folder has a function which takes the repr string and returns a class instance of the encoded object. The functions that correspond to each class type include:

  • Activations -> activations.parse_act_info()
  • Optimizers -> optimizers.parse_opt_info()
  • Pooling -> pooling.parse_pool_info()
  • Schedulers -> schedulers.parse_sched_info()

These classes do not have options to save directly to a file, but the user can export them and import them manually if absolutely needed. If the user creates a custom sub-class and wishes to save or load them, they will need to create an appropriate repr() following the same procedure as outlined above, and add the class to the tool folder dictionary:

  • Activations -> activations.act_funct_dict
  • Optimizers -> optimizers.opt_dict
  • Pooling -> pooling.pool_funct_dict
  • Schedulers -> schedulers.sched_dict

Loading a class in these categories will look similar to below: ```python from caspian import activations as act

class CustomFunct(act.Activation): ... def repr(): ...

Create instance

a1 = CustomFunct(...) savedstr = repr(a_1)

Load from context string

act.actfunctdict["CustomFunct"] = CustomFunct a2 = act.parseactinfo(savedstr) ```

Notes

It's important to note that this library is still a work in progress, and due to it using very little framework resources, it prioritizes both efficiency and utility over heavy safety. Here are a few things to keep in mind while using Caspian:

Memory Safety

[!CAUTION] While most functions and classes in this library are perfectly safe to use and modify, there are some that use unsafe memory operations to greatly increase the speed of that tool. An example of this would be any convolutional or pooling layers, like Conv1D, Conv1DTranspose, or Pooling1D. It is highly recommended for the safety of any machine that uses Caspian, DO NOT modify the internal variables or functions of these unsafe layers. Any memory unsafe layers or functions will contain a warning in their in-line documentation. Changes to necessary variables may create harmful effects such as segmentation faults.

General Usability

All classes in this library fit into specific categories of tools that all inherit from a basic abstraction (See Above) and follow specific functionality guidelines which allow them to work seamlessly with one another. To keep the necessary functionality working as intended, it is encouraged to not modify any variables inside of any class that has already been initialized. Some variables, like the weights of a layer, for instance, may be changed safely as long as the shape and integrity is kept the same.

Gradient Calculation

Because NumPy does not have any integrated automatic differentiation functionality, all gradient calculations performed by each class is done manually. For any new layers that the user may create, they may use an auto-grad to perform any backwards passes as long as it is compatible with NumPy.

Further Compatibility

Caspian only requires Python and NumPy, so any other libraries that the user wishes to use alongside it will not be required or affected by Caspian's installation. As mentioned previously in Gradient Calculation, any custom class which inherits from a Caspian abstract container may use any helper libraries or frameworks as long as they are NumPy compatible.

Future Plans and Developments

  • Transformer grade layers (Attention, Encoders, Decoders, etc.)
  • More activation functions, base layers, and optimizers.
  • Improved model saving and loading.
  • More utilities, like train/test data splitting, etc.

Owner

  • Name: Vexives
  • Login: Vexives
  • Kind: user

GitHub Events

Total
  • Release event: 1
  • Watch event: 1
  • Delete event: 2
  • Public event: 1
  • Push event: 35
  • Pull request event: 2
  • Create event: 4
Last Year
  • Release event: 1
  • Watch event: 1
  • Delete event: 2
  • Public event: 1
  • Push event: 35
  • Pull request event: 2
  • Create event: 4

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 2 minutes
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 2 minutes
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • Vexives (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 21 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 2
  • Total maintainers: 1
pypi.org: caspian-ml

A deep learning library focused entirely around NumPy.

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 21 Last month
Rankings
Dependent packages count: 9.3%
Average: 31.0%
Dependent repos count: 52.7%
Maintainers (1)
Last synced: 10 months ago