https://github.com/3mcloud/cwest-polymer

https://github.com/3mcloud/cwest-polymer

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.3%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: 3mcloud
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 3.11 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 7
Created 10 months ago · Last pushed 9 months ago
Metadata Files
Readme License

README.md

'cwest-polymer' Polymer Analysis Package

PyPI Version pypi download DOI

This python package is used for reading, analyzing, and interpretting polymer species within mass spectrometry data using fractional mass remainder (fmr), a generalized kendrick mass defect (KMD) algorithm. Using circular distance metrics with cluster analysis, we can classify polymer groups rapidly and effectively.

cwest-polymer (Cymraeg for "polymer quest") uses the data science python package piblin (Cymraeg for "pipeline"), which is able to comprehensively capture analytical data from a variety of sources along with their metadata. cwest-polymer shows a basic implementations of file readers and transforms for polymer analysis, but can be extended to include more complex data processing and analysis pipelines, to determine polymer groupings. Further examples of piblin implementations can be found in hermes rheo, a rheological data analysis package.

More details on these concepts can be found in the reprint below. This includes literature references to KMD for futher background as well.

ASMS 2025 Poster Reprint: Improvements and Analysis of KMD

Installation

cwest-polymer is in PyPI. You can install it using pip: pip install cwest-polymer

Fractional Mass Remainder (fMR) Transforms and Clustering Results

Cluster analysis performed on .csv file imported. Generated by applying data transformation pipelines described below:

Theoretical PEG Cluster Analysis: data folder here

PEG mass values were calculated based on theoretical repeat unit and arbitrary end-groups. Transformed with scripts shown below, colored by grouping.

Lipids Clusted by Alkanes (CH2) / alkenes (C2H2): data folder here

Fatty acid (FA) mass values were calculated by molecular formula. Transformed with scripts shown below, colored by grouping.

Package Parameters

Default repeat units based on package parameters.

``` from cwestpolymer import DEFAULTREPEAT_UNITS

DEFAULTREPEATUNITS = { "PEG": "C2 H4 O", "PPG": "C3 H6 O", "PTHF": "C4 H8 O", "PET": "C10 H8 O4", "PE": "C2 H4", "PP": "C3 H6", "Perfluoro": "C F2", "PDMS": "C2 H6 Si O", "BPA": "C18 H20 O3", "Acrylamide": "C3 H5 N O", "Acrylic acid": "C3 H4 O2", "Nylon 6 6": "C12 H22 N2 O2", } ```

Create transforms with different repeat units, parsed by molmass or simply float values.

The repeat_unit parameter can be a list or dictionary to supply different repeat units for the transformation pipeline. The list or dictionary values can be a combination of formulas (str) or mass values (float). Dictionaries enable custom repeat unit labels.

Fractional values, fractional, default to 1 for single charged ions, but can be a list of integers for multiply charged species. The default_list parameter adds all repeat units from DEFAULT_REPEAT_UNITS with the supplies values.

``` from cwest_polymer import transforms

fmrtransform1 = transforms.FractionalMRTransform.create(repeatunits=['C1 H2 O3'], fractionalvalues=1, defaultlist=True) fmrtransform2 = transforms.FractionalMRTransform.create(repeatunits=[123.45, 67.89], fractionalvalues=[1,2,3], defaultlist=False, kmd=True) ```

The following headers can be detected within a given spreadsheet (.csv and .xlsx)

Column headers are read using cwest_polymer.fmr_filereaders.fmr_mass_spreadsheet_reader.MassSpreadsheetReader class. Custom column fields can be used to modify the reader and an example of such is shown here ``` from cwestpolymer import ACCEPTEDCOLUMN_HEADERS

ACCEPTEDCOLUMNHEADERS = ['mass', 'mz', 'm/z', 'rt', 'retention time', 'abundance', 'intensity', 'area', 'xpos', 'ypos'] ```

Implementing fractional mass-remainder (fMR) polymer detection algorithm

Python imports of cwest_polymer and piblin for file reading, transform set-up and following data transform

``` from cwestpolymer import MassSpreadsheetReader, transforms from cwestpolymer import fmr_parameters as p

from pathlib import Path import os import pandas as pd import numpy as np ```

Set parameters

``` spreadsheetpath = r"PATH/TO/DATA/FOLDER" resultpath = r"PATH/TO/RESULT/FOLDER" ppmtol = 10 mztol = 0.005 min_samples = 3

repeat_units = { 'alkanes': 'C H2', 'alkenes': 'C2 H2' } ```

Read directories with the spreadsheet data file reader (.csv and .xlsx files)

data = MassSpreadsheetReader().data_from_filepath(filepath=path)

Create transform classes to calculate fMR values and determine polymer clusters

```

transform to fMR

calcfmr = transforms.FractionalMRTransform.create(repeatunits=repeatunits, fractionalvalues=1, default_list=False)

cluster based on fMR

clusterdata = transforms.ClusterTransform.create(mztol=mztol, ppmtol=ppmtol, minsamples=min_samples)

filter by group size

filtercluster = transforms.FilterByClusterSize.create(minsamples=min_samples)

create transformation pipeline (thus the name piblin)

pipeline = calcfmr + clusterdata + filter_cluster ```

Apply transforms to data

```

run pipeline on data

fmr_clusters = pipeline(data) ```

Creating results files

  • Export .csv results
  • Generate .png fmr plots

``` results = fmrclusters.splitbyconditionname('file_name')

for result in results: for measurement in result.measurements: if measurement.datasets[0].numberofpoints() == 0: continue ru_name = measurement.details['repeatunitinformation'][0]

    # export data to .csv file
    df = pd.DataFrame(np.array(measurement.datasets[0].data_arrays).T, columns=measurement.datasets[0].data_array_names)
    df.to_csv(os.path.join(result_path, f'result_{name}_{ru_name}_filtered.csv'))

    # generate fmr plot figure
    fig, _ = measurement.visualize()
    fig.savefig(os.path.join(result_path, f'plot_{name}_{ru_name}.png'), dpi=1000, bbox_inches='tight')

```

Owner

  • Name: 3M
  • Login: 3mcloud
  • Kind: organization
  • Location: Maplewood, MN

Science. Applied to life.

GitHub Events

Total
  • Release event: 4
  • Public event: 1
  • Push event: 5
  • Create event: 3
Last Year
  • Release event: 4
  • Public event: 1
  • Push event: 5
  • Create event: 3

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 121 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 7
  • Total maintainers: 1
pypi.org: cwest-polymer

Functions to determine polymer groups within datasets using kendrick mass defect and mass remainder analysis.

  • Versions: 7
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 121 Last month
Rankings
Dependent packages count: 8.7%
Average: 29.0%
Dependent repos count: 49.2%
Maintainers (1)
Last synced: 9 months ago

Dependencies

.github/workflows/release.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v5 composite
  • ncipollo/release-action v1 composite
pyproject.toml pypi
  • bokeh *
  • matplotlib *
  • molmass ==2023.4.10
  • numpy *
  • openpyxl *
  • pandas *
  • piblin ==0.0.0a1
  • plotly >=6.0.1
  • scikit-learn *
  • scipy *