ganx
The GitHub repository for the Python package ganX - generate artificially new XRF
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.0%) to scientific vocabulary
Keywords
Repository
The GitHub repository for the Python package ganX - generate artificially new XRF
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
ganX - a python library to generate MA-XRF raw data out of RGB images
ganX (generate artificially new XRF) is a small library to generate Macro mapping X-ray fluorescence (MA-XRF) images out of an RGB input image, and a dictionary comprising pigments' RGBs and characteristic XRF signals.
To generate a synthetic MA-XRF data, it performs the following steps:
- Use an interative KNN unsupervised clustering on the RGB space to extract a set of RGB clusters, thus replacing the original RGB with a clustered RGB.
- Starting from the clustered RGB, it associates a pigment (or a list of pigments) to the colour, by computing the distance of the cluster RGB to the pigments RGB in CIELAB colour space.
- After that, it builds a distribution out of a weighted average of the pigments distribution found; this distribution is used to randomly generate the XRF signal via a Montecarlo simulation.
Additionaly, the library offers other classes and methods to explore the XRF dataset.
Usage
To generate a MA-XRF raw data out of a RGB image, you may call the XRFGenerator as
```python
generator
from ganx.XRFgeneratorclasses import XRFGenerator
utils
from ganx.RGBsegmentationutils import RGBMethods
open RGB
exrgb = RGBMethods.openimage(filename=file, rootpathtorgb=subdir)
init XRF Generator class
xrfgenerator = XRFGenerator( # PigmentDataBaseUtils pigmentsdictdata = pigmentsdictdata, # RGBKMeansClustering rgbimg = exrgb, Nstart = 10, # PigmentDataBaseUtils Eint = [0.5, 38.5], rebinsize = 1024, # RGBKMeansClustering deltaN = 1, Npatience = -1, scorebatch = 1024, # Threshold generationthreshold = generationthreshold )
generate XRF
generatedXRF = _xrfgenerator.generatexrf()
In this exaple, the classrgbMethods.openimagewas used to open the RGB image; users may use their favourite system to open RGB images;XRFGeneratorneeds anp.arrayof shape(height, width, 3|4)```.
If users want to generate a set of MA-XRF out of a set of RGBs, they may use the function generate_all as
```python
from ganx import generate_all
rootdir = 'path/to/RGB/' storedir = 'path/to/XRF/' pigmentsdictdata = 'path/to/pigments_dict.json'
generateall( rootdir = rootdir, storedir = storedir, pigmentsdictdata = pigmentsdict_data, ) ```
Code Contents
The ganX project defines three main classes:
- MAXRF: A class defining a MA-XRF object.
- RGBKMeansClustering: Class for computing iterative KMeans.
- XRFGenerator : Class to perform the MA-XRF generations out of a RGB image.
those are based on, and use, other classes:
- XRFUtils: A class exposing static methods for manipulating XRF data.
- IterativeKMeans: Class for computing iterative KMeans Clustering.
- RGBMethods: Class offering static methods for plotting of RGB images, clustered or not.
- PigmentDataBaseUtils: Class to initialise and use the Pigment XRF - RGB database.
- Distances: Classes furnishing static methods to compute distances in RGB space.
In depth description
MAXRF_class
File containing the Python classes for manipulating MA-XRF np.array data.
It contains the following classes:
1. XRFUtils
A class exposing static methods for manipulating XRF data
2. MAXRF(XRFUtils)
A class extending XRFUtils; it defines the MA-XRF object, furnishing few methods for analysing MA-XRF;
NB: To Be Finished; it has basics methods and args for the purpose of the XRF_generator_classes
The class architecture is (From Parent to child):
XRFUtils
|
|
MAXRF
XRFUtils
A class exposing static methods for manipulating XRF data. Static Methods rebin_ma_xrf(img: np.array, n_bins: int = 500) : Function to rebin a rank-3 MA-XRF np.array. get_index_from_energy(en: float, _x: np.array) : Static method to get the index out of an energy arange. convolve_xrf(xrf: np.array, kernel : np.array = _default ) : Static method to convolve spatially a MA-XRF np.array. (i.e., along axis = 0,1). open_file(path_to_file: str, key: str = 'img') : Method to open a .h5 or .npz file and initialise a MA-XRF np.array. -------- XRFUtils.rebin_ma_xrf(img: np.array, n_bins: int = 500) -> np.array Function to rebin a rank-3 MA-XRF np.array. It employs at most the numpy slicing to speed-up the rebin process. How it works: 1. Compute the divisor, i.e. the integer division of the original number of bins vs the wanted number of bins; 2. if divisor > 1, i.e. rebinning needed, proceds; else; returns original. 3. Do rebinning: i. Init rebinned tensor as empty np.zeros( shape = [img.shape[0], img.shape[1], n_bins] ) ii. iterate over range(divisor) = [0, 1, ..., divisor-1]: a. at each step, get the view of the original MA-XRF keeping only the bins multiple of step, i.e. step, step + 1*divisor, step + 2*divisor, ... b. sums it to the rebinned tensor. iii.Return rebinned. Args: img (np.array) : XRF rank-3 tensor. n_bins (int, optional) : Wanted number of energy bins in output. Defaults to 500. Raises: Exception : Raises an exception if XRF has no valid shape, i.e. is not a rank-3 tensor. Returns: np.array : Rebinned MA-XRF XRFUtils.get_index_from_energy(en: float, _x: np.array) -> int Static method to get the index out of an energy arange. Args: en (float) : Energy value (in keV) to extract the index. _x (np.array) : Energy np.arange ndarray representing the energy region. _x = np.arange(E_i, E_f, delta_E) Returns: int : Index of en in _x XRFUtils.convolve_xrf(xrf: np.array, kernel : np.array = np.array([[1,2,1],[2,4,2],[1,2,1]])) -> np.array Static method to convolve spatially a MA-XRF np.array. (i.e., along axis = 0,1). Args: xrf (np.array) : Input MA-XRF np.array kernel (np.array, optional): 2D kernel to perform the 2D convolution. Defaults to np.array( [ [1,2,1], [2,4,2], [1,2,1] ] ). Returns: np.array : Convolved MA-XRF np.array XRFUtils.open_file(path_to_file: str, key: str = 'img') Method to open a .h5 or .npz file and initialise a MA-XRF np.array. Args: path_to_file (str) : Path to MA-XRF HDF5 or npz file. key (str, optional) : Dataset key. Defaults to 'img'. NB: the standard LABEC HDF5 file (or NPZ file) is a dataset with metadata and data. MA-XRF data are stored as rank-3 tensor into the 'img' name. Raises: Exception : if os.path.isfile returns false, i.e. no file found. Exception : If the extension is neither .h5 nor .npz Returns: np.array : Loaded np.arrayMAXRF(XRFUtils)
A class defining a MA-XRF object. It extends XRFUtils adding internal args (the MA-XRF) and methods to analyse the MA-XRF. Attributes img (np.array) : XRF rank-3 tensor. n_bins (int, optional) : Wanted number of energy bins in output. Defaults to 1024. Additiona Args XRFLines (None | list) : Dictionary of the XRF Lines. Note: it has to have the form of list(dict), where each list item has to be { "element" : (str) # element line name - Siegbahn notation "value" : (float) # element line value (keV) } _E_int (None | list) : List list(float) of energy interval. len(_E_int) = 2. Note: _E_int = [E_i, E_f] _delta_E(None | float) : Bin size in energy. _E_range_x_axis (None | np.array) : np.array describing the energy range; np.arange(_E_int[0], _E_int[1], _delta_E) Methods init_XRFLines(self, path_to_json: str) : Method to open the JSON file containing the XRFLines and set the self.XRFLines arg. init_calibration_data(self, E_min: float, E_max: float) : Method to initialise the calibration data _E_int, _delta_E, _E_range_x_axis. get_X_line_image(self, el: str, delta_Energy_plot: float = 0.5) : Method to compute the integrated image out of a selected XRF element line, e.g. Pb (La). Static methods: get_key_from_value(mydict: dict, value) : Static method to extract a key from a value. get_element_name_from_value(XRFvalues: list, value: float) : Utils for extracting element name from element value in XRFLines lines(dict). utils for extracting element value from element name : utils for extracting element value from element name -----------------RGBsegmentationutils
File containing the Python classes for performing iterative KMeans clustering on an RGB image.
It contains the following classes:
1. IterativeKMeans
Class for computing iterative KMeans.
2. RGBMethods
Class offering static methods for plotting of RGB images, clustered or not.
3. RGBKMeansClustering(RGBMethods, IterativeKMeans)
Class for computing iterative KMeans.
More details on the classes are reported in the classes docstrings.
The class architecture is (From Parent to child):
1 2
\ /
3
or
IterativeKMeans RGBMethods
\ /
RGBKMeansClustering
IterativeKMeans
Class for computing iterative KMeans Clustering. The iteration is performed over the number of clusters. The performance for each iteration is computed using the Silhouette score. Attributes ---------- Init values: _N_start (int) : Central value for the iteration. It is the central number of cluster _delta_N (int) : (Optional; default = 3) Delta value; the iteration will be performed from _N_start - _delta_N to _N_start + _delta_N. _N_patience (int) : (Optional; default = -1) Patience in iteration steps before breaking iterations. If after _N_patience epochs we see no improvement, we break the cycle. score_batch (int) : (Optional; default = 1024) Batch value used in MiniBatchKMeans to speed up the process. if value `<` 0 are inserted, it is set up to +inf; in this case, MiniBatchKMeans becomes a standard KMeans Additional attributes: Internal params: _N_min (int) : Minimal n_cluster parameter used in iteration; the check on it is: self._N_start - self._delta_N if self._N_start - self._delta_N > 2 else 2 _N_max (int) : Maximal n_cluster parameter used in iteration. _X (None | np.array) : Internal input array X used in fit `&` prediction. Results: _segmented (None | np.array) : Result of the training process. _idx_best (None | int) : Iteration Index of best result. _best_KMeans (None | MiniBatchKMeans) : Best performing MiniBatchKMeans __scores (None | list) : List of epoch's score. Methods ---------- Extension of sklearn KMeans: fit(X: np.array) -> None : Compute KMeans fit fit_predict(X: np.array) -> None | np.array : Compute KMeans fit_predict and returns the prediction predict(X: np.array, use_best: bool = True) -> None | np.array : Compute KMeans predict Custom methods: set_X(X: np.array) -> None : Set X compute_score(X: np.array) -> float : Compute the Silhouette score iter_step(n_clusters: int) -> None : Method to perform of a single iteration's step cluster_train(X: np.array) -> None : Iteration method cluster_train_predict(X: np.array) -> None | np.array : Perform both Iteration and prediction Visualization utils: show_training_stats( _figsize: tuple = (12, 8), axis_fontsize: int = 15, title_fontsize: int = 18 ) -> None : Method to plot the training historyRGBMethods
Class offering static methods for plotting of RGB images, clustered or not. Static Methods: open_image(file_name: str, root_path_to_rgb: str = './Synthetic_data/RGB/') : Open image with filename file_name located in root path root_path_to_rgb. show_image(_img: np.array, _figsize: tuple = (12, 8)) : Method to plot RGB image return_label_image(segmented: np.array, cluster_idx: int) : Method to get greyscale cluster image with index cluster_idx from segmented show_label_image(segmented: np.array, cluster_idx: int) : Method to plot greyscale cluster image with index cluster_idx from segmented return_single_rgb_cluster(_img: np.array, segmented: np.array, cluster_idx: int) : Method to get a single cluster in RGB space; show_single_rgb_cluster(_img: np.array, segmented: np.array, cluster_idx: int) : Method to plot a single cluster in RGB space;RGBKMeansClustering
Class for computing iterative KMeans. The iteration is performed over the number of clusters. The performance for each iteration is computed using the Silhouette score. Attributes ---------- IterativeKMeans __init__() Attributes: _N_start (int) : Central value for the iteration. It is the central number of cluster _delta_N (int) : (Optional; default = 3) Delta value; the iteration will be performed from _N_start - _delta_N to _N_start + _delta_N. _N_patience (int) : (Optional; default = -1) Patience in iteration steps before breaking iterations. If after _N_patience epochs we see no improvement, we break the cycle. score_batch (int) : (Optional; default = 1024) Batch value used in MiniBatchKMeans to speed up the process. if value `<` 0 are inserted, it is set up to +inf; in this case, MiniBatchKMeans becomes a standard KMeans New Attributes: Init Params rgb_img (np.array) : RGB image to be clustered. Internal params: _rgb_shape (tuple) : Shape of the RGB image to be clustered. Result params: segmented_iter (None | np.array) : Result of the IterativeKMeans iteration; reshaped_segmented_iter (None | np.array) : Its reshaped version. clustered_rgb (None | np.array) : Clustered RGB; list_of_rgbs (None | np.array) : List of single cluster in RGB space; Methods ---------- Class methods: cluster_rgb() -> np.array : Main method. It performs the whole pipeline, returning the clustered RGB image. compute_clustered_rgb_from_segmented() -> None : Method to compute the clustered RGB out of the segmented tensor. Viz Methods: show_average_rgb_clusters(_figsize: tuple = (12,8), _title_fontsize: int = 18) : Method to plot the computed average RGB cluster. confront_clustered_with_unclustered(plot_diff: bool = True, _figsize: tuple = (12,8), _title_fontsize: int = 18) : Method to confront original RGB with Clustered one. Static methods: plot_grey_scale_confront(A: np.array, B: np.array, _figsize: tuple = (12,8), _title_fontsize: int = 18) : Method to confront original RGB with Clustered one in greyscale.XRFgeneratorclasses
File containing the Python classes for generating a synthetic MA-XRF .h5 file starting from an RGB.
It imports
A. RGBKMeansClustering ( from RGB_segmentation_utils )
B. XRFUtils ( from MAXRF_class )
It contains the following classes:
1. PigmentDataBaseUtils(XRFUtils)
2. Distances
3. XRFGenerator
PigmentDataBaseUtils(XRFUtils)
Class to initialise and use the Pigment XRF - RGB database. To be initialised, we need to pass to it the path to a JSON file containing the DataBase as a nested dict. The JSON dict must have the form { "pigment_name" : { "xrf" : path_to_xrf_h5_file, "RGB" : RGB color as list }, } e.g. { ... "LeadWithe" : { "xrf": "./infraart_db/XRFSpectrum/MetallicLead.h5", "RGB": [240, 235, 229] } } It extends the XRFUtils. Attributes ---------- pigments_dict_data (dict) : Nested Dictionary containing the RGB and XRF data of all pigments in the DB. Additional Attributes E_int (list, optional) : Energy interval (in keV). Defaults to [0.5, 38.5] rebin_size (int, optional) : Final size of the XRF histogram (in bins). Defaults to 1024. Methods ---------- set_pigments_dict_data(self, pigments_dict_data: dict) : Setter method for the pigments_dict_data attribute. Static Methods open_pigment_dict_json(path_to_pigments_dict_json: str = './utils/pigments_dict.json') -> dict : Open the JSON file, parses it and creates the nested dict object. get_distr_from_infraart_h5(path_to_infraart_h5: str, E: list = [0.5, 38.5], rebin_size: int = 500) : Static method to get a distribution from an h5 file.Distances
Classes furnishing static methods to compute distances in RGB space. Static Methods -------------- cosine_similarity(x, y) : returns the cosine similarity rgb2lab( rgb ) : returns the CIELAB image CIEdelta1994_similarity(rgb1, rgb2) : returns the similarity using the CIEdelta1994 distance CIEdelta2000_similarity(rgb1, rgb2) : returns the similarity using the CIEdelta2000 distanceXRFGenerator
Class to perform the MA-XRF generations out of a RGB image. It generates the MA-XRF np.array by extracting randomly a certain number of counts, pixel-by-pixel, from an XRF signal probability distribution obtained, pixel-by-pixel, by similarity with pigments in a passed database. The RGB image is firstly segmented to reduce the RGB thriples, thus the noise. Attributes ---------- _distances (Distances) : Distences class instance. Is used to compute distances in color space between the RGB cluster and the RGB in the DataBase _pigmentDataBaseUtils (PigmentDataBaseUtils) : PigmentDataBaseUtils class instance to handle the database _rgbKMeansClustering (RGBKMeansClustering) : RGBKMeansClustering class instance to cluster the RGB image. _num_of_counts (int) : Final XRF histogram number of pixel counts. Defaults to 400. _lambda (int) : XRF Pixel Noise lambda - TBUsed generation_threshold (float) : Threshold for distance in XRF generation. Has to be in (0, 1) range, where 0 is every pigments and 1 possibily zero. Defaults to 0.2. _list_of_rgbs (np.array | None) : Results of the Iterative KNN on RGB; list of RGB clusters _clustered_rgb (np.array | None) : Results of the Iterative KNN on RGB; Clustered RGB _reshaped_segmented_iter (np.array | None) : Results of the Iterative KNN on RGB; list of cluster mask. _generated_XRF (np.array) : generated MA-XRF np.array Methods ---------- do_cluster() : Performs the whole RGB clustering process. generate_xrf() -> np.array : Method to generate the MA-XRF out of an RGB image. get_distribution_from_rgb( rgb: np.array, pigments_dict: dict, threshold: float = 0.2, debug: bool = False, use_cie_similarity: bool = True ) -> np.array : Method to get an XRF synthetic histogram out of an RGB color. Static Methods get_xrf_distr_2D(num_of_counts: int, distr: np.array, size: tuple) -> np.array : Static method to randomly generate A fake MA-XRF rank-3 tensor out of a unitary distribution bincount2d(arr: np.array, bins=None) -> np.array : Static method to compute a 2D bincount.Owner
- Login: androbomb
- Kind: user
- Repositories: 2
- Profile: https://github.com/androbomb
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Bombini" given-names: "Alessandro" orcid: "https://orcid.org/0000-0001-7225-3355" title: "ganX - generate artificially new XRF" version: 0.0.1 date-released: 2023-01-12 url: "https://github.com/androbomb/ganX"
GitHub Events
Total
Last Year
Committers
Last synced: almost 3 years ago
All Time
- Total Commits: 11
- Total Committers: 1
- Avg Commits per committer: 11.0
- Development Distribution Score (DDS): 0.0
Top Committers
| Name | Commits | |
|---|---|---|
| androbomb | a****i@g****m | 11 |
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0