https://github.com/ashdehghan/neext

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.6%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: ashdehghan
License: mit
Language: Jupyter Notebook
Default Branch: main
Size: 48.2 MB

Statistics

Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 5

Created over 2 years ago · Last pushed 10 months ago

Metadata Files

Readme Changelog License

NEExT: Network Embedding Experimentation Toolkit

NEExT is a powerful Python framework for graph analysis, embedding computation, and machine learning on graph-structured data. It provides a unified interface for working with different graph backends (NetworkX and iGraph), computing node features, generating graph embeddings, and training machine learning models.

📚 Documentation

Detailed documentation is available in the docs directory. Build it locally or visit the online documentation at NEExT Documentation.

🌟 Features

Flexible Graph Handling
- Support for both NetworkX and iGraph backends
- Automatic graph reindexing and largest component filtering
- Node sampling capabilities for large graphs
- Rich attribute support for nodes and edges
Comprehensive Node Features
- PageRank
- Degree Centrality
- Closeness Centrality
- Betweenness Centrality
- Eigenvector Centrality
- Clustering Coefficient
- Local Efficiency
- LSME (Local Structural Motif Embeddings)
Graph Embeddings
- Approximate Wasserstein
- Exact Wasserstein
- Sinkhorn Vectorizer
- Customizable embedding dimensions
Machine Learning Integration
- Classification and regression support
- Dataset balancing options
- Cross-validation with customizable splits
- Feature importance analysis

Custom Node Feature Functions

NEExT allows you to define and compute your own custom node feature functions alongside the built-in ones. This provides great flexibility for experimenting with novel graph metrics.

Defining a Custom Feature Function:

Your custom feature function must adhere to the following structure:

Input: It must accept a single argument, which will be a graph object. This object provides access to the graph's structure (nodes, edges) and properties (e.g., graph.nodes, graph.graph_id, graph.G which is the underlying NetworkX or iGraph object).
Output: It must return a pandas.DataFrame with the following specific columns in order:
- "node_id": Identifiers for the nodes for which features are computed.
- "graph_id": The identifier of the graph to which these nodes belong.
- One or more feature columns: These columns should contain the computed feature values. The naming convention for these columns should ideally follow the pattern your_feature_name_0, your_feature_name_1, etc., if your feature has multiple components or is expanded over hops (though a single feature column like your_feature_name is also acceptable).

Example:

Here's how you can define a simple custom feature function and use it:

```python import pandas as pd

1. Define your custom feature function

This function must be defined at the top level of your script/module

if you plan to use multiprocessing (n_jobs != 1).

def mynodedegreesquared(graph): nodes = list(graph.nodes) # or range(graph.G.vcount()) for igraph if nodes are 0-indexed graphid = graph.graph_id

if hasattr(graph.G, 'degree'): # Handles both NetworkX and iGraph
    if isinstance(graph.G, nx.Graph): # NetworkX
        degrees = [graph.G.degree(n) for n in nodes]
    else: # iGraph
        degrees = graph.G.degree(nodes)
else:
    raise TypeError("Graph object does not have a degree method.")

degree_squared_values = [d**2 for d in degrees]

df = pd.DataFrame({
    'node_id': nodes,
    'graph_id': graph_id,
    'degree_sq_0': degree_squared_values
})
# Ensure the correct column order
return df[['node_id', 'graph_id', 'degree_sq_0']]

2. Prepare the list of custom feature methods

myfeaturemethods = [ {"featurename": "mydegreesquared", "featurefunction": mynodedegree_squared} ]

3. Pass it to computenodefeatures

Initialize NEExT and load your graph_collection as shown in the Quick Start

nxt = NEExT()

graphcollection = nxt.readfrom_csv(...)

features = nxt.computenodefeatures( graphcollection=graphcollection, featurelist=["pagerank", "mydegreesquared"], # Include your custom feature name featurevectorlength=3, # Applies to built-in features that use it myfeaturemethods=myfeaturemethods )

print(features.features_df.head()) ```

When you include "my_degree_squared" in the feature_list and provide my_feature_methods, NEExT will automatically register and compute your custom function. If "all" is in feature_list, your custom registered function will also be included in the computation.

📦 Installation

Basic Installation

bash pip install NEExT

Development Installation

```bash

Clone the repository

git clone https://github.com/ashdehghan/NEExT.git cd NEExT

Install with development dependencies

pip install -e ".[dev]" ```

Additional Components

```bash

For running tests

pip install -e ".[test]"

For building documentation

pip install -e ".[docs]"

For running experiments

pip install -e ".[experiments]"

Install all components

pip install -e ".[dev,test,docs,experiments]" ```

🚀 Quick Start

Basic Usage

```python from NEExT import NEExT

Initialize the framework

nxt = NEExT() nxt.setloglevel("INFO")

Load graph data

graphcollection = nxt.readfromcsv( edgespath="edges.csv", nodegraphmappingpath="nodegraphmapping.csv", graphlabelpath="graphlabels.csv", reindexnodes=True, filterlargestcomponent=True, graphtype="igraph" )

Compute node features

features = nxt.computenodefeatures( graphcollection=graphcollection, featurelist=["all"], featurevector_length=3 )

Compute graph embeddings

embeddings = nxt.computegraphembeddings( graphcollection=graphcollection, features=features, embeddingalgorithm="approxwasserstein", embedding_dimension=3 )

Train a classifier

modelresults = nxt.trainmlmodel( graphcollection=graphcollection, embeddings=embeddings, modeltype="classifier", sample_size=50 ) ```

Working with Large Graphs

NEExT supports node sampling for handling large graphs:

```python

Load graphs with 70% of nodes

graphcollection = nxt.readfromcsv( edgespath="edges.csv", nodegraphmappingpath="nodegraphmapping.csv", nodesample_rate=0.7 # Use 70% of nodes ) ```

Feature Importance Analysis

```python

Compute feature importance

importancedf = nxt.computefeatureimportance( graphcollection=graphcollection, features=features, featureimportancealgorithm="supervisedfast", embeddingalgorithm="approxwasserstein" ) ```

📊 Experiments

NEExT includes several pre-built experiments in the examples/experiments directory:

Node Sampling Experiment

Investigates the effect of node sampling on classifier accuracy: bash cd examples/experiments python node_sampling_experiments.py

📝 Input File Formats

edges.csv

csv src_node_id,dest_node_id 0,1 1,2 ...

nodegraphmapping.csv

csv node_id,graph_id 0,1 1,1 2,2 ...

graph_labels.csv

csv graph_id,graph_label 1,0 2,1 ...

🛠️ Development

Running Tests

```bash

Run all tests

pytest

Run with coverage

pytest --cov=NEExT

Run specific test file

pytest tests/testnodesampling.py ```

Building Documentation

bash cd docs make html

Code Style

The project uses several tools for code quality: ```bash

Format code

black .

Sort imports

isort .

Check style

flake8 .

Type checking

mypy . ```

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Run tests
Submit a pull request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

👥 Authors

Ash Dehghan - ash.dehghan@gmail.com

🙏 Acknowledgments

NetworkX team for the graph algorithms
iGraph team for the efficient graph operations
Scikit-learn team for machine learning components

📧 Contact

For questions and support: - Email: ash@anomalypoint.com - GitHub Issues: NEExT Issues

🔄 Version History

0.1.0
- Initial release
- Basic graph operations
- Node feature computation
- Graph embeddings
- Machine learning integration

Owner

Name: ashdehghan
Login: ashdehghan
Kind: organization

Repositories: 1
Profile: https://github.com/ashdehghan

GitHub Events

Total

Create event: 7
Issues event: 1
Release event: 1
Watch event: 1
Delete event: 1
Push event: 65
Pull request review event: 1
Pull request review comment event: 1
Pull request event: 8
Fork event: 1

Last Year

Create event: 7
Issues event: 1
Release event: 1
Watch event: 1
Delete event: 1
Push event: 65
Pull request review event: 1
Pull request review comment event: 1
Pull request event: 8
Fork event: 1

Packages

Total packages: 1
Total downloads:
- pypi 35 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 11
Total maintainers: 3

pypi.org: neext

Network Embedding Experimentation Toolkit - A powerful framework for graph analysis, embedding computation, and machine learning on graph-structured data

Homepage: https://github.com/ashdehghan/NEExT
Documentation: https://neext.readthedocs.io
License: MIT
Latest release: 0.2.10
published 12 months ago

Versions: 11
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 35 Last month

Rankings

Dependent packages count: 10.1%

Average: 38.2%

Dependent repos count: 66.4%

Maintainers (3)

elmspace LourensT CptQuak

Last synced: 10 months ago

Dependencies

.github/workflows/python-publish.yml actions

actions/checkout v3 composite
actions/setup-python v3 composite
pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite

setup.py pypi

arrow ==1.2.3
igraph ==0.11.3
jupyter ==1.0.0
karateclub ==1.2.2
matplotlib ==3.7.2
networkx ==2.8.8
node2vec ==0.4.6
numpy ==1.25.2
pandas ==2.0.3
plotly ==5.18.0
scikit-learn ==1.3.0
scipy ==1.11.2
tqdm ==4.65.0
umap-learn ==0.5.4
vectorizers ==0.2
xgboost ==2.0.2

https://github.com/ashdehghan/neext

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

NEExT: Network Embedding Experimentation Toolkit

📚 Documentation

🌟 Features

Custom Node Feature Functions

1. Define your custom feature function

This function must be defined at the top level of your script/module

if you plan to use multiprocessing (n_jobs != 1).

2. Prepare the list of custom feature methods

3. Pass it to computenodefeatures

Initialize NEExT and load your graph_collection as shown in the Quick Start

nxt = NEExT()

graphcollection = nxt.readfrom_csv(...)

📦 Installation

Basic Installation

Development Installation

Clone the repository

Install with development dependencies

Additional Components

For running tests

For building documentation

For running experiments

Install all components

🚀 Quick Start

Basic Usage

Initialize the framework

Load graph data

Compute node features

Compute graph embeddings

Train a classifier

Working with Large Graphs

Load graphs with 70% of nodes

Feature Importance Analysis

Compute feature importance

📊 Experiments

Node Sampling Experiment

📝 Input File Formats

edges.csv

nodegraphmapping.csv

graph_labels.csv

🛠️ Development

Running Tests

Run all tests

Run with coverage

Run specific test file

Building Documentation

Code Style

Format code

Sort imports

Check style

Type checking

🤝 Contributing

📄 License

👥 Authors

🙏 Acknowledgments

📧 Contact

🔄 Version History

Owner

GitHub Events

Total

Last Year

Packages

pypi.org: neext

Rankings

Maintainers (3)

Dependencies