Structural features
NEExT describes every node by a vector of structural features — graph-theoretic metrics computed from the graph’s topology. These per-node features are the raw material for graph embeddings.
features = nxt.compute_node_features(
graph_collection=graphs,
feature_list=["all"], # or a subset of the names below
feature_vector_length=3, # k-hop aggregation depth
normalize_features=True,
n_jobs=1,
) The 11 built-in features
Pass any subset by name, or ["all"] for every one:
| Name | Description |
|---|---|
page_rank | PageRank — influence from the link structure |
degree_centrality | Fraction of nodes a node connects to |
closeness_centrality | Inverse mean shortest-path distance to all nodes |
betweenness_centrality | Fraction of shortest paths passing through the node |
eigenvector_centrality | Influence weighted by neighbors’ influence |
clustering_coefficient | How tightly a node’s neighbors interconnect |
local_efficiency | Efficiency of information flow in the node’s neighborhood |
lsme | Local Spectral Method Embedding — local connectivity signature |
load_centrality | Shortest-path load through the node |
basic_expansion | Neighborhood expansion structure |
betastar | Community-aware node metric (βstar) |
["all"]expands to all eleven features. You can also mix"all"with custom feature names in the samefeature_list.
The community-aware betastar feature is based on Kamiński, Prałat, Théberge, and Zając,
“Predicting Properties of Nodes via Community-Aware Features” (Social Network Analysis
and Mining 14(1), 2024) — arXiv:2311.04730,
doi:10.1007/s13278-024-01281-2.
k-hop neighborhood aggregation
feature_vector_length controls how far each feature reaches. With the default of 3,
every feature is computed for the node itself and aggregated over its k-hop neighborhood,
producing a short vector per feature that captures multi-scale structure. Larger values
capture wider context at higher cost.
Normalization
With normalize_features=True (default), features are scaled across all nodes. You can
also normalize a Features object directly with a chosen scaler:
features.normalize(type="StandardScaler") # or "MinMaxScaler", "RobustScaler" Parallelism
Feature computation parallelizes across graphs with joblib:
n_jobs— number of parallel workers (default1).parallel_backend—"loky"(default, process-based; serializes notebook-defined functions) or"threading".joblib_kwargs— advanced options forwarded tojoblib.Parallel(you may not pass NEExT-owned keys liken_jobsorbackendhere).profile_features— log per graph-feature timing at INFO level.
Custom features
Register your own metric with my_feature_methods. Your function receives a graph and
must return a DataFrame with node_id, graph_id, then one column per feature dimension
(named <feature_name>_0, <feature_name>_1, …).
import pandas as pd
def degree_squared(graph):
G = graph.G
return pd.DataFrame({
"node_id": graph.nodes,
"graph_id": graph.graph_id,
"degree_squared_0": [G.degree(n) ** 2 for n in graph.nodes],
})[["node_id", "graph_id", "degree_squared_0"]]
features = nxt.compute_node_features(
graph_collection=graphs,
feature_list=["page_rank", "degree_squared"],
my_feature_methods=[
{"feature_name": "degree_squared", "feature_function": degree_squared},
],
) The Features container
compute_node_features returns a Features object:
features.features_df— the underlying DataFrame (node_id,graph_id, feature cols).features.feature_columns— the feature column names.features.normalize(type=...)— re-scale in place.features_a + features_b— merge two feature sets on(node_id, graph_id).
Next: turn these node features into one vector per graph with Embeddings.