Feature importance

Feature importance answers “which structural features actually matter for this task?” It scores each feature by how much it contributes to the predictive signal in the embeddings.

importance.py
importance = nxt.compute_feature_importance(
  graph_collection=graphs,
  features=features,
  feature_importance_algorithm="supervised_fast",
  embedding_algorithm="approx_wasserstein",
  n_iterations=5,
)

print(importance.head())

The call returns a pandas.DataFrame ranking the features.

Algorithms

feature_importance_algorithmTypeDescription
supervised_greedySupervisedIteratively selects the best features by predictive gain — thorough but slower
supervised_fastSupervisedFast greedy variant
unsupervisedUnsupervisedRanks features without using labels

Parameters

ParameterDefaultMeaning
feature_importance_algorithmOne of the three above (required)
embedding_algorithm"approx_wasserstein"Embedding used to evaluate feature subsets
n_iterations5Iterations to average performance over
random_state42Seed for reproducibility
n_jobs-1Parallel workers (-1 = all CPUs)
parallel_backend"process""process" or "thread"

The supervised algorithms evaluate feature subsets by training a model internally (random forest by default). The two supervised options require labeled graphs; the unsupervised option does not.