Feature importance

Feature importance answers “which structural features actually matter for this task?” It scores each feature by how much it contributes to the predictive signal in the embeddings.

importance.py

importance = nxt.compute_feature_importance(
  graph_collection=graphs,
  features=features,
  feature_importance_algorithm="supervised_fast",
  embedding_algorithm="approx_wasserstein",
  n_iterations=5,
)

print(importance.head())

The call returns a pandas.DataFrame ranking the features.

Algorithms

`feature_importance_algorithm`	Type	Description
`supervised_greedy`	Supervised	Iteratively selects the best features by predictive gain — thorough but slower
`supervised_fast`	Supervised	Fast greedy variant
`unsupervised`	Unsupervised	Ranks features without using labels

Parameters

Parameter	Default	Meaning
`feature_importance_algorithm`	—	One of the three above (required)
`embedding_algorithm`	`"approx_wasserstein"`	Embedding used to evaluate feature subsets
`n_iterations`	`5`	Iterations to average performance over
`random_state`	`42`	Seed for reproducibility
`n_jobs`	`-1`	Parallel workers (`-1` = all CPUs)
`parallel_backend`	`"process"`	`"process"` or `"thread"`

The supervised algorithms evaluate feature subsets by training a model internally (random forest by default). The two supervised options require labeled graphs; the unsupervised option does not.