Permutation feature importance is a great way to get feature importance in a model-agnostic fashion. All our algorithms (except Stacked Ensemble at the moment) have built-in feature importance, but it would be great to have this feature. It makes sense to have it as separate function which does not happen automatically as part of the model building process. This can also be used as a new method for doing metalearning (model selection) inside a Stacked Ensemble.
Here is the methodology:
A) you have a hold-out dataset (or you use the kfold)
B) You make predictions using the ensemble model and you measure AUC or whichever other metric ( you have already computed these things with the leaderboard).Lets say this gives 0.8 AUC
C) For each column in the data.
you randomly shuffle it
you repeat the scoring where you have that column as random (and everything else is correct)
you measure AUC . Now lets say AUC is 0.7. The different between the original AUC and this one (where one feature is wrong) is the importance of that column
you bring this column back to normal and your repeat for the next column
The permutation feature importance measurement was introduced by Breiman (2001) for random forests. Based on this idea, Fisher, Rudin, and Dominici (2018) proposed a model-agnostic version of the feature importance and called it model reliance.