Boruta
Feature selection is a process of filtering variables with some method or criteria (Wiki). It often improves a machine learning model performance and helps with data exploration. Boruta [1] is a feature selection method that identifies all-relevant variables, instead of just selecting a minimal subset. Boruta.js is almost line-by-line port of R's package Boruta to JavaScript. It depends on the random-forest package, but can be used with other models as well.
Example
// Load borutaconst boruta = // Generate synthetic dataconst make = const X y = make // Run borutaconst bor = // Print resultsconsole
Results:
'0': 'Confirmed' '1': 'Confirmed' '2': 'Rejected' '3': 'Confirmed' '4': 'Rejected' '5': 'Rejected' '6': 'Rejected' '7': 'Rejected' '8': 'Rejected' '9': 'Rejected'
Web demo
You can try Boruta in the StatSim app: https://statsim.com/select/. It visualizes importance scores with final decisions and also suports multiple base models (linear regression, logistic regression, KNN, random forest)
References
- Feature Selection with the Boruta Package (2010) Miron B. Kursa, Witold R. Rudnicki