rFerns is an extended random ferns implementation for R; in comparison to original, it can handle standard information system containing both categorical and continuous attributes. Moreover, it generates OOB error approximation and permutation-based attribute importance measure similar to randomForest. Here is a paper with all the details.
rFerns is good for doing training fast, in predictable time and utilising many CPU cores; in general it is less accurate than Random Forest, yet not substantially, and obviously there are cases in which it is better.
It is also nice as a very fast variable importance source; in fact it was created to speed-up the
Boruta all relevant feature selector, and it did pretty well.
Even more, not it can produce shadow importance, i.e., a heuristic way to reason about significance of importance scores; the package contains a
naiveWrapper function which implements a simple feature selector based on it, and you can read all about it here.
Finally, it is a very stochastic method, practically doing no optimisation at all; basically it is crazy that it works.
Hence, it is theoretically interesting (;
There is also a Spark version (not mine), Sparkling Ferns.
Contributions are welcome, but please make pull requests against the devel branch.
Quite fresh version should be on CRAN, you can also install directly from NotABug with:
For a bleeding edge (but working) version, install from the devel branch:
If you want to use it / test it apart from R, it is quite possible -- consult
side_src/test.c to see how this may work.
Yet don't expect that this will ever become a standalone library.