naiveWrapper.Rd 2.9 KB

  1. % Generated by roxygen2: do not edit by hand
  2. % Please edit documentation in R/naivewrapper.R
  3. \name{naiveWrapper}
  4. \alias{naiveWrapper}
  5. \title{Naive feature selection method utilising the rFerns shadow imporance}
  6. \usage{
  7. naiveWrapper(x, y, iterations = 1000, depth = 5, ferns = 100, size = 30,
  8. lambda = 5, threads = 0, saveHistory = FALSE)
  9. }
  10. \arguments{
  11. \item{x}{Data frame containing attributes; must have unique names and contain only numeric, integer or (ordered) factor columns.
  12. Factors must have less than 31 levels. No \code{NA} values are permitted.}
  13. \item{y}{A decision vector. Must a factor of the same length as \code{nrow(X)} for ordinary many-label classification, or a logical matrix with each column corresponding to a class for multi-label classification.}
  14. \item{iterations}{Number of iterations i.e., the number of sub-models built.}
  15. \item{depth}{The depth of the ferns; must be in 1--16 range. Note that time and memory requirements scale with \code{2^depth}.}
  16. \item{ferns}{Number of ferns to be build in each sub-model. This should be a small number, around 3-5 times \code{size}.}
  17. \item{size}{Number of attributes considered by each sub-model.}
  18. \item{lambda}{Lambda parameter driving the re-weighting step of the method.}
  19. \item{threads}{Number of parallel threads, copied to the underlying \code{rFerns} call.}
  20. \item{saveHistory}{Should weight history be stored.}
  21. }
  22. \value{
  23. An object of class \code{naiveWrapper}, which is a list with the following components:
  24. \item{found}{Names of all selected attributes.}
  25. \item{weights}{Vector of weights indicating the confidence that certain feature is relevant.}
  26. \item{timeTaken}{Time of computation.}
  27. \item{weightHistory}{History of weights over all iterations, present if \code{saveHistory} was \code{TRUE}.}
  28. \item{params}{Copies of algorithm parameters, \code{iterations}, \code{depth}, \code{ferns} and \code{size}, as a named vector.}
  29. }
  30. \description{
  31. Proof-of-concept ensemble of rFerns models, built to stabilise and improve selection based on shadow importance.
  32. It employs a super-ensemble of \code{iterations} small rFerns forests, each built on a subspace of \code{size} attributes, which is selected randomly, but with a higher selection probability for attributes claimed important by previous sub-models.
  33. Final selection is a group of attributes which hold a substantial weight at the end of the procedure.
  34. }
  35. \examples{
  36. set.seed(77)
  37. #Fetch Iris data
  38. data(iris)
  39. #Extend with random noise
  40. noisyIris<-cbind(iris[,-5],apply(iris[,-5],2,sample))
  41. names(noisyIris)[5:8]<-sprintf("Nonsense\%d",1:4)
  42. #Execute selection
  43. naiveWrapper(noisyIris,iris$Species,iterations=50,ferns=20,size=8)
  44. }
  45. \references{
  46. Kursa MB (2017). \emph{Efficient all relevant feature selection with random ferns}. In: Kryszkiewicz M., Appice A., Slezak D., Rybinski H., Skowron A., Ras Z. (eds) Foundations of Intelligent Systems. ISMIS 2017. Lecture Notes in Computer Science, vol 10352. Springer, Cham.
  47. }