peterlane/svm_toolkit: libSVM bundled as a jRuby library. @ version-1.1.4

libSVM bundled as a jRuby library. https://rubygems.org/gems/svm_toolkit

Peter Lane 06ee2176bd Release 1.1.4		5 years ago
bin	85a4d1715c Improved some documentation	5 years ago
examples	593daf6dc6 license and minor updates	5 years ago
java	66bdfa7a5a Replace _ identifier in Svm.java	5 years ago
lib	85a4d1715c Improved some documentation	5 years ago
src	0bf66b7db3 rewrote demo in pure java	12 years ago
test	593daf6dc6 license and minor updates	5 years ago
.gitignore	593daf6dc6 license and minor updates	5 years ago
LICENSE.txt	593daf6dc6 license and minor updates	5 years ago
README.rdoc	1c44bbb0a3 Release 1.1.4	5 years ago
Rakefile.rb	85a4d1715c Improved some documentation	5 years ago
svm-toolkit.gemspec	1c44bbb0a3 Release 1.1.4	5 years ago

		
			
			
				README.rdoc
			
		
		
	
			
				= svm-toolkit 

home:: https://peterlane.netlify.app/svm-toolkit/

== Description

Support-vector machines are a popular tool in data mining.  This package
includes an amended version of the Java implementation of the libsvm library
(version 3.11).  Additional methods and examples are provided to support
standard training techniques, such as cross-validation, and simple
visualisations.  Training/testing of models can use a variety of built-in or
user-defined evaluation methods, including overall accuracy, geometric mean,
precision and recall.

== Features 

- All features of LibSVM 3.11 are supported, and many are augmented with Ruby wrappers.
- Loading Problem definitions from file in Svmlight, Csv or Arff (simple subset) format.
- Creating Problem definitions from values supplied programmatically in arrays.
- Rescaling of feature values.
- Integrated cost/gamma search for model with RBF kernel, taking advantage of multiple cores.
- Contour plot visualisation of cost/gamma search results.
- Model provides value of w-squared for hyperplane.
- svm-demo application, a version of the svm_toy applet which comes with libsvm.
- Model stores indices of training instances used as support vectors.
- User-selected evaluation techniques supported in Model#evaluate_dataset and Svm.cross_validation_search.
- Library provides evaluation classes for OverallAccuracy, GeometricMean, ClassPrecision, ClassRecall, MatthewsCorrelationCoefficient.

== Example

The following example illustrates how a dataset can be constructed in code, and 
an SVM model created and tested against the different kernels. 

   require "svm-toolkit"
   include SvmToolkit
   
   puts "Classification with LIBSVM"
   puts "--------------------------"
    
   # Sample dataset: the 'Play Tennis' dataset 
   # from T. Mitchell, Machine Learning (1997)
   # --------------------------------------------
   # Labels for each instance in the training set
   #    1 = Play, 0 = Not
   Labels = [0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0]
   
   # Recoding the attribute values into range [0, 1]
   Instances = [
     [0.0,1.0,1.0,0.0],
     [0.0,1.0,1.0,1.0],
     [0.5,1.0,1.0,0.0],
     [1.0,0.5,1.0,0.0],
     [1.0,0.0,0.0,0.0],
     [1.0,0.0,0.0,1.0],
     [0.5,0.0,0.0,1.0],
     [0.0,0.5,1.0,0.0],
     [0.0,0.0,0.0,0.0],
     [1.0,0.5,0.0,0.0],
     [0.0,0.5,0.0,1.0],
     [0.5,0.5,1.0,1.0],
     [0.5,1.0,0.0,0.0],
     [1.0,0.5,1.0,1.0]
   ]
   
   # create some arbitrary train/test split
   TrainingSet = Problem.from_array(Instances.slice(0, 10), Labels.slice(0, 10))
   TestSet     = Problem.from_array(Instances.slice(10, 4), Labels.slice(10, 4))
   
   # Iterate over each kernel type
   Parameter.kernels.each do |kernel|
    
     # -- train model for this kernel type
     params = Parameter.new(
       :svm_type => Parameter::C_SVC, 
       :kernel_type => kernel,
       :cost => 10, 
       :degree => 1,
       :gamma => 100
     )
     model = Svm.svm_train(TrainingSet, params)
   
     # -- test kernel performance on the training set
     errors = model.evaluate_dataset(TrainingSet, :print_results => true)
     puts "Kernel #{Parameter.kernel_name(kernel)} has #{errors} on the training set"
   
     # -- test kernel performance on the test set
     errors = model.evaluate_dataset(TestSet, :print_results => true)
     puts "Kernel #{Parameter.kernel_name(kernel)} has #{errors} on the test set"
   end

== Acknowledgements

The svm-toolkit is based on LibSVM, which is available from: 
http://www.csie.ntu.edu.tw/~cjlin/libsvm/

The contour plot uses the PlotPackage library, available from:
http://thehuwaldtfamily.org/java/Packages/Plot/PlotPackage.html

Contributor:

* {Knut Hellan}[https://github.com/khellan], the Matthews Correlation Coefficient.