(This readme file might not render 100% precisely on many git host websites. If something seems strange, you can look at the raw version of this file.)
This is a collection of machine learning algorithms implemented in GNU Guile. Currently implemented algorithms are:
Nope, sorry, currently that is it.
The decision tree algorithm is implemented in a purely functional way, which helps immensely with parallelization of the algorithm and unit testing the code. My vision for this repository is, that more purely functional machine learning algorithms are added later. My preference is strongly for purely functional algorithms.
Feel free to report bugs, report possible improvements, contribute machine learning algorithms in GNU Guile or standard Scheme runnable in GNU Guile (perhaps even non-purely functional, so that we can later on transform them into purely functional ones), fork or use this code in any license adhering way.
There are many things, that could be done. A few will be listed here, in a non-comprehensive list:
The plan is to add more machine learning algorithms. Before implementing anything very complex or anything with very specific application areas, basic algorithms would probably be implemented:
Personally I would also like to understand the implementations. Code readability is valued strongly. Without readable code, I would definitely hesitate to add an algorithm. If a lot of mathematics is involved, a companion document explaining the mathematical details might be needed.
Many algorithms lend themselves well to matrix multiplication. A choice would have to be made about what kind of library to use for that, or at least a basic implementation in Guile to be written, which, abstracted away cleanly, can act as fallback, for the time, when there is not yet a fast matrix multiplication library chosen.
Here are some candidates for matrix multiplication:
guile-ffi-cblas
When installing Guile via Guix package manager, it is important, that CBLAS or BLIS are also
available in a place, where programs installed via Guix look for it. The easiest way to achieve this
would be to install CBLAS or BLIS through Guix as well. There are currently 3 related packages
available in Guix package repositories: blis
, openblas
and lapack
. In my case I installed all
3 of them:
guix environment --ad-hoc guile blis openblas lapack
It seems, that using BLIS works best. However, it took almost 2h to build and be checked.
Note the following:
In my case that directory where Guix installed programs should look is
~"${HOME}/.guix-profile/lib"~. To generalize, it will probably be the lib
directory of whatever
Guix profile or environment you installed GNU Guile in.
To tell Guile to look for the CBLAS
or BLIS
library there, an environment variable
~LTDL_LIBRARY_PATH~ (it stands for libtool dynamic link library path) needs to be set to be the path
of that lib
directory. The following shell commands show how to run test suites of
~guile-ffi-cblas~ for various backends.
# switch to a temporary directory
pushd $(mktemp --directory)
# clone the git repository of guile-ffi-cblas
git clone https://notabug.org/lloda/guile-ffi-cblas.git
# switch to cloned git repository
pushd guile-ffi-cblas
# create an environment with the required libraries installed
guix environment --ad-hoc guile blis openblas lapack
# for BLIS (it is the default of the guile-ffi-cblas library)
# TODO: Perhaps there is a better way to get the lib folder of the environment?
LTDL_LIBRARY_PATH="$(which guile | rev | cut --delimiter '/' --fields 3- | rev)/lib" \
guile -L mod -s test/test-ffi-blis.scm
# or
# GUILE_FFI_BLIS_LIBNAME=libblis \
# GUILE_FFI_BLIS_LIBPATH="$(which guile | rev | cut --delimiter '/' --fields 3- | rev)/lib" \
# guile -L mod -s test/test-ffi-blis.scm
popd
popd
#+BEGIN_SRC shell
pushd $(mktemp --directory)
git clone https://notabug.org/lloda/guile-ffi-cblas.git
pushd guile-ffi-cblas
guix environment --ad-hoc guile blis openblas lapack
LTDL_LIBRARY_PATH="$(which guile | rev | cut --delimiter '/' --fields 3- | rev)/lib" \ GUILE_FFI_CBLAS_LIBNAME=libopenblas \ guile -L mod -s test/test-ffi-cblas.scm
popd popd #+end_src
#+RESULTS:
One probably needs to set the required environment variables, when running a program, which makes
use of guile-ffi-cblas
, as well, as guile-ffi-cblas
does:
GUILE_FFI_BLIS_LIBNAME=libblis \ GUILE_FFI_BLIS_LIBPATH="$(which guile | rev | cut --delimiter '/' --fields 3- | rev)/lib" \ guile -L mod -s test/test-ffi-blis.scm
You can run the tests by using the make file as follows:
# from the root directory of this project:
make test
The tests cover a lot of the functionality. Future development should try to maintain this level of coverage or exceed it.
In general, the idea is to implement machine learning algorithms in a purely functional way, which will help with parallelization, testability and avoiding bugs due to mutable state.