CINXE.COM
SVM-Light Support Vector Machine
<HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1252"> <META NAME="Generator" CONTENT="Microsoft FrontPage 6.0"> <TITLE>SVM-Light Support Vector Machine</TITLE> <META NAME="Version" CONTENT="8.0.3514"> <META NAME="Date" CONTENT="11/26/96"> <META NAME="Template" CONTENT="C:\Programme\Microsoft Office\Office\HTML.DOT"> </HEAD> <BODY LINK="#0000ff" VLINK="#800080" BGCOLOR="#ffffff"> <TABLE CELLSPACING=0 BORDER=0 CELLPADDING=5> <TR><TD WIDTH="14%" VALIGN="top"> <H2><a TARGET="_top" HREF="http://www-ai.cs.uni-dortmund.de/"><IMG height=81 src="http://www.joachims.org/images/eier_graybg.gif" width=100 border=0></a></H2></TD> <TD WIDTH="75%" VALIGN="top"> <H1 ALIGN="center">SVM<I><SUP>light</SUP> </H1></I><H1 ALIGN="center">Support Vector Machine</H1> <P ALIGN="center">Author: <a TARGET="_top" HREF="http://www.joachims.org/">Thorsten Joachims</a> <<A href="mailto:thorsten@joachims.org">thorsten@joachims.org</A>> <BR> <a TARGET="_top" HREF="http://www.cornell.edu/">Cornell University</a> <BR> <a TARGET="_top" HREF="http://www.cs.cornell.edu/">Department of Computer Science</a> </P> <P ALIGN="center">Developed at: <BR> <a TARGET="_top" HREF="http://www.uni-dortmund.de/">University of Dortmund</a>, <a TARGET="_top" HREF="http://www.informatik.uni-dortmund.de/">Informatik</a>, <a TARGET="_top" HREF="http://www-ai.informatik.uni-dortmund.de/">AI-Unit</a> <BR> <a TARGET="_top" HREF="http://www.sfb475.uni-dortmund.de/">Collaborative Research Center on 'Complexity Reduction in Multivariate Data' (SFB475)</a> </P> <P ALIGN="center">Version: 6.01 <BR> Date: 02.09.2004</P></TD> <TD WIDTH="11%" VALIGN="top"> <H2><IMG SRC="http://www.joachims.org/images/culogo_125.gif" WIDTH=80 HEIGHT=80></H2></TD> </TR> </TABLE> <H2>Overview</H2> <P>SVM<I><SUP>light</I></SUP> is an implementation of Support Vector Machines (SVMs) in C. The main features of the program are the following: </P> <UL> <LI>fast optimization algorithm <UL> <LI>working set selection based on steepest feasible descent <LI>"shrinking" heuristic <LI>caching of kernel evaluations <LI>use of folding in the linear case </LI></UL> <LI>solves classification and regression problems. For multivariate and structured outputs use <a href=svm_struct.html>SVM<I><SUP>struct</I></SUP></a>. <LI>solves ranking problems (e. g. learning retrieval functions in <a href="http://striver.joachims.org/"><I>STRIVER</I></a> search engine). <LI>computes XiAlpha-estimates of the error rate, the precision, and the recall <LI>efficiently computes Leave-One-Out estimates of the error rate, the precision, and the recall <LI>includes algorithm for approximately training large transductive SVMs (TSVMs) (see also <a href="http://sgt.joachims.org">Spectral Graph Transducer</a>) <LI>can train SVMs with cost models and example dependent costs <LI>allows restarts from specified vector of dual variables <LI>handles many thousands of support vectors <LI>handles several hundred-thousands of training examples <LI>supports standard kernel functions and lets you define your own <LI>uses sparse vector representation </LI></UL> <P><img border=0 src="../../images/new.gif" width="32" height="16"> <a href=../svm_struct.html>SVM<I><SUP>struct</I></SUP></a>: SVM learning for multivariate and structured outputs like trees, sequences, and sets (available <a href=../svm_struct.html>here</a>).</P> <P><img border=0 src="../../images/new.gif" width="32" height="16"> <a href=../svm_perf.html>SVM<sup><i>perf</i></sup></a>: New training algorithm for linear classification SVMs that can be much faster than SVM<sup><i>light</i></sup> for large datasets. It also lets you direcly optimize multivariate performance measures like F1-Score, ROC-Area, and the Precision/Recall Break-Even Point. (available <a href=../svm_perf.html>here</a>).</P> <H2>Description</H2> <P>SVM<I><SUP>light</I></SUP> is an implementation of Vapnik's Support Vector Machine [<A href="#References">Vapnik, 1995</a>] for the problem of pattern recognition, for the problem of regression, and for the problem of learning a ranking function. The optimization algorithms used in SVM<I><SUP>light</I></SUP> are described in [<A href="#References">Joachims, 2002a</A> ]. [<A href="#References">Joachims, 1999a</a>]. The algorithm has scalable memory requirements and can handle problems with many thousands of support vectors efficiently. </P> <P>The software also provides methods for assessing the generalization performance efficiently. It includes two efficient estimation methods for both error rate and precision/recall. XiAlpha-estimates [<A href="#References">Joachims, 2002a</A>, <A href="#References">Joachims, 2000b</A>] can be computed at essentially no computational expense, but they are conservatively biased. Almost unbiased estimates provides leave-one-out testing. SVM<I><SUP>light</I></SUP> exploits that the results of most leave-one-outs (often more than 99%) are predetermined and need not be computed [<A href="#References">Joachims, 2002a</A>].</P> <P>New in this version is an algorithm for learning ranking functions [<A href="#References">Joachims, 2002c</A>]. The goal is to learn a function from preference examples, so that it orders a new set of objects as accurately as possible. Such ranking problems naturally occur in applications like search engines and recommender systems.</P> <P>Futhermore, this version includes an algorithm for training large-scale transductive SVMs. The algorithm proceeds by solving a sequence of optimization problems lower-bounding the solution using a form of local search. A detailed description of the algorithm can be found in [<A href="#References">Joachims, 1999c</A>]. A similar transductive learner, which can be thought of as a transductive version of k-Nearest Neighbor is the <a href="http://sgt.joachims.org">Spectral Graph Transducer</a>. </P> <P>SVM<I><SUP>light</I></SUP> can also train SVMs with cost models (see [<A href="#References">Morik et al., 1999</A>]).</P> <P>The code has been used on a large range of problems, including text classification [<A href="#References">Joachims, 1999c</A>][<A href="#References">Joachims, 1998a</A>], image recognition tasks, bioinformatics and medical applications. Many tasks have the property of sparse instance vectors. This implementation makes use of this property which leads to a very compact and efficient representation.</P> <H2>Source Code and Binaries</H2> <P>The program is free for scientific use. Please contact me, if you are planning to use the software for commercial purposes. The software must not be further distributed without prior permission of the author. If you use SVM<I><SUP>light</I></SUP> in your scientific work, please cite as </P> <UL> <LI>T. Joachims, Making large-Scale SVM Learning Practical. Advances in Kernel Methods - Support Vector Learning, B. Sch鰈kopf and C. Burges and A. Smola (ed.), MIT-Press, 1999. <BR> <a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_99a.pdf">[PDF]</a><a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_99a.ps.gz">[Postscript (gz)]</a></LI> </UL> <P>I would also appreciate, if you sent me (a link to) your papers so that I can learn about your research. The implementation was developed on Solaris 2.5 with gcc, but compiles also on SunOS 3.1.4, Solaris 2.7, Linux, IRIX, Windows NT, and Powermac (after small modifications, see <A href="svm_light_faq.html">FAQ</A>). The source code is available at the following location: </P><DIR> <P> <a target="_top" href="http://download.joachims.org/svm_light/v6.01/svm_light.tar.gz"> http://download.joachims.org/svm_light/v6.01/svm_light.tar.gz</a></P></DIR> <P>If you just want the binaries, you can download them for the following systems:</P> <ul> <li>Solaris: <a target="_top" href="http://download.joachims.org/svm_light/v6.01/svm_light_solaris.tar.gz"> http://download.joachims.org/svm_light/v6.01/svm_light_solaris.tar.gz</a> <li>Windows: <a target="_top" href="http://download.joachims.org/svm_light/v6.01/svm_light_windows.zip"> http://download.joachims.org/svm_light/v6.01/svm_light_windows.zip</a> <li>Cygwin: <a target="_top" href="http://download.joachims.org/svm_light/v6.01/svm_light_cygwin.tar.gz"> http://download.joachims.org/svm_light/v6.01/svm_light_cygwin.tar.gz</a> <li>Linux: <a target="_top" href="http://download.joachims.org/svm_light/v6.01/svm_light_linux.tar.gz"> http://download.joachims.org/svm_light/v6.01/svm_light_linux.tar.gz</a></li> </ul> <P><A href="mailto:svm-light@ls8.cs.uni-dortmund.de">Please send me email</A> and let me know that you got svm-light. I will put you on my mailing list to inform you about new versions and bug-fixes. SVM<I><SUP>light</I></SUP> comes with a quadratic programming tool for solving small intermediate quadratic programming problems. It is based on the method of Hildreth and D'Espo and solves small quadratic programs very efficiently. Nevertheless, if for some reason you want to use another solver, the new version still comes with an interface to PR_LOQO. The <a TARGET="_top" HREF="http://www.first.gmd.de/~smola/">PR_LOQO optimizer</a> was written by <a TARGET="_top" HREF="http://www.first.gmd.de/~smola/">A. Smola</a>. It can be requested from <a TARGET="_top" HREF="http://www.kernel-machines.org/code/prloqo.tar.gz">http://www.kernel-machines.org/code/prloqo.tar.gz</a>. </P> <H2>Installation</H2> <P>To install SVM<I><SUP>light</I></SUP> you need to download <TT>svm_light.tar.gz</TT>. Create a new directory:</P> <DIR> <TT><P>mkdir svm_light</P></TT> </DIR> <P>Move <TT>svm_light.tar.gz</TT> to this directory and unpack it with </P> <DIR> <TT><P>gunzip -c svm_light.tar.gz | tar xvf -</P></TT> </DIR> <P>Now execute </P> <DIR> <TT><P>make or make all</P></TT> </DIR> <P>which compiles the system and creates the two executables </P> <DIR> <TT>svm_learn (learning module)</TT><BR> <TT>svm_classify (classification module)</TT> </DIR> <P>If you do not want to use the built-in optimizer but PR_LOQO instead, create a subdirectory in the svm_light directory with </P> <DIR> <TT><P>mkdir pr_loqo</P></TT> </DIR> <P>and copy the files <TT>pr_loqo.c</TT> and <TT>pr_loqo.h</TT> in there. Now execute </P> <DIR> <TT><P>make svm_learn_loqo</P></TT> </DIR> <P>If the system does not compile properly, check this <A href="svm_light_faq.html">FAQ</A>.</P> <H2>How to use</H2> <P>This section explains how to use the SVM<I><SUP>light</I></SUP> software. A good introduction to the theory of SVMs is Chris Burges' <a TARGET="_top" HREF="http://www.kernel-machines.org/papers/Burges98.ps.gz">tutorial</a>. </P> <P>SVM<I><SUP>light</I></SUP> consists of a learning module (<TT>svm_learn</TT>) and a classification module (<TT>svm_classify</TT>). The classification module can be used to apply the learned model to new examples. See also the examples below for how to use <TT>svm_learn</TT> and <TT>svm_classify</TT>. </P> <TT><P>svm_learn</TT> is called with the following parameters:</P> <DIR> <TT><P>svm_learn [options] example_file model_file</P></TT> </DIR> <P>Available options are: </P> <DIR> <PRE>General options: -? - this help -v [0..3] - verbosity level (default 1) Learning options: -z {c,r,p} - select between classification (c), regression (r), and preference ranking (p) (see [<A href="#References">Joachims, 2002c</A>]) (default classification) -c float - C: trade-off between training error and margin (default [avg. x*x]^-1) -w [0..] - epsilon width of tube for regression (default 0.1) -j float - Cost: cost-factor, by which training errors on positive examples outweight errors on negative examples (default 1) (see [<A href="#References">Morik et al., 1999</A>]) -b [0,1] - use biased hyperplane (i.e. x*w+b0) instead of unbiased hyperplane (i.e. x*w0) (default 1) -i [0,1] - remove inconsistent training examples and retrain (default 0) Performance estimation options: -x [0,1] - compute leave-one-out estimates (default 0) (see [5]) -o ]0..2] - value of rho for XiAlpha-estimator and for pruning leave-one-out computation (default 1.0) (see [<A href="#References">Joachims, 2002a</A>]) -k [0..100] - search depth for extended XiAlpha-estimator (default 0) Transduction options (see [<A href="#References">Joachims, 1999c</A>], [<A href="#References">Joachims, 2002a</A>]): -p [0..1] - fraction of unlabeled examples to be classified into the positive class (default is the ratio of positive and negative examples in the training data) Kernel options: -t int - type of kernel function: 0: linear (default) 1: polynomial (s a*b+c)^d 2: radial basis function exp(-gamma ||a-b||^2) 3: sigmoid tanh(s a*b + c) 4: user defined kernel from kernel.h -d int - parameter d in polynomial kernel -g float - parameter gamma in rbf kernel -s float - parameter s in sigmoid/poly kernel -r float - parameter c in sigmoid/poly kernel -u string - parameter of user defined kernel Optimization options (see [<A href="#References">Joachims, 1999a</A>], [<A href="#References">Joachims, 2002a</A>]): -q [2..] - maximum size of QP-subproblems (default 10) -n [2..q] - number of new variables entering the working set in each iteration (default n = q). Set n<q to prevent zig-zagging. -m [5..] - size of cache for kernel evaluations in MB (default 40) The larger the faster... -e float - eps: Allow that error for termination criterion [y [w*x+b] - 1] = eps (default 0.001) -h [5..] - number of iterations a variable needs to be optimal before considered for shrinking (default 100) -f [0,1] - do final optimality check for variables removed by shrinking. Although this test is usually positive, there is no guarantee that the optimum was found if the test is omitted. (default 1) -y string -> if option is given, reads alphas from file with given and uses them as starting point. (default 'disabled') -# int -> terminate optimization, if no progress after this number of iterations. (default 100000) Output options: -l char - file to write predicted labels of unlabeled examples into after transductive learning -a char - write all alphas to this file after learning (in the same order as in the training set)</PRE></DIR> <P>A more detailed description of the parameters and how they link to the respective algorithms is given in the appendix of [<A href="#References">Joachims, 2002a</A>]. </P> <P>The input file <TT>example_file</TT> contains the training examples. The first lines may contain comments and are ignored if they start with #. Each of the following lines represents one training example and is of the following format: </P> <DIR> <TT><line> .=. <target> <feature>:<value> <feature>:<value> ... <feature>:<value> # <info></TT><BR> <TT><target> .=. +1 | -1 | 0 | <float></TT> </TT><BR> <TT><feature> .=. <integer> | "qid"</TT><BR> <TT><value> .=. <float></TT><BR> <TT><info> .=. <string></TT> </DIR> <P>The target value and each of the feature/value pairs are separated by a space character. Feature/value pairs MUST be ordered by increasing feature number. Features with value zero can be skipped. The string <TT><info></TT> can be used to pass additional information to the kernel (e.g. non feature vector data).</P> <P>In classification mode, the target value denotes the class of the example. +1 as the target value marks a positive example, -1 a negative example respectively. So, for example, the line </P> <blockquote> <P><tt>-1 1:0.43 3:0.12 9284:0.2 # abcdef</tt> </P> </blockquote> <P>specifies a negative example for which feature number 1 has the value 0.43, feature number 3 has the value 0.12, feature number 9284 has the value 0.2, and all the other features have value 0. In addition, the string <tt>abcdef</tt> is stored with the vector, which can serve as a way of providing additional information for user defined kernels. A class label of 0 indicates that this example should be classified using transduction. The predictions for the examples classified by transduction are written to the file specified through the -l option. The order of the predictions is the same as in the training data. </P> <P>In regression mode, the <target> contains the real-valued target value.</P> <P>In ranking mode [<A href="#References">Joachims, 2002c</A>], the target value is used to generated pairwise preference constraints (see <a href="http://striver.joachims.org">STRIVER</a>). A preference constraint is included for all pairs of examples in the <TT>example_file</TT>, for which the target value differs. The special feature "qid" can be used to restrict the generation of constraints. Two examples are considered for a pairwise preference constraint only, if the value of "qid" is the same. For example, given the <TT>example_file</TT></P> <BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px"> <P><TT>3 qid:1 1:0.53 2:0.12<BR> 2 qid:1 1:0.13 2:0.1<BR> 7 qid:2 1:0.87 2:0.12 </TT></P> </BLOCKQUOTE> <P>a preference constraint is included only for the first and the second example(ie. the first should be ranked higher than the second), but not with the third example, since it has a different "qid".</P> <P>In all modes, the result of <TT>svm_learn</TT> is the model which is learned from the training data in <TT>example_file</TT>. The model is written to <TT>model_file</TT>. To make predictions on test examples, <TT>svm_classify</TT> reads this file. <TT>svm_classify</TT> is called with the following parameters: </P> <DIR> <TT><P>svm_classify [options] example_file model_file output_file</P></TT> </DIR> <P>Available options are: </P> <blockquote> <PRE> -h Help. -v [0..3] Verbosity level (default 2). -f [0,1] 0: old output format of V1.0 1: output the value of decision function (default)</PRE> </blockquote> <P>The test examples in <TT>example_file</TT> are given in the same format as the training examples (possibly with 0 as class label). For all test examples in <TT>example_file</TT> the predicted values are written to <TT>output_file</TT>. There is one line per test example in <TT>output_file</TT> containing the value of the decision function on that example. For classification, the sign of this value determines the predicted class. For regression, it is the predicted value itself, and for ranking the value can be used to order the test examples. The test example file has the same format as the one for <TT>svm_learn</TT>. Again, <TT><class></TT> can have the value zero indicating unknown. </P> <P>If you want to find out more, try this <A href="svm_light_faq.html">FAQ</A>. </P> <H2>Getting started: some Example Problems</H2> <H3>Inductive SVM</H3> <P>You will find an example text classification problem at </P> <DIR> <P><a href="http://download.joachims.org/svm_light/examples/example1.tar.gz" target="_top">http://download.joachims.org/svm_light/examples/example1.tar.gz</a></P> </DIR> <P>Download this file into your svm_light directory and unpack it with </P> <DIR> <TT><P>gunzip -c example1.tar.gz | tar xvf -</P></TT> </DIR> <P>This will create a subdirectory <TT>example1</TT>. Documents are represented as feature vectors. Each feature corresponds to a word stem (9947 features). The task is to learn which <a TARGET="_top" HREF="http://www.daviddlewis.com/resources/testcollections/reuters21578/">Reuters articles</a> are about "corporate acquisitions". There are 1000 positive and 1000 negative examples in the file <TT>train.dat</TT>. The file <TT>test.dat</TT> contains 600 test examples. The feature numbers correspond to the line numbers in the file <TT>words</TT>. To run the example, execute the commands: </P> <DIR> <TT><P>svm_learn example1/train.dat example1/model<BR></TT> <TT>svm_classify example1/test.dat example1/model example1/predictions</P></TT> </DIR> <P>The accuracy on the test set is printed to stdout. </P> <H3>Transductive SVM</H3> <P>To try out the transductive learner, you can use the following dataset (see also <a href="http://sgt.joachims.org">Spectral Graph Transducer</a>). I compiled it from the same Reuters articles as used in the example for the inductive SVM. The dataset consists of only 10 training examples (5 positive and 5 negative) and the same 600 test examples as above. You find it at </P> <DIR> <P><a href="http://download.joachims.org/svm_light/examples/example2.tar.gz" target="_top">http://download.joachims.org/svm_light/examples/example2.tar.gz</a></P> </DIR> <P>Download this file into your svm_light directory and unpack it with </P> <DIR> <TT><P>gunzip -c example2.tar.gz | tar xvf -</P></TT> </DIR> <P>This will create a subdirectory <TT>example2</TT>. To run the example, execute the commands: </P> <DIR> <P><TT>svm_learn example2/train_transduction.dat example2/model</TT> <BR> <TT>svm_classify example2/test.dat example2/model example2/predictions</TT></P> </DIR> <P>The classification module is called only to get the accuracy printed. The transductive learner is invoced automatically, since <TT>train_transduction.dat </TT>contains unlabeled examples (i. e. the 600 test examples). You can compare the results to those of the inductive SVM by running: </P> <BLOCKQUOTE> <TT>svm_learn example2/train_induction.dat example2/model</TT> <BR> <TT>svm_classify example2/test.dat example2/model example2/predictions</TT> </BLOCKQUOTE> <P>The file <TT>train_induction.dat</TT> contains the same 10 (labeled) training examples as <TT>train_transduction.dat</TT>. </P> <H3> Ranking SVM</H3> <P>For the ranking SVM [<A href="#References">Joachims, 2002c</A>], I created a toy example. It consists of only 12 training examples in 3 groups and 4 test examples. You find it at </P> <DIR> <P><a href="http://download.joachims.org/svm_light/examples/example3.tar.gz" target="_top">http://download.joachims.org/svm_light/examples/example3.tar.gz</a></P> </DIR> <P>Download this file into your svm_light directory and unpack it with </P> <DIR> <TT><P>gunzip -c example3.tar.gz | tar xvf -</P></TT> </DIR> <P>This will create a subdirectory <TT>example3</TT>. To run the example, execute the commands: </P> <DIR> <P><TT>svm_learn -z p example3/train.dat example3/model</TT> <BR> <TT>svm_classify example3/test.dat example3/model example3/predictions</TT></P> </DIR> <P>The output in the predictions file can be used to rank the test examples. If you do so, you will see that it predicts the correct ranking. The values in the predictions file do not have a meaning in an absolute sense. They are only used for ordering. </P><P>It can also be interesting to look at the "training error" of the ranking SVM. The equivalent of training error for a ranking SVM is the number of training pairs that are misordered by the learned model. To find those pairs, one can apply the model to the training file: </P> <BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px"><P> <TT> svm_classify example3/train.dat example3/model example3/predictions.train </TT></P></BLOCKQUOTE> <P>Again, the predictions file shows the ordering implied by the model. The model ranks all training examples correctly. </P><P>Note that ranks are comparable only between examples with the same qid. Note also that the target value (first value in each line of the data files) is only used to define the order of the examples. Its absolute value does not matter, as long as the ordering relative to the other examples with the same qid remains the same.</P> <H2>Extensions and Additions</H2> <UL> <LI><a TARGET="_top" HREF="http://ai-nlp.info.uniroma2.it/moschitti/">Tree Kernels</a>: kernel for classifying trees with SVM<I><SUP>light</I></SUP> written by <a TARGET="_top" HREF="http://ai-nlp.info.uniroma2.it/moschitti/">Alessandro Moschitti</a> </LI> <LI><a TARGET="_top" HREF="http://search.cpan.org/~kwilliams/Algorithm-SVMLight/lib/Algorithm/SVMLight.pm">PERL Interface</a>: a PERL interface to SVM<I><SUP>light</I></SUP> written by <a TARGET="_top" HREF="mailto:kwilliams@cpan.org">Ken Williams</a> </LI> <LI><a TARGET="_top" HREF="http://www.cis.TUGraz.at/igi/aschwaig/software.html">Matlab Interface</a>: a MATLAB interface to SVM<I><SUP>light</I></SUP> written by <a TARGET="_top" HREF="http://www.cis.TUGraz.at/igi/aschwaig/index.html">Anton Schwaighofer</a> (for <A HREF="http://svmlight.joachims.org/old/svm_light_v4.00.html">SVM<I><SUP>light</SUP> </I> V4.00</a>) </LI> <LI><a TARGET="_top" HREF="http://www.ship.edu/~thb/mexsvm/">Matlab Interface</a>: a MATLAB MEX-interface to SVM<I><SUP>light</I></SUP> written by <a TARGET="_top" HREF="http://www.ship.edu/~thb/">Tom Briggs</a></LI> <LI><a TARGET="_top" HREF="http://www-cad.eecs.berkeley.edu/~hwawen/research/projects/jsvm/doc/manual/index.html">jSVM</a>: a JAVA interface to SVM<I><SUP>light</I></SUP> written by <a TARGET="_top" HREF="http://www-cad.eecs.berkeley.edu/~hwawen/">Heloise Hwawen Hse</a> (for <A HREF="http://www-ai.cs.uni-dortmund.de/SOFTWARE/SVM_LIGHT/svm_light_v2.01.eng.html">SVM<I><SUP>light</I></SUP> V2.01</A>) </LI> <LI>A <a href="http://sourceforge.net/project/showfiles.php?group_id=16036">special version of SVM<I><SUP>light</SUP> </I></a> is integrated into the virtual file system <a href="http://witme.sourceforge.net/libferris.web/">libferris</a> by Ben Martin </LI> <LI><a href="http://www.quantderivatives.com/svm_light_data_helper.html"> LightDataAgent</a>: tool to translate comma/tab-delimited data into SVM<SUP><I>light</I></SUP> format, written by Ophir Gottlieb</LI> <LI><a href="http://www.aifb.uni-karlsruhe.de/WBS/sbl/software/jnikernel/">JNI Kernel</a>: interface for SVM<SUP><I>light</I></SUP> to access kernel functions implemented in Java, written by Stephan Bloehdorn (for <A HREF="http://www-ai.cs.uni-dortmund.de/SOFTWARE/SVM_LIGHT/svm_light_v6.01.eng.html">SVM<SUP><I>light</I></SUP> V6.01</A>) </LI> </UL> <H2>Questions and Bug Reports</H2> <P>If you find bugs or you have problems with the code you cannot solve by yourself, please contact me via email <<A href="mailto:svm-light@ls8.cs.uni-dortmund.de">svm-light@ls8.cs.uni-dortmund.de</A>. </P> <H2>Disclaimer</H2> <P>This software is free only for non-commercial use. It must not be distributed without prior permission of the author. The author is not responsible for implications from the use of this software. </P> <H2>History</H2> <H4>V6.00 - V6.01</H4> <UL> <LI>Small bug fixes in HIDEO optimizer. </UL> <H4>V5.00 - V6.00</H4> <UL> <LI>Allows restarts from a particular vector of dual variables (option y). <LI>Time out for exceeding number of iterations without progress (option #). <LI>Allows the use of Kernels for learning ranking functions. <LI>Support for non-vectorial data like strings. <LI>Improved robustness and convergence especially for regression problems. <LI>Cleaned up code, which makes it easier to integrate it into other programs. <LI>Interface to SVM<I><SUP>struct</SUP></I>. <LI>Source code for <A HREF="./old/svm_light_v5.00.html">SVM<I><SUP>light</I></SUP> V5.00</A></LI> </UL> <H4>V4.00 - V5.00</H4> <UL> <LI>Can now solve ranking problems in addition to classification and regression. <LI>Fixed bug in kernel cache that could lead to segmentation fault on some platforms. <LI>Fixed bug in transductive SVM that was introduced in version V4.00. <LI>Improved robustness.</LI> <LI>Source code for <A HREF="./old/svm_light_v4.00.html">SVM<I><SUP>light</I></SUP> V4.00</A></LI> </UL> <H4>V3.50 - V4.00</H4> <UL> <LI>Can now solve regression problems in addition to classification. <LI>Bug fixes and improved numerical stability. </LI> <LI>Source code for <A HREF="./old/svm_light_v3.50.html">SVM<I><SUP>light</I></SUP> V3.50</A></LI> </UL> <H4>V3.02 - V3.50</H4> <UL> <LI>Computes XiAlpha estimates of the error rate, the precision, and the recall. <LI>Efficiently computes Leave-One-Out estimates of the error rate, the precision, and the recall. <LI>Improved Hildreth and D'Espo optimizer especially for low-dimensional data sets. <LI>Easier to link into other C and C++ code. Easier compilation under Windows. <LI>Faster classification of new examples for linear SVMs. </LI> </UL> <H4>V3.01 - V3.02</H4> <UL> <LI>Now examples can be read in correctly on SGIs. </LI> </UL> <H4>V3.00 - V3.01</H4> <UL> <LI>Fixed convergence bug for Hildreth and D'Espo solver. </LI> </UL> <H4>V2.01 - V3.00</H4> <UL> <LI>Training algorithm for transductive Support Vector Machines. <LI>Integrated core QP-solver based on the method of Hildreth and D'Espo. <LI>Uses folding in the linear case, which speeds up linear SVM training by an order of magnitude. <LI>Allows linear cost models. <LI>Faster in general. </LI> </UL> <H4>V2.00 - V2.01</H4> <UL> <LI>Improved interface to PR_LOQO <LI>Source code for <A HREF="http://www-ai.cs.uni-dortmund.de/SOFTWARE/SVM_LIGHT/svm_light_v2.01.eng.html">SVM<I><SUP>light</I></SUP> V2.01</A> </LI> </UL> <H4>V1.00 - V2.00</H4> <UL> <LI>Learning is much faster especially for large training sets. <LI>Working set selection based on steepest feasible descent. <LI>"Shrinking" heuristic. <LI>Improved caching. <LI>New solver for intermediate QPs. <LI>Lets you set the size of the cache in MB. <LI>Simplified output format of svm_classify. <LI>Data files may contain comments. </LI> </UL> <H4>V0.91 - V1.00</H4> <UL> <LI>Learning is more than 4 times faster. <LI>Smarter caching and optimization. <LI>You can define your own kernel function. <LI>Lets you set the size of the cache. <LI>VCdim is now estimated based on the radius of the support vectors. <LI>The classification module is more memory efficient. <LI>The f2c library is available from <A href="ftp://ftp-ai.cs.uni-dortmund.de/pub/Users/thorsten/svm_light/f2c/">here</A>. <LI>Adaptive precision tuning makes optimization more robust. <LI>Includes some small bug fixes and is more robust. <LI>Source code for <A HREF="http://www-ai.cs.uni-dortmund.de/SOFTWARE/SVM_LIGHT/svm_light_v1.00.eng.html">SVM<I><SUP>light</I></SUP> V1.00</A> </LI> </UL> <H4>V0.9 - V0.91</H4> <UL> <LI>Fixed bug which appears for very small C. Optimization did not converge. </LI> </UL> <a NAME="References"></a><H2>References</H2> <TABLE CELLSPACING=0 BORDER=0 CELLPADDING=5> <TR><TD WIDTH="34%" VALIGN="top"> <P>[Joachims, 2002a]</P></TD> <TD WIDTH="66%" VALIGN="top"> <P>Thorsten Joachims, <a href="http://textclassification.joachims.org"><i>Learning to Classify Text Using Support Vector Machines</i></a>. Dissertation, Kluwer, 2002.<br> [<a href="http://search.barnesandnoble.com/booksearch/isbninquiry.asp?isbn=079237679X">B&N</a>] [<a href="http://www.amazon.com/exec/obidos/ASIN/079237679X">Amazon</a>] [<a href="http://www.wkap.nl/prod/b/0-7923-7679-X">Kluwer</a>] </P></TD> </TR> <TR><TD WIDTH="34%" VALIGN="top"> [Joachims, 2002c]</TD> <TD WIDTH="66%" VALIGN="top"> <span lang="EN-GB" style="mso-ansi-language: EN-GB">T. Joachims, <i>Optimizing Search Engines Using Clickthrough Data</i>, Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), ACM, 2002.<br> </span><a href="http://www.joachims.org/publications/joachims_02c.ps.gz"><span lang="EN-GB" style="mso-ansi-language: EN-GB">Online [Postscript]</span></a><span lang="EN-GB" style="mso-ansi-language: EN-GB"> </span><a href="http://www.joachims.org/publications/joachims_02c.pdf"><span lang="EN-GB" style="mso-ansi-language: EN-GB">[PDF]</span></a><span lang="EN-GB" style="mso-ansi-language: EN-GB"> </span></TD> </TR> <TR><TD WIDTH="34%" VALIGN="top"> <P>[Klinkenberg, Joachims, 2000a]</P></TD> <TD WIDTH="66%" VALIGN="top"> <P>R. Klinkenberg and T. Joachims, <I>Detecting Concept Drift with Support Vector Machines</I>. Proceedings of the Seventeenth International Conference on Machine Learning (ICML), Morgan Kaufmann, 2000. <BR> <a TARGET="_top" HREF="http://www.joachims.org/publications/klinkenberg_joachims_2000a.ps.gz">Online [Postscript (gz)]</a> <a TARGET="_top" HREF="http://www.joachims.org/publications/klinkenberg_joachims_2000a.pdf.gz">[PDF (gz)]</a></P></TD> </TR> <TR><TD WIDTH="34%" VALIGN="top"> <P>[Joachims, 2000b]</P></TD> <TD WIDTH="66%" VALIGN="top"> <P>T. Joachims, <I>Estimating the Generalization Performance of a SVM Efficiently</I>. Proceedings of the International Conference on Machine Learning, Morgan Kaufman, 2000. <BR> <a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_00a.ps.gz">Online [Postscript (gz)]</a> <a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_00a.pdf">[PDF]</a></P></TD> </TR> <TR><TD WIDTH="34%" VALIGN="top"> <P>[Joachims, 1999a]</P></TD> <TD WIDTH="66%" VALIGN="top"> <P>T. Joachims, 11 in: <I>Making large-Scale SVM Learning Practical</I>. Advances in Kernel Methods - Support Vector Learning, B. Sch鰈kopf and C. Burges and A. Smola (ed.), MIT Press, 1999. <BR> <a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_99a.ps.gz">Online [Postscript (gz)]</a> <a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_99a.pdf">[PDF]</a></P></TD> </TR> <TR><TD WIDTH="34%" VALIGN="top"> <P>[Joachims, 1999c]</P></TD> <TD WIDTH="66%" VALIGN="top"> <P>Thorsten Joachims, <I>Transductive Inference for Text Classification using Support Vector Machines</I>. International Conference on Machine Learning (ICML), 1999. <BR> <a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_99c.ps.gz">Online [Postscript (gz)]</a> <a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_99c.pdf">[PDF]</a></P></TD> </TR> <TR><TD WIDTH="34%" VALIGN="top"> <P>[Morik et al., 1999a]</P></TD> <TD WIDTH="66%" VALIGN="top"> <P>K. Morik, P. Brockhausen, and T. Joachims, <I>Combining statistical learning with a knowledge-based approach - A case study in intensive care monitoring</I>. Proc. 16th Int'l Conf. on Machine Learning (ICML-99), 1999. <BR> <a TARGET="_top" HREF="http://www.joachims.org/publications/morik_etal_99a.ps.gz">Online [Postscript (gz)]</a> <a TARGET="_top" HREF="http://www.joachims.org/publications/morik_etal_99a.pdf">[PDF]</a></P></TD> </TR> <TR><TD WIDTH="34%" VALIGN="top"> <P>[Joachims, 1998a]</P></TD> <TD WIDTH="66%" VALIGN="top"> <P>T. Joachims, <I>Text Categorization with Support Vector Machines: Learning with Many Relevant Features</I>. Proceedings of the European Conference on Machine Learning, Springer, 1998. <BR> <a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_98a.ps.gz">Online [Postscript (gz)]</a> <a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_98a.pdf">[PDF]</a></P></TD> </TR> <TR><TD WIDTH="34%" VALIGN="top"> <P>[Joachims, 1998c]</P></TD> <TD WIDTH="66%" VALIGN="top"> <P>Thorsten Joachims, <I>Making Large-Scale SVM Learning Practical</I>. LS8-Report, 24, Universit鋞 Dortmund, LS VIII-Report, 1998. <BR> <a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_98c.ps.gz">Online [Postscript (gz)]</a> <a TARGET="_top" HREF="http://www.joachims.org/publications/joachims_98c.pdf">[PDF]</a></P></TD> </TR> <TR><TD WIDTH="34%" VALIGN="top"> <P>[Vapnik, 1995a]</P></TD> <TD WIDTH="66%" VALIGN="top"> <P>Vladimir N. Vapnik, <I>The Nature of Statistical Learning Theory</I>. Springer, 1995.</P></TD> </TR> </TABLE> <H2>Other SVM Resources</H2> <UL> <LI><a TARGET="_top" HREF="http://www.first.gmd.de/">GMD-First Berlin</a> <LI><a TARGET="_top" HREF="http://www.kernel-machines.org/">Kernel-Machines Web Site</a> <LI><a TARGET="_top" HREF="http://svm.research.bell-labs.com/">Bell Labs</a> <LI><a TARGET="_top" HREF="http://www.research.microsoft.com/~jplatt/svm.html">Microsoft Research</a> <LI><a TARGET="_top" HREF="http://www.dcs.rhbnc.ac.uk/research/compint/areas/comp_learn/sv/index.shtml">Royal Holloway College</a> <LI><a TARGET="_top" HREF="http://wwwsyseng.anu.edu.au/lsg/">ANU Canberra</a> <LI><a TARGET="_top" HREF="http://www.ai.mit.edu/projects/cbcl/res-area/theory/index-theory-learning.html">MIT</a> <LI><a TARGET="_top" HREF="http://lara.enm.bris.ac.uk/cig/">Bristol CI-Group</a> </LI></UL> <P>Last modified July 20th, 2004 by <a TARGET="_top" HREF="http://www.joachims.org/">Thorsten Joachims</a> <<A href="mailto:thorsten@joachims.org">thorsten@joachims.org</A>></P></BODY> </HTML>