Libsvm：脚本（subset.py、grid.py、checkdata.py）

2023-11-15

1.脚本

This directory includes some useful codes:

1. subset selection tools. （子集抽取工具） subset.py
2. parameter selection tools. （参数选优工具） grid.py
3. LIBSVM format checking tools（格式检查工具）checkdata.py

Part I: Subset selection tools子集抽取

Introduction

============

Training large data（大训练集） is time consuming. Sometimes one should work on a smaller subset（先用小的子集测试） first. The python script subset.py randomly selects a specified number of samples. For classification data, we provide a stratified分层的 selection to ensure the same class distribution in the subset.

Usage: subset.py [options] dataset number [output1] [output2]

This script selects a subset of the given data set.

options: 
-s method : method of selection (default 0) 
0 -- stratified selection (classification only) 分层选择 
1 -- random selection 随机选择

output1 : the subset (optional) 
output2 : the rest of data (optional)

If output1 is omitted省略, the subset will be printed on the screen.

Example

=======

> python subset.py heart_scale 100 file1 file2

From heart_scale 100 samples are randomly selected and stored in file1. All remaining instances are stored in file2.

这里没有-s参数，所以默认是-s 0分层选择；选取100个样本存储在file1中，其余的实例存放在file2中。

Part II: Parameter Selection Tools参数选优

Introduction

============

grid.py is a parameter selection tool for C-SVM classification（参数选优工具） using the RBF (radial basis function)（核函数） kernel. It uses cross validation (CV) technique（交叉检验技术） to estimate the accuracy of each parameter combination in the specified range and helps you to decide the best parameters for your problem.

grid.py directly executes libsvm binaries (so no python binding is needed) for cross validation and then draw contour轮廓 of CV accuracy using gnuplot. You must have libsvm and gnuplot installed before using it. The package gnuplot is available at http://www.gnuplot.info/

On Mac OSX, the precompiled gnuplot file needs the library Aquarterm, which thus must be installed as well. In addition, this version of gnuplot does not support png, so you need to change "set term png transparent small" and use other image formats. For example, you may have "set term pbm small color".

Usage: grid.py [grid_options] [svm_options] dataset

grid_options :

-log2c {begin,end,step | "null"} : set the range of c (default -5,15,2) 
    begin,end,step -- c_range = 2^{begin,...,begin+k*step,...,end} 
    "null"         -- do not grid with c 

-log2g {begin,end,step | "null"} : set the range of g (default 3,-15,-2) 
    begin,end,step -- g_range = 2^{begin,...,begin+k*step,...,end} 
    "null"         -- do not grid with g 

-v n : n-fold cross validation (default 5) 

-svmtrain pathname : set svm executable path and name 

-gnuplot {pathname | "null"} : 
    pathname -- set gnuplot executable path and name 
    "null"   -- do not plot 

-out {pathname | "null"} : (default dataset.out) 
    pathname -- set output file path and name 
    "null"   -- do not output file 

-png pathname : set graphic output file path and name (default dataset.png) 

-resume [pathname] : resume the grid task using an existing output file (default pathname is dataset.out)

Use this option only if some parameters have been checked for the SAME data.

svm_options : additional options for svm-train 可以附加svm-train程序的参数，以下的-m就是设置缓存

The program conducts v-fold cross validation using parameter C (and gamma) = 2^begin, 2^(begin+step), ..., 2^end.

You can specify where the libsvm executable and gnuplot are using the -svmtrain and -gnuplot parameters.

For windows users, please use pgnuplot.exe. If you are using gnuplot 3.7.1, please upgrade to version 3.7.3 or higher. The version 3.7.1 has a bug. If you use cygwin on windows, please use gunplot-x11.

If the task is terminated accidentally or you would like to change the range of parameters, you can apply '-resume' （重新开始）to save time by re-using previous results. You may specify the output file of a previous run or use the default (i.e., dataset.out) without giving a name. Please note that the same condition must be used in two runs. For example,
you cannot use '-v 10' earlier and resume the task with '-v 5'.

The value of some options can be "null." For example, `-log2c -1,0,1 -log2 "null"' means that C=2^-1,2^0,2^1 and g=LIBSVM's default gamma value. That is, you do not conduct parameter selection on gamma.

Example

=======

> python grid.py -log2c -5,5,1 -log2g -4,0,1 -v 5 -m 300 heart_scale

Users (in particular MS Windows users) may need to specify the path of executable files. You can either change paths in the beginning of grid.py（推荐） or specify them in the command line. For example,

> grid.py -log2c -5,5,1 -svmtrain "c:\Program Files\libsvm\windows\svm-train.exe" -gnuplot c:\tmp\gnuplot\binary\pgnuplot.exe -v 10 heart_scale

Output: two files
dataset.png: the CV accuracy contour plot generated by gnuplot
dataset.out: the CV accuracy at each (log2(C),log2(gamma))

The following example saves running time by loading the output file of a previous run.

> python grid.py -log2c -7,7,1 -log2g -5,2,1 -v 5 -resume heart_scale.out heart_scale

Parallel grid search 并行

====================

You can conduct a parallel grid search by dispatching jobs to a cluster of computers which share the same file system. First, you add machine names in grid.py:

ssh_workers = ["linux1", "linux5", "linux5"]

and then setup your ssh so that the authentication works without asking a password.

The same machine (e.g., linux5 here) can be listed more than once if it has multiple CPUs or has more RAM. If the local machine is the best, you can also enlarge the nr_local_worker. For example:

nr_local_worker = 2

Example:

> python grid.py heart_scale 
[local] -1 -1 78.8889  (best c=0.5, g=0.5, rate=78.8889) 
[linux5] -1 -7 83.3333  (best c=0.5, g=0.0078125, rate=83.3333) 
[linux5] 5 -1 77.037  (best c=0.5, g=0.0078125, rate=83.3333) 
[linux1] 5 -7 83.3333  (best c=0.5, g=0.0078125, rate=83.3333) 
.

If -log2c, -log2g, or -v is not specified, default values are used.

If your system uses telnet instead of ssh, you list the computer names in telnet_workers.

Calling grid in Python 调用

======================

In addition to using grid.py as a command-line tool, you can use it as a Python module.

>>> rate, param = find_parameters(dataset, options)

You need to specify `dataset' and `options' (default ''). See the following example.

> python

>>> from grid import * 
>>> rate, param = find_parameters('../heart_scale', '-log2c -1,1,1 -log2g -1,1,1') 
[local] 0.0 0.0 rate=74.8148 (best c=1.0, g=1.0, rate=74.8148) 
[local] 0.0 -1.0 rate=77.037 (best c=1.0, g=0.5, rate=77.037) 
. 
. 
[local] -1.0 -1.0 rate=78.8889 (best c=0.5, g=0.5, rate=78.8889) 
. 
. 
>>> rate 
78.8889 
>>> param 
{'c': 0.5, 'g': 0.5}

Part III: LIBSVM format checking tools格式检查

Introduction

============

`svm-train' conducts only a simple check of the input data. To do a detailed check, we provide a python script `checkdata.py.'

Usage: checkdata.py dataset

Exit status (returned value): 1 if there are errors, 0 otherwise.

This tool is written by Rong-En Fan at National Taiwan University.

Example

=======

> cat bad_data 
1 3:1 2:4 
> python checkdata.py bad_data 
line 1: feature indices must be in an ascending order, previous/current features 3:1 2:4 
Found 1 lines with error.

2.MATLAB/OCTAVE interface of LIBSVM

=================

- Introduction
- Installation
- Usage
- Returned Model Structure
- Other Utilities
- Examples
- Additional Information

Introduction

============

This tool provides a simple interface to LIBSVM, a library for support vector machines (http://www.csie.ntu.edu.tw/~cjlin/libsvm). It is very easy to use as the usage and the way of specifying parameters are the same as that of LIBSVM.

Installation

============

On Windows systems, pre-built binary files are already in the directory '..\windows', so no need to conduct installation. Now we provide binary files only for 64bit MATLAB on Windows. If you would like to re-build the package, please rely on the following steps.

We recommend using make.m on both MATLAB and OCTAVE. Just type 'make' to build 'libsvmread.mex', 'libsvmwrite.mex', 'svmtrain.mex', and 'svmpredict.mex'.

On MATLAB or Octave:

>> make

If make.m does not work on MATLAB (especially for Windows), try 'mex -setup' to choose a suitable compiler for mex. Make sure your compiler is accessible and workable. Then type 'make' to start the installation.

Example:

    matlab>> mex -setup
    (ps: MATLAB will show the following messages to setup default compiler.)
    Please choose your compiler for building external interface (MEX) files:
    Would you like mex to locate installed compilers [y]/n? y
    Select a compiler:
    [1] Microsoft Visual C/C++ version 7.1 in C:\Program Files\Microsoft Visual Studio
    [0] None
    Compiler: 1
    Please verify your choices:
    Compiler: Microsoft Visual C/C++ 7.1
    Location: C:\Program Files\Microsoft Visual Studio
    Are these correct?([y]/n): y

matlab>> make

On Unix systems, if neither make.m nor 'mex -setup' works, please use Makefile and type 'make' in a command window. Note that we assume your MATLAB is installed in '/usr/local/matlab'. If not, please change MATLABDIR in Makefile.

Example:
linux> make

To use octave, type 'make octave':

Example:
linux> make octave

For a list of supported/compatible compilers for MATLAB, please check the following page:

http://www.mathworks.com/support/compilers/current_release/

Usage

=====

matlab> model = svmtrain(training_label_vector, training_instance_matrix [, 'libsvm_options']);

        -training_label_vector:
            An m by 1 vector of training labels (type must be double).
        -training_instance_matrix:
            An m by n matrix of m training instances with n features.
            It can be dense or sparse (type must be double).
        -libsvm_options:
            A string of training options in the same format as that of LIBSVM.

matlab> [predicted_label, accuracy, decision_values/prob_estimates] = svmpredict(testing_label_vector, testing_instance_matrix, model [, 'libsvm_options']);
matlab> [predicted_label] = svmpredict(testing_label_vector, testing_instance_matrix, model [, 'libsvm_options']);

        -testing_label_vector:
            An m by 1 vector of prediction labels. If labels of test
            data are unknown, simply use any random values. (type must be double)
        -testing_instance_matrix:
            An m by n matrix of m testing instances with n features.
            It can be dense or sparse. (type must be double)
        -model:
            The output of svmtrain.
        -libsvm_options:
            A string of testing options in the same format as that of LIBSVM.

Returned Model Structure

========================

The 'svmtrain' function returns a model which can be used for future prediction. It is a structure and is organized as [Parameters, nr_class, totalSV, rho, Label, ProbA, ProbB, nSV, sv_coef, SVs]:

        -Parameters: parameters
        -nr_class: number of classes; = 2 for regression/one-class svm
        -totalSV: total #SV
        -rho: -b of the decision function(s) wx+b
        -Label: label of each class; empty for regression/one-class SVM
        -sv_indices: values in [1,...,num_traning_data] to indicate SVs in the training set
        -ProbA: pairwise probability information; empty if -b 0 or in one-class SVM
        -ProbB: pairwise probability information; empty if -b 0 or in one-class SVM
        -nSV: number of SVs for each class; empty for regression/one-class SVM
        -sv_coef: coefficients for SVs in decision functions
        -SVs: support vectors

If you do not use the option '-b 1', ProbA and ProbB are empty matrices. If the '-v' option is specified, cross validation is conducted and the returned model is just a scalar: cross-validation accuracy for classification and mean-squared error for regression.

More details about this model can be found in LIBSVM FAQ (http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html) and LIBSVM
implementation document (http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf).

Result of Prediction

====================

The function 'svmpredict' has three outputs. The first one, predictd_label, is a vector of predicted labels. The second output, accuracy, is a vector including accuracy (for classification), mean squared error, and squared correlation coefficient (for regression).
The third is a matrix containing decision values or probability estimates (if '-b 1' is specified). If k is the number of classes in training data, for decision values, each row includes results of predicting k(k-1)/2 binary-class SVMs. For classification, k = 1 is a special case. Decision value +1 is returned for each testing instance, instead of an empty vector. For probabilities, each row contains k values indicating the probability that the testing instance is in each class.
Note that the order of classes here is the same as 'Label' field in the model structure.

Other Utilities

===============

A matlab function libsvmread reads files in LIBSVM format:

[label_vector, instance_matrix] = libsvmread('data.txt');

Two outputs are labels and instances, which can then be used as inputs of svmtrain or svmpredict.

A matlab function libsvmwrite writes Matlab matrix to a file in LIBSVM format:

libsvmwrite('data.txt', label_vector, instance_matrix)

The instance_matrix must be a sparse matrix. (type must be double) For 32bit and 64bit MATLAB on Windows, pre-built binary files are ready in the directory `..\windows', but in future releases, we will only
include 64bit MATLAB binary files.

These codes are prepared by Rong-En Fan and Kai-Wei Chang from National Taiwan University.

Examples

========

Train and test on the provided data heart_scale:

matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale');
matlab> model = svmtrain(heart_scale_label, heart_scale_inst, '-c 1 -g 0.07');
matlab> [predict_label, accuracy, dec_values] = svmpredict(heart_scale_label, heart_scale_inst, model); % test the training data

For probability estimates, you need '-b 1' for training and testing:

matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale');
matlab> model = svmtrain(heart_scale_label, heart_scale_inst, '-c 1 -g 0.07 -b 1');
matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale');
matlab> [predict_label, accuracy, prob_estimates] = svmpredict(heart_scale_label, heart_scale_inst, model, '-b 1');

To use precomputed kernel, you must include sample serial number as the first column of the training and testing data (assume your kernel matrix is K, # of instances is n):

matlab> K1 = [(1:n)', K]; % include sample serial number as first column
matlab> model = svmtrain(label_vector, K1, '-t 4');
matlab> [predict_label, accuracy, dec_values] = svmpredict(label_vector, K1, model); % test the training data

We give the following detailed example by splitting heart_scale into 150 training and 120 testing data. Constructing a linear kernel matrix and then using the precomputed kernel gives exactly the same testing error as using the LIBSVM built-in linear kernel.

matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale');
matlab>
matlab> % Split Data
matlab> train_data = heart_scale_inst(1:150,:);
matlab> train_label = heart_scale_label(1:150,:);
matlab> test_data = heart_scale_inst(151:270,:);
matlab> test_label = heart_scale_label(151:270,:);
matlab>
matlab> % Linear Kernel
matlab> model_linear = svmtrain(train_label, train_data, '-t 0');
matlab> [predict_label_L, accuracy_L, dec_values_L] = svmpredict(test_label, test_data, model_linear);
matlab>
matlab> % Precomputed Kernel
matlab> model_precomputed = svmtrain(train_label, [(1:150)', train_data*train_data'], '-t 4');
matlab> [predict_label_P, accuracy_P, dec_values_P] = svmpredict(test_label, [(1:120)', test_data*train_data'], model_precomputed);
matlab>
matlab> accuracy_L % Display the accuracy using linear kernel
matlab> accuracy_P % Display the accuracy using precomputed kernel

Note that for testing, you can put anything in the testing_label_vector. For more details of precomputed kernels, please read the section ``Precomputed Kernels'' in the README of the LIBSVM package.

Additional Information

======================

This interface was initially written by Jun-Cheng Chen, Kuan-Jen Peng, Chih-Yuan Yang and Chih-Huai Cheng from Department of Computer Science, National Taiwan University. The current version was prepared by Rong-En Fan and Ting-Fan Wu. If you find this tool useful, please cite LIBSVM as follows

Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

For any question, please contact Chih-Jen Lin cjlin@csie.ntu.edu.tw, or check the FAQ page: http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html#/Q10:_MATLAB_interface

3.Python interface of LIBSVM界面

=================

- Introduction
- Installation
- Quick Start
- Design Description
- Data Structures
- Utility Functions
- Additional Information

Introduction

============

Python (http://www.python.org/) is a programming language suitable for rapid development. This tool provides a simple Python interface界面 to LIBSVM, a library for support vector machines (http://www.csie.ntu.edu.tw/~cjlin/libsvm). The interface is very easy to use as the usage is the same as that of LIBSVM. The interface is developed with the built-in Python library "ctypes."

Installation

============

On Unix systems, type

> make

The interface needs only LIBSVM shared library, which is generated by the above command. We assume that the shared library is on the LIBSVM main directory or in the system path.

For windows, the shared library libsvm.dll for 32-bit python is ready in the directory `..\windows'. You can also copy it to the system directory (e.g., `C:\WINDOWS\system32\' for Windows XP). To regenerate the shared library, please follow the instruction of building windows binaries in LIBSVM README.

Quick Start

===========

There are two levels of usage. The high-level one uses utility functions公用功能 in svmutil.py and the usage is the same as the LIBSVM MATLAB interface.

>>> from svmutil import *
# Read data in LIBSVM format
>>> y, x = svm_read_problem('../heart_scale')
>>> m = svm_train(y[:200], x[:200], '-c 4')
>>> p_label, p_acc, p_val = svm_predict(y[200:], x[200:], m)

# Construct problem in python format
# Dense data
>>> y, x = [1,-1], [[1,0,1], [-1,0,-1]]
# Sparse data
>>> y, x = [1,-1], [{1:1, 3:1}, {1:-1,3:-1}]
>>> prob  = svm_problem(y, x)
>>> param = svm_parameter('-t 0 -c 4 -b 1')
>>> m = svm_train(prob, param)

# Precomputed kernel data (-t 4)
# Dense data
>>> y, x = [1,-1], [[1, 2, -2], [2, -2, 2]]
# Sparse data
>>> y, x = [1,-1], [{0:1, 1:2, 2:-2}, {0:2, 1:-2, 2:2}]
# isKernel=True must be set for precomputer kernel
>>> prob  = svm_problem(y, x, isKernel=True)
>>> param = svm_parameter('-t 4 -c 4 -b 1')
>>> m = svm_train(prob, param)
# For the format of precomputed kernel, please read LIBSVM README.

# Other utility functions
>>> svm_save_model('heart_scale.model', m)
>>> m = svm_load_model('heart_scale.model')
>>> p_label, p_acc, p_val = svm_predict(y, x, m, '-b 1')
>>> ACC, MSE, SCC = evaluations(y, p_label)

# Getting online help
>>> help(svm_train)

The low-level use directly calls C interfaces imported by svm.py. Note that all arguments and return values are in ctypes format. You need to handle them carefully.

>>> from svm import *
>>> prob = svm_problem([1,-1], [{1:1, 3:1}, {1:-1,3:-1}])
>>> param = svm_parameter('-c 4')
>>> m = libsvm.svm_train(prob, param) # m is a ctype pointer to an svm_model
# Convert a Python-format instance to svm_nodearray, a ctypes structure
>>> x0, max_idx = gen_svm_nodearray({1:1, 3:1})
>>> label = libsvm.svm_predict(m, x0)

Design Description设计说明

==================

There are two files svm.py and svmutil.py, which respectively correspond to low-level and high-level use of the interface.

In svm.py, we adopt the Python built-in library "ctypes," so that Python can directly access C structures and interface functions defined in svm.h.

While advanced users can use structures/functions in svm.py, to avoid handling ctypes structures, in svmutil.py we provide some easy-to-use functions. The usage is similar to LIBSVM MATLAB interface.

Data Structures数据结构

===============

Four data structures derived from svm.h are svm_node, svm_problem, svm_parameter, and svm_model. They all contain fields with the same names in svm.h. Access these fields carefully because you directly use a C structure instead of a Python object. For svm_model, accessing the field directly is not recommanded.
Programmers should use the interface functions or methods of svm_model class in Python to get the values. The following description introduces additional fields and methods.

Before using the data structures, execute the following command to load the LIBSVM shared library:

>>> from svm import *

- class svm_node:

Construct an svm_node.

>>> node = svm_node(idx, val)

idx: an integer indicates the feature index.

val: a float indicates the feature value.

Show the index and the value of a node.

>>> print(node)

- Function: gen_svm_nodearray(xi [,feature_max=None [,isKernel=False]])

Generate a feature vector from a Python list/tuple or a dictionary:

>>> xi, max_idx = gen_svm_nodearray({1:1, 3:1, 5:-2})

xi: the returned svm_nodearray (a ctypes structure)

max_idx: the maximal feature index of xi

    feature_max: if feature_max is assigned, features with indices larger than
                 feature_max are removed.

    isKernel: if isKernel == True, the list index starts from 0 for precomputed
              kernel. Otherwise, the list index starts from 1. The default
          value is False.

- class svm_problem:

Construct an svm_problem instance

>>> prob = svm_problem(y, x)

y: a Python list/tuple of l labels (type must be int/double).

x: a Python list/tuple of l data instances. Each element of x must be
an instance of list/tuple/dictionary type.

Note that if your x contains sparse data (i.e., dictionary), the internal
ctypes data format is still sparse.

For pre-computed kernel, the isKernel flag should be set to True:

>>> prob = svm_problem(y, x, isKernel=True)

Please read LIBSVM README for more details of pre-computed kernel.

- class svm_parameter:

Construct an svm_parameter instance

>>> param = svm_parameter('training_options')

If 'training_options' is empty, LIBSVM default values are applied.

Set param to LIBSVM default values.

>>> param.set_to_default_values()

Parse a string of options.

>>> param.parse_options('training_options')

Show values of parameters.

>>> print(param)

- class svm_model:

There are two ways to obtain an instance of svm_model:

>>> model = svm_train(y, x)
>>> model = svm_load_model('model_file_name')

    Note that the returned structure of interface functions
    libsvm.svm_train and libsvm.svm_load_model is a ctypes pointer of
    svm_model, which is different from the svm_model object returned
    by svm_train and svm_load_model in svmutil.py. We provide a
    function toPyModel for the conversion:

>>> model_ptr = libsvm.svm_train(prob, param)
>>> model = toPyModel(model_ptr)

If you obtain a model in a way other than the above approaches,
handle it carefully to avoid memory leak or segmentation fault.

Some interface functions to access LIBSVM models are wrapped as
members of the class svm_model:

    >>> svm_type = model.get_svm_type()
    >>> nr_class = model.get_nr_class()
    >>> svr_probability = model.get_svr_probability()
    >>> class_labels = model.get_labels()
    >>> sv_indices = model.get_sv_indices()
    >>> nr_sv = model.get_nr_sv()
    >>> is_prob_model = model.is_probability_model()
    >>> support_vector_coefficients = model.get_sv_coef()
    >>> support_vectors = model.get_SV()

Utility Functions

=================

To use utility functions, type

>>> from svmutil import *

The above command loads
    svm_train()        : train an SVM model
    svm_predict()      : predict testing data
    svm_read_problem() : read the data from a LIBSVM-format file.
    svm_load_model()   : load a LIBSVM model.
    svm_save_model()   : save model to a file.
    evaluations()      : evaluate prediction results.

- Function: svm_train

There are three ways to call svm_train()

    >>> model = svm_train(y, x [, 'training_options'])
    >>> model = svm_train(prob [, 'training_options'])
    >>> model = svm_train(prob, param)

y: a list/tuple of l training labels (type must be int/double).

x: a list/tuple of l training instances. The feature vector of
each training instance is an instance of list/tuple or dictionary.

training_options: a string in the same form as that for LIBSVM command
mode.

    prob: an svm_problem instance generated by calling
          svm_problem(y, x).
      For pre-computed kernel, you should use
      svm_problem(y, x, isKernel=True)

param: an svm_parameter instance generated by calling
svm_parameter('training_options')

    model: the returned svm_model instance. See svm.h for details of this
           structure. If '-v' is specified, cross validation is
           conducted and the returned model is just a scalar: cross-validation
           accuracy for classification and mean-squared error for regression.

To train the same data many times with different
parameters, the second and the third ways should be faster..

Examples:

    >>> y, x = svm_read_problem('../heart_scale')
    >>> prob = svm_problem(y, x)
    >>> param = svm_parameter('-s 3 -c 5 -h 0')
    >>> m = svm_train(y, x, '-c 5')
    >>> m = svm_train(prob, '-t 2 -c 5')
    >>> m = svm_train(prob, param)
    >>> CV_ACC = svm_train(y, x, '-v 3')

- Function: svm_predict

To predict testing data with a model, use

>>> p_labs, p_acc, p_vals = svm_predict(y, x, model [,'predicting_options'])

    y: a list/tuple of l true labels (type must be int/double). It is used
       for calculating the accuracy. Use [0]*len(x) if true labels are
       unavailable.

x: a list/tuple of l predicting instances. The feature vector of
each predicting instance is an instance of list/tuple or dictionary.

predicting_options: a string of predicting options in the same format as
that of LIBSVM.

model: an svm_model instance.

p_labels: a list of predicted labels

    p_acc: a tuple including accuracy (for classification), mean
           squared error, and squared correlation coefficient (for
           regression).

    p_vals: a list of decision values or probability estimates (if '-b 1'
            is specified). If k is the number of classes in training data,
        for decision values, each element includes results of predicting
        k(k-1)/2 binary-class SVMs. For classification, k = 1 is a
        special case. Decision value [+1] is returned for each testing
        instance, instead of an empty list.
        For probabilities, each element contains k values indicating
            the probability that the testing instance is in each class.
            Note that the order of classes is the same as the 'model.label'
            field in the model structure.

Example:

>>> m = svm_train(y, x, '-c 5')
>>> p_labels, p_acc, p_vals = svm_predict(y, x, m)

- Functions: svm_read_problem/svm_load_model/svm_save_model

See the usage by examples:

    >>> y, x = svm_read_problem('data.txt')
    >>> m = svm_load_model('model_file')
    >>> svm_save_model('model_file', m)

- Function: evaluations

Calculate some evaluations using the true values (ty) and predicted
values (pv):

>>> (ACC, MSE, SCC) = evaluations(ty, pv)

ty: a list of true values.

pv: a list of predict values.

ACC: accuracy.

MSE: mean squared error.

SCC: squared correlation coefficient.

Additional Information

======================

This interface was written by Hsiang-Fu Yu from Department of Computer Science, National Taiwan University. If you find this tool useful, please cite LIBSVM as follows

For any question, please contact Chih-Jen Lin cjlin@csie.ntu.edu.tw, or check the FAQ page: http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)