Libsvm:脚本(subset.py、grid.py、checkdata.py)

2023-11-15

1.脚本

This directory includes some useful codes:

1. subset selection tools. (子集抽取工具) subset.py
2. parameter selection tools. (参数选优工具) grid.py
3. LIBSVM format checking tools(格式检查工具)checkdata.py

Part I: Subset selection tools子集抽取

Introduction

============

Training large data(大训练集) is time consuming. Sometimes one should work on a smaller subset(先用小的子集测试) first. The python script subset.py randomly selects a specified number of samples. For classification data, we provide a stratified分层的 selection to ensure the same class distribution in the subset.

Usage: subset.py [options] dataset number [output1] [output2]

This script selects a subset of the given data set.

options: 
-s method : method of selection (default 0) 
0 -- stratified selection (classification only) 分层选择 
1 -- random selection 随机选择

output1 : the subset (optional) 
output2 : the rest of data (optional)

If output1 is omitted省略, the subset will be printed on the screen.

Example

=======

> python subset.py heart_scale 100 file1 file2

From heart_scale 100 samples are randomly selected and stored in file1. All remaining instances are stored in file2.

这里没有-s参数,所以默认是-s 0分层选择;选取100个样本存储在file1中,其余的实例存放在file2中。

 

Part II: Parameter Selection Tools参数选优

Introduction

============

grid.py is a parameter selection tool for C-SVM classification参数选优工具) using the RBF (radial basis function)(核函数) kernel. It uses cross validation (CV) technique(交叉检验技术) to estimate the accuracy of each parameter combination in the specified range and helps you to decide the best parameters for your problem.

grid.py directly executes libsvm binaries (so no python binding is needed) for cross validation and then draw contour轮廓 of CV accuracy using gnuplot. You must have libsvm and gnuplot installed before using it. The package gnuplot is available at http://www.gnuplot.info/

On Mac OSX, the precompiled gnuplot file needs the library Aquarterm, which thus must be installed as well. In addition, this version of gnuplot does not support png, so you need to change "set term png transparent small" and use other image formats. For example, you may have "set term pbm small color".

Usage: grid.py [grid_options] [svm_options] dataset

grid_options

-log2c {begin,end,step | "null"} : set the range of c (default -5,15,2) 
    begin,end,step -- c_range = 2^{begin,...,begin+k*step,...,end} 
    "null"         -- do not grid with c 

-log2g {begin,end,step | "null"} : set the range of g (default 3,-15,-2) 
    begin,end,step -- g_range = 2^{begin,...,begin+k*step,...,end} 
    "null"         -- do not grid with g 

-v n : n-fold cross validation (default 5) 

-svmtrain pathname : set svm executable path and name 

-gnuplot {pathname | "null"} : 
    pathname -- set gnuplot executable path and name 
    "null"   -- do not plot 

-out {pathname | "null"} : (default dataset.out) 
    pathname -- set output file path and name 
    "null"   -- do not output file 

-png pathname : set graphic output file path and name (default dataset.png) 

-resume [pathname] : resume the grid task using an existing output file (default pathname is dataset.out)

Use this option only if some parameters have been checked for the SAME data.

 

svm_options : additional options for svm-train  可以附加svm-train程序的参数,以下的-m就是设置缓存

 

The program conducts v-fold cross validation using parameter C (and gamma) = 2^begin, 2^(begin+step), ..., 2^end.

You can specify where the libsvm executable and gnuplot are using the -svmtrain and -gnuplot parameters.

For windows users, please use pgnuplot.exe. If you are using gnuplot 3.7.1, please upgrade to version 3.7.3 or higher. The version 3.7.1 has a bug. If you use cygwin on windows, please use gunplot-x11.

If the task is terminated accidentally or you would like to change the range of parameters, you can apply '-resume'重新开始)to save time by re-using previous results.  You may specify the output file of a previous run or use the default (i.e., dataset.out) without giving a name. Please note that the same condition must be used in two runs. For example,
you cannot use '-v 10' earlier and resume the task with '-v 5'.

The value of some options can be "null." For example, `-log2c -1,0,1 -log2 "null"' means that C=2^-1,2^0,2^1 and g=LIBSVM's default gamma value. That is, you do not conduct parameter selection on gamma.

Example

=======

> python grid.py -log2c -5,5,1 -log2g -4,0,1 -v 5 -m 300 heart_scale

Users (in particular MS Windows users) may need to specify the path of executable files. You can either change paths in the beginning of grid.py(推荐) or specify them in the command line. For example,

> grid.py -log2c -5,5,1 -svmtrain "c:\Program Files\libsvm\windows\svm-train.exe" -gnuplot c:\tmp\gnuplot\binary\pgnuplot.exe -v 10 heart_scale

Output: two files
dataset.png: the CV accuracy contour plot generated by gnuplot
dataset.out: the CV accuracy at each (log2(C),log2(gamma))

The following example saves running time by loading the output file of a previous run.

> python grid.py -log2c -7,7,1 -log2g -5,2,1 -v 5 -resume heart_scale.out heart_scale

 

Parallel grid search 并行

====================

You can conduct a parallel grid search by dispatching jobs to a cluster of computers which share the same file system. First, you add machine names in grid.py:

ssh_workers = ["linux1", "linux5", "linux5"]

and then setup your ssh so that the authentication works without asking a password.

The same machine (e.g., linux5 here) can be listed more than once if it has multiple CPUs or has more RAM. If the local machine is the best, you can also enlarge the nr_local_worker. For example:

nr_local_worker = 2

Example:

> python grid.py heart_scale 
[local] -1 -1 78.8889  (best c=0.5, g=0.5, rate=78.8889) 
[linux5] -1 -7 83.3333  (best c=0.5, g=0.0078125, rate=83.3333) 
[linux5] 5 -1 77.037  (best c=0.5, g=0.0078125, rate=83.3333) 
[linux1] 5 -7 83.3333  (best c=0.5, g=0.0078125, rate=83.3333) 
.

If -log2c, -log2g, or -v is not specified, default values are used.

If your system uses telnet instead of ssh, you list the computer names in telnet_workers.

Calling grid in Python 调用

======================

In addition to using grid.py as a command-line tool, you can use it as a Python module.

>>> rate, param = find_parameters(dataset, options)

You need to specify `dataset' and `options' (default ''). See the following example.

> python

>>> from grid import * 
>>> rate, param = find_parameters('../heart_scale', '-log2c -1,1,1 -log2g -1,1,1') 
[local] 0.0 0.0 rate=74.8148 (best c=1.0, g=1.0, rate=74.8148) 
[local] 0.0 -1.0 rate=77.037 (best c=1.0, g=0.5, rate=77.037) 
. 
. 
[local] -1.0 -1.0 rate=78.8889 (best c=0.5, g=0.5, rate=78.8889) 
. 
. 
>>> rate 
78.8889 
>>> param 
{'c': 0.5, 'g': 0.5}

 

Part III: LIBSVM format checking tools格式检查

Introduction

============

`svm-train' conducts only a simple check of the input data. To do a detailed check, we provide a python script `checkdata.py.'

Usage: checkdata.py dataset

Exit status (returned value): 1 if there are errors, 0 otherwise.

This tool is written by Rong-En Fan at National Taiwan University.

Example

=======

> cat bad_data 
1 3:1 2:4 
> python checkdata.py bad_data 
line 1: feature indices must be in an ascending order, previous/current features 3:1 2:4 
Found 1 lines with error.

 

2.MATLAB/OCTAVE interface of LIBSVM

Table of Contents

=================

- Introduction
- Installation
- Usage
- Returned Model Structure
- Other Utilities
- Examples
- Additional Information

 

Introduction

============

This tool provides a simple interface to LIBSVM, a library for support vector machines (http://www.csie.ntu.edu.tw/~cjlin/libsvm). It is very easy to use as the usage and the way of specifying parameters are the same as that of LIBSVM.

Installation

============

On Windows systems, pre-built binary files are already in the directory '..\windows', so no need to conduct installation. Now we provide binary files only for 64bit MATLAB on Windows. If you would like to re-build the package, please rely on the following steps.

We recommend using make.m on both MATLAB and OCTAVE. Just type 'make' to build 'libsvmread.mex', 'libsvmwrite.mex', 'svmtrain.mex', and 'svmpredict.mex'.

On MATLAB or Octave:

        >> make

If make.m does not work on MATLAB (especially for Windows), try 'mex -setup' to choose a suitable compiler for mex. Make sure your compiler is accessible and workable. Then type 'make' to start the installation.

Example:

    matlab>> mex -setup
    (ps: MATLAB will show the following messages to setup default compiler.)
    Please choose your compiler for building external interface (MEX) files:
    Would you like mex to locate installed compilers [y]/n? y
    Select a compiler:
    [1] Microsoft Visual C/C++ version 7.1 in C:\Program Files\Microsoft Visual Studio
    [0] None
    Compiler: 1
    Please verify your choices:
    Compiler: Microsoft Visual C/C++ 7.1
    Location: C:\Program Files\Microsoft Visual Studio
    Are these correct?([y]/n): y

    matlab>> make

On Unix systems, if neither make.m nor 'mex -setup' works, please use Makefile and type 'make' in a command window. Note that we assume your MATLAB is installed in '/usr/local/matlab'. If not, please change MATLABDIR in Makefile.

Example:
        linux> make

To use octave, type 'make octave':

Example:
    linux> make octave

For a list of supported/compatible compilers for MATLAB, please check the following page:

http://www.mathworks.com/support/compilers/current_release/

Usage

=====

matlab> model = svmtrain(training_label_vector, training_instance_matrix [, 'libsvm_options']);

        -training_label_vector:
            An m by 1 vector of training labels (type must be double).
        -training_instance_matrix:
            An m by n matrix of m training instances with n features.
            It can be dense or sparse (type must be double).
        -libsvm_options:
            A string of training options in the same format as that of LIBSVM.

matlab> [predicted_label, accuracy, decision_values/prob_estimates] = svmpredict(testing_label_vector, testing_instance_matrix, model [, 'libsvm_options']);
matlab> [predicted_label] = svmpredict(testing_label_vector, testing_instance_matrix, model [, 'libsvm_options']);

        -testing_label_vector:
            An m by 1 vector of prediction labels. If labels of test
            data are unknown, simply use any random values. (type must be double)
        -testing_instance_matrix:
            An m by n matrix of m testing instances with n features.
            It can be dense or sparse. (type must be double)
        -model:
            The output of svmtrain.
        -libsvm_options:
            A string of testing options in the same format as that of LIBSVM.

Returned Model Structure

========================

The 'svmtrain' function returns a model which can be used for future prediction.  It is a structure and is organized as [Parameters, nr_class, totalSV, rho, Label, ProbA, ProbB, nSV, sv_coef, SVs]:

        -Parameters: parameters
        -nr_class: number of classes; = 2 for regression/one-class svm
        -totalSV: total #SV
        -rho: -b of the decision function(s) wx+b
        -Label: label of each class; empty for regression/one-class SVM
        -sv_indices: values in [1,...,num_traning_data] to indicate SVs in the training set
        -ProbA: pairwise probability information; empty if -b 0 or in one-class SVM
        -ProbB: pairwise probability information; empty if -b 0 or in one-class SVM
        -nSV: number of SVs for each class; empty for regression/one-class SVM
        -sv_coef: coefficients for SVs in decision functions
        -SVs: support vectors

If you do not use the option '-b 1', ProbA and ProbB are empty matrices. If the '-v' option is specified, cross validation is conducted and the returned model is just a scalar: cross-validation accuracy for classification and mean-squared error for regression.

More details about this model can be found in LIBSVM FAQ (http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html) and LIBSVM
implementation document (http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf).

Result of Prediction

====================

The function 'svmpredict' has three outputs. The first one, predictd_label, is a vector of predicted labels. The second output, accuracy, is a vector including accuracy (for classification), mean squared error, and squared correlation coefficient (for regression).
The third is a matrix containing decision values or probability estimates (if '-b 1' is specified). If k is the number of classes in training data, for decision values, each row includes results of predicting k(k-1)/2 binary-class SVMs. For classification, k = 1 is a special case. Decision value +1 is returned for each testing instance, instead of an empty vector. For probabilities, each row contains k values indicating the probability that the testing instance is in each class.
Note that the order of classes here is the same as 'Label' field in the model structure.

Other Utilities

===============

A matlab function libsvmread reads files in LIBSVM format:

[label_vector, instance_matrix] = libsvmread('data.txt');

Two outputs are labels and instances, which can then be used as inputs of svmtrain or svmpredict.

A matlab function libsvmwrite writes Matlab matrix to a file in LIBSVM format:

libsvmwrite('data.txt', label_vector, instance_matrix)

The instance_matrix must be a sparse matrix. (type must be double) For 32bit and 64bit MATLAB on Windows, pre-built binary files are ready in the directory `..\windows', but in future releases, we will only
include 64bit MATLAB binary files.

These codes are prepared by Rong-En Fan and Kai-Wei Chang from National Taiwan University.

Examples

========

Train and test on the provided data heart_scale:

matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale');
matlab> model = svmtrain(heart_scale_label, heart_scale_inst, '-c 1 -g 0.07');
matlab> [predict_label, accuracy, dec_values] = svmpredict(heart_scale_label, heart_scale_inst, model); % test the training data

For probability estimates, you need '-b 1' for training and testing:

matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale');
matlab> model = svmtrain(heart_scale_label, heart_scale_inst, '-c 1 -g 0.07 -b 1');
matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale');
matlab> [predict_label, accuracy, prob_estimates] = svmpredict(heart_scale_label, heart_scale_inst, model, '-b 1');

To use precomputed kernel, you must include sample serial number as the first column of the training and testing data (assume your kernel matrix is K, # of instances is n):

matlab> K1 = [(1:n)', K]; % include sample serial number as first column
matlab> model = svmtrain(label_vector, K1, '-t 4');
matlab> [predict_label, accuracy, dec_values] = svmpredict(label_vector, K1, model); % test the training data

We give the following detailed example by splitting heart_scale into 150 training and 120 testing data.  Constructing a linear kernel matrix and then using the precomputed kernel gives exactly the same testing error as using the LIBSVM built-in linear kernel.

matlab> [heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale');
matlab>
matlab> % Split Data
matlab> train_data = heart_scale_inst(1:150,:);
matlab> train_label = heart_scale_label(1:150,:);
matlab> test_data = heart_scale_inst(151:270,:);
matlab> test_label = heart_scale_label(151:270,:);
matlab>
matlab> % Linear Kernel
matlab> model_linear = svmtrain(train_label, train_data, '-t 0');
matlab> [predict_label_L, accuracy_L, dec_values_L] = svmpredict(test_label, test_data, model_linear);
matlab>
matlab> % Precomputed Kernel
matlab> model_precomputed = svmtrain(train_label, [(1:150)', train_data*train_data'], '-t 4');
matlab> [predict_label_P, accuracy_P, dec_values_P] = svmpredict(test_label, [(1:120)', test_data*train_data'], model_precomputed);
matlab>
matlab> accuracy_L % Display the accuracy using linear kernel
matlab> accuracy_P % Display the accuracy using precomputed kernel

Note that for testing, you can put anything in the testing_label_vector.  For more details of precomputed kernels, please read the section ``Precomputed Kernels'' in the README of the LIBSVM package.

Additional Information

======================

This interface was initially written by Jun-Cheng Chen, Kuan-Jen Peng, Chih-Yuan Yang and Chih-Huai Cheng from Department of Computer Science, National Taiwan University. The current version was prepared by Rong-En Fan and Ting-Fan Wu. If you find this tool useful, please cite LIBSVM as follows

Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

For any question, please contact Chih-Jen Lin cjlin@csie.ntu.edu.tw, or check the FAQ page: http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html#/Q10:_MATLAB_interface

 

3.Python interface of LIBSVM界面

Table of Contents

=================

- Introduction
- Installation
- Quick Start
- Design Description
- Data Structures
- Utility Functions
- Additional Information

Introduction

============

Python (http://www.python.org/) is a programming language suitable for rapid development. This tool provides a simple Python interface界面 to LIBSVM, a library for support vector machines (http://www.csie.ntu.edu.tw/~cjlin/libsvm). The interface is very easy to use as the usage is the same as that of LIBSVM. The interface is developed with the built-in Python library "ctypes."

Installation

============

On Unix systems, type

> make

The interface needs only LIBSVM shared library, which is generated by the above command. We assume that the shared library is on the LIBSVM main directory or in the system path.

For windows, the shared library libsvm.dll for 32-bit python is ready in the directory `..\windows'. You can also copy it to the system directory (e.g., `C:\WINDOWS\system32\' for Windows XP). To regenerate the shared library, please follow the instruction of building windows binaries in LIBSVM README.

Quick Start

===========

There are two levels of usage. The high-level one uses utility functions公用功能 in svmutil.py and the usage is the same as the LIBSVM MATLAB interface.

>>> from svmutil import *
# Read data in LIBSVM format
>>> y, x = svm_read_problem('../heart_scale')
>>> m = svm_train(y[:200], x[:200], '-c 4')
>>> p_label, p_acc, p_val = svm_predict(y[200:], x[200:], m)

# Construct problem in python format
# Dense data
>>> y, x = [1,-1], [[1,0,1], [-1,0,-1]]
# Sparse data
>>> y, x = [1,-1], [{1:1, 3:1}, {1:-1,3:-1}]
>>> prob  = svm_problem(y, x)
>>> param = svm_parameter('-t 0 -c 4 -b 1')
>>> m = svm_train(prob, param)

# Precomputed kernel data (-t 4)
# Dense data
>>> y, x = [1,-1], [[1, 2, -2], [2, -2, 2]]
# Sparse data
>>> y, x = [1,-1], [{0:1, 1:2, 2:-2}, {0:2, 1:-2, 2:2}]
# isKernel=True must be set for precomputer kernel
>>> prob  = svm_problem(y, x, isKernel=True)
>>> param = svm_parameter('-t 4 -c 4 -b 1')
>>> m = svm_train(prob, param)
# For the format of precomputed kernel, please read LIBSVM README.
# Other utility functions
>>> svm_save_model('heart_scale.model', m)
>>> m = svm_load_model('heart_scale.model')
>>> p_label, p_acc, p_val = svm_predict(y, x, m, '-b 1')
>>> ACC, MSE, SCC = evaluations(y, p_label)

# Getting online help
>>> help(svm_train)

The low-level use directly calls C interfaces imported by svm.py. Note that all arguments and return values are in ctypes format. You need to handle them carefully.

>>> from svm import *
>>> prob = svm_problem([1,-1], [{1:1, 3:1}, {1:-1,3:-1}])
>>> param = svm_parameter('-c 4')
>>> m = libsvm.svm_train(prob, param) # m is a ctype pointer to an svm_model
# Convert a Python-format instance to svm_nodearray, a ctypes structure
>>> x0, max_idx = gen_svm_nodearray({1:1, 3:1})
>>> label = libsvm.svm_predict(m, x0)

Design Description设计说明

==================

There are two files svm.py and svmutil.py, which respectively correspond to low-level and high-level use of the interface.

In svm.py, we adopt the Python built-in library "ctypes," so that Python can directly access C structures and interface functions defined in svm.h.

While advanced users can use structures/functions in svm.py, to avoid handling ctypes structures, in svmutil.py we provide some easy-to-use functions. The usage is similar to LIBSVM MATLAB interface.

Data Structures数据结构

===============

Four data structures derived from svm.h are svm_node, svm_problem, svm_parameter, and svm_model. They all contain fields with the same names in svm.h. Access these fields carefully because you directly use a C structure instead of a Python object. For svm_model, accessing the field directly is not recommanded.
Programmers should use the interface functions or methods of svm_model class in Python to get the values. The following description introduces additional fields and methods.

Before using the data structures, execute the following command to load the LIBSVM shared library:

>>> from svm import *
- class svm_node:

    Construct an svm_node.

    >>> node = svm_node(idx, val)

    idx: an integer indicates the feature index.

    val: a float indicates the feature value.

    Show the index and the value of a node.

    >>> print(node)

- Function: gen_svm_nodearray(xi [,feature_max=None [,isKernel=False]])

    Generate a feature vector from a Python list/tuple or a dictionary:

    >>> xi, max_idx = gen_svm_nodearray({1:1, 3:1, 5:-2})

    xi: the returned svm_nodearray (a ctypes structure)

    max_idx: the maximal feature index of xi

    feature_max: if feature_max is assigned, features with indices larger than
                 feature_max are removed.
   
    isKernel: if isKernel == True, the list index starts from 0 for precomputed
              kernel. Otherwise, the list index starts from 1. The default
          value is False.

- class svm_problem:

    Construct an svm_problem instance

    >>> prob = svm_problem(y, x)

    y: a Python list/tuple of l labels (type must be int/double).

    x: a Python list/tuple of l data instances. Each element of x must be
       an instance of list/tuple/dictionary type.

    Note that if your x contains sparse data (i.e., dictionary), the internal
    ctypes data format is still sparse.

    For pre-computed kernel, the isKernel flag should be set to True:

    >>> prob = svm_problem(y, x, isKernel=True)

    Please read LIBSVM README for more details of pre-computed kernel.

- class svm_parameter:

    Construct an svm_parameter instance

    >>> param = svm_parameter('training_options')

    If 'training_options' is empty, LIBSVM default values are applied.

    Set param to LIBSVM default values.

    >>> param.set_to_default_values()

    Parse a string of options.

    >>> param.parse_options('training_options')

    Show values of parameters.

    >>> print(param)

- class svm_model:

    There are two ways to obtain an instance of svm_model:

    >>> model = svm_train(y, x)
    >>> model = svm_load_model('model_file_name')

    Note that the returned structure of interface functions
    libsvm.svm_train and libsvm.svm_load_model is a ctypes pointer of
    svm_model, which is different from the svm_model object returned
    by svm_train and svm_load_model in svmutil.py. We provide a
    function toPyModel for the conversion:

    >>> model_ptr = libsvm.svm_train(prob, param)
    >>> model = toPyModel(model_ptr)

    If you obtain a model in a way other than the above approaches,
    handle it carefully to avoid memory leak or segmentation fault.

    Some interface functions to access LIBSVM models are wrapped as
    members of the class svm_model:

    >>> svm_type = model.get_svm_type()
    >>> nr_class = model.get_nr_class()
    >>> svr_probability = model.get_svr_probability()
    >>> class_labels = model.get_labels()
    >>> sv_indices = model.get_sv_indices()
    >>> nr_sv = model.get_nr_sv()
    >>> is_prob_model = model.is_probability_model()
    >>> support_vector_coefficients = model.get_sv_coef()
    >>> support_vectors = model.get_SV()

Utility Functions

=================

To use utility functions, type

>>> from svmutil import *

The above command loads
    svm_train()        : train an SVM model
    svm_predict()      : predict testing data
    svm_read_problem() : read the data from a LIBSVM-format file.
    svm_load_model()   : load a LIBSVM model.
    svm_save_model()   : save model to a file.
    evaluations()      : evaluate prediction results.

- Function: svm_train

    There are three ways to call svm_train()

    >>> model = svm_train(y, x [, 'training_options'])
    >>> model = svm_train(prob [, 'training_options'])
    >>> model = svm_train(prob, param)

    y: a list/tuple of l training labels (type must be int/double).

    x: a list/tuple of l training instances. The feature vector of
       each training instance is an instance of list/tuple or dictionary.

    training_options: a string in the same form as that for LIBSVM command
                      mode.

    prob: an svm_problem instance generated by calling
          svm_problem(y, x).
      For pre-computed kernel, you should use
      svm_problem(y, x, isKernel=True)

    param: an svm_parameter instance generated by calling
           svm_parameter('training_options')

    model: the returned svm_model instance. See svm.h for details of this
           structure. If '-v' is specified, cross validation is
           conducted and the returned model is just a scalar: cross-validation
           accuracy for classification and mean-squared error for regression.

    To train the same data many times with different
    parameters, the second and the third ways should be faster..

    Examples:

    >>> y, x = svm_read_problem('../heart_scale')
    >>> prob = svm_problem(y, x)
    >>> param = svm_parameter('-s 3 -c 5 -h 0')
    >>> m = svm_train(y, x, '-c 5')
    >>> m = svm_train(prob, '-t 2 -c 5')
    >>> m = svm_train(prob, param)
    >>> CV_ACC = svm_train(y, x, '-v 3')

- Function: svm_predict

    To predict testing data with a model, use

    >>> p_labs, p_acc, p_vals = svm_predict(y, x, model [,'predicting_options'])

    y: a list/tuple of l true labels (type must be int/double). It is used
       for calculating the accuracy. Use [0]*len(x) if true labels are
       unavailable.

    x: a list/tuple of l predicting instances. The feature vector of
       each predicting instance is an instance of list/tuple or dictionary.

    predicting_options: a string of predicting options in the same format as
                        that of LIBSVM.

    model: an svm_model instance.

    p_labels: a list of predicted labels

    p_acc: a tuple including accuracy (for classification), mean
           squared error, and squared correlation coefficient (for
           regression).

    p_vals: a list of decision values or probability estimates (if '-b 1'
            is specified). If k is the number of classes in training data,
        for decision values, each element includes results of predicting
        k(k-1)/2 binary-class SVMs. For classification, k = 1 is a
        special case. Decision value [+1] is returned for each testing
        instance, instead of an empty list.
        For probabilities, each element contains k values indicating
            the probability that the testing instance is in each class.
            Note that the order of classes is the same as the 'model.label'
            field in the model structure.

    Example:

    >>> m = svm_train(y, x, '-c 5')
    >>> p_labels, p_acc, p_vals = svm_predict(y, x, m)

- Functions:  svm_read_problem/svm_load_model/svm_save_model

    See the usage by examples:

    >>> y, x = svm_read_problem('data.txt')
    >>> m = svm_load_model('model_file')
    >>> svm_save_model('model_file', m)

- Function: evaluations

    Calculate some evaluations using the true values (ty) and predicted
    values (pv):

    >>> (ACC, MSE, SCC) = evaluations(ty, pv)

    ty: a list of true values.

    pv: a list of predict values.

    ACC: accuracy.

    MSE: mean squared error.

    SCC: squared correlation coefficient.

 

Additional Information

======================

This interface was written by Hsiang-Fu Yu from Department of Computer Science, National Taiwan University. If you find this tool useful, please cite LIBSVM as follows

Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

For any question, please contact Chih-Jen Lin cjlin@csie.ntu.edu.tw, or check the FAQ page: http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

Libsvm:脚本(subset.py、grid.py、checkdata.py) 的相关文章

  • 在 python 2 和 3 的spyder之间切换

    根据我在文档中了解到的内容 它指出您只需使用命令提示符创建一个新变量即可轻松在 2 个 python 环境之间切换 如果我已经安装了 python 2 7 则 conda create n python34 python 3 4 anaco
  • 为什么方法无法访问类变量?

    我试图理解Python中的变量作用域 除了我不明白为什么类变量不能从其方法访问的部分之外 大多数事情对我来说都很清楚 在下面的例子中mydef1 无法访问a 但如果a可以在全局范围 类定义之外 声明 class MyClass1 a 25
  • 如何用spaCy获取依赖树?

    我一直在尝试寻找如何使用 spaCy 获取依赖树 但我找不到任何有关如何获取树的信息 只能在如何导航树 https spacy io usage examples subtrees 如果有人想轻松查看 spacy 生成的依赖关系树 一种解决
  • 如何在算术表达式的结果上添加 SQLAlchemy 标签?

    我如何将这样的东西翻译成 SQLAlchemy select x y as difference 我知道该怎么做 x label foo 但我不确定在哪里放置下面的 label 方法调用 select table c x table c y
  • Pytest:如何使用从夹具返回的列表来参数化测试?

    我想使用由固定装置动态创建的列表来参数化测试 如下所示 pytest fixture def my list returning fixture depends on other fixtures return a dynamically
  • 将 Django 表单中的所有 CharField 表单字段输入转换为小写

    我使用 Django 表单进行用户注册 用户可以在其中输入优惠券代码 我希望在优惠券代码字段中输入的所有字符都转换为小写 我尝试过在保存方法 自定义清理方法和自定义验证器中使用 lower 但这些方法没有运气 下面是我的代码 class S
  • 使用 Python 抓取维基百科数据

    我正在尝试从以下内容中检索 3 列 NFL 球队 球员姓名 大学球队 维基百科页面 http en wikipedia org wiki 2008 NFL draft 我是 python 新手 一直在尝试使用 beautifulsoup 来
  • pandas 两个数据框交叉连接[重复]

    这个问题在这里已经有答案了 我找不到有关交叉联接的任何内容 包括合并 联接或其他一些内容 我需要使用 my function 作为 myfunc 处理两个数据帧 相当于 for itemA in df1 iterrows for itemB
  • 在 MATLAB 中定义其他中缀运算符

    有没有办法在 MATLAB 中定义额外的中缀运算符 具体来说 我想定义两个中缀运算符 gt and lt gt 这些符号是理想的 但如果需要 它可以是单个字符 它调用函数implies and iff以同样的方式 calls and and
  • 使用reduce方法的斐波那契数列

    于是 我看到有人用reduce方法来计算斐波那契数列 这是他的想法 1 0 1 1 2 1 3 2 5 3 对应于 1 1 2 3 5 8 13 21 代码如下所示 def fib reduce n initial 1 0 dummy ra
  • 以编程方式将列名称添加到 numpy ndarray

    我正在尝试将列名称添加到 numpy ndarray 然后按名称选择列 但这不起作用 我无法判断问题是在添加名称时出现 还是在稍后尝试调用它们时出现 这是我的代码 data np genfromtxt csv file delimiter
  • 在 Windows 上将 Word2vec 与 Tensorflow 结合使用

    In 本教程文件 https github com tensorflow models blob master tutorials embedding word2vec py L45通过 Tensorflow 找到以下行 第 45 行 来加
  • Python 相当于 Bit Twiddling Hacks 中的 C 代码?

    我有一个位计数方法 我正在尝试尽可能快地实现 我想尝试下面的算法位摆弄黑客 http graphics stanford edu seander bithacks html CountBitsSetParallel 但我不知道 C 什么是
  • 如何检查包含 NaN 的列表 [关闭]

    Closed 这个问题需要细节或清晰度 help closed questions 目前不接受答案 在我的 for 循环中 我的代码生成一个如下所示的列表 list 0 0 0 0 sum 0 0 0 0 该循环生成所有其他数字向量 但它也
  • 如何像在浏览器中一样检索准确的 HTML

    我正在使用 Python 脚本来呈现网页并检索其 HTML 它适用于大多数页面 但对于其中一些页面 检索到的 HTML 不完整 我不太明白为什么 这是我用来废弃此页面的脚本 由于某种原因 每个产品的链接不在 HTML 中 Link http
  • 别碰我的女人

    我讨厌的一件事迪斯图尔斯 http docs python org distutils 我猜他是邪恶的人 他这样做了 https github com python cpython blob 300dd552b15825abfe0e367a
  • 从 python 中的缩进文本文件创建树/深度嵌套字典

    基本上 我想迭代一个文件并将每行的内容放入一个深层嵌套的字典中 其结构由每行开头的空格数量定义 本质上 目标是采取这样的事情 a b c d e 并将其变成这样的东西 a b c d e Or this apple colours red
  • 如何在python中递归复制目录并覆盖全部?

    我正在尝试复制 home myUser dir1 及其所有内容 及其内容等 home myuser dir2 在Python中 此外 我希望副本覆盖中的所有内容dir2 It looks like distutils dir util co
  • Pandas DataFrame:如何计算组中第一行和最后一行的差异?

    这是我的熊猫数据框 import pandas as pd import numpy as np data column1 338 519 871 1731 2693 2963 3379 3789 3910 4109 4307 4800 4
  • Python 中的迭代器 (iter()) 函数。 [关闭]

    Closed 这个问题是无法重现或由拼写错误引起 help closed questions 目前不接受答案 对于字典 我可以使用iter 用于迭代字典的键 y x 10 y 20 for val in iter y print val 当

随机推荐

  • JAVASE总复习

    一 填空题 共 20 个题目 总计 20 分 1 Java application 中的主类需要包含 main 方法 main 方法的返回类型是void 2 移位运算符可以起到对操作数乘以 2 或者除以 2 的作用 那么操作数除以 2 的移
  • shader练习中遇到的问题点

    half3 lrDirWS normalize reflect lDirWS nDirWS 不加normalize会有白点点 1 max 0 ndotv 模型上会出现黑点点 max 0 1 ndotv 模型上不会出现黑点点
  • Spirng的事务 方法A调用方法B,事务是否失效

    Springboot开启了事务的方法调用没有事务的方法 提示 上方标题是一个很笼统的场景 详情展开如下 先说结论 总结 方法A调用方法B 场景一 如果A和B方法在同一个类中 如果A加 Transactional注解 B加不加 Transac
  • 看完这篇 教你玩转渗透测试靶机vulnhub——FunBox1

    Vulnhub靶机FunBox1渗透测试详解 Vulnhub靶机介绍 Vulnhub靶机下载 Vulnhub靶机安装 Vulnhub靶机漏洞详解 信息收集 暴力破解 ssh登入 提权 获取flag Vulnhub靶机渗透总结 Vulnhub
  • Golang CLI框架介绍

    网址 https github com mitchellh cli 功能 该框架是个人开发的命令行程序框架 作者还成立了公司 HashiCorp 其公司的产品也采用这个CLI框架 解读 框架的思路是 把命令和执行方法以map的形式记录在内部
  • 命令查看被占用端口号,并杀死进程

    1 win R 输入cmd回车打开命令窗口 2 查看所有端口 命令 netstat an 3 查看单个端口号是否被占用 netstat ano findstr 8080 最后一列是占用端口对应的进程号 4 查看进程号对应的进程名称 task
  • 关于对比损失(contrasive loss)的理解(相似度越大越相似的情况):

    def contro loss self 总结下来对比损失的特点 首先看标签 然后标签为1是正对 负对部分损失为0 最小化总损失就是最小化类内损失 within loss 部分 让s逼近margin的过程 是个增大的过程 标签为0是负对 正
  • gitee的一些常用命令

    Gitee 是一个基于 Git 的代码托管和协作平台 提供了一些常用的命令来完成代码的管理和协作 以下是一些常见的 Gitee 命令 克隆远程仓库到本地 Copy Codegit clone lt 远程仓库地址 gt 将本地代码提交到远程仓
  • Matlab

    目录 摘要 一 电力负荷数据导入 二 输入输出数据归一化 三 建立和训练BP神经网络 四 使用测试数据进行负荷预测 五 Matlab代码实现 摘要 使用BP神经网络实现简单的电力负荷回归预测任务 主要的步骤为 导入数据 数据归一化 建立BP
  • Linux中级实战专题篇三:nginx服务(日志介绍,作用域,格式定义,流量控制,访问控制模块,用户信任登录)

    Nginx 日志配置 1 Nginx 日志介绍 Nginx 有一个非常灵活的日志记录模式 每个级别的配置可以有各自独立的访问日志 所需日志模块 ngx http log module 的支持 日志格式通过 log format 命令来定义
  • NP完全问题的证明-算法概论课后习题8.15

    题目 证明如下问题是NP 完全的 输入 两个图G1 V1 E1 和G2 V2 E2 预算b 输出 两个节点集合V1 V1 和V2 V2和它们被移除后 将在两图中分别留下至少b个节点 且图的剩余部分完全一样 解析 可将最大独立集问题归约到此问
  • 多态原理探究

    概念 当类中声明虚函数时 编译器会在类中生成一个虚函数表 虚函数表是一个存储类成员函数指针的数据结构 虚函数表是由编译器自动生成与维护的 virtual成员函数会被编译器放入虚函数表中 当存在虚函数时 每个对象中都有一个指向虚函数表的指针
  • FFmpeg简介及在vc2010下编译步骤

    FFmpeg是一个开源的多媒体库 最新版本是2 4 3 它的License是LGPL或GPL FFmpeg可以用来记录 转换数字音频 视频 并能将其转换为流的开源计算机程序 它包括了音 视频编码库libavcodec FFmpeg是在Lin
  • [mmdetection 混合精度]用fastrcnn实测混合精度fp16效果2

    用官方检测工具测试 平均时间 0 0947 0 1298 0 72958 map比较 faster rcnn r50 fpn 1x coco py fp16训练后map 结论 map下降不明显 但平均训练时间降低了27 fp16还是很好的
  • uni-app混合开发中的链接跳转navigateTo、reLaunch、redirectTo、switchTab区别

    1 navigateTo 保留当前页面 跳转到应用内的某个页面 使用uni navigateBack可以返回到原页面 要注意的是navigateTo只能跳转的应用内非 tabBar 的页面的路径 路径后可以带参数 如果跳转url参数为tab
  • kubeadm 安装 kubernetes 1.4.6

    kubeadm 安装 kubernetes 1 4 6 准备 安装docker 下载镜像 安装kubernetes 安装kubernetes dashbord 准备 机器名 ip centos7 kubermaster 192 168 10
  • C语言入门--3x3转置矩阵

    题目 答题思路 我这里使用的是两个多维数组 因此此段代码仅需修改输入部分既可用于多种对称矩阵的转置 第一个多维数组用于记录原始的矩阵排列 第二个多维数组用于记录转置后的矩阵排列 我的思路很简单 先将右上角的数交换到左下角 本人非数学专业 专
  • Redis学习;Redis主从复制

    主从复制 是指将一台Redis服务器的数据 复制到其他的Redis服务器 前者称为主节点 Master Leader 后者称为从节点 Slave Follower 数据的复制是单向的 只能由主节点复制到从节点 主节点以写为主 从节点以读为主
  • QAnimation的介绍

    QAnimation的介绍 QAnimation是Qt框架中提供的一个动画类 用于实现GUI控件的各种动画效果 可以通过QAnimation实现如平移 旋转 缩放等动态效果 同时还支持动态添加或删除控件等操作 基本用法 创建和启动动画 通过
  • Libsvm:脚本(subset.py、grid.py、checkdata.py)

    1 脚本 This directory includes some useful codes 1 subset selection tools 子集抽取工具 subset py 2 parameter selection tools 参数选