Tensorflow 简单线性规划和逻辑回归

2023-11-16

Tensorflow 简单线性规划和逻辑回归

摘要
Tensorflow 中单变量线性回归
- 多变量的情况
用 tensorflow进行逻辑回归分类
参考文献

摘要

线性规划和逻辑回归分别是回归(regression) 和分类 (classification) 问题中最常见的算法之一。许多软件比如 Python, R 等都提供了相关的函数。这里我们介绍如何用 tensorflow 来进行简单的线性规划和逻辑回归。

Tensorflow 中单变量线性回归

假设我们的数据是 ( x i , y i ) , i = 1 , 2 , ⋯ , m (x_i, \, y_i), \, i = 1, 2, \cdots, m (xi,yi),i=1,2,⋯,m，即一共有 m m m 个数据点。我们须要根据 x i x_i xi 来对 y i y_i yi 进行预测。

在单变量的线性规划模型中，我们须要拟合 y = w x + b y = w x + b y=wx+b。这里 y y y 就是要预测的值， x x x 是我们的feature 值。我们想要用 tensorflow 来根据给出的数据 ( x i , y i ) (x_i, \, y_i) (xi,yi) 求解出 w w w 和 b b b 的值。根据最小二乘法的规定，为了求出 w w w 和 b b b，我们须要求出使得
RSS = ∑ i = 1 n ( y i − ( b + w x i ) ) 2 \displaystyle \text{RSS} = \sum_{i = 1}^n \left( y_i - (b + w x_i) \right)^2 RSS=i=1∑n(yi−(b+wxi))2
最小的 w , b w, b w,b。

在tensorflow中，我们通过定义损失函数(cost function)，来求得使得 RSS (residual sum of squares) 最小的 w , b w, b w,b。具体代码如下。

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior() 
import numpy as np
import matplotlib.pyplot as plt

class demo_tensorflow:
    
    def __init__(self, X, Y):
        self.X = X
        self.Y = Y
        
    def tf_linear_regression(self, learning_rate, training_epoches):
        """
        Use tensorflow for linear regression. 
        """
        # define X and Y. We will fit Y using X.  
        X = tf.placeholder(tf.float32)
        Y = tf.placeholder(tf.float32)
        
        # w and b are the parameters that we need to fit 
        w = tf.Variable(np.random.normal(), name="weights")
        b = tf.Variable(np.random.normal(), name='intercept')
        
        # define the model
        model = tf.add(tf.multiply(X, w), b)
        
        # define the cost function
        cost = tf.reduce_mean(tf.square(model - Y))
        
        train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
        
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            for epoch in range(training_epoches):
                sess.run(train_op, feed_dict={X: self.X, Y: self.Y})
                cur_cost = sess.run(cost, feed_dict={X: self.X, Y: self.Y})
                if (epoch + 1) % 500 == 1:
                    print("Epoch", (epoch + 1), ": cost=", cur_cost)
            weight = sess.run(w)
            intecept = sess.run(b)
        return weight, intecept

x_batch = np.linspace(0, 2, 100) # training x data
y_batch = 1.5 * x_batch + np.random.randn(*x_batch.shape) * 0.2 + 0.5 # training y data
learning_rate = 0.01
training_epoches = 5000

a = demo_tensorflow(x_batch, y_batch)
a.tf_linear_regression(learning_rate, training_epoches)

Epoch 1 : cost= 2.3156583
Epoch 501 : cost= 0.045759827
Epoch 1001 : cost= 0.045451876
Epoch 1501 : cost= 0.04543826
Epoch 2001 : cost= 0.04543766
Epoch 2501 : cost= 0.04543763
Epoch 3001 : cost= 0.045437638
Epoch 3501 : cost= 0.045437627
Epoch 4001 : cost= 0.045437627
Epoch 4501 : cost= 0.045437627

根据训练模型得到的直线如下图所示。

tensorflow_linear_regression

多变量的情况

对于多变量的情况，我们只须要在定义 tf.placeholder 和 tf.Variable 的时候注意矩阵的维度。其余部分是相同的。值得注意的是，我们应该用tf.matmul 来对矩阵进行乘法运算。具体代码如下 [1]。

def tf_multi_linear_regression(self, learning_rate, training_epoches):
        """
        multivariate case for linear regression.
        """
        # m is the number of training data, p is the number of features
        m, p = self.X.shape
        #print(m, p)
        # define X and Y. We will fit Y using X.  
        X = tf.placeholder(tf.float32, shape=(None, p))
        Y = tf.placeholder(tf.float32, shape=(None, 1))
        
        # w and b are the parameters that we need to fit 
        w = tf.Variable(tf.random_normal([p, 1], stddev=0.01), dtype=np.float32, name="weights")
        b = tf.Variable(np.random.normal(), dtype=np.float32, name='intercept')
        # define the model
        model = tf.add(tf.matmul(X, w), b) # Note that we use tf.matmul for matrix multicplication. 
        
        # define the cost function
        cost = tf.reduce_mean(tf.square(model - Y))
        
        train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
        
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            for epoch in range(training_epoches):
                sess.run(train_op, feed_dict={X: np.asarray(self.X), Y: np.asarray(self.Y).reshape(m, 1)})
                cur_cost = sess.run(cost, feed_dict={X: np.asarray(self.X), Y: np.asarray(self.Y).reshape(m, 1)})
                if (epoch + 1) % 500 == 1:
                    print("Epoch", (epoch + 1), ": cost=", cur_cost)
            weight = sess.run(w)
            intecept = sess.run(b)
        return weight, intecept

m = 10 ** 2
x1 = np.linspace(0, 2, m)
x2 = np.linspace(-1, 2, m) + np.random.normal(0, 2, m)
x_multi = np.vstack((x1, x2)).T
y = -2 * x1 + 3 * x2 + 2 + np.random.normal(0, 1, m) * 0.2

a = demo_tensorflow(x_multi, y)
a.tf_multi_linear_regression(learning_rate, training_epoches)

Epoch 1 : cost= 36.237904
Epoch 501 : cost= 0.1619576
Epoch 1001 : cost= 0.03932593
Epoch 1501 : cost= 0.033134725
Epoch 2001 : cost= 0.03282215
Epoch 2501 : cost= 0.032806374
Epoch 3001 : cost= 0.032805584
Epoch 3501 : cost= 0.032805543
Epoch 4001 : cost= 0.032805547
Epoch 4501 : cost= 0.032805547
(array([[-2.016268 ],
[ 3.0005507]], dtype=float32), 2.020361)

用 tensorflow进行逻辑回归分类

有了用tensorflow 进行线性回归的经验之后，对于用罗辑回归(logistic regression) 进行分类，我们只需要定义新的损失函数，而其他大部分代码与线性回归的情况。对于数据 ( x i , y i ) , i = 1 , 2 , ⋯ , m (x_i, \, y_i), \, i = 1, 2, \cdots, m (xi,yi),i=1,2,⋯,m， y i ∈ { 0 , 1 } y_i \in \{0, \, 1\} yi∈{0,1}。逻辑回归的损失函数定义为：
cost = − 1 m ∑ i = 1 m ( y i log ⁡ ( y i ^ ) + ( 1 − y i ) log ⁡ ( 1 − y i ^ ) ) \text{cost} = -\frac{1}{m} \sum_{i = 1}^m \left( y_i \log(\hat{y_i}) + (1 - y_i) \log(1 - \hat{y_i}) \right) cost=−m1i=1∑m(yilog(yi^)+(1−yi)log(1−yi^))

这里 y i ^ \displaystyle \hat{y_i} yi^ 是我们预测数据 x i x_i xi 属于类别 1 的概率。具体的表达式为：
y i ^ = σ ( w x i + b ) \hat{y_i} = \sigma(w x_i + b) yi^=σ(wxi+b)。

其中 σ \sigma σ 函数是 sigmoid 函数， σ ( x ) = 1 1 + e − x \sigma(x) = \frac{1}{1 + e^{-x}} σ(x)=1+e−x1。可以看出 0 < σ ( x ) < 1 , ∀ x ∈ R 0 < \sigma(x) < 1, \forall x \in \mathbb{R} 0<σ(x)<1,∀x∈R，所以 y i ^ \hat{y_i} yi^ 作为概率始终是有意义的。

有了 cost 函数，我们便可以用 tensorflow 中的 optimization 方法进行优化，求出参数 w w w 和 b b b。具体代码如下：

class demo_tf_logisticRegression:
    
    def __init__(self, X, Y):
        self.X = X
        self.Y = Y
    
    
    def tf_logistic_regression(self, learning_rate, training_epoches):
        """
        Train the logistic regression model using tensorflow. 
        """
        # m is the number of training data points, p is the number of features.
        m, p = self.X.shape
        X = tf.placeholder(tf.float32, shape=(None, p))
        Y = tf.placeholder(tf.float32, shape=(None, 1))
        
        w = tf.Variable(tf.random_normal([p, 1]), dtype=tf.float32, name='weights')
        b = tf.Variable(np.random.normal(), dtype=tf.float32, name='bias')
        
        model = tf.sigmoid(tf.add(tf.matmul(X, w), b))
        cost = -tf.reduce_mean(Y * tf.log(model) + (1 - Y) * tf.log(1 - model))
        
        train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
        
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            for epoch in range(training_epoches):
                sess.run(train_op, feed_dict={X: np.asarray(self.X), Y: np.asarray(self.Y).reshape(m, 1)})
                cur_cost = sess.run(cost, feed_dict={X: np.asarray(self.X), Y: np.asarray(self.Y).reshape(m, 1)})
                if (epoch + 1) % 500 == 1:
                    print("Epoch", (epoch + 1), ": cost=", cur_cost)
            weight = sess.run(w)
            intecept = sess.run(b)
        return weight, intecept
    
    def get_boundary(self, w, b):
        """
        Obtain the classification boundary using the weight and bias that we got from the
        training.
        """
        x_boundary, y_boundary = [], []
        for x in np.linspace(0, 3, 100):
            for y in np.linspace(-1, 5, 100):
                prob = self.sigmoid(w[0] * x + w[1] * y + b)
                if np.abs(prob - 0.5) < 0.01:
                    x_boundary.append(x)
                    y_boundary.append(y)
        return x_boundary, y_boundary
    
    def sigmoid(self, x):
        """
        define the sigmoid function
        """
        return 1 / (1 + np.exp(-x))

假如我们要对下图中的红点（标记为0）和蓝点（标记为1）进行分类。那么我们用逻辑回归得到的分类边界如下图中绿线所示。

# Define the training data
np.random.seed(2)
m = 100
x1 = np.random.uniform(0, 3, m)
y1 = x1 - 1 + np.random.normal(0, 1, m) # with label 0
x2 = np.random.uniform(0, 3, m)
y2 = x2 + 1 + np.random.normal(0, 1, m) # with label 1
x_train_0 = np.vstack((x1, y1))
x_train_1 = np.vstack((x2, y2))
x_train = np.hstack((x_train_0, x_train_1)).T
y_train = np.asarray([0] * m + [1] * m)

# train the logistic model with tensorflow
b = demo_tf_logisticRegression(x_train, y_train)
w, bias = b.tf_logistic_regression(learning_rate, training_epoches)
x_b, y_b = b.get_boundary(w, bias)

Epoch 1 : cost= 1.7830517
Epoch 501 : cost= 0.5220195
Epoch 1001 : cost= 0.44926476
Epoch 1501 : cost= 0.4294767
Epoch 2001 : cost= 0.4218148
Epoch 2501 : cost= 0.41830078
Epoch 3001 : cost= 0.41651848
Epoch 3501 : cost= 0.41554946
Epoch 4001 : cost= 0.41499367
Epoch 4501 : cost= 0.41466048

# plot the data points and the classification boundary
# learned from tensorflow
plt.figure(figsize=(8, 6), dpi=100)
plt.plot(x1, y1, 'o', color='red', markersize=10)
plt.plot(x2, y2, 's', color='blue', markersize=10)
plt.plot(x_b, y_b, '-', color='green', linewidth = 5)
plt.xlabel('x', fontsize = 20)
plt.ylabel('y', fontsize = 20)
plt.xticks(fontsize=20)
plt.yticks(fontsize=20)

tf_logistic_regression

参考文献

[1] Machine learning with tensorflow, Nishant Shukla, Manning Publications, 2017

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)