Logistic Regression

As mentioned earlier, classification problems can be categorized into binary problems and multi-class problems. Binary problems are the types of problems that logistic regression is typically used to solve.

In binary classification, the target variable yiyi belongs to one of two classes: 0 or 1. These classes are often referred to as “negative” and “positive,” and they represent two mutually exclusive outcomes. In the context of churn prediction, “no churn” and “churn” are examples of binary classes. Similarly, in email classification, “no spam” and “spam” are also binary classes.

That means g(x_i) outputs a number from 0 to 1 that we can treat as the probability of x_i belonging to the positive class.

Formula:
Linear regression: g(x_i)=w₀+w^Tx_i → outputs a number −∞..∞∈R

x₀ – bias term
w^T – weights vector
x_i – features

Logistic regression: g(x_i)=SIGMOID(w₀+w^Tx_i) → outputs a number 0..1∈R

sigmoid(z)= 1 / (1+exp(−z))

This function maps any real number z to the range of 0 to 1, making it suitable for modeling probabilities in logistic regression. We’ll use this function to convert a score into a probability.

Let’s see how to implement the sigmoid function and use it. We can create an array with 51 values between -7 and 7 using np.linspace(-7, 7, 51). This is our z in the next snippet.

def sigmoid(z):
    return 1 / (1 + np.exp(-z))

z = np.linspace(-7, 7, 51)
z
# Output:
# array([-7.0000000e+00, -6.7200000e+00, -6.4400000e+00, -6.1600000e+00,
#       -5.8800000e+00, -5.6000000e+00, -5.3200000e+00, -5.0400000e+00,
#       -4.7600000e+00, -4.4800000e+00, -4.2000000e+00, -3.9200000e+00,
#       -3.6400000e+00, -3.3600000e+00, -3.0800000e+00, -2.8000000e+00,
#       -2.5200000e+00, -2.2400000e+00, -1.9600000e+00, -1.6800000e+00,
#       -1.4000000e+00, -1.1200000e+00, -8.4000000e-01, -5.6000000e-01,
#       -2.8000000e-01,  8.8817842e-16,  2.8000000e-01,  5.6000000e-01,
#        8.4000000e-01,  1.1200000e+00,  1.4000000e+00,  1.6800000e+00,
#        1.9600000e+00,  2.2400000e+00,  2.5200000e+00,  2.8000000e+00,
#        3.0800000e+00,  3.3600000e+00,  3.6400000e+00,  3.9200000e+00,
#        4.2000000e+00,  4.4800000e+00,  4.7600000e+00,  5.0400000e+00,
#        5.3200000e+00,  5.6000000e+00,  5.8800000e+00,  6.1600000e+00,
#        6.4400000e+00,  6.7200000e+00,  7.0000000e+00])

We can apply this sigmoid function to our array z,…

sigmoid(z)

# Output:
# array([9.11051194e-04, 1.20508423e-03, 1.59386223e-03, 2.10780106e-03,
#             2.78699622e-03, 3.68423990e-03, 4.86893124e-03, 6.43210847e-03,
#             8.49286285e-03, 1.12064063e-02, 1.47740317e-02, 1.94550846e-02,
#             2.55807883e-02, 3.35692233e-02, 4.39398154e-02, 5.73241759e-02,
#             7.44679452e-02, 9.62155417e-02, 1.23467048e-01, 1.57095469e-01,
#             1.97816111e-01, 2.46011284e-01, 3.01534784e-01, 3.63547460e-01,
#             4.30453776e-01, 5.00000000e-01, 5.69546224e-01, 6.36452540e-01,
#             6.98465216e-01, 7.53988716e-01, 8.02183889e-01, 8.42904531e-01,
#             8.76532952e-01, 9.03784458e-01, 9.25532055e-01, 9.42675824e-01,
#             9.56060185e-01, 9.66430777e-01, 9.74419212e-01, 9.80544915e-01,
#             9.85225968e-01, 9.88793594e-01, 9.91507137e-01, 9.93567892e-01,
#             9.95131069e-01, 9.96315760e-01, 9.97213004e-01, 9.97892199e-01,
#             9.98406138e-01, 9.98794916e-01, 9.99088949e-01])

… but let’s visualize how the graph of the sigmoid function looks.

plt.plot(z, sigmoid(z))

At the end of this article, both implementations are presented for comparison. The first snippet demonstrates the familiar linear regression, while the second snippet illustrates logistic regression. It’s evident that there is essentially only one difference between the two: in logistic regression, the sigmoid function is applied to the result of the linear regression to transform it into a probability value between 0 and 1.

def linear_regression(xi):
    result = w0
    
    for j in range(len(w)):
        result = result + xi[j] * w[j]
    
    return result

def logistic_regression(xi):
    score = w0
    
    for j in range(len(w)):
        score = score + xi[j] * w[j]
    
    result = sigmoid(score)
    return result

Linear regression and logistic regression are called linear models, because dot product in linear algebra is a linear operator. Linear models are fast to use, fast to train.

ML Zoomcamp 2023 – Machine Learning for Classification– Part 9

Logistic Regression

Leave a comment Cancel reply

Logistic Regression

Teilen mit:

Related

Leave a comment Cancel reply