Forward and Backward Propagation

Introduction

Forward and backward propagation are essential components of training artificial neural networks. This process allows the model to learn from the training data by adjusting the weights to minimize the error between the predicted output and the actual output.

Forward Propagation

What is Forward Propagation?

Forward propagation is the process of passing the input data through the neural network to obtain an output. It involves several steps: 1. Input Layer: The input data is fed into the network. 2. Weighted Sum Calculation: Each neuron calculates a weighted sum of its inputs. The formula for this is: $$ z = w_1x_1 + w_2x_2 + ... + w_nx_n + b $$ Where: - $z$ is the weighted sum, - $w_i$ are the weights, - $x_i$ are the input features, and - $b$ is the bias term.

3. Activation Function: The weighted sum is then passed through an activation function (like sigmoid, ReLU, etc.) to introduce non-linearity. This can be expressed as: $$ a = f(z) $$ Where: - $a$ is the activation output, - $f$ is the activation function.

4. Output Layer: This process continues through all the layers until the output layer is reached, which gives the predicted output.

Example of Forward Propagation

Consider a simple neural network with one input layer, one hidden layer, and one output layer.

`python import numpy as np

Input features

X = np.array([0.5, 0.2])

Example input

Weights and biases

W1 = np.array([[0.4, 0.6], [0.3, 0.2]])

Weights for hidden layer

b1 = np.array([0.1, 0.1])

Biases for hidden layer

W2 = np.array([0.7, 0.9])

Weights for output layer

b2 = 0.2

Bias for output layer

Forward propagation

z1 = np.dot(W1, X) + b1

Weighted sum for hidden layer

Activation (using ReLU)

a1 = np.maximum(0, z1)

z2 = np.dot(W2, a1) + b2

Weighted sum for output layer

a2 = z2

Output (linear activation)

print('Output:', a2)

Predicted output

Backward Propagation

What is Backward Propagation?

Backward propagation is the process of updating the weights of the neural network based on the error of the output during training. The main goal is to minimize the loss function, which measures the difference between the predicted and actual output.

Steps in Backward Propagation

1. Calculate the Loss: Use a loss function (like Mean Squared Error) to compute the error between the predicted output and the actual target. $$ L = (y - ilde{y})^2 $$ Where: - $L$ is the loss, - $y$ is the actual output, and - $ ilde{y}$ is the predicted output.

2. Calculate Gradients: Compute the gradient of the loss with respect to each weight using the chain rule. This tells us how much to change the weights to reduce the error. 3. Update Weights: Adjust the weights using the gradients and a learning rate ($eta$): $$ w_i = w_i - eta rac{ ext{d}L}{ ext{d}w_i} $$

Example of Backward Propagation

Continuing from our previous example, let's assume we have the actual target value as 0.8. We will perform one step of backward propagation:

`python

Actual target value

y = 0.8

Calculate loss

loss = (y - a2) ** 2

Calculate gradient of loss w.r.t output

loss_gradient = -2 * (y - a2)

Backpropagate through output layer

Gradients for weights W2

W2_gradient = np.dot(loss_gradient, a1)

Update weights (assuming learning rate of 0.01)

learning_rate = 0.01 W2 -= learning_rate * W2_gradient

print('Updated W2:', W2) `

Conclusion

Forward and backward propagation are critical processes that enable neural networks to learn from data. Understanding these concepts is fundamental for training and improving the performance of deep learning models.