Activation Function vs Cost Fucnction
Both activation functions and cost functions play crucial roles in neural networks, but they serve different purposes in training deep learning models.
1๏ธโฃ Activation Function
๐น Purpose:
- Transforms neuron outputs to introduce non-linearity.
- Helps the network learn complex patterns.
- Applied at each neuron in the hidden and output layers.
๐น Examples:
- ReLU โ Used in hidden layers.
- Sigmoid โ Used for binary classification.
- Softmax โ Used for multi-class classification.
- Tanh โ Used in hidden layers for normalization.
๐น Mathematical Example:
Sigmoid Activation Function:f(x)=11+eโxf(x) = \frac{1}{1 + e^{-x}}f(x)=1+eโx1โ
๐น Example in PyTorch:
import torch.nn.functional as F
x = torch.tensor([-1.0, 0.0, 2.0])
sigmoid_output = F.sigmoid(x)
print(sigmoid_output) # tensor([0.2689, 0.5000, 0.8808])
2๏ธโฃ Cost Function (Loss Function)
๐น Purpose:
- Measures the difference between the modelโs predictions and actual values.
- Guides the optimizer in adjusting the modelโs weights.
- Used after forward propagation to compute error.
๐น Examples:
- Mean Squared Error (MSE) โ For regression.
- Cross-Entropy Loss โ For classification.
- Hinge Loss โ Used for SVM models.
๐น Mathematical Example:
Cross-Entropy Loss (for classification):L=โโylogโก(y^)L = – \sum y \log(\hat{y})L=โโylog(y^โ)
Where y is the actual label and ลท is the predicted probability.
๐น Example in PyTorch:
import torch.nn as nn
pred = torch.tensor([0.2, 0.8]) # Model output (probabilities)
target = torch.tensor([0, 1]) # Actual labels (one-hot encoded)
loss_fn = nn.CrossEntropyLoss()
loss = loss_fn(pred.unsqueeze(0), target.unsqueeze(0))
print(loss)
๐ Key Differences
| Feature | Activation Function | Cost Function |
|---|---|---|
| Purpose | Converts neuron output into meaningful values | Measures the error in predictions |
| Used in | Hidden & output layers | After forward pass (during training) |
| Affects | Non-linearity & learning ability | Model optimization & weight updates |
| Output | Transformed neuron values | A single scalar loss value |
| Examples | ReLU, Sigmoid, Softmax | MSE, Cross-Entropy, Hinge Loss |
๐ ๏ธ When to Use Each?
- Use an activation function to ensure neurons can model complex relationships.
- Use a cost function to evaluate how well the model is performing.
๐ Final Thought
โ
Activation functions shape neuron outputs.
โ
Cost functions evaluate model accuracy and guide optimization.
Let me know if you need further clarification! ๐