Neural Networks & Deep Learning
Swish
A smooth activation function (x * sigmoid(x)) that often outperforms ReLU, discovered through neural architecture search.
This concept is essential for understanding neural networks & deep learning and forms a key part of modern AI systems.
Related Concepts
- Activation Function
- ReLU
- GELU
Tags
neural-networks-deep-learning activation-function relu gelu
Related Terms
Activation Function
A non-linear function applied to neuron outputs that introduces non-linearity, enabling networks to learn complex patterns.
GELU
Gaussian Error Linear Unit - a smooth activation function combining properties of dropout and ReLU, used in BERT and GPT.
ReLU
Rectified Linear Unit - an activation function that outputs the input if positive, zero otherwise. f(x) = max(0, x).