Neural Networks & Deep Learning
Mish
A smooth, non-monotonic activation function (x * tanh(softplus(x))) providing better gradients than ReLU.
This concept is essential for understanding neural networks & deep learning and forms a key part of modern AI systems.
Related Concepts
- Activation Function
- ReLU
- Swish
Tags
neural-networks-deep-learning activation-function relu swish
Related Terms
Activation Function
A non-linear function applied to neuron outputs that introduces non-linearity, enabling networks to learn complex patterns.
ReLU
Rectified Linear Unit - an activation function that outputs the input if positive, zero otherwise. f(x) = max(0, x).
Swish
A smooth activation function (x * sigmoid(x)) that often outperforms ReLU, discovered through neural architecture search.