Swish - AI & ML Glossary | Farez Vadsaria

Neural Networks & Deep Learning

Swish

A smooth activation function (x * sigmoid(x)) that often outperforms ReLU, discovered through neural architecture search.

This concept is essential for understanding neural networks & deep learning and forms a key part of modern AI systems.

Activation Function
ReLU
GELU

Tags

neural-networks-deep-learning activation-function relu gelu

Related Terms

Activation Function

A non-linear function applied to neuron outputs that introduces non-linearity, enabling networks to learn complex patterns.

GELU

Gaussian Error Linear Unit - a smooth activation function combining properties of dropout and ReLU, used in BERT and GPT.

ReLU

Rectified Linear Unit - an activation function that outputs the input if positive, zero otherwise. f(x) = max(0, x).

← Back to All Terms