# Why does ReLU equal max(0, x)?

I read that ReLU activation can be computed as max(0,x). Why is this true ?

ReLU can be written as follows:

$$\begin{split}R(x) = \begin{Bmatrix} x & x >= 0 \\ 0 & otherwise \end{Bmatrix}\end{split}$$

As you can see, the output for ReLU is always zero when the input is $$< 0$$. It’s $$x$$ when the output is $$>= 0$$, because the output is then equal to $$x$$.

If you look at what

max(0, x)

does, you’ll find the following:

• For all $$x >= 0$$, the output equals the $$x$$ part of the max function.
• For all $$x < 0$$, the output equals the $$0$$ part of the max function.

In other words, max(0, x) produces exactly the same output as the formula above. It’s also computationally efficient. In fact, it’s why ReLU is one of the most popular activation functions these days.

MachineCurve. (2019, September 4). ReLU, sigmoid and tanh: Today’s most used activation functions – MachineCurvehttps://www.machinecurve.com/index.php/2019/09/04/relu-sigmoid-and-tanh-todays-most-used-activation-functions/