Why does ReLU equal max(0, x)?

Viradj asked 2 months ago

I read that ReLU activation can be computed as max(0,x). Why is this true ?

1 Answers
Best Answer
Chris Staff answered 2 months ago

ReLU can be written as follows:

\(\begin{split}R(x) = \begin{Bmatrix} x & x >= 0 \\
0 & otherwise \end{Bmatrix}\end{split}\)

 

As you can see, the output for ReLU is always zero when the input is \(< 0\). It’s \(x\) when the output is \(>= 0\), because the output is then equal to \(x\).

If you look at what

max(0, x)

does, you’ll find the following:

  • For all \(x >= 0\), the output equals the \(x\) part of the `max` function.
  • For all \(x < 0\), the output equals the \(0\) part of the `max` function.

In other words, `max(0, x)` produces exactly the same output as the formula above. It’s also computationally efficient. In fact, it’s why ReLU is one of the most popular activation functions these days.

MachineCurve. (2019, September 4). ReLU, sigmoid and tanh: Today’s most used activation functions – MachineCurvehttps://www.machinecurve.com/index.php/2019/09/04/relu-sigmoid-and-tanh-todays-most-used-activation-functions/

Your Answer

0 + 8 =