The function is strictly convex if equality only holds for \(\lambda=0\) or \(\lambda=1\). Note that if \(-f\) is convex we call the function concave.In a convex function every chord lies above or on the function.
If the domain is the real numbers and \(f\) is twice differentiable then \(d^{2}f/dx^{2}\ge 0\) implies \(f\) is convex, and if \(d^{2}f/dx^{2}> 0\) then \(f\) is strictly convex.
If equality holds \(\langle f(x) \rangle = f(\langle x\rangle)\) and \(f(x)\) is strictly convex, then \(x_{j}=\langle x\rangle\).
Jensen’s inequality can be easily proved by induction on the number of terms in the average:
Proof
With \(m\) terms, the averages are \(\langle x\rangle = \sum_{j=1}^{m} p_{j} x_{j}\) and \(\langle f(x) \rangle= \sum_{j=1}^{m}p_{j} f(x_{j})\) where \(p_{j}\) is the probability of \(x_{j}\). The case \(m=1\) is trivially true. The case \(m=2\) is
which is just the definition of convexity for \(f(x)\). Now assume the result holds for \(m=n\) and look at \(m=n+1\). First note that if \(p_{1}=1\) then all the other probabilities are zero and we are back to the case \(m=1\) so we can assume \(p_{1}<1\).