The Power Rule in calculus is the following:
$$ \frac{d}{dx} x^n = nx^{n-1} $$
I first learned this rule in high school. The typical proof I saw for the power rule used the product rule and induction.
Using induction, we must first prove the base case \( \frac{d}{dx} x^0 = 0 \cdot x^{0-1} = 0 \). Since \( x^0 = 1 \) and the derivative of a constant is zero, this is rather trivial.
Next, we must prove the inductive case \( \frac{d}{dx} x^n = nx^{n-1} \). For this step, we can assume that the equation holds for values from \( 0 \) up to but not including \( n \). The main "trick" here, if it is one, is to break down \( x^n \) as \( x \cdot x^{n-1} \).
$$ \frac{d}{dx} x^{n+1} = \frac{d}{dx} [ x \cdot x^{n-1} ] $$
The product rule will allow this derivative to be decomposed in the terms of derivatives of lower powers of \( x \):
$$ \frac{d}{dx} [ x \cdot x^{n-1} ] = \frac{d}{dx}[ x ] \cdot x^{n-1} + x \cdot \frac{d}{dx} [ x^{n-1} ] $$
The inductive hypothesis provides the values of these two derivatives.
$$ \frac{d}{dx}[ x ] \cdot x^{n-1} + x \cdot \frac{d}{dx} [ x^{n-1} ] = 1 \cdot x^{n-1} + x \cdot (n-1)x^{n-2} $$
From this, a little algebraic rearranging simplifies the result.
$$ 1 \cdot x^{n-1} + x \cdot (n-1)x^{n-2} = x^{n-1} + (n-1)x^{n-1} = nx^{n-1} $$
Thus, the power rule is proved. \( \square \)
This is the first way I learned how the power rule worked. It only requires knowledge of a derivative of a constant and the product rule, which are typically learned early on in calculus courses.
The power rule is later applied in situations for negative \( n \) and values of \( n \) that are not natural numbers. Wait—what? The proof above only held for natural number values of \( n \). It could not be easily extended for even real values of \( n \) because of its reliance on induction. I used to think that there was a gap in how I learned calculus because this detail was never provided. Luckily, there is another way to prove the power rule for any constant value \( n \) as long as other tools are provided.
Recall that \( x^n \) can be rewritten in terms of the exponential and natural log as \( e^{\ln x^n} = e^{n \ln x} \). This, along with the fact that \( \frac{d}{dx} e^x = e^x \) will be exploited in the next proof.
First, we rewrite \( \frac{d}{dx} x^n \) as \( \frac{d}{dx} e^{n \ln x} \). Next, we proceed to evaluate this derivative by applying the chain rule.
$$ \frac{d}{dx} e^{n \ln x} = e^{n \ln x} \frac{d}{dx}[n \ln x] $$
The \( e^{n \ln x} \) term outside of the derivative can be replaced by the original expression \( x^n \). Additionally, \( n \) is a constant and can be moved out.
$$ e^{n \ln x} \frac{d}{dx}[n \ln x] = n x^n \cdot \frac{d}{dx} \ln x $$
You may know from prior calculus knowledge that \( \frac{d}{dx} \ln x = \frac{1}{x} \), but this can actually be derived from the current equation. When \( n = 1 \), the previous set of equations simplifies to the following.
$$ \frac{d}{dx} x = x \frac{d}{dx} \ln x $$
\( \frac{d}{dx} x = 1 \) is taken from basic calculus knowledge. Dividing both sides of the equation by \( x \) shows that \( \frac{d}{dx} \ln x = \frac{1}{x} \).
Putting this all together in the general case where \( n \) may not be \( 1\), the power rule is proven.
$$ \frac{d}{dx} x^n = nx^n \cdot \frac{d}{dx} \ln x = nx^n \cdot \frac{1}{x} = nx^{n-1} $$
\( \square \)
In this alternative proof, more advanced techniques such as the chain rule and the derivative of \( e^x \) form the foundation of the proof. These techniques are often learned later in calculus courses after the power rule is introduced. For this reason, I see why the power rule may only be proved or shown for positive integer cases although it is applied to the general case. But the gap of why the power rule works in the general case is often not made clear. This alternative proof is not constrained by any special values of \( n \) and proves a very general and powerful result.
I have seen this second proof used and for a while I took it as authoritative. However, there is a slight flaw that makes this proof incomplete. Did you spot it?
Remember that \( e^x \) is always positive in value. \( x^n = e^{n \ln x} \) only holds true when \( x^n > 0 \). This proof is only valid under this assumption! What about when \( x^n \leq 0 \)? Can it be fixed to work for the general case?
Let's first take a look at when \( x^n \leq 0 \), splitting this into cases when \( x^n < 0 \) and \( x^n = 0 \). The zero case is easier. \( x^n = 0 \) if and only if \( x = 0 \) (ignoring \( n = 0 \) for now).
It is also true that \( x^n < 0 \) only if \( x < 0 \). This relationship is not if and only if because \( x^n \) is positive when \( n \) is even, and \( x^n \) may not be defined for \( x < 0 \). In fact, in the realm of real numbers, \( x^n \) is only defined for negative values of \( x \) when \( n \) is an integer. For negative \( x \) and odd integer \( n \), the relationship \( x^n = -e^{n \ln -x} \) does hold by adding in the necessary negative signs. For negative \( x \) and even integer \( n \), the relationship is \( x^n = e^{n \ln -x} \). Proof 2 does work to prove the power rule for these \( x < 0 \) cases, as long as you are careful with the negative signs.
The last step is to look at the case when \( x = 0 \). Our previous analysis suggests that the \( x = 0 \) case will make the derivative continuous at \( x= 0 \). But it cannot be assumed that the derivative will be continuous or even defined for \( x = 0 \). Let's look at when \( x^n \) is not differentiable at \( x = 0 \). Certainly, \( n \) must be positive or else \( x^n \) is not defined for \( x = 0 \). If \( n \) is not an integer, then \( x^n \) is not continuous at \( x = 0 \) because \( x^n \) is not defined at all for the negative values and hence any left limits to \( x = 0 \) do not exist. We would need continuity as a necessary condition for differentiability, so all this leaves are the positive integer values of \( n \).
At this point, I would turn to the definition of the derivative to show that \( x^n \) is differentiable at \( x = 0 \) for positive integer \( n \).
$$ \lim_{h \to 0} \frac{(x+h)^n - x^n}{h} $$
Substituting \( x = 0 \),
$$ \lim_{h \to 0} \frac{(0+h)^n - 0^n}{h} = \lim_{h \to 0} \frac{h^n}{h} = \lim_{h \to 0} h^{n-1} $$
We are operating under the assumption that \( n \) is a positive integer. If \( n = 1 \), then \( \lim_{h \to 0} h^{n-1} = \lim_{h \to 0} h^0 = 1 \). If \( n > 1 \), then \( \lim_{h \to 0} h^{n-1} = 0 \). What matters is that in both cases, the limit exists and is consistent with the power rule formula.