Advertisement
17:33
Derivative formulas through geometry | Chapter 3, Essence of calculus
3Blue1Brown
·
May 12, 2026
Open on YouTube
Transcript
0:12
Now that we've seen what a derivative means and what it has to do with rates of change,
0:16
our next step is to learn how to actually compute these guys.
0:19
As in, if I give you some kind of function with an explicit formula,
0:22
you'd want to be able to find what the formula for its derivative is.
0:26
Maybe it's obvious, but I think it's worth stating explicitly why this
0:30
is an important thing to be able to do, why much of a calculus student's
0:34
time ends up going towards grappling with derivatives of abstract
0:37
functions rather than thinking about concrete rate of change problems.
Advertisement
0:42
It's because a lot of real-world phenomena, the sort of things that
0:45
we want to use calculus to analyze, are modeled using polynomials,
0:49
trigonometric functions, exponentials, and other pure functions like that.
0:53
So if you build up some fluency with the ideas of rates of change for those kinds of
0:58
pure abstract functions, it gives you a language to more readily talk about the rates
1:02
at which things change in concrete situations that you might be using calculus to model.
1:07
But it is way too easy for this process to feel like just memorizing a list of rules,
1:12
and if that happens, if you get that feeling, it's also easy to lose sight of the
1:16
fact that derivatives are fundamentally about just looking at tiny changes to some
1:20
quantity and how that relates to a resulting tiny change in another quantity.
Advertisement
1:24
So in this video and in the next one, my aim is to show you how you can think
1:28
about a few of these rules intuitively and geometrically,
1:31
and I really want to encourage you to never forget that tiny nudges are at the
1:35
heart of derivatives.
1:37
Let's start with a simple function like f of x equals x squared.
1:41
What if I asked you its derivative?
1:43
That is, if you were to look at some value x, like x equals 2,
1:47
and compare it to a value slightly bigger, just dx bigger,
1:50
what's the corresponding change in the value of the function?
1:54
dF.
1:55
And in particular, what's dF divided by dx, the rate
1:58
at which this function is changing per unit change in x.
2:03
As a first step for intuition, we know that you can think of this ratio
2:07
dF dx as the slope of a tangent line to the graph of x squared,
2:10
and from that you can see that the slope generally increases as x increases.
2:15
At zero, the tangent line is flat, and the slope is zero.
2:19
At x equals 1, it's something a bit steeper.
2:22
At x equals 2, it's steeper still.
2:25
But looking at graphs isn't generally the best way
2:27
to understand the precise formula for a derivative.
2:30
For that, it's best to take a more literal look at what x squared actually means,
2:34
and in this case let's go ahead and picture a square whose side length is x.
2:39
If you increase x by some tiny nudge, some little dx,
2:43
what's the resulting change in the area of that square?
2:47
That slight change in area is what dF means in this context.
2:52
It's the tiny increase to the value of f of x equals x squared,
2:55
caused by increasing x by that tiny nudge dx.
2:59
Now you can see that there's three new bits of area in this diagram,
3:03
two thin rectangles and a minuscule square.
3:06
The two thin rectangles each have side lengths of x and dx,
3:10
so they account for 2 times x times dx units of new area.
3:18
For example, let's say x was 3 and dx was 0.01,
3:21
then that new area from these two thin rectangles would be 2 times 3 times 0.01,
3:25
which is 0.06, about 6 times the size of dx.
3:29
That little square there has an area of dx squared,
3:32
but you should think of that as being really tiny, negligibly tiny.
3:37
For example, if dx was 0.01, that would be only 0.0001,
3:41
and keep in mind I'm drawing dx with a fair bit of width here just so we
3:46
can actually see it, but always remember in principle,
3:49
dx should be thought of as a truly tiny amount, and for those truly tiny amounts,
3:54
a good rule of thumb is that you can ignore anything that includes a dx
3:59
raised to a power greater than 1.
4:02
That is, a tiny change squared is a negligible change.
4:07
What this leaves us with is that dF is just some multiple of dx, and that multiple 2x,
4:13
which you could also write as dF divided by dx, is the derivative of x squared.
4:19
For example, if you were starting at x equals 3, then as you slightly increase x,
4:24
the rate of change in the area per unit change in length added, dx squared over dx,
4:29
would be 2 times 3, or 6, and if instead you were starting at x equals 5,
4:34
then the rate of change would be 10 units of area per unit change in x.
4:41
Let's go ahead and try a different simple function, f of x equals x cubed.
4:45
This is going to be the geometric view of the stuff
4:48
that I went through algebraically in the last video.
4:51
What's nice here is that we can think of x cubed as the volume of an actual
4:55
cube whose side lengths are x, and when you increase x by a tiny nudge,
5:00
a tiny dx, the resulting increase in volume is what I have here in yellow.
5:04
That represents all the volume in a cube with side lengths x plus dx
5:08
that's not already in the original cube, the one with side length x.
5:13
It's nice to think of this new volume as broken up into multiple components,
5:18
but almost all of it comes from these three square faces,
5:22
or said a little more precisely, as dx approaches 0,
5:25
those three squares comprise a portion closer and closer to 100% of
5:30
that new yellow volume.
5:33
Each of those thin squares has a volume of x squared times dx,
5:38
the area of the face times that little thickness dx.
5:42
So in total this gives us 3x squared dx of volume change.
5:47
And to be sure there are other slivers of volume here along the edges
5:51
and that tiny one in the corner, but all of that volume is going to be
5:54
proportional to dx squared, or dx cubed, so we can safely ignore them.
5:59
Again this is ultimately because they're going to be divided by dx,
6:03
and if there's still any dx remaining then those terms aren't
6:07
going to survive the process of letting dx approach 0.
6:11
What this means is that the derivative of x cubed,
6:14
the rate at which x cubed changes per unit change of x, is 3 times x squared.
6:20
What that means in terms of graphical intuition is that the slope of
6:25
the graph of x cubed at every single point x is exactly 3x squared.
6:34
And reasoning about that slope, it should make sense that this derivative is high on the
6:38
left and then 0 at the origin and then high again as you move to the right,
6:42
but just thinking in terms of the graph would never have landed us on the precise
6:47
quantity 3x squared.
6:48
For that we had to take a much more direct look at what x cubed actually means.
6:54
Now in practice you wouldn't necessarily think of the square every
6:57
time you're taking the derivative of x squared,
6:59
nor would you necessarily think of this cube whenever you're taking
7:03
the derivative of x cubed.
7:04
Both of them fall under a pretty recognizable pattern for polynomial terms.
7:09
The derivative of x to the fourth turns out to be 4x cubed,
7:13
the derivative of x to the fifth is 5x to the fourth, and so on.
7:18
Abstractly you'd write this as the derivative of x to
7:22
the n for any power n is n times x to the n minus 1.
7:27
This right here is what's known in the business as the power rule.
7:31
In practice we all quickly just get jaded and think about this symbolically as
7:35
the exponent hopping down in front, leaving behind one less than itself,
7:39
rarely pausing to think about the geometric delights that underlie these derivatives.
7:45
That's the kind of thing that happens when these tend
7:47
to fall in the middle of much longer computations.
7:50
But rather than tracking it all off to symbolic patterns,
7:53
let's just take a moment and think about why this works for powers beyond just 2 and 3.
7:58
When you nudge that input x, increasing it slightly to x plus dx,
8:02
working out the exact value of that nudged output would involve
8:06
multiplying together these n separate x plus dx terms.
8:11
The full expansion would be really complicated,
8:13
but part of the point of derivatives is that most of that complication can be ignored.
8:19
The first term in your expansion is x to the n.
8:22
This is analogous to the area of the original square,
8:25
or the volume of the original cube from our previous examples.
8:30
For the next terms in the expansion you can choose mostly x's with a single dx.
8:41
Since there are n different parentheticals from which you could have chosen
8:46
that single dx, this gives us n separate terms,
8:50
all of which include n minus 1 x's times a dx,
8:53
giving a value of x to the power n minus 1 times dx.
8:57
This is analogous to how the majority of the new area in the square came from those
9:02
two bars, each with area x times dx, or how the bulk of the new volume in the cube
9:07
came from those three thin squares, each of which had a volume of x squared times dx.
9:14
There will be many other terms of this expansion,
9:17
but all of them are just going to be some multiple of dx squared,
9:21
so we can safely ignore them, and what that means is that all but a
9:25
negligible portion of the increase in the output comes from n copies of
9:29
this x to the n minus 1 times dx.
9:31
That's what it means for the derivative of x to the n to be n times x to the n minus 1.
9:38
And even though, like I said in practice, you'll find yourself performing this
9:43
derivative quickly and symbolically, imagining the exponent hopping down to the front,
9:47
every now and then it's nice to just step back and remember why these rules work.
9:52
Not just because it's pretty, and not just because it helps remind us that math
9:56
actually makes sense and isn't just a pile of formulas to memorize,
10:00
but because it flexes that very important muscle of thinking about derivatives in
10:04
terms of tiny nudges.
10:07
As another example, think of the function f of x equals 1 divided by x.
10:12
Now on the hand you could just blindly try applying the power rule,
10:16
since 1 divided by x is the same as writing x to the negative 1.
10:21
That would involve letting the negative 1 hop down in front,
10:24
leaving behind 1 less than itself, which is negative 2.
10:28
But let's have some fun and see if we can reason about this geometrically,
10:31
rather than just plugging it through some formula.
10:34
The value 1 over x is asking what number multiplied by x equals 1.
10:40
So here's how I'd like to visualize it.
10:42
Imagine a little rectangular puddle of water sitting in two dimensions whose area is 1.
10:48
And let's say that its width is x, which means that the height has to be 1 over x,
10:53
since the total area of it is 1.
10:56
So if x was stretched out to 2, then that height is forced down to 1 half.
11:01
And if you increased x up to 3, then the other side has to be squished down to 1 third.
11:07
This is a nice way to think about the graph of 1 over x, by the way.
11:11
If you think of this width x of the puddle as being in the xy-plane,
11:15
then that corresponding output 1 divided by x, the height of the graph above that point,
11:20
is whatever the height of your puddle has to be to maintain an area of 1.
11:26
So with this visual in mind, for the derivative,
11:29
imagine nudging up that value of x by some tiny amount, some tiny dx.
11:34
How must the height of this rectangle change so
11:37
that the area of the puddle remains constant at 1?
11:41
That is, increasing the width by dx adds some new area to the right here.
11:46
So the puddle has to decrease in height by some d 1 over x,
11:50
so that the area lost off of that top cancels out the area gained.
11:56
You should think of that d 1 over x as being a negative amount,
11:59
by the way, since it's decreasing the height of the rectangle.
12:03
And you know what?
12:04
I'm going to leave the last few steps here for you,
12:07
for you to pause and ponder and work out an ultimate expression.
12:10
And once you reason out what d of 1 over x divided by dx should be,
12:14
I want you to compare it to what you would have gotten if you had just
12:17
blindly applied the power rule, purely symbolically, to x to the negative 1.
12:23
And while I'm encouraging you to pause and ponder,
12:26
here's another fun challenge if you're feeling up to it.
12:29
See if you can reason through what the derivative of the square root of x should be.
12:36
To finish things off, I want to tackle one more type of function,
12:40
trigonometric functions, and in particular let's focus on the sine function.
12:45
So for this section I'm going to assume that you're already
12:48
familiar with how to think about trig functions using the unit circle,
12:51
the circle with a radius 1 centered at the origin.
12:55
For a given value of theta, like say 0.8, you imagine yourself
12:59
walking around the circle starting from the rightmost point
13:02
until you've traversed that distance of 0.8 in arc length.
13:06
This is the same thing as saying that the angle right here is exactly theta radians,
13:11
since the circle has a radius of 1.
13:14
Then what sine of theta means is the height of that point above the x-axis,
13:20
and as your theta value increases and you walk around the circle
13:24
your height bobs up and down between negative 1 and 1.
13:29
So when you graph sine of theta versus theta you get this wave pattern,
13:33
the quintessential wave pattern.
13:37
And just from looking at this graph we can start to
13:40
get a feel for the shape of the derivative of the sine.
13:44
The slope at 0 is something positive since sine of theta is increasing there,
13:48
and as we move to the right and sine of theta approaches its peak that slope goes down
13:54
to 0.
13:55
Then the slope is negative for a little while,
13:58
while the sine is decreasing before coming back up to 0 as the sine graph levels out.
14:04
And as you continue thinking this through and drawing it out,
14:07
if you're familiar with the graph of trig functions you might guess that this
14:11
derivative graph should be exactly cosine of theta,
14:13
since all the peaks and valleys line up perfectly with where the peaks and
14:17
valleys for the cosine function should be.
14:20
And spoiler alert, the derivative is in fact the cosine of theta,
14:23
but aren't you a little curious about why it's precisely cosine of theta?
14:28
I mean you could have all sorts of functions with peaks and valleys at the same points
14:32
that have roughly the same shape, but who knows,
14:34
maybe the derivative of sine could have turned out to be some entirely new type of
14:38
function that just happens to have a similar shape.
14:41
Well just like the previous examples, a more exact understanding
14:44
of the derivative requires looking at what the function actually represents,
14:48
rather than looking at the graph of the function.
14:52
So think back to that walk around the unit circle,
14:54
having traversed an arc with length theta and thinking about sine of theta as
14:58
the height of that point.
15:01
Now zoom into that point on the circle and consider a slight nudge of d theta
15:06
along their circumference, a tiny step in your walk around the unit circle.
15:11
How much does that tiny step change the sine of theta?
15:15
How much does this increase d theta of arc length increase the height above the x-axis?
15:21
Well zoomed in close enough, the circle basically looks like a straight line in this
15:26
neighborhood, so let's go ahead and think of this right triangle where the hypotenuse
15:30
of that right triangle represents the nudge d theta along the circumference,
15:34
and that left side here represents the change in height, the resulting d sine of theta.
15:40
Now this tiny triangle is actually similar to this larger triangle here,
15:44
with the defining angle theta and whose hypotenuse is the radius of the circle with
15:48
length 1.
15:50
Specifically this little angle right here is precisely equal to theta radians.
15:57
Now think about what the derivative of sine is supposed to mean.
16:01
It's the ratio between that d sine of theta, the tiny change to the height,
16:05
divided by d theta, the tiny change to the input of the function.
16:10
And from the picture we can see that that's the ratio between the
16:14
length of the side adjacent to the angle theta divided by the hypotenuse.
16:18
Well let's see, adjacent divided by hypotenuse,
16:21
that's exactly what the cosine of theta means, that's the definition of the cosine.
16:27
So this gives us two different really nice ways of
16:30
thinking about how the derivative of sine is cosine.
16:33
One of them is looking at the graph and getting a loose feel for the shape of
16:36
things based on thinking about the slope of the sine graph at every single point.
16:41
And the other is a more precise line of reasoning looking at the unit circle itself.
16:47
For those of you that like to pause and ponder,
16:49
see if you can try a similar line of reasoning to find what the derivative of
16:52
the cosine of theta should be.
16:56
In the next video I'll talk about how you can take derivatives
16:59
of functions who combine simple functions like these ones,
17:02
either as sums or products or function compositions, things like that.
17:06
And similar to this video the goal is going to be to understand each one
17:09
geometrically in a way that makes it intuitively reasonable and somewhat more memorable.
— end of transcript —
Advertisement