Advertisement
Ad slot
Derivative formulas through geometry | Chapter 3, Essence of calculus 17:33

Derivative formulas through geometry | Chapter 3, Essence of calculus

3Blue1Brown · May 12, 2026
Open on YouTube
Transcript ~3108 words · 17:33
0:12
Now that we've seen what a derivative means and what it has to do with rates of change,
0:16
our next step is to learn how to actually compute these guys.
0:19
As in, if I give you some kind of function with an explicit formula,
0:22
you'd want to be able to find what the formula for its derivative is.
0:26
Maybe it's obvious, but I think it's worth stating explicitly why this
0:30
is an important thing to be able to do, why much of a calculus student's
0:34
time ends up going towards grappling with derivatives of abstract
0:37
functions rather than thinking about concrete rate of change problems.
Advertisement
Ad slot
0:42
It's because a lot of real-world phenomena, the sort of things that
0:45
we want to use calculus to analyze, are modeled using polynomials,
0:49
trigonometric functions, exponentials, and other pure functions like that.
0:53
So if you build up some fluency with the ideas of rates of change for those kinds of
0:58
pure abstract functions, it gives you a language to more readily talk about the rates
1:02
at which things change in concrete situations that you might be using calculus to model.
1:07
But it is way too easy for this process to feel like just memorizing a list of rules,
1:12
and if that happens, if you get that feeling, it's also easy to lose sight of the
1:16
fact that derivatives are fundamentally about just looking at tiny changes to some
1:20
quantity and how that relates to a resulting tiny change in another quantity.
Advertisement
Ad slot
1:24
So in this video and in the next one, my aim is to show you how you can think
1:28
about a few of these rules intuitively and geometrically,
1:31
and I really want to encourage you to never forget that tiny nudges are at the
1:35
heart of derivatives.
1:37
Let's start with a simple function like f of x equals x squared.
1:41
What if I asked you its derivative?
1:43
That is, if you were to look at some value x, like x equals 2,
1:47
and compare it to a value slightly bigger, just dx bigger,
1:50
what's the corresponding change in the value of the function?
1:54
dF.
1:55
And in particular, what's dF divided by dx, the rate
1:58
at which this function is changing per unit change in x.
2:03
As a first step for intuition, we know that you can think of this ratio
2:07
dF dx as the slope of a tangent line to the graph of x squared,
2:10
and from that you can see that the slope generally increases as x increases.
2:15
At zero, the tangent line is flat, and the slope is zero.
2:19
At x equals 1, it's something a bit steeper.
2:22
At x equals 2, it's steeper still.
2:25
But looking at graphs isn't generally the best way
2:27
to understand the precise formula for a derivative.
2:30
For that, it's best to take a more literal look at what x squared actually means,
2:34
and in this case let's go ahead and picture a square whose side length is x.
2:39
If you increase x by some tiny nudge, some little dx,
2:43
what's the resulting change in the area of that square?
2:47
That slight change in area is what dF means in this context.
2:52
It's the tiny increase to the value of f of x equals x squared,
2:55
caused by increasing x by that tiny nudge dx.
2:59
Now you can see that there's three new bits of area in this diagram,
3:03
two thin rectangles and a minuscule square.
3:06
The two thin rectangles each have side lengths of x and dx,
3:10
so they account for 2 times x times dx units of new area.
3:18
For example, let's say x was 3 and dx was 0.01,
3:21
then that new area from these two thin rectangles would be 2 times 3 times 0.01,
3:25
which is 0.06, about 6 times the size of dx.
3:29
That little square there has an area of dx squared,
3:32
but you should think of that as being really tiny, negligibly tiny.
3:37
For example, if dx was 0.01, that would be only 0.0001,
3:41
and keep in mind I'm drawing dx with a fair bit of width here just so we
3:46
can actually see it, but always remember in principle,
3:49
dx should be thought of as a truly tiny amount, and for those truly tiny amounts,
3:54
a good rule of thumb is that you can ignore anything that includes a dx
3:59
raised to a power greater than 1.
4:02
That is, a tiny change squared is a negligible change.
4:07
What this leaves us with is that dF is just some multiple of dx, and that multiple 2x,
4:13
which you could also write as dF divided by dx, is the derivative of x squared.
4:19
For example, if you were starting at x equals 3, then as you slightly increase x,
4:24
the rate of change in the area per unit change in length added, dx squared over dx,
4:29
would be 2 times 3, or 6, and if instead you were starting at x equals 5,
4:34
then the rate of change would be 10 units of area per unit change in x.
4:41
Let's go ahead and try a different simple function, f of x equals x cubed.
4:45
This is going to be the geometric view of the stuff
4:48
that I went through algebraically in the last video.
4:51
What's nice here is that we can think of x cubed as the volume of an actual
4:55
cube whose side lengths are x, and when you increase x by a tiny nudge,
5:00
a tiny dx, the resulting increase in volume is what I have here in yellow.
5:04
That represents all the volume in a cube with side lengths x plus dx
5:08
that's not already in the original cube, the one with side length x.
5:13
It's nice to think of this new volume as broken up into multiple components,
5:18
but almost all of it comes from these three square faces,
5:22
or said a little more precisely, as dx approaches 0,
5:25
those three squares comprise a portion closer and closer to 100% of
5:30
that new yellow volume.
5:33
Each of those thin squares has a volume of x squared times dx,
5:38
the area of the face times that little thickness dx.
5:42
So in total this gives us 3x squared dx of volume change.
5:47
And to be sure there are other slivers of volume here along the edges
5:51
and that tiny one in the corner, but all of that volume is going to be
5:54
proportional to dx squared, or dx cubed, so we can safely ignore them.
5:59
Again this is ultimately because they're going to be divided by dx,
6:03
and if there's still any dx remaining then those terms aren't
6:07
going to survive the process of letting dx approach 0.
6:11
What this means is that the derivative of x cubed,
6:14
the rate at which x cubed changes per unit change of x, is 3 times x squared.
6:20
What that means in terms of graphical intuition is that the slope of
6:25
the graph of x cubed at every single point x is exactly 3x squared.
6:34
And reasoning about that slope, it should make sense that this derivative is high on the
6:38
left and then 0 at the origin and then high again as you move to the right,
6:42
but just thinking in terms of the graph would never have landed us on the precise
6:47
quantity 3x squared.
6:48
For that we had to take a much more direct look at what x cubed actually means.
6:54
Now in practice you wouldn't necessarily think of the square every
6:57
time you're taking the derivative of x squared,
6:59
nor would you necessarily think of this cube whenever you're taking
7:03
the derivative of x cubed.
7:04
Both of them fall under a pretty recognizable pattern for polynomial terms.
7:09
The derivative of x to the fourth turns out to be 4x cubed,
7:13
the derivative of x to the fifth is 5x to the fourth, and so on.
7:18
Abstractly you'd write this as the derivative of x to
7:22
the n for any power n is n times x to the n minus 1.
7:27
This right here is what's known in the business as the power rule.
7:31
In practice we all quickly just get jaded and think about this symbolically as
7:35
the exponent hopping down in front, leaving behind one less than itself,
7:39
rarely pausing to think about the geometric delights that underlie these derivatives.
7:45
That's the kind of thing that happens when these tend
7:47
to fall in the middle of much longer computations.
7:50
But rather than tracking it all off to symbolic patterns,
7:53
let's just take a moment and think about why this works for powers beyond just 2 and 3.
7:58
When you nudge that input x, increasing it slightly to x plus dx,
8:02
working out the exact value of that nudged output would involve
8:06
multiplying together these n separate x plus dx terms.
8:11
The full expansion would be really complicated,
8:13
but part of the point of derivatives is that most of that complication can be ignored.
8:19
The first term in your expansion is x to the n.
8:22
This is analogous to the area of the original square,
8:25
or the volume of the original cube from our previous examples.
8:30
For the next terms in the expansion you can choose mostly x's with a single dx.
8:41
Since there are n different parentheticals from which you could have chosen
8:46
that single dx, this gives us n separate terms,
8:50
all of which include n minus 1 x's times a dx,
8:53
giving a value of x to the power n minus 1 times dx.
8:57
This is analogous to how the majority of the new area in the square came from those
9:02
two bars, each with area x times dx, or how the bulk of the new volume in the cube
9:07
came from those three thin squares, each of which had a volume of x squared times dx.
9:14
There will be many other terms of this expansion,
9:17
but all of them are just going to be some multiple of dx squared,
9:21
so we can safely ignore them, and what that means is that all but a
9:25
negligible portion of the increase in the output comes from n copies of
9:29
this x to the n minus 1 times dx.
9:31
That's what it means for the derivative of x to the n to be n times x to the n minus 1.
9:38
And even though, like I said in practice, you'll find yourself performing this
9:43
derivative quickly and symbolically, imagining the exponent hopping down to the front,
9:47
every now and then it's nice to just step back and remember why these rules work.
9:52
Not just because it's pretty, and not just because it helps remind us that math
9:56
actually makes sense and isn't just a pile of formulas to memorize,
10:00
but because it flexes that very important muscle of thinking about derivatives in
10:04
terms of tiny nudges.
10:07
As another example, think of the function f of x equals 1 divided by x.
10:12
Now on the hand you could just blindly try applying the power rule,
10:16
since 1 divided by x is the same as writing x to the negative 1.
10:21
That would involve letting the negative 1 hop down in front,
10:24
leaving behind 1 less than itself, which is negative 2.
10:28
But let's have some fun and see if we can reason about this geometrically,
10:31
rather than just plugging it through some formula.
10:34
The value 1 over x is asking what number multiplied by x equals 1.
10:40
So here's how I'd like to visualize it.
10:42
Imagine a little rectangular puddle of water sitting in two dimensions whose area is 1.
10:48
And let's say that its width is x, which means that the height has to be 1 over x,
10:53
since the total area of it is 1.
10:56
So if x was stretched out to 2, then that height is forced down to 1 half.
11:01
And if you increased x up to 3, then the other side has to be squished down to 1 third.
11:07
This is a nice way to think about the graph of 1 over x, by the way.
11:11
If you think of this width x of the puddle as being in the xy-plane,
11:15
then that corresponding output 1 divided by x, the height of the graph above that point,
11:20
is whatever the height of your puddle has to be to maintain an area of 1.
11:26
So with this visual in mind, for the derivative,
11:29
imagine nudging up that value of x by some tiny amount, some tiny dx.
11:34
How must the height of this rectangle change so
11:37
that the area of the puddle remains constant at 1?
11:41
That is, increasing the width by dx adds some new area to the right here.
11:46
So the puddle has to decrease in height by some d 1 over x,
11:50
so that the area lost off of that top cancels out the area gained.
11:56
You should think of that d 1 over x as being a negative amount,
11:59
by the way, since it's decreasing the height of the rectangle.
12:03
And you know what?
12:04
I'm going to leave the last few steps here for you,
12:07
for you to pause and ponder and work out an ultimate expression.
12:10
And once you reason out what d of 1 over x divided by dx should be,
12:14
I want you to compare it to what you would have gotten if you had just
12:17
blindly applied the power rule, purely symbolically, to x to the negative 1.
12:23
And while I'm encouraging you to pause and ponder,
12:26
here's another fun challenge if you're feeling up to it.
12:29
See if you can reason through what the derivative of the square root of x should be.
12:36
To finish things off, I want to tackle one more type of function,
12:40
trigonometric functions, and in particular let's focus on the sine function.
12:45
So for this section I'm going to assume that you're already
12:48
familiar with how to think about trig functions using the unit circle,
12:51
the circle with a radius 1 centered at the origin.
12:55
For a given value of theta, like say 0.8, you imagine yourself
12:59
walking around the circle starting from the rightmost point
13:02
until you've traversed that distance of 0.8 in arc length.
13:06
This is the same thing as saying that the angle right here is exactly theta radians,
13:11
since the circle has a radius of 1.
13:14
Then what sine of theta means is the height of that point above the x-axis,
13:20
and as your theta value increases and you walk around the circle
13:24
your height bobs up and down between negative 1 and 1.
13:29
So when you graph sine of theta versus theta you get this wave pattern,
13:33
the quintessential wave pattern.
13:37
And just from looking at this graph we can start to
13:40
get a feel for the shape of the derivative of the sine.
13:44
The slope at 0 is something positive since sine of theta is increasing there,
13:48
and as we move to the right and sine of theta approaches its peak that slope goes down
13:54
to 0.
13:55
Then the slope is negative for a little while,
13:58
while the sine is decreasing before coming back up to 0 as the sine graph levels out.
14:04
And as you continue thinking this through and drawing it out,
14:07
if you're familiar with the graph of trig functions you might guess that this
14:11
derivative graph should be exactly cosine of theta,
14:13
since all the peaks and valleys line up perfectly with where the peaks and
14:17
valleys for the cosine function should be.
14:20
And spoiler alert, the derivative is in fact the cosine of theta,
14:23
but aren't you a little curious about why it's precisely cosine of theta?
14:28
I mean you could have all sorts of functions with peaks and valleys at the same points
14:32
that have roughly the same shape, but who knows,
14:34
maybe the derivative of sine could have turned out to be some entirely new type of
14:38
function that just happens to have a similar shape.
14:41
Well just like the previous examples, a more exact understanding
14:44
of the derivative requires looking at what the function actually represents,
14:48
rather than looking at the graph of the function.
14:52
So think back to that walk around the unit circle,
14:54
having traversed an arc with length theta and thinking about sine of theta as
14:58
the height of that point.
15:01
Now zoom into that point on the circle and consider a slight nudge of d theta
15:06
along their circumference, a tiny step in your walk around the unit circle.
15:11
How much does that tiny step change the sine of theta?
15:15
How much does this increase d theta of arc length increase the height above the x-axis?
15:21
Well zoomed in close enough, the circle basically looks like a straight line in this
15:26
neighborhood, so let's go ahead and think of this right triangle where the hypotenuse
15:30
of that right triangle represents the nudge d theta along the circumference,
15:34
and that left side here represents the change in height, the resulting d sine of theta.
15:40
Now this tiny triangle is actually similar to this larger triangle here,
15:44
with the defining angle theta and whose hypotenuse is the radius of the circle with
15:48
length 1.
15:50
Specifically this little angle right here is precisely equal to theta radians.
15:57
Now think about what the derivative of sine is supposed to mean.
16:01
It's the ratio between that d sine of theta, the tiny change to the height,
16:05
divided by d theta, the tiny change to the input of the function.
16:10
And from the picture we can see that that's the ratio between the
16:14
length of the side adjacent to the angle theta divided by the hypotenuse.
16:18
Well let's see, adjacent divided by hypotenuse,
16:21
that's exactly what the cosine of theta means, that's the definition of the cosine.
16:27
So this gives us two different really nice ways of
16:30
thinking about how the derivative of sine is cosine.
16:33
One of them is looking at the graph and getting a loose feel for the shape of
16:36
things based on thinking about the slope of the sine graph at every single point.
16:41
And the other is a more precise line of reasoning looking at the unit circle itself.
16:47
For those of you that like to pause and ponder,
16:49
see if you can try a similar line of reasoning to find what the derivative of
16:52
the cosine of theta should be.
16:56
In the next video I'll talk about how you can take derivatives
16:59
of functions who combine simple functions like these ones,
17:02
either as sums or products or function compositions, things like that.
17:06
And similar to this video the goal is going to be to understand each one
17:09
geometrically in a way that makes it intuitively reasonable and somewhat more memorable.
— end of transcript —
Advertisement
Ad slot

More from 3Blue1Brown

Trending Transcripts

Disclaimer: This site is not affiliated with, endorsed by, or sponsored by YouTube or Google LLC. All trademarks belong to their respective owners. Transcripts are sourced from publicly available captions on YouTube and remain the property of their original creators.