Derivative formulas through geometry | Chapter 3, Essence of calculus

3Blue1Brown · May 12, 2026

Open on YouTube

Transcript ~3108 words · 17:33

0:12

Now that we've seen what a derivative means and what it has to do with rates of change,

0:16

our next step is to learn how to actually compute these guys.

0:19

As in, if I give you some kind of function with an explicit formula,

0:22

you'd want to be able to find what the formula for its derivative is.

0:26

Maybe it's obvious, but I think it's worth stating explicitly why this

0:30

is an important thing to be able to do, why much of a calculus student's

0:34

time ends up going towards grappling with derivatives of abstract

0:37

functions rather than thinking about concrete rate of change problems.

0:42

It's because a lot of real-world phenomena, the sort of things that

0:45

we want to use calculus to analyze, are modeled using polynomials,

0:49

trigonometric functions, exponentials, and other pure functions like that.

0:53

So if you build up some fluency with the ideas of rates of change for those kinds of

0:58

pure abstract functions, it gives you a language to more readily talk about the rates

1:02

at which things change in concrete situations that you might be using calculus to model.

1:07

But it is way too easy for this process to feel like just memorizing a list of rules,

1:12

and if that happens, if you get that feeling, it's also easy to lose sight of the

1:16

fact that derivatives are fundamentally about just looking at tiny changes to some

1:20

quantity and how that relates to a resulting tiny change in another quantity.

1:24

So in this video and in the next one, my aim is to show you how you can think

1:28

about a few of these rules intuitively and geometrically,

1:31

and I really want to encourage you to never forget that tiny nudges are at the

1:35

heart of derivatives.

1:37

Let's start with a simple function like f of x equals x squared.

1:41

What if I asked you its derivative?

1:43

That is, if you were to look at some value x, like x equals 2,

1:47

and compare it to a value slightly bigger, just dx bigger,

1:50

what's the corresponding change in the value of the function?

1:54

dF.

1:55

And in particular, what's dF divided by dx, the rate

1:58

at which this function is changing per unit change in x.

2:03

As a first step for intuition, we know that you can think of this ratio

2:07

dF dx as the slope of a tangent line to the graph of x squared,

2:10

and from that you can see that the slope generally increases as x increases.

2:15

At zero, the tangent line is flat, and the slope is zero.

2:19

At x equals 1, it's something a bit steeper.

2:22

At x equals 2, it's steeper still.

2:25

But looking at graphs isn't generally the best way

2:27

to understand the precise formula for a derivative.

2:30

For that, it's best to take a more literal look at what x squared actually means,

2:34

and in this case let's go ahead and picture a square whose side length is x.

2:39

If you increase x by some tiny nudge, some little dx,

2:43

what's the resulting change in the area of that square?

2:47

That slight change in area is what dF means in this context.

2:52

It's the tiny increase to the value of f of x equals x squared,

2:55

caused by increasing x by that tiny nudge dx.

2:59

Now you can see that there's three new bits of area in this diagram,

3:03

two thin rectangles and a minuscule square.

3:06

The two thin rectangles each have side lengths of x and dx,

3:10

so they account for 2 times x times dx units of new area.

3:18

For example, let's say x was 3 and dx was 0.01,

3:21

then that new area from these two thin rectangles would be 2 times 3 times 0.01,

3:25

which is 0.06, about 6 times the size of dx.

3:29

That little square there has an area of dx squared,

3:32

but you should think of that as being really tiny, negligibly tiny.

3:37

For example, if dx was 0.01, that would be only 0.0001,

3:41

and keep in mind I'm drawing dx with a fair bit of width here just so we

3:46

can actually see it, but always remember in principle,

3:49

dx should be thought of as a truly tiny amount, and for those truly tiny amounts,

3:54

a good rule of thumb is that you can ignore anything that includes a dx

3:59

raised to a power greater than 1.

4:02

That is, a tiny change squared is a negligible change.

4:07

What this leaves us with is that dF is just some multiple of dx, and that multiple 2x,

4:13

which you could also write as dF divided by dx, is the derivative of x squared.

4:19

For example, if you were starting at x equals 3, then as you slightly increase x,

4:24

the rate of change in the area per unit change in length added, dx squared over dx,

4:29

would be 2 times 3, or 6, and if instead you were starting at x equals 5,

4:34

then the rate of change would be 10 units of area per unit change in x.

4:41

Let's go ahead and try a different simple function, f of x equals x cubed.

4:45

This is going to be the geometric view of the stuff

4:48

that I went through algebraically in the last video.

4:51

What's nice here is that we can think of x cubed as the volume of an actual

4:55

cube whose side lengths are x, and when you increase x by a tiny nudge,

5:00

a tiny dx, the resulting increase in volume is what I have here in yellow.

5:04

That represents all the volume in a cube with side lengths x plus dx

5:08

that's not already in the original cube, the one with side length x.

5:13

It's nice to think of this new volume as broken up into multiple components,

5:18

but almost all of it comes from these three square faces,

5:22

or said a little more precisely, as dx approaches 0,

5:25

those three squares comprise a portion closer and closer to 100% of

5:30

that new yellow volume.

5:33

Each of those thin squares has a volume of x squared times dx,

5:38

the area of the face times that little thickness dx.

5:42

So in total this gives us 3x squared dx of volume change.

5:47

And to be sure there are other slivers of volume here along the edges

5:51

and that tiny one in the corner, but all of that volume is going to be

5:54

proportional to dx squared, or dx cubed, so we can safely ignore them.

5:59

Again this is ultimately because they're going to be divided by dx,

6:03

and if there's still any dx remaining then those terms aren't

6:07

going to survive the process of letting dx approach 0.

6:11

What this means is that the derivative of x cubed,

6:14

the rate at which x cubed changes per unit change of x, is 3 times x squared.

6:20

What that means in terms of graphical intuition is that the slope of

6:25

the graph of x cubed at every single point x is exactly 3x squared.

6:34

And reasoning about that slope, it should make sense that this derivative is high on the

6:38

left and then 0 at the origin and then high again as you move to the right,

6:42

but just thinking in terms of the graph would never have landed us on the precise

6:47

quantity 3x squared.

6:48

For that we had to take a much more direct look at what x cubed actually means.

6:54

Now in practice you wouldn't necessarily think of the square every

6:57

time you're taking the derivative of x squared,

6:59

nor would you necessarily think of this cube whenever you're taking

7:03

the derivative of x cubed.

7:04

Both of them fall under a pretty recognizable pattern for polynomial terms.

7:09

The derivative of x to the fourth turns out to be 4x cubed,

7:13

the derivative of x to the fifth is 5x to the fourth, and so on.

7:18

Abstractly you'd write this as the derivative of x to

7:22

the n for any power n is n times x to the n minus 1.

7:27

This right here is what's known in the business as the power rule.

7:31

In practice we all quickly just get jaded and think about this symbolically as

7:35

the exponent hopping down in front, leaving behind one less than itself,

7:39

rarely pausing to think about the geometric delights that underlie these derivatives.

7:45

That's the kind of thing that happens when these tend

7:47

to fall in the middle of much longer computations.

7:50

But rather than tracking it all off to symbolic patterns,

7:53

let's just take a moment and think about why this works for powers beyond just 2 and 3.

7:58

When you nudge that input x, increasing it slightly to x plus dx,

8:02

working out the exact value of that nudged output would involve

8:06

multiplying together these n separate x plus dx terms.

8:11

The full expansion would be really complicated,

8:13

but part of the point of derivatives is that most of that complication can be ignored.

8:19

The first term in your expansion is x to the n.

8:22

This is analogous to the area of the original square,

8:25

or the volume of the original cube from our previous examples.

8:30

For the next terms in the expansion you can choose mostly x's with a single dx.

8:41

Since there are n different parentheticals from which you could have chosen

8:46

that single dx, this gives us n separate terms,

8:50

all of which include n minus 1 x's times a dx,

8:53

giving a value of x to the power n minus 1 times dx.

8:57

This is analogous to how the majority of the new area in the square came from those

9:02

two bars, each with area x times dx, or how the bulk of the new volume in the cube

9:07

came from those three thin squares, each of which had a volume of x squared times dx.

9:14

There will be many other terms of this expansion,

9:17

but all of them are just going to be some multiple of dx squared,

9:21

so we can safely ignore them, and what that means is that all but a

9:25

negligible portion of the increase in the output comes from n copies of

9:29

this x to the n minus 1 times dx.

9:31

That's what it means for the derivative of x to the n to be n times x to the n minus 1.

9:38

And even though, like I said in practice, you'll find yourself performing this

9:43

derivative quickly and symbolically, imagining the exponent hopping down to the front,

9:47

every now and then it's nice to just step back and remember why these rules work.

9:52

Not just because it's pretty, and not just because it helps remind us that math

9:56

actually makes sense and isn't just a pile of formulas to memorize,

10:00

but because it flexes that very important muscle of thinking about derivatives in

10:04

terms of tiny nudges.

10:07

As another example, think of the function f of x equals 1 divided by x.

10:12

Now on the hand you could just blindly try applying the power rule,

10:16

since 1 divided by x is the same as writing x to the negative 1.

10:21

That would involve letting the negative 1 hop down in front,

10:24

leaving behind 1 less than itself, which is negative 2.

10:28

But let's have some fun and see if we can reason about this geometrically,

10:31

rather than just plugging it through some formula.

10:34

The value 1 over x is asking what number multiplied by x equals 1.

10:40

So here's how I'd like to visualize it.

10:42

Imagine a little rectangular puddle of water sitting in two dimensions whose area is 1.

10:48

And let's say that its width is x, which means that the height has to be 1 over x,

10:53

since the total area of it is 1.

10:56

So if x was stretched out to 2, then that height is forced down to 1 half.

11:01

And if you increased x up to 3, then the other side has to be squished down to 1 third.

11:07

This is a nice way to think about the graph of 1 over x, by the way.

11:11

If you think of this width x of the puddle as being in the xy-plane,

11:15

then that corresponding output 1 divided by x, the height of the graph above that point,

11:20

is whatever the height of your puddle has to be to maintain an area of 1.

11:26

So with this visual in mind, for the derivative,

11:29

imagine nudging up that value of x by some tiny amount, some tiny dx.

11:34

How must the height of this rectangle change so

11:37

that the area of the puddle remains constant at 1?

11:41

That is, increasing the width by dx adds some new area to the right here.

11:46

So the puddle has to decrease in height by some d 1 over x,

11:50

so that the area lost off of that top cancels out the area gained.

11:56

You should think of that d 1 over x as being a negative amount,

11:59

by the way, since it's decreasing the height of the rectangle.

12:03

And you know what?

12:04

I'm going to leave the last few steps here for you,

12:07

for you to pause and ponder and work out an ultimate expression.

12:10

And once you reason out what d of 1 over x divided by dx should be,

12:14

I want you to compare it to what you would have gotten if you had just

12:17

blindly applied the power rule, purely symbolically, to x to the negative 1.

12:23

And while I'm encouraging you to pause and ponder,

12:26

here's another fun challenge if you're feeling up to it.

12:29

See if you can reason through what the derivative of the square root of x should be.

12:36

To finish things off, I want to tackle one more type of function,

12:40

trigonometric functions, and in particular let's focus on the sine function.

12:45

So for this section I'm going to assume that you're already

12:48

familiar with how to think about trig functions using the unit circle,

12:51

the circle with a radius 1 centered at the origin.

12:55

For a given value of theta, like say 0.8, you imagine yourself

12:59

walking around the circle starting from the rightmost point

13:02

until you've traversed that distance of 0.8 in arc length.

13:06

This is the same thing as saying that the angle right here is exactly theta radians,

13:11

since the circle has a radius of 1.

13:14

Then what sine of theta means is the height of that point above the x-axis,

13:20

and as your theta value increases and you walk around the circle

13:24

your height bobs up and down between negative 1 and 1.

13:29

So when you graph sine of theta versus theta you get this wave pattern,

13:33

the quintessential wave pattern.

13:37

And just from looking at this graph we can start to

13:40

get a feel for the shape of the derivative of the sine.

13:44

The slope at 0 is something positive since sine of theta is increasing there,

13:48

and as we move to the right and sine of theta approaches its peak that slope goes down

13:54

to 0.

13:55

Then the slope is negative for a little while,

13:58

while the sine is decreasing before coming back up to 0 as the sine graph levels out.

14:04

And as you continue thinking this through and drawing it out,

14:07

if you're familiar with the graph of trig functions you might guess that this

14:11

derivative graph should be exactly cosine of theta,

14:13

since all the peaks and valleys line up perfectly with where the peaks and

14:17

valleys for the cosine function should be.

14:20

And spoiler alert, the derivative is in fact the cosine of theta,

14:23

but aren't you a little curious about why it's precisely cosine of theta?

14:28

I mean you could have all sorts of functions with peaks and valleys at the same points

14:32

that have roughly the same shape, but who knows,

14:34

maybe the derivative of sine could have turned out to be some entirely new type of

14:38

function that just happens to have a similar shape.

14:41

Well just like the previous examples, a more exact understanding

14:44

of the derivative requires looking at what the function actually represents,

14:48

rather than looking at the graph of the function.

14:52

So think back to that walk around the unit circle,

14:54

having traversed an arc with length theta and thinking about sine of theta as

14:58

the height of that point.

15:01

Now zoom into that point on the circle and consider a slight nudge of d theta

15:06

along their circumference, a tiny step in your walk around the unit circle.

15:11

How much does that tiny step change the sine of theta?

15:15

How much does this increase d theta of arc length increase the height above the x-axis?

15:21

Well zoomed in close enough, the circle basically looks like a straight line in this

15:26

neighborhood, so let's go ahead and think of this right triangle where the hypotenuse

15:30

of that right triangle represents the nudge d theta along the circumference,

15:34

and that left side here represents the change in height, the resulting d sine of theta.

15:40

Now this tiny triangle is actually similar to this larger triangle here,

15:44

with the defining angle theta and whose hypotenuse is the radius of the circle with

15:48

length 1.

15:50

Specifically this little angle right here is precisely equal to theta radians.

15:57

Now think about what the derivative of sine is supposed to mean.

16:01

It's the ratio between that d sine of theta, the tiny change to the height,

16:05

divided by d theta, the tiny change to the input of the function.

16:10

And from the picture we can see that that's the ratio between the

16:14

length of the side adjacent to the angle theta divided by the hypotenuse.

16:18

Well let's see, adjacent divided by hypotenuse,

16:21

that's exactly what the cosine of theta means, that's the definition of the cosine.

16:27

So this gives us two different really nice ways of

16:30

thinking about how the derivative of sine is cosine.

16:33

One of them is looking at the graph and getting a loose feel for the shape of

16:36

things based on thinking about the slope of the sine graph at every single point.

16:41

And the other is a more precise line of reasoning looking at the unit circle itself.

16:47

For those of you that like to pause and ponder,

16:49

see if you can try a similar line of reasoning to find what the derivative of

16:52

the cosine of theta should be.

16:56

In the next video I'll talk about how you can take derivatives

16:59

of functions who combine simple functions like these ones,

17:02

either as sums or products or function compositions, things like that.

17:06

And similar to this video the goal is going to be to understand each one

17:09

geometrically in a way that makes it intuitively reasonable and somewhat more memorable.

— end of transcript —

More from 3Blue1Brown

17:04

The essence of calculus

3Blue1Brown

44:52

How (and why) to take a logarithm of an image

3Blue1Brown

30:38

Solving Wordle using information theory

3Blue1Brown

11:15

The hardest problem on the hardest test

3Blue1Brown

Trending Transcripts

26:15

Is AI pushing our planet too far? | BBC News

BBC News

4:19

George Carlin — I Just Don't Care

Robin Slater

3:17:57

Joe Rogan Experience #2493 - Protect Our Parks 16

PowerfulJRE

17:04

The essence of calculus

3Blue1Brown