Advertisement
Ad slot
Visualizing the chain rule and product rule | Chapter 4, Essence of calculus 15:56

Visualizing the chain rule and product rule | Chapter 4, Essence of calculus

3Blue1Brown · May 12, 2026
Open on YouTube
Transcript ~2554 words · 15:56
0:14
In the last videos I talked about the derivatives of simple functions,
0:18
and the goal was to have a clear picture or intuition to hold in
0:22
your mind that actually explains where these formulas come from.
0:26
But most of the functions you deal with in modeling the world involve mixing,
0:31
combining, or tweaking these simple functions in some other way,
0:35
so our next step is to understand how you take derivatives of more complicated
0:39
combinations.
0:41
Again, I don't want these to be something to memorize,
Advertisement
Ad slot
0:43
I want you to have a clear picture in mind for where each one comes from.
0:49
Now, this really boils down into three basic ways to combine functions.
0:54
You can add them together, you can multiply them,
0:56
and you can throw one inside the other, known as composing them.
1:00
Sure, you could say subtracting them, but really that's just
1:03
multiplying the second by negative one and adding them together.
1:08
Likewise, dividing functions doesn't really add anything,
1:11
because that's the same as plugging one inside the function, one over x,
1:14
and then multiplying the two together.
1:17
So really, most functions you come across just involve layering
Advertisement
Ad slot
1:20
together these three different types of combinations,
1:23
though there's not really a bound on how monstrous things can become.
1:27
But as long as you know how derivatives play with just those three combination types,
1:31
you'll always be able to take it step by step and peel through
1:34
the layers for any kind of monstrous expression.
1:38
So the question is, if you know the derivative of two functions,
1:42
what is the derivative of their sum, of their product,
1:45
and of the function composition between them?
1:50
The sum rule is easiest, if somewhat tongue-twisting to say out loud.
1:54
The derivative of a sum of two functions is the sum of their derivatives.
1:59
But it's worth warming up with this example by really thinking through
2:03
what it means to take a derivative of a sum of two functions,
2:07
since the derivative patterns for products and function composition won't
2:11
be so straightforward, and they're going to require this kind of deeper thinking.
2:16
For example, let's think about this function f of x equals sine of x plus x squared.
2:22
It's a function where, for every input, you add together
2:25
the values of sine of x and x squared at that point.
2:29
For example, let's say at x equals 0.5, the height of the sine
2:34
graph is given by this vertical bar, and the height of the x
2:38
squared parabola is given by this slightly smaller vertical bar.
2:44
And their sum is the length you get by just stacking them together.
2:48
For the derivative, you want to ask what happens as you nudge that input slightly,
2:53
maybe increasing it up to 0.5 plus dx.
2:57
The difference in the value of f between those two places is what we call df.
3:04
And when you picture it like this, I think you'll agree that the total
3:08
change in the height is whatever the change to the sine graph is,
3:13
what we might call d sine of x, plus whatever the change to x squared is, dx squared.
3:22
We know that the derivative of sine is cosine, and remember what that means.
3:27
It means that this little change, d sine of x, is about cosine of x times dx.
3:33
It's proportional to the size of our initial nudge dx,
3:37
and the proportionality constant equals cosine of whatever input we started at.
3:43
Likewise, because the derivative of x squared is 2x,
3:48
the change in the height of the x squared graph is 2x times whatever dx was.
3:55
So rearranging df divided by dx, the ratio of the tiny change to
4:00
the sum function to the tiny change in x that caused it,
4:04
is indeed cosine of x plus 2x, the sum of the derivatives of its parts.
4:11
But like I said, things are a bit different for products,
4:15
and let's think through why in terms of tiny nudges again.
4:20
In this case, I don't think graphs are our best bet for visualizing things.
4:23
Pretty commonly in math, at a lot of levels of math really,
4:27
if you're dealing with a product of two things,
4:29
it helps to understand it as some kind of area.
4:33
In this case, maybe you try to configure some mental setup
4:36
of a box where the side lengths are sine of x and x squared.
4:39
But what would that mean?
4:42
Well, since these are functions, you might think of those sides as adjustable,
4:46
dependent on the value of x, which maybe you think of as this
4:49
number that you can just freely adjust up and down.
4:53
So getting a feel for what this means, focus on
4:56
that top side who changes as the function sine of x.
5:01
As you change this value of x up from 0, it increases up to
5:05
a length of 1 as sine of x moves up towards its peak,
5:09
and after that it starts to decrease as sine of x comes down from 1.
5:15
And in the same way, that height there is always changing as x squared.
5:20
So f of x, defined as the product of these two functions, is the area of this box.
5:27
And for the derivative, let's think about how
5:30
a tiny change to x by dx influences that area.
5:33
What is that resulting change in area df?
5:39
Well, the nudge dx caused that width to change by some small d sine of x,
5:44
and it caused that height to change by some dx squared.
5:50
And this gives us three little snippets of new area,
5:53
a thin rectangle on the bottom whose area is its width, sine of x,
5:58
times its thin height, dx squared.
6:01
And there's this thin rectangle on the right, whose area is its height,
6:06
x squared, times its thin width, d sine of x.
6:10
And there's also this little bit in the corner, but we can ignore that.
6:14
Its area is ultimately proportional to dx squared,
6:17
and as we've seen before, that becomes negligible as dx goes to zero.
6:23
I mean, this whole setup is very similar to what I showed last video,
6:27
with the x squared diagram.
6:29
And just like then, keep in mind that I'm using somewhat beefy
6:32
changes here to draw things, just so we can actually see them.
6:36
But in principle, dx is something very very small,
6:39
and that means that dx squared and d sine of x are also very very small.
6:45
So, applying what we know about the derivative of sine and of x squared,
6:51
that tiny change, dx squared, is going to be about 2x times dx.
6:56
And that tiny change, d sine of x, well that's going to be about cosine of x times dx.
7:02
As usual, we divide out by that dx to see that the ratio we want, df divided by dx,
7:09
is sine of x times the derivative of x squared,
7:12
plus x squared times the derivative of sine.
7:17
And nothing we've done here is specific to sine or to x squared.
7:21
This same line of reasoning would work for any two functions, g and h.
7:27
And sometimes people like to remember this pattern with
7:29
a certain mnemonic that you kind of sing in your head.
7:32
Left d right, right d left.
7:34
In this example, where we have sine of x times x squared, left d right,
7:38
means you take that left function, sine of x, times the derivative of the right,
7:43
in this case 2x.
7:45
Then you add on right d left, that right function,
7:48
x squared, times the derivative of the left one, cosine of x.
7:54
Now out of context, presented as a rule to remember,
7:57
I think this would feel pretty strange, don't you?
8:00
But when you actually think of this adjustable box,
8:03
you can see what each of those terms represents.
8:06
Left d right is the area of that little bottom rectangle,
8:10
and right d left is the area of that rectangle on the side.
8:20
By the way, I should mention that if you multiply by a constant,
8:23
say 2 times sine of x, things end up a lot simpler.
8:27
The derivative is just the same as the constant multiplied by
8:30
the derivative of the function, in this case 2 times cosine of x.
8:35
I'll leave it to you to pause and ponder and verify that makes sense.
8:41
Aside from addition and multiplication, the other common way to combine functions,
8:46
and believe me, this one comes up all the time,
8:49
is to shove one inside the other, function composition.
8:53
For example, maybe we take the function x squared and shove it
8:56
inside sine of x to get this new function, sine of x squared.
9:01
What do you think the derivative of that new function is?
9:05
To think this one through, I'll choose yet another way to visualize things,
9:09
just to emphasize that in creative math, we've got lots of options.
9:13
I'll put up three different number lines, the top one is going to hold the value of x,
9:18
the second one is going to hold the x squared,
9:21
and the third line is going to hold the value of sine of x squared.
9:26
That is, the function x squared gets you from line 1 to line 2,
9:30
and the function sine gets you from line 2 to line 3.
9:34
As I shift around this value of x, maybe moving it up to the value 3,
9:39
that second value stays pegged to whatever x squared is, in this case moving up to 9.
9:46
That bottom value, being sine of x squared, is
9:49
going to go to whatever sine of 9 happens to be.
9:54
So, for the derivative, let's again start by nudging that x value by some little dx.
10:01
I always think that it's helpful to think of x as starting
10:04
at some actual concrete number, maybe 1.5 in this case.
10:08
The resulting nudge to that second value, the change in x squared caused by such a dx,
10:14
is dx squared.
10:16
We could expand this like we have before, as 2x times dx,
10:21
which for our specific input would be 2 times 1.5 times dx,
10:25
but it helps to keep things written as dx squared, at least for now.
10:31
In fact, I'm going to go one step further, give a new name to this x squared,
10:36
maybe h, so instead of writing dx squared for this nudge, we write dh.
10:42
This makes it easier to think about that third value, which is now pegged at sine of h.
10:48
Its change is d sine of h, the tiny change caused by the nudge dh.
10:55
By the way, the fact that it's moving to the left while the dh bump is going to the right
11:00
just means that this change, d sine of h, is going to be some kind of negative number.
11:06
Once again, we can use our knowledge of the derivative of the sine.
11:10
This d sine of h is going to be about cosine of h times dh.
11:15
That's what it means for the derivative of sine to be cosine.
11:19
Unfolding things, we can replace that h with x squared again,
11:23
so we know that the bottom nudge will be a size of cosine of x squared times dx squared.
11:31
Let's unfold things even further.
11:32
That intermediate nudge dx squared is going to be about 2x times dx.
11:39
It's always a good habit to remind yourself of
11:41
what an expression like this actually means.
11:44
In this case, where we started at x equals 1.5 up top,
11:48
this whole expression is telling us that the size of the nudge on that third
11:54
line is going to be about cosine of 1.5 squared times 2 times 1.5 times whatever
12:00
the size of dx was.
12:02
It's proportional to the size of dx, and this
12:05
derivative is giving us that proportionality constant.
12:10
Notice what we came out with here.
12:12
We have the derivative of the outside function,
12:15
and it's still taking in the unaltered inside function,
12:19
and then multiplying it by the derivative of that inside function.
12:25
Again, there's nothing special about sine of x or x squared.
12:29
If you have any two functions, g of x and h of x,
12:33
the derivative of their composition, g of h of x,
12:37
is going to be the derivative of g evaluated on h, multiplied by the derivative of h.
12:47
This pattern right here is what we usually call the chain rule.
12:52
Notice for the derivative of g, I'm writing it as dg dh instead of dg dx.
12:58
On the symbolic level, this is a reminder that the thing you plug
13:02
into that derivative is still going to be that intermediary function h.
13:07
But more than that, it's an important reflection of what
13:09
this derivative of the outer function actually represents.
13:13
Remember, in our three line setup, when we took the derivative of the sine on
13:18
that bottom, we expanded the size of that nudge, d sine, as cosine of h times dh.
13:24
This was because we didn't immediately know how
13:27
the size of that bottom nudge depended on x.
13:30
That's kind of the whole thing we were trying to figure out.
13:33
But we could take the derivative with respect to that intermediate variable, h.
13:38
That is, figure out how to express the size of that nudge on the third
13:41
line as some multiple of dh, the size of the nudge on the second line.
13:46
It was only after that that we unfolded further by figuring out what dh was.
13:53
In this chain rule expression, we're saying, look at the ratio between a tiny change in
13:58
g, the final output, to a tiny change in h that caused it,
14:02
h being the value we plug into g.
14:05
Then multiply that by the tiny change in h, divided
14:08
by the tiny change in x that caused it.
14:12
So notice, those dh's cancel out, and they give us a ratio
14:15
between the change in that final output and the change to the input that,
14:19
through a certain chain of events, brought it about.
14:23
And that cancellation of dh is not just a notational trick.
14:26
That is a genuine reflection of what's going on with the
14:30
tiny nudges that underpin everything we do with derivatives.
14:36
So those are the three basic tools to have in your belt to handle
14:39
derivatives of functions that combine a lot of smaller things.
14:43
You've got the sum rule, the product rule, and the chain rule.
14:48
And I'll be honest with you, there is a big difference between knowing
14:51
what the chain rule is and what the product rule is,
14:54
and actually being fluent with applying them in even the most hairy of situations.
14:59
Watching videos, any videos, about the mechanics of calculus is
15:03
never going to substitute for practicing those mechanics yourself,
15:06
and building up the muscles to do these computations yourself.
15:11
I really wish I could offer to do that for you,
15:13
but I'm afraid the ball is in your court, my friend, to seek out the practice.
15:18
What I can offer, and what I hope I have offered,
15:20
is to show you where these rules actually come from.
15:24
To show that they're not just something to be memorized and hammered away,
15:27
but they're natural patterns, things that you too could have discovered
15:31
just by patiently thinking through what a derivative actually means.
— end of transcript —
Advertisement
Ad slot

More from 3Blue1Brown

Trending Transcripts

Disclaimer: This site is not affiliated with, endorsed by, or sponsored by YouTube or Google LLC. All trademarks belong to their respective owners. Transcripts are sourced from publicly available captions on YouTube and remain the property of their original creators.