[00:14] In the last videos I talked about the derivatives of simple functions, [00:18] and the goal was to have a clear picture or intuition to hold in [00:22] your mind that actually explains where these formulas come from. [00:26] But most of the functions you deal with in modeling the world involve mixing, [00:31] combining, or tweaking these simple functions in some other way, [00:35] so our next step is to understand how you take derivatives of more complicated [00:39] combinations. [00:41] Again, I don't want these to be something to memorize, [00:43] I want you to have a clear picture in mind for where each one comes from. [00:49] Now, this really boils down into three basic ways to combine functions. [00:54] You can add them together, you can multiply them, [00:56] and you can throw one inside the other, known as composing them. [01:00] Sure, you could say subtracting them, but really that's just [01:03] multiplying the second by negative one and adding them together. [01:08] Likewise, dividing functions doesn't really add anything, [01:11] because that's the same as plugging one inside the function, one over x, [01:14] and then multiplying the two together. [01:17] So really, most functions you come across just involve layering [01:20] together these three different types of combinations, [01:23] though there's not really a bound on how monstrous things can become. [01:27] But as long as you know how derivatives play with just those three combination types, [01:31] you'll always be able to take it step by step and peel through [01:34] the layers for any kind of monstrous expression. [01:38] So the question is, if you know the derivative of two functions, [01:42] what is the derivative of their sum, of their product, [01:45] and of the function composition between them? [01:50] The sum rule is easiest, if somewhat tongue-twisting to say out loud. [01:54] The derivative of a sum of two functions is the sum of their derivatives. [01:59] But it's worth warming up with this example by really thinking through [02:03] what it means to take a derivative of a sum of two functions, [02:07] since the derivative patterns for products and function composition won't [02:11] be so straightforward, and they're going to require this kind of deeper thinking. [02:16] For example, let's think about this function f of x equals sine of x plus x squared. [02:22] It's a function where, for every input, you add together [02:25] the values of sine of x and x squared at that point. [02:29] For example, let's say at x equals 0.5, the height of the sine [02:34] graph is given by this vertical bar, and the height of the x [02:38] squared parabola is given by this slightly smaller vertical bar. [02:44] And their sum is the length you get by just stacking them together. [02:48] For the derivative, you want to ask what happens as you nudge that input slightly, [02:53] maybe increasing it up to 0.5 plus dx. [02:57] The difference in the value of f between those two places is what we call df. [03:04] And when you picture it like this, I think you'll agree that the total [03:08] change in the height is whatever the change to the sine graph is, [03:13] what we might call d sine of x, plus whatever the change to x squared is, dx squared. [03:22] We know that the derivative of sine is cosine, and remember what that means. [03:27] It means that this little change, d sine of x, is about cosine of x times dx. [03:33] It's proportional to the size of our initial nudge dx, [03:37] and the proportionality constant equals cosine of whatever input we started at. [03:43] Likewise, because the derivative of x squared is 2x, [03:48] the change in the height of the x squared graph is 2x times whatever dx was. [03:55] So rearranging df divided by dx, the ratio of the tiny change to [04:00] the sum function to the tiny change in x that caused it, [04:04] is indeed cosine of x plus 2x, the sum of the derivatives of its parts. [04:11] But like I said, things are a bit different for products, [04:15] and let's think through why in terms of tiny nudges again. [04:20] In this case, I don't think graphs are our best bet for visualizing things. [04:23] Pretty commonly in math, at a lot of levels of math really, [04:27] if you're dealing with a product of two things, [04:29] it helps to understand it as some kind of area. [04:33] In this case, maybe you try to configure some mental setup [04:36] of a box where the side lengths are sine of x and x squared. [04:39] But what would that mean? [04:42] Well, since these are functions, you might think of those sides as adjustable, [04:46] dependent on the value of x, which maybe you think of as this [04:49] number that you can just freely adjust up and down. [04:53] So getting a feel for what this means, focus on [04:56] that top side who changes as the function sine of x. [05:01] As you change this value of x up from 0, it increases up to [05:05] a length of 1 as sine of x moves up towards its peak, [05:09] and after that it starts to decrease as sine of x comes down from 1. [05:15] And in the same way, that height there is always changing as x squared. [05:20] So f of x, defined as the product of these two functions, is the area of this box. [05:27] And for the derivative, let's think about how [05:30] a tiny change to x by dx influences that area. [05:33] What is that resulting change in area df? [05:39] Well, the nudge dx caused that width to change by some small d sine of x, [05:44] and it caused that height to change by some dx squared. [05:50] And this gives us three little snippets of new area, [05:53] a thin rectangle on the bottom whose area is its width, sine of x, [05:58] times its thin height, dx squared. [06:01] And there's this thin rectangle on the right, whose area is its height, [06:06] x squared, times its thin width, d sine of x. [06:10] And there's also this little bit in the corner, but we can ignore that. [06:14] Its area is ultimately proportional to dx squared, [06:17] and as we've seen before, that becomes negligible as dx goes to zero. [06:23] I mean, this whole setup is very similar to what I showed last video, [06:27] with the x squared diagram. [06:29] And just like then, keep in mind that I'm using somewhat beefy [06:32] changes here to draw things, just so we can actually see them. [06:36] But in principle, dx is something very very small, [06:39] and that means that dx squared and d sine of x are also very very small. [06:45] So, applying what we know about the derivative of sine and of x squared, [06:51] that tiny change, dx squared, is going to be about 2x times dx. [06:56] And that tiny change, d sine of x, well that's going to be about cosine of x times dx. [07:02] As usual, we divide out by that dx to see that the ratio we want, df divided by dx, [07:09] is sine of x times the derivative of x squared, [07:12] plus x squared times the derivative of sine. [07:17] And nothing we've done here is specific to sine or to x squared. [07:21] This same line of reasoning would work for any two functions, g and h. [07:27] And sometimes people like to remember this pattern with [07:29] a certain mnemonic that you kind of sing in your head. [07:32] Left d right, right d left. [07:34] In this example, where we have sine of x times x squared, left d right, [07:38] means you take that left function, sine of x, times the derivative of the right, [07:43] in this case 2x. [07:45] Then you add on right d left, that right function, [07:48] x squared, times the derivative of the left one, cosine of x. [07:54] Now out of context, presented as a rule to remember, [07:57] I think this would feel pretty strange, don't you? [08:00] But when you actually think of this adjustable box, [08:03] you can see what each of those terms represents. [08:06] Left d right is the area of that little bottom rectangle, [08:10] and right d left is the area of that rectangle on the side. [08:20] By the way, I should mention that if you multiply by a constant, [08:23] say 2 times sine of x, things end up a lot simpler. [08:27] The derivative is just the same as the constant multiplied by [08:30] the derivative of the function, in this case 2 times cosine of x. [08:35] I'll leave it to you to pause and ponder and verify that makes sense. [08:41] Aside from addition and multiplication, the other common way to combine functions, [08:46] and believe me, this one comes up all the time, [08:49] is to shove one inside the other, function composition. [08:53] For example, maybe we take the function x squared and shove it [08:56] inside sine of x to get this new function, sine of x squared. [09:01] What do you think the derivative of that new function is? [09:05] To think this one through, I'll choose yet another way to visualize things, [09:09] just to emphasize that in creative math, we've got lots of options. [09:13] I'll put up three different number lines, the top one is going to hold the value of x, [09:18] the second one is going to hold the x squared, [09:21] and the third line is going to hold the value of sine of x squared. [09:26] That is, the function x squared gets you from line 1 to line 2, [09:30] and the function sine gets you from line 2 to line 3. [09:34] As I shift around this value of x, maybe moving it up to the value 3, [09:39] that second value stays pegged to whatever x squared is, in this case moving up to 9. [09:46] That bottom value, being sine of x squared, is [09:49] going to go to whatever sine of 9 happens to be. [09:54] So, for the derivative, let's again start by nudging that x value by some little dx. [10:01] I always think that it's helpful to think of x as starting [10:04] at some actual concrete number, maybe 1.5 in this case. [10:08] The resulting nudge to that second value, the change in x squared caused by such a dx, [10:14] is dx squared. [10:16] We could expand this like we have before, as 2x times dx, [10:21] which for our specific input would be 2 times 1.5 times dx, [10:25] but it helps to keep things written as dx squared, at least for now. [10:31] In fact, I'm going to go one step further, give a new name to this x squared, [10:36] maybe h, so instead of writing dx squared for this nudge, we write dh. [10:42] This makes it easier to think about that third value, which is now pegged at sine of h. [10:48] Its change is d sine of h, the tiny change caused by the nudge dh. [10:55] By the way, the fact that it's moving to the left while the dh bump is going to the right [11:00] just means that this change, d sine of h, is going to be some kind of negative number. [11:06] Once again, we can use our knowledge of the derivative of the sine. [11:10] This d sine of h is going to be about cosine of h times dh. [11:15] That's what it means for the derivative of sine to be cosine. [11:19] Unfolding things, we can replace that h with x squared again, [11:23] so we know that the bottom nudge will be a size of cosine of x squared times dx squared. [11:31] Let's unfold things even further. [11:32] That intermediate nudge dx squared is going to be about 2x times dx. [11:39] It's always a good habit to remind yourself of [11:41] what an expression like this actually means. [11:44] In this case, where we started at x equals 1.5 up top, [11:48] this whole expression is telling us that the size of the nudge on that third [11:54] line is going to be about cosine of 1.5 squared times 2 times 1.5 times whatever [12:00] the size of dx was. [12:02] It's proportional to the size of dx, and this [12:05] derivative is giving us that proportionality constant. [12:10] Notice what we came out with here. [12:12] We have the derivative of the outside function, [12:15] and it's still taking in the unaltered inside function, [12:19] and then multiplying it by the derivative of that inside function. [12:25] Again, there's nothing special about sine of x or x squared. [12:29] If you have any two functions, g of x and h of x, [12:33] the derivative of their composition, g of h of x, [12:37] is going to be the derivative of g evaluated on h, multiplied by the derivative of h. [12:47] This pattern right here is what we usually call the chain rule. [12:52] Notice for the derivative of g, I'm writing it as dg dh instead of dg dx. [12:58] On the symbolic level, this is a reminder that the thing you plug [13:02] into that derivative is still going to be that intermediary function h. [13:07] But more than that, it's an important reflection of what [13:09] this derivative of the outer function actually represents. [13:13] Remember, in our three line setup, when we took the derivative of the sine on [13:18] that bottom, we expanded the size of that nudge, d sine, as cosine of h times dh. [13:24] This was because we didn't immediately know how [13:27] the size of that bottom nudge depended on x. [13:30] That's kind of the whole thing we were trying to figure out. [13:33] But we could take the derivative with respect to that intermediate variable, h. [13:38] That is, figure out how to express the size of that nudge on the third [13:41] line as some multiple of dh, the size of the nudge on the second line. [13:46] It was only after that that we unfolded further by figuring out what dh was. [13:53] In this chain rule expression, we're saying, look at the ratio between a tiny change in [13:58] g, the final output, to a tiny change in h that caused it, [14:02] h being the value we plug into g. [14:05] Then multiply that by the tiny change in h, divided [14:08] by the tiny change in x that caused it. [14:12] So notice, those dh's cancel out, and they give us a ratio [14:15] between the change in that final output and the change to the input that, [14:19] through a certain chain of events, brought it about. [14:23] And that cancellation of dh is not just a notational trick. [14:26] That is a genuine reflection of what's going on with the [14:30] tiny nudges that underpin everything we do with derivatives. [14:36] So those are the three basic tools to have in your belt to handle [14:39] derivatives of functions that combine a lot of smaller things. [14:43] You've got the sum rule, the product rule, and the chain rule. [14:48] And I'll be honest with you, there is a big difference between knowing [14:51] what the chain rule is and what the product rule is, [14:54] and actually being fluent with applying them in even the most hairy of situations. [14:59] Watching videos, any videos, about the mechanics of calculus is [15:03] never going to substitute for practicing those mechanics yourself, [15:06] and building up the muscles to do these computations yourself. [15:11] I really wish I could offer to do that for you, [15:13] but I'm afraid the ball is in your court, my friend, to seek out the practice. [15:18] What I can offer, and what I hope I have offered, [15:20] is to show you where these rules actually come from. [15:24] To show that they're not just something to be memorized and hammered away, [15:27] but they're natural patterns, things that you too could have discovered [15:31] just by patiently thinking through what a derivative actually means.