[00:12] Now that we've seen what a derivative means and what it has to do with rates of change, [00:16] our next step is to learn how to actually compute these guys. [00:19] As in, if I give you some kind of function with an explicit formula, [00:22] you'd want to be able to find what the formula for its derivative is. [00:26] Maybe it's obvious, but I think it's worth stating explicitly why this [00:30] is an important thing to be able to do, why much of a calculus student's [00:34] time ends up going towards grappling with derivatives of abstract [00:37] functions rather than thinking about concrete rate of change problems. [00:42] It's because a lot of real-world phenomena, the sort of things that [00:45] we want to use calculus to analyze, are modeled using polynomials, [00:49] trigonometric functions, exponentials, and other pure functions like that. [00:53] So if you build up some fluency with the ideas of rates of change for those kinds of [00:58] pure abstract functions, it gives you a language to more readily talk about the rates [01:02] at which things change in concrete situations that you might be using calculus to model. [01:07] But it is way too easy for this process to feel like just memorizing a list of rules, [01:12] and if that happens, if you get that feeling, it's also easy to lose sight of the [01:16] fact that derivatives are fundamentally about just looking at tiny changes to some [01:20] quantity and how that relates to a resulting tiny change in another quantity. [01:24] So in this video and in the next one, my aim is to show you how you can think [01:28] about a few of these rules intuitively and geometrically, [01:31] and I really want to encourage you to never forget that tiny nudges are at the [01:35] heart of derivatives. [01:37] Let's start with a simple function like f of x equals x squared. [01:41] What if I asked you its derivative? [01:43] That is, if you were to look at some value x, like x equals 2, [01:47] and compare it to a value slightly bigger, just dx bigger, [01:50] what's the corresponding change in the value of the function? [01:54] dF. [01:55] And in particular, what's dF divided by dx, the rate [01:58] at which this function is changing per unit change in x. [02:03] As a first step for intuition, we know that you can think of this ratio [02:07] dF dx as the slope of a tangent line to the graph of x squared, [02:10] and from that you can see that the slope generally increases as x increases. [02:15] At zero, the tangent line is flat, and the slope is zero. [02:19] At x equals 1, it's something a bit steeper. [02:22] At x equals 2, it's steeper still. [02:25] But looking at graphs isn't generally the best way [02:27] to understand the precise formula for a derivative. [02:30] For that, it's best to take a more literal look at what x squared actually means, [02:34] and in this case let's go ahead and picture a square whose side length is x. [02:39] If you increase x by some tiny nudge, some little dx, [02:43] what's the resulting change in the area of that square? [02:47] That slight change in area is what dF means in this context. [02:52] It's the tiny increase to the value of f of x equals x squared, [02:55] caused by increasing x by that tiny nudge dx. [02:59] Now you can see that there's three new bits of area in this diagram, [03:03] two thin rectangles and a minuscule square. [03:06] The two thin rectangles each have side lengths of x and dx, [03:10] so they account for 2 times x times dx units of new area. [03:18] For example, let's say x was 3 and dx was 0.01, [03:21] then that new area from these two thin rectangles would be 2 times 3 times 0.01, [03:25] which is 0.06, about 6 times the size of dx. [03:29] That little square there has an area of dx squared, [03:32] but you should think of that as being really tiny, negligibly tiny. [03:37] For example, if dx was 0.01, that would be only 0.0001, [03:41] and keep in mind I'm drawing dx with a fair bit of width here just so we [03:46] can actually see it, but always remember in principle, [03:49] dx should be thought of as a truly tiny amount, and for those truly tiny amounts, [03:54] a good rule of thumb is that you can ignore anything that includes a dx [03:59] raised to a power greater than 1. [04:02] That is, a tiny change squared is a negligible change. [04:07] What this leaves us with is that dF is just some multiple of dx, and that multiple 2x, [04:13] which you could also write as dF divided by dx, is the derivative of x squared. [04:19] For example, if you were starting at x equals 3, then as you slightly increase x, [04:24] the rate of change in the area per unit change in length added, dx squared over dx, [04:29] would be 2 times 3, or 6, and if instead you were starting at x equals 5, [04:34] then the rate of change would be 10 units of area per unit change in x. [04:41] Let's go ahead and try a different simple function, f of x equals x cubed. [04:45] This is going to be the geometric view of the stuff [04:48] that I went through algebraically in the last video. [04:51] What's nice here is that we can think of x cubed as the volume of an actual [04:55] cube whose side lengths are x, and when you increase x by a tiny nudge, [05:00] a tiny dx, the resulting increase in volume is what I have here in yellow. [05:04] That represents all the volume in a cube with side lengths x plus dx [05:08] that's not already in the original cube, the one with side length x. [05:13] It's nice to think of this new volume as broken up into multiple components, [05:18] but almost all of it comes from these three square faces, [05:22] or said a little more precisely, as dx approaches 0, [05:25] those three squares comprise a portion closer and closer to 100% of [05:30] that new yellow volume. [05:33] Each of those thin squares has a volume of x squared times dx, [05:38] the area of the face times that little thickness dx. [05:42] So in total this gives us 3x squared dx of volume change. [05:47] And to be sure there are other slivers of volume here along the edges [05:51] and that tiny one in the corner, but all of that volume is going to be [05:54] proportional to dx squared, or dx cubed, so we can safely ignore them. [05:59] Again this is ultimately because they're going to be divided by dx, [06:03] and if there's still any dx remaining then those terms aren't [06:07] going to survive the process of letting dx approach 0. [06:11] What this means is that the derivative of x cubed, [06:14] the rate at which x cubed changes per unit change of x, is 3 times x squared. [06:20] What that means in terms of graphical intuition is that the slope of [06:25] the graph of x cubed at every single point x is exactly 3x squared. [06:34] And reasoning about that slope, it should make sense that this derivative is high on the [06:38] left and then 0 at the origin and then high again as you move to the right, [06:42] but just thinking in terms of the graph would never have landed us on the precise [06:47] quantity 3x squared. [06:48] For that we had to take a much more direct look at what x cubed actually means. [06:54] Now in practice you wouldn't necessarily think of the square every [06:57] time you're taking the derivative of x squared, [06:59] nor would you necessarily think of this cube whenever you're taking [07:03] the derivative of x cubed. [07:04] Both of them fall under a pretty recognizable pattern for polynomial terms. [07:09] The derivative of x to the fourth turns out to be 4x cubed, [07:13] the derivative of x to the fifth is 5x to the fourth, and so on. [07:18] Abstractly you'd write this as the derivative of x to [07:22] the n for any power n is n times x to the n minus 1. [07:27] This right here is what's known in the business as the power rule. [07:31] In practice we all quickly just get jaded and think about this symbolically as [07:35] the exponent hopping down in front, leaving behind one less than itself, [07:39] rarely pausing to think about the geometric delights that underlie these derivatives. [07:45] That's the kind of thing that happens when these tend [07:47] to fall in the middle of much longer computations. [07:50] But rather than tracking it all off to symbolic patterns, [07:53] let's just take a moment and think about why this works for powers beyond just 2 and 3. [07:58] When you nudge that input x, increasing it slightly to x plus dx, [08:02] working out the exact value of that nudged output would involve [08:06] multiplying together these n separate x plus dx terms. [08:11] The full expansion would be really complicated, [08:13] but part of the point of derivatives is that most of that complication can be ignored. [08:19] The first term in your expansion is x to the n. [08:22] This is analogous to the area of the original square, [08:25] or the volume of the original cube from our previous examples. [08:30] For the next terms in the expansion you can choose mostly x's with a single dx. [08:41] Since there are n different parentheticals from which you could have chosen [08:46] that single dx, this gives us n separate terms, [08:50] all of which include n minus 1 x's times a dx, [08:53] giving a value of x to the power n minus 1 times dx. [08:57] This is analogous to how the majority of the new area in the square came from those [09:02] two bars, each with area x times dx, or how the bulk of the new volume in the cube [09:07] came from those three thin squares, each of which had a volume of x squared times dx. [09:14] There will be many other terms of this expansion, [09:17] but all of them are just going to be some multiple of dx squared, [09:21] so we can safely ignore them, and what that means is that all but a [09:25] negligible portion of the increase in the output comes from n copies of [09:29] this x to the n minus 1 times dx. [09:31] That's what it means for the derivative of x to the n to be n times x to the n minus 1. [09:38] And even though, like I said in practice, you'll find yourself performing this [09:43] derivative quickly and symbolically, imagining the exponent hopping down to the front, [09:47] every now and then it's nice to just step back and remember why these rules work. [09:52] Not just because it's pretty, and not just because it helps remind us that math [09:56] actually makes sense and isn't just a pile of formulas to memorize, [10:00] but because it flexes that very important muscle of thinking about derivatives in [10:04] terms of tiny nudges. [10:07] As another example, think of the function f of x equals 1 divided by x. [10:12] Now on the hand you could just blindly try applying the power rule, [10:16] since 1 divided by x is the same as writing x to the negative 1. [10:21] That would involve letting the negative 1 hop down in front, [10:24] leaving behind 1 less than itself, which is negative 2. [10:28] But let's have some fun and see if we can reason about this geometrically, [10:31] rather than just plugging it through some formula. [10:34] The value 1 over x is asking what number multiplied by x equals 1. [10:40] So here's how I'd like to visualize it. [10:42] Imagine a little rectangular puddle of water sitting in two dimensions whose area is 1. [10:48] And let's say that its width is x, which means that the height has to be 1 over x, [10:53] since the total area of it is 1. [10:56] So if x was stretched out to 2, then that height is forced down to 1 half. [11:01] And if you increased x up to 3, then the other side has to be squished down to 1 third. [11:07] This is a nice way to think about the graph of 1 over x, by the way. [11:11] If you think of this width x of the puddle as being in the xy-plane, [11:15] then that corresponding output 1 divided by x, the height of the graph above that point, [11:20] is whatever the height of your puddle has to be to maintain an area of 1. [11:26] So with this visual in mind, for the derivative, [11:29] imagine nudging up that value of x by some tiny amount, some tiny dx. [11:34] How must the height of this rectangle change so [11:37] that the area of the puddle remains constant at 1? [11:41] That is, increasing the width by dx adds some new area to the right here. [11:46] So the puddle has to decrease in height by some d 1 over x, [11:50] so that the area lost off of that top cancels out the area gained. [11:56] You should think of that d 1 over x as being a negative amount, [11:59] by the way, since it's decreasing the height of the rectangle. [12:03] And you know what? [12:04] I'm going to leave the last few steps here for you, [12:07] for you to pause and ponder and work out an ultimate expression. [12:10] And once you reason out what d of 1 over x divided by dx should be, [12:14] I want you to compare it to what you would have gotten if you had just [12:17] blindly applied the power rule, purely symbolically, to x to the negative 1. [12:23] And while I'm encouraging you to pause and ponder, [12:26] here's another fun challenge if you're feeling up to it. [12:29] See if you can reason through what the derivative of the square root of x should be. [12:36] To finish things off, I want to tackle one more type of function, [12:40] trigonometric functions, and in particular let's focus on the sine function. [12:45] So for this section I'm going to assume that you're already [12:48] familiar with how to think about trig functions using the unit circle, [12:51] the circle with a radius 1 centered at the origin. [12:55] For a given value of theta, like say 0.8, you imagine yourself [12:59] walking around the circle starting from the rightmost point [13:02] until you've traversed that distance of 0.8 in arc length. [13:06] This is the same thing as saying that the angle right here is exactly theta radians, [13:11] since the circle has a radius of 1. [13:14] Then what sine of theta means is the height of that point above the x-axis, [13:20] and as your theta value increases and you walk around the circle [13:24] your height bobs up and down between negative 1 and 1. [13:29] So when you graph sine of theta versus theta you get this wave pattern, [13:33] the quintessential wave pattern. [13:37] And just from looking at this graph we can start to [13:40] get a feel for the shape of the derivative of the sine. [13:44] The slope at 0 is something positive since sine of theta is increasing there, [13:48] and as we move to the right and sine of theta approaches its peak that slope goes down [13:54] to 0. [13:55] Then the slope is negative for a little while, [13:58] while the sine is decreasing before coming back up to 0 as the sine graph levels out. [14:04] And as you continue thinking this through and drawing it out, [14:07] if you're familiar with the graph of trig functions you might guess that this [14:11] derivative graph should be exactly cosine of theta, [14:13] since all the peaks and valleys line up perfectly with where the peaks and [14:17] valleys for the cosine function should be. [14:20] And spoiler alert, the derivative is in fact the cosine of theta, [14:23] but aren't you a little curious about why it's precisely cosine of theta? [14:28] I mean you could have all sorts of functions with peaks and valleys at the same points [14:32] that have roughly the same shape, but who knows, [14:34] maybe the derivative of sine could have turned out to be some entirely new type of [14:38] function that just happens to have a similar shape. [14:41] Well just like the previous examples, a more exact understanding [14:44] of the derivative requires looking at what the function actually represents, [14:48] rather than looking at the graph of the function. [14:52] So think back to that walk around the unit circle, [14:54] having traversed an arc with length theta and thinking about sine of theta as [14:58] the height of that point. [15:01] Now zoom into that point on the circle and consider a slight nudge of d theta [15:06] along their circumference, a tiny step in your walk around the unit circle. [15:11] How much does that tiny step change the sine of theta? [15:15] How much does this increase d theta of arc length increase the height above the x-axis? [15:21] Well zoomed in close enough, the circle basically looks like a straight line in this [15:26] neighborhood, so let's go ahead and think of this right triangle where the hypotenuse [15:30] of that right triangle represents the nudge d theta along the circumference, [15:34] and that left side here represents the change in height, the resulting d sine of theta. [15:40] Now this tiny triangle is actually similar to this larger triangle here, [15:44] with the defining angle theta and whose hypotenuse is the radius of the circle with [15:48] length 1. [15:50] Specifically this little angle right here is precisely equal to theta radians. [15:57] Now think about what the derivative of sine is supposed to mean. [16:01] It's the ratio between that d sine of theta, the tiny change to the height, [16:05] divided by d theta, the tiny change to the input of the function. [16:10] And from the picture we can see that that's the ratio between the [16:14] length of the side adjacent to the angle theta divided by the hypotenuse. [16:18] Well let's see, adjacent divided by hypotenuse, [16:21] that's exactly what the cosine of theta means, that's the definition of the cosine. [16:27] So this gives us two different really nice ways of [16:30] thinking about how the derivative of sine is cosine. [16:33] One of them is looking at the graph and getting a loose feel for the shape of [16:36] things based on thinking about the slope of the sine graph at every single point. [16:41] And the other is a more precise line of reasoning looking at the unit circle itself. [16:47] For those of you that like to pause and ponder, [16:49] see if you can try a similar line of reasoning to find what the derivative of [16:52] the cosine of theta should be. [16:56] In the next video I'll talk about how you can take derivatives [16:59] of functions who combine simple functions like these ones, [17:02] either as sums or products or function compositions, things like that. [17:06] And similar to this video the goal is going to be to understand each one [17:09] geometrically in a way that makes it intuitively reasonable and somewhat more memorable.