[00:12] Now that we've seen what a derivative means and what it has to do with rates of change,
[00:16] our next step is to learn how to actually compute these guys.
[00:19] As in, if I give you some kind of function with an explicit formula,
[00:22] you'd want to be able to find what the formula for its derivative is.
[00:26] Maybe it's obvious, but I think it's worth stating explicitly why this
[00:30] is an important thing to be able to do, why much of a calculus student's
[00:34] time ends up going towards grappling with derivatives of abstract
[00:37] functions rather than thinking about concrete rate of change problems.
[00:42] It's because a lot of real-world phenomena, the sort of things that
[00:45] we want to use calculus to analyze, are modeled using polynomials,
[00:49] trigonometric functions, exponentials, and other pure functions like that.
[00:53] So if you build up some fluency with the ideas of rates of change for those kinds of
[00:58] pure abstract functions, it gives you a language to more readily talk about the rates
[01:02] at which things change in concrete situations that you might be using calculus to model.
[01:07] But it is way too easy for this process to feel like just memorizing a list of rules,
[01:12] and if that happens, if you get that feeling, it's also easy to lose sight of the
[01:16] fact that derivatives are fundamentally about just looking at tiny changes to some
[01:20] quantity and how that relates to a resulting tiny change in another quantity.
[01:24] So in this video and in the next one, my aim is to show you how you can think
[01:28] about a few of these rules intuitively and geometrically,
[01:31] and I really want to encourage you to never forget that tiny nudges are at the
[01:35] heart of derivatives.
[01:37] Let's start with a simple function like f of x equals x squared.
[01:41] What if I asked you its derivative?
[01:43] That is, if you were to look at some value x, like x equals 2,
[01:47] and compare it to a value slightly bigger, just dx bigger,
[01:50] what's the corresponding change in the value of the function?
[01:54] dF.
[01:55] And in particular, what's dF divided by dx, the rate
[01:58] at which this function is changing per unit change in x.
[02:03] As a first step for intuition, we know that you can think of this ratio
[02:07] dF dx as the slope of a tangent line to the graph of x squared,
[02:10] and from that you can see that the slope generally increases as x increases.
[02:15] At zero, the tangent line is flat, and the slope is zero.
[02:19] At x equals 1, it's something a bit steeper.
[02:22] At x equals 2, it's steeper still.
[02:25] But looking at graphs isn't generally the best way
[02:27] to understand the precise formula for a derivative.
[02:30] For that, it's best to take a more literal look at what x squared actually means,
[02:34] and in this case let's go ahead and picture a square whose side length is x.
[02:39] If you increase x by some tiny nudge, some little dx,
[02:43] what's the resulting change in the area of that square?
[02:47] That slight change in area is what dF means in this context.
[02:52] It's the tiny increase to the value of f of x equals x squared,
[02:55] caused by increasing x by that tiny nudge dx.
[02:59] Now you can see that there's three new bits of area in this diagram,
[03:03] two thin rectangles and a minuscule square.
[03:06] The two thin rectangles each have side lengths of x and dx,
[03:10] so they account for 2 times x times dx units of new area.
[03:18] For example, let's say x was 3 and dx was 0.01,
[03:21] then that new area from these two thin rectangles would be 2 times 3 times 0.01,
[03:25] which is 0.06, about 6 times the size of dx.
[03:29] That little square there has an area of dx squared,
[03:32] but you should think of that as being really tiny, negligibly tiny.
[03:37] For example, if dx was 0.01, that would be only 0.0001,
[03:41] and keep in mind I'm drawing dx with a fair bit of width here just so we
[03:46] can actually see it, but always remember in principle,
[03:49] dx should be thought of as a truly tiny amount, and for those truly tiny amounts,
[03:54] a good rule of thumb is that you can ignore anything that includes a dx
[03:59] raised to a power greater than 1.
[04:02] That is, a tiny change squared is a negligible change.
[04:07] What this leaves us with is that dF is just some multiple of dx, and that multiple 2x,
[04:13] which you could also write as dF divided by dx, is the derivative of x squared.
[04:19] For example, if you were starting at x equals 3, then as you slightly increase x,
[04:24] the rate of change in the area per unit change in length added, dx squared over dx,
[04:29] would be 2 times 3, or 6, and if instead you were starting at x equals 5,
[04:34] then the rate of change would be 10 units of area per unit change in x.
[04:41] Let's go ahead and try a different simple function, f of x equals x cubed.
[04:45] This is going to be the geometric view of the stuff
[04:48] that I went through algebraically in the last video.
[04:51] What's nice here is that we can think of x cubed as the volume of an actual
[04:55] cube whose side lengths are x, and when you increase x by a tiny nudge,
[05:00] a tiny dx, the resulting increase in volume is what I have here in yellow.
[05:04] That represents all the volume in a cube with side lengths x plus dx
[05:08] that's not already in the original cube, the one with side length x.
[05:13] It's nice to think of this new volume as broken up into multiple components,
[05:18] but almost all of it comes from these three square faces,
[05:22] or said a little more precisely, as dx approaches 0,
[05:25] those three squares comprise a portion closer and closer to 100% of
[05:30] that new yellow volume.
[05:33] Each of those thin squares has a volume of x squared times dx,
[05:38] the area of the face times that little thickness dx.
[05:42] So in total this gives us 3x squared dx of volume change.
[05:47] And to be sure there are other slivers of volume here along the edges
[05:51] and that tiny one in the corner, but all of that volume is going to be
[05:54] proportional to dx squared, or dx cubed, so we can safely ignore them.
[05:59] Again this is ultimately because they're going to be divided by dx,
[06:03] and if there's still any dx remaining then those terms aren't
[06:07] going to survive the process of letting dx approach 0.
[06:11] What this means is that the derivative of x cubed,
[06:14] the rate at which x cubed changes per unit change of x, is 3 times x squared.
[06:20] What that means in terms of graphical intuition is that the slope of
[06:25] the graph of x cubed at every single point x is exactly 3x squared.
[06:34] And reasoning about that slope, it should make sense that this derivative is high on the
[06:38] left and then 0 at the origin and then high again as you move to the right,
[06:42] but just thinking in terms of the graph would never have landed us on the precise
[06:47] quantity 3x squared.
[06:48] For that we had to take a much more direct look at what x cubed actually means.
[06:54] Now in practice you wouldn't necessarily think of the square every
[06:57] time you're taking the derivative of x squared,
[06:59] nor would you necessarily think of this cube whenever you're taking
[07:03] the derivative of x cubed.
[07:04] Both of them fall under a pretty recognizable pattern for polynomial terms.
[07:09] The derivative of x to the fourth turns out to be 4x cubed,
[07:13] the derivative of x to the fifth is 5x to the fourth, and so on.
[07:18] Abstractly you'd write this as the derivative of x to
[07:22] the n for any power n is n times x to the n minus 1.
[07:27] This right here is what's known in the business as the power rule.
[07:31] In practice we all quickly just get jaded and think about this symbolically as
[07:35] the exponent hopping down in front, leaving behind one less than itself,
[07:39] rarely pausing to think about the geometric delights that underlie these derivatives.
[07:45] That's the kind of thing that happens when these tend
[07:47] to fall in the middle of much longer computations.
[07:50] But rather than tracking it all off to symbolic patterns,
[07:53] let's just take a moment and think about why this works for powers beyond just 2 and 3.
[07:58] When you nudge that input x, increasing it slightly to x plus dx,
[08:02] working out the exact value of that nudged output would involve
[08:06] multiplying together these n separate x plus dx terms.
[08:11] The full expansion would be really complicated,
[08:13] but part of the point of derivatives is that most of that complication can be ignored.
[08:19] The first term in your expansion is x to the n.
[08:22] This is analogous to the area of the original square,
[08:25] or the volume of the original cube from our previous examples.
[08:30] For the next terms in the expansion you can choose mostly x's with a single dx.
[08:41] Since there are n different parentheticals from which you could have chosen
[08:46] that single dx, this gives us n separate terms,
[08:50] all of which include n minus 1 x's times a dx,
[08:53] giving a value of x to the power n minus 1 times dx.
[08:57] This is analogous to how the majority of the new area in the square came from those
[09:02] two bars, each with area x times dx, or how the bulk of the new volume in the cube
[09:07] came from those three thin squares, each of which had a volume of x squared times dx.
[09:14] There will be many other terms of this expansion,
[09:17] but all of them are just going to be some multiple of dx squared,
[09:21] so we can safely ignore them, and what that means is that all but a
[09:25] negligible portion of the increase in the output comes from n copies of
[09:29] this x to the n minus 1 times dx.
[09:31] That's what it means for the derivative of x to the n to be n times x to the n minus 1.
[09:38] And even though, like I said in practice, you'll find yourself performing this
[09:43] derivative quickly and symbolically, imagining the exponent hopping down to the front,
[09:47] every now and then it's nice to just step back and remember why these rules work.
[09:52] Not just because it's pretty, and not just because it helps remind us that math
[09:56] actually makes sense and isn't just a pile of formulas to memorize,
[10:00] but because it flexes that very important muscle of thinking about derivatives in
[10:04] terms of tiny nudges.
[10:07] As another example, think of the function f of x equals 1 divided by x.
[10:12] Now on the hand you could just blindly try applying the power rule,
[10:16] since 1 divided by x is the same as writing x to the negative 1.
[10:21] That would involve letting the negative 1 hop down in front,
[10:24] leaving behind 1 less than itself, which is negative 2.
[10:28] But let's have some fun and see if we can reason about this geometrically,
[10:31] rather than just plugging it through some formula.
[10:34] The value 1 over x is asking what number multiplied by x equals 1.
[10:40] So here's how I'd like to visualize it.
[10:42] Imagine a little rectangular puddle of water sitting in two dimensions whose area is 1.
[10:48] And let's say that its width is x, which means that the height has to be 1 over x,
[10:53] since the total area of it is 1.
[10:56] So if x was stretched out to 2, then that height is forced down to 1 half.
[11:01] And if you increased x up to 3, then the other side has to be squished down to 1 third.
[11:07] This is a nice way to think about the graph of 1 over x, by the way.
[11:11] If you think of this width x of the puddle as being in the xy-plane,
[11:15] then that corresponding output 1 divided by x, the height of the graph above that point,
[11:20] is whatever the height of your puddle has to be to maintain an area of 1.
[11:26] So with this visual in mind, for the derivative,
[11:29] imagine nudging up that value of x by some tiny amount, some tiny dx.
[11:34] How must the height of this rectangle change so
[11:37] that the area of the puddle remains constant at 1?
[11:41] That is, increasing the width by dx adds some new area to the right here.
[11:46] So the puddle has to decrease in height by some d 1 over x,
[11:50] so that the area lost off of that top cancels out the area gained.
[11:56] You should think of that d 1 over x as being a negative amount,
[11:59] by the way, since it's decreasing the height of the rectangle.
[12:03] And you know what?
[12:04] I'm going to leave the last few steps here for you,
[12:07] for you to pause and ponder and work out an ultimate expression.
[12:10] And once you reason out what d of 1 over x divided by dx should be,
[12:14] I want you to compare it to what you would have gotten if you had just
[12:17] blindly applied the power rule, purely symbolically, to x to the negative 1.
[12:23] And while I'm encouraging you to pause and ponder,
[12:26] here's another fun challenge if you're feeling up to it.
[12:29] See if you can reason through what the derivative of the square root of x should be.
[12:36] To finish things off, I want to tackle one more type of function,
[12:40] trigonometric functions, and in particular let's focus on the sine function.
[12:45] So for this section I'm going to assume that you're already
[12:48] familiar with how to think about trig functions using the unit circle,
[12:51] the circle with a radius 1 centered at the origin.
[12:55] For a given value of theta, like say 0.8, you imagine yourself
[12:59] walking around the circle starting from the rightmost point
[13:02] until you've traversed that distance of 0.8 in arc length.
[13:06] This is the same thing as saying that the angle right here is exactly theta radians,
[13:11] since the circle has a radius of 1.
[13:14] Then what sine of theta means is the height of that point above the x-axis,
[13:20] and as your theta value increases and you walk around the circle
[13:24] your height bobs up and down between negative 1 and 1.
[13:29] So when you graph sine of theta versus theta you get this wave pattern,
[13:33] the quintessential wave pattern.
[13:37] And just from looking at this graph we can start to
[13:40] get a feel for the shape of the derivative of the sine.
[13:44] The slope at 0 is something positive since sine of theta is increasing there,
[13:48] and as we move to the right and sine of theta approaches its peak that slope goes down
[13:54] to 0.
[13:55] Then the slope is negative for a little while,
[13:58] while the sine is decreasing before coming back up to 0 as the sine graph levels out.
[14:04] And as you continue thinking this through and drawing it out,
[14:07] if you're familiar with the graph of trig functions you might guess that this
[14:11] derivative graph should be exactly cosine of theta,
[14:13] since all the peaks and valleys line up perfectly with where the peaks and
[14:17] valleys for the cosine function should be.
[14:20] And spoiler alert, the derivative is in fact the cosine of theta,
[14:23] but aren't you a little curious about why it's precisely cosine of theta?
[14:28] I mean you could have all sorts of functions with peaks and valleys at the same points
[14:32] that have roughly the same shape, but who knows,
[14:34] maybe the derivative of sine could have turned out to be some entirely new type of
[14:38] function that just happens to have a similar shape.
[14:41] Well just like the previous examples, a more exact understanding
[14:44] of the derivative requires looking at what the function actually represents,
[14:48] rather than looking at the graph of the function.
[14:52] So think back to that walk around the unit circle,
[14:54] having traversed an arc with length theta and thinking about sine of theta as
[14:58] the height of that point.
[15:01] Now zoom into that point on the circle and consider a slight nudge of d theta
[15:06] along their circumference, a tiny step in your walk around the unit circle.
[15:11] How much does that tiny step change the sine of theta?
[15:15] How much does this increase d theta of arc length increase the height above the x-axis?
[15:21] Well zoomed in close enough, the circle basically looks like a straight line in this
[15:26] neighborhood, so let's go ahead and think of this right triangle where the hypotenuse
[15:30] of that right triangle represents the nudge d theta along the circumference,
[15:34] and that left side here represents the change in height, the resulting d sine of theta.
[15:40] Now this tiny triangle is actually similar to this larger triangle here,
[15:44] with the defining angle theta and whose hypotenuse is the radius of the circle with
[15:48] length 1.
[15:50] Specifically this little angle right here is precisely equal to theta radians.
[15:57] Now think about what the derivative of sine is supposed to mean.
[16:01] It's the ratio between that d sine of theta, the tiny change to the height,
[16:05] divided by d theta, the tiny change to the input of the function.
[16:10] And from the picture we can see that that's the ratio between the
[16:14] length of the side adjacent to the angle theta divided by the hypotenuse.
[16:18] Well let's see, adjacent divided by hypotenuse,
[16:21] that's exactly what the cosine of theta means, that's the definition of the cosine.
[16:27] So this gives us two different really nice ways of
[16:30] thinking about how the derivative of sine is cosine.
[16:33] One of them is looking at the graph and getting a loose feel for the shape of
[16:36] things based on thinking about the slope of the sine graph at every single point.
[16:41] And the other is a more precise line of reasoning looking at the unit circle itself.
[16:47] For those of you that like to pause and ponder,
[16:49] see if you can try a similar line of reasoning to find what the derivative of
[16:52] the cosine of theta should be.
[16:56] In the next video I'll talk about how you can take derivatives
[16:59] of functions who combine simple functions like these ones,
[17:02] either as sums or products or function compositions, things like that.
[17:06] And similar to this video the goal is going to be to understand each one
[17:09] geometrically in a way that makes it intuitively reasonable and somewhat more memorable.