[00:07] The months ahead of you hold within them a lot of hard work, some neat examples, [00:11] some not-so-neat examples, beautiful connections to physics, [00:14] not-so-beautiful piles of formulas to memorize, [00:17] plenty of moments of getting stuck and banging your head into a wall, [00:20] a few nice aha moments sprinkled in as well, and some genuinely lovely [00:24] graphical intuition to help guide you through it all. [00:27] But if the course ahead of you is anything like my first introduction to calculus, [00:31] or any of the first courses I've seen in the years since, [00:34] there's one topic you will not see, but which I believe stands to greatly accelerate [00:38] your learning. [00:40] You see, almost all of the visual intuitions from that first year are based on graphs. [00:45] The derivative is the slope of a graph, the integral is a certain area under that graph. [00:50] But as you generalize calculus beyond functions whose inputs and outputs are [00:54] simply numbers, it's not always possible to graph the function you're analyzing. [01:00] So if all your intuitions for the fundamental ideas, like derivatives, [01:04] are rooted too rigidly in graphs, it can make for a very tall and largely [01:08] unnecessary conceptual hurdle between you and the more quote-unquote advanced [01:13] topics like multivariable calculus and complex analysis, differential geometry. [01:18] What I want to share with you is a way to think about derivatives, [01:22] which I'll refer to as the transformational view, [01:24] that generalizes more seamlessly into some of those more general contexts [01:28] where calculus comes up. [01:29] And then we'll use this alternate view to analyze a fun puzzle about repeated fractions. [01:35] But first off, I just want to make sure we're all [01:37] on the same page about what the standard visual is. [01:40] If you were to graph a function, which simply takes real numbers as inputs and outputs, [01:44] one of the first things you learn in a calculus course is that the derivative gives [01:49] you the slope of this graph, where what we mean by that is that the derivative of [01:54] the function is a new function which for every input x returns that slope. [01:59] Now I'd encourage you not to think of this derivative [02:01] as slope idea as being the definition of a derivative. [02:05] Instead think of it as being more fundamentally about how [02:07] sensitive the function is to tiny little nudges around the input. [02:11] And the slope is just one way to think about that sensitivity [02:14] relevant only to this particular way of viewing functions. [02:17] I have not just another video, but a full series on this [02:19] topic if it's something you want to learn more about. [02:22] The basic idea behind the alternate visual for the derivative is to [02:26] think of this function as mapping all of the input points on the [02:29] number line to their corresponding outputs on a different number line. [02:33] In this context, what the derivative gives you is a measure of how [02:36] much the input space gets stretched or squished in various regions. [02:41] That is, if you were to zoom in around a specific input and take a look at some [02:46] evenly spaced points around it, the derivative of the function of that input is [02:51] going to tell you how spread out or contracted those points become after the mapping. [02:57] Here, a specific example helps. [02:59] Take the function x2, it maps 1 to 1, 2 to 4, 3 to 9, and so on. [03:06] You can also see how it acts on all of the points in between. [03:12] If you were to zoom in on a little cluster of points around the input 1, [03:16] and see where they land around the relevant output, [03:19] which for this function also happens to be 1, you'd notice that they tend [03:23] to get stretched out. [03:25] In fact, it roughly looks like stretching out by a factor of 2. [03:29] The closer you zoom in, the more this local behavior looks just like multiplying by a [03:35] factor of 2. This is what it means for the derivative of x2 at the input x equals 1 to be [03:41] 2. [03:42] It's what that fact looks like in the context of transformations. [03:46] If you looked at a neighborhood of points around the input 3, [03:49] they would get stretched out by a factor of 6. [03:52] This is what it means for the derivative of this function at the input 3 to equal 6. [03:58] Around the input 1 fourth, a small region tends to get contracted specifically by a [04:03] factor of 1 half, and that's what it looks like for a derivative to be smaller than 1. [04:10] The input 0 is interesting. [04:13] Zooming in by a factor of 10, it doesn't really [04:15] look like a constant stretching or squishing. [04:18] For one thing, all of the outputs end up on the right positive side of things. [04:23] As you zoom in closer and closer, by 100x, or by 1000x, [04:27] it looks more and more like a small neighborhood of points around 0 just [04:33] gets collapsed into 0 itself. This is what it looks like for the derivative to be 0. [04:40] The local behavior looks more and more like multiplying the whole number line by 0. [04:45] It doesn't have to completely collapse everything to a point at a particular zoom level, [04:49] instead it's a matter of what the limiting behavior is as you zoom in closer and closer. [04:55] It's also instructive to take a look at the negative inputs here. [05:00] Things start to feel a little cramped since they collide with where all the positive [05:04] input values go, and this is one of the downsides of thinking of functions as [05:08] transformations. [05:09] But for derivatives, we only really care about the local behavior anyway, [05:13] what happens in a small range around a given input. [05:16] Here, notice that the inputs in a little neighborhood around, say, [05:20] negative 2, don't just get stretched out, they also get flipped around. [05:24] Specifically, the action on such a neighborhood looks more [05:28] and more like multiplying by negative 4 the closer you zoom in. [05:32] This is what it looks like for the derivative of a function to be negative. [05:38] And I think you get the point, this is all well and good, [05:40] but let's see how this is actually useful in solving a problem. [05:44] A friend of mine recently asked me a pretty fun question about the infinite [05:48] fraction 1 plus 1 divided by 1 plus 1 divided by 1 plus 1 divided by 1, [05:52] and clearly you watch math videos online, so maybe you've seen this before, [05:56] but my friend's question actually cuts to something you might not have [05:59] thought about before, relevant to the view of derivatives we're looking at here. [06:05] The typical way you might evaluate an expression like this is to set it equal to x, [06:09] and then notice that there is a copy of the full fraction inside itself. [06:14] So you can replace that copy with another x, and then just solve for x. [06:19] That is, what you want is to find a fixed point of the function 1 plus 1 divided by x. [06:27] But here's the thing, there are actually two solutions for x, [06:30] two special numbers where 1 plus 1 divided by that number gives you back the same thing. [06:36] One is the golden ratio, phi, around 1.618, and the other is negative 0.618, [06:42] which happens to be negative 1 divided by phi. [06:46] I like to call this other number phi's little brother, [06:49] since just about any property that phi has, this number also has. [06:53] And this raises the question, would it be valid to say that the infinite [06:58] fraction we saw is somehow also equal to phi's little brother, negative 0.618? [07:04] Maybe you initially say, obviously not, everything on the left hand side is positive, [07:08] so how could it possibly equal a negative number? [07:12] Well, first we should be clear about what we actually mean by an expression like this. [07:17] One way you could think about it, and it's not the only way, [07:21] there's freedom for choice here, is to imagine starting with some constant, like 1, [07:26] and then repeatedly applying the function 1 plus 1 divided by x, and then asking, [07:30] what is this approach as you keep going? [07:36] I mean, certainly symbolically what you get looks more and more [07:38] like our infinite fraction, so maybe if you wanted to equal a number, [07:41] you should ask what this series of numbers approaches. [07:45] And if that's your view of things, maybe you start off with a negative number, [07:48] so it's not so crazy for the whole expression to end up negative. [07:52] After all, if you start with negative 1 divided by phi, [07:55] then applying this function 1 plus 1 over x, you get back the same number, [07:59] negative 1 divided by phi, so no matter how many times you apply it, [08:03] you're staying fixed at this value. [08:07] But even then, there is one reason you should [08:10] view phi as the favorite brother in this pair. [08:14] Here, try this, pull up a calculator of some kind, then start with any random number, [08:19] and plug it into this function, 1 plus 1 divided by x, [08:22] and plug that number into 1 plus 1 over x, and again, and again, and again, and again. [08:28] No matter what constant you start with, you eventually end up at 1.618. [08:33] Even if you start with a negative number, even one that's really close to phi's [08:38] little brother, eventually it shies away from that value and jumps back over to phi. [08:50] So, what's going on here? [08:52] Why is one of these fixed points favored above the other one? [08:56] Maybe you can already see how the transformational understanding of derivatives [09:00] is helpful for understanding this setup, but for the sake of having a point of contrast, [09:03] I want to show you how a problem like this is often taught using graphs. [09:07] If you were to plug in some random input to this function, [09:11] the y value tells you the corresponding output, right? [09:14] So to think about plugging that output back into the function, [09:17] you might first move horizontally until you hit the line y equals x, [09:22] and that's going to give you a position where the x value corresponds to your [09:26] previous y value, right? [09:28] So then from there, you can move vertically to see what output this new x value has, [09:34] and then you repeat. [09:36] You move horizontally to the line y equals x to find a point whose x value is the same [09:40] as the output you just got, and then you move vertically to apply the function again. [09:45] Now personally, I think this is kind of an awkward way [09:48] to think about repeatedly applying a function, don't you? [09:51] I mean, it makes sense, but you kind of have to pause [09:53] and think about it to remember which way to draw the lines. [09:57] And you can, if you want, think through what conditions make this spiderweb [10:01] process narrow in on a fixed point, versus propagating away from it. [10:05] In fact, go ahead, pause right now, and try to think it through as an exercise. [10:09] It has to do with slopes. [10:12] Or if you want to skip the exercise for something that I think gives a much more [10:15] satisfying understanding, think about how this function acts as a transformation. [10:22] So I'm going to go ahead and start here by drawing a bunch of [10:24] arrows to indicate where the various sampled input points will go. [10:28] And side note, don't you think this gives a neat emergent pattern? [10:31] I wasn't expecting this, but it was cool to see it pop up when animating. [10:35] I guess the action of 1 divided by x gives this nice emergent circle, [10:38] and then we're just shifting things over by 1. [10:42] Anyway, I want you to think about what it means to repeatedly apply some function, [10:46] like 1 plus 1 over x, in this context. [10:50] Well after letting it map all of the inputs to the outputs, [10:53] you could consider those as the new inputs, and then just apply the same process again, [10:58] and then again, and do it however many times you want. [11:02] Notice, in animating this with a few dots representing the sample points, [11:06] it doesn't take many iterations at all before all of those dots kind of clump in around 1. [11:11] 618. [11:14] Now remember, we know that 1.618 and its little brother, [11:18] negative 0.618 on and on, stay fixed in place during each iteration of this process. [11:24] But zoom in on a neighborhood around phi. [11:27] During the map, points in that region get contracted around phi, [11:32] meaning that the function 1 plus 1 over x has a derivative with a magnitude less [11:39] than 1 at this input. [11:41] In fact, this derivative works out to be around negative 0.38. [11:46] So what that means is that each repeated application scrunches the neighborhood [11:50] around this number smaller and smaller, like a gravitational pull towards phi. [11:54] So now tell me what you think happens in the neighborhood of phi's little brother. [12:01] Over there, the derivative actually has a magnitude larger than 1, [12:05] so points near the fixed point are repelled away from it. [12:09] And when you work it out, you can see that they get [12:11] stretched by more than a factor of 2 in each iteration. [12:14] They also get flipped around, because the derivative is negative here, [12:17] but the salient fact for the sake of stability is just the magnitude. [12:23] Mathematicians would call this right value a stable fixed point, [12:26] and the left one is an unstable fixed point. [12:30] Something is considered stable if when you perturb it just a little bit, [12:33] it tends to come back towards where it started, rather than going away from it. [12:38] So what we're seeing is a very useful little fact, [12:40] that the stability of a fixed point is determined by whether or not the magnitude of its [12:45] derivative is bigger or smaller than 1. [12:47] This explains why phi always shows up in the numerical play, [12:50] where you're just hitting enter on your calculator over and over, [12:53] but phi's little brother never does. [12:56] As to whether or not you want to consider phi's little brother a [12:59] valid value of the infinite fraction, well that's really up to you. [13:03] Everything we just showed suggests that if you think of this expression [13:06] as representing a limiting process, then because every possible seed [13:10] value other than phi's little brother gives you a series converging to phi, [13:14] it does feel silly to put them on equal footing with each other. [13:18] But maybe you don't think of it as a limit, maybe the kind of math [13:21] you're doing lends itself to treating this as a purely algebraic object, [13:25] like the solutions of a polynomial, which simply has multiple values. [13:30] Anyway, that's beside the point, and my point here is not that viewing derivatives [13:34] as this change in density is somehow better than the graphical intuition on the whole. [13:39] In fact, picturing an entire function this way can be [13:42] kind of clunky and impractical as compared to graphs. [13:45] My point is that it deserves more of a mention in most of the [13:48] introductory calculus courses, because it can help make a [13:50] student's understanding of the derivative a little more flexible. [13:54] Like I mentioned, the real reason I'd recommend you carry this perspective [13:58] with you as you learn new topics is not so much for what it does with your [14:01] understanding of single variable calculus, it's for what comes after.