1 00:00:07,740 --> 00:00:11,853 The months ahead of you hold within them a lot of hard work, some neat examples, 2 00:00:11,852 --> 00:00:14,949 some not-so-neat examples, beautiful connections to physics, 3 00:00:14,949 --> 00:00:17,387 not-so-beautiful piles of formulas to memorize, 4 00:00:17,388 --> 00:00:20,943 plenty of moments of getting stuck and banging your head into a wall, 5 00:00:20,943 --> 00:00:24,548 a few nice aha moments sprinkled in as well, and some genuinely lovely 6 00:00:24,547 --> 00:00:27,239 graphical intuition to help guide you through it all. 7 00:00:27,629 --> 00:00:31,699 But if the course ahead of you is anything like my first introduction to calculus, 8 00:00:31,699 --> 00:00:34,543 or any of the first courses I've seen in the years since, 9 00:00:34,543 --> 00:00:38,712 there's one topic you will not see, but which I believe stands to greatly accelerate 10 00:00:38,713 --> 00:00:39,399 your learning. 11 00:00:40,280 --> 00:00:44,620 You see, almost all of the visual intuitions from that first year are based on graphs. 12 00:00:45,079 --> 00:00:49,659 The derivative is the slope of a graph, the integral is a certain area under that graph. 13 00:00:50,200 --> 00:00:54,035 But as you generalize calculus beyond functions whose inputs and outputs are 14 00:00:54,034 --> 00:00:58,019 simply numbers, it's not always possible to graph the function you're analyzing. 15 00:01:00,679 --> 00:01:04,653 So if all your intuitions for the fundamental ideas, like derivatives, 16 00:01:04,653 --> 00:01:08,794 are rooted too rigidly in graphs, it can make for a very tall and largely 17 00:01:08,793 --> 00:01:13,158 unnecessary conceptual hurdle between you and the more quote-unquote advanced 18 00:01:13,159 --> 00:01:17,580 topics like multivariable calculus and complex analysis, differential geometry. 19 00:01:18,840 --> 00:01:22,204 What I want to share with you is a way to think about derivatives, 20 00:01:22,204 --> 00:01:24,716 which I'll refer to as the transformational view, 21 00:01:24,716 --> 00:01:28,433 that generalizes more seamlessly into some of those more general contexts 22 00:01:28,433 --> 00:01:29,640 where calculus comes up. 23 00:01:29,879 --> 00:01:34,859 And then we'll use this alternate view to analyze a fun puzzle about repeated fractions. 24 00:01:35,459 --> 00:01:37,538 But first off, I just want to make sure we're all 25 00:01:37,539 --> 00:01:39,659 on the same page about what the standard visual is. 26 00:01:40,060 --> 00:01:44,936 If you were to graph a function, which simply takes real numbers as inputs and outputs, 27 00:01:44,936 --> 00:01:49,593 one of the first things you learn in a calculus course is that the derivative gives 28 00:01:49,593 --> 00:01:54,138 you the slope of this graph, where what we mean by that is that the derivative of 29 00:01:54,138 --> 00:01:58,240 the function is a new function which for every input x returns that slope. 30 00:01:59,519 --> 00:02:01,978 Now I'd encourage you not to think of this derivative 31 00:02:01,978 --> 00:02:04,439 as slope idea as being the definition of a derivative. 32 00:02:05,000 --> 00:02:07,555 Instead think of it as being more fundamentally about how 33 00:02:07,555 --> 00:02:10,419 sensitive the function is to tiny little nudges around the input. 34 00:02:11,020 --> 00:02:14,058 And the slope is just one way to think about that sensitivity 35 00:02:14,057 --> 00:02:16,900 relevant only to this particular way of viewing functions. 36 00:02:17,340 --> 00:02:19,816 I have not just another video, but a full series on this 37 00:02:19,816 --> 00:02:22,120 topic if it's something you want to learn more about. 38 00:02:22,599 --> 00:02:26,022 The basic idea behind the alternate visual for the derivative is to 39 00:02:26,022 --> 00:02:29,294 think of this function as mapping all of the input points on the 40 00:02:29,294 --> 00:02:32,819 number line to their corresponding outputs on a different number line. 41 00:02:33,400 --> 00:02:36,810 In this context, what the derivative gives you is a measure of how 42 00:02:36,810 --> 00:02:40,219 much the input space gets stretched or squished in various regions. 43 00:02:41,860 --> 00:02:46,673 That is, if you were to zoom in around a specific input and take a look at some 44 00:02:46,673 --> 00:02:51,485 evenly spaced points around it, the derivative of the function of that input is 45 00:02:51,485 --> 00:02:56,599 going to tell you how spread out or contracted those points become after the mapping. 46 00:02:57,939 --> 00:02:59,400 Here, a specific example helps. 47 00:02:59,740 --> 00:03:05,920 Take the function x2, it maps 1 to 1, 2 to 4, 3 to 9, and so on. 48 00:03:06,479 --> 00:03:09,219 You can also see how it acts on all of the points in between. 49 00:03:12,719 --> 00:03:16,760 If you were to zoom in on a little cluster of points around the input 1, 50 00:03:16,760 --> 00:03:19,639 and see where they land around the relevant output, 51 00:03:19,639 --> 00:03:23,736 which for this function also happens to be 1, you'd notice that they tend 52 00:03:23,736 --> 00:03:24,900 to get stretched out. 53 00:03:25,759 --> 00:03:29,019 In fact, it roughly looks like stretching out by a factor of 2. 54 00:03:29,659 --> 00:03:35,534 The closer you zoom in, the more this local behavior looks just like multiplying by a 55 00:03:35,534 --> 00:03:41,682 factor of 2. This is what it means for the derivative of x2 at the input x equals 1 to be 56 00:03:41,682 --> 00:03:41,819 2. 57 00:03:42,340 --> 00:03:45,400 It's what that fact looks like in the context of transformations. 58 00:03:46,460 --> 00:03:49,731 If you looked at a neighborhood of points around the input 3, 59 00:03:49,731 --> 00:03:52,159 they would get stretched out by a factor of 6. 60 00:03:52,740 --> 00:03:57,439 This is what it means for the derivative of this function at the input 3 to equal 6. 61 00:03:58,979 --> 00:04:03,613 Around the input 1 fourth, a small region tends to get contracted specifically by a 62 00:04:03,614 --> 00:04:08,360 factor of 1 half, and that's what it looks like for a derivative to be smaller than 1. 63 00:04:10,719 --> 00:04:12,599 The input 0 is interesting. 64 00:04:13,120 --> 00:04:15,617 Zooming in by a factor of 10, it doesn't really 65 00:04:15,617 --> 00:04:17,959 look like a constant stretching or squishing. 66 00:04:18,379 --> 00:04:21,680 For one thing, all of the outputs end up on the right positive side of things. 67 00:04:23,319 --> 00:04:27,694 As you zoom in closer and closer, by 100x, or by 1000x, 68 00:04:27,694 --> 00:04:33,396 it looks more and more like a small neighborhood of points around 0 just 69 00:04:33,396 --> 00:04:39,959 gets collapsed into 0 itself. This is what it looks like for the derivative to be 0. 70 00:04:40,500 --> 00:04:45,019 The local behavior looks more and more like multiplying the whole number line by 0. 71 00:04:45,680 --> 00:04:49,783 It doesn't have to completely collapse everything to a point at a particular zoom level, 72 00:04:49,783 --> 00:04:53,840 instead it's a matter of what the limiting behavior is as you zoom in closer and closer. 73 00:04:55,279 --> 00:04:58,959 It's also instructive to take a look at the negative inputs here. 74 00:05:00,699 --> 00:05:04,536 Things start to feel a little cramped since they collide with where all the positive 75 00:05:04,536 --> 00:05:08,057 input values go, and this is one of the downsides of thinking of functions as 76 00:05:08,057 --> 00:05:08,780 transformations. 77 00:05:09,399 --> 00:05:13,093 But for derivatives, we only really care about the local behavior anyway, 78 00:05:13,093 --> 00:05:15,639 what happens in a small range around a given input. 79 00:05:16,500 --> 00:05:20,189 Here, notice that the inputs in a little neighborhood around, say, 80 00:05:20,189 --> 00:05:24,100 negative 2, don't just get stretched out, they also get flipped around. 81 00:05:24,680 --> 00:05:28,132 Specifically, the action on such a neighborhood looks more 82 00:05:28,132 --> 00:05:31,819 and more like multiplying by negative 4 the closer you zoom in. 83 00:05:32,319 --> 00:05:35,599 This is what it looks like for the derivative of a function to be negative. 84 00:05:38,459 --> 00:05:40,951 And I think you get the point, this is all well and good, 85 00:05:40,951 --> 00:05:43,660 but let's see how this is actually useful in solving a problem. 86 00:05:44,259 --> 00:05:48,305 A friend of mine recently asked me a pretty fun question about the infinite 87 00:05:48,305 --> 00:05:52,137 fraction 1 plus 1 divided by 1 plus 1 divided by 1 plus 1 divided by 1, 88 00:05:52,137 --> 00:05:56,182 and clearly you watch math videos online, so maybe you've seen this before, 89 00:05:56,182 --> 00:05:59,961 but my friend's question actually cuts to something you might not have 90 00:05:59,961 --> 00:06:04,220 thought about before, relevant to the view of derivatives we're looking at here. 91 00:06:05,019 --> 00:06:09,661 The typical way you might evaluate an expression like this is to set it equal to x, 92 00:06:09,661 --> 00:06:13,639 and then notice that there is a copy of the full fraction inside itself. 93 00:06:14,699 --> 00:06:18,779 So you can replace that copy with another x, and then just solve for x. 94 00:06:19,439 --> 00:06:24,579 That is, what you want is to find a fixed point of the function 1 plus 1 divided by x. 95 00:06:27,160 --> 00:06:30,970 But here's the thing, there are actually two solutions for x, 96 00:06:30,970 --> 00:06:36,380 two special numbers where 1 plus 1 divided by that number gives you back the same thing. 97 00:06:36,939 --> 00:06:42,949 One is the golden ratio, phi, around 1.618, and the other is negative 0.618, 98 00:06:42,949 --> 00:06:46,540 which happens to be negative 1 divided by phi. 99 00:06:46,959 --> 00:06:49,681 I like to call this other number phi's little brother, 100 00:06:49,682 --> 00:06:52,900 since just about any property that phi has, this number also has. 101 00:06:53,560 --> 00:06:58,413 And this raises the question, would it be valid to say that the infinite 102 00:06:58,413 --> 00:07:03,600 fraction we saw is somehow also equal to phi's little brother, negative 0.618? 103 00:07:04,519 --> 00:07:08,812 Maybe you initially say, obviously not, everything on the left hand side is positive, 104 00:07:08,812 --> 00:07:11,259 so how could it possibly equal a negative number? 105 00:07:12,500 --> 00:07:17,100 Well, first we should be clear about what we actually mean by an expression like this. 106 00:07:17,779 --> 00:07:21,315 One way you could think about it, and it's not the only way, 107 00:07:21,315 --> 00:07:26,185 there's freedom for choice here, is to imagine starting with some constant, like 1, 108 00:07:26,185 --> 00:07:30,939 and then repeatedly applying the function 1 plus 1 divided by x, and then asking, 109 00:07:30,939 --> 00:07:33,259 what is this approach as you keep going? 110 00:07:36,040 --> 00:07:38,552 I mean, certainly symbolically what you get looks more and more 111 00:07:38,552 --> 00:07:41,300 like our infinite fraction, so maybe if you wanted to equal a number, 112 00:07:41,300 --> 00:07:43,420 you should ask what this series of numbers approaches. 113 00:07:45,120 --> 00:07:48,509 And if that's your view of things, maybe you start off with a negative number, 114 00:07:48,509 --> 00:07:51,300 so it's not so crazy for the whole expression to end up negative. 115 00:07:52,740 --> 00:07:55,836 After all, if you start with negative 1 divided by phi, 116 00:07:55,836 --> 00:07:59,985 then applying this function 1 plus 1 over x, you get back the same number, 117 00:07:59,985 --> 00:08:03,802 negative 1 divided by phi, so no matter how many times you apply it, 118 00:08:03,802 --> 00:08:05,740 you're staying fixed at this value. 119 00:08:07,819 --> 00:08:10,620 But even then, there is one reason you should 120 00:08:10,620 --> 00:08:13,420 view phi as the favorite brother in this pair. 121 00:08:14,019 --> 00:08:19,330 Here, try this, pull up a calculator of some kind, then start with any random number, 122 00:08:19,331 --> 00:08:22,728 and plug it into this function, 1 plus 1 divided by x, 123 00:08:22,728 --> 00:08:28,040 and plug that number into 1 plus 1 over x, and again, and again, and again, and again. 124 00:08:28,480 --> 00:08:33,158 No matter what constant you start with, you eventually end up at 1.618. 125 00:08:33,798 --> 00:08:38,481 Even if you start with a negative number, even one that's really close to phi's 126 00:08:38,481 --> 00:08:43,399 little brother, eventually it shies away from that value and jumps back over to phi. 127 00:08:50,820 --> 00:08:52,460 So, what's going on here? 128 00:08:52,799 --> 00:08:55,919 Why is one of these fixed points favored above the other one? 129 00:08:56,720 --> 00:09:00,158 Maybe you can already see how the transformational understanding of derivatives 130 00:09:00,158 --> 00:09:03,984 is helpful for understanding this setup, but for the sake of having a point of contrast, 131 00:09:03,984 --> 00:09:07,080 I want to show you how a problem like this is often taught using graphs. 132 00:09:07,919 --> 00:09:11,115 If you were to plug in some random input to this function, 133 00:09:11,115 --> 00:09:14,039 the y value tells you the corresponding output, right? 134 00:09:14,039 --> 00:09:17,862 So to think about plugging that output back into the function, 135 00:09:17,863 --> 00:09:22,050 you might first move horizontally until you hit the line y equals x, 136 00:09:22,049 --> 00:09:26,782 and that's going to give you a position where the x value corresponds to your 137 00:09:26,783 --> 00:09:28,240 previous y value, right? 138 00:09:28,919 --> 00:09:34,553 So then from there, you can move vertically to see what output this new x value has, 139 00:09:34,553 --> 00:09:35,879 and then you repeat. 140 00:09:36,340 --> 00:09:40,598 You move horizontally to the line y equals x to find a point whose x value is the same 141 00:09:40,597 --> 00:09:44,759 as the output you just got, and then you move vertically to apply the function again. 142 00:09:45,879 --> 00:09:48,285 Now personally, I think this is kind of an awkward way 143 00:09:48,285 --> 00:09:50,779 to think about repeatedly applying a function, don't you? 144 00:09:51,299 --> 00:09:53,803 I mean, it makes sense, but you kind of have to pause 145 00:09:53,803 --> 00:09:56,539 and think about it to remember which way to draw the lines. 146 00:09:57,120 --> 00:10:01,426 And you can, if you want, think through what conditions make this spiderweb 147 00:10:01,426 --> 00:10:05,280 process narrow in on a fixed point, versus propagating away from it. 148 00:10:05,860 --> 00:10:08,899 In fact, go ahead, pause right now, and try to think it through as an exercise. 149 00:10:09,240 --> 00:10:10,460 It has to do with slopes. 150 00:10:12,019 --> 00:10:15,818 Or if you want to skip the exercise for something that I think gives a much more 151 00:10:15,818 --> 00:10:19,620 satisfying understanding, think about how this function acts as a transformation. 152 00:10:22,279 --> 00:10:24,923 So I'm going to go ahead and start here by drawing a bunch of 153 00:10:24,923 --> 00:10:27,740 arrows to indicate where the various sampled input points will go. 154 00:10:28,320 --> 00:10:31,440 And side note, don't you think this gives a neat emergent pattern? 155 00:10:31,820 --> 00:10:35,020 I wasn't expecting this, but it was cool to see it pop up when animating. 156 00:10:35,019 --> 00:10:38,797 I guess the action of 1 divided by x gives this nice emergent circle, 157 00:10:38,797 --> 00:10:41,279 and then we're just shifting things over by 1. 158 00:10:42,039 --> 00:10:46,621 Anyway, I want you to think about what it means to repeatedly apply some function, 159 00:10:46,621 --> 00:10:48,719 like 1 plus 1 over x, in this context. 160 00:10:50,240 --> 00:10:53,590 Well after letting it map all of the inputs to the outputs, 161 00:10:53,590 --> 00:10:58,504 you could consider those as the new inputs, and then just apply the same process again, 162 00:10:58,504 --> 00:11:01,519 and then again, and do it however many times you want. 163 00:11:02,580 --> 00:11:06,523 Notice, in animating this with a few dots representing the sample points, 164 00:11:06,523 --> 00:11:11,320 it doesn't take many iterations at all before all of those dots kind of clump in around 1. 165 00:11:11,320 --> 00:11:12,000 618. 166 00:11:14,620 --> 00:11:18,355 Now remember, we know that 1.618 and its little brother, 167 00:11:18,355 --> 00:11:23,860 negative 0.618 on and on, stay fixed in place during each iteration of this process. 168 00:11:24,860 --> 00:11:27,480 But zoom in on a neighborhood around phi. 169 00:11:27,480 --> 00:11:32,788 During the map, points in that region get contracted around phi, 170 00:11:32,788 --> 00:11:39,403 meaning that the function 1 plus 1 over x has a derivative with a magnitude less 171 00:11:39,403 --> 00:11:41,120 than 1 at this input. 172 00:11:41,879 --> 00:11:45,200 In fact, this derivative works out to be around negative 0.38. 173 00:11:46,120 --> 00:11:50,312 So what that means is that each repeated application scrunches the neighborhood 174 00:11:50,312 --> 00:11:54,399 around this number smaller and smaller, like a gravitational pull towards phi. 175 00:11:54,960 --> 00:11:58,620 So now tell me what you think happens in the neighborhood of phi's little brother. 176 00:12:01,320 --> 00:12:05,426 Over there, the derivative actually has a magnitude larger than 1, 177 00:12:05,426 --> 00:12:08,920 so points near the fixed point are repelled away from it. 178 00:12:09,519 --> 00:12:11,598 And when you work it out, you can see that they get 179 00:12:11,599 --> 00:12:13,800 stretched by more than a factor of 2 in each iteration. 180 00:12:14,419 --> 00:12:17,674 They also get flipped around, because the derivative is negative here, 181 00:12:17,674 --> 00:12:20,839 but the salient fact for the sake of stability is just the magnitude. 182 00:12:23,440 --> 00:12:26,970 Mathematicians would call this right value a stable fixed point, 183 00:12:26,970 --> 00:12:29,360 and the left one is an unstable fixed point. 184 00:12:30,000 --> 00:12:33,408 Something is considered stable if when you perturb it just a little bit, 185 00:12:33,408 --> 00:12:37,100 it tends to come back towards where it started, rather than going away from it. 186 00:12:38,179 --> 00:12:40,777 So what we're seeing is a very useful little fact, 187 00:12:40,778 --> 00:12:45,312 that the stability of a fixed point is determined by whether or not the magnitude of its 188 00:12:45,312 --> 00:12:47,300 derivative is bigger or smaller than 1. 189 00:12:47,299 --> 00:12:50,479 This explains why phi always shows up in the numerical play, 190 00:12:50,480 --> 00:12:53,922 where you're just hitting enter on your calculator over and over, 191 00:12:53,922 --> 00:12:55,800 but phi's little brother never does. 192 00:12:56,460 --> 00:12:59,620 As to whether or not you want to consider phi's little brother a 193 00:12:59,620 --> 00:13:02,879 valid value of the infinite fraction, well that's really up to you. 194 00:13:03,259 --> 00:13:06,970 Everything we just showed suggests that if you think of this expression 195 00:13:06,970 --> 00:13:10,524 as representing a limiting process, then because every possible seed 196 00:13:10,524 --> 00:13:14,442 value other than phi's little brother gives you a series converging to phi, 197 00:13:14,442 --> 00:13:17,740 it does feel silly to put them on equal footing with each other. 198 00:13:18,259 --> 00:13:21,773 But maybe you don't think of it as a limit, maybe the kind of math 199 00:13:21,773 --> 00:13:25,600 you're doing lends itself to treating this as a purely algebraic object, 200 00:13:25,600 --> 00:13:29,220 like the solutions of a polynomial, which simply has multiple values. 201 00:13:30,340 --> 00:13:34,485 Anyway, that's beside the point, and my point here is not that viewing derivatives 202 00:13:34,485 --> 00:13:38,779 as this change in density is somehow better than the graphical intuition on the whole. 203 00:13:39,600 --> 00:13:42,204 In fact, picturing an entire function this way can be 204 00:13:42,203 --> 00:13:44,759 kind of clunky and impractical as compared to graphs. 205 00:13:45,340 --> 00:13:48,221 My point is that it deserves more of a mention in most of the 206 00:13:48,221 --> 00:13:50,918 introductory calculus courses, because it can help make a 207 00:13:50,918 --> 00:13:53,940 student's understanding of the derivative a little more flexible. 208 00:13:54,899 --> 00:13:58,357 Like I mentioned, the real reason I'd recommend you carry this perspective 209 00:13:58,357 --> 00:14:01,816 with you as you learn new topics is not so much for what it does with your 210 00:14:01,817 --> 00:14:05,000 understanding of single variable calculus, it's for what comes after.