[00:15] The goal here is simple, explain what a derivative is.
[00:19] The thing is though, there's some subtlety to this topic,
[00:21] and a lot of potential for paradoxes if you're not careful.
[00:24] So a secondary goal is that you have an appreciation
[00:27] for what those paradoxes are and how to avoid them.
[00:31] You see, it's common for people to say that the derivative measures an instantaneous
[00:35] rate of change, but when you think about it, that phrase is actually an oxymoron.
[00:40] Change is something that happens between separate points in time,
[00:43] and when you blind yourself to all but just a single instant,
[00:46] there's not really any room for change.
[00:49] You'll see what I mean more as we get into it,
[00:51] but when you appreciate that a phrase like instantaneous rate of change is actually
[00:56] nonsense, I think it makes you appreciate just how clever the fathers of calculus
[01:00] were in capturing the idea that phrase is meant to evoke,
[01:02] but with a perfectly sensible piece of math, the derivative.
[01:07] As our central example, I want you to imagine a car that starts at some point A,
[01:11] speeds up, and then slows down to a stop at some point B 100 meters away,
[01:15] and let's say it all happens over the course of 10 seconds.
[01:20] That's the setup to have in mind as we lay out what the derivative is.
[01:23] Well, we could graph this motion, letting the vertical axis represent the
[01:29] distance traveled, and the horizontal axis represent time, so at each time t,
[01:34] represented with a point somewhere on the horizontal axis,
[01:38] the height of the graph tells us how far the car has traveled in total after
[01:44] that amount of time.
[01:46] It's pretty common to name a distance function like this s of t.
[01:50] I would use the letter d for distance, but that
[01:52] guy already has another full time job in calculus.
[01:56] Initially, the curve is quite shallow, since the car is slow to start.
[02:00] During that first second, the distance it travels doesn't change that much.
[02:04] For the next few seconds, as the car speeds up,
[02:07] the distance traveled in a given second gets larger,
[02:10] which corresponds to a steeper slope in this graph.
[02:13] Then towards the end, when it slows down, that curve shallows out again.
[02:20] If we were to plot the car's velocity in meters per second as a function of time,
[02:25] it might look like this bump.
[02:27] At early times, the velocity is very small.
[02:30] Up to the middle of the journey, the car builds up to some maximum velocity,
[02:34] covering a relatively large distance each second.
[02:37] Then it slows back down towards a speed of zero.
[02:41] These two curves are definitely related to each other.
[02:44] If you change the specific distance vs.
[02:47] time function, you'll have some different velocity vs.
[02:50] time function.
[02:51] What we want to understand is the specifics of that relationship.
[02:55] Exactly how does velocity depend on a distance vs.
[02:59] time function?
[03:01] To do that, it's worth taking a moment to think
[03:04] critically about what exactly velocity means here.
[03:08] Intuitively, we all might know what velocity at a given moment means,
[03:11] it's just whatever the car's speedometer shows in that moment.
[03:17] Intuitively, it might make sense that the car's velocity should be higher at times when
[03:21] this distance function is steeper, when the car traverses more distance per unit time.
[03:26] But the funny thing is, velocity at a single moment makes no sense.
[03:31] If I show you a picture of a car, just a snapshot in an instant,
[03:34] and I ask you how fast it's going, you'd have no way of telling me.
[03:39] What you'd need are two separate points in time to compare.
[03:43] That way you can compute whatever the change in distance across those times is,
[03:47] divided by the change in time.
[03:49] Right?
[03:49] I mean, that's what velocity is, it's the distance traveled per unit time.
[03:55] So how is it that we're looking at a function for velocity that
[03:59] only takes in a single value of t, a single snapshot in time?
[04:02] It's weird, isn't it?
[04:04] We want to associate individual points in time with a velocity,
[04:07] but actually computing velocity requires comparing two separate points in time.
[04:14] If that feels strange and paradoxical, good!
[04:17] You're grappling with the same conflicts that the fathers of calculus did.
[04:21] And if you want a deep understanding for rates of change, not just for a moving car,
[04:25] but for all sorts of things in science, you're going to need to resolve this apparent
[04:29] paradox.
[04:32] First, I think it's best to talk about the real world,
[04:34] and then we'll go into a purely mathematical one.
[04:37] Let's think about what the car's speedometer is probably doing.
[04:41] At some point, say 3 seconds into the journey,
[04:43] the speedometer might measure how far the car goes in a very small amount of time,
[04:48] maybe the distance traveled between 3 seconds and 3.01 seconds.
[04:53] Then it could compute the speed in meters per second as that tiny
[04:57] distance traversed in meters divided by that tiny time, 0.01 seconds.
[05:02] That is, a physical car just side-steps the paradox and
[05:05] doesn't actually compute speed at a single point in time.
[05:08] It computes speed during a very small amount of time.
[05:13] So let's call that difference in time dt, which you might think of as 0.01 seconds,
[05:18] and let's call that resulting difference in distance ds.
[05:22] So the velocity at some point in time is ds divided by dt,
[05:26] the tiny change in distance over the tiny change in time.
[05:31] Graphically, you can imagine zooming in on some point of this distance vs.
[05:35] time graph above t equals 3.
[05:38] That dt is a small step to the right, since time is on the horizontal axis,
[05:43] and that ds is the resulting change in the height of the graph,
[05:47] since the vertical axis represents the distance traveled.
[05:51] So ds divided by dt is something you can think of as the rise
[05:55] over run slope between two very close points on this graph.
[06:00] Of course, there's nothing special about the value t equals 3.
[06:03] We could apply this to any other point in time,
[06:06] so we consider this expression ds over dt to be a function of t,
[06:10] something where I can give you a time t and you can give me back the value of this
[06:15] ratio at that time, the velocity as a function of time.
[06:19] For example, when I had the computer draw this bump curve here,
[06:22] the one representing the velocity function, here's what I had the computer actually do.
[06:27] First, I chose a small value for dt, I think in this case it was 0.01.
[06:33] Then I had the computer look at a whole bunch of times t between 0 and 10,
[06:38] and compute the distance function s at t plus dt,
[06:41] and then subtract off the value of that function at t.
[06:45] In other words, that's the difference in the distance traveled between the given time,
[06:51] t, and the time 0.01 seconds after that.
[06:54] Then you can just divide that difference by the change in time, dt,
[06:58] and that gives you velocity in meters per second around each point in time.
[07:04] So with a formula like this, you could give the computer any curve representing any
[07:08] distance function s of t, and it could figure out the curve representing velocity.
[07:13] Now would be a good time to pause, reflect, and make sure this idea
[07:17] of relating distance to velocity by looking at tiny changes makes sense,
[07:21] because we're going to tackle the paradox of the derivative head on.
[07:27] This idea of ds over dt, a tiny change in the value of the function s divided by
[07:32] the tiny change in the input that caused it, that's almost what a derivative is.
[07:38] And even though a car's speedometer will actually look at a concrete change in time,
[07:43] like 0.01 seconds, and even though the drawing program here is looking at an actual
[07:49] concrete change in time, in pure math the derivative is not this ratio ds over dt for a
[07:54] specific choice of dt. Instead, it's whatever that ratio approaches as your choice for dt
[07:59] approaches 0.
[08:02] Luckily there is a really nice visual understanding for what it means to ask what
[08:07] this ratio approaches, Remember, for any specific choice of dt,
[08:11] this ratio ds over dt is the slope of a line passing through two separate points
[08:15] on the graph, right?
[08:17] Well as dt approaches 0, and as those two points approach each other,
[08:22] the slope of the line approaches the slope of a line that's
[08:26] tangent to the graph at whatever point t we're looking at.
[08:30] So the true honest-to-goodness pure math derivative is not the
[08:33] rise over run slope between two nearby points on the graph,
[08:37] it's equal to the slope of a line tangent to the graph at a single point.
[08:42] Now notice what I'm not saying, I'm not saying that the derivative is
[08:45] whatever happens when dt is infinitely small, whatever that would mean.
[08:50] Nor am I saying that you plug in 0 for dt.
[08:53] This dt is always a finitely small non-zero value, it's just that it approaches 0 is all.
[09:03] I think that's really clever.
[09:05] Even though change in an instant makes no sense,
[09:08] this idea of letting dt approach 0 is a really sneaky backdoor
[09:12] way to talk reasonably about the rate of change at a single point in time.
[09:17] Isn't that neat?
[09:18] It's kind of flirting with the paradox of change in
[09:20] an instant without ever needing to actually touch it.
[09:23] And it comes with such a nice visual intuition too,
[09:25] as the slope of a tangent line to a single point on the graph.
[09:30] And because change in an instant still makes no sense,
[09:33] I think it's healthiest for you to think of this slope not as some instantaneous
[09:37] rate of change, but instead as the best constant approximation for a rate of
[09:41] change around a point.
[09:44] By the way, it's worth saying a couple words on notation here.
[09:47] Throughout this video I've been using dt to refer to a tiny change in t with
[09:51] some actual size, and ds to refer to the resulting change in s,
[09:55] which again has an actual size, and this is because that's how I want you to
[09:59] think about them.
[10:01] But the convention in calculus is that whenever you're using the letter d like this,
[10:05] you're kind of announcing your intention that eventually you're
[10:08] going to see what happens as dt approaches 0.
[10:11] For example, the honest-to-goodness pure math derivative is written as ds divided by dt,
[10:16] even though it's technically not a fraction per se,
[10:19] but whatever that fraction approaches for smaller and smaller nudges in t.
[10:25] I think a specific example should help here.
[10:28] You might think that asking about what this ratio approaches
[10:31] for smaller and smaller values would make it much more difficult to compute,
[10:35] but weirdly it kind of makes things easier.
[10:38] Let's say you have a given distance vs time function that happens to be exactly t cubed.
[10:43] So after 1 second the car has traveled 1 cubed equals 1 meters,
[10:47] after 2 seconds it's traveled 2 cubed, or 8 meters, and so on.
[10:53] Now what I'm about to do might seem somewhat complicated,
[10:55] but once the dust settles it really is simpler,
[10:57] and more importantly it's the kind of thing you only ever have to do once in calculus.
[11:03] Let's say you wanted to compute the velocity, ds divided by dt,
[11:06] at some specific time, like t equals 2.
[11:09] For right now let's think of dt as having an actual size,
[11:13] some concrete nudge, we'll let it go to 0 in just a bit.
[11:17] The tiny change in distance between 2 seconds and 2 plus dt
[11:22] seconds is s of 2 plus dt minus s of 2, and we divide that by dt.
[11:28] Since our function is t cubed, that numerator looks like 2 plus dt cubed minus 2 cubed.
[11:35] And this is something we can work out algebraically.
[11:38] Again, bear with me, there's a reason I'm showing you the details here.
[11:42] When you expand that top, what you get is 2 cubed plus 3 times 2 squared dt
[11:49] plus 3 times 2 times dt squared plus dt cubed, and all of that is minus 2 cubed.
[11:58] Now there's a lot of terms, and I want you to remember that it looks like a mess,
[12:01] but it does simplify.
[12:03] Those 2 cubed terms cancel out.
[12:06] Everything remaining here has a dt in it, and since there's a dt on the bottom there,
[12:11] many of those cancel out as well.
[12:14] What this means is that the ratio ds divided by dt has boiled down into
[12:19] 3 times 2 squared plus 2 different terms that each have a dt in them.
[12:25] So if we ask what happens as dt approaches 0, representing the idea of looking at a
[12:30] smaller and smaller change in time, we can just completely ignore those other terms.
[12:36] By eliminating the need to think about a specific dt,
[12:39] we've eliminated a lot of the complication in the full expression.
[12:43] So what we're left with is this nice clean 3 times 2 squared.
[12:48] You can think of that as meaning that the slope of a line tangent to
[12:52] the point at t equals 2 of this graph is exactly 3 times 2 squared, or 12.
[12:57] And of course, there's nothing special about the time t equals 2.
[13:01] We could more generally say that the derivative
[13:04] of t cubed as a function of t is 3 times t squared.
[13:10] Now take a step back, because that's beautiful.
[13:13] The derivative is this crazy complicated idea.
[13:16] We've got tiny changes in distance over tiny changes in time,
[13:19] but instead of looking at any specific one of those,
[13:22] we're talking about what that thing approaches.
[13:24] I mean, that's a lot to think about.
[13:27] And yet what we've come out with is such a simple expression, 3 times t squared.
[13:32] And in practice, you wouldn't go through all this algebra each time.
[13:36] Knowing that the derivative of t cubed is 3t squared is one of those things that all
[13:40] calculus students learn how to do immediately without having to re-derive it each time.
[13:45] And in the next video, I'm going to show you a nice way to think about
[13:48] this and a couple other derivative formulas in really nice geometric ways.
[13:52] But the point I want to make by showing you all of the algebraic guts
[13:56] here is that when you consider the tiny change in distance caused by a
[14:00] tiny change in time for some specific value of dt, you'd have kind of a mess.
[14:05] But when you consider what that ratio approaches as dt approaches 0,
[14:08] it lets you ignore much of that mess, and it really does simplify the problem.
[14:13] That right there is kind of the heart of why calculus becomes useful.
[14:18] Another reason to show you a concrete derivative like this is that it
[14:21] sets the stage for an example of the kind of paradoxes that come about
[14:25] if you believe too much in the illusion of instantaneous rate of change.
[14:30] So think about the actual car traveling according to this t cubed distance function,
[14:34] and consider its motion at the moment t equals 0, right at the start.
[14:39] Now ask yourself whether or not the car is moving at that time.
[14:45] On the one hand, we can compute its speed at that point using the derivative,
[14:50] 3t squared, which for time t equals 0 works out to be 0.
[14:54] Visually, this means that the tangent line to the graph at that point is perfectly flat,
[14:59] so the car's quote-unquote instantaneous velocity is 0,
[15:03] and that suggests that obviously it's not moving.
[15:07] But on the other hand, if it doesn't start moving at time 0, when does it start moving?
[15:12] Really, pause and ponder that for a moment.
[15:15] Is the car moving at time t equals 0?
[15:22] Do you see the paradox?
[15:24] The issue is that the question makes no sense.
[15:26] It references the idea of change in a moment, but that doesn't actually exist.
[15:30] That's just not what the derivative measures.
[15:33] What it means for the derivative of a distance function to be 0 is that the best
[15:38] constant approximation for the car's velocity around that point is 0 m per second.
[15:44] For example, if you look at an actual change in time,
[15:47] say between time 0 and 0.1 seconds, the car does move.
[15:51] It moves 0.001 m.
[15:54] That's very small, and importantly, it's very small compared to the change in time,
[15:59] giving an average speed of only 0.01 m per second.
[16:03] And remember, what it means for the derivative of this motion to be 0 is that
[16:08] for smaller and smaller nudges in time, this ratio of m per second approaches 0.
[16:14] But that's not to say that the car is static.
[16:17] Approximating its movement with a constant velocity of 0 is,
[16:20] after all, just an approximation.
[16:24] So whenever you hear people refer to the derivative as an instantaneous rate of change,
[16:29] a phrase which is intrinsically oxymoronic, I want you to think of that as a
[16:33] conceptual shorthand for the best constant approximation for rate of change.
[16:39] In the next couple videos, I'll be talking more about the derivative,
[16:42] what it looks like in different contexts, how do you actually compute it,
[16:45] why is it useful, things like that, focusing on visual intuition as always.