WEBVTT

00:00:14.320 --> 00:00:17.312
The last several videos have been about the idea of a derivative,

00:00:17.312 --> 00:00:20.940
and before moving on to integrals I want to take some time to talk about limits.

00:00:21.660 --> 00:00:24.820
To be honest, the idea of a limit is not really anything new.

00:00:25.160 --> 00:00:28.699
If you know what the word approach means you pretty much already know what a limit is.

00:00:29.039 --> 00:00:32.301
You could say it's a matter of assigning fancy notation to

00:00:32.301 --> 00:00:35.619
the intuitive idea of one value that gets closer to another.

00:00:36.439 --> 00:00:39.659
But there are a few reasons to devote a full video to this topic.

00:00:40.280 --> 00:00:43.683
For one thing, it's worth showing how the way I've been describing

00:00:43.683 --> 00:00:46.732
derivatives so far lines up with the formal definition of a

00:00:46.732 --> 00:00:50.239
derivative as it's typically presented in most courses and textbooks.

00:00:50.920 --> 00:00:55.103
I want to give you a little confidence that thinking in terms of dx and df

00:00:55.103 --> 00:00:59.286
as concrete non-zero nudges is not just some trick for building intuition,

00:00:59.286 --> 00:01:03.359
it's backed up by the formal definition of a derivative in all its rigor.

00:01:04.260 --> 00:01:08.033
I also want to shed light on what exactly mathematicians mean when

00:01:08.033 --> 00:01:11.920
they say approach in terms of the epsilon-delta definition of limits.

00:01:12.519 --> 00:01:16.579
Then we'll finish off with a clever trick for computing limits called L'Hopital's rule.

00:01:17.799 --> 00:01:21.700
So, first things first, let's take a look at the formal definition of the derivative.

00:01:22.319 --> 00:01:25.430
As a reminder, when you have some function f of x,

00:01:25.430 --> 00:01:29.762
to think about its derivative at a particular input, maybe x equals 2,

00:01:29.762 --> 00:01:33.605
you start by imagining nudging that input some little dx away,

00:01:33.605 --> 00:01:36.900
and looking at the resulting change to the output, df.

00:01:37.959 --> 00:01:41.466
The ratio df divided by dx, which can be nicely thought of

00:01:41.466 --> 00:01:46.757
as the rise over run slope between the starting point on the graph and the nudged point,

00:01:46.757 --> 00:01:48.719
is almost what the derivative is.

00:01:49.099 --> 00:01:53.959
The actual derivative is whatever this ratio approaches as dx approaches 0.

00:01:55.000 --> 00:01:59.123
Just to spell out what's meant there, that nudge to the output

00:01:59.123 --> 00:02:05.013
df is the difference between f at the starting input plus dx and f at the starting input,

00:02:05.013 --> 00:02:07.500
the change to the output caused by dx.

00:02:08.680 --> 00:02:14.341
To express that you want to find what this ratio approaches as dx approaches 0,

00:02:14.341 --> 00:02:17.879
you write lim for limit, with dx arrow 0 below it.

00:02:18.960 --> 00:02:21.891
You'll almost never see terms with a lowercase

00:02:21.890 --> 00:02:24.759
d like dx inside a limit expression like this.

00:02:25.319 --> 00:02:28.076
Instead, the standard is to use a different variable,

00:02:28.076 --> 00:02:31.039
something like delta x, or commonly h for whatever reason.

00:02:31.860 --> 00:02:35.505
The way I like to think of it is that terms with this lowercase

00:02:35.504 --> 00:02:40.174
d in the typical derivative expression have built into them this idea of a limit,

00:02:40.175 --> 00:02:43.080
the idea that dx is supposed to eventually go to 0.

00:02:44.659 --> 00:02:47.609
In a sense, this left hand side here, df over dx,

00:02:47.609 --> 00:02:51.207
the ratio we've been thinking about for the past few videos,

00:02:51.206 --> 00:02:55.866
is just shorthand for what the right hand side here spells out in more detail,

00:02:55.866 --> 00:03:00.939
writing out exactly what we mean by df, and writing out this limit process explicitly.

00:03:01.620 --> 00:03:05.265
This right hand side here is the formal definition of a derivative,

00:03:05.264 --> 00:03:08.159
as you would commonly see it in any calculus textbook.

00:03:08.759 --> 00:03:11.228
And if you'll pardon me for a small rant here,

00:03:11.229 --> 00:03:15.800
I want to emphasize that nothing about this right hand side references the paradoxical

00:03:15.800 --> 00:03:17.640
idea of an infinitely small change.

00:03:18.259 --> 00:03:19.959
The point of limits is to avoid that.

00:03:20.620 --> 00:03:23.026
This value h is the exact same thing as the dx

00:03:23.026 --> 00:03:25.280
I've been referencing throughout the series.

00:03:25.900 --> 00:03:32.280
It's a nudge to the input of f with some non-zero, finitely small size, like 0.001.

00:03:33.099 --> 00:03:37.699
It's just that we're analyzing what happens for arbitrarily small choices of h.

00:03:38.580 --> 00:03:43.438
In fact, the only reason people introduce a new variable name into this formal

00:03:43.437 --> 00:03:48.295
definition, rather than just using dx, is to be extra clear that these changes

00:03:48.295 --> 00:03:53.400
to the input are just ordinary numbers that have nothing to do with infinitesimals.

00:03:54.379 --> 00:03:59.060
There are others who like to interpret this dx as an infinitely small change,

00:03:59.060 --> 00:04:02.719
whatever Or to just say that dx and df are nothing more than

00:04:02.719 --> 00:04:05.419
symbols that we shouldn't take too seriously.

00:04:06.219 --> 00:04:09.479
But by now in the series, you know I'm not really a fan of either of those views.

00:04:10.020 --> 00:04:14.231
I think you can and should interpret dx as a concrete, finitely small nudge,

00:04:14.231 --> 00:04:18.500
just so long as you remember to ask what happens when that thing approaches 0.

00:04:19.420 --> 00:04:23.071
For one thing, and I hope the past few videos have helped convince you of this,

00:04:23.071 --> 00:04:27.180
that helps to build stronger intuition for where the rules of calculus actually come from.

00:04:27.180 --> 00:04:29.900
But it's not just some trick for building intuitions.

00:04:30.459 --> 00:04:34.129
Everything I've been saying about derivatives with this concrete,

00:04:34.129 --> 00:04:38.911
finitely small nudge philosophy is just a translation of this formal definition we're

00:04:38.911 --> 00:04:40.079
staring at right now.

00:04:41.040 --> 00:04:44.677
Long story short, the big fuss about limits is that they let us

00:04:44.677 --> 00:04:48.540
avoid talking about infinitely small changes by instead asking what

00:04:48.540 --> 00:04:52.519
happens as the size of some small change to our variable approaches 0.

00:04:53.279 --> 00:04:56.129
And this brings us to goal number 2, understanding

00:04:56.129 --> 00:04:59.259
exactly what it means for one value to approach another.

00:05:00.439 --> 00:05:07.139
For example, consider the function 2 plus h cubed minus 2 cubed all divided by h.

00:05:08.480 --> 00:05:12.273
This happens to be the expression that pops out when you unravel

00:05:12.273 --> 00:05:16.183
the definition of a derivative of x cubed evaluated at x equals 2,

00:05:16.182 --> 00:05:19.860
but let's just think of it as any old function with an input h.

00:05:20.439 --> 00:05:23.303
Its graph is this nice continuous looking parabola,

00:05:23.303 --> 00:05:27.379
which would make sense because it's a cubic term divided by a linear term.

00:05:28.199 --> 00:05:32.204
But actually, if you think about what's going on at h equals 0,

00:05:32.204 --> 00:05:36.460
plugging that in you would get 0 divided by 0, which is not defined.

00:05:37.420 --> 00:05:40.247
So really, this graph has a hole at that point,

00:05:40.247 --> 00:05:45.139
and you have to exaggerate to draw that hole, often with an empty circle like this.

00:05:45.139 --> 00:05:47.839
But keep in mind, the function is perfectly well

00:05:47.839 --> 00:05:50.319
defined for inputs as close to 0 as you want.

00:05:51.259 --> 00:05:55.702
Wouldn't you agree that as h approaches 0, the corresponding output,

00:05:55.702 --> 00:05:58.279
the height of this graph, approaches 12?

00:05:59.160 --> 00:06:01.580
It doesn't matter which side you come at it from.

00:06:03.740 --> 00:06:08.199
That limit of this ratio as h approaches 0 is equal to 12.

00:06:09.360 --> 00:06:12.742
But imagine you're a mathematician inventing calculus,

00:06:12.742 --> 00:06:17.480
and someone skeptically asks you, well, what exactly do you mean by approach?

00:06:18.439 --> 00:06:21.218
That would be kind of an annoying question, I mean, come on,

00:06:21.218 --> 00:06:24.180
we all know what it means for one value to get closer to another.

00:06:24.939 --> 00:06:28.533
But let's start thinking about ways you might be able to answer that person,

00:06:28.533 --> 00:06:29.699
completely unambiguously.

00:06:30.939 --> 00:06:34.187
For a given range of inputs within some distance of 0,

00:06:34.187 --> 00:06:39.028
excluding the forbidden point 0 itself, look at all of the corresponding outputs,

00:06:39.028 --> 00:06:42.040
all possible heights of the graph above that range.

00:06:42.860 --> 00:06:47.281
As the range of input values closes in more and more tightly around 0,

00:06:47.281 --> 00:06:51.639
that range of output values closes in more and more closely around 12.

00:06:52.420 --> 00:06:57.280
And importantly, the size of that range of output values can be made as small as you want.

00:06:59.019 --> 00:07:02.526
As a counter example, consider a function that looks like this,

00:07:02.526 --> 00:07:06.199
which is also not defined at 0, but kind of jumps up at that point.

00:07:06.959 --> 00:07:11.599
When you approach h equals 0 from the right, the function approaches the value 2,

00:07:11.600 --> 00:07:14.600
but as you come at it from the left, it approaches 1.

00:07:15.540 --> 00:07:20.043
Since there's not a single clear, unambiguous value that this function

00:07:20.043 --> 00:07:24.420
approaches as h approaches 0, the limit is not defined at that point.

00:07:25.160 --> 00:07:30.066
One way to think of this is that when you look at any range of inputs around 0,

00:07:30.065 --> 00:07:35.033
and consider the corresponding range of outputs, as you shrink that input range,

00:07:35.033 --> 00:07:38.959
the corresponding outputs don't narrow in on any specific value.

00:07:39.779 --> 00:07:43.909
Instead, those outputs straddle a range that never shrinks smaller than 1,

00:07:43.910 --> 00:07:47.380
even as you make that input range as tiny as you could imagine.

00:07:48.519 --> 00:07:52.298
This perspective of shrinking an input range around the limiting point,

00:07:52.298 --> 00:07:56.866
and seeing whether or not you're restricted in how much that shrinks the output range,

00:07:56.867 --> 00:08:00.280
leads to something called the epsilon-delta definition of limits.

00:08:01.220 --> 00:08:03.319
Now I should tell you, you could argue that this is

00:08:03.319 --> 00:08:05.500
needlessly heavy duty for an introduction to calculus.

00:08:06.060 --> 00:08:08.353
Like I said, if you know what the word approach means,

00:08:08.353 --> 00:08:11.939
you already know what a limit means, there's nothing new on the conceptual level here.

00:08:12.319 --> 00:08:16.283
But this is an interesting glimpse into the field of real analysis,

00:08:16.283 --> 00:08:21.356
and gives you a taste for how mathematicians make the intuitive ideas of calculus more

00:08:21.357 --> 00:08:22.640
airtight and rigorous.

00:08:23.699 --> 00:08:25.339
You've already seen the main idea here.

00:08:25.660 --> 00:08:29.622
When a limit exists, you can make this output range as small as you want,

00:08:29.622 --> 00:08:33.960
but when the limit doesn't exist, that output range cannot get smaller than some

00:08:33.960 --> 00:08:38.780
particular value, no matter how much you shrink the input range around the limiting input.

00:08:39.678 --> 00:08:42.372
Let's freeze that same idea a little more precisely,

00:08:42.373 --> 00:08:45.879
maybe in the context of this example where the limiting value was 12.

00:08:46.779 --> 00:08:50.033
Think about any distance away from 12, where for some reason it's

00:08:50.033 --> 00:08:53.139
common to use the Greek letter epsilon to denote that distance.

00:08:53.820 --> 00:08:58.040
The intent here is that this distance epsilon is as small as you want.

00:08:58.820 --> 00:09:04.701
What it means for the limit to exist is that you will always be able to find a

00:09:04.701 --> 00:09:10.135
range of inputs around our limiting point, some distance delta around 0,

00:09:10.135 --> 00:09:16.016
so that any input within delta of 0 corresponds to an output within a distance

00:09:16.017 --> 00:09:17.060
epsilon of 12.

00:09:18.419 --> 00:09:21.154
The key point here is that that's true for any epsilon,

00:09:21.154 --> 00:09:24.819
no matter how small, you'll always be able to find the corresponding delta.

00:09:25.580 --> 00:09:29.999
In contrast, when a limit does not exist, as in this example here,

00:09:29.999 --> 00:09:33.495
you can find a sufficiently small epsilon, like 0.4,

00:09:33.495 --> 00:09:39.234
so that no matter how small you make your range around 0, no matter how tiny delta is,

00:09:39.234 --> 00:09:43.060
the corresponding range of outputs is just always too big.

00:09:43.700 --> 00:09:48.640
There is no limiting output where everything is within a distance epsilon of that output.

00:09:54.100 --> 00:09:57.159
So far, this is all pretty theory-heavy, don't you think?

00:09:57.679 --> 00:10:00.387
Limits being used to formally define the derivative,

00:10:00.388 --> 00:10:04.120
and epsilons and deltas being used to rigorously define the limit itself.

00:10:04.899 --> 00:10:08.259
So let's finish things off here with a trick for actually computing limits.

00:10:09.100 --> 00:10:12.363
For instance, let's say for some reason you were studying

00:10:12.363 --> 00:10:15.740
the function sin of pi times x divided by x squared minus 1.

00:10:16.220 --> 00:10:19.240
Maybe this was modeling some kind of dampened oscillation.

00:10:20.240 --> 00:10:23.460
When you plot a bunch of points to graph this, it looks pretty continuous.

00:10:27.279 --> 00:10:29.480
But there's a problematic value at x equals 1.

00:10:30.000 --> 00:10:35.341
When you plug that in, sin of pi is 0, and the denominator also comes out to 0,

00:10:35.341 --> 00:10:39.014
so the function is actually not defined at that input,

00:10:39.014 --> 00:10:41.620
and the graph should have a hole there.

00:10:42.200 --> 00:10:45.511
This also happens at x equals negative 1, but let's just

00:10:45.510 --> 00:10:48.939
focus our attention on a single one of these holes for now.

00:10:50.019 --> 00:10:53.775
The graph certainly does seem to approach a distinct value at that point,

00:10:53.775 --> 00:10:54.639
wouldn't you say?

00:10:57.279 --> 00:11:03.004
So you might ask, how exactly do you find what output this approaches as x approaches 1,

00:11:03.004 --> 00:11:05.000
since you can't just plug in 1?

00:11:07.960 --> 00:11:11.624
Well, one way to approximate it would be to plug in

00:11:11.624 --> 00:11:15.360
a number that's just really close to 1, like 1.00001.

00:11:16.120 --> 00:11:20.080
Doing that, you'd find that this should be a number around negative 1.57.

00:11:21.159 --> 00:11:23.600
But is there a way to know precisely what it is?

00:11:23.960 --> 00:11:27.596
Some systematic process to take an expression like this one,

00:11:27.596 --> 00:11:32.187
that looks like 0 divided by 0 at some input, and ask what is its limit as x

00:11:32.187 --> 00:11:33.500
approaches that input?

00:11:36.440 --> 00:11:40.157
After limits, so helpfully let us write the definition for derivatives,

00:11:40.157 --> 00:11:44.700
derivatives can actually come back here and return the favor to help us evaluate limits.

00:11:45.200 --> 00:11:46.020
Let me show you what I mean.

00:11:47.019 --> 00:11:50.367
Here's what the graph of sin of pi times x looks like,

00:11:50.368 --> 00:11:53.900
and here's what the graph of x squared minus 1 looks like.

00:11:53.899 --> 00:11:56.779
That's a lot to have up on the screen, but just

00:11:56.779 --> 00:11:59.419
focus on what's happening around x equals 1.

00:12:00.179 --> 00:12:06.301
The point here is that sin of pi times x and x squared minus 1 are both 0 at that point,

00:12:06.302 --> 00:12:08.159
they both cross the x axis.

00:12:09.000 --> 00:12:14.277
In the same spirit as plugging in a specific value near 1, like 1.00001,

00:12:14.277 --> 00:12:20.640
let's zoom in on that point and consider what happens just a tiny nudge dx away from it.

00:12:21.299 --> 00:12:26.380
The value sin of pi times x is bumped down, and the value of that nudge,

00:12:26.380 --> 00:12:32.159
which was caused by the nudge dx to the input, is what we might call d sin of pi x.

00:12:33.039 --> 00:12:37.259
And from our knowledge of derivatives, using the chain rule,

00:12:37.259 --> 00:12:41.480
that should be around cosine of pi times x times pi times dx.

00:12:42.700 --> 00:12:47.700
Since the starting value was x equals 1, we plug in x equals 1 to that expression.

00:12:51.259 --> 00:12:56.706
In other words, the amount that this sin of pi times x graph changes is roughly

00:12:56.706 --> 00:13:02.360
proportional to dx, with a proportionality constant equal to cosine of pi times pi.

00:13:03.360 --> 00:13:06.645
And cosine of pi, if we think back to our trig knowledge,

00:13:06.645 --> 00:13:11.179
is exactly negative 1, so we can write this whole thing as negative pi times dx.

00:13:12.220 --> 00:13:18.445
Similarly, the value of the x squared minus 1 graph changes by some dx squared minus 1,

00:13:18.445 --> 00:13:23.540
and taking the derivative, the size of that nudge should be 2x times dx.

00:13:24.480 --> 00:13:29.407
Again, we were starting at x equals 1, so we plug in x equals 1 to that expression,

00:13:29.407 --> 00:13:33.280
meaning the size of that output nudge is about 2 times 1 times dx.

00:13:34.919 --> 00:13:41.277
What this means is that for values of x which are just a tiny nudge dx away from 1,

00:13:41.278 --> 00:13:46.425
the ratio sin of pi x divided by x squared minus 1 is approximately

00:13:46.424 --> 00:13:49.679
negative pi times dx divided by 2 times dx.

00:13:50.899 --> 00:13:54.740
The dx's cancel out, so what's left is negative pi over 2.

00:13:55.720 --> 00:13:58.591
And importantly, those approximations get more and more

00:13:58.591 --> 00:14:01.360
accurate for smaller and smaller choices of dx, right?

00:14:02.309 --> 00:14:05.618
This ratio, negative pi over 2, actually tells

00:14:05.619 --> 00:14:09.000
us the precise limiting value as x approaches 1.

00:14:09.539 --> 00:14:13.106
Remember, what that means is that the limiting height on

00:14:13.106 --> 00:14:16.799
our original graph is evidently exactly negative pi over 2.

00:14:18.220 --> 00:14:21.601
What happened there is a little subtle, so I want to go through it again,

00:14:21.601 --> 00:14:23.340
but this time a little more generally.

00:14:24.120 --> 00:14:29.388
Instead of these two specific functions, which are both equal to 0 at x equals 1,

00:14:29.388 --> 00:14:34.913
think of any two functions, f of x and g of x, which are both 0 at some common value,

00:14:34.913 --> 00:14:35.620
x equals a.

00:14:36.279 --> 00:14:39.562
The only constraint is that these have to be functions where you're

00:14:39.562 --> 00:14:41.928
able to take a derivative of them at x equals a,

00:14:41.928 --> 00:14:45.405
which means they each basically look like a line when you zoom in close

00:14:45.405 --> 00:14:46.419
enough to that value.

00:14:47.799 --> 00:14:52.392
Even though you can't compute f divided by g at this trouble point,

00:14:52.393 --> 00:14:56.514
since both of them equal 0, you can ask about this ratio for

00:14:56.514 --> 00:15:00.500
values of x really close to a, the limit as x approaches a.

00:15:01.220 --> 00:15:06.200
It's helpful to think of those nearby inputs as just a tiny nudge, dx, away from a.

00:15:06.759 --> 00:15:12.160
The value of f at that nudged point is approximately its derivative,

00:15:12.160 --> 00:15:14.979
df over dx, evaluated at a times dx.

00:15:15.980 --> 00:15:22.124
Likewise, the value of g at that nudged point is approximately the derivative of g,

00:15:22.124 --> 00:15:23.879
evaluated at a times dx.

00:15:25.059 --> 00:15:31.059
Near that trouble point, the ratio between the outputs of f and g is actually about the

00:15:31.059 --> 00:15:37.059
same as the derivative of f at a times dx, divided by the derivative of g at a times dx.

00:15:37.879 --> 00:15:41.119
Those dx's cancel out, so the ratio of f and g near a

00:15:41.119 --> 00:15:44.540
is about the same as the ratio between their derivatives.

00:15:45.860 --> 00:15:50.307
Because each of those approximations gets more and more accurate for smaller and

00:15:50.307 --> 00:15:54.700
smaller nudges, this ratio of derivatives gives the precise value for the limit.

00:15:55.539 --> 00:15:58.500
This is a really handy trick for computing a lot of limits.

00:15:58.919 --> 00:16:02.937
Whenever you come across some expression that seems to equal 0 divided by

00:16:02.937 --> 00:16:06.900
0 when you plug in some particular input, just try taking the derivative

00:16:06.900 --> 00:16:10.919
of the top and bottom expressions and plugging in that same trouble input.

00:16:13.980 --> 00:16:16.300
This clever trick is called L'Hopital's Rule.

00:16:17.240 --> 00:16:20.182
Interestingly, it was actually discovered by Johann Bernoulli,

00:16:20.182 --> 00:16:22.844
but L'Hopital was this wealthy dude who essentially paid

00:16:22.844 --> 00:16:25.879
Bernoulli for the rights to some of his mathematical discoveries.

00:16:26.740 --> 00:16:30.076
Academia is weird back then, but in a very literal way,

00:16:30.076 --> 00:16:32.460
it pays to understand these tiny nudges.

00:16:34.960 --> 00:16:38.716
Right now, you might be remembering that the definition of a derivative

00:16:38.716 --> 00:16:42.264
for a given function comes down to computing the limit of a certain

00:16:42.264 --> 00:16:45.657
fraction that looks like 0 divided by 0, so you might think that

00:16:45.657 --> 00:16:49.780
L'Hopital's Rule could give us a handy way to discover new derivative formulas.

00:16:50.679 --> 00:16:53.473
But that would actually be cheating, since presumably

00:16:53.474 --> 00:16:56.320
you don't know what the derivative of the numerator is.

00:16:57.019 --> 00:16:59.593
When it comes to discovering derivative formulas,

00:16:59.594 --> 00:17:02.374
something we've been doing a fair amount this series,

00:17:02.374 --> 00:17:04.640
there is no systematic plug-and-chug method.

00:17:05.118 --> 00:17:05.959
But that's a good thing!

00:17:06.400 --> 00:17:09.373
Whenever creativity is needed to solve problems like these,

00:17:09.373 --> 00:17:11.901
it's a good sign that you're doing something real,

00:17:11.901 --> 00:17:15.420
something that might give you a powerful tool to solve future problems.

00:17:18.259 --> 00:17:22.907
And speaking of powerful tools, up next I'm going to be talking about what an integral

00:17:22.907 --> 00:17:25.686
is, as well as the fundamental theorem of calculus,

00:17:25.686 --> 00:17:30.442
another example of where limits can be used to give a clear meaning to a pretty delicate

00:17:30.442 --> 00:17:32.099
idea that flirts with infinity.

00:17:33.579 --> 00:17:36.818
As you know, most support for this channel comes through Patreon,

00:17:36.818 --> 00:17:40.795
and the primary perk for patrons is early access to future series like this one,

00:17:40.795 --> 00:17:43.200
where the next one is going to be on probability.

00:17:44.259 --> 00:17:47.754
But for those of you who want a more tangible way to flag that

00:17:47.755 --> 00:17:51.640
you're part of the community, there is also a small 3blue1brown store.

00:17:52.299 --> 00:17:53.960
Links on the screen and in the description.

00:17:54.680 --> 00:18:05.081
I'm still debating whether or not to make a preliminary batch of plushie pie creatures,

00:18:05.080 --> 00:18:14.064
it depends on how many viewers seem interested in the store more generally,

00:18:14.065 --> 00:18:23.875
but let me know in comments what other kinds of things you'd like to see in there.

00:18:23.875 --> 00:18:26.240
Thanks for watching!
