In real analysis, a branch of mathematics, the inverse function theorem is a theorem that asserts that, if a real function f has a continuous derivative near a point where its derivative is nonzero, then, near this point, f has an inverse function. The inverse function is also continuously differentiable, and the inverse function rule expresses its derivative as the multiplicative inverse of the derivative of f.
The theorem applies verbatim to complex-valued functions of a complex variable. It generalizes to functions from n-tuples (of real or complex numbers) to n-tuples, and to functions between vector spaces of the same finite dimension, by replacing "derivative" with "Jacobian matrix" and "nonzero derivative" with "nonzero Jacobian determinant".
If the function of the theorem belongs to a higher differentiability class, the same is true for the inverse function. There are also versions of the inverse function theorem for holomorphic functions, for differentiable maps between manifolds, for differentiable functions between Banach spaces, and so forth.
The theorem was first established by Picard and Goursat using an iterative scheme: the basic idea is to prove a fixed point theorem using the contraction mapping theorem.
Statements
For functions of a single variable, the theorem states that if is a continuously differentiable function with nonzero derivative at the point
; then
is injective (or bijective onto the image) in a neighborhood of
, the inverse is continuously differentiable near
, and the derivative of the inverse function at
is the reciprocal of the derivative of
at
:
It can happen that a function may be injective near a point
while
. An example is
. In fact, for such a function, the inverse cannot be differentiable at
, since if
were differentiable at
, then, by the chain rule,
, which implies
. (The situation is different for holomorphic functions; see Holomorphic inverse function theorem below.)
For functions of more than one variable, the theorem states that if is a continuously differentiable function from an open subset
of
into
, and the derivative
is invertible at a point a (that is, the determinant of the Jacobian matrix of f at a is non-zero), then there exist neighborhoods
of
in
and
of
such that
and
is bijective.[1] Writing
, this means that the system of n equations
has a unique solution for
in terms of
when
. Note that the theorem does not say
is bijective onto the image where
is invertible but that it is locally bijective where
is invertible.
Moreover, the theorem says that the inverse function is continuously differentiable, and its derivative at
is the inverse map of
; i.e.,
In other words, if are the Jacobian matrices representing
, this means:
The hard part of the theorem is the existence and differentiability of . Assuming this, the inverse derivative formula follows from the chain rule applied to
. (Indeed,
) Since taking the inverse is infinitely differentiable, the formula for the derivative of the inverse shows that if
is continuously
times differentiable, with invertible derivative at the point a, then the inverse is also continuously
times differentiable. Here
is a positive integer or
.
There are two variants of the inverse function theorem.[1] Given a continuously differentiable map , the first is
- The derivative
is surjective (i.e., the Jacobian matrix representing it has rank
) if and only if there exists a continuously differentiable function
on a neighborhood
of
such that
near
,
and the second is
- The derivative
is injective if and only if there exists a continuously differentiable function
on a neighborhood
of
such that
near
.
In the first case (when is surjective), the point
is called a regular value. Since
, the first case is equivalent to saying
is not in the image of critical points
(a critical point is a point
such that the kernel of
is nonzero). The statement in the first case is a special case of the submersion theorem.
These variants are restatements of the inverse functions theorem. Indeed, in the first case when is surjective, we can find an (injective) linear map
such that
. Define
so that we have:
Thus, by the inverse function theorem, has inverse near
; i.e.,
near
. The second case (
is injective) is seen in the similar way.
Example
Consider the vector-valued function defined by:
The Jacobian matrix of it at is:
with the determinant:
The determinant is nonzero everywhere. Thus the theorem guarantees that, for every point p in
, there exists a neighborhood about p over which F is invertible. This does not mean F is invertible over its entire domain: in this case F is not even injective since it is periodic:
.
Counter-example
If one drops the assumption that the derivative is continuous, the function is no longer necessarily locally injective. For example and
has discontinuous derivative
and
which vanishes arbitrarily close to
. These critical points are local max/min points of
so
is not one-to-one (and not invertible) on any interval containing
. Intuitively, the slope
does not propagate to nearby points, where the slopes are governed by a weak but rapid oscillation.
If the derivative is continuous but zero at a point, the function is no longer necessarily locally injective. A real function that is locally constant at a point in the interior of its domain is not locally injective at
but is trivially continuously differentiable at
.
Methods of proof
As an important result, the inverse function theorem has been given numerous proofs. The proof most commonly seen in textbooks relies on the contraction mapping principle, also known as the Banach fixed-point theorem (which can also be used as the key step in the proof of existence and uniqueness of solutions to ordinary differential equations).[2][3]
Since the fixed point theorem applies in infinite-dimensional (Banach space) settings, this proof generalizes immediately to the infinite-dimensional version of the inverse function theorem[4] (see Generalizations below).
An alternate proof in finite dimensions hinges on the extreme value theorem for functions on a compact set.[5] This approach has an advantage that the proof generalizes to a situation where there is no Cauchy completeness (see § Over a real closed field).
Yet another proof uses Newton's method, which has the advantage of providing an effective version of the theorem: bounds on the derivative of the function imply an estimate of the size of the neighborhood on which the function is invertible.[6]
Proof for single-variable functions
We want to prove the following: Let be an open set with
a continuously differentiable function defined on
, and suppose that
. Then there exists an open interval
with
such that
maps
bijectively onto the open interval
, and such that the inverse function
is continuously differentiable, and for any
, if
is such that
, then
.
We may without loss of generality assume that . Given that
is an open set and
is continuous at
, there exists
such that
and
In particular,
This shows that is strictly increasing for all
. Let
be such that
. Then
. By the intermediate value theorem, we find that
maps the interval
bijectively onto
. Denote by
and
. Then
is a bijection and the inverse
exists. The fact that
is differentiable follows from the differentiability of
. In particular, the result follows from the fact that if
is a strictly monotonic and continuous function that is differentiable at
with
, then
is differentiable with
, where
(a standard result in analysis). This completes the proof.
A proof using successive approximation
To prove existence, it can be assumed after an affine transformation that and
, so that
.
By the mean value theorem for vector-valued functions, for a differentiable function ,
. Setting
, it follows that
Now choose so that
for
. Suppose that
and define
inductively by
and
. The assumptions show that if
then
.
In particular implies
. In the inductive scheme
and
. Thus
is a Cauchy sequence tending to
. By construction
as required.
To check that is C1, write
so that
. By the inequalities above,
so that
.
On the other hand, if
, then
. Using the geometric series for
, it follows that
. But then
tends to 0 as and
tend to 0, proving that
is C1 with
.
The proof above is presented for a finite-dimensional space, but applies equally well for Banach spaces. If an invertible function is Ck with
, then so too is its inverse. This follows by induction using the fact that the map
on operators is Ck for any
(in the finite-dimensional case this is an elementary fact because the inverse of a matrix is given as the adjugate matrix divided by its determinant).
[1][7] The method of proof here can be found in the books of Henri Cartan, Jean Dieudonné, Serge Lang, Roger Godement and Lars Hörmander.
A proof using the contraction mapping principle
Here is a proof based on the contraction mapping theorem. Specifically, following T. Tao,[8] it uses the following consequence of the contraction mapping theorem.
Lemma—Let denote an open ball of radius r in
with center 0 and
a map with a constant
such that
for all in
. Then for
on
, we have
in particular, f is injective. If, moreover, , then
.
More generally, the statement remains true if is replaced by a Banach space. Also, the first part of the lemma is true for any normed space.
Basically, the lemma says that a small perturbation of the identity map by a contraction map is injective and preserves a ball in some sense. Assuming the lemma for a moment, we prove the theorem first. As in the above proof, it is enough to prove the special case when and
. Let
. The mean value inequality applied to
says:
Since and
is continuous, we can find an
such that
for all in
. Then the early lemma says that
is injective on
and
. Then
is bijective and thus has an inverse. Next, we show the inverse is continuously differentiable (this part of the argument is the same as that in the previous proof). This time, let
denote the inverse of
and
. For
, we write
or
. Now, by the early estimate, we have
and so . Writing
for the operator norm,
As , we have
and
is bounded. Hence,
is differentiable at
with the derivative
. Also,
is the same as the composition
where
; so
is continuous.
It remains to show the lemma. First, we have:
which is to say
This proves the first part. Next, we show . The idea is to note that this is equivalent to, given a point
in
, find a fixed point of the map
where such that
and the bar means a closed ball. To find a fixed point, we use the contraction mapping theorem and checking that
is a well-defined strict-contraction mapping is straightforward. Finally, we have:
since
As might be clear, this proof is not substantially different from the previous one, as the proof of the contraction mapping theorem is by successive approximation.
Applications
Implicit function theorem
The inverse function theorem can be used to solve a system of equations
i.e., expressing as functions of
, provided the Jacobian matrix is invertible. The implicit function theorem allows to solve a more general system of equations:
for in terms of
. Though more general, the theorem is actually a consequence of the inverse function theorem. First, the precise statement of the implicit function theorem is as follows:[9]
- given a map
, if
,
is continuously differentiable in a neighborhood of
and the derivative of
at
is invertible, then there exists a differentiable map
for some neighborhoods
of
such that
. Moreover, if
, then
; i.e.,
is a unique solution.
To see this, consider the map . By the inverse function theorem,
has the inverse
for some neighborhoods
. We then have:
implying and
Thus
has the required property.
Giving a manifold structure
In differential geometry, the inverse function theorem is used to show that the pre-image of a regular value under a smooth map is a manifold.[10] Indeed, let be such a smooth map from an open subset of
(since the result is local, there is no loss of generality with considering such a map). Fix a point
in
and then, by permuting the coordinates on
, assume the matrix
has rank
. Then the map
is such that
has rank
. Hence, by the inverse function theorem, we find the smooth inverse
of
defined in a neighborhood
of
. We then have
which implies
That is, after the change of coordinates by ,
is a coordinate projection (this fact is known as the submersion theorem). Moreover, since
is bijective, the map
is bijective with the smooth inverse. That is to say, gives a local parametrization of
around
. Hence,
is a manifold.
(Note the proof is quite similar to the proof of the implicit function theorem and, in fact, the implicit function theorem can be also used instead.)
More generally, the theorem shows that if a smooth map is transversal to a submanifold
, then the pre-image
is a submanifold.[11]
Global version
The inverse function theorem is a local result; it applies to each point. A priori, the theorem thus only shows the function is locally bijective (or locally diffeomorphic of some class). The next topological lemma can be used to upgrade local injectivity to injectivity that is global to some extent.
Lemma—[12][13] If is a closed subset of a (second-countable) topological manifold
(or, more generally, a topological space admitting an exhaustion by compact subsets) and
,
some topological space, is a local homeomorphism that is injective on
, then
is injective on some neighborhood of
.
Proof:[14] First assume is compact. If the conclusion of the theorem is false, we can find two sequences
such that
and
each converge to some points
in
. Since
is injective on
,
. Now, if
is large enough,
are in a neighborhood of
where
is injective; thus,
, a contradiction.
In general, consider the set . It is disjoint from
for any subset
where
is injective. Let
be an increasing sequence of compact subsets with union
and with
contained in the interior of
. Then, by the first part of the proof, for each
, we can find a neighborhood
of
such that
. Then
has the required property.
(See also [15] for an alternative approach.)
The lemma implies the following (a sort of) global version of the inverse function theorem:
Inverse function theorem—[16] Let be a map between open subsets of
or more generally of manifolds. Assume
is continuously differentiable (or is
). If
is injective on a closed subset
and if the Jacobian matrix of
is invertible at each point of
, then
is injective on a neighborhood
of
and
is continuously differentiable (or is
).
Note that if is a point, then the above is the usual inverse function theorem.
Holomorphic inverse function theorem
There is a version of the inverse function theorem for holomorphic maps.
Theorem—[17][18] Let be open subsets such that
and
a holomorphic map whose Jacobian matrix in variables
is invertible (the determinant is nonzero) at
. Then
is injective in some neighborhood
of
and the inverse
is holomorphic.
The theorem follows from the usual inverse function theorem. Indeed, let denote the Jacobian matrix of
in variables
and
for that in
. Then we have
, which is nonzero by assumption. Hence, by the usual inverse function theorem,
is injective near
with continuously differentiable inverse. By chain rule, with
,
where the left-hand side and the first term on the right vanish since and
are holomorphic. Thus,
for each
.
Similarly, there is the implicit function theorem for holomorphic functions.[19]
As already noted earlier, it can happen that an injective smooth function has the inverse that is not smooth (e.g., in a real variable). This is not the case for holomorphic functions because of:
Proposition—[19] If is an injective holomorphic map between open subsets of
, then
is holomorphic.
Formulations for manifolds
The inverse function theorem can be rephrased in terms of differentiable maps between differentiable manifolds. In this context the theorem states that for a differentiable map (of class
), if the differential of
,
is a linear isomorphism at a point in
then there exists an open neighborhood
of
such that
is a diffeomorphism. Note that this implies that the connected components of M and N containing p and F(p) have the same dimension, as is already directly implied from the assumption that dFp is an isomorphism. If the derivative of F is an isomorphism at all points p in M then the map F is a local diffeomorphism.
Generalizations
Banach spaces
The inverse function theorem can also be generalized to differentiable maps between Banach spaces X and Y.[20] Let U be an open neighbourhood of the origin in X and a continuously differentiable function, and assume that the Fréchet derivative
of F at 0 is a bounded linear isomorphism of X onto Y. Then there exists an open neighbourhood V of
in Y and a continuously differentiable map
such that
for all y in V. Moreover,
is the only sufficiently small solution x of the equation
.
There is also the inverse function theorem for Banach manifolds.[21]
Constant rank theorem
The inverse function theorem (and the implicit function theorem) can be seen as a special case of the constant rank theorem, which states that a smooth map with constant rank near a point can be put in a particular normal form near that point.[22] Specifically, if has constant rank near a point
, then there are open neighborhoods U of p and V of
and there are diffeomorphisms
and
such that
and such that the derivative
is equal to
. That is, F "looks like" its derivative near p. The set of points
such that the rank is constant in a neighborhood of
is an open dense subset of M; this is a consequence of semicontinuity of the rank function. Thus the constant rank theorem applies to a generic point of the domain.
When the derivative of F is injective (resp. surjective) at a point p, it is also injective (resp. surjective) in a neighborhood of p, and hence the rank of F is constant on that neighborhood, and the constant rank theorem applies.
Polynomial functions
If it is true, the Jacobian conjecture would be a variant of the inverse function theorem for polynomials. It states that if a vector-valued polynomial function has a Jacobian determinant that is an invertible polynomial (that is a nonzero constant), then it has an inverse that is also a polynomial function. It is unknown whether this is true or false, even in the case of two variables. This is a major open problem in the theory of polynomials.
Selections
When with
,
is
times continuously differentiable, and the Jacobian
at a point
is of rank
, the inverse of
may not be unique. However, there exists a local selection function
such that
for all
in a neighborhood of
,
,
is
times continuously differentiable in this neighborhood, and
(
is the Moore–Penrose pseudoinverse of
).[23]
Over a real closed field
The inverse function theorem also holds over a real closed field k (or an o-minimal structure).[24] Precisely, the theorem holds for a semialgebraic (or definable) map between open subsets of that is continuously differentiable.
The usual proof of the IFT uses Banach's fixed point theorem, which relies on the Cauchy completeness. That part of the argument is replaced by the use of the extreme value theorem, which does not need completeness. Explicitly, in § A proof using the contraction mapping principle, the Cauchy completeness is used only to establish the inclusion . Here, we shall directly show
instead (which is enough). Given a point
in
, consider the function
defined on a neighborhood of
. If
, then
and so
, since
is invertible. Now, by the extreme value theorem,
admits a minimal at some point
on the closed ball
, which can be shown to lie in
using
. Since
,
, which proves the claimed inclusion.
Alternatively, one can deduce the theorem from the one over real numbers by Tarski's principle.
See also
- Nash–Moser theorem
Notes
- Theorem 1.1.7. in Hörmander, Lars (2015). The Analysis of Linear Partial Differential Operators I: Distribution Theory and Fourier Analysis. Classics in Mathematics (2nd ed.). Springer. ISBN 978-3-642-61497-2.
- McOwen, Robert C. (1996). "Calculus of Maps between Banach Spaces". Partial Differential Equations: Methods and Applications. Upper Saddle River, NJ: Prentice Hall. pp. 218–224. ISBN 0-13-121880-8.
- Tao, Terence (12 September 2011). "The inverse function theorem for everywhere differentiable maps".
- Jaffe, Ethan. "Inverse Function Theorem" (PDF).
- Spivak 1965, pages 31–35
- Hubbard, John H.; Hubbard, Barbara Burke (2001). Vector Analysis, Linear Algebra, and Differential Forms: A Unified Approach (Matrix ed.).
- Cartan, Henri (1971). Calcul Differentiel (in French). Hermann. pp. 55–61. ISBN 978-0-395-12033-0.
- Theorem 17.7.2 in Tao, Terence (2014). Analysis. II. Texts and Readings in Mathematics. Vol. 38 (Third edition of 2006 original ed.). New Delhi: Hindustan Book Agency. ISBN 978-93-80250-65-6. MR 3310023. Zbl 1300.26003.
- Spivak 1965, Theorem 2-12.
- Spivak 1965, Theorem 5-1. and Theorem 2-13.
- "Transversality" (PDF). northwestern.edu.
- One of Spivak's books (Editorial note: give the exact location).
- Hirsch 1976, Ch. 2, § 1., Exercise 7. NB: This one is for a
-immersion.
- Lemma 13.3.3. of Lectures on differential topology utoronto.ca
- Dan Ramras (https://mathoverflow.net/users/4042/dan-ramras), On a proof of the existence of tubular neighborhoods., URL (version: 2017-04-13): https://mathoverflow.net/q/58124
- Ch. I., § 3, Exercise 10. and § 8, Exercise 14. in V. Guillemin, A. Pollack. "Differential Topology". Prentice-Hall Inc., 1974. ISBN 0-13-212605-2.
- Griffiths & Harris 1978, p. 18.
- Fritzsche, K.; Grauert, H. (2002). From Holomorphic Functions to Complex Manifolds. Springer. pp. 33–36. ISBN 978-0-387-95395-3.
- Griffiths & Harris 1978, p. 19.
- Luenberger, David G. (1969). Optimization by Vector Space Methods. New York: John Wiley & Sons. pp. 240–242. ISBN 0-471-55359-X.
- Lang, Serge (1985). Differential Manifolds. New York: Springer. pp. 13–19. ISBN 0-387-96113-5.
- Boothby, William M. (1986). An Introduction to Differentiable Manifolds and Riemannian Geometry (Second ed.). Orlando: Academic Press. pp. 46–50. ISBN 0-12-116052-1.
- Dontchev, Asen L.; Rockafellar, R. Tyrrell (2014). Implicit Functions and Solution Mappings: A View from Variational Analysis (Second ed.). New York: Springer-Verlag. p. 54. ISBN 978-1-4939-1036-6.
- Chapter 7, Theorem 2.11. in Dries, L. P. D. van den (1998). Tame Topology and O-minimal Structures. London Mathematical Society lecture note series, no. 248. Cambridge, New York, and Oakleigh, Victoria: Cambridge University Press. doi:10.1017/CBO9780511525919. ISBN 9780521598385.
References
- Allendoerfer, Carl B. (1974). "Theorems about Differentiable Functions". Calculus of Several Variables and Differentiable Manifolds. New York: Macmillan. pp. 54–88. ISBN 0-02-301840-2.
- Baxandall, Peter; Liebeck, Hans (1986). "The Inverse Function Theorem". Vector Calculus. New York: Oxford University Press. pp. 214–225. ISBN 0-19-859652-9.
- Nijenhuis, Albert (1974). "Strong derivatives and inverse mappings". Amer. Math. Monthly. 81 (9): 969–980. doi:10.2307/2319298. hdl:10338.dmlcz/102482. JSTOR 2319298.
- Griffiths, Phillip; Harris, Joseph (1978), Principles of Algebraic Geometry, John Wiley & Sons, ISBN 978-0-471-05059-9.
- Hirsch, Morris W. (1976). Differential Topology. Springer-Verlag. ISBN 978-0-387-90148-0.
- Protter, Murray H.; Morrey, Charles B. Jr. (1985). "Transformations and Jacobians". Intermediate Calculus (Second ed.). New York: Springer. pp. 412–420. ISBN 0-387-96058-9.
- Renardy, Michael; Rogers, Robert C. (2004). An Introduction to Partial Differential Equations. Texts in Applied Mathematics 13 (Second ed.). New York: Springer-Verlag. pp. 337–338. ISBN 0-387-00444-0.
- Rudin, Walter (1976). Principles of mathematical analysis. International Series in Pure and Applied Mathematics (Third ed.). New York: McGraw-Hill Book. pp. 221–223. ISBN 978-0-07-085613-4.
- Spivak, Michael (1965). Calculus on Manifolds: A Modern Approach to Classical Theorems of Advanced Calculus. San Francisco: Benjamin Cummings. ISBN 0-8053-9021-9.