ラプラスの方法のソースを表示

[[数学]]において'''ラプラスの方法'''（らぷらすのほうほう、{{lang-en-short|Laplace's method}}）とは、[[ピエール＝シモン・ラプラス]]にちなんだ[[積分]]
:<math>\int_a^b e^{nf(x)} \, dx</math>
の近似に用いられる方法。ここで {{math|''f''(''x'')}} は二回[[連続微分可能]]な[[関数 (数学)|関数]]、{{mvar|n}} は大きな数で、端点 {{mvar|a}}, {{mvar|b}} は有限でなくともよい。この方法は {{harvtxt|Laplace|1774}} で初めて用いられた。

== ラプラスの方法のアイディア ==
[[File:Laplaces method.svg|right|200px|thumb|関数 {{math|''f''(''x'') {{=}} sin(''x'')/''x''}} は原点 {{math|0}} において最大値をとる。被積分関数 {{math|''e''{{sup|''nf''(''x'')}}}} を {{math|''n'' {{=}} 0.5}} のとき（上図）と {{math|''n'' {{=}} 3}} のとき（下図）に青色で示した。数 {{mvar|n}} が大きくなるにつれて、被積分関数の[[ガウス関数]]（赤色）による近似がよくなる。この観察がラプラスの方法の背後にある。]]
関数 {{math|''f''(''x'')}} が点 {{math|''x''{{sub|0}}}} においてのみ最大値をとると仮定する。数 {{mvar|n}} に対して、次の関数を考える。
:<math>\begin{align}
g(x) &= nf(x) \\
h(x) &= e^{nf(x)}
\end{align}</math>
点 {{math|''x''{{sub|0}}}} において関数 {{mvar|g}} と {{mvar|h}} も最大値をとることに注意する。また、このとき
:<math>\begin{align}
\frac{g(x_0)}{g(x)} &= \frac{nf(x_0)}{nf(x)} = \frac{f(x_0)}{f(x)} \\
\frac{h(x_0)}{h(x)} &= \frac{e^{nf(x_0)}}{e^{nf(x)}} = e^{n(f(x_0) - f(x))}
\end{align}</math>
である。

数 {{mvar|n}} が大きくなるにつれて {{mvar|h}} の比は指数的に大きくなる一方で {{mvar|g}} の比は変化しない。したがって、関数の積分における支配的な寄与は点 {{math|''x''{{sub|0}}}} の[[近傍]]における点 {{mvar|x}} のみから来るため近似ができる。

== 厳密な主張 ==
{{math|''f''(''x'')}} は区間 {{math|[''a'', ''b'']}} 上の二回[[連続微分可能]]な関数で、ある点 {{math|''x''{{sub|0}} &isin; (''a'', ''b'')}} でのみ
:<math> f(x_0) = \max_{a \le x \le b} f(x), \quad f''(x_0) < 0 </math>
を満たすと仮定する。このとき
:<math> \int_a^b e^{nf(x)} \, dx \sim e^{nf(x_0)} \sqrt{\frac{2\pi}{n|f''(x_0)|}} \qquad (n \to \infty)</math>
である<ref>
{{citation
    |last1              = Bell
    |first1             = Jordan
    |title              = Watson’s lemma and Laplace’s method
    |year               = 2014
    |url                = http://individual.utoronto.ca/jordanbell/notes/laplace.pdf
}}
</ref>。（ここで {{math|&sim;}} は両辺の比が {{math|''n'' &rarr; &infin;}} の極限で {{math|1}} に収束することを意味する。）

== 他の定式化 ==
ラプラスの方法は
:<math> \int_a^b g(x) e^{nf(x)} \, dx \sim g(x_0)e^{nf(x_0)}\sqrt{\frac{2\pi}{n|f''(x_0)|}} \qquad (n \to \infty) </math>
と書かれることもある。

== 例：スターリングの公式 ==
ラプラスの方法は[[スターリングの公式]]
:<math> n! \sim n^n e^{-n} \sqrt{2\pi n} \quad (n \to \infty) </math>
の導出に用いることができる。[[ガンマ関数]]の定義から
:<math> n! = \Gamma(n+1) = \int_0^\infty e^{-t} t^n \, dt </math>
が得られる。変数変換 {{math|''t'' {{=}} ''nx''}} を考えると {{math|''dt'' {{=}} ''ndx''}} ゆえ
:<math>\begin{align}
n! 
&= \int_0^\infty e^{-nx} (nx)^n n \, dx \\
&= n^{n+1} \int_0^\infty e^{-nx} x^n \, dx \\
&= n^{n+1} \int_0^\infty e^{-nx} e^{n\ln x} \, dx \\
&= n^{n+1} \int_0^\infty e^{n(\ln x - x)} \, dx.
\end{align}</math>
この積分はラプラスの方法が適用できる形である。いま {{math|''f''(''x'') {{=}} ln ''x'' &minus; ''x''}} とおけば、これは二階微分可能で、
:<math> f'(x) = \frac{1}{x} - 1, \quad f''(x) = -\frac{1}{x^2}. </math>
よって関数 {{math|''f''(''x'')}} は点 {{math|''x''{{sub|0}} {{=}} 1}} でのみ最大値 {{math|''f''(''x''{{sub|0}}) {{=}} &minus;1}} をとり、{{math|''f''&prime;&prime;(''x''{{sub|0}}) {{=}} &minus;1}} である。したがって
:<math> n! \sim n^{n+1} e^{-n} \sqrt{\frac{2\pi}{n}} = n^n e^{-n} \sqrt{2\pi n} \quad (n \to \infty) </math>
となる。

== 脚注 ==
{{reflist}}

== 参考文献 ==
* {{Citation |last=Laplace |first=P. S. |year=1774 |title=Mémoires de Mathématique et de Physique, Tome Sixième |journal=Statistical Science |volume=1 |issue=3 |pages=366–367 |trans-title=Memoir on the probability of causes of events |jstor=2245476 }}

== 関連項目 ==
* [[停留位相の方法]]
* [[大偏差理論]]
* [[ラプラス原理]]

{{PlanetMath attribution|id=4284|title=saddle point approximation}}

{{DEFAULTSORT:らふらすのほうほう}}
[[Category:漸近解析]]
[[Category:ピエール＝シモン・ラプラス]]
[[Category:数学に関する記事]]

<!--
{{Distinguish|Laplace smoothing}}

In [[mathematics]], '''Laplace's method''', named after [[Pierre-Simon Laplace]], is a technique used to approximate [[integral]]s of the form

:<math>\int_a^b e^{Mf(x)} \, dx,</math>

where <math>f(x)</math> is a twice-[[Derivative|differentiable]] [[function (mathematics)|function]], ''M'' is a large number, and the endpoints ''a'' and ''b'' could possibly be infinite. This technique was originally presented in {{harvtxt|Laplace|1774}}.

==The idea of Laplace's method==
[[File:Laplaces method.svg|right|150px|thumb| <math>f(x) = \tfrac{\sin(x)}{x}</math> has a global maximum at 0. <math>e^{Mf(x)}</math> is shown on top for ''M''&nbsp;=&nbsp;0.5, and at the bottom for ''M''&nbsp;=&nbsp;3 (both in blue). As ''M'' grows the approximation of this function by a [[Gaussian function]] (shown in red) improves. This observation underlies Laplace's method.]]

Suppose the function <math>f(x)</math> has a unique [[Maxima and minima|global maximum]] at ''x''<sub>0</sub>. Let ''M'' be a constant and consider the following two functions:

:<math>\begin{align}
g(x) &= Mf(x) \\
h(x) &= e^{Mf(x)}
\end{align}</math>

Note that ''x''<sub>0</sub> will be the global maximum of <math>g</math> and <math>h</math> as well. Now observe:

:<math>\begin{align}
\frac{g(x_0)}{g(x)} &= \frac{M f(x_0)}{M f(x)} = \frac{f(x_0)}{f(x)} \\[4pt]
\frac{h(x_0)}{h(x)} &= \frac{e^{M f(x_0)}}{e^{M f(x)}} = e^{M(f(x_0) - f(x))}
\end{align}</math>

As ''M'' increases, the ratio for <math>h</math> will grow exponentially, while the ratio for <math>g</math> does not change. Thus, significant contributions to the integral of this function will come only from points ''x'' in a [[neighbourhood (mathematics)|neighbourhood]] of ''x''<sub>0</sub>, which can then be estimated.

==General theory of Laplace's method==
To state and motivate the method, we need several assumptions. We will assume that ''x''<sub>0</sub> is not an endpoint of the interval of integration, that the values <math>f(x)</math> cannot be very close to <math>f(x_0)</math> unless ''x'' is close to ''x''<sub>0</sub>, and that <math>f''(x_0)<0.</math>

We can expand <math>f(x)</math> around ''x''<sub>0</sub> by [[Taylor's theorem]],

:<math>f(x) = f(x_0) + f'(x_0)(x-x_0) + \frac{1}{2} f''(x_0)(x-x_0)^2 + R</math>

where <math>R = O\left((x-x_0)^3\right)</math> (see: [[big O notation]]).

Since <math>f</math> has a global maximum at ''x''<sub>0</sub>, and since ''x''<sub>0</sub> is not an endpoint, it is a [[stationary point]], so the derivative of <math>f</math> vanishes at ''x''<sub>0</sub>. Therefore, the function <math>f(x)</math> may be approximated to quadratic order

:<math>f(x) \approx f(x_0) - \frac{1}{2} \left|f''(x_0)\right| (x-x_0)^2</math>

for ''x'' close to ''x''<sub>0</sub> (recall <math>f''(x_0)<0</math>). The assumptions ensure the accuracy of the approximation

:<math>\int_a^b e^{M f(x)}\, dx\approx e^{M f(x_0)}\int_a^b e^{-\frac{1}{2} M|f''(x_0)| (x-x_0)^2} \, dx</math>

(see the picture on the right). This latter integral is a [[Gaussian integral]] if the limits of integration go from −∞ to +∞ (which can be assumed because the exponential decays very fast away from ''x''<sub>0</sub>), and thus it can be calculated. We find

:<math>\int_a^b e^{M f(x)}\, dx\approx \sqrt{\frac{2\pi}{M|f''(x_0)|}}e^{M f(x_0)} \text { as } M\to\infty.</math>

A generalization of this method and extension to arbitrary precision is provided by {{harvtxt|Fog|2008}}.

===Formal statement and proof===

Suppose <math>f(x)</math> is a twice continuously differentiable function on <math>[a,b],</math> and there exists a unique point <math>x_0 \in (a,b)</math> such that: 

:<math>f(x_0) = \max_{x \in [a,b]} f(x) \quad \text{and} \quad f''(x_0)<0.</math> 

Then:

:<math>\lim_{n\to\infty} \frac{\int_a^b e^{nf(x)} \, dx}{e^{nf(x_0)} \sqrt{\frac{2\pi}{n\left(-f''(x_0)\right)}}}= 1. </math>

{{hidden begin|border=solid 1px #aaa|title={{center|Proof}}}}
'''Lower bound:''' Let <math>\varepsilon > 0</math>. Since <math>f''</math> is continuous there exists <math>\delta >0</math> such that if <math>|x_0-c|< \delta</math> then <math>f''(c) \ge f''(x_0) - \varepsilon.</math> By [[Taylor's Theorem]], for any <math>x \in (x_0 - \delta, x_0 + \delta),</math>

:<math>f(x) \ge f(x_0) + \frac{1}{2}(f''(x_0) - \varepsilon)(x-x_0)^2.</math>

Then we have the following lower bound:

:<math>\begin{align}
\int_a^b e^{nf(x)} \, dx &\ge \int_{x_0 - \delta}^{x_0 + \delta} e^{nf(x)} \, dx \\
                         &\ge e^{nf(x_0)} \int_{x_0 - \delta}^{x_0 + \delta} e^{\frac{n}{2}(f''(x_0) - \varepsilon)(x-x_0)^2} \, dx \\
                         &=   e^{nf(x_0)} \sqrt{\frac{1}{n(-f''(x_0) + \varepsilon)}} \int_{-\delta \sqrt{n(-f''(x_0) + \varepsilon)} }^{\delta \sqrt{n(-f''(x_0) + \varepsilon)} } e^{-\frac{1}{2}y^2} \, dy
\end{align}</math>

where the last equality was obtained by a change of variables 

:<math>y= \sqrt{n(-f''(x_0) + \varepsilon)} (x-x_0).</math>

Remember <math>f''(x_0)<0</math> so we can take the square root of its negation.

If we divide both sides of the above inequality by 

:<math>e^{nf(x_0)}\sqrt{\frac{2 \pi}{n(-f''(x_0))}}</math> 

and take the limit we get:

:<math>\lim_{n \to \infty} \frac{\int_a^b e^{nf(x)} \,dx}{e^{nf(x_0)}\sqrt{\frac{2\pi}{n(-f''(x_0))}}} \ge \lim_{n \to \infty} \frac{1}{\sqrt{2\pi}} \int_{-\delta\sqrt{n(-f''(x_0) + \varepsilon)} }^{\delta \sqrt{n(-f''(x_0) + \varepsilon)}} e^{-\frac{1}{2}y^2} \, dy \, \cdot \sqrt{\frac{-f''(x_0)}{-f''(x_0) + \varepsilon}} = \sqrt{\frac{-f''(x_0)}{-f''(x_0) + \varepsilon}}</math>

since this is true for arbitrary <math>\varepsilon</math> we get the lower bound:

:<math>\lim_{n \to \infty} \frac{\int_a^b e^{nf(x)} \, dx}{ e^{nf(x_0)}\sqrt{\frac{2\pi}{n(-f''(x_0))}}} \ge 1</math>

Note that this proof works also when <math>a = -\infty</math> or <math>b= \infty</math> (or both).

'''Upper bound:''' The proof is similar to that of the lower bound but there are a few inconveniences. Again we start by picking an <math>\varepsilon >0</math> but in order for the proof to work we need <math>\varepsilon</math> small enough so that <math>f''(x_0) + \varepsilon < 0.</math> Then, as above, by continuity of <math>f''</math> and [[Taylor's Theorem]] we can find <math>\delta>0</math> so that if <math>|x-x_0| < \delta</math>, then 

:<math>f(x) \le f(x_0) + \frac{1}{2} (f''(x_0) + \varepsilon)(x-x_0)^2.</math>

Lastly, by our assumptions (assuming <math>a,b</math> are finite) there exists an <math>\eta >0</math> such that if <math>|x-x_0|\ge \delta</math>, then <math>f(x) \le f(x_0) - \eta</math>.

Then we can calculate the following upper bound:

:<math>\begin{align}
\int_a^b e^{nf(x)} \, dx &\le \int_a^{x_0-\delta} e^{nf(x)} \, dx + \int_{x_0-\delta}^{x_0 + \delta} e^{nf(x)} \, dx + \int_{x_0 + \delta}^b e^{nf(x)} \, dx \\
&\le (b-a)e^{n(f(x_0)-\eta)} + \int_{x_0-\delta}^{x_0 + \delta} e^{n f(x) } \, dx \\
&\le (b-a)e^{n(f(x_0)-\eta)} + e^{nf(x_0)} \int_{x_0-\delta}^{x_0 + \delta} e^{\frac{n}{2}(f''(x_0)+\varepsilon)(x-x_0)^2} \, dx\\
&\le (b-a)e^{n(f(x_0)-\eta)} + e^{nf(x_0)} \int_{-\infty}^{+\infty} e^{\frac{n}{2}(f''(x_0)+\varepsilon)(x-x_0)^2} \, dx \\
&\le (b-a)e^{n(f(x_0)-\eta)} + e^{nf(x_0)} \sqrt{\frac{2 \pi}{n (-f''(x_0) - \varepsilon)}}
\end{align}</math>

If we divide both sides of the above inequality by 

:<math>e^{nf(x_0)}\sqrt{\frac{2 \pi}{n (-f''(x_0))}}</math> 

and take the limit we get:

:<math>\lim_{n \to \infty} \frac{\int_a^b e^{nf(x)} \, dx}{e^{nf(x_0)}\sqrt{\frac{2\pi}{n(-f''(x_0))}}} \le \lim_{n \to \infty} (b-a) e^{-\eta n} \sqrt{\frac{n(-f''(x_0))}{2\pi}} + \sqrt{\frac{-f''(x_0)}{-f''(x_0) - \varepsilon}} = \sqrt{\frac{-f''(x_0)}{-f''(x_0) - \varepsilon}}</math>

Since <math>\varepsilon</math> is arbitrary we get the upper bound:

:<math>\lim_{n \to \infty} \frac{\int_a^b e^{nf(x)} \, dx}{e^{nf(x_0)}\sqrt{\frac{2 \pi}{n (-f''(x_0))}}} \le 1</math>

And combining this with the lower bound gives the result.

Note that the above proof obviously fails when <math>a = -\infty</math> or <math>b = \infty</math> (or both). To deal with these cases, we need some extra assumptions. A sufficient (not necessary) assumption is that for <math>n = 1,</math> 

:<math>\int_a^b e^{nf(x)} \, dx < \infty,</math> 

and that the number <math>\eta</math> as above exists (note that this must be an assumption in the case when the interval <math>[a,b]</math> is infinite). The proof proceeds otherwise as above, but with a s lightly different approximation of integrals:

:<math>\int_a^{x_0-\delta} e^{nf(x)} \, dx + \int_{x_0 + \delta}^b e^{nf(x)} \, dx \le \int_a^b e^{f(x)}e^{(n-1)(f(x_0) - \eta)} \, dx = e^{(n-1)(f(x_0) - \eta)} \int_a^b e^{f(x)} \, dx.</math>

When we divide by 

:<math>e^{nf(x_0)}\sqrt{\frac{2\pi}{n(-f''(x_0))}},</math>

we get for this term

:<math>\frac{e^{(n-1)(f(x_0) - \eta)} \int_a^b e^{f(x)} \, dx}{e^{nf(x_0)}\sqrt{\frac{2\pi}{n(-f''(x_0))}}} = e^{-(n-1)\eta} \sqrt{n} e^{-f(x_0)} \int_a^b e^{f(x)} \, dx \sqrt{\frac{-f''(x_0)}{2\pi}}</math>

whose limit as <math>n \to \infty</math> is <math>0</math>. The rest of the proof (the analysis of the interesting term) proceeds as above.

The given condition in the infinite interval case is, as said above, sufficient but not necessary. However, the condition is fulfilled in many, if not in most, applications: the condition simply says that the integral we are studying must be well-defined (not infinite) and that the maximum of the function at <math>x_0</math> must be a "true" maximum (the number <math>\eta > 0</math> must exist). There is no need to demand that the integral is finite for <math>n=1</math> but it is enough to demand that the integral is finite for some <math>n=N.</math>
{{hidden end}}

This method relies on 4 basic concepts such as
{{hidden begin|border=solid 1px #aaa|title={{center|Concepts}}}}
:'''1. Relative error'''
The “approximation” in this method is related to the [[relative error]] and not the [[absolute error]]. Therefore, if we set

:<math>s = \sqrt{\frac{2\pi}{M\left|f''(x_0)\right|}}.</math>

the integral can be written as

:<math>\begin{align} 
\int_a^b e^{M f(x)} \, dx &= se^{Mf(x_0)} \frac{1}{s}\int_a^b e^{M(f(x)-f(x_0))}\, dx \\ 
& = se^{Mf(x_0)} \int_{\frac{a-x_0}{s}}^{\frac{b-x_0}{s}} e^{M(f(sy+x_0)-f(x_0))}\,dy 
\end{align}</math>

where <math>s</math> is a small number when <math>M</math> is a large number obviously and the relative error will be
 
:<math>\left| \int_{\frac{a-x_0}{s}}^{\frac{b-x_0}{s}} e^{M(f(sy+x_0)-f(x_0))} dy-1 \right|.</math>

Now, let us separate this integral into two parts: <math>y\in[-D_y,D_y]</math> region and the rest.

:'''2. <math>e^{M (f(sy+x_0)-f(x_0))} \to e^{-\pi y^2}</math> around the [[stationary point]] when <math>M</math> is large enough'''

Let’s look at the [[Taylor expansion]] of <math>M(f(x)-f(x_0))</math> around ''x''<sub>0</sub> and translate ''x'' to ''y'' because we do the comparison in y-space, we will get

:<math>M(f(x)-f(x_0)) = \frac{Mf''(x_0)}{2}s^2y^2 +\frac{Mf'''(x_0)}{6}s^3y^3+ \cdots = -\pi y^2 +O\left(\frac{1}{\sqrt{M}}\right).</math>

Note that <math>f'(x_0)=0</math> because <math>x_0</math> is a stationary point. From this equation you will find that the terms higher than second derivative in this Taylor expansion is suppressed as the order of <math>\tfrac{1}{\sqrt{M}}</math> so that <math>\exp(M(f(x)-f(x_0)))</math> will get closer to the [[Gaussian function]] as shown in figure. Besides,

:<math>\int_{-\infty}^{\infty}e^{-\pi y^2} dy =1.</math>

[[File:For laplace method --- with different M.png|thumb|The figure of <math>e^{M[f(sy+x_0)-f(x_0)]}</math> with <math>M</math> equals 1, 2 and 3, and the red line is the curve of function <math>e^{-\pi y^2}</math> .]]

:'''3. The larger <math>M</math> is, the smaller range of <math>x</math> is related'''

Because we do the comparison in y-space, <math>y</math> is fixed in <math>y\in[-D_y,D_y]</math> which will cause <math>x\in[-sD_y, sD_y]</math>; however, <math>s</math> is inversely proportional to <math>\sqrt{M}</math>, the chosen region of <math>x</math> will be smaller when <math>M</math> is increased.

:'''4. If the integral in Laplace’s method converges, the contribution of the region which is not around the stationary point of the integration of its relative error will tend to zero as <math>M</math> grows.'''

Relying on the 3rd concept, even if we choose a very large ''D<sub>y</sub>'', ''sD<sub>y</sub>'' will finally be a very small number when <math>M</math> is increased to a huge number. Then, how can we guarantee the integral of the rest will tend to 0 when <math>M</math> is large enough?

The basic idea is to find a function <math>m(x)</math> such that <math>m(x)\ge f(x)</math> and the integral of <math>e^{Mm(x)}</math> will tend to zero when <math>M</math> grows. Because the exponential function of <math>Mm(x)</math> will be always larger than zero as long as <math>m(x)</math> is a real number, and this exponential function is proportional to <math>m(x),</math> the integral of <math>e^{Mf(x)}</math> will tend to zero. For simplicity, choose <math>m(x)</math> as a [[tangent]] through the point <math>x=sD_y</math> as shown in the figure:

[[File:For laplace method --- upper limit function m(x).gif|thumb| <math>m(x)</math> is denoted by the two [[tangent]] lines passing through <math>x=\pm sD_y+x_0</math>. When <math>sD_y</math> gets smaller, the cover region will be larger. ]]

If the interval of the integration of this method is finite, we will find that no matter <math>f(x)</math> is continue in the rest region, it will be always smaller than <math>m(x)</math> shown above when <math>M</math> is large enough. By the way, it will be proved later that the integral of <math>e^{Mm(x)}</math> will tend to zero when <math>M</math> is large enough.

If the interval of the integration of this method is infinite, <math>m(x)</math> and <math>f(x)</math> might always cross to each other. If so, we cannot guarantee that the integral of <math>e^{Mf(x)}</math> will tend to zero finally. For example, in the case of <math>f(x)=\tfrac{\sin(x)}{x},</math> <math>\int^{\infty}_{0}e^{Mf(x)} dx</math> will always diverge. Therefore, we need to require that <math>\int^{\infty}_{d}e^{Mf(x)} dx</math> can converge for the infinite interval case. If so, this integral will tend to zero when <math>d</math> is large enough and we can choose this <math>d</math> as the cross of <math>m(x)</math> and <math>f(x).</math>

You might ask that why not choose <math>\int^{\infty}_{d}e^{f(x)} dx</math> as a convergent integral? Let me use an example to show you the reason. Suppose the rest part of <math>f(x)</math> is <math>-\ln x,</math> then <math>e^{f(x)}=\tfrac{1}{x}</math> and its integral will diverge; however, when <math>M=2,</math> the integral of <math>e^{Mf(x)}=\tfrac{1}{x^2}</math> converges. So, the integral of some functions will diverge when <math>M</math> is not a large number, but they will converge when <math>M</math> is large enough.
{{hidden end}}

Based on these four concepts, we can derive the relative error of this Laplace's method.

==Other formulations==
Laplace's approximation is sometimes written as

:<math>\int_a^b h(x) e^{M g(x)}\, dx \approx \sqrt{\frac{2\pi}{M|g''(x_0)|}} h(x_0) e^{M g(x_0)} \ \text { as } M\to\infty</math>

where <math>h</math> is positive.

Importantly, the accuracy of the approximation depends on the variable of integration, that is, on what stays in <math>g(x)</math> and what goes into <math>h(x)</math>.<ref>{{cite book |last=Butler |first=Ronald W |date=2007 |title=Saddlepoint approximations and applications |publisher=Cambridge University Press |isbn=978-0-521-87250-8}}</ref>

{{hidden begin|border=1px #aaa solid|title={{center|The derivation of its relative error}}}}
First of all, let me set the global maximum is located at <math>x_0=0</math> which can simplify the derivation and does not lost any important information; therefore, all the derivation inside this sub-section is under this assumption. Besides, what we want is the relative error <math>|R|</math> as shown below

:<math>\int_a^b h(x) e^{M g(x)}\, dx = h(0)e^{Mg(0)}s \underbrace{\int_{a/s}^{b/s}\frac{h(x)}{h(0)}e^{M\left[ g(sy)-g(0) \right]} dy}_{1+R},</math>

where 

:<math>s\equiv\sqrt{\frac{2\pi}{M\left| g'' (0) \right|}}.</math>

So, if we let 

:<math>A\equiv \frac{h(sy)}{h(0)}e^{M\left[g(sy)-g(0) \right]}</math> 

and <math>A_0\equiv e^{-\pi y^2}</math>, we can get

:<math>\left| R\right| = \left| \int_{a/s}^{b/s}A\,dy -\int_{-\infty}^{\infty}A_0\,dy \right|</math>

since <math>\int_{-\infty}^{\infty}A_0\,dy =1</math>. Now, let us find its upper bound.

Owing to <math>|A+B| \le |A|+|B|,</math> we can separate this integration into 5 parts with 3 different types (a), (b) and (c), respectively. Therefore,

:<math>|R| < \underbrace{\left| \int_{D_y}^{\infty}A_0 dy \right|}_{(a_1)} + \underbrace{\left| \int_{D_y}^{b/s}A dy \right|}_{(b_1)}+ \underbrace{\left| \int_{-D_y}^{D_y}\left(A-A_0\right) dy \right|}_{(c)} + \underbrace{\left| \int_{a/s}^{-D_y}A dy \right|}_{(b_2)} + \underbrace{\left| \int_{-\infty}^{-D_y}A_0 dy \right|}_{(a_2)}</math>

where <math>(a_1)</math> and <math>(a_2)</math> are similar, let us just calculate <math>(a_1)</math> and <math>(b_1)</math> and <math>(b_2)</math> are similar, too, I’ll just calculate <math>(b_1)</math>.

For <math>(a_1)</math>, after the translation of <math>z\equiv\pi y^2</math>, we can get

:<math>(a_1) = \left| \frac{1}{2\sqrt{\pi}}\int_{\pi D_y^2}^{\infty} e^{-z}z^{-1/2} dz\right| <\frac{e^{-\pi D_y^2}}{2\pi D_y}.</math>

This means that as long as <math>D_y</math> is large enough, it will tend to zero.

For <math>(b_1)</math>, we can get

:<math>(b_1)\le\left| \int_{D_y}^{b/s}\left[\frac{h(sy)}{h(0)}\right]_{\text{max}} e^{Mm(sy)}dy \right|</math>

where

:<math>m(x) \ge g(x)-g(0) \text{as} x\in [sD_y,b]</math>

and <math>h(x)</math> should have the same sign of <math>h(0)</math> during this region. Let us choose <math>m(x)</math> as the tangent across the point at <math>x=sD_y</math> , i.e. <math>m(sy)= g(sD_y)-g(0) +g'(sD_y)\left( sy-sD_y \right)</math> which is shown in the figure

[[File:For laplace method --- upper limit function m(x).gif|thumb|<math>m(x)</math> is the tangent lines across the point at <math>x=sD_y</math> .]]

From this figure you can find that when <math>s</math> or <math>D_y</math> gets smaller, the region satisfies the above inequality will get larger. Therefore, if we want to find a suitable <math>m(x)</math> to cover the whole <math>f(x)</math> during the interval of <math>(b_1)</math>, <math>D_y</math> will have an upper limit. Besides, because the integration of <math>e^{-\alpha x}</math> is simple, let me use it to estimate the relative error contributed by this <math>(b_1)</math>.

Based on Taylor expansion, we can get

:<math>\begin{align}
M\left[g(sD_y)-g(0)\right] &= M\left[ \frac{g''(0)}{2}s^2D_y^2 +\frac{g'''(\xi)}{6}s^3D_y^3 \right] && \text{as } \xi\in[0,sD_y] \\
& = -\pi D_y^2 +\frac{(2\pi)^{3/2}g'''(\xi)D_y^3}{6\sqrt{M}|g''(0)|^{\frac{3}{2}}},
\end{align}</math>

and

:<math>\begin{align}
Msg'(sD_y) &= Ms\left(g''(0)sD_y +\frac{g'''(\zeta)}{2}s^2D_y^2\right) && \text{as } \zeta\in[0,sD_y] \\
 &= -2\pi D_y +\sqrt{\frac{2}{M}}\left( \frac{\pi}{|g''(0)|} \right)^{\frac{3}{2}}g'''(\zeta)D_y^2,
\end{align}</math>

and then substitute them back into the calculation of <math>(b_1)</math>; however, you can find that the remainders of these two expansions are both inversely proportional to the square root of <math>M</math>, let me drop them out to beautify the calculation. Keeping them is better, but it will make the formula uglier.

:<math>\begin{align}
(b_1) &\le \left|\left[ \frac{h(sy)}{h(0)} \right]_{\max} e^{-\pi D_y^2}\int_0^{b/s-D_y}e^{-2\pi D_y y} dy \right| \\
 &\le \left|\left[ \frac{h(sy)}{h(0)} \right]_{\max} e^{-\pi D_y^2}\frac{1}{2\pi D_y} \right|.
\end{align}</math>

Therefore, it will tend to zero when <math>D_y</math> gets larger, but don't forget that the upper bound of <math>D_y</math> should be considered during this calculation.

About the integration near <math>x=0</math>, we can also use [[Taylor's Theorem]] to calculate it. When <math>h'(0) \ne 0</math>

:<math>\begin{align}
(c) &\le \int_{-D_y}^{D_y} e^{-\pi y^2} \left| \frac{sh'(\xi)}{h(0)}y \right|\, dy \\
 &< \sqrt{\frac{2}{\pi M |g''(0)|}} \left| \frac{h'(\xi)}{h(0)} \right|_\max \left( 1-e^{-\pi D_y^2} \right)
\end{align}</math>

and you can find that it is inversely proportional to the square root of <math>M</math>. In fact, <math>(c)</math> will have the same behave when <math>h(x)</math> is a constant.

Conclusively, the integral near the stationary point will get smaller as <math>\sqrt{M}</math> gets larger, and the rest parts will tend to zero as long as <math>D_y</math> is large enough; however, we need to remember that <math>D_y</math> has an upper limit which is decided by whether the function <math>m(x)</math> is always larger than <math>g(x)-g(0)</math> in the rest region. However, as long as we can find one <math>m(x)</math> satisfying this condition, the upper bound of <math>D_y</math> can be chosen as directly proportional to <math>\sqrt{M}</math> since <math>m(x)</math> is a tangent across the point of <math>g(x)-g(0)</math> at <math>x=sD_y</math>. So, the bigger <math>M</math> is, the bigger <math>D_y</math> can be.
{{hidden end}}

In the multivariate case where <math>\mathbf{x}</math> is a <math>d</math>-dimensional vector and <math>f(\mathbf{x})</math> is a scalar function of <math>\mathbf{x}</math>, Laplace's approximation is usually written as:

:<math>\int e^{M f(\mathbf{x})}\, d\mathbf{x} \approx \left(\frac{2\pi}{M}\right)^{d/2} \frac{e^{M f(\mathbf{x}_0)}}{\left|-H(f)(\mathbf{x}_0)\right|^{1/2}} \text { as } M\to\infty</math>

where <math>H(f)(\mathbf{x}_0)</math> is the [[Hessian matrix]] of <math>f</math> evaluated at <math>\mathbf{x}_0</math> and where <math>|\cdot|</math> denotes [[matrix determinant]]. Analogously to the univariate case, the Hessian is required to be [[Positive-definite matrix|negative definite]].<ref>{{cite book | last =MacKay | first =David J. C. | title =Information Theory, Inference and Learning Algorithms | publisher =Cambridge University Press |date=September 2003 | location =Cambridge | url =http://www.inference.phy.cam.ac.uk/mackay/itila/book.html | isbn = 9780521642989}}</ref>

By the way, although <math>\mathbf{x}</math> denotes a <math>d</math>-dimensional vector, the term <math>d\mathbf{x}</math> denotes an [[infinitesimal]] [[Volume element|volume]] here, i.e. <math>d\mathbf{x} := dx_1dx_2\cdots dx_d</math>.

==Laplace's method extension: Steepest descent==
{{main|Method of steepest descent}}

In extensions of Laplace's method, [[complex analysis]], and in particular [[Cauchy's integral formula]], is used to find a contour ''of steepest descent'' for an (asymptotically with large ''M'') equivalent integral, expressed as a [[line integral]]. In particular, if no point ''x''<sub>0</sub> where the derivative of <math>f</math> vanishes exists on the real line, it may be necessary to deform the integration contour to an optimal one, where the above analysis will be possible. Again the main idea is to reduce, at least asymptotically, the calculation of the given integral to that of a simpler integral that can be explicitly evaluated. See the book of Erdelyi (1956) for a simple discussion (where the method is termed ''steepest descents'').

The appropriate formulation for the complex ''z''-plane is

:<math>\int_a^b e^{M f(z)}\, dz \approx \sqrt{\frac{2\pi}{-Mf''(z_0)}}e^{M f(z_0)} \text{ as } M\to\infty.</math>

for a path passing through the saddle point at ''z''<sub>0</sub>. Note the explicit appearance of a minus sign to indicate the direction of the second derivative: one must ''not'' take the modulus. Also note that if the integrand is [[meromorphic]], one may have to add residues corresponding to poles traversed while deforming the contour (see for example section 3 of Okounkov's paper ''Symmetric functions and random partitions'').

==Further generalizations==
An extension of the steepest descent method is the so-called ''nonlinear stationary phase/steepest descent method''. Here, instead of integrals, one needs to evaluate asymptotically solutions of [[Riemann–Hilbert problem|Riemann–Hilbert factorization problem]]s.

Given a contour ''C'' in the [[complex sphere]], a function <math>f</math> defined on that contour and a special point, say infinity, one seeks a function ''M'' holomorphic away from the contour ''C'', with prescribed jump across ''C'', and with a given normalization at infinity. If <math>f</math> and hence ''M'' are matrices rather than scalars this is a problem that in general does not admit an explicit solution.

An asymptotic evaluation is then possible along the lines of the linear stationary phase/steepest descent method. The idea is to reduce asymptotically the solution of the given Riemann–Hilbert problem to that of a simpler, explicitly solvable, Riemann–Hilbert problem. Cauchy's theorem is used to justify deformations of the jump contour.

The nonlinear stationary phase was introduced by Deift and Zhou in 1993, based on earlier work of Its. A (properly speaking) nonlinear steepest descent method was introduced by Kamvissis, K. McLaughlin and P. Miller in 2003, based on previous work of Lax, Levermore, Deift, Venakides and Zhou. As in the linear case, "steepest descent contours" solve a min-max problem. In the nonlinear case they turn out to be "S-curves" (defined in a different context back in the 80s by Stahl, Gonchar and Rakhmanov).

The nonlinear stationary phase/steepest descent method has applications to the theory of soliton equations and [[integrable model]]s, [[random matrices]] and [[combinatorics]].

==Complex integrals==
For complex integrals in the form:

:<math>\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty} g(s)e^{st} \,ds</math>

with <math>t \gg 1,</math> we make the substitution ''t'' = ''iu'' and the change of variable <math>s=c+ix</math> to get the bilateral Laplace transform:

:<math>\frac{1}{2 \pi}\int_{-\infty}^\infty g(c+ix)e^{-ux}e^{icu} \, dx.</math>

We then split ''g''(''c'' + ''ix'') in its real and complex part, after which we recover ''u'' = ''t''/''i''. This is useful for [[inverse Laplace transform]]s, the [[Perron formula]] and complex integration.

==Example 1: Stirling's approximation==
Laplace's method can be used to derive [[Stirling's approximation]]

:<math>N!\approx \sqrt{2\pi N} N^N e^{-N}\,</math>

for a large [[integer]]&nbsp;''N''.

From the definition of the [[Gamma function]], we have

:<math>N! = \Gamma(N+1)=\int_0^\infty e^{-x} x^N \, dx.</math>

Now we change variables, letting <math>x=Nz</math> so that <math>dx = Ndz.</math> Plug these values back in to obtain

:<math>\begin{align}
N! &= \int_0^\infty e^{-Nz} (Nz)^N N \, dz \\
   &= N^{N+1} \int_0^\infty e^{-Nz} z^N \, dz \\
   &= N^{N+1} \int_0^\infty e^{-Nz} e^{N\ln z} \, dz \\
   &= N^{N+1} \int_0^\infty e^{N(\ln z-z)} \, dz.
\end{align}</math>

This integral has the form necessary for Laplace's method with

:<math>f(z) = \ln{z}-z</math>

which is twice-differentiable:

:<math>f'(z) = \frac{1}{z}-1,</math>
:<math>f''(z) = -\frac{1}{z^2}.</math>

The maximum of <math>f(z)</math> lies at ''z''<sub>0</sub>&nbsp;=&nbsp;1, and the second derivative of <math>f(z)</math> has the value&nbsp;−1 at this point. Therefore, we obtain

:<math>N! \approx N^{N+1}\sqrt{\frac{2\pi}{N}} e^{-N}=\sqrt{2\pi N} N^N e^{-N}.</math>

==See also==
* [[Stationary phase approximation|Method of stationary phase]]
* [[Large deviations theory]]
* [[Laplace principle (large deviations theory)]]

==Notes==
{{reflist}}

==References==
{{refbegin}}
*{{Citation| last=Azevedo-Filho | first=A. | last2=Shachter | first2=R. | year=1994 | chapter=Laplace's Method Approximations for Probabilistic Inference in Belief Networks with Continuous Variables | editor-first=R. | editor-last=Mantaras | editor2-first=D. | editor2-last=Poole | title= Uncertainty in Artificial Intelligence | publisher=[[Morgan Kaufmann]] | place=San Francisco, CA | citeseerx = 10.1.1.91.2064 }}.
*{{Citation| last=Deift | first=P. | last2=Zhou | first2=X. | year=1993 | title=A steepest descent method for oscillatory Riemann–Hilbert problems. Asymptotics for the MKdV equation | periodical=Ann. of Math. | volume=137 | issue=2 | pages=295–368 | doi= 10.2307/2946540 | arxiv=math/9201261 | jstor=2946540 }}.
*{{Citation| last=Erdelyi | first=A. | year=1956 | title=Asymptotic Expansions | publisher=Dover}}.
*{{Citation| last=Fog | first=A. | year=2008 | title=Calculation Methods for Wallenius' Noncentral Hypergeometric Distribution | periodical=Communications in Statistics, Simulation and Computation | volume=37 | issue=2 | pages=258–273 | doi= 10.1080/03610910701790269 }}.
*{{Citation |last=Laplace |first=P S |year=1774  | title=Mémoires de Mathématique et de Physique, Tome Sixième |journal=Statistical Science |volume=1 |issue=3 |pages=366–367 |trans-title=Memoir on the probability of causes of events.|jstor=2245476 }}
*{{cite journal| first1= Xiang-Sheng | last1=Wang | first2=Roderick | last2=Wong |title= Discrete analogues of Laplace's approximation |journal = Asymptot. Anal. | year=2007 | volume=54 | number=3–4 |pages=165–180 }}

{{refend}}
{{PlanetMath attribution|id=4284|title=saddle point approximation}}

[[Category:Asymptotic analysis]]
[[Category:Perturbation theory]]
-->