Introduction
Introduction Statistics Contact Development Disclaimer Help
_______ __ _______
| | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----.
| || _ || __|| < | -__|| _| | || -__|| | | ||__ --|
|___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____|
on Gopher (inofficial)
Visit Hacker News on the Web
COMMENT PAGE FOR:
Gaussian integration is cool
actinium226 wrote 6 hours 17 min ago:
While I love seeing mathy articles on HN, the lead image in this
article is terrible.
At first glance I see we're talking about numerical integration so I
assuming the red part is this method that is being discussed and that
it's much better than the other two. Then I look at the axis, which the
caption notes is log scale, and see that it goes from 1.995 to 2.
Uh-oh, is someone trying to inflate performance by cutting off the
origin? Big no-no, but then wait a minute, this is ground truth
compared to two approximations. So actually the one on the right is
better. But the middle one is still accurate to within 0.25%. And why
is it log scale?
Anyway point is, there's lots of room for improvement!
In particular it's probably not worth putting ground truth as a
separate bar, just plot the error of the two methods, then you don't
need to cut off the original. And ditch the log scale.
drmikeando wrote 6 hours 58 min ago:
My mind was exploded by this somewhat similar technique [1] - it uses a
similar transformation of domain, but uses some properties of optimal
quadrature for infinite sequences in the complex plane to produce
quickly convergent integrals for many simple and pathological cases.
[1]: https://en.wikipedia.org/wiki/Tanh-sinh_quadrature
tim-kt wrote 17 hours 7 min ago:
No, the integral cannot be "expressed" as a sum over weights and
function evaluations (with a "="), it can be approximated with this
idea. If you fix any n+1 nodes, interpolate your function, and
integrate your polynomial, you will get this sum where the weights are
integrals over (Lagrange) basis polynomials. By construction, you can
compute the integral of polynomials up to degree n exactly. Now, if you
choose the nodes in a particular way (namely, as the zeros of some
polynomials), you can increase this to up to 2n+1. What I'm getting at
is that the Gaussian integration is not estimating the integrals of
polynomials of degree 2n+1, but it's evaluating them exactly.
Onavo wrote 15 hours 28 min ago:
Convolution?
tim-kt wrote 13 hours 7 min ago:
What about it?
extrabajs wrote 19 hours 22 min ago:
What is Fig. 1 showing? Is it the value of the integral compared with
two approximations? Would it not be more interesting to show the error
of the approximations instead? Asking for a friend who isn’t
computing a lot of integrals.
mturmon wrote 15 hours 57 min ago:
Fig 1 could use a rethink. It uses log scale, but the dynamic range
of the y-axis is tiny, so the log transform isn't doing anything.
It would be better shown as a table with 3 numbers. Or, maybe two
columns, one for integral value and one for error, as you suggest.
rd11235 wrote 8 hours 53 min ago:
Yeah - my guess is this was just a very roundabout solution for
setting axis limits.
(For some reason, plt.bar was used instead of plt.plot, so the y
axis would start at 0 by default, making all results look the same.
But when the log scale is applied, the lower y limit becomes the
data’s minimum. So, because the dynamic range is so low, the end
result is visually identical to having just set y limits using the
original linear scale).
Anyhow for anyone interested the values for those 3 points are
2.0000 (exact), 1.9671 (trapezoid), and 1.9998 (gaussian). The
relatives errors are 1.6% vs. 0.01%.
seanhunter wrote 22 hours 3 min ago:
I thought when I first saw the title that it was going to be about the
Gaussian integral[1] which has to be one of the coolest results in all
of maths.
That is, the integral from - to + infinity of e^(-x^2) dx = sqrt(pi).
I remember being given this as an exercise and just being totally
shocked by how beautiful it was as a result (when I eventually managed
to work out how to evaluate it).
[1]: https://mathworld.wolfram.com/GaussianIntegral.html
constantcrying wrote 19 hours 35 min ago:
There is a relationship here, in the case of Gauß-Hermite
Integration, where the weight function is exactly e^(-x^2) the
weights have to add up sqrt(pi), because the integral is exact for
the constant 1 polynomial.
niklasbuschmann wrote 20 hours 9 min ago:
Gaussian integrals are also pretty much the basis of quantum field
theory in the path integral formalism, where Isserlis's theorem is
the analog to Wick's theorem in the operator formalism.
srean wrote 19 hours 58 min ago:
Indeed.
It's the gateway drug to Laplace's method (Laplace approximation),
mean field theory, perturbation theory, ... QFT.
[1]: https://en.m.wikipedia.org/wiki/Laplace%27s_method
chermi wrote 16 hours 14 min ago:
Maybe I should just read the wiki you linked, but I guess I'm
confused on how this is different than steepest descent? I'm a
physicist by training so maybe we just call it something
different?
srean wrote 15 hours 4 min ago:
It is Laplace's method of steepest decent. When the same method
is used to approximate probability density function, for
example the posterior probability density, it's called
Laplace's approximation of the density.
The wikipedia link would have made things quite clear :)
dawnofdusk wrote 15 hours 19 min ago:
They are the same for physicists who analytically continue
everything. Steepest descent is technically for integrals in
the complex plane, and Laplace's method is for integrating over
the real line.
constantcrying wrote 23 hours 23 min ago:
A good introduction to the basics.
What is also worth pointing out and which was somewhat glanced over is
the close connection between the weight function and the polynomials.
For different weight functions you get different classes of orthogonal
polynomials. Orthogonal has to be understood in relation to the scalar
product given by integrating with respect to the weight function as
well.
Interestingly Gauss-Hermite integrates on the entire real line, so from
-infinity to infinity. So the choice of weight function also influences
the choice of integration domain.
creata wrote 20 hours 56 min ago:
Sorry if this is a stupid question, but is there a direct link
between the choice of weight function and the applications of the
polynomial?
Like, is it possible to infer that Chebyshev polynomials would be
useful in approximation theory using only the fact that they're
orthogonal wrt the Wigner semicircle (U_n) or arcsine (T_n)
distribution?
sfpotter wrote 19 hours 11 min ago:
Chebyshev polynomials are useful in approximation theory because
they're the minimax polynomials. The remainder of polynomial
interpolation can be given in terms of the nodal polynomial, which
is the polynomial with the interpolation nodes as zeros. Minimizing
the maximum error then leads to the Chebyshev polynomials. This is
a basic fact in numerical analysis that has tons of derivations
online and in books.
The weight function shows the Chebyshev polynomials' relation to
the Fourier series . But they are not what you would usually think
of as a good candidate for L2 approximation on the interval.
Normally you'd use Legendre polynomials, since they have w = 1, but
they are a much less convenient basis than Chebyshev for numerics.
creata wrote 17 hours 51 min ago:
True, and there are plenty of other reasons Chebyshev polynomials
are convenient, too.
But I guess what I was asking was: is there some kind of abstract
argument why the semicircle distribution would be appropriate in
this context?
For example, you have abstract arguments like the central limit
theorem that explain (in some loose sense) why the normal
distribution is everywhere.
I guess the semicircle might more-or-less be the only way to get
something where interpolation uses the DFT (by projecting points
evenly spaced on the complex unit circle onto [-1, 1]), but I
dunno, that motivation feels too many steps removed.
sfpotter wrote 17 hours 35 min ago:
If there is, I'm not aware of it. Orthogonal polynomials come
up in random matrix theory. Maybe there's something there?
But your last paragraph is exactly it... it is a "basic" fact
but the consequences are profound.
efavdb wrote 15 hours 56 min ago:
Could you guys recommend an easy book on this topic?
sfpotter wrote 15 hours 48 min ago:
Which topic?
efavdb wrote 15 hours 15 min ago:
Some sort of numerical analysis book that covers these
topics - minimax approx, quadrature etc. I’ve read on
these separately but am curious what other sorts of
things would be covered in courses including that.
sfpotter wrote 14 hours 49 min ago:
I would check out "An Introduction to Numerical
Analysis" by Suli and Mayers or "Approximation Theory
and Approximation Practice" by Trefethen. The former
covers all the major intro numerical analysis topics in
a format that is suitable for someone with ~undergrad
math or engineering backgrounds. The latter goes deep
into Chebyshev approximation (and some related topics).
It is also very accessible but is much more
specialized.
4ce0b361 wrote 14 hours 50 min ago:
I'd suggest: Trefethen, Lloyd N., Approximation theory
and approximation practice (Extended edition), SIAM,
Philadelphia, PA (2020), ISBN 978-1-611975-93-2.
constantcrying wrote 19 hours 29 min ago:
Yes. Precisely that they are orthogonal means that they are
suitable.
If you are familiar with the Fourier series, the same principle can
be applied to approximating with polynomials.
In both cases the crucial point is that you can form an orthogonal
subspace, onto which you can project the function to be
approximated.
For polynomials it is this:
[1]: https://en.m.wikipedia.org/wiki/Polynomial_chaos
sfpotter wrote 18 hours 17 min ago:
What you're saying isn't wrong, per se, but...
There are polynomials that aren't orthogonal that are suitable
for numerics: both the Bernstein basis and the monomial basis are
used very often and neither are orthogonal. (Well, you could pick
a weight function that makes them orthogonal, but...!)
The fact of their orthogonality is crucial, but when you work
with Chebyshev polynomials, it is very unlikely you are doing an
orthogonal (L2) projection! Instead, you would normally use
Chebyshev interpolation: 1) interpolate at either the Type-I or
Type-II Chebyshev nodes, 2) use the DCT to compute the Chebyshev
series coefficients. The fact that you can do this is related to
the weight function, but it isn't an L2 procedure. Like I
mentioned in my other post, the Chebyshev weight function is
maybe more of an artifact of the Chebyshev polynomials' intimate
relation to the Fourier series.
I am also not totally sure what polynomial chaos has to do with
any of this. PC is a term of art in uncertainty quantification,
and this is all just basic numerical analysis. If you have a
series in orthgonal polynomials, if you want to call it something
fancy, you might call it a Fourier series, but usually there is
no fancy term...
constantcrying wrote 16 hours 22 min ago:
I think you are very confused. My post is about approximation.
Obviously other areas use other polynomials or the same
polynomials for other reasons.
In this case it is about the principle of approximation by
orthogonal projection, which is quite common in different
fields of mathematics. Here you create an approximation of a
target by projecting it onto an orthogonal subspace. This is
what the Fourier series is about, an orthogonal projection.
Choosing e.g. the Chebychev Polynomials instead of the complex
exponential gives you an Approximation onto the orthogonal
space of e.g. Chebychev polynomials.
The same principle applies e.g. when you are computing an SVD
for a low rank approximation. That is another case of
orthogonal projection.
>Instead, you would normally use Chebyshev interpolation
What you do not understand is that this is the same thing. The
distinction you describe does not exist, these are the same
things, just different perspectives. That they are the same
easily follows from the uniqueness of polynomials, which are
fully determined by their interpolation points. These aren't
distinct ideas, there is a greater principle behind them and
that you are using some other algorithm to compute the
Approximation does not matter at all.
>I am also not totally sure what polynomial chaos has to do
with any of this.
It is the exact same thing. Projection onto an orthogonal
subspace of polynomials. Just that you choose the polynomials
with regard to a random variable. So you get an approximation
with good statistical properties.
sfpotter wrote 15 hours 55 min ago:
No, I am not confused. :-) I am just trying to help you
understand some basics of numerical analysis.
> What you do not understand is that this is the same thing.
It is not the same thing.
You can express an analytic function f(x) in a convergent (on
[-1, 1]) Chebyshev series: f(x) = \sum_{n=0}^\infty a_n
T_n(x). You can then truncate it keeping N+1 terms, giving a
degree N polynomial. Call it f_N.
Alternatively, you can interpolate f at at N+1 Chebyshev
nodes and use a DCT to compute the corresponding Chebyshev
series coefficients. Call the resulting polynomial p_N.
In general, f_N and p_N are not the same polynomial.
Furthermore, computing the coefficients of f_N is much more
expensive than computing the coefficients of p_N. For f_N,
you need to evaluate N+1 integral which may be quite
expensive indeed if you want to get digits. For p_N, you
simply evaluate f at N+1 nodes, compute a DCT in O(N log N)
time, and the result is the coefficients of p_N up to
rounding error.
In practice, people do not compute the coefficients of f_N,
they compute the coefficients of p_N. Nevertheless, f_N and
p_N are essentially as good as each other when it comes to
approximation.
constantcrying wrote 14 hours 55 min ago:
At this point I really hate to ask. Do you know what
"orthogonal subspace" means and what a projection is?
sfpotter wrote 14 hours 37 min ago:
Ah, shucks. I can see you're getting upset and defensive.
Sorry... Yes, it should be clear from everything I've
written that I'm quite clear on the definition of these.
If you would like to read what I'm saying but from a more
authoritative reference that you feel you can trust, you
can just take a look at Trefethen's "Approximation Theory
and Approximation Practice". I'm just quoting contents of
Chapter 4 at you.
Again, like I said in my first response to you, what
you're saying isn't wrong, it just misses the mark a bit.
If you want to compute the L2 projection of a function
onto the orthogonal subspace of degree N Chebyshev
polynomials, you would need to evaluate a rather
expensive integral to compute the coefficients. It's
expensive because it requires the use of adaptive
integration... many function evaluations per coefficient!
Bad!
On the other hand, you could just do polynomial
interpolation using either of the degree N Chebyshev
nodes (Type-I or Type-II). This requires only N+1
functions evaluations. Only one function evaluation per
coefficient. Good!
And, again, since the the polynomial so constructed is
not the same polynomial as the one obtained via L2
projection mentioned in paragraph 3 above, this
interpolation procedure cannot be regarded as a
projection! I guess you could call it an "approximate
projection". It agrees quite closely with the L2
projection, and has essentially the same approximation
power. This is why Chebyshev polynomials are so useful in
practice for approximation, and why e.g. Legendre
polynomials are much less useful (they do not have a
convenient fast transform).
Anyway, I hope this helps! It's a beautiful subject and a
lot of fun to work on.
<- back to front page
You are viewing proxied material from codevoid.de. The copyright of proxied material belongs to its original authors. Any comments or complaints in relation to proxied material should be directed to the original authors of the content concerned. Please see the disclaimer for more details.