_______ __ _______ | |
| | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----. | |
| || _ || __|| < | -__|| _| | || -__|| | | ||__ --| | |
|___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____| | |
on Gopher (inofficial) | |
Visit Hacker News on the Web | |
COMMENT PAGE FOR: | |
Gaussian integration is cool | |
actinium226 wrote 6 hours 17 min ago: | |
While I love seeing mathy articles on HN, the lead image in this | |
article is terrible. | |
At first glance I see we're talking about numerical integration so I | |
assuming the red part is this method that is being discussed and that | |
it's much better than the other two. Then I look at the axis, which the | |
caption notes is log scale, and see that it goes from 1.995 to 2. | |
Uh-oh, is someone trying to inflate performance by cutting off the | |
origin? Big no-no, but then wait a minute, this is ground truth | |
compared to two approximations. So actually the one on the right is | |
better. But the middle one is still accurate to within 0.25%. And why | |
is it log scale? | |
Anyway point is, there's lots of room for improvement! | |
In particular it's probably not worth putting ground truth as a | |
separate bar, just plot the error of the two methods, then you don't | |
need to cut off the original. And ditch the log scale. | |
drmikeando wrote 6 hours 58 min ago: | |
My mind was exploded by this somewhat similar technique [1] - it uses a | |
similar transformation of domain, but uses some properties of optimal | |
quadrature for infinite sequences in the complex plane to produce | |
quickly convergent integrals for many simple and pathological cases. | |
[1]: https://en.wikipedia.org/wiki/Tanh-sinh_quadrature | |
tim-kt wrote 17 hours 7 min ago: | |
No, the integral cannot be "expressed" as a sum over weights and | |
function evaluations (with a "="), it can be approximated with this | |
idea. If you fix any n+1 nodes, interpolate your function, and | |
integrate your polynomial, you will get this sum where the weights are | |
integrals over (Lagrange) basis polynomials. By construction, you can | |
compute the integral of polynomials up to degree n exactly. Now, if you | |
choose the nodes in a particular way (namely, as the zeros of some | |
polynomials), you can increase this to up to 2n+1. What I'm getting at | |
is that the Gaussian integration is not estimating the integrals of | |
polynomials of degree 2n+1, but it's evaluating them exactly. | |
Onavo wrote 15 hours 28 min ago: | |
Convolution? | |
tim-kt wrote 13 hours 7 min ago: | |
What about it? | |
extrabajs wrote 19 hours 22 min ago: | |
What is Fig. 1 showing? Is it the value of the integral compared with | |
two approximations? Would it not be more interesting to show the error | |
of the approximations instead? Asking for a friend who isnât | |
computing a lot of integrals. | |
mturmon wrote 15 hours 57 min ago: | |
Fig 1 could use a rethink. It uses log scale, but the dynamic range | |
of the y-axis is tiny, so the log transform isn't doing anything. | |
It would be better shown as a table with 3 numbers. Or, maybe two | |
columns, one for integral value and one for error, as you suggest. | |
rd11235 wrote 8 hours 53 min ago: | |
Yeah - my guess is this was just a very roundabout solution for | |
setting axis limits. | |
(For some reason, plt.bar was used instead of plt.plot, so the y | |
axis would start at 0 by default, making all results look the same. | |
But when the log scale is applied, the lower y limit becomes the | |
dataâs minimum. So, because the dynamic range is so low, the end | |
result is visually identical to having just set y limits using the | |
original linear scale). | |
Anyhow for anyone interested the values for those 3 points are | |
2.0000 (exact), 1.9671 (trapezoid), and 1.9998 (gaussian). The | |
relatives errors are 1.6% vs. 0.01%. | |
seanhunter wrote 22 hours 3 min ago: | |
I thought when I first saw the title that it was going to be about the | |
Gaussian integral[1] which has to be one of the coolest results in all | |
of maths. | |
That is, the integral from - to + infinity of e^(-x^2) dx = sqrt(pi). | |
I remember being given this as an exercise and just being totally | |
shocked by how beautiful it was as a result (when I eventually managed | |
to work out how to evaluate it). | |
[1]: https://mathworld.wolfram.com/GaussianIntegral.html | |
constantcrying wrote 19 hours 35 min ago: | |
There is a relationship here, in the case of GauÃ-Hermite | |
Integration, where the weight function is exactly e^(-x^2) the | |
weights have to add up sqrt(pi), because the integral is exact for | |
the constant 1 polynomial. | |
niklasbuschmann wrote 20 hours 9 min ago: | |
Gaussian integrals are also pretty much the basis of quantum field | |
theory in the path integral formalism, where Isserlis's theorem is | |
the analog to Wick's theorem in the operator formalism. | |
srean wrote 19 hours 58 min ago: | |
Indeed. | |
It's the gateway drug to Laplace's method (Laplace approximation), | |
mean field theory, perturbation theory, ... QFT. | |
[1]: https://en.m.wikipedia.org/wiki/Laplace%27s_method | |
chermi wrote 16 hours 14 min ago: | |
Maybe I should just read the wiki you linked, but I guess I'm | |
confused on how this is different than steepest descent? I'm a | |
physicist by training so maybe we just call it something | |
different? | |
srean wrote 15 hours 4 min ago: | |
It is Laplace's method of steepest decent. When the same method | |
is used to approximate probability density function, for | |
example the posterior probability density, it's called | |
Laplace's approximation of the density. | |
The wikipedia link would have made things quite clear :) | |
dawnofdusk wrote 15 hours 19 min ago: | |
They are the same for physicists who analytically continue | |
everything. Steepest descent is technically for integrals in | |
the complex plane, and Laplace's method is for integrating over | |
the real line. | |
constantcrying wrote 23 hours 23 min ago: | |
A good introduction to the basics. | |
What is also worth pointing out and which was somewhat glanced over is | |
the close connection between the weight function and the polynomials. | |
For different weight functions you get different classes of orthogonal | |
polynomials. Orthogonal has to be understood in relation to the scalar | |
product given by integrating with respect to the weight function as | |
well. | |
Interestingly Gauss-Hermite integrates on the entire real line, so from | |
-infinity to infinity. So the choice of weight function also influences | |
the choice of integration domain. | |
creata wrote 20 hours 56 min ago: | |
Sorry if this is a stupid question, but is there a direct link | |
between the choice of weight function and the applications of the | |
polynomial? | |
Like, is it possible to infer that Chebyshev polynomials would be | |
useful in approximation theory using only the fact that they're | |
orthogonal wrt the Wigner semicircle (U_n) or arcsine (T_n) | |
distribution? | |
sfpotter wrote 19 hours 11 min ago: | |
Chebyshev polynomials are useful in approximation theory because | |
they're the minimax polynomials. The remainder of polynomial | |
interpolation can be given in terms of the nodal polynomial, which | |
is the polynomial with the interpolation nodes as zeros. Minimizing | |
the maximum error then leads to the Chebyshev polynomials. This is | |
a basic fact in numerical analysis that has tons of derivations | |
online and in books. | |
The weight function shows the Chebyshev polynomials' relation to | |
the Fourier series . But they are not what you would usually think | |
of as a good candidate for L2 approximation on the interval. | |
Normally you'd use Legendre polynomials, since they have w = 1, but | |
they are a much less convenient basis than Chebyshev for numerics. | |
creata wrote 17 hours 51 min ago: | |
True, and there are plenty of other reasons Chebyshev polynomials | |
are convenient, too. | |
But I guess what I was asking was: is there some kind of abstract | |
argument why the semicircle distribution would be appropriate in | |
this context? | |
For example, you have abstract arguments like the central limit | |
theorem that explain (in some loose sense) why the normal | |
distribution is everywhere. | |
I guess the semicircle might more-or-less be the only way to get | |
something where interpolation uses the DFT (by projecting points | |
evenly spaced on the complex unit circle onto [-1, 1]), but I | |
dunno, that motivation feels too many steps removed. | |
sfpotter wrote 17 hours 35 min ago: | |
If there is, I'm not aware of it. Orthogonal polynomials come | |
up in random matrix theory. Maybe there's something there? | |
But your last paragraph is exactly it... it is a "basic" fact | |
but the consequences are profound. | |
efavdb wrote 15 hours 56 min ago: | |
Could you guys recommend an easy book on this topic? | |
sfpotter wrote 15 hours 48 min ago: | |
Which topic? | |
efavdb wrote 15 hours 15 min ago: | |
Some sort of numerical analysis book that covers these | |
topics - minimax approx, quadrature etc. Iâve read on | |
these separately but am curious what other sorts of | |
things would be covered in courses including that. | |
sfpotter wrote 14 hours 49 min ago: | |
I would check out "An Introduction to Numerical | |
Analysis" by Suli and Mayers or "Approximation Theory | |
and Approximation Practice" by Trefethen. The former | |
covers all the major intro numerical analysis topics in | |
a format that is suitable for someone with ~undergrad | |
math or engineering backgrounds. The latter goes deep | |
into Chebyshev approximation (and some related topics). | |
It is also very accessible but is much more | |
specialized. | |
4ce0b361 wrote 14 hours 50 min ago: | |
I'd suggest: Trefethen, Lloyd N., Approximation theory | |
and approximation practice (Extended edition), SIAM, | |
Philadelphia, PA (2020), ISBN 978-1-611975-93-2. | |
constantcrying wrote 19 hours 29 min ago: | |
Yes. Precisely that they are orthogonal means that they are | |
suitable. | |
If you are familiar with the Fourier series, the same principle can | |
be applied to approximating with polynomials. | |
In both cases the crucial point is that you can form an orthogonal | |
subspace, onto which you can project the function to be | |
approximated. | |
For polynomials it is this: | |
[1]: https://en.m.wikipedia.org/wiki/Polynomial_chaos | |
sfpotter wrote 18 hours 17 min ago: | |
What you're saying isn't wrong, per se, but... | |
There are polynomials that aren't orthogonal that are suitable | |
for numerics: both the Bernstein basis and the monomial basis are | |
used very often and neither are orthogonal. (Well, you could pick | |
a weight function that makes them orthogonal, but...!) | |
The fact of their orthogonality is crucial, but when you work | |
with Chebyshev polynomials, it is very unlikely you are doing an | |
orthogonal (L2) projection! Instead, you would normally use | |
Chebyshev interpolation: 1) interpolate at either the Type-I or | |
Type-II Chebyshev nodes, 2) use the DCT to compute the Chebyshev | |
series coefficients. The fact that you can do this is related to | |
the weight function, but it isn't an L2 procedure. Like I | |
mentioned in my other post, the Chebyshev weight function is | |
maybe more of an artifact of the Chebyshev polynomials' intimate | |
relation to the Fourier series. | |
I am also not totally sure what polynomial chaos has to do with | |
any of this. PC is a term of art in uncertainty quantification, | |
and this is all just basic numerical analysis. If you have a | |
series in orthgonal polynomials, if you want to call it something | |
fancy, you might call it a Fourier series, but usually there is | |
no fancy term... | |
constantcrying wrote 16 hours 22 min ago: | |
I think you are very confused. My post is about approximation. | |
Obviously other areas use other polynomials or the same | |
polynomials for other reasons. | |
In this case it is about the principle of approximation by | |
orthogonal projection, which is quite common in different | |
fields of mathematics. Here you create an approximation of a | |
target by projecting it onto an orthogonal subspace. This is | |
what the Fourier series is about, an orthogonal projection. | |
Choosing e.g. the Chebychev Polynomials instead of the complex | |
exponential gives you an Approximation onto the orthogonal | |
space of e.g. Chebychev polynomials. | |
The same principle applies e.g. when you are computing an SVD | |
for a low rank approximation. That is another case of | |
orthogonal projection. | |
>Instead, you would normally use Chebyshev interpolation | |
What you do not understand is that this is the same thing. The | |
distinction you describe does not exist, these are the same | |
things, just different perspectives. That they are the same | |
easily follows from the uniqueness of polynomials, which are | |
fully determined by their interpolation points. These aren't | |
distinct ideas, there is a greater principle behind them and | |
that you are using some other algorithm to compute the | |
Approximation does not matter at all. | |
>I am also not totally sure what polynomial chaos has to do | |
with any of this. | |
It is the exact same thing. Projection onto an orthogonal | |
subspace of polynomials. Just that you choose the polynomials | |
with regard to a random variable. So you get an approximation | |
with good statistical properties. | |
sfpotter wrote 15 hours 55 min ago: | |
No, I am not confused. :-) I am just trying to help you | |
understand some basics of numerical analysis. | |
> What you do not understand is that this is the same thing. | |
It is not the same thing. | |
You can express an analytic function f(x) in a convergent (on | |
[-1, 1]) Chebyshev series: f(x) = \sum_{n=0}^\infty a_n | |
T_n(x). You can then truncate it keeping N+1 terms, giving a | |
degree N polynomial. Call it f_N. | |
Alternatively, you can interpolate f at at N+1 Chebyshev | |
nodes and use a DCT to compute the corresponding Chebyshev | |
series coefficients. Call the resulting polynomial p_N. | |
In general, f_N and p_N are not the same polynomial. | |
Furthermore, computing the coefficients of f_N is much more | |
expensive than computing the coefficients of p_N. For f_N, | |
you need to evaluate N+1 integral which may be quite | |
expensive indeed if you want to get digits. For p_N, you | |
simply evaluate f at N+1 nodes, compute a DCT in O(N log N) | |
time, and the result is the coefficients of p_N up to | |
rounding error. | |
In practice, people do not compute the coefficients of f_N, | |
they compute the coefficients of p_N. Nevertheless, f_N and | |
p_N are essentially as good as each other when it comes to | |
approximation. | |
constantcrying wrote 14 hours 55 min ago: | |
At this point I really hate to ask. Do you know what | |
"orthogonal subspace" means and what a projection is? | |
sfpotter wrote 14 hours 37 min ago: | |
Ah, shucks. I can see you're getting upset and defensive. | |
Sorry... Yes, it should be clear from everything I've | |
written that I'm quite clear on the definition of these. | |
If you would like to read what I'm saying but from a more | |
authoritative reference that you feel you can trust, you | |
can just take a look at Trefethen's "Approximation Theory | |
and Approximation Practice". I'm just quoting contents of | |
Chapter 4 at you. | |
Again, like I said in my first response to you, what | |
you're saying isn't wrong, it just misses the mark a bit. | |
If you want to compute the L2 projection of a function | |
onto the orthogonal subspace of degree N Chebyshev | |
polynomials, you would need to evaluate a rather | |
expensive integral to compute the coefficients. It's | |
expensive because it requires the use of adaptive | |
integration... many function evaluations per coefficient! | |
Bad! | |
On the other hand, you could just do polynomial | |
interpolation using either of the degree N Chebyshev | |
nodes (Type-I or Type-II). This requires only N+1 | |
functions evaluations. Only one function evaluation per | |
coefficient. Good! | |
And, again, since the the polynomial so constructed is | |
not the same polynomial as the one obtained via L2 | |
projection mentioned in paragraph 3 above, this | |
interpolation procedure cannot be regarded as a | |
projection! I guess you could call it an "approximate | |
projection". It agrees quite closely with the L2 | |
projection, and has essentially the same approximation | |
power. This is why Chebyshev polynomials are so useful in | |
practice for approximation, and why e.g. Legendre | |
polynomials are much less useful (they do not have a | |
convenient fast transform). | |
Anyway, I hope this helps! It's a beautiful subject and a | |
lot of fun to work on. | |
<- back to front page |