Hacker News new | past | comments | ask | show | jobs | submit login
Understanding The Fourier Transform (altdevblogaday.com)
272 points by mannjani on Dec 2, 2012 | hide | past | favorite | 44 comments



The Fourier Transform can also be thought of as part of Linear Algebra, because it's actually funding a representation of a given function in the basis consisting of sin and cos functions (or complex exponentials).

See, the collection of non-pathological functions is a vector space. We add elements by adding the functions pointwise, we multiply by a constant in the obvious way, and the other requirements can be checked. Functions form a vector space.

And vector spaces have bases. One basis for the vector space of functions is the collection of sin and cos functions. Thus we can see that finding the Fourier Transform is just finding how much of each basis vector we need to make the function.

And as we know, the amount of basis vector u needed in the representation of a vector v is v.u, the dot product.

Thinking of it this way starts to make connections between all sorts of ideas.

Added in edit: I see the same sort of point made by dropdownmenu in http://news.ycombinator.com/item?id=4862228


Yup, that's how I usually picture it, change the values to a collection of sin and cos functions.

When I started explaining it to some computer science students, it helped by giving a particular example of its usefulness:

Sound is composed of waves so, when you want to send a music to a friend it's all a bunch of values like [0, 1, 2, 1, 0, -1 , -2, -1, 0, ...]. If you know they're going to look like waves (sinusoidal functions) why not just send your friend how much they look like sin or cos? The values back there were just a 2sin(x) so why not just send them the value [2]?

You could save a lot of bandwidth. You just need to "correlate" sounds with a bunch of sin or cos functions everybody agrees on :)

Bonus: you can add the phase values, 2 sin(x + phase), to get the beats just right.


> Bonus: you can add the phase values, 2 sin(x + phase), to get the beats just right.

Fortunately, you don't need to do so; if the complex number z = A + iB has magnitude r and argument theta, then Acos(t) + Bsin(t) is the same as r*cos(t - theta). That is, combining cosines and sines of the same frequency already accounts for the phase shift.


> And vector spaces have bases. One basis for the vector space of functions is the collection of sin and cos functions. Thus we can see that finding the Fourier Transform is just finding how much of each basis vector we need to make the function.

You have indicated quite explicitly that you're being informal (by using non-technical terms like 'non-pathological functions'), but I think it's worth making the small observation here that sine and cosine functions do not form a basis (implicitly: "a Hamel basis") for, say, the space of continuous functions on R/Z in the usual sense of linear algebra, since not every function can be written as a finite linear combination of them.

Rather, they are a topological basis, in the sense that every function can be written as an infinite linear combination of them. Why is it worth making this point, which probably seemed too obvious to say? The reason is that, unlike finite sums, which are unambiguous algebraic constructs, infinite sums require topology to compute; and being a topological basis in, say, the L^2 topology is quite different to being a topological basis in, say, the topology of pointwise convergence. The study of the different kinds of summability of Fourier series is the subject of some very, very deep work.


All true - the explanation should be littered with "here be dragons ..." - but for the sake of grasping the concepts without drowning in detail, I thought it was worth leaving out such tricky (but important!) Issues.

But you're right, there is interesting and deep material here.


This is a great explanation.

The only thing that I still can't get my head around or visualize is how it is that we know that the set of all cos and sin functions is sufficient to span this set of "non-pathological" functions.

So assuming some vector v is reachable by a linear combination of vectors u1,u2,u3... we know that each component should be v.u1, v.u2 etc. But how do you know that v is in fact "reachable" by some combination of vectors in the basis u?

That is, suppose I describe some (infinite) set of a functions. How do I determine the set of functions that it spans? Is there an intuitive way to picture why the set of sin and cos functions forms a suitable basis?


This was Fourier's big theorem, that every periodic, non-pathological function is "simply" the sum of phase-shifted sine waves of all necessary frequencies. You can remove the "periodic" if you allow infinitely many frequencies, and so on. The phase-shifting is taken care of by having a sine and a cosine at the same frequency, but (possibly) different amplitudes.

So start with a periodic wave and look at how much sin(x) is in it. Do this by integrating:

    Integral from 0 to 2.pi of f(x)*sin(x) dx
That's a dot product of your function f(x) with the sine wave, and that tells you how much of the first frequency you need. Subtract off the result, and then go again with sin(2x). You find the residuals get less and less. More, for something nice like a square wave or triangular wave the coefficients you get form a predictable sequence.

Interesting note:

    integral sin(k.x)*sin(m.x)
is 0 if k != m, and 1 otherwise, so the basis elements have dot-product 0, and hence are thought of as being at right angles. They also have "length" 1, since the dot-product with themselves is 1. So they are an ortho-normal basis.

So the question is: what functions can be reached by adding and subtracting multiples of sine (and cosine) functions of different frequencies? In Linear Algebra terms, what is the space of functions spanned by these basis functions?

That's harder, but it turns out to be "everything non-pathological".

http://en.wikipedia.org/wiki/Fourier_series#Hilbert_space_in...

http://en.wikipedia.org/wiki/Hilbert_space#Fourier_analysis

Edited for typos


ah thanks for answering the other questions I hadn't gotten around to asking (the reason why it's an orthonormal basis).

I think I need to go home and have a play with this now!


For an orthonormal set, one also gets another answer to your question: how can you see how much such a set spans?

Theorem: If S is an orthonormal (or even orthogonal) set in a Hilbert space V (here, L^2 functions on R/Z), then put S^\perp = {v in V : v is orthogonal to every s in S}. Then the span of S is (S^\perp)^\perp. (The containment of the span in the double-orthocomplement is formal; the other direction requires a supplemental theorem on the geometry of Hilbert spaces.)

With this in mind, we see that S is a topological basis (spans everything) when S^\perp = 0; so, to show that the sine and cosine functions span, it suffices to show that nothing (non-0) is orthogonal to all of them. This still requires computation, but at least it's easier to imagine doing this than somehow finding a Fourier series for any L^2 function.


"That is, suppose I describe some (infinite) set of a functions.... Is there an intuitive way to picture why the set of sin and cos functions forms a suitable basis?"

There are infinite expansions of trig functions to solve the finite vs infinite apparent mismatch, so don't worry about that.

An intuitive way to look at it, is on a 2-d graph there's no spot on the graph that can't be hit by a point of a triangle aka trig function collection, so it doesn't particularly matter how you pick any one spot, a triangle can always hit that one spot because it can hit all spots.

Comp sci analogy would be something like x=x+1 starting at x=0 will hit all the positive integers eventually...


That doesn't really work, because you have to be able to hit them all simultaneously, and in fact, you can't. The theorem says that for any given epsilon you can choose enough to ensure that you are within epsilon everywhere except in a very small section of the real line, and you can make the parts where it's not within epsilon very small.

The problem with your comment is that yes, for any given point you can get as close as you like, but the hard part is getting close nearly everywhere all at once. More, you then have to show that there's some sort of convergence, and that where you are close now is a superset of where you are close when you demand to be closer.

It's not straight forward.



I really liked where he color coded the equation and the sentence describing how it worked. I've seen tables in the past explaining what each variable represented, but this was much clearer.

http://altdevblogaday.com/wp-content/uploads/2011/05/Derived...


I nearly got the light blue and green confused, and I have normal color vision. I think there are reasons to prefer tables, or perhaps something animated or interactive would work as well. (e.g. mouseover a term to get its description.)


>I nearly got the light blue and green confused, and I have normal color vision. //

Define normal.

I have some difficulty differentiating blues and greens sometimes but I found those shades to be quite distinct even on a low quality screen.

Perhaps your colour differentiation is worse than you think? It could be down to monitor settings though, for example.

There's an Ishihara test here - http://colorvisiontesting.com/ishihara.htm FWIW but I'd imagine you'd need to see an optician to do it correctly.


This is an interesting way to visualize it, but I think the only way to really grok the Fourier transform is to use it in several different applications, and understand its role in each.

A particularly helpful tangent is to play with convolution and see the many different applications that is useful for (did you know that the shadow cast by an object is its cross section convolved with the cross section of the light source?). Once you really get convolution, and really get its relationship to the Fourier transform, you will be well along the path to enlightenment.


You can also think of the Fourier Transform as a projection (dot product) of a signal onto the space of all sinusoids. That's the explanation that made everything click for me.


I like this way of thinking about it, but I think it is not quite accurate for discrete Fourier Transforms. In this case, we're not projecting onto the space of all sinusoids, only the space of sinusoids whose period is a multiple of (1/N). We could probably prove (if we wanted to try) that those form a basis for the vector space of N-long complex vectors, so using any more sinusoids would be redundant.

However, I believe the continuous Fourier Transform works exactly like that.


Very true, and if you want to be more specific my definition only works for finite duration continuous time signals with finite second moment. I use this definition as it also works well for understanding other transforms such as the Laplace transform.


It is both the least squares approximation using periodic functions of this sort (i.e. the projection you mentioned), and an interpolant - a very nice combination of properties.


As is explained, you don't even need to project. What you do is a change of basis, which is to say no information is lost.


That's a really great way of thinking about it, actually - particularly if you're used to vector mathematics.


If someone (like me) is having trouble visualizing, http://dave.whipp.name/tutorial/fourier.html


You might be interested in the discussion from one of the previous submissions of this:

http://news.ycombinator.com/item?id=2555562

Here are some other items about the Fourier transform:

http://www.hnsearch.com/search#request/all&q=title%3Afou...


You can also think about the Fourier Transform in terms of its physical properties.

For example, the Fourier transform is behind quantum uncertainty (dp.dx>h). Think of it this way: the inverse Fourier transform of a frequency impulse (zero extent) is a sine wave of infinite duration. Truncate the infinite sine wave and its spectrum ceases being an impulse, broadening into the shape of the windowing function used to truncate the sine wave. That is, an attempt to constrain/define time leads to a broadening in frequency, and vice versa. The uncertainty principle naturally arises from using the Fourier Transform in an environment where "you can't have infinities".

This is true of any two variables which are related by a Fourier Transform. Yes, position and momentum are related by a Fourier transform (as are energy and time).

The thinking also works for the other extreme: if you consider how a time impulse (zero extent) related to its Fourier transform, a flat spectrum of infinite extent on the frequency axis.


http://www.youtube.com/watch?v=Znby3t3AS5s

♫♫♫

Uncertainty is not so odd as long as we're aware

position and momentum are a Fourier transform pair

So anything that tightens our precision on the one

means certainty about the other value gets undone

♫♫♫

Still, I'm not sure I approve of your suggestion to use this physical property of the universe as a basis for intuition. The reasons why quantum mechanics "works" are far, far more difficult to understand, internalize, and accept than the concepts behind the DFT which only really requires an understanding of first-semester linear algebra. In other words, I expect that the set of people who understand QM at this level but do not understand the DFT is nearly empty.

If you actually meant to go the other way (use the properties of the FT/DFT to gain an understanding of QM) then I completely agree with everything you said, of course.


"That is, an attempt to constrain/define time leads to a broadening in frequency, and vice versa."

That is also a really well phrased one line point of commonality to talk to a telecom / RF / EE type person about communications bandwidth theory. If you just wedge in signal to noise ratio / bit error rate, and look at what you phrase "time definition" and "frequency broadening" in the right way, then you pretty much have Shannons famous paper.

The FT shows up all over the place in science and anywhere you find it, you can analogize it into a totally different field of study. One "design pattern" hundreds of "implementations".


That would be a nice topic for a website: to encapsulate and present the duality present in any number of fields.

A major barrier to understanding any new field is being able to strip away the jargon and recast the ideas into a familiar form. One can envisage website, where you tick the field(s) "A" that you want to learn about, tick the field(s) "B" that you already know about, and it recasts field(s) "A" in terms of the terminology of field(s) "B".

Make it have a "plug-in" structure, so supporting a new field is a matter of writing a mapping of field specific concepts to a set of "design patterns". Make it open source, so experts can jump in and contribute mappings for a wide variety of fields. Getting fancy, mappings could be written in terms of any two fields (so the expert does not have to learn a set of web-site specific design patterns), and the software could construct a graph to allow conversions between arbitrary fields.


My preferred way to think about it is in linear algebra terms: your signal of length n is an n-dimensional vector. Now you just need to change the coordinate system into fourier basis, which is made up of vectors of length n who's entries are sin/cos waves of different frequencies. The way you change basis in linear alebra is by doing a dot product with the basis vectors you want to transform to, and it's no different here... it's just a projection of your single point into fourier basis and that's all the formula says.

The fancy e^i stuff is just for mathematical beauty and compactness and should be avoided when explaining the fourier transform, imo. There's absolutely no need for it as far as the idea goes, just do the sins and coses separately.


If, however, one wanted a similarly intuitive explanation of the e^i part of the story, I recommend these two articles in order. They explain the hand wavy part about complex numbers and rotation: http://betterexplained.com/articles/a-visual-intuitive-guide... http://betterexplained.com/articles/intuitive-understanding-...


If you want to solve his original problem, detecting treble and bass energy in an audio signal, you might want to read up on the Goertzel algorithm. http://en.wikipedia.org/wiki/Goertzel_algorithm

For limited numbers of bands it will use less battery power and be faster.


This is a nice way to see how the DFT is computed, however I find the view of the FT as a change of basis as even more important - generalizes easily to other bases and and one can understand easily wavelets and their advantages. Basically, the sinusoids form a basis of the vector space of functions (every 'non-pathological' function can be written as a possibly infinite sum of them) and the numbers computed by the FT are coefficients for the respective basis vectors - the magnitude of these coefficients is interpreted as the strength of the corresponding wave in the original signal.

Another way to see the FT is as the basis where the convolution operators are diagonal - this is used in image processing, where computing the FFT of a filter + entry-wise multiplication can be much faster than running the convolution at each pixel of the input image.


This is a pretty good intro to DFTs. I was excited to see that others were also motivated by the connection to music and math! I looked into this quite in linear algebra in college and found Benson[1] and Smith[2] to be really thorough and interesting resources on the topic. Hope they help others!

1. http://homepages.abdn.ac.uk/mth192/pages/html/maths-music.ht... (free pdf)

2. https://ccrma.stanford.edu/~jos/mdft/ (skip down to applications and the digital audio number systems for a preview)


Here is an attempt at an easy-going, matrix-oriented discussion of the FFT, together with quite a bit of motivation:

http://home.gregfjohnson.com/fft

Here is a really terse "just the essence" ruby implementation of the FFT:

http://home.gregfjohnson.com/fftruby


When I first came across Fourier transforms I just thought of it as a bunch of sine and cosine waves destructively/constructively interfering. Could be because we derived FT in class starting by matching sine/cosines to intersect various points.


I got an idea how and why dft works from prof.Wilf(RIP) http://www.math.upenn.edu/~wilf/AlgoComp.pdf see from page 50 on. He approaches it from the algebraical side


I found this amazing lecture on youtube.

http://www.youtube.com/watch?v=M0Sa8fLOajA

(Gilbert Strang - my first time watching him teach. Such a great paced lecturer)


That's so conceptually simple I hate myself for not getting it before.


I would like to see more of these articles cover the phase portion.


Perhaps this was left as an exercise for the reader? It seems to follow quite closely from the same intuition: after you have averaged the points on the imaginary plane you get a point that you can can represent as (r,theta). Theta is the phase portion. It's just the direction of the constructively interfering peak.


What do you mean?


The imaginary part of the transform.


The article uses the magnitude of the coefficients, which is computed using both the real and the imaginary part.


The phase portion isn't actually just the imaginary part, it's just the piece of information lost when one goes from real+imag -> magnitude, i.e. it's the argument of the complex number.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: