PCA: Basically you represent your measurements as a covariance matrix about the data set you care about. You then find the eigenvalues and the eigenvectors of that matrix. These basically tell you the hyperplanes which most accurately represent your data sets. Unfortunately, I can't get into more details about how this is used for bill detection -- go read the patents and papers yourself.
SVM: Basically, you have a bunch of datasets, and you an unknown data point, and you want to figure out which dataset your new data point belongs to. Well, you're not a clever person, and neither am I, so you just come up with the "cloud that surrounds" your N-dimensional shapes. This is your Support Vector.
A Support Vector Machine is just "hey, I've got a bunch of characteristic datasets, find the minimum structure for each dataset that surrounds the cloud, and then let me compare them." In practice, it gets really thorny to find the minimum vector, so people use something called the Kernel Trick to simplify that into something more manageable. (Basically, it's a higher dimensional transform that maps your dataset into even higher dimensions which likely will simplify the data as there's probably an underlying structure to your data you don't know. You try a bunch of kernels, and take the one that works best for you.)
Again, I can't tell you how it relates to bill detection. I'm embargoed. Go look at the patents and papers yourself.
I've adopted a similar attitude as you here when it comes to past machine learning jobs, and discussion of detail. What ends up being your bright shiny line that you don't cross? I tend to just not talk about the specific feature engineering, being relatively upfront about such basic things as "I used a random forest".
"Principal components are linear combinations of original variables x1, x2, etc. So when you do SVM on PCA decomposition you work with these combinations instead of original variables."
"What do you do to the data? My answer: nothing. SVMs are designed to handle high-dimensional data. I'm working on a research problem right now that involves supervised classification using SVMs. Along with finding sources on the Internet, I did my own experiments on the impact of dimensionality reduction prior to classification. Preprocessing the features using PCA/LDA did not significantly increase classification accuracy of the SVM."
I'm an amateur in machine learning myself so I don't have a lot of knowledge of the details, but allow me to take a guess at what it's about.
Support vector machines are a machine learning algorithm that works by taking data points in some (usually) high-dimensional space, and classifies them based on where they lie in relation to a boundary that (mostly) divides the positive examples from the negative ones. So one way a bill detection SVM might work is by using images of the bills are being transformed into points in that high-dimensional space by treating individual pixels as different dimensions, and deciding if they're valid banknotes (and the denomination) based on where in that space a given point falls.
Since SVMs are designed to work well in high-dimensional data, you're correct that principal component analysis doesn't normally help them do better. Oftentimes it makes them perform worse. More likely, the reason they're using doing dimensionality reduction is to cut down on the size of the SVM's model. That could help in two ways: If you're using a really massive number of training examples, then dimensionality reduction can help cut down on the time it takes to train the SVM, or the space you need to store your training set. And if you're trying to fit the SVM into an embedded system, then dimensionality reduction would allow you to produce an SVM that runs well on lower-cost hardware.
It's more than that. The SVM kernel maps your multi-dimensional data into an infinite dimensional data. Because of the way the math works you can essentially learn from an infinite dimensional data without overfitting. The support vector is the data points that "support" the separating hyperplane, that is the points that meet the constraints. The other thing about SVMs is that they are computationally friendly.
Just as a simple example (stolen from the Caltech course which I highly recommend) if you look at points on a plane that form a circle and try to separate them with a line you're going to fail. I.e. your points are (x,y) and those in the circle are your fake dollar bill and those outside aren't. But you can take all these points and apply a non-linear transform, e.g. (x, y, xy, xxy, yyx, x+y, xx+y2), you get the idea... It turns out that now you can* separate the data into what's inside and outside the circle. The problem is you just increased your so called VC dimension of the model and you might overfit the data and not learn anything. SVMs let you get infinite numbers of combinations, without overfitting and with cheap calculations... Pretty neat.