I’m admittedly baffled by this article, and I’ve also skimmed the arxiv paper. I’ve formally studied music theory and Bach more than the average person.
First, I’m not sure why the author has chosen to represent music as a graph showing the relationship between individual notes. Usually we analyze Bach in terms of harmony and the relationship between each chord. In fact, we don’t even care about chord names, we just care about the relative differences, which is why we use Roman numerals to analyze instead of note names.
I think there is something interesting in using a graph to model voicings, but the example here seems to ignore the very methods Bach used to compose.
To that end, we very much already know what Bach does to create tension and release or surpass the listener. It’s in the name, “deceptive cadence”.
So I would have loved for this to have confirmed - or refuted! - something we already know. For example, to quantify for us why a V-vi cadence feels deceptive. But we never get there. We have a set of graph techniques applied to Bach’s music without much discussion of why we believe that way of modeling makes sense.
> I’m not sure why the author has chosen to represent music as a graph showing the relationship between individual notes.
I think this can easily explainted: You only get notes from the data. Chords, etc are "cultural" constructs: Names for certain sets of relationships between individual notes.
So as you said: We should assume that their technique of analysis should reproduce our notions. For example I would assume that the note transitions primarliy happen inside the key of a piece or that comparing the note transitions between different voices follows certain patterns ("contrapoint"). And it would be interesting to connect their data space to our existing framework.
But that's a lot of additional work that might not be necessary for certain goals: If the goal is to see if their technique is able to answer certain questions (such as differentiating styles of composition), that's already good enough.
My general assumption would be: If their data space or technique of representation is rich enough to reconstruct the original data, it is just as good in terms of what we can answer. The question is then: How convenienty can we answer our questions using their representation.
I'd disagree with your first sentence, which is that the data only contains notes. The original score - the data - also includes note duration, which notes are voiced simultaneously, and a starting and terminal note of the composition. With the model in their paper, all of that information is lost (or ignored.)
That is, I do not believe that your assumption is true, that the original data could be reconstructed from their representation.
I am theoretically open to the idea that perhaps we don't need all of the original information from the score in order to answer certain questions about the music. But I am skeptical that this particular analysis will yield anything fruitful.
After reading Section A.1 (Data Collection and Network Construction), I wonder to what degree their analysis depends on MIDI encoding practices. For example, for a keyboard fugue, you could use one channel for the entire piece, one channel per voice, or one channel for each hand, with the voices moving between the two as a human player would play them. Apart from rare corner cases (especially in the WTCs), they would sound exactly the same, but I think their method generates very different networks for them.
And MIDI files might be a recording of an actual performance (with all ornamentals played out, chords arpeggiated to varying degrees, etc.), or try to capture an urtext edition as accurately as possible (leaving much to interpretation).
You've - accidentally - hit on one of the biggest challenges I've been struggling with for pianojacq, the input of which is MIDI files, but MIDI files are quite lossy with respect to the original text as written, especially when it comes to ornaments. I'd like to recover those automatically to show them 'properly' but so far I haven't found a really good and reliable way to do this. Any ideas to get me unstuck would be most appreciated.
Bach is exactly on the cusp between linear counterpoint, which didn't use functional harmony, and later classical music, which did.
Bach did not agree with Rameau's take on harmony. You can see this in CPE Bach's Essay on the True Art of Playing Keyboard Instruments, which was written by his son and is a decent distillation of Bach Sr's approach.
There are vertical arrangements that can be recognised as chords and modulations, but there's also a lot of horizontal shaping in the lines between them.
This is how you get fugues, canons, linear inversions, retrogrades, and so on - none of which can be analysed using simple chord theory.
There is far, far more going on than the occasional use of what we'd now call a deceptive cadence.
> In fact, we don’t even care about chord names, we just care about the relative differences, which is why we use Roman numerals to analyze instead of note names.
Btw, that's a pretty modern thing and requires equal temperament to really work.
Roman numerals can describe scale degrees and harmonic function in any temperament. They do not require equal temperament to "work"; they work just as well in just temperament.
The numerals describe the degree/chord's relationship to the tonic. The difference is that in just temperament, you cannot switch which degree/chord is tonic (Roman numeral I) without re-tuning, while in equal temperament you can switch freely.
The Roman numeral notation did come a little while after Bach, but I believe the concepts the notation represents are older (though some of the concepts may have been more implicit in the past).
> In fact, we don’t even care about chord names, we just care about the relative differences, which is why we use Roman numerals to analyze instead of note names.
I mostly reacted to this sentence.
Yes, I agree with the gist of what you are saying.
Agreed. In the very first example (fig 1), a G-B chord going to A-C has the network showing edges from G to A and C (B to A and C), whereas in music we look at G going to A and B going to C (just two moves). I.e., it makes more sense to consider this as just two edges, not four.
First, I’m not sure why the author has chosen to represent music as a graph showing the relationship between individual notes. Usually we analyze Bach in terms of harmony and the relationship between each chord. In fact, we don’t even care about chord names, we just care about the relative differences, which is why we use Roman numerals to analyze instead of note names.
I think there is something interesting in using a graph to model voicings, but the example here seems to ignore the very methods Bach used to compose.
To that end, we very much already know what Bach does to create tension and release or surpass the listener. It’s in the name, “deceptive cadence”.
So I would have loved for this to have confirmed - or refuted! - something we already know. For example, to quantify for us why a V-vi cadence feels deceptive. But we never get there. We have a set of graph techniques applied to Bach’s music without much discussion of why we believe that way of modeling makes sense.