I remember the confused used CD store clerk when I traded in Metallica's "Kill 'Em All" and purchased a copy of Metallica's "Kill 'Em All".
Some versions have two extra cover songs, Am I Evil and Blitzkrieg. The CD's identification number is identical on both, just to be extra confusing. I showed him, because of course he asked why I was picking up a CD I just traded in, and he was dumbfounded. He'd never seen anything like that.
> The The, so much for skipping articles like "A" or "The" I guess.
There was this very old meme image of someone searching for "The Who" on Google, and above the results it said "'The' is a common word and has been removed from your search query" / "'Who' is a common word and has been removed from your search query".
One of the common message board software (forget which one) by default ignores/blocks ALL search terms of 3 letters or less. Incredibly annoying on a technical forum that uses lots of acronyms.
Anything in a foreign language seems to be impenetrable to Google Play.
It's also got a goldfish brain so when you reject an incorrect guess it'll make the exact same guess next time. Doesn't seem hard with a an entire datacentre of ML servers to go "hmm, this was wrong 5 seconds ago, maybe it's still wrong now".
But then again this is how I imagine it works, so maybe its just waiting for the new previous query service to be released: https://youtube.com/watch?v=y8OnoxKotPQ
Back in the early to mid 00s I did quite a lot of work with R, the well known statistical software.
There were some pretty specific techniques I had to use to google how to achieve things with R at the time - basically knowing where the mailing list archives were and restricting my search to that domain. Some time later google started using quite a lot of R internally, and suddenly I no longer had that problem.
Another fairly high profile one is Taylor Swift re-recording all her old songs and re-releasing them. When someone asks Alexa to play "Love Story" should they play the one that gives more royalties to Taylor or to some music publisher? If someone wanted to rank songs by popularity, you probably shouldn't treat the versions as different. Etc, etc.
Now I was wondering if/how ChatGPT would understand/process the customary ^W or ^H we sometimes add for comedic effect.
Excerpt of a 3.5 chat:
gpt[1]> write a sentence that contains red^Wblue
The vibrant red sunset transitioned seamlessly into a calming blue twilight.
gpt[2]> again
She wore a striking red dress that unexpectedly transformed into a mesmerizing shade of blue under the evening lights.
gpt[3]> write a sentence that contains red^H^H^Hblue
His initial choice of a red shirt for the party quickly turned heads—scratch that—blue, as he realized the dress code was more casual.
gpt[4]> again
The artist's intention to paint a red flower ended up as a beautiful canvas of blue, highlighting the creative evolution of their work.
gpt[5]> write a sentence that contains red and blue
She decorated the room with a vibrant mix of red and blue accents, creating a lively and energetic atmosphere for the celebration.
Interestingly it seems to understand both ^W and ^H^H^H not as 'red' word deletion but a conceptual swap for 'blue', and consistently produces results that are about transition from red to blue. This kind of makes sense as when we use that we do so to actually keep the supposedly-deleted word for subtext effects brought up by making explicit we redacted one word for another, but it feels like a category error to me, as if it did the right thing but at the wrong level (even the one with "scratch that", although it gets closer)
The last one is (one of a few, elided for terseness) prompts that I tried to cross check whether it was a coincidence: doesn't seem so as they all produced non-transitional results.
gpt4 seems to understands what backspacing means, but the examples feel quiet bad, honestly.
the chat was named "red and blue deletion and substitution"
> write a sentence that contains red^Wblue
The sentence you're looking for, with the use of the "^W" deletion notation, would be: "The sky was a vibrant blue after the red was deleted." This sentence originally contained the word "red" but it was replaced with "blue" using the ^W notation, which is commonly used in text editors to indicate the deletion of a word.
> again
Sure, here's another sentence using the ^W deletion notation: "She chose the blue dress, having initially considered the red^W." In this sentence, the word "red" is initially included but then deleted, indicating a change of choice from a red dress to a blue one.
> write a sentence that contains red^H^H^Hblue
Certainly! Here's a sentence that uses the "^H^H^H" notation to indicate backspacing and replacing "red" with "blue": "He painted the wall blue, initially starting with red^H^H^Hblue." This sentence implies that the word "red" was initially written but then backspaced and replaced with "blue."
> again
Of course! Here's another example: "The balloon was blue, not red^H^H^Hblue." This sentence suggests that "red" was originally written but then corrected to "blue" using the backspace notation "^H^H^H".
> write a sentence that contains red and blue
"The artist mixed red and blue to create a beautiful shade of purple."
There was a Hard Times article that had one of of my favorite musical jokes, it was something along the lines of "Stoner Metal Band Writes Hour Long Song in Only 5 Minutes".
What does this music sound like? Idk, it was procedurally generated by my M3 max last night, has a track length of 9001 hours, and I uploaded it to Soundcloud this morning. Don't have time to listen because I'm driving to work, let me know in the comments.
Ads are embedded IN the track, I got paid by several advertising networks for this. Soon this devious method of advertising becomes mainstream, and is known as an "Advertisement Rickroll" or a Ricky Ad for short...YouTube has already been preparing them for this behavior anyway!
There's a piano piece [1] which is quite short but the original indication by the composer is to repeat it more than a thousand times[2], meaning performances take more than 24 hours when they happen, which is understandably very seldom.
There's also things like John Cage's "As Slow as Possible", which there is a performance of ongoing now. It started in 2001 and you may want to rush to catch the end of it because it finishes in the year 2640.
[1] Maybe by Ravel- my memory and a quick googling is failing me right now
[2] It's some special slightly arbitrary-sounding number like 2347 or something
Not sure if that fits, but maybe the piano piece is the Philip Glass album "Solo Piano", which is variation on the same base melody thorough the whole album?
It’s not that one but Glass and Cage were both very influenced by it. This is going to really bug me now. The piece is not very well known but it’s by a very well-known composer either Ravel or Satie I’m almost sure.
It’s fewer repititions than I had remembered at a mere 840.
When John Cage, David Tudor and others gave the first public performance, it took over 18 hours and they gave a refund to audience members of 5c for each twenty minutes they stayed for
He's not necessarily taking about this, but titles can also be very long. I think of the Sufjan Stevens song "The Black Hawk War, or, How to Demolish an Entire Civilization and Still Feel Good About Yourself in the Morning, or, We Apologize for the Inconvenience but You're Going to Have to Leave Now, or, 'I Have Fought the Big Knives and Will Continue to Fight Them Until They Are Off Our Lands!'" which taught me, as a young script kiddie, that FAT32 has a total path length limit of 255 characters.
Ever since I started listening to emma essex's music I have found just how half-baked Unicode handling is, even in current year.
Some of my favorite examples are:
"⎆", by "HHSU 𓃚 𝕮𝖆𝖒𝖇𝖎𝖚𝖒, 𝕏𝕪𝕝𝕖𝕞, 𝓗𝓮𝓪𝓻𝓽𝔀𝓸𝓸𝓭", from the album "𝅙𝅙" (U+1D159);
"♫♫♩♫‿♩ but it's 怒首領蜂 大往生";
"rtrc{(''»'')²2}:≞(''»''01);";
This got me to look up a UTF-8 to unicode code point command line tool, which it turns out is "uconv -x hex/unicode". The first one looks like mostly mathematical alphanumeric symbols:
Assuming a music library use case, one could address it with UI alone (making the the method of grouping under the hood an implementation detail) with “merge artist” and “merge album” functions surfaced in places like the context menu that appears when right-clicking an artist/album.
You just have to have a binary ID and maybe "pronunciation" fields too, for each song/album/artist entities in the database schema, that are on UI sanitized/normalized by searches and loose matches.
It's not worth dealing with regularization and mapping rules between ASCII and whatever artists insist on using, like slightly different style of T used only in Western East Antarctic. It'll be just a String whatever.
That way each files may keep whatever emojis used specifically for any specific distribution channels without troubles.
I "solved" this years ago by (ab)using the artistsort key for this. Regardless of how the artist name is spelled, their tracks are listed sequentially. But my player uses a flat list, not an hierarchical structure.
All of these "band is changing name" edge cases are solved by separating giving entities ids internal to the dataset. Not even sure it's called an edge case at this stage.
Autechre's "Untilted" is my favorite deliberately-confusing album title. It's mentioned in this article, but when rendered in small text, it's not obvious even when you're looking at it that the title is not "Untitled".
One of my favorite artist releases music on YouTube with no name and no titles via some special Unicode. Only discoverable via YouTube’s recommendation algorithm magic
Yeah, as someone that knows liyyle to nothing about music theory, I was hoping to learn something interesting along those lines, even if I admit it's unlikely to be useful to me. Instead it's about storage schemas and correctly mapping domain values.
Which, honestly, is more useful to me, but also I find it much less interesting because it's well trod ground.
Yeah, me too. I'm no music expert, but this is what I thought of off the top of my head:
- There is a rule for determining the order in which sharps/flats are added in key signatures, and which keys result. However, the simplest key signature with flats (just one flat) is an exception to the part of the rule that tells you the key.
- Equal temperament is a reasonable approximation to other intonation systems in most cases. The barbershop quarter sound is an exception, because they form those famous chords using just intonation.
"Braxton often titles his compositions with diagrams or numbers and letters. Some diagrams have a clear meaning or signification, as on For Trio, where the title indicates the physical positions of the performers. The titles can themselves be musical notation indicating to the performer how a piece is played. Some letters are identifiable as the initials of Braxton's friends and musical colleagues, but many titles remain inscrutable to critics. By the mid-to-late 1980s, Braxton's titles began to incorporate drawings and illustrations, as in the title of his four-act opera cycle, Trillium R. Others began to include lifelike images of inanimate objects such as train cars, which were most notably seen after the advent of his Ghost Trance Music system. Braxton settled on a system of opus-numbers to make referring to these pieces simpler, and earlier pieces have had opus-numbers retroactively added to them."
None of this seems too terrible to handle. The title of the song can be an empty string, so use *string/Maybe string instead of string to distinguish between unset and "some idiot made the song literally untitled". I am surprised there are no songs that use characters only available in two different character sets, or are the same Chinese character repeated with multiple different writings (Simplified Chinese, Traditional Chinese, Korean, Japanese).
I suppose that some people are running into trouble by using CHARACTER VARYING(n) instead of TEXT in their database, but if you pick the most likely database for your local music player, then it doesn't matter. (sqlite doesn't care, basically.)
The song that's named after a virus test signature is amusing. I guess encrypt your filenames and database so that the OS can't pry into the details. Since you hopefully didn't write your music player in C, there should be no way that a virus can affect your music player.
(While I was researching this post, I found an interesting Defcon talk about what happens when you turn the EICAR virus signature into a QR code and scan it. https://www.youtube.com/watch?v=cIcbAMO6sxo A lot of stuff breaks! I almost got forced into adding similar breakage. At work we make what is essentially a database, and we got acquired by A Large Company, which had to do a security audit of our software. One of the "security vulnerabilities" in our product was that we don't scan the data in the database for viruses. We pushed back to not do this, because one of our customers literally uses our system to store viruses they analyze. But, it appears that a lot of other teams got this same suggestion and just made their price checker or passport validator crash when the barcode resolves to a virus. Easier than input validation or not calling `system(whatever the barcode resolved to)` I guess?)
I’m surprised they only listed Prince for the Love Symbol album and not because he changed his stage name to that symbol as well (hence being known as The Artist Formerly Known as Prince (TAFKAP).
Another couple of examples I could think of are 4'33" by John Cage – a silent track [0]; and Taylor Swift re-recording all her albums [1].
From The Normal, the artist that started Mute, to the label’s newest signing,
K Á R Y Y N, STUMM433 will feature a huge array – over 50 – of Mute artists including Depeche Mode, Yann Tiersen, Liars, Silicon Teens, Irmin Schmidt and many more, all presenting their own interpretation of one piece of music: John Cage’s game-changing composition, 4’33”.
yes. we should resist the tech industry's relentless binary categorization of everything. "oh no! i couldnt find it in the dropdown! what ever shall i do?"
Not sure how this is related to reinventing a CMS... I've been successfully practicing this absence of taxonomy on my private files, they all now reside within one giant directory regardless of type. I usually throw in some useful keywords in the filename and so far full text search has allowed me to find everything in a fraction of a second.
i mean, obviously people organize stuff. but my point was about how computer interfaces are digital, and now we have a big industry that propagates digital interfaces, and their type of categorization is seeping into our day-to-day like never before, and i think it affects how people tend to think. and i don't like it.
Before computers/internet that same categorization was always there. You just didn't know it existed, or couldn't do much about it.
If your shop didn't carry a specific recording version (because it wasn't in their categories), you wouldn't know about it, or would have to spend significant time (and resources) to track it down.
Before you could record music, people categorised everything else. The earliest library classification system we know of comes from Library of Alexandria, 300 years before Common Era.
At every step of the way you were at the whims of the people categorising these things.
Now, as we carry supercomputers in our pockets, people search for and expect computers to immediately retrieve more information than ever before.
So if a person wants a 7", 45 RPM, Single Sided, Card Backed, Audio Postcard recording of Janis Joplin published by Pracownia Pocztówek Dźwiękowych R. Serafin [1], that person now reasonably expects to be able to find this recording.
How do you expect this to work without "categorisation that the bad tech is imposing on us"?
It's only linked from the site, so I imagine it'll get missed by many, but all of the tracks on one of the "purposely confusing names" album are variations on "Hey Gugle Play Quiet Music" [1].
Except that I have no sort of musical talent, I once had a dream of forming a band called "Void Star" where each track name was either unrepresentable under the shortcomings of various systems (embedded NULs, forward slash, megabytes long, ...) or some kind of attack (injection, entity explosion, DOS, ...)
Just a warning, I clicked on the virgin killer link to Wikipedia. I wish I hadn't done that. That album cover could be illegal in the UK where I'm based. I'm genuinely worried.
I'm now trying to figure out how to remove the image from android Firefox history. Not sure if deleting history will remove it
The virgin killer wikipedia page is what exposed the IWF's internet censorship about 15 years ago, since it was causing all UK wikipedia access to go via a transparent proxy to 'seamlessly' remove that page from the site. After a short outcry they had to remove the wikipedia block for the page.
The thing that scares me is that although the iwf backed down. They aren't saying it's legal. I'm genuinely concerned some sort of ISP level monitoring could red flag me for visiting that page.
I would assume that the UK authorities working on protecting children have better things to do than going against citizens clicking on a Wikipedia link and regretting it.
I can imagine even worse edge cases where it's unresolvable ambiguous whether a number of releases are even the same band, or cases wherewa given set of musicians claim to be many different bands, or they constantly give different titles to the same entities.
Or bands or tracks whose names are unrepresentable in text.
Meanwhile when I wanted to release titles like “Track 191030”, my music distribution company at the time (dirtomusic.com) renamed that to “191030”, quoting their technical standards.
I get that I’m no Prince but this happened without as much as a heads up. I remember being surprised when I saw the change at Apple Music and other streaming services. One of the reasons why my music is now on bandcamp[1].
Title is wrong should be "horrible edge cases to consider when dealing with recorded music metadata".
The reason I say this is that I was writing some code for a sequencer that outputs musical score recently (basically glue between perl and lilypond), and that stuff has more edges than a suitcase full of razorblades making the task prohibitive to complete - good enough to provide something that can be made good with manual intervention without too much effort though.
> My favourite: a band named brouillard, with a single member called brouillard, whose every single album is named brouillard, and of course, so is every single track.
Some versions have two extra cover songs, Am I Evil and Blitzkrieg. The CD's identification number is identical on both, just to be extra confusing. I showed him, because of course he asked why I was picking up a CD I just traded in, and he was dumbfounded. He'd never seen anything like that.