American young person here: I do the glottal stop thing with "button" but a flapped r with "butter." I think it has to do with the "n" - I would also do the glottal stop for "bitten" but not for "bitter." For me the second syllable is a pure syllabic /n/ - no preceding vowel or consonant.
Voiceless phonemes(?) tend to become voiced when they're between two vowels. Boise natives hypercorrect to say Boy-see, when most folks from outside the area say Boy-zee.
Also originally a midwestern. I've noticed that I and others with the accent also put a compensatory breathy h on the end of words like Dakota, and hence pronounce it "Dakodah". Often this results in a devoicing of that second "a". I've never seen any studies on that though.
The pronunciation "dee-troit" is just local slang meant to sound folksy. It's used by sports announcers and singers, and for comedic value. Most of the time people in the area pronounce it the same way as the rest of the country pronounces it.
I think you're right there. But also going back, there was a more 'neutral' pronunciation of "deh-troit". I believe from there, we started getting "duh-troit" as a type of Schwa[1]. Again, I agree that there's a newer "DEE-troit" which is kind of like if you drew a line from "duh-troit" through "deh-troit" and kept going, then you'd get "DEE-troit".
The typical American "nearly silent" one you are describing tends to be more of a flapped /ɾ/, by the way. <d> is often the same.