RSS’ death is real - 15 years ago, almost every news site had a RSS feed, some had several ones. Today? RSS feed is rare.
So if you want to make news feed from news sites, you have to use parsing their html code, and ofc everybody has its own structure. JS powered sites are painful ones.
15 years ago, almost every news site had a RSS feed, some had several ones. Today? RSS feed is rare.
It may be a reflection of where you get your news.
New York Times, Washington Post, Wall Street Journal, Radio Free Europe, Mainichi, and lots of other legitimate primary source Big-J journalism news sites have RSS.
Rando McRepost's AI-Generated Rehash Blog? Not so much.
I don't know, I also only use RSS (with the exception of Reddit I think) so I would not even notice a website that a) provides content I want to get notified about and not actively visit for a reason and b) has no feed.
It is somehow less funny today but in the 90's we would say "is there something wrong with your hands?"
A truly funny story: I wrote an rss aggregator and one day I discover some feeds had died without me noticing it. I looked at the feed, it was gone, I look at my aggregate and the headlines were all there?!?!
Since I gather a lot of feeds I couldn't help but noticed that a very large amount isn't wellformed. For example, in xml attributes the & (in urls) is suppose to be &, if you do that however many aggregators won't be able to parse it.
Every other month I wrote little bits of code to address the most annoying issues.
1) if I cant find a <link> or <guide> etc I eventually just gather <a>'s and take the href.
2) if I really cant find a title for the item I had it fail back on whatever is in the <a> since I was gathering those anyway.
3) if I cant even find an <item> I just look for the things that are suppose to go in the <item>
4) if I cant find a proper time stamp ill try parse one out of the url
5) if the urls are relative path complete them.
What was actually going on: The feed was gone, it redirected to the home page. In an attempt to parse the "xml" it eventually resorted to gathering the url and title from the <a>'s and build valid time stamps from the urls.
Mistral used to serve a feed actually up until 6ish months ago I guess? Their admin console used to be built with HTMX too which I found kinda interesting.
Now the news site and admin console is all in Next.js and slow and no feed.
So if you want to make news feed from news sites, you have to use parsing their html code, and ofc everybody has its own structure. JS powered sites are painful ones.