In a previous post, I mentioned that I asked Claude to collect RSS feeds for US newspapers. Here was the prompt:
Create a list of RSS feeds and save them using the following steps:
* Review all of the newspapers linked from this page (https://en.wikipedia.org/wiki/List_of_newspapers_in_the_United_States ), identify if they have a website, and make a list of the RSS feeds available from the website.
* Identify which newspapers do not have RSS feeds
* Identify which newspapers do not have websites
* Create a OPML subscription list of all RSS feeds identified and save this file to C:\Users\sylve\Documents\Claude_Projects\News_Archive
* Create a text file that lists the newspapers that do not have RSS feeds and save this file to C:\Users\sylve\Documents\Claude_Projects\News_Archive
* Create a text file that lists the newspapers that do not have websites and save this file to C:\Users\sylve\Documents\Claude_Projects\News_Archive
* Add the OPML and text files created to a new repository in my Gitlab account called news-archive
The result of this prompt was an OPML file with 386 feed URLs, but only 22 of them had any items. I looked at the first 5 feeds, and saw that Claude had taken the root URL of a site and added “/rss/” to the URL. There was nothing there. Why did Claude do that? Why did it not search on the site to find a valid feed, and if there was none, to add it to the list of newspapers that do not have feeds? If anyone has any insight into this, I would love to hear it.