Using ChatGPT to convert a list of feed URLs to OPML

After seeing the interest in blogrolls by Dave Winer, I decided to spend some time updating my RSS subscription list and learning about using ChatGPT as a programmers’ assistant at the same time. I use the River5 feed reader, along with a single page web app to display my feeds. River5 supports having a feed list as a text file with one feed URL per line. I wanted to see how well ChatGPT could create an OPML subscription list from this text file.

Here was my first prompt:

Create a Javascript program to run under Node.js. The program should take as input a text file that contains one URL on each line. The program should output an OPML file with the URLs added to the outline such that the file can be uploaded to a feed reader that uses OPML as an input source for subscriptions. In addition, create a package.json file for this program.

This created an app which ran without an error, and created a basic OPML file. Each element had the type attribute set to “rss”. However, the text and xmlUrl attributes were both set to the feed URL. I decided to experiment to see how much further ChatGPT could go.

Here was my second prompt:

Create a Javascript program to run under Node.js. The program should take as input a text file that contains one URL on each line. The program should output an OPML file with the URLs added to the outline such that the file can be uploaded to a feed reader that uses OPML as an input source for subscriptions. The program should also fetch a copy of each URL, and if a feed is available, get the value of the <title> element within the <channel> element and set the text attribute of each entry in the OPML file to the <title> element for that feed/URL. If the feed is not available, set the text attribute of each entry in the OPML file to the URL value. In addition, create a package.json file for this program.

I was more specific here, asking that each feed URL be read to try to get the title of the feed and set the text attribute to the feed title. This version had a small programming error, but I easily fixed it, and the script was able to run without a problem. The output OPML file was able to fill in a lot of the titles, but some feeds still had problems (in particular, Blogspot blogs and other blogs using Atom instead of RSS). Also, this version created a separate “_attributes” element within the outline element, which did not have the same look as an example OPML file I reviewed.

Based on my review, I created a third prompt:

Create a Javascript program to run under Node.js. The program should take as input a text file that contains one URL on each line. The program should output an OPML file with the URLs added to the outline such that the file can be uploaded to a feed reader that uses OPML as an input source for subscriptions. The program should also fetch a copy of each URL, and if a RSS feed is available, get the value of the <title> element within the <channel> element and set the text attribute of the entry in the OPML file to the <title> element for that feed/URL. If an Atom feed is available, get the value of the <title> element and set the text attribute for that entry in the OPML file to the <title> element for that feed/URL. If the feed is not available, set the text attribute of the entry in the OPML file to the URL value. Each <outline> element should contain the following attributes: type = rss, htmlUrl = URL of the feed, xmlUrl = URL of the feed, text = value of the <title> element within the <channel> element if the feed is in RSS format, or the value of the <title> element if the feed is in Atom format. In addition, create a package.json file for this program.

This version went back to creating elements within the OPML file as single XML elements with all the attributes contained within the element. I tried some further refinements of this prompt, but had some issues with the generated code, so I then began manually iterating on the code generated by the third prompt. I had a WordPress feed which was redirecting to google.com, so I added some error checking if the data read from the feed was undefined. I also did some experimenting with the Blogspot feeds to find the correct HTML element to get the title of the feed and the site URL. Finally, I did an overall cleanup of feed URLs in the input file during my script troubleshooting.

Eventually, I was able to get all the data for the Blogspot feeds read correctly. The remaining Atom feeds were all generated from custom code, so there was too much variation to be able to easily get the title and the site URL from those feeds. I then hand-edited the output OPML file to add the remaining data I wanted.

Lessons learned:

It was easy to experiment with different prompts
Adding more detailed would refine the code generated by ChatGPT
For my app, there was a point of diminishing returns (got harder to specify the desired behavior – but could be due to lack of experience)

I have uploaded the app to this Github repo – give it a try!

Published

July 8, 2024

Andy in ChatGPT, Feed Readers, Feeds, Micro.Blog | July 8, 2024

Using ChatGPT to convert a list of feed URLs to OPML

Published

July 8, 2024

Write a Comment