Using ChatGPT to convert a list of feed URLs to OPML

After seeing the interest in blogrolls by Dave Winer, I decided to spend some time updating my RSS subscription list and learning about using ChatGPT as a programmers’ assistant at the same time. I use the River5 feed reader, along with a single page web app to display my feeds. River5 supports having a feed list as a text file with one feed URL per line. I wanted to see how well ChatGPT could create an OPML subscription list from this text file.

Here was my first prompt:

Create a Javascript program to run under Node.js. The program should take as input a text file that contains one URL on each line. The program should output an OPML file with the URLs added to the outline such that the file can be uploaded to a feed reader that uses OPML as an input source for subscriptions. In addition, create a package.json file for this program.

This created an app which ran without an error, and created a basic OPML file. Each element had the type attribute set to “rss”. However, the text and xmlUrl attributes were both set to the feed URL. I decided to experiment to see how much further ChatGPT could go.

Here was my second prompt:

Create a Javascript program to run under Node.js. The program should take as input a text file that contains one URL on each line. The program should output an OPML file with the URLs added to the outline such that the file can be uploaded to a feed reader that uses OPML as an input source for subscriptions. The program should also fetch a copy of each URL, and if a feed is available, get the value of the <title> element within the <channel> element and set the text attribute of each entry in the OPML file to the <title> element for that feed/URL. If the feed is not available, set the text attribute of each entry in the OPML file to the URL value. In addition, create a package.json file for this program.

I was more specific here, asking that each feed URL be read to try to get the title of the feed and set the text attribute to the feed title. This version had a small programming error, but I easily fixed it, and the script was able to run without a problem. The output OPML file was able to fill in a lot of the titles, but some feeds still had problems (in particular, Blogspot blogs and other blogs using Atom instead of RSS). Also, this version created a separate “_attributes” element within the outline element, which did not have the same look as an example OPML file I reviewed.

Based on my review, I created a third prompt:

Create a Javascript program to run under Node.js. The program should take as input a text file that contains one URL on each line. The program should output an OPML file with the URLs added to the outline such that the file can be uploaded to a feed reader that uses OPML as an input source for subscriptions. The program should also fetch a copy of each URL, and if a RSS feed is available, get the value of the <title> element within the <channel> element and set the text attribute of the entry in the OPML file to the <title> element for that feed/URL. If an Atom feed is available, get the value of the <title> element and set the text attribute for that entry in the OPML file to the <title> element for that feed/URL. If the feed is not available, set the text attribute of the entry in the OPML file to the URL value. Each <outline> element should contain the following attributes: type = rss, htmlUrl = URL of the feed, xmlUrl = URL of the feed, text = value of the <title> element within the <channel> element if the feed is in RSS format, or the value of the <title> element if the feed is in Atom format. In addition, create a package.json file for this program.

This version went back to creating elements within the OPML file as single XML elements with all the attributes contained within the element. I tried some further refinements of this prompt, but had some issues with the generated code, so I then began manually iterating on the code generated by the third prompt. I had a WordPress feed which was redirecting to google.com, so I added some error checking if the data read from the feed was undefined. I also did some experimenting with the Blogspot feeds to find the correct HTML element to get the title of the feed and the site URL. Finally, I did an overall cleanup of feed URLs in the input file during my script troubleshooting.

Eventually, I was able to get all the data for the Blogspot feeds read correctly. The remaining Atom feeds were all generated from custom code, so there was too much variation to be able to easily get the title and the site URL from those feeds. I then hand-edited the output OPML file to add the remaining data I wanted.

Lessons learned:

  • It was easy to experiment with different prompts
  • Adding more detailed would refine the code generated by ChatGPT
  • For my app, there was a point of diminishing returns (got harder to specify the desired behavior – but could be due to lack of experience)

I have uploaded the app to this Github repo – give it a try!

Call for Twitter-like systems based on feeds

Dave Winer again calls for “a twitter-like system built with feeds, with all their limits”. In May 2023, I created My Status Tool (Github repo) using Node.js that provides the basic posting and reading functionality within Twitter, but using RSS and rssCloud as the enabling technologies. Colin Walker also created a PHP implementation (Github repo), and our two versions were able to interop. Dave also called for this back in December 2023 (my response), but from what I heard, Dave had some other ideas besides working with MyStatusTool. I don’t think that FeedLand is the system he was talking about, and I don’t think that Blogroll Social is the system either. Anyone interested in working on this?

Creating the future of journalism (post and podcast)

I just finished listening to the two podcasts by Dave Winer on what we need from Biden, and his conversation with Jeff Jarvis on how to work around the brokenness of the mainstream media in the 2024 election. This was an excellent conversation. I  have several comments on the Jeff Jarvis podcast, and will cover them in this post, and there is a separate podcast at the end of this post.

Jeff Jarvis brought up two points based on prior writing/conversations with Dave Winer. One was “the power of the link”, and the other was that people should uses their own personal spaces to respond to someone else’s post or story. I agree that if you are going to talk about someone, or something that they wrote, you should link to it. However, recently Dave Winer wrote a post critical about the people and work of the Podcasting 2.0 effort (how they reimplemented rssCloud), but he did not link to the thing he was complaining about (I had to track it down). How does this square with what was discussed in this podcast? I think it is inconsistent at a minimum, and perhaps bordering on hypocrisy.

Another topic was people commenting on social media posts, and how a lot of these comments were “spam”, in that people were not responding to what was posted, but were posting to try to take advantage of the “flow” of the original poster (in this case, I am assuming it was Dave Winer – it could also happen to Jeff Jarvis,  but it was Dave Winer who brought it up). This is a tricky topic. Both Dave Winer and Jeff Jarvis said they want to encourage conversation (well, maybe it was Jeff more than Dave). However, if you want to have conversation, you have to give people the chance to say something. If the response to comments is deleting comments, or blocking people because they disagree or are critical, this discourages people from commenting. If I write a post commenting on another post (either compliments or criticism), how should I inform the person or site I am are writing about? Both Dave Winer and Jeff Jarvis said people should be “respectful”. That sounds good, until the conversation gets blocked. I do not have any solutions to offer here, but if someone wants to have a conversation, it has to be two-way/bidirectional. In the case of social media apps, part of the design of the apps, in my opinion, is to encourage conversation. Blocking people and deleting comments in a thread do not give the impression that someone wants to have a conversation.

The next topic I would like to address is providing an alternative to the mainstream media. Dave Winer talked at some point about individuals creating stories (covering  events (like reporters, I suppose)) and creating/editing a flow of stories (again I assume this is mainstream media stories, which is a lot of what get commented about on blogs). I  will address the “flow of stories” idea first. During the 2020 George Floyd protests, I started a site to curate the mainstream media and social media coverage of the protests in Portland,  Oregon. The site was called Portland Protest News, and I updated it daily for a month and a half before I had to stop due to an illness. I set up news flows from mainstream media (primarily using RSS feeds), reviewed those feeds on a daily basis, selected stories to post, created a post with links to those stories, and also created a newsletter with the same content. At best, I was able to do this in an hour. Most of the time, it was 1.5 hours, and sometimes two hours. It was difficult to do this and work a regular 8 hour day. To me, the curated flow that Dave Winer talked about in the podcast with Jeff Jarvis would take at least this much time. Someone would have to put in that time to create a dynamic site with daily posts.

Next, I would like to discuss the topic of people covering events. I thought the idea of protesting the New York Times was interesting,  and the idea of others news organizations covering that protest might occur. However,  in a recent post of mine commenting on an essay by Anne Applebaum on protests in Poland, Applebaum stated that protests, if not carefully targeted, achieve little. I do not think there would be a clear enough goal to make protesting the New York Times effective. I think that the idea of independent writers/bloggers attending events and publishing accounts of this events is worthwhile, but I think there are several issues as well. Finding out about events takes work, attending events takes time, and writing about the events takes time and effort. Who will do these things? Who will coordinate this work? How will the posts/stories be distributed so that others can find out about them? The story “The Little Red Hen” comes to my mind, where one animal does all the work to produce a loaf of bread. Where are the “little red hens” to do this work?

There are some independent news organizations covering state legislatures (States Newsroom) and voting issues (Votebeat). There are small news startups trying to cover local news (Salem Reporter in Salem, Oregon as an example). There is even a online newspaper in Washington state (the Sammamish Independent) that is produced by volunteers. These are all current examples of independent coverage. Some of them have some funding, but many are dependent on subscriptions or donations. Doc Searls, in his work at the Ostrom Workshop at Indiana University, has written a series of stories about “The News Commons“, and experiments in the Bloomington Indiana area. So, I point to these examples of “little red hens”, each with a focus, but providing inspiration and food for thought to others.

I welcomed this podcast, as it shared many ideas and food for thought. I hope my analysis has done the same, and I welcome any and all feedback. No one will be blocked or deleted,  I assure you!

I recently posted a quote from Hannah Arendt: “We are free to change the world and start something new in it” . I would like to point to a recent post by Ken Smith about how to solve the problem of Donald Trump. He organized his post as a series of problems to be addressed. I think the structure of this post could be implemented as a website in a fairly straightforward manner. I will try to create something in the next week that could serve as a model. Maybe I can even get Ken Smith or someone to collaborate with me on this project. Any assistance would be welcomed!

Why I am sticking with Joe Biden

Yes, I watched the CNN debate on Thursday with Joe Biden and Donald Trump. Yes, I  thought Joe Biden’s “performance” at the debate was poor, compared to the confidence of Trump’s presentation. However, Biden answered the questions set by the moderators and generally answered those questions truthfully, while Trump repeatedly refused to answer the questions in the debate, even after being pressed several times for some of those questions. Trump told so many lies that it took CNN’s Daniel Dale several minutes just to list all the lies that Trump spewed out over the course of the debate (also see text of fact checking). As Joe Biden mentioned afterwards, “It’s hard to debate a liar.”.

Several other perspectives:

Mary Trump (Trump’s niece) on Substack:

While Biden’s performance is rightly being criticized, it was the debate moderators who allowed Donald to steamroll the truth with an incessant stream of increasingly bizarre and dangerous lies — that he, not President Biden instituted a cap on insulin, that blue states allow women and their doctors to commit infanticide, and that Nancy Pelosi  was somehow responsible for January 6th — while refusing to answer the questions asked of him. Why Jake Tapper and Dana Bash chose to abdicate their journalistic responsibility in service to a man who is an enemy of American democracy and a free press only they know, but that abdication should be a much bigger story, I know who Joe Biden really is. And I know who my uncle really is.  And I’ll take the decent guy with the sore throat who believes in democracy over the rapist insurrectionist monster every single time.

https://marytrump.substack.com/p/why-im-still-with-president-biden

Seth Abraham on Substack:

Biden will not step away from the 2024 election cycle because it would hand the presidency, beyond any doubt, to a confirmed rapist, serial sexual assailant, active insurrectionist, convicted felon, pathological liar, malignant narcissistic sociopath, gleeful adulterer, career criminal, unrepentant con man, traitorous would-be U.S. dictator, misogynist, antisemite, racist, homophobe, transphobe, Islamophobe, and budding war criminal.

https://sethabramson.substack.com/p/the-extremely-simple-reason-maga

Heather Cox Richardson on Substack:

Tonight was the first debate between President Joe Biden and presumptive Republican presidential nominee Donald Trump, and by far the most striking thing about the debate was the overwhelming focus among pundits immediately afterward about Biden’s appearance and soft, hoarse voice as he rattled off statistics and events. Virtually unmentioned was the fact that Trump lied and rambled incoherently, ignored questions to say whatever he wanted; refused to acknowledge the events of January 6, 2021; and refused to commit to accepting the result of the 2024 presidential election, finally saying he would accept it only if it met his standards for fairness. 

https://heathercoxrichardson.substack.com/p/june-27-2024

The American people have a choice between a convicted felon and liar, and a man who has fought to preserve democracy and improve our way of life. I am going with the second one.