How to store a rolling archive of an RSS feed? - eviltoast

Is anyone aware of an existing project that can do something like this:

  • Access an RSS feed.
  • Parse the contents of the items in the feed, and fetch linked images.
  • Take the new feed elements and add them to previously fetched elements.
  • Store all of the content in a merged RSS/XML file, or something like a SQLite DB.

Context: I’d like to archive Mastodon posts of an account automatically. I’d prefer it to be a script/binary I could run on Linux as I’d likely throw it in a GitHub action and save the resulting output in the git repo.

I could probably whip something together but I’m lazy and I’d prefer to use something that already exists.

  • atheken@programming.dev
    link
    fedilink
    arrow-up
    2
    ·
    1 year ago

    I use miniflux, and you can configure it to modify feed items. As far as I know it does not purge anything by default.

    Really, pulling an RSS feed and parsing it, storing stuff is probably 50 lines of bash, and less in a general purpose scripting language.