Is directory monitoring just cursed? - eviltoast

So, I need to monitor a fairly large nested directory tree for changes on Linux. It seems like there are a few different watcher modules that I could use – fsnotify and notify being the main ones, both of which use the inotify interface and attempt to set watches on each individual subdirectory and maintain all their watchers as things change. I have way too many directories for that to be a workable approach. It looks like the underlying issue is just that this is a difficult problem on Linux; both inotify and fanotify have some issues which make them difficult for library authors to use to present a clean and useful API.

Long story short - I coded up an fanotify-based solution which seems like a good start of what I need, and I’m planning on sharing it back in the hopes that it’s useful. I guess my question is, did I miss something? Is there already an easy and straightforward way to monitor a big directory for changes?

  • orsetto@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    3
    ·
    edit-2
    8 months ago

    Just the other day I wrote something just like this, but I used inotify.

    I have to monitor a fairly small direcotry tho, so I didn’t encounter any problem.

    both of which use the inotify interface and attempt to set watches on each individual subdirectory

    Is this really necessary? Inotify’s man page says it can monitor a whole directory with only one watch, so that’s what I did and it looks like it’s working

    I didn’t make a lot of tests tho, so I might be wrong

    Edit: from the inotify man page:

    Inotify monitoring of directories is not recursive: to monitor subdirectories under a directory, additional watches must be created. This can take a significant amount time for large directory trees.

    Yeah, it is necessary. That’s what I get for not reading the whole thing.

  • wyrmroot@programming.dev
    link
    fedilink
    arrow-up
    1
    ·
    8 months ago

    Just chiming in to say I was unaware of how limiting inotify is because of the relatively trivial cases in which I’ve used it. I’d be interested to see the alternative you’ve come up with!

    • mo_ztt ✅@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      8 months ago

      Yeah. I think it’s moderately likely that I’ll try to produce a little command-line tool that can do it effectively for deeply nested directories, with some attempt at making it cross platform. To me it’s kind of weird that there’s no stock solution existing to this problem. I get that it’s actually a deceptively difficult problem to solve for a couple of different reasons, but that’s no reason to pass the difficulty on to the programmer instead of just presenting a clean and nice interface.

      Update: I looked around for something already-existing, and found watchman and fswatch… IDK, maybe I’ll try to talk one of them into letting me write an fanotify backend for those tools instead. It seems like it’s purely just a Linux issue, and everything is simple on BSD/Mac/Windows, so maybe I’m just lucky.

    • mo_ztt ✅@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      8 months ago

      Just looking briefly it looks like it uses inotify (which definitely won’t work; I don’t have a super heavy write load but I have a total of 124,000 subdirectories to monitor) or can fall back to polling (which I could do myself without having to involve a library).

      Why this app is constructed to store its stuff in 124,000 subdirectories is a separate issue but one that I can’t immediately snap my fingers and make go away, unfortunately.

      • adONis@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        8 months ago

        124,000 subdirectories is a separate issue

        Yeah sure, but that’s also not for discussion here.

        Would it solve the issue by finding the limit of how much inotify can handle and then running multiple instances of a watcher on a subset of directories?

        Say, inotify can deal with 1000 subdirs, so having 124 instances running should in theory be equal in resourse usage as if it would monitor 124k subdirs, give or take.

        Just a thought tho 🤷‍♂️

        • mo_ztt ✅@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          8 months ago

          I think inotify’s limit is per system… and even if it wasn’t, why would I want to take on the artificial challenge of keeping up with making sure all the watchers are set on the right directories as things change, instead of just recursively monitoring the whole directory? The whole point of asking the question was “hey can something do this for me” as opposed to “hey I’d like the opportunity to code up for myself a solution to this problem.” 🙂