dima

on software
Posts from blog by tag concurrency:

Go: Delayed file system events handling

Suppose you need to do something when some file system event occurs. For example, restart the web server when files change. Quite a common practice in development: recompile the project and restart it on the fly immediately after editing the source files.

This site is written in Go, and I recently decided to add a similar hot-reload for markdown files with articles: make the web server notice a new file in it's data directories and repopulate the internal in-memory-storage without restarting itself. And I wanted to "listen" to the file system, not scan it every few seconds.

Fortunately, there is already a good listener fsnotify, which can observe given directories. (Not recursively though, but I don't have that many directories.)

The README gives a pretty clear example. I wrapped it in the Watcher() function and added my channel there, which I send something to once an event happen. I don't care about the type of events, so I just always send 1. Also, I wrapped the Watcher() function into another Watch() function, which executes the reload function passed to it at every FS change.

Something like this (error handling and some unimportant stuff intentionally left out):

func main() {
    dirs := []string[
        "/foo",
        "/bar",
    ]
    go Watch(dirs, func() {
        reloadStuff()
    })
}

func Watch(dirs []string, reload func()) {
    ch := make(chan int)
    go watcher(dirs, ch)

    // Execute the reload function on each
    // file system event
    for range ch {
        reload()
    }
}

func Watcher(dirs []string, ch chan int) {
    watcher, _ := fsnotify.NewWatcher()
    defer watcher.Close()

    done := make(chan bool)
    go func() {
        for {
            select {
            case event, _ := <-watcher.Events:
                // Send a notification to the channel
                // on any event from the watcher
                ch <- 1
            case err, _ := <-watcher.Errors:
                log.Println("error:", err)
            }
        }
    }()
    for _, dir := range dirs {
        watcher.Add(dir)
    }
    <-done
}

I tried to run ran some tests and found that more than one event can be triggered when changing a file (e.g. CHMOD and WRITE sequentially). And if multiple files are changed at once (git checkout, rsync, touch *.*), there'll be even more events, and my hot-reload is triggered on each of them.

In fact, I only need to trigger it once, if a lot of events came in a short period of time. That is, accumulate them, wait half a second, and if nothing else came, do the thing.

To my shame, I couldn't come up with a good solution on my own, but I noticed that CompileDaemon, which I use in development to recompile the source code, works exactly as I want. The solution from there is elegant (as elegant as it can be in Go), and it's about using time.After(): it starts a timer and sends the current time to the return channel after a specified interval.

As a result, the Watch() function has become the following:

func Watch(dirs []string, reload func()) {
    ch := make(chan int)
    go watcher(dirs, ch)

    // A function that returns the channel, which receives
    // the current time at the end of the specified time interval.
    createThreshold := func() <-chan time.Time {
        return time.After(time.Duration(500 * time.Millisecond))
    }

    // `threshold := createThreshold()` is also acceptable,
    // if you want to trigger reload() on the first run.
    // I don't need that, so an empty channel is enough.
    threshold := make(<-chan time.Time)
    for {
        select {
        case <-ch:
            // At each event, we simply recreate the threshold
            // so that `case <-threshold` is delayed for another 500ms.
            threshold = createThreshold()
        case <-threshold:
            // If nothing else comes into the `ch` channel within 500ms,
            // trigger the reload() and wait for the next burst of events.
            reload()
        }
    }
}

It worked exactly as I wanted it to, and now I can update all the data files through rsync without restarting the web server within 500ms.

I invented PHP.