Sherief, FYI

The Atomic Application

     There is nothing new to be discovered in physics now. All that remains is more and more precise measurement.

— A quote (commonly misattributed to Lord Kelvin) which we now know is false.

There’s no longer any low hanging fruit in classical physics. Software, on the other hand, is full of fields ripe for picking. The way we traditionally locally install applications on a PC is archaic — we either download a stepping stone called an installer then run through it, clicking where we’ve been trained to click, until we end up with an icon that we can click to launch an app that lives in maybe one, but probably multiple places on our computer as a bunch of opaque, fragile files that we cannot move around otherwise everything breaks.

There is value in applications that exist as a self contained unit — macOS for years represented application bundles (which are basically directories following a specific structure) as pseudo-atomic units in its file browsing interface (called Finder), and unless you try to peek under the hood you might think that macOS apps really are a single file — but the abstraction breaks in some cases, and you might find yourself wondering why you can’t pass a .app to an application that’s expecting a file.

Abstractions leak, and the only way to plug those leaks is to remove the abstractions. I write a lot of small apps to try out a graphical feature or another, and to get anything useful I need to bundle some data with the code — a 3D model, some shaders written in HLSL, etc. Passing these around to friends and seeing some of them try to share the apps by copying the exes then wonder why they break made me realize something: to a lot of users, the app is the icon they click and everything else is noise. And the user is right.

Building systems that expect the users to change their intuitive behavior isn’t “design”, it’s cobbling something together and expecting the world to bend over backwards to support it — it’s somewhere on the spectrum between being arrogant and being stupid. We, as software engineers, can and should do better than this. And I decided to see what it’d take.

I took one of the simple apps I had. The directory structure consists of a folder with one executable, one 3D model, and one HLSL shader. My loading code can load data from file or from memory, so a simple way to embed the assets into the executable is to convert them into arrays of bytes and store these arrays in code, then pass these arrays to my load-from-memory functions. This works just fine, but it doesn’t scale and it makes development a huge pain — every time I want to change my shader I need to run a preprocessing step that generates arrays from the on-disk files, and that’s no fun. But it’s a good starting point.

Then I decided to include PhysicsFS into my app. It’s a library that provides a virtualized in-process filesystem, so you can use PhysicsFS to “mount” the current working directory at / then use the PhysicsFS API to open /shader.hlsl and it will map to $CWD/shader.hlsl. This by itself isn’t very helpful, but PhysicsFS has another trick up its sleeve: it can mount compressed archives at arbitrary mount points in its virtual filesystem, and it can read said files from disk or from memory.

Composing the two concepts above, my pre-build processing step became combining all my assets into one archive using the 7z command line utility, then embedding that archive as an array in my code and using PhysicsFS to mount it at /. Accesses to /shader.hlsl using the PhysicsFS API now return the bytes from the shader.hlsl file stored inside the 7z archive which itself is stored in an array inside my executable.

Now we have an atomic application, and to the end user the executable they click really is the application. They can copy it around, share it, and it will work as they expect. And with how simple the PhysicsFS API is the whole thing took a few hours to set up, but it still hasn’t addressed the issue of development hassle — an edit to one of my shaders now needs a complete rebuild of the app, and this can become prohibitively slow once your app becomes non-trivial.

That’s where another advantage of PhysicsFS kicks in: you can bind multiple directories or archives to the same bind points with a specified search order, and I use this to mount both the in-app archive and the current working directory to /, so opening /shader.hlsl will first look for it in the current working directory and only fall back on the embedded archive’s version if no shader was found in the current working directory. During development, I just update shaders on disk and once I realize I have something ready to ship I rebuild the archive, embed it in the code and recompile the executable. You can either keep this behavior in builds you distribute or restrict it to development builds only. If you’re concerned about integrity, you can only mount the built-in archive and sign your executable for builds you distribute and now your code and data are both trusted. I personally like people to play around with my apps and modify shaders to their heart’s content so I always mount the current working directory and also add a command line option to extract the built-in archive so someone who only has the app can run it using the command line option --extract [path] and it will extract the built-in archive to the specified path so the user can inspect and modify the files.

With only about a day’s work you can turn your app into a standalone file (remember to statically link with the runtime libraries you use!) that’ll improve your users’ experience whether they’re tech inclined or not, and get some nice bonuses like code signing integrity checks while you’re at it. Installing and uninstalling apps becomes file copy / delete operations, and migrating apps from one computer to another becomes trivial. This can be applied to things other than assets, for example you can mount a default configuration file from an in-app archive and only write it to disk if the user changes the app’s configuration — this way stored preferences override defaults and defaults don’t write anything to disk if left unchanged.

This is UX low hanging fruit and there’s a lot more to be picked. The solution outlined here to this one problem is not some insanely clever, convoluted thing that took months of research to invent — it’s merely composing some existing systems together in a day to scratch a long-standing itch. And it’s a damn embarassment to our entire industry that similar solutions, and trains of thought that lead to them, aren’t more widespread. Engineers tend to be very myopic and unempathetic with users, and we need to raise awareness of this and spend more cycles on improving the status quo. Computers have amazing potential for learning and creation, and when we as engineers make them easier to use we expand the pool of people able to learn and create. Thinking that the users should spends hours (each!) learning how to use the systems we build instead of building better system is, at best, laziness. We can do better.

And we should.