portrait picture

TIMO ZIMMERMANN

balancing software engineering & infosec

Beyond self hosting

posted on July 16, 2019, 2:24 p.m. in

As some of you might know I am a big fan of hosting services on a local server if there is an adequate software for the problem I want to solve available and if there are no requirements to share data with the outside world (except some dedicated VPN users). With some basic understanding of how to run servers or a Docker image this is a relatively pain-free process, till something goes wrong. Then you better have a very good understanding of how to troubleshoot a whole stack or it might look grimm for your data.

There are some very good resources to get started, like awesome-selfhosted and r/selfhosted. But many of them fall short when it comes to troubleshooting. You likely end up in an GitHub issue tracker, maybe some IRC channel or best case screaming into the Twitter-void makes someone emerge from the abyss to help you out.

Why self host?

Now let us take a step back for a moment and think about why people are self hosting services.

  1. it is cheaper (as in you do not have a fixed fee and your time does not have a lot of value to you)
  2. you do not trust corporations
  3. you want full control over your data, because you do not trust corporations
  4. you have sensitive data which should stay on premise
  5. learn how to run a certain service

Those five seem to be the most common reasons why people want to self host services and of those three are about control over your data and data ownership. It seems pretty rare that people want to host a unique server where no SaaS exists, but a simple replacement.

But self hosting comes at a price. You have to keep a server running or rent one, keep it secure if you are exposing it to the Internet, you need to backup you data - and obviously test recovery - and a lot more.

I think most readers of my blog would be comfortable hosting a feed reader or calendar service, but not all are actually willing to do so considering how cheap the hosted alternatives are and considering the time and risk self hosting brings to the table.

For the non-technical crowd things look pretty bad - they might be interested in having more control over their data, but if you start a sentence with „you just need a server“ they likely shut down mentally and hope you just stop talking. There were some projects that tried to sell a server including some hosted software in a box, but from what I can tell those companies never gained significant market share.

But do we really need to self host applications to keep control over our data?

Apps, apps, apps

I think one of the widest scale efforts to self host and find alternatives to a hosted service happened when Google shut down Google Reader. There were quite a few decent alternatives, some even compatible with the same feed readers people loved. I migrated to Feedly before getting a bit upset with it and trying various self hosted alternatives.

But the best alternative did not require any service at all. News Explorer. You buy it. It synchronizes across all your devices, even your Apple TV through iCloud. No subscription. No third party service. No server. It simply runs on your device. Our phones and tablets are becoming more and more powerful each year, but we let them sit idle for most of the time till they have to execute 10MB JavaScript to display Hello World for a web application.

Consider this: you want to achieve something, like reading news, organizing your bookmarks, managing todo lists or writing some notes. In many cases - sometimes when stock apps do not get the job done in a satisfactory way - you would download an application and create an account with a third party service. Why the third party service?

From a users perspective it often helps with synchronizing data and web access, from a business perspective you get your customers data, quite valuable beside the few $ for the subscription.

Synchronizing data

I do not want to downplay web access or synchronization. There might be people who have to use a web interface. Even if there are apps on all platforms there will always be scenarios where you need access to your data but cannot install or use an application. Synchronisation is another topic. If you are fine to only use Apple hardware you can get away with iCloud for synchronization. The moment you also want Windows, Linux or Android support iCloud is likely not the best choice.

But does this mean applications need a third party service just for synchronization? Likely not. Remember WebDav? SFTP? SCP? There are tons of ways to transfer data via a server that only handles storage. And they are quite cheap - 1TB for 7.90€ for example. You own the storage box that does not require any maintenance, this is pretty comfortable. If you prefer to avoid third parties even for storage you could take a look at something like Resilio.

From an application developers perspective the problems to solve are mostly the same, no matter if they use iCloud or a standardized protocol to transfer data from a client to a server and back, so this should not be a driving argument for one method over another.

Web access on the other hand will be a bit harder, but depending on the protocol for storage this should be doable.

Sharing

One of the things that will likely change a bit is how content will be shared between people. With an traditional web service and client application sharing becomes pretty easy and guarantees that data can be consumed in the exact way the developers and designers imagine.

But let us assume all data is only on your devices and you want to share it with someone who does not have the applications. Asking them to install an app to receive a URL does not make a lot of sense.

I think for the majority of content being shared there is a good chance that sharing it via iMessage, WhatsApp or Email is actually sufficient and the receiving end has a system capable of displaying the data.

The moment we leave the realm of plain text, photos, videos and URLs things will likely become a lot harder. This could also be the point where data is so specialized that it is reasonable to ask all parties to have the application installed. Working with someone on an Excel spreadsheet usually means all parties need access to Excel, at least if there is some complexity in the tables and formulas.

Being able to collaborate on a file might actually require some storage and sharing solution with a little bit more fine grained access model which is still reasonable easy to use. I think Resilio is actually doing a pretty good job with this part.

Show me the money

Moving to an app only model means revenue has to be generated a bit differently than having a free app and a paid online component. This might be one of the biggest hurdles.

There are already some very prominent applications without any or a free server component - Things, Ulysses, Bear are some that come to my mind which are usually pretty well received. I think this is something that works well for Apple users.

Windows, Linux and Android are usually the platforms for which I hear devs complain about the lack of willingness to spend money on software.

Somewhere in my head it makes sense to weight the app price against the price you would be paying for a service with a client application, but I am nearly sure there is some research or statistic that show that people do not think this way.

As with many things the price for an application, the perceived value and how customers learn why and how pricing changed are factors which are hard to predict, but we should be able to get an idea from the apps mentioned above.

So,... now what?

We have insanely powerful computers in our pockets, on our wrists and even more powerful ones on our desks and backpacks. Still a lot of data and processing and storage is done on third party servers.

Historically this makes a lot of sense. Clients were far less powerful than they are today. Storage and redundancy was a lot more expensive and harder to accomplish.

If you are building an application or planning one, just consider the option of not requiring a centralized service. Devices are powerful enough. People can choose where to store data and whom to trust. No one has to rely and trust you to keep the service online and secure and your apps will not stop working if you ever (decide to) go out of business or Twitter buys your company.

It surely depends on your audience - for an Apple only ecosystem this is surely viable. If engineers are your target audience it will most likely be well received. Businesses having the resources to roll some form of storage for their employees are potential customers who might like „full data control“ as selling point. For non-tech savvy customers with multiple operating systems this will likely not be a feasible option right now.