A patch for the Github centralization dilemma

Github 404

Github, with its 75,000,000 repositories, has become a central place for open source development and is well-known for having popularized Git among programmers and other code hungry fellas. The irony is not lost on anyone that we are again relying on a centralized service for our decentralized Git workflow. And as with any centralization comes the risk of giving too much power in the hands of just a few.

Of course, a central service such as Github has its benefits. We all know where to search for code. We all also potentially know how the service works and can jump more quickly from one project to another. Third parties can even build upon this resource and push things in new directions, maybe attracting faster early adopters.

But… Centralized services can turn against you. They can censor and be censored. They also can disappear. Maybe Github will not disappear soon, but a user on Github could decide to delete all its repositories and there would not be much you could do about it. You don’t think that has happened? Check RGBDToolkit or Gravit, for example. (You’ll have to put those urls in your preferred search-engine to verify that I’m not bullshitting you and that these projects did exist on Github at some point.)

So, in order to restore balance in the force, I’ve decided to adopt a few habits that I want to share with you. They are not going to solve the centralization problem. But they can maybe provide some safe guards against the major risk exposed in the previous paragraph. These tricks apply for projects you have not created. For your own projects, it’s up to you to decide where you want to host them.

The solution I’m using is based on the mirror feature from Gitlab. Gitlab is an open source clone of Github. It provides the same functionalities, but you can install it on your own server. And many groups are running public instances across the web. Gitlab.com itself, as a company, develops the software and offers hosting of public and private repositories at the same address.

So now, every time I find a nice open source project on Github, and especially the ones with few stars, forks or developers, I create a mirror of it in a public Gitlab repository. The advantage here over just a git clone on my machine or elsewhere is that I’m not just creating a copy of the project at a certain time. The mirror feature will keep watching the original project and pull all the changes that happen after I created the mirror. So I’m confident, that whatever happens to the original repository, all the history and changes will be saved elsewhere.

Because those repositories are just backups, I also disable issue tracking, wikis and any other unnecessary feature that could mislead visitors. The point is not to divert development. There is also a clear mention that those are mirrors and link back to the canonical repository.

So next time, instead of starring a project you like, mirror it. You’ll do everyone a favor. The ones I keep are here. But feel free to choose any other hosting service elsewhere. Let’s keep things distributed.


Also published on Medium.

Leave a Reply

Your email address will not be published. Required fields are marked *