2

I have a git server and planing to backup it for a disaster situation. I have googled it and got some points to do so are following:

  1. git clone repos and push on new server whenever required, can be used locally also.
  2. Create a git bundle.

I have a friend who's is having an argument that We should copy whole file structure of server of git location(containing all file structure of git server application) and whenever required configure it back on server.

I want your help guys. How do i proceed, which way is best?

Sachin Chandil
  • 17,133
  • 8
  • 47
  • 65
  • You could also have a raid1 filesystem or a distributed filesystem on multiple computer with each node having a raid1 sub filesystem. – jibe Aug 08 '16 at 08:32
  • looks like too complicated, can you elaborate more? – Sachin Chandil Aug 08 '16 at 08:37
  • You need to decide first what kind of backups you care about. You mentioned, specifically, disaster recovery, which generally includes concerns about fires, earthquakes, etc., which may affect any and all materials on-site; and also concerns about man-made disasters such as accidental or deliberate data damage, i.e., the hardware is fine but an outside attacker or berserk employee is wrecking things. You can then proceed from here. Decide whether you only care about the clone-able repository data, or whether you are concerned about ancillary data as well: logs, hooks, etc. – torek Aug 08 '16 at 08:42
  • Raid1 filesystem protect against a disk drive failure. Distributed filesystem protect against whole computer failure and if different node are in different location it protect against other risk like fire or earthquakes. I just want to show you that backup could be provided by the system and it could have some advantage over just pushing repos or copy bundle to another server. But it all depends on your needs. – jibe Aug 08 '16 at 08:49
  • @torek I am concerned about man-made disasters, and i care about only clone-able data no hooks files etc. But i would love to know about fire and earthquake disaster also. – Sachin Chandil Aug 08 '16 at 08:51
  • @jibe i think, you are talking about very low level backup strategy. i am concerned about backing up repos that can help me when my git server is somehow down(someone stole git server, burned in fire, wracked etc..etc..) – Sachin Chandil Aug 08 '16 at 08:54
  • 1
    I disagree. It is not that very low level and it is more human-proof because with a system level backup you do not have to mind about "did I forget to sync my lasts commits" or "do my backup server take in charge that new repo ?". After the installation, you have less to care about. And better, the system could continue to answer transparently to your users with the same domain name even if one disaster you listed happens. – jibe Aug 08 '16 at 09:24
  • @jibe sorry for that. I do not know much about disk level security. would you like to share something to me that can help me start with the process? I have read about RAID in my graduation but dont know exactly how to implement. – Sachin Chandil Aug 08 '16 at 09:29
  • @jibe A live Raid1 cluster is not a guarantee for protection against corruption or deletion though. A regular backup of the whole file structure to some offline storage is still advised. – JBert Aug 08 '16 at 09:32
  • @JBert: you are right. It is always wise to do regular "state" backup in addition. – jibe Aug 08 '16 at 09:58
  • The simplest way to make a RAID1 is to have 2 disks with same capacity and to create it within BIOS. But it is said "software RAID" which is less reliable but cheaper than "hardware RAID". To protect against your office becoming ashes, you have to do the same on at least another computer hosted elsewhere and to combine then with a distrubted filesystem on which you will finally install the operating system. There are plenty options. choosing depends on your specific needs and on your cash – jibe Aug 08 '16 at 09:58

2 Answers2

3

If you are coming from a sysadmin point of view, what your friend suggests is best. Only a filesystem backup including all permissions etc. will make sure that recovering after an outage is 100% possible, quick, and can be done easily by your regular sysadmin team, whose job it is to worry about backup&recovery in the first place.

(Also make sure to disable incoming pushes while the backup is running, of course, as usual.)

Yes, you could also simply git clone, but that won't get the whole environment; i.e. if you are using ssh authentication (~/.ssh/authorized_keys), you need to get that too (even more so if you are using other git servers, like GitLab or whatever). git hooks are not automatically fetched by a git clone and so on, there may be more.

EDIT: the file system backup should of course contain anything that is remotely (sic) related to git. I.e., the complete /home/git (or whichever user git is running under), /var/git_repos (or wherever you are keeping your repositories) and so on. Enough so if you set up a cleanly installed machine you can get everything back and running just by recovering those directories.

AnoE
  • 8,048
  • 1
  • 21
  • 36
0

git is a distributed version control system.

It means that each time someone clones the repository, he has all the history cloned with it.

Clone and pull your git repositories often and you won't have any problem restoring in case of a crash.

blue112
  • 52,634
  • 3
  • 45
  • 54
  • You are right but in an organisation there must be strategy to deal with any kind of Situation. I raise same argument to my lead but he is stick to same question that is unfortunately valid point. – Sachin Chandil Aug 08 '16 at 09:24
  • This is not 100% correct: a distributed version control system means that anyone has *his own* clone, but it does not guarantee that everyone has an *identical* state. It is important to backup the server's `refs` on a regular basis in case someone / something manages to corrupt it. – JBert Aug 08 '16 at 09:30
  • @JBert what would you like to suggest for backup here, backing up individual repos or git server directory structure files(with hooks etc etc)? My goal is to have a backup even if when all computers in any organization is down (burned, wracked etc etc) for example. (I know example is sick as i am still laughing on it :D, but its a valid point i guess). – Sachin Chandil Aug 08 '16 at 09:41
  • @chandil03: protecting against all is impossible. If all your computers and backups are burned or wracked, you are just doomed. The aim of live and offline backups is to reduce the probability. More you add, more it cost and less is the risk. – jibe Aug 08 '16 at 09:50