versionControl

A version control system is useful when working on a collection of files that evolves over time, in particular if several people are modifying the files. For example, version control is often used in software projects to store the code in a way that several developers can access and contribute to the code without the need to send files by e-mail etc. Another example is when you are writing a paper with your colleagues and you need to efficiently and safely share your additions to the paper. A version control system also keeps track of the history of your contributions; most version control systems can be used to retrieve the state of your files as of a given date. This means that there is no need to create local back-up copies. To make it short and sweet: version control means you can relax.

Git

Git is currently the preferred version control tool at the department. For starters you might want to look at a git tutorial I wrote some time ago git-tutorial.pdf. There are also heaps of tutorials out on the net. A quite comprehensive manual is Git Magic, another pretty good one is Git for Scientists. It is also quite handy to have a cheat sheet handy beside the keyboard for those quick command-and-options lookups.

Talk to Anders Nilsson in order to have a repository for your files set up, or if you want to know more about version control.

Setting up GIT

git config --global alias.co 'commit -a --allow-empty-message -m ""'
git config --global alias.com 'commit -a --allow-empty-message -m'

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
        LANGUAGE = (unset),
        LC_ALL = (unset),
        LC_CTYPE = "UTF-8",
        LANG = (unset)
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").

when communicating with the server. An easy fix to this is to avoid sending your locale settings to the server over SSH, by commenting out SendEnv LANG LC_* in /etc/ssh_config (sing a leading #).

Deep-remove files in Git

Warning: Do not use this instructions when you normally remove files, and do not use them if you don't know what you are doing.

Sometimes large files or files with sensitive content (like passwords) accidently got added in a git repo. To remove the file with git rm is not then enough, since the content of the file will still be saved in git's history. But there are several ways to remove the traces of the file even in the log files. The easiest one is probably to use the https://rtyley.github.io/bfg-repo-cleaner/.

Start with mirroring the git-repo

git clone --mirror git@gitlab.control.lth.se:my-dirty-repo

Then use the BFG Repo-Cleaner to remove the file you want

java -jar bfg-1.12.13.jar --delete-files '*.iso' my-dirty-repo.git

The command above will remove all the .iso-files from the whole repo (all branches included). Warning: Again, think twice before doing this, for instance, removing all .pdf files from a course repo isn't probably a good idea, since some of them may not originate from a tex-file. And there is no easy way to get the file back anymore (IF you are lucky, some Anders might can find it on a backup)

Once this done

cd my-dirty-repo.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive

Then (if you have the right permissions to the project in gitlab) it should just be to run

git push

However, I think due to some issues with gitlab, even though I had master permissions to the project I was not allowed to push. The solution was to remove the protection on the master branch in gitlab, push and then add the protection again.

After this is done, it can be an idea to request some housekeeping in gitlab. It looks like the housekeeping runs git gc without the --prune=now flag, which means that it is only housekeeping requested two weeks after the removal of the file, that will actually remove it from the servers storage. But the file will be removed from the repo and the history immediately, so if you make a new clone after doing this, it will be less data to download.

Gitlab CI

In gitlab there are tools to run jobs every time you push to the repository. You can also put restrictions on when to run the jobs, for example only running it if any files in a specific folder are changed. This can be used for many different things, for example running automated tests or maybe automatically uploading any changed files to a homepage as in canvas sync.

General advices

versionControl (last edited 2020-02-13 13:03:00 by albheim)