Table of Contents
I realised that I would benefit from using version control after a particularly messy period at work where I'd made some mistakes at some point. Git sounded like a popular tool to use thanks to its integration with GitHub, GitLab and Bitbucket which would facilitate sharing work with others and you can even host your own Git repo using Gitea.
Little did I know the sort of hell I would be letting myself in for as I learnt the semantics and idiosyncrasies. This page records my learning adventure and serves as a reference for myself that just might be useful for others.
Gitting Started
First thing to do is setup Git with your name and email address and a few other common default settings.
- snippet.bash
git config user.name 'My Username' git config user.email 'my@email.org' git config --global push.default upstream # Syncs pull/push to the same branch (https://stackoverflow.com/a/42642628/1444043)
Configuration
Git Bash/Zsh completion
Git (>2.37.3 at least, perhaps before) ships with the file that automates git completion. Under Gentoo this installs at /usr/share/bash-completion/completions/git
and it includes instructions on how to install and use it.
Git under Emacs
I have found that using Magit under Emacs has been really useful in helping to learn the concepts and principles around using Git. For more on this along with common reference commands see my Magit page.
GitHub/GitLab/BitBucket/Codeberg
You've been meticulous in saving changes to your files locally but now you want to share your code with the world (or perhaps just your colleague at the desk opposite). This is where the sites like GitHub, GitLab, Codeberg and Bitbucket step in as they provide free (and paid-for) services for hosting projects from where people can collaborate on them.
Starting a Repository
Obviously you will have an account on GitHub/GitLab/Codeberg/BitBucket but how do you get your local version controlled files for a given project onto them? Go to your account on one of these and create a new repository in this example its called new-project
. You are then shown a page with three different methods of starting/adding files to the repository. Note that the URL for SSH needs tweaking if you are using multiple GitHub accounts and have configured SSH as described below. If you've created a second account called work
and configured SSH to recognise the URL work.github.com
then you can push your existing repository with…
- snippet.bash
git remote add origin git@work.github.com:work/new-project.git git push -u origin master
Your files should now appear in your repository.
Commiting
Commits are straight forward, you save your file and then within the git repository make a git commit -m "A meaningful message about the changes you are commiting"
.
- snippet.bash
git commit -m "A meaningful message about the changes being commited"
Log
You can view the history of your commits with git log
and it prints out detailed information about each commit. Sometimes you wish to have a shorter view in which case the --decorate --oneline
flags can be used (under zsh there is a default alias for this glo
).
- snippet.bash
a8c4277 (HEAD -> master, origin/master) Added rsync to zshrc plugins 6acb4e0 Added submodule for solarized 076645e Adding dotbot conf e5a6965 Adding rhapsody and podget 0ef688f Adding gimp 8191fc7 Adding oh-my-zsh as a submodule 4cd397b Adding zsh 460d42e Adding bash podget and tmux lines 1-8/8 (END)
Ignoring Important Files
You can tell Git to automatically ignore certain files by adding file blobs to the file .gitignore
in the root of your repositories directory (i.e. the highest level). This is useful as you can exclude temporary files that your test editor might create (e.g. Emacs leaves behind *~
files) and if you're working with patient data then its quite likely that this shouldn't be shared in a public repository such as GitHub. The github/gitignore repository has a number of skeleton/example files for different languages including R, I modify this to include all *.RData
files to exclude any and all R Data objects and also Emacs temporary files *~
. There is also gitignore.io which generates configs automatically.
- snippet.bash
# History files .Rhistory .Rapp.history # All Data files *.RData # Example code in package build process *-Ex.R # RStudio files .Rproj.user/ # Emacs tmp files *~ # produced vignettes vignettes/*.html vignettes/*.pdf # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3 .httr-oauth
Moving Files
You might rename a file and want delete the original, how to do this in Git? The solution is to mv
your file just as you would for a normal file.
- snippet.bash
git mv file1.txt file2.txt git commit -m "Renaming file1.txt > file2.txt" git push
Deleting Files
To remove a file complete from a repository and delete it locally then use rm
- snippet.bash
git rm file1.txt git commit -m "remove file1.txt" git push -u origin master
Removing Files
Sometimes you will want to remove a file from a Git repository but not delete it, to do this use the rm –cached
option
- snippet.bash
# Remove a file git rm --cached file.txt # Remove a directory git rm --cached -r directory
Undoing Unstaged Changes
If you decide you don't want to keep any changes made to a working branch before you git add/commit
them then you can simply
- snippet.bash
git checkout -- file1.txt file2.txt
More destructively you can reset the HEAD
using the following which will remove staged and unstaged changes, use cautiously!
- snippet.bash
git reset --hard
These were from devtonight.
Cherry Picking
Pull across single or a range of commits
- snippet.bash
git checkout -b new-branch git cherry-pick 09ae98a # Pull a range of commits inclusive of both stated git cherry-pick asdf90a^..g90agu # Pull a range of commits starting _after_ first stated git cherry-pick asdf90a..g90agu
Amending and Rebasing
Sometimes you make a mistake and need to amend your commit message (Git Book : Rewriting History).. If its the most recent commit then you can modify the most recent message with git commit --amend
. If the changes are further back in the commit history then you need to perform a what is known as a rebase. You can view the log of commits using git log
in doing so you might decide that the last five commits need modifying which you can do using an interactive rebase
- snippet.bash
git rebase -i HEAD~5
…which will start your default text editor with a list of the last five commits, you should edit the first line that precedes each commit with the keyword indicating the action you wish to take on that particular commit. This will typical be reword
. Save and exit, and the first commit you wish to edit will then be opened, on saving and exiting the second one will be and so forth.
A good article on rebasing is here. Rebasing can, and arguably should, also be used to clarify your intent when making changes. This process is described well in this article Write Better Commits, Build Better Projects | The GitHub Blog
Rebasing Branches
As work proceeds origin/master
(the master branch on GitHub/GitLab/BitBucket) will get updated by others. When you want to start some work you should do so on a the most recent version that incorporates others changes. To ensure your branch is up-to-date you should rebase you branch onto a freshly pulled master.
- snippet.bash
git checkout my_branch # Switch to your branch git rebase origin/master #
To get an individual file from another branch you can…
- snippet.bash
git checkout <branch-where-the-file-is> -- path/to/file/in/the/other/branch
Restoring individual files
Sometimes you may want to revert changes in an individual file, if they haven't yet been committed then that is easy with either of the following which are equivalent…
- snippet.bash
git checkout -- <filename> git restore <filename>
If you have already committed a change then you need to use git revert
to undo commits.
Forcing pulls
Sometimes you want to manually force a pull, and whilst it might be tempting to use git pull -f
this is not the best approach, rather you should use fetch and reset.
- snippet.bash
git fetch origin master git reset --hard FETCH_HEAD git clean -df `` ` ## Branches The beauty of Git is that it allows multiple people to work on the same software project without interfering with each others work. This is done through [branching and merging](https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging). On GitHub/GitLab/BitBucket you will find the option to make a branch of a repository, but its generally cleaner to make branches on your local machine and then have them pushed and updated to your remote (`origin`). Create a branch and switch to it in one step using... git checkout -b new_branch
You can see what branches there are now locally and which you are currently on using…
- snippet.bash
git branch * new_branch master
How does the remote repository know/become aware of this new branch? You can push and update all branches on the remote origin
with…
- snippet.bash
git push --all -u
Branching from a specified branch
Sometimes you may be working on a problem with others simultaneously and wish to develop you work together before merging to master. In such an instance you could create a development
branch and push your work to this to ensure changes you and your colleague make are consistent and work before you merge that to master.
- snippet.bash
git branch --list * development master
I want branch from development
rather than master
and so you…
- snippet.bash
git checkout -b my_new_branch development ``` ## Move Most Recent Commit to a New Branch From [here](https://stackoverflow.com/a/1628584/1444043) git branch a_new_branch # This creates a new branch from the existing. git reset --hard ad1290ai # Remove the last commit from the current branch git checkout a_new_branch # Moves to the new branch which includes the last commit
Tidying up Merged Branches
Lots of articles out there on Git housekeeping one simple thing to do is use…
- snippet.bash
git fetch -p
…which will prune branches that have been merged on fetching.
Git Stash
Sometimes if you've work in progress (WIP) you may end up stashing your changes when you git pull
.
Listing stashes
- snippet.bash
git stash list
Clearing stashes
Its relatively straight-forward to clear them.
- snippet.bash
git stash clear
To remove a specific stash you need to find its position in the list of stashes (see above for listing) and then drop it…
- snippet.bash
git stash drop stash@{3}
From How to delete a stash in Git. If you're using magit then you can access stashes and subsequent actions using z
(and follow the commands in the transient buffer).
Git Hooks
Git hooks are script that reside in the .git/hooks
directory and are run automatically on various events. These may be pre-commit to check spelling of the commit message, or post-commit to automatically push to upstream.
I have my org and `~/.config/emacs/` files under git version control and mirrored across my home server, laptop, work laptop, VPS, and Raspberry Pi's and found I was spending a lot of time pushing and pulling and sometimes resolving conflicts. This is where I decided to enable a pre-commit and post-commit hooks across all machines so that pulls are made before a commit to avoid/highlight any conflicts and ensure things are up-to=date and pushes are made after a commit so they are ready to be pulled everywhere else.
Pre-commit
- snippet.bash
#!/bin/bash # # A hook script to pull before commits # exec git pull
Post-commit
- snippet.bash
#!/bin/bash # # A hook script to push after commits # exec git push
SSH Authentication
You might find putting your password in each time you want to push changes a bit of a pain, make your life easy using SSH keys (which you like use for SSH anyway). If you don't have an SSH key then create one with ssh-keygen
. Then upload your public key (by default ~/.ssh/id_rsa.pub
) to your GitHub account by copying and pasting it under Settings > SSH Keys > New SSH Key
and giving the key an appropriate name (e.g. one that reflects the computer its for). Assuming you use Keychain to load your SSH keys when you initially login (i.e. you're prompted for the password for your SSH key it gets loaded into memory and then used each time its required) you can then push staged changes without having to enter your password.
Test your SSH authentication with…
- snippet.bash
$ ssh -T git@github.com Hi [username]! You've successfully authenticated, but GitHub does not provide shell access.
Multiple Accounts
After having made a start I realised that I wanted to keep my personal and professional work separate so I created a second GitHub account for my work email address. But how to handle that on my local computers? Unsurprisingly I'm not the first person to have wanted to do this.
Generate a second SSH Key
You can't use the same public SSH key with more than one GitHub account so you have to create a second to use with work. You don't want to overwrite your existing keys though so do the following (you can substitute work
for anything that is useful/memorable/relevant to you)…
- snippet.bash
mkdir ~/.ssh/work ssh-keygen -f ~/.ssh/work/id_rsa
Make sure that all keys are loaded when you login by modifying ~/.bash_profile
to call keychain
to add them…
- snippet.bash
# Keychain /usr/bin/keychain --agents ssh ~/.ssh/id_rsa /usr/bin/keychain --agents ssh ~/.ssh/work/id_rsa . ~/.keychain/$HOSTNAME-sh
Now when you next login you should be prompted for your login in password followed by each of the passwords for your keys in turn (assuming you're using something like ssh-askpass-fullscreen
.
Configure SSH to use different keys
You've two SSH keys, one for each GitHub account, but how does SSH know which to use when? This is done via ~/.ssh/config
by setting up a default configuration for github.com
and a second one for the second GitHub account you created (in this example work-github.com
).
- snippet.bash
Host github.com Hostname github.com PreferredAuthentications publickey IdentityFile ~/.ssh/id_rsa Host work.github.com Hostname github.com PreferredAuthentications publickey IdentityFile ~/.ssh/work/id_rsa
Test that the new key works using the following (substituting work for whatever you have called your repository)…
- snippet.bash
ssh -T git@work.github.com
Per Repository SSH key
As of Git 2.10.0 you can configure each repository to use a specific key (source). At the command line…
- snippet.bash
cd a/git/repository git config core.sshCommand "ssh -i ~/.ssh/different_ed25519 -F /dev/null" git config --local user.name "Username" git config --local user.email "repos@username.com"
This adds the following to the repositories .git/config
- snippet.bash
[core] sshCommand = ssh -i ~/.ssh/different_ed25519 -F /dev/null [user] name = Username email = repos@username.com
What it is doing is instructing Git to run ssh
using the private key file (with the -i
flag) that is located at ~/.ssh/different_ed25519
. You should have already uploaded the public key associated with it to the GitHub account. If you've stored this key in a Keychain you shouldn't be prompted for a password.
Modify the project to use the alternative account
When you place a directory (/project folder) under version control Git creates the sub-directory and populates it with the necessary files. One key file is the configuration for the project .git/config
and within it are details of the username and email address. You should change these to use your secondary account…
- snippet.bash
[core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true [user] name = My Name email = my.work@email.com [remote "origin"] url = git@work.github.com:work/dipep.git fetch = +refs/heads/*:refs/remotes/origin/* [branch "master"] remote = origin merge = refs/heads/master
…this means that when an attempt is made to push to GitHub the alternative work email address is used because the target url is specified as work.github.com
(rather than github.com
) and you have configured SSH to recognise this and use an alternative SSH key.
Tidying up Directories
On occasions Git repositories can grow un-wieldly, some advice on how to shrink the size of repositories is at How to Shrink a Git Repository.
Backing up `/etc`
Its worth considering using version control on your configuration files and on most GNU/Linux systems the vast majority of these are stored in the /etc
directory. You could do this all manually yourself, creating and setting up /etc/.git
and adding individual files. However, its far easier to use the excellent etckeeper
which is available under most distributions repositories.
Installation
Gentoo
- snippet.bash
emerge -av etckeeper
Arch Linux
- snippet.bash
pacman -Syu etckeeper
Configuration
You're most likely to use Git given you are reading a page about Git so edit /etc/etckeeper/etckeeper.conf
to use Git for version control…
- snippet.bash
# The VCS to use. #VCS="hg" VCS="git" #VCS="bzr" #VCS="darcs"
…then initialise on your system…
- snippet.bash
etckeeper init
Backup
Having the files version controlled is good, but what if your system suffers a catastrophic failure and you want to restore the settings and configurations you have worked with and tweaked over a long time? To this end it is wise to backup your configuration somehwere, this could be as simple as backing up to a second independant location, although its worth considering backing up online. GitHub didn't used to offer private repositories with its entry level free accounts (although if you have academic affiliation then you can apply for an upgrade which does include private repositories). This is important because a lot of sensitive information on your systems configuration is stored within these files and making them publicly available opens yourself up to being hacked/exploited. Thus you should use a private repository to backup your files. Step forward GitLab a nice alternative to GitHub that includes the ability to have private repositories with their entry-level Free account.
NB I did stupidly run rm -rf *
whilst in /etc/
and thought I would have to completely reinstall my system, fortunately this was after I had started using etc-keeper
and so I was able to restore everything relatively painlessly by cloning the repository.
Once you've signed up and got yourself an account simply create a repository, an appropriate name would be that which you use for your system and is stored in /etc/hostname
. For example you might choose to call it kimura.no-ip.info
. As you will already have placed /etc
under version control you can then add the repository using the following (substitute [username]
for your GitLab username or just copy and paste from the newly establish project)…
- snippet.bash
git remote add origin git@gitlab.com:[username]/kimura.no-ip.info.git git push -u origin --all git push -u origin --tags
Submodules
Nesting git repositories is a bad idea, it makes it hard to track files and keep things upto date/synced. Unsurprisingly the authors of Git have anticipated this and provided a solution whereby nested Git repositories are added as submodules (tutorial). Its pretty straight-forward to add a submodule and requires two steps…
- snippet.bash
git submodule add https://github.com/<user>/rock rock git submodule update --init [--recursive]
A tutorial I came across (but haven't read in full) is here.
Backing up ~/
You've got your system configuration files under version control, but there are a host of other files that your regular user account uses which would be useful to place under version control and back up/synchronise between systems. Simply placing your ~/
/ $HOME
directory under version control is not recommended (although I did initially do it myself) for the simple reason that should you run git clean
you'll wipe out everything that isn't under version control in that directory. Instead it is recommended that you use dotfiles to save your configuration. There are a number of different ways to do this and I've described doing this using GNU/Stow, dobot and Gitlab. This process not only backs up your key configuration files but also makes it easy to setup new systems by cloning and installing your dotfiles.
Citing Software
Its a great shame that the developers of software do not receive more citations and credit for their work. GitHub support Citation File Format and by creating a CITATION.cff
file in your repository a button is added making it easy for others to cite your software.
Analysing a repository
There is a useful Python package repostat which analyses the Git history and summarises it nicely.
- snippet.bash
pip install repostat cd ~/path/to/a/project/ repostat --copy-assets --with-index-page [--no-browser] . ~/tmp/project_summary
You then get a neat web-page summarising your project, who has made how many commits and how many lines of code contributed.
Links
Learning Git/Github
- Pro Git (online free)
- gitignore.io - Create Useful
.gitignore
Files For Your Project
Tips and Tricks
Branching
GitHub
GitHub Pages
Multiple Accounts
The following were useful in figuring out how to use multiple accounts…
Tools
Videos
Alternatives
I can't see many alternatives gaining traction these days but its worth keeping an eye on things.
Misc
git:git emacs:magit