I realised that I would benefit from using version control after a particularly messy period at work where I'd made some mistakes at some point. Git sounded like a popular tool to use thanks to its integration with GitHub, GitLab and Bitbucket which would facilitate sharing work with others and you can even host your own Git repo using Gitea.

Little did I know the sort of hell I would be letting myself in for as I learnt the semantics and idiosyncrasies. This page records my learning adventure and serves as a reference for myself that just might be useful for others.

Gitting Started

First thing to do is setup Git with your name and email address and a few other common default settings.

snippet.bash
git config user.name 'My Username'
git config user.email 'my@email.org'
git config --global push.default upstream # Syncs pull/push to the same branch (https://stackoverflow.com/a/42642628/1444043)

Configuration

Git Bash/Zsh completion

Git (>2.37.3 at least, perhaps before) ships with the file that automates git completion. Under Gentoo this installs at /usr/share/bash-completion/completions/git and it includes instructions on how to install and use it.

Git under Emacs

I have found that using Magit under Emacs has been really useful in helping to learn the concepts and principles around using Git. For more on this along with common reference commands see my Magit page.

GitHub/GitLab/BitBucket/Codeberg

You've been meticulous in saving changes to your files locally but now you want to share your code with the world (or perhaps just your colleague at the desk opposite). This is where the sites like GitHub, GitLab, Codeberg and Bitbucket step in as they provide free (and paid-for) services for hosting projects from where people can collaborate on them.

Starting a Repository

Obviously you will have an account on GitHub/GitLab/Codeberg/BitBucket but how do you get your local version controlled files for a given project onto them? Go to your account on one of these and create a new repository in this example its called new-project. You are then shown a page with three different methods of starting/adding files to the repository. Note that the URL for SSH needs tweaking if you are using multiple GitHub accounts and have configured SSH as described below. If you've created a second account called work and configured SSH to recognise the URL work.github.com then you can push your existing repository with…

snippet.bash
git remote add origin git@work.github.com:work/new-project.git
git push -u origin master

Your files should now appear in your repository.

Commiting

Commits are straight forward, you save your file and then within the git repository make a git commit -m "A meaningful message about the changes you are commiting".

snippet.bash
git commit -m "A meaningful message about the changes being commited"

Log

You can view the history of your commits with git log and it prints out detailed information about each commit. Sometimes you wish to have a shorter view in which case the --decorate --oneline flags can be used (under zsh there is a default alias for this glo).

snippet.bash
a8c4277 (HEAD -> master, origin/master) Added rsync to zshrc plugins
6acb4e0 Added submodule for solarized
076645e Adding dotbot conf
e5a6965 Adding rhapsody and podget
0ef688f Adding gimp
8191fc7 Adding oh-my-zsh as a submodule
4cd397b Adding zsh
460d42e Adding bash podget and tmux
lines 1-8/8 (END)

Ignoring Important Files

You can tell Git to automatically ignore certain files by adding file blobs to the file .gitignore in the root of your repositories directory (i.e. the highest level). This is useful as you can exclude temporary files that your test editor might create (e.g. Emacs leaves behind *~ files) and if you're working with patient data then its quite likely that this shouldn't be shared in a public repository such as GitHub. The github/gitignore repository has a number of skeleton/example files for different languages including R, I modify this to include all *.RData files to exclude any and all R Data objects and also Emacs temporary files *~. There is also gitignore.io which generates configs automatically.

snippet.bash
# History files
.Rhistory
.Rapp.history
 
# All Data files
*.RData
 
# Example code in package build process
*-Ex.R
 
# RStudio files
.Rproj.user/
 
# Emacs tmp files
*~
 
# produced vignettes
vignettes/*.html
vignettes/*.pdf
 
# OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
.httr-oauth

Moving Files

You might rename a file and want delete the original, how to do this in Git? The solution is to mv your file just as you would for a normal file.

snippet.bash
git mv file1.txt file2.txt
git commit -m "Renaming file1.txt > file2.txt"
git push

Deleting Files

To remove a file complete from a repository and delete it locally then use rm

snippet.bash
git rm file1.txt
git commit -m "remove file1.txt"
git push -u origin master

Removing Files

Sometimes you will want to remove a file from a Git repository but not delete it, to do this use the rm –cached option

snippet.bash
# Remove a file
git rm --cached file.txt
# Remove a directory
git rm --cached -r directory

Undoing Unstaged Changes

If you decide you don't want to keep any changes made to a working branch before you git add/commit them then you can simply

snippet.bash
git checkout -- file1.txt file2.txt

More destructively you can reset the HEAD using the following which will remove staged and unstaged changes, use cautiously!

snippet.bash
git reset --hard

These were from devtonight.

Cherry Picking

Pull across single or a range of commits

snippet.bash
git checkout -b new-branch
git cherry-pick 09ae98a
# Pull a range of commits inclusive of both stated
git cherry-pick asdf90a^..g90agu
# Pull a range of commits starting _after_ first stated
git cherry-pick asdf90a..g90agu

Amending and Rebasing

Sometimes you make a mistake and need to amend your commit message (Git Book : Rewriting History).. If its the most recent commit then you can modify the most recent message with git commit --amend. If the changes are further back in the commit history then you need to perform a what is known as a rebase. You can view the log of commits using git log in doing so you might decide that the last five commits need modifying which you can do using an interactive rebase

snippet.bash
git rebase -i HEAD~5

…which will start your default text editor with a list of the last five commits, you should edit the first line that precedes each commit with the keyword indicating the action you wish to take on that particular commit. This will typical be reword. Save and exit, and the first commit you wish to edit will then be opened, on saving and exiting the second one will be and so forth.

A good article on rebasing is here. Rebasing can, and arguably should, also be used to clarify your intent when making changes. This process is described well in this article Write Better Commits, Build Better Projects | The GitHub Blog

Rebasing Branches

As work proceeds origin/master (the master branch on GitHub/GitLab/BitBucket) will get updated by others. When you want to start some work you should do so on a the most recent version that incorporates others changes. To ensure your branch is up-to-date you should rebase you branch onto a freshly pulled master.

snippet.bash
git checkout my_branch          # Switch to your branch
git rebase origin/master        #

To get an individual file from another branch you can…

snippet.bash
git checkout <branch-where-the-file-is> -- path/to/file/in/the/other/branch

Restoring individual files

Sometimes you may want to revert changes in an individual file, if they haven't yet been committed then that is easy with either of the following which are equivalent…

snippet.bash
git checkout -- <filename>
git restore <filename>

If you have already committed a change then you need to use git revert to undo commits.

Forcing pulls

Sometimes you want to manually force a pull, and whilst it might be tempting to use git pull -f this is not the best approach, rather you should use fetch and reset.

snippet.bash
git fetch origin master
git reset --hard FETCH_HEAD
git clean -df
``
`
## Branches
 
The beauty of Git is that it allows multiple people to work on the same software project without interfering with each others work.  This is done through [branching and merging](https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging).
 
On GitHub/GitLab/BitBucket you will find the option to make a branch of a repository, but its generally cleaner to make branches on your local machine and then have them pushed and updated to your remote (`origin`).  Create a branch and switch to it in one step using...
 
git checkout -b new_branch

You can see what branches there are now locally and which you are currently on using…

snippet.bash
git branch
* new_branch
  master

How does the remote repository know/become aware of this new branch? You can push and update all branches on the remote origin with…

snippet.bash
git push --all -u

Branching from a specified branch

Sometimes you may be working on a problem with others simultaneously and wish to develop you work together before merging to master. In such an instance you could create a development branch and push your work to this to ensure changes you and your colleague make are consistent and work before you merge that to master.

snippet.bash
git branch --list
* development
  master

I want branch from development rather than master and so you…

snippet.bash
git checkout -b my_new_branch development
``` 
 
## Move Most Recent Commit to a New Branch
 
From [here](https://stackoverflow.com/a/1628584/1444043)
 
git branch a_new_branch     # This creates a new branch from the existing.
git reset --hard ad1290ai   # Remove the last commit from the current branch
git checkout a_new_branch   # Moves to the new branch which includes the last commit

Tidying up Merged Branches

Lots of articles out there on Git housekeeping one simple thing to do is use…

snippet.bash
git fetch -p

…which will prune branches that have been merged on fetching.

Git Stash

Sometimes if you've work in progress (WIP) you may end up stashing your changes when you git pull.

Listing stashes

snippet.bash
git stash list

Clearing stashes

Its relatively straight-forward to clear them.

snippet.bash
git stash clear

To remove a specific stash you need to find its position in the list of stashes (see above for listing) and then drop it…

snippet.bash
git stash drop stash@{3}

From How to delete a stash in Git. If you're using magit then you can access stashes and subsequent actions using z (and follow the commands in the transient buffer).

Git Hooks

Git hooks are script that reside in the .git/hooks directory and are run automatically on various events. These may be pre-commit to check spelling of the commit message, or post-commit to automatically push to upstream.

I have my org and `~/.config/emacs/` files under git version control and mirrored across my home server, laptop, work laptop, VPS, and Raspberry Pi's and found I was spending a lot of time pushing and pulling and sometimes resolving conflicts. This is where I decided to enable a pre-commit and post-commit hooks across all machines so that pulls are made before a commit to avoid/highlight any conflicts and ensure things are up-to=date and pushes are made after a commit so they are ready to be pulled everywhere else.

Pre-commit

snippet.bash
#!/bin/bash
#
# A hook script to pull before commits
#
 
exec git pull

Post-commit

snippet.bash
#!/bin/bash
#
# A hook script to push after commits
#
 
exec git push

SSH Authentication

You might find putting your password in each time you want to push changes a bit of a pain, make your life easy using SSH keys (which you like use for SSH anyway). If you don't have an SSH key then create one with ssh-keygen. Then upload your public key (by default ~/.ssh/id_rsa.pub) to your GitHub account by copying and pasting it under Settings > SSH Keys > New SSH Key and giving the key an appropriate name (e.g. one that reflects the computer its for). Assuming you use Keychain to load your SSH keys when you initially login (i.e. you're prompted for the password for your SSH key it gets loaded into memory and then used each time its required) you can then push staged changes without having to enter your password.

Test your SSH authentication with…

snippet.bash
$ ssh -T git@github.com
Hi [username]! You've successfully authenticated, but GitHub does not provide shell access.

Multiple Accounts

After having made a start I realised that I wanted to keep my personal and professional work separate so I created a second GitHub account for my work email address. But how to handle that on my local computers? Unsurprisingly I'm not the first person to have wanted to do this.

Generate a second SSH Key

You can't use the same public SSH key with more than one GitHub account so you have to create a second to use with work. You don't want to overwrite your existing keys though so do the following (you can substitute work for anything that is useful/memorable/relevant to you)…

snippet.bash
mkdir ~/.ssh/work
ssh-keygen -f ~/.ssh/work/id_rsa

Make sure that all keys are loaded when you login by modifying ~/.bash_profile to call keychain to add them…

snippet.bash
# Keychain
/usr/bin/keychain --agents ssh ~/.ssh/id_rsa
/usr/bin/keychain --agents ssh ~/.ssh/work/id_rsa
. ~/.keychain/$HOSTNAME-sh

Now when you next login you should be prompted for your login in password followed by each of the passwords for your keys in turn (assuming you're using something like ssh-askpass-fullscreen.

Configure SSH to use different keys

You've two SSH keys, one for each GitHub account, but how does SSH know which to use when? This is done via ~/.ssh/config by setting up a default configuration for github.com and a second one for the second GitHub account you created (in this example work-github.com).

snippet.bash
Host github.com
     Hostname github.com
     PreferredAuthentications publickey     
     IdentityFile ~/.ssh/id_rsa
Host work.github.com
     Hostname github.com
     PreferredAuthentications publickey
     IdentityFile ~/.ssh/work/id_rsa

Test that the new key works using the following (substituting work for whatever you have called your repository)…

snippet.bash
ssh -T git@work.github.com

Per Repository SSH key

As of Git 2.10.0 you can configure each repository to use a specific key (source). At the command line…

snippet.bash
cd a/git/repository
git config core.sshCommand "ssh -i ~/.ssh/different_ed25519 -F /dev/null"
git config --local user.name "Username"
git config --local user.email "repos@username.com"

This adds the following to the repositories .git/config

snippet.bash
[core]
    sshCommand = ssh -i ~/.ssh/different_ed25519 -F /dev/null
[user]
    name = Username
    email = repos@username.com

What it is doing is instructing Git to run ssh using the private key file (with the -i flag) that is located at ~/.ssh/different_ed25519. You should have already uploaded the public key associated with it to the GitHub account. If you've stored this key in a Keychain you shouldn't be prompted for a password.

Modify the project to use the alternative account

When you place a directory (/project folder) under version control Git creates the sub-directory and populates it with the necessary files. One key file is the configuration for the project .git/config and within it are details of the username and email address. You should change these to use your secondary account…

snippet.bash
[core]
	repositoryformatversion = 0
	filemode = true
	bare = false
	logallrefupdates = true
[user]
        name = My Name
        email = my.work@email.com
[remote "origin"]
	url = git@work.github.com:work/dipep.git
	fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
	remote = origin
	merge = refs/heads/master

…this means that when an attempt is made to push to GitHub the alternative work email address is used because the target url is specified as work.github.com (rather than github.com) and you have configured SSH to recognise this and use an alternative SSH key.

Tidying up Directories

On occasions Git repositories can grow un-wieldly, some advice on how to shrink the size of repositories is at How to Shrink a Git Repository.

Backing up `/etc`

Its worth considering using version control on your configuration files and on most GNU/Linux systems the vast majority of these are stored in the /etc directory. You could do this all manually yourself, creating and setting up /etc/.git and adding individual files. However, its far easier to use the excellent etckeeper which is available under most distributions repositories.

Installation

Gentoo

snippet.bash
emerge -av etckeeper

Arch Linux

snippet.bash
pacman -Syu etckeeper

Configuration

You're most likely to use Git given you are reading a page about Git so edit /etc/etckeeper/etckeeper.conf to use Git for version control…

snippet.bash
# The VCS to use.
#VCS="hg"
VCS="git"
#VCS="bzr"
#VCS="darcs"

…then initialise on your system…

snippet.bash
etckeeper init

Backup

Having the files version controlled is good, but what if your system suffers a catastrophic failure and you want to restore the settings and configurations you have worked with and tweaked over a long time? To this end it is wise to backup your configuration somehwere, this could be as simple as backing up to a second independant location, although its worth considering backing up online. GitHub didn't used to offer private repositories with its entry level free accounts (although if you have academic affiliation then you can apply for an upgrade which does include private repositories). This is important because a lot of sensitive information on your systems configuration is stored within these files and making them publicly available opens yourself up to being hacked/exploited. Thus you should use a private repository to backup your files. Step forward GitLab a nice alternative to GitHub that includes the ability to have private repositories with their entry-level Free account.

NB I did stupidly run rm -rf * whilst in /etc/ and thought I would have to completely reinstall my system, fortunately this was after I had started using etc-keeper and so I was able to restore everything relatively painlessly by cloning the repository.

Once you've signed up and got yourself an account simply create a repository, an appropriate name would be that which you use for your system and is stored in /etc/hostname. For example you might choose to call it kimura.no-ip.info. As you will already have placed /etc under version control you can then add the repository using the following (substitute [username] for your GitLab username or just copy and paste from the newly establish project)…

snippet.bash
git remote add origin git@gitlab.com:[username]/kimura.no-ip.info.git
git push -u origin --all
git push -u origin --tags

Submodules

Nesting git repositories is a bad idea, it makes it hard to track files and keep things upto date/synced. Unsurprisingly the authors of Git have anticipated this and provided a solution whereby nested Git repositories are added as submodules (tutorial). Its pretty straight-forward to add a submodule and requires two steps…

snippet.bash
git submodule add https://github.com/<user>/rock rock
git submodule update --init [--recursive]

A tutorial I came across (but haven't read in full) is here.

Backing up ~/

You've got your system configuration files under version control, but there are a host of other files that your regular user account uses which would be useful to place under version control and back up/synchronise between systems. Simply placing your ~/ / $HOME directory under version control is not recommended (although I did initially do it myself) for the simple reason that should you run git clean you'll wipe out everything that isn't under version control in that directory. Instead it is recommended that you use dotfiles to save your configuration. There are a number of different ways to do this and I've described doing this using GNU/Stow, dobot and Gitlab. This process not only backs up your key configuration files but also makes it easy to setup new systems by cloning and installing your dotfiles.

Citing Software

Its a great shame that the developers of software do not receive more citations and credit for their work. GitHub support Citation File Format and by creating a CITATION.cff file in your repository a button is added making it easy for others to cite your software.

Analysing a repository

There is a useful Python package repostat which analyses the Git history and summarises it nicely.

snippet.bash
pip install repostat
cd ~/path/to/a/project/
repostat --copy-assets --with-index-page [--no-browser] . ~/tmp/project_summary

You then get a neat web-page summarising your project, who has made how many commits and how many lines of code contributed.

Links

Learning Git/Github

Tips and Tricks

Branching

GitHub

GitHub Pages

Multiple Accounts

Tools

Videos

Alternatives

I can't see many alternatives gaining traction these days but its worth keeping an eye on things.

Misc

git/git.txt · Last modified: 2023/02/03 19:22 by admin
CC Attribution-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0