Advanced LALSuite Git Instructions
Table of contents
- Basic usage
- Other features
- Getting help
These are more advanced instructions for accessing and using the LALSuite Git repository, please ensure you are familiar with the basic instructions before proceeding.
Note: The instructions that follow assume that you are using at least git version 1.5.0, please ensure that you are using at least this version before proceeding.
The LALSuite Git repository
The LALSuite (comprised of glue, lal, lalapps, and pylal) uses Git as a collaborative code development system. It is a tool to allow multiple developers to manipulate the same files, merging differences, and identifying conflicts between changes if they arise.
The lalsuite Git repository is structured as follows:
Repository Name: lalsuite |---->glue |---->lal |---->lalframe |---->lalmetaio |---->lalxml |---->lalburst |---->lalstochastic |---->lalapps +---->pylal
An excellent introduction to why version control is required, and the strengths that git has over other version control systems can be found in the The Git Parable. It is highly recommended to read this carefully as it explains a lot of the concepts and terminology used in following documentation.
Differences between Git and CVS
The main differences between Git and CVS stem from the fact that Git is a distributed version control system, whereas CVS is a centralised version control system. This means that every checkout of a Git repository contains the entire history of the project, therefore the notion of a central repository is by definition only.
By definition the default branch of a Git repository is referred to as
master branch. In a centralised setup, as is used for
the LALSuite repository, it is advised to keep the
branch as clean, and as near to the remote status, as possible.
Therefore it is recommended to do all work on a separate branch and only
merge changes to the
master branch when they are ready to
pushed to the remote repository.
Another main difference is due to how CVS and Git record version information. CVS manages versions on a per file basis whereas Git versions tree wide, the entire state of the repository is defined by a single ID. In the case of Git this is the SHA1 sum, which is a unique cryptographic hash of the entire tree. This ID can be determined in several ways, for example the following command will return the SHA1 sum of the HEAD of the current branch:
git rev-parse HEAD
The ID is also reported in the log messages in the commit field, and can be seen along with the author, committer, date, and log message by running the following command:
The Git Workflow
The fact that Git is a distributed version control system leads to some important differences in the development workflow:
- Each clone of the repository contains the complete history of the project, so the concept of a central repository is by definition only
- Common operations, such as commits, viewing history, diffs, are fast as there is no need to communicate with a central server. Communication is only necessary when pushing or pulling changes
- In a centralised git repository setup, each working copy can essentially be considered to be a branch on the main repository
These differences have many consequences that will effect the development workflow. For example, as each working copy can be considered a branch of the central repository it makes things easier if you keep the master branch of your repository as clean as possible. This is due to that fact that when you push changes to the remote repository you cannot control which changes get pushed, either all the committed changes are pushed or none of the changes are. Therefore it is recommended to use local branches for your development work and then merge these to you local master branch when they are ready to be pushed to the remote master repository.
As the repository contains the entire history of the project, the working copy can be easily switched between various different branches and tags, therefore you do not need multiple checkouts for working with different branches and tags; the same repository can be used.
We operate the lalsuite Git repository as an open development environment in which we rely on developers not to do deliberate damage to other people's code. This honor system has worked well up to now and we hope it will continue.
Some basic rules about pushing code to the master repository:
- Don't push code that breaks the build
- Don't change another developer's code without checking with the developer
- Do help improve the code and documentation when you see problems
Everyone has access to read the LALSuite repository using the git protocol. If you're in the roster, you have the ability to push to the repository over ssh.
Cloning the LALSuite Git repository
First make sure that Git knows who you are so that this can be recorded
correctly in commit messages by running the following, replacing
Albert Einstein with your name, and
email@example.com with your ligo.org email address
user.name is literally
user.name and not your
user name; likewise for
git config --global user.name "Albert Einstein" git config --global user.email firstname.lastname@example.org
This needs to be done on each machine that you wish to access the repository from.
Your LIGO.ORG account is used for authentication and you must have a valid ECP Cookie inorder to clone the repository:
$ ecp-cookie-init LIGO.ORG https://versions.ligo.org/git albert.einstein
albert.einstein with your LIGO.ORG Kerberos
principle. You need to tell git the location of this cookie using:
$ git config --global http.cookiefile /tmp/ecpcookie.u`id -u`
this needs to be done on each machine that you wish to access the repository from.
To get a copy of the LALSuite Git repository, use the command:
git clone https://versions.ligo.org/git/lalsuite.git
this may take a few minutes. When this is completed you have a repository that contains the entire history of the LALSuite Git repository.
A read-only copy of the repository can be clone anonymously using the following command:
git clone git://versions.ligo.org/lalsuite.git
Creating a local branch
As previously stated it is recommended to do all development on a local
branch and only merge changes to the master branch when you are ready to
push these changes to the remote repository. As Git has powerful
branching and merging support it is advisable to use a new branch for
each new feature or bug fix you are developing. Therefore the first step
after cloning the repository should be to create a local branch to work
on. For example a local branch, called
that tracks the remote master branch can be created, and checkout out,
by running the following:
git checkout -b local_branch_name origin/master
Making changes and committing them to your Git repository
Once you've made your changes, you need to commit them to your local Git repository. You can either commit all changes using:
git commit -a
Or you can specify which files to commit by specifying the names of the files by using:
git commit file_1 file_2
You will be asked to provide a comment for the ChangeLog. This does not need to be detailed, but should give some indication of the work you were doing. For example, if this addresses a problem report make a note of which in the commit message. Git will refuse to perform the commit if you leave this blank. As a guideline this message should be detailed enough so that another developer understands what the change did without having to look at what code was changed.
Note that these changes have only been committed to your own local repository. They are not yet in the remote repository.
Adding a new file
To add a file to your local repository, first place it in the appropriate directory. For convenience, let the filename be myfile.a. Type:
git add myfile.a
To move file myfile.a to myfile.b, type:
git mv myfile.a myfile.b
To remove file myfile.b, type:
git rm myfile.b
When you're ready, these changes can be committed to you local repository.
Determining current tree status
The current status of you working tree can be determined by running:
the output will be something like:
# On branch master # Changed but not updated: # (use "git add
..." to update what will be committed) # (use "git checkout -- ..." to discard changes in working directory) # # modified: myfile.a # # Untracked files: # (use "git add ..." to include in what will be committed) # # myfile.b no changes added to commit (use "git add" and/or "git commit -a")
In the above example we are on the
master branch and there
are uncommitted changes to myfile.a and the file
myfile.b is currently untracked by Git.
Branching and merging
One of the main strengths of Git is its support for dealing with branches. There are two ways that branches can be created: i) you can create and new branch and switch your working copy to this branch with a single command:
git checkout -b new_branch
Or ii) you can create a new branch without switching your working copy to it:
git branch new_branch
this branch can then be checked out with:
git checkout new_branch
Note that the branches created this way do not track any other branch so you will manually need to update these branches by merging in upstream changes. If you want to be able to pull in changes from other branches then see below.
You can then switch the working copy back to the
branch, for example, with:
git checkout master
You can list all branches, both local and remote, in a given repository with the following:
git branch -a
Remote branches, prefixed with
origin/ in the output from
the above command, can be tracked in your local repository. Track means
that if you run the
git pull command on a branch that
tracks another branch it will pull, and merge if needed, changes from
this branch onto the current branch. For example if you want to create a
branch that tracks the branch
remote_branch in your local
repository run the following:
git branch --track new_branch origin/remote_branch
Note: NEVER checkout these remote branches directly, always checkout a local copy that tracks the remote branch. Checking out the remote branch directly will lead to nothing but problems!
Like tags, any new branch you create is only within your local repository. Therefore to get a branch into the master repository it needs to be pushed, this is done with the following:
git push origin new_branch
This should only be done for branches that other people need to access, therefore please think before adding new branches to the remote repository.
If you now want your local branch to track this new remote branch you have added, you need to tell git this. This can be achieved with:
git config branch.new_branch.remote origin git config branch.new_branch.merge refs/heads/new_branch
Merging between branches is generally straightforward, to merge changes
from the branch
new_branch to the current branch run the
git merge new_branch
If there are conflicting changes in the merge then the merge process will halt and alert you to the conflict. For example, following a conflicting merge the output of the above command would be something like the following:
Auto-merging path/to/file CONFLICT (content): Merge conflict in path/to/file Automatic merge failed; fix conflicts and then commit the result.
You then need to fix the conflict, how to do this is dependant upon the nature of the conflict see the links in the more information section for details on how to resolve conflicts. Once the conflict has been resolved you can proceed with the merge by running:
git update-index path/to/file git commit
If you only want to merge a single, or handful, of commits from a branch then you can "cherry pick" these commits with the following:
git cherry-pick -x commit_id
commit_id is the SHA1 sum of the commit.
Note: the -x option records the original commit id of
the commit you are cherry picking. If this is on a branch that you are not
intending on pushing to the remote repository then the -x option
should be dropped.
Once you have finished with a branch, i.e. all the commits on the branch have been pushed to the remote repository or that avenue of development is no longer being pursued, you can delete a branch with the following:
git branch -d branch_name
If there are any commits on the branch that have not been merged with the master branch then you will receive an error, if you really want to delete this branch you can force it with:
git branch -D branch_name
Git provides a very powerful feature called rebase, in the documentation that follows only the basic usage of rebase is discussed (see the git-rebase man page for more details). The following example illustrates how rebase can be used to modify the local branch point, therefore eliminating extraneous merge commits.
Consider the following, the master branch has three commits A, B, and C:
You then create a branch, called local_branch for example, off the HEAD of master, ie commit C:
And then proceed with you own development, let say you make 3 extra commits, D, E, and F so the tree now looks like:
D-E-F / A-B-C
but in the time you have been developing on local_branch there have been extra commits to the master branch, say G and H, so when you pull the tree looks like:
D-E-F / A-B-C-G-H
so if you checkout the master branch, pull the latest changes from upstream, checkout local_branch and merge with master by running the following commands:
git checkout master git pull git checkout local_branch git merge master
your tree will now look like the following:
D-E-F / \ A-B-C---G---H
So if you push to master now there will be an extra "merge commit" that gets your local branch back in sync with master. This merge commit can be eliminated by rebasing. So if you now rebase against master this changes the branch point of your local branch to be the latest HEAD of the master branch, this is done by running the following command, whilst on local_branch:
git rebase master
following this the tree will now look like:
D-E-F / A-B-C-G-H
When this is merged to master there will not merge commit, the history will be:
The resulting code is identical, just the history is a lot cleaner and easier to follow.
Note: if you need to resolve any conflicts when merging the latest changes from master into local_branch these same conflicts will need to be resolved a second time during the rebase operation. If at any point during a conflicted rebase you can cancel the rebase operation by running:
git rebase --abort
Pushing your changes to the master Git repository
When you are ready to push your changes to the remote repository you need to ensure that the branch containing your changes is up-to-date with the remote repository. This is done by running the following with the local branch as the active working copy:
You may have to resolve conflicts. Do so. And double check that your code works. When you're fully satisfied that you are ready these changes need to be merged to the local version of the master branch. First of all you need to checkout the master branch of your local repository:
git checkout master
You then need to pull changes from the remote master branch:
If you want all the changes on the local branch to be pushed to the
remote repository simply merge change from your local development branch
onto your local master branch. So if the local branch on which you where
working was called
local_branch_name this merge can be done
git merge local_branch_name
Again, you may have to resolve conflicts, do this and check that the code works as expected.
If however you only need to commit handful of the changes you need to determine the commit IDs of the individual changes and cherry-pick these into the local master branch. Do this with:
git cherry-pick -x commit_id
commit_id is the ID of the commit you wish to push to
the remote repository. Again, you may need to resolve conflicts, do this
and check that the code works. Note: the -x
option records the original commit id of the commit you are cherry
picking. If this is on a branch that you are not intending on pushing to
the remote repository then the -x option should be dropped.
Now that everything is up to date, and all potential conflicts resolved, you are ready to push your changes to the remote repository. This is done with the following:
git push origin HEAD
which pushes the current branch to a branch of the same name in the
origin, repository. Depending on your git version
you can configure git so that it will to this with just
push by running the following:
git config --global remote.origin.push HEAD git config --global push.default current
Note: if you do not configure git with the above,
may push all local changes that are not present in the
remote repository, therefore you should take care when using this
command. More detail on options that can be passed to
push can be found in the
git-push man page.
Keeping your copy up-to-date with changes made to the master repository
To update your local Git repository simply type:
If you have been working in your local repository and your changes conflict with those made by the other person, you will receive a warning and will need to manually edit the file to remove the conflicts. It is usually good practice to update your local repository before working on the files; this avoids conflicts. It is also good practice to commit changes reasonably often.
Making and checking out tags
To create a tag of the current state of the repository run:
git tag -a tag_name
The working copy of the repository can switched to a given tag with the following:
git checkout tag_name
When you create a new tag this is only within your local repository, in order to get this tag into the master repository it needs to be pushed, this is done with the following:
git push origin refs/tags/tag_name
or all tags that exist in your local repository, that are not in the remote repository, can be pushed with:
git push --tags
Quite often halfway through the development of a new fix or feature, you will find a small bug in the code that has a very simple fix yet you are unable to address it due to being in the middle of another fix. Git provides a very useful feature that can help in situations such as this. You can stash changes in the repository, without committing them, thereby reverting the tree to the latest commited state. This can be done with the following command:
After running the above command all uncommited changes present in the working copy will have been stashed and the tree reverted to the current HEAD. You can now fix the minor bug and commit. When you ready to continue working on the previous fix/feature you can reapply the stashed changes to your working copy by running:
git stash apply
Resetting the staging area
Sometimes it may be necessary to reset the state of the tree to how it was at the last commit. Any uncommitted changes can be removed with the following command:
git checkout -f HEAD
This, however, may not always be enough. For example if there is a conflicting merge and you would like to reset the tree to the state it was in before the merge you need to reset the staging area as well. This can be done with the following:
git reset --hard HEAD
Cleaning the working copy
Sometimes it is necessary to completely remove all files from the working copy
that are not part of the repository, this is done using the following
NOTE: this command removes all
files that are not committed, so please take case in using this - if in
doubt, don't use it! Uncommitted changes to files that are already
present in the repository will be untouched. Recommended: first do a dry run with the
git clean -dxn
If this looks safe, replace the
-n flag with
-i for interactive mode or
-f to force deletion without prompting. CAUTION: unlike most git operations, this cannot be undone. Files not in the repository will be permanantly deleted, similar to an
rm command with
Git has a very detailed set of man pages which are easily accessible. For example if you wish to see detailed instructions concerning commits run the following:
git commit --help
In fact all git commands have the
--help option that
display detailed documentation. Failing that there is a lot of detailed
documentation online, see the more information section
below for pointers on where to start.
More information and documentation regarding the usage of git, and details of more advanced features such as branching, merging, patch generation, conflict resolution, and many other advanced features can be found on the following pages:
- Common CVS commands and their Git counterparts
- http://git-scm.com - Git Project Page
- http://git-scm.com/course/svn.html - Git for SVN users
- http://www-cs-students.stanford.edu/~blynn/gitmagic - Git Magic Guide
- http://book.git-scm.com - Git Community Book
- http://progit.org/book - Pro Git, professional version control book
- http://gitready.com - Git Usage Tips
- http://nathanj.github.com/gitguide/tour.html - Illustrated Guide to Git on Windows
- http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html - Git tutorial
- http://www.kernel.org/pub/software/scm/git/docs/gitglossary.html - Git glossary
- http://www.spheredev.org/wiki/Git_for_the_lazy - Git for the lazy