Search


DASWG LAL Doxygen

Docs

How-To's
Technical
Software Docs
Minutes

Download

Browse CVS, Git, or SVN
Software Repositories
OS Security Updates
LIGO software virtual machine
VMware SL6 Install

Participate

SCCB - Software Change Control Board
Edit these pages
Sub-committees
Mailing List
Telecon

Projects

DMT
DQSEGDB
Glue
GraceDB
gstlal
LALSuite
LDAS Tools
LDG Client/Server
LDR
ligoDV
LIGOtools
LVAlert Administration
LVAlert
MatApps
Metaio
NDS Client
PyLAL
LSCSOFT VM

Legacy Projects

geopp
LDAS
LDM
LIGOtools
LSCGIS
Onasys
OSG-LIGO

Git use cases

Table of contents

  1. Managing builds
    1. Building master
    2. Building remote branches
    3. Building tags
  2. Managing development
    1. Bugfix or solitary development
    2. Collaboration on major feature

Introduction

Git has proven difficult for many LSC developers not from lack of information, but from a flood of it. Many malcontents posted to DASWG suggesting that we suggest a good workflow — how to get things done. This document looks at some common use cases and what some mildly experienced git users suggest.

Herein, we assume you have a clone of the lalsuite repository and are familiar with basic git commands as outlined in the instructions. All code blocks assume that you are in the base lalsuite directory of your clone.

Managing builds

Albert hated git. It seemed that he was always rebuilding LAL and LALApps from scratch every time he did anything. On top of that, he was asked to develop on the trunk and a remote branch and to build tags to run his part of his analysis. Here's how he regained sanity and prevented gratuitous reruns of configure and recompiles.

Building master

Albert had a fresh clone of the lalsuite git repository. He needed to build code at the HEAD of master (conceptually, "the trunk"). He minimizes headaches by keeping everything well insulated.

git checkout master
git pull
cd lal
./00boot
mkdir build_master
cd build_master
../configure --prefix=/home/albert/opt_master/lal
make -j3 install
source /home/albert/opt_master/lal/etc/lal-user-env.sh
cd ../../lalapps
./00boot
mkdir build_master
cd build_master
../configure --prefix=/home/albert/opt_master/lalapps
make -j3 install

Later, Albert wanted to get the latest changes and build again. As long as he didn't rerun 00boot in the mean time, he can do the normal build procedure without rebuilding everything:

git checkout master
git pull
cd lal/build_master
make -j3 install
source /home/albert/opt_master/lal/etc/lal-user-env.sh
cd ../../lalapps/build_master
make -j3 install
What's with the lone make install? Although the LAL/LALApps documentation lists make and make install as separate steps, they can be done all at once with just make install.
What is the -j3 option? The -j3 option enables 3 parallel compiles to execute at once to speed up the build. It is a reasonable default on most desktops and laptops today. The general rule for what number to pass with -j is number of free cores + 1. For ldas-pcdev1.ligo.caltech.edu, you can see that there are 32 total cores with grep processor /proc/cpuinfo and you can see the approximate number in use with uptime; the difference is the effectively the number of free cores. The marginal return diminishes quickly, so don't be aggressive. Building on a local disk will make a bigger difference than -j33. Finally, the -j option was slightly broken before May 2009; it would get to the end of the build then fail. The solution is to run make install again without the -j and it will complete very quickly.

Building remote branches

This section is different depending on whether it is necessary to rerun 00boot to regenerate the build system.

When do I need to run 00boot? You need to rerun 00boot if the build system has changed between the version you wish to build and the last version you built. This is not easy to know a priori, but in general, if a new branch is nearby to what you last built, you don't need to rerun 00boot. Rerunning 00boot will force a recompile from scratch. If you didn't rerun 00boot but need to, the build will fail, though probably not with a useful error message.
00boot not required

Albert was a developer of a bugfix-only remote branch named s5_ihypeihope that branched from master in his recent history, so rerunning 00boot was unnecessary (see When do I need to run 00boot?). In this case, he could proceed very much as for master, but now 00boot is unnecessary:

git checkout s5_ihypeihope
git pull
cd lal
mkdir build_s5_ihypeihope
cd build_s5_ihypeihope
../configure --prefix=/home/albert/opt_s5_ihypeihope/lal
make -j3 install
source /home/albert/opt_s5_ihypeihope/lal/etc/lal-user-env.sh
cd ../../lalapps
mkdir build_s5_ihypeihope
cd build_s5_ihypeihope
../configure --prefix=/home/albert/opt_s5_ihypeihope/lalapps
make -j3 install

Later, Albert wanted to get the latest changes and build again. As long as he didn't rerun 00boot in the mean time, he can do the normal build procedure without rebuilding everything:

git checkout s5_ihypeihope
git pull
cd lal/build_s5_ihypeihope
make -j3 install
source /home/albert/opt_s5_ihypeihope/lal/etc/lal-user-env.sh
cd ../../lalapps/build_s5_ihypeihope
make -j3 install
00boot required

Albert needed to go back and run a remote branch s5_skynet that branched from master early during S5, so rerunning 00boot was necessary (see When do I need to run 00boot?). In this case, Albert knew that he would want to build master and s5_skynet repeatedly as new commits appeared and he wouldn't want to recompile everything from scratch each time he switched. Unfortunately, creating a separate build directory as before is unsufficient, as running 00boot forces a complete recompile. Instead, Albert had to make a separate clone of the repository. Although not strictly necessary, he cloned his pre-existing, local repository to avoid re-downloading it from UWM.

cd ..
git clone file:///home/albert/lalsuite/.git lalsuite_s5_skynet
cd lalsuite_s5_skynet
git config remote.origin.url https://versions.ligo.org/git/lalsuite.git
git checkout -b s5_skynet origin/s5_skynet
git pull

Albert then did his first build, which of course required a compile from scratch:

cd lal
./00boot
mkdir build_s5_skynet
cd build_s5_skynet
../configure --prefix=/home/albert/opt_s5_skynet/lal
make -j3 install
source /home/albert/opt_s5_skynet/lal/etc/lal-user-env.sh
cd ../../lalapps
./00boot
mkdir build_s5_skynet
cd build_s5_skynet
../configure --prefix=/home/albert/opt_s5_skynet/lalapps
make -j3 install

Henceforth, Albert can build this branch in the ~/lalsuite_s5_skynet area without gratuitous recompiles:

git checkout s5_skynet
git pull
cd lal/build_s5_skynet
make -j3 install
source /home/albert/opt_s5_skynet/lal/etc/lal-user-env.sh
cd ../../lalapps/build_s5_skynet
make -j3 install

Building tags

Albert was asked to run a search with a tag given to him by his analysis leader. The tag was near the HEAD of the master branch, so he didn't need to rerun 00boot (see When do I need to run 00boot?). His reviewers reminded him emphatically that LIGO analyses must be run from tagged code. Furthermore, reviewers feel a lot more comfortable if a tag is built without reusing intermediate compilation products from previous builds. Albert therefore created a separate build directory and patiently sat through a full rebuild. Since a tagged run for analysis needs to control all versions and not use released binary packages, a top-level build of everything is appropriate.

make distclean
git checkout s5_highmass_eobadger
./00boot
mkdir build_s5_highmass_eobadger
cd build_s5_highmass_eobadger
../configure --prefix=/home/albert/opt_s5_highmass_eobadger
make -j3 install
source /home/albert/opt_s5_highmass_eobadger/etc/lal-user-env.sh
source /home/albert/opt_s5_highmass_eobadger/etc/lalapps-user-env.sh
When I checkout a tag, why does git branch say that I'm on (no branch)? Tags are fixed pointers to the state of the whole repository. Branches are movable pointers; the HEAD updates after each commit. It's not obvious what a movable pointer should mean with a tag — what should happen if you commit over a tag? The originating branch in general already has a future. You can spawn a new branch from a tag, but git doesn't assume that this is what you want, and indeed, you rarely want that. To find out what tag you have checked out, use git describe.

Development

Bugfix or solitary development

Albert got an email in which he learned about a bug in his program, lalapps_gitn00b. After some quick thought, Albert determined that the fix required changes in one or two files. This was his git workflow:

git checkout master
git pull
git checkout -b bugfix_gitn00b

Albert is now on a local branch (right now identical to master) named bugfix_gitn00b. On this local branch, he can commit his edits as often as he wants, as the transactions are all local to his working copy. To build the code for testing, he can generally reuse master's build system:

cd lal/build_master
make -j3 install
source /home/albert/opt_master/lal/etc/lal-user-env.sh
cd ../../lalapps/build_master
make -j3 install

When everything was working, Albert needed to file a Gnats PR. Part of the patch submission protocol is to include a patch in the Gnats PR. Here's how he can prepare patches:

git checkout master
git pull
git checkout bugfix_gitn00b
git diff master

Albert used git diff to find all differences with the current branch and master. In Gnats, it's often most convenient for patch reviewers if you copy and paste the patches rather than attach them.

Later, when Albert fixed the bugs, tested, committed, and gotten approval on his Gnats PR to proceed, he was ready to push. He completed with:

git checkout master
git pull
git checkout bugfix_gitn00b
git rebase master
git checkout master
git merge bugfix_gitn00b
git push
git branch -d bugfix_gitn00b

In other words, he goes back to the master branch and synchronizes with the remote server to make sure he is up to date. If there are no scary changes to his files of interest from other users, he merges his changes onto master from bugfix_gitn00b. The git rebase in between is to suppress extra merge commits. If there are no conflicts, he pushes his changes to the remote repository. Finally, bugfix_gitn00b has no further purpose, so he deletes it. Finally, he should close the Gnats PR and leave a note of the new commit hash.

How often should I commit? It is good practice to generate a separate commit for every change that fixes, alters, or adds a single aspect of the code and which is tested for basic functionality. Multiple files can be lumped together (git add file1 [file2 [...]]; git commit) if the changes to them affect the same aspect of code function and do not make much sense separately.
Why bother with a local branch? On a local branch, your work won't be affected if you need to switch gears mid-way through, as is so often the case. You can just commit (or git stash) and switch back to the master to take care of other tasks that need doing more urgently. Also, you maintain a clean master, which means that you can merge and push your bugfixes/features at any time in any order, even if your first PR is stalled and your second PR is approved immediately.
Rebase frightens me; do I have to do it? You are wise to fear git rebase; it is extremely powerful and if you use it improperly, you can trash your local repository (origin/master is protected). The git checkout bugfix_gitn00b; git rebase master; git checkout master sequence is safe, and highly recommended. It suppresses merge commits which leads to a much cleaner development history. More information regarding rebase can be found in the git-rebase man page and in the LALSuite Git Documentation.
I got a conflict! What do I do? If you have a conflict, git merge should tell you so on no uncertain terms: CONFLICT (content): Merge conflict in test_file.txt. For each conflicted file, search for <<<<<<<<. You'll see two versions of the same block of code. Keep one and delete the other or otherwise put the file into a good state. Be sure to delete the separator lines (<, =, and > lines), too. When you've resolved all conflicts, continue the merge/pull with git add test_file.txt; git commit or, if the conflict occurred during a rebase, git rebase --continue.

Collaboration on major feature

Albert's adviser decided it would be a great idea to add a dodecahedral coincidence test (dthinca) to the CBC pipeline. This required weeks of coding with team-members Kip, Rai, and Ron. They decided that Albert would be the gate-keeper/patch-collector to hold the authoritative copy of their common codebase. Here's how they operated smoothly.

  1. Albert first created a branch on his local copy:
    git checkout master
    git pull
    git checkout -b dev_dthinca
    
  2. Kip, Rai, and Ron then created tracking branches to follow Albert's branch. Albert's repository was situated on the Nemo cluster in /home/albert/lalsuite. Kip, Rai, and Ron do:
    git remote add -f -t dev_dthinca -m dev_dthinca albert_repo pcdev1.phys.uwm.edu:/home/albert/lalsuite/.git
    git checkout -b dev_dthinca albert_repo/dev_dthinca
    

Kip, Rai, and Ron now have a tracking branch named dev_dthinca from which they can now see everything that Albert has committed with git pull. However, they cannot push. Let's follow Ron as he attacks the problem at hand.

Ron is ready to start developing on his tracking branch. The tracking branch tracks dev_dthinca on Albert's repository albert_repo in the same way that the local master tracks the authoritative master on origin. Because it's rare to work on just one thing at a time, it can help maintain sanity to keep each new feature on a separate local branch, branched from the dev_dthinca branch, much like in the bugfix case above, but where dev_dthinca replaces master. Here, Ron wants to add a dodecahedral overlap function:

git checkout dev_dthinca
git checkout -b dthinca_dodec_overlap

Some time later, Ron made some progress on his dodecahedral overlap function on his local dthinca_dodec_overlap branch and the team wants to test it out. Ron has committed many changes to many files. Here's how he would prepare the patches to send to Albert, the gate-keeper:

git checkout dev_dthinca
git pull
git checkout dthinca_dodec_overlap
git format-patch dev_dthinca

The command produces a patch file with each commit in dthinca_dodec_overlap that is not in dev_dthinca. Ron then takes the patch file and emails it to Albert, who applies it to dev_dthinca:

git checkout dev_dthinca
git am PATCH_FILENAME

Kip, Rai, and Ron can now pull the changes from Albert. Commit messages and the information that Ron authored those commits is recorded.

Albert himself can develop with local branches. To share with the others, he only need checkout dev_dthinca and merge his local branch.

At the end of the process, once the feature is complete and the team is ready to unleash dthinca on the rest of the collaboration, Albert can merge it to master and push it:

git checkout master
git pull
git merge dev_dthinca
git push
git branch -d dev_dthinca

The code is now part of the trunk and any future fixes and development should proceed like normal. Kip, Rai, and Ron can likewise delete their dev_dthinca branches and associated local branches.

Why not just push a branch to origin? Albert could have chosen to create a local branch and push it to origin. This would make it readable by the whole world and writeable by the LVC. Albert may want tighter control than that at the outset to keep the development focused. Furthermore, pushing to a branch on origin clutters the long list of remote branches available to everyone (git branch -a) and triggers push-announcement emails to be sent. Finally, if you end up abandoning this work, you don't leave skeletons behind on origin.
$Id$