Git Basics

Git can be a very complicated thing. Someone once told me that we mere humans have a very difficult time with it at first. I myself have had a tremendous<nowiki>[ly difficult]</nowiki> time learning how to use Git (many thanks to marktraceur for all the help). It is an incredibly robust and so a very complicated solution. What source code management system isn’t though (especially one that is command line)? This document should serve as a very high level view of how to use Git. It will not cover advanced functionality such as cherry-picking, merging, rebasing, etc. If something is not documented here, please see the Git docs or suggest it on the discussion page.

Working with Branches

Branches in Git look are like tree branches. The Git repository itself is the trunk and the branches are the various projects in the repository. Typically (hopefully) these projects are related to each other. In the case of a development project with a frequently changing database schema that you wanted to back up, the repository would have two branches: the files branch where the code files are stored, and the database branch where the database dumps are stored.

Viewing Branches

Viewing branches is simple. Type git branch and you should see output similar to the following:

$ git branch

* database
  master

To use a different branch, a the checkout command is required. In this case, we will switch from the database branch to the master branch.

Note:Some decompression happens here so if the branch to be checked out is very large, this will likely take a few seconds.

$ git checkout master

Checking out files: 100% (6110/6110), done.
Switched to branch 'master'

Commits

Git does not have commitmentphobia. In fact, it loves commits as if it were its only purpose in life.

In most if not all source code management software, a commit is essentially a set of changes to be merged into the master repository.

To create a commit, there are several steps that need to take place.

Firstly, the changed files to be pushed to the repository need to be added. For this, we use the git add command.

$ git add ./ex1.blah
$ git add ./example2.blah

One handy bit for this is the -A switch. If used, git will recursively add all files in the specified directory that have been changed for the commit. This is very handy if many files were changed.

$ git add -A .

Once the changes files are set up for commit, we just need one more step. Run git commit and you will be taken to a text editor (likely vi - specified in the repository configuration) to add comments on your commit so you and other developers know what was changed in your commit in case something is broken or someone wants to revert.

This piece is key if you are using the git repository as a code repository rather than a versioning repository for backups. Please write in meaningful comments.

There is actually one more piece to committing a change if you have a remote repository on another box or a different location on the local box. So other developers can pull the repository and get your changes, you need to push your changes to the remote repository. Please see the Pushing Changes to a Remote Repository section for more information on this. To do this, we use the git push command.

Logs

All of this commit and commit log business is a bit worthless if we can’t look at logs. To look at the logs we use the git log command. This will open up your system’s pager (typically less is the one used) to view the logs for the current branch. If you wish to view the logs on a different branch, you can either check out that branch, or you can type git log BranchName.

A handy option for the git log command is the --name-status switch. If you use this switch, git will list all of the commit logs along with all of the files affected and what was done (modified, deleted, created, renamed) in each individual commit.

Remote Repositories

Git is a distributed code versioning system which means that every person that has pulled the repository has a complete copy of the original. This is really great for working remotely because you don’t have to be online and able to talk to the remote repository to see change history.

Adding a Remote Repository

Git needs several things to add a remote repository. Firstly, it needs a local alias for the remote repository. It also needs a username to log in to the repo with, as well as the ip address or hostname of the repository, and the path to the actual repo directory on the remote server. With that, to add a remote repository the command looks somewhat like this:

git remote add origin gitman@someserver.org:repos/CleverProjectName

Now, let’s break down what that all means since it seems a tad complicated.

git remote add origin gitman @someserver.org :repos/CleverProjectName

This is the command to work with remote servers in git.

Tells git we are adding a remote

The local alias for the remote. Origin is typically used here.

The username to log in to the remote server with.

This is the server where the repo is stored

This is the path to the actual repository directory. Since it does not start with a / it starts in the home directory of gitman (\~/).

Fetching a Remote Repository

Now that we have a remote repository added to our local git repository, we simply need to fetch the repo. To do this we use the git fetch command. Here is where that alias from the remote add command comes in handy.

git fetch origin

This command will fetch all branches of the origin repository.

Pushing Changes to the Remote Repository

Now that we have a local copy of a repository to work on and have made some changes, some amount of code synchronization needs to take place with an origin repository so each of the developers can have the latest-and-greatest. With that, a commit only pushes code to your local copy of the repository. What needs to happen after a commit is to push the change to the origin repository so everyone else will also have access to your change set. To do this, we use the git push command.

There are two parameters for this though. The first is the local alias for the remote repository (typically referred to as origin since presumably the remote server is where your repository originated). The second parameter is the branch name. Since we often have more than one branch, this is a good piece to pay attention to so you don’t submit a database dump file to the code branch.

git push origin master

Dealing with Size Issues

Since git is a code versioning system that contains as many versions of a file as the number of commits, its size can grow out of hand rather quickly, especially when dealing with binaries. Luckily, there is a handy command for this very situation: git gc.

This command compresses all of your repository branches in the context of each other. This can reduce the size of your local and/or remote repositories very effectively. I have a repository that should be several gigabytes with about 60 commits per branch (it’s a repo used for versioned backups), and git gc reduced it to about 370 megabytes.