GitPython Tutorial¶
GitPython provides object model access to your git repository. Once you have created a repository object, you can traverse it to find parent commit(s), trees, blobs, etc.
Initialize a Repo object¶
The first step is to create a Repo object to represent your repository.
>>> from git import *
>>> repo = Repo("/Users/mtrier/Development/git-python")
In the above example, the directory /Users/mtrier/Development/git-python is my working repository and contains the .git directory. You can also initialize GitPython with a bare repository.
>>> repo = Repo.create("/var/git/git-python.git")
Getting a list of commits¶
From the Repo object, you can get a list of Commit objects.
>>> repo.commits()
[<git.Commit "207c0c4418115df0d30820ab1a9acd2ea4bf4431">,
<git.Commit "a91c45eee0b41bf3cdaad3418ca3850664c4a4b4">,
<git.Commit "e17c7e11aed9e94d2159e549a99b966912ce1091">,
<git.Commit "bd795df2d0e07d10e0298670005c0e9d9a5ed867">]
Called without arguments, Repo.commits returns a list of up to ten commits reachable by the master branch (starting at the latest commit). You can ask for commits beginning at a different branch, commit, tag, etc.
>>> repo.commits('mybranch')
>>> repo.commits('40d3057d09a7a4d61059bca9dca5ae698de58cbe')
>>> repo.commits('v0.1')
You can specify the maximum number of commits to return.
>>> repo.commits('master', max_count=100)
If you need paging, you can specify a number of commits to skip.
>>> repo.commits('master', max_count=10, skip=20)
The above will return commits 21-30 from the commit list.
The Commit object¶
Commit objects contain information about a specific commit.
>>> head = repo.commits()[0]
>>> head.id
'207c0c4418115df0d30820ab1a9acd2ea4bf4431'
>>> head.parents
[<git.Commit "a91c45eee0b41bf3cdaad3418ca3850664c4a4b4">]
>>> head.tree
<git.Tree "563413aedbeda425d8d9dcbb744247d0c3e8a0ac">
>>> head.author
<git.Actor "Michael Trier <mtrier@gmail.com>">
>>> head.authored_date
(2008, 5, 7, 5, 0, 56, 2, 128, 0)
>>> head.committer
<git.Actor "Michael Trier <mtrier@gmail.com>">
>>> head.committed_date
(2008, 5, 7, 5, 0, 56, 2, 128, 0)
>>> head.message
'cleaned up a lot of test information. Fixed escaping so it works with
subprocess.'
Note: date time is represented in a struct_time format. Conversion to human readable form can be accomplished with the various time module methods.
>>> import time
>>> time.asctime(head.committed_date)
'Wed May 7 05:56:02 2008'
>>> time.strftime("%a, %d %b %Y %H:%M", head.committed_date)
'Wed, 7 May 2008 05:56'
You can traverse a commit’s ancestry by chaining calls to parents.
>>> repo.commits()[0].parents[0].parents[0].parents[0]
The above corresponds to master^^^ or master~3 in git parlance.
The Tree object¶
A tree records pointers to the contents of a directory. Let’s say you want the root tree of the latest commit on the master branch.
>>> tree = repo.commits()[0].tree
<git.Tree "a006b5b1a8115185a228b7514cdcd46fed90dc92">
>>> tree.id
'a006b5b1a8115185a228b7514cdcd46fed90dc92'
Once you have a tree, you can get the contents.
>>> contents = tree.values()
[<git.Blob "6a91a439ea968bf2f5ce8bb1cd8ddf5bf2cad6c7">,
<git.Blob "e69de29bb2d1d6434b8b29ae775ad8c2e48c5391">,
<git.Tree "eaa0090ec96b054e425603480519e7cf587adfc3">,
<git.Blob "980e72ae16b5378009ba5dfd6772b59fe7ccd2df">]
The tree is implements a dictionary protocol so it can be used and acts just like a dictionary with some additional properties.
>>> tree.items()
[('lib', <git.Tree "310ebc9a0904531438bdde831fd6a27c6b6be58e">),
('LICENSE', <git.Blob "6797c1421052efe2ded9efdbb498b37aeae16415">),
('doc', <git.Tree "a58386dd101f6eb7f33499317e5508726dfd5e4f">),
('MANIFEST.in', <git.Blob "7da4e346bb0a682e99312c48a1f452796d3fb988">),
('.gitignore', <git.Blob "6870991011cc8d9853a7a8a6f02061512c6a8190">),
('test', <git.Tree "c6f6ee37d328987bc6fb47a33fed16c7886df857">),
('VERSION', <git.Blob "9faa1b7a7339db85692f91ad4b922554624a3ef7">),
('AUTHORS', <git.Blob "9f649ef5448f9666d78356a2f66ba07c5fb27229">),
('README', <git.Blob "9643dcf549f34fbd09503d4c941a5d04157570fe">),
('ez_setup.py', <git.Blob "3031ad0d119bd5010648cf8c038e2bbe21969ecb">),
('setup.py', <git.Blob "271074302aee04eb0394a4706c74f0c2eb504746">),
('CHANGES', <git.Blob "0d236f3d9f20d5e5db86daefe1e3ba1ce68e3a97">)]
This tree contains three Blob objects and one Tree object. The trees are subdirectories and the blobs are files. Trees below the root have additional attributes.
>>> contents = tree["lib"]
<git.Tree "c1c7214dde86f76bc3e18806ac1f47c38b2b7a3">
>>> contents.name
'test'
>>> contents.mode
'040000'
There is a convenience method that allows you to get a named sub-object from a tree with a syntax similar to how paths are written in an unix system.
>>> tree/"lib"
<git.Tree "c1c7214dde86f76bc3e18806ac1f47c38b2b7a30">
You can also get a tree directly from the repository if you know its name.
>>> repo.tree()
<git.Tree "master">
>>> repo.tree("c1c7214dde86f76bc3e18806ac1f47c38b2b7a30")
<git.Tree "c1c7214dde86f76bc3e18806ac1f47c38b2b7a30">
The Blob object¶
A blob represents a file. Trees often contain blobs.
>>> blob = tree['urls.py']
<git.Blob "b19574431a073333ea09346eafd64e7b1908ef49">
A blob has certain attributes.
>>> blob.name
'urls.py'
>>> blob.mode
'100644'
>>> blob.mime_type
'text/x-python'
>>> blob.size
415
You can get the data of a blob as a string.
>>> blob.data
"from django.conf.urls.defaults import *\nfrom django.conf..."
You can also get a blob directly from the repo if you know its name.
>>> repo.blob("b19574431a073333ea09346eafd64e7b1908ef49")
<git.Blob "b19574431a073333ea09346eafd64e7b1908ef49">
What Else?¶
There is more stuff in there, like the ability to tar or gzip repos, stats, log, blame, and probably a few other things. Additionally calls to the git instance are handled through a __getattr__ construct, which makes available any git commands directly, with a nice conversion of Python dicts to command line parameters.
Check the unit tests, they’re pretty exhaustive.