Occassionaly we need to clean up our repository when we accidently commit large files. We may delete the files and commit the project again however the repository size remails huge. This is what I did to clean up the repository.
Rewriting Git history demands changing all the affected commit ids, and so everyone who’s working on the project will need to delete their old copies of the repo, and do a fresh clone after you’ve cleaned the history. The more people it inconveniences, the more you need a good reason to do it – your superfluous file isn’t really causing a problem, but if only you are working on the project, you might as well clean up the Git history if you want to!
To make it as easy as possible, I’d recommend using the BFG Repo-Cleaner, a simpler, faster alternative to git-filter-branch specifically designed for removing files from Git history. One way in which it makes your life easier here is that it actually handles all refs by default (all tags, branches, etc) but it’s also 10 – 50x faster.
You should carefully follow the steps here: http://rtyley.github.com/bfg-repo-cleaner/#usage – but the core bit is just this: download the BFG jar (requires Java 6 or above) and run this command:
$ java -jar bfg.jar –delete-files filename.orig my-repo.git
Your entire repository history will be scanned, and any file named filename.orig (that’s not in your latest commit) will be removed. This is considerably easier than using git-filter-branch to do the same thing!
Multiple Branches getting into the commit
There was also a problem of multiple branches getting into the commit and being merged later.
For that we did a rebase. However when we do a rebase note that all the nodes (commits) from the point of the rebase gets rehashed. You have to tell all the users of your codebase to checkout the project again.
command for that is:
git rebase <commit where branching started>