Git: Use git-sizer to Identify Large Space Usage in Your History

https://github.com/github/git-sizer

Notice that it cites the revisions at the bottom:

$ git-sizer -v
Processing blobs: 2049                        
Processing trees: 1056                        
Processing commits: 432                        
Matching commits to trees: 432                        
Processing annotated tags: 0                        
Processing references: 12                        
| Name                         | Value     | Level of concern               |
| ---------------------------- | --------- | ------------------------------ |
| Overall repository size      |           |                                |
| * Commits                    |           |                                |
|   * Count                    |   432     |                                |
|   * Total size               |   143 KiB |                                |
| * Trees                      |           |                                |
|   * Count                    |  1.06 k   |                                |
|   * Total size               |   885 KiB |                                |
|   * Total tree entries       |  21.0 k   |                                |
| * Blobs                      |           |                                |
|   * Count                    |  2.05 k   |                                |
|   * Total size               |  33.8 MiB |                                |
| * Annotated tags             |           |                                |
|   * Count                    |     0     |                                |
| * References                 |           |                                |
|   * Count                    |    12     |                                |
|                              |           |                                |
| Biggest objects              |           |                                |
| * Commits                    |           |                                |
|   * Maximum size         [1] |  1.32 KiB |                                |
|   * Maximum parents      [2] |     2     |                                |
| * Trees                      |           |                                |
|   * Maximum entries      [3] |    66     |                                |
| * Blobs                      |           |                                |
|   * Maximum size         [4] |  4.05 MiB |                                |
|                              |           |                                |
| History structure            |           |                                |
| * Maximum history depth      |   413     |                                |
| * Maximum tag depth          |     0     |                                |
|                              |           |                                |
| Biggest checkouts            |           |                                |
| * Number of directories  [5] |    27     |                                |
| * Maximum path depth     [6] |     4     |                                |
| * Maximum path length    [7] |   105 B   | *                              |
| * Number of files        [6] |   137     |                                |
| * Total size of files    [8] |  4.64 MiB |                                |
| * Number of symlinks         |     0     |                                |
| * Number of submodules       |     0     |                                |

[1]  a52f2b2c86baa9617ed66bc0b3301d57bf763ed3
[2]  fa562d025143096dc8ae0c2294114cd0b4443945 (refs/stash)
[3]  4ca9fdb84fec68c38d6061441996fd15b9494e9d (d2a75bc4b9743a3decf9e6cd5cb4a8670d57f30d^{tree})
[4]  62918aac7359d32b4b342db8d65a5a2a5172215d (86c3344be1fc590ee3ebe9eb7117e91c4ad21450:command/gozipinfo/gozipinfo)
[5]  2b5b648f5e6488420c22ba9c318fc6bfc4ce2a47 (461bbd7555360a202d1ab16f02c992581791a7d2^{tree})
[6]  853678df5f5fe5ec58b169d0fe3878e376c53e3e (refs/heads/dustin/profiling/temp_path^{tree})
[7]  8b03f6bab9d3369fec1fe322f5d3f159a220d95a (1b385e1714b7904fca85943d0f27c7f772af6d0f^{tree})
[8]  a557ab9aff02167856530cf536163d349e44fdfe (86c3344be1fc590ee3ebe9eb7117e91c4ad21450^{tree})

Git: Annotate Recent Changes in Blame

Pretty awesome. Pass a duration of time and the blame output will mark the lines from older commits with a “^” prefix.

$ git blame --since=3.weeks -- work_deserving_a_promotion.py

Output:

^4412d8c5 (Dustin Oprea 2018-05-17 18:56:11 -0400 1285)                     remote_fil
^4412d8c5 (Dustin Oprea 2018-05-17 18:56:11 -0400 1286)                     attributes
3386b3595 (Dustin Oprea 2018-05-25 19:27:55 -0400 1287) 
^4412d8c5 (Dustin Oprea 2018-05-17 18:56:11 -0400 1288)             elif fnmatch.fnmat
aac11271e (Dustin Oprea 2018-05-27 02:52:29 -0400 1289)                 # If we're bui
aac11271e (Dustin Oprea 2018-05-27 02:52:29 -0400 1290)                 # and test-key

Thanks to this SO.

Git: Putting All Submodules on Their Branches

By default, submodules are initialized in a detached-head state and not made to track specific branches, even when you specify a branch when initially adding the submodule. This means that any commits you produce will not be on a particular branch and the head commit will not be updated to point to new commits (you would not be able to push any new commits, at least not in the way you expect). This is fine where there is no active development, but, otherwise, you would likely need to intervene and individually checkout each project to the branches.

Assuming you specified a branch when you added the submodule, you can use the “git submodule foreach” subcommand to automate this:

git submodule foreach --recursive 'git checkout $(git config -f .gitmodules --get submodule.$name.branch)'

You can run this from your supermodule project or qualify the “.gitmodule” filename with its path.

If you need something more complicated, you can obviously write a script and call it from this context.

Git: Automatically Squashing at the Prompt

I do a huge amount of squashing, every day of the week. Ever the kind of engineer who wishes to optimize every single redundant operation, I wrote a simple script and then aliased it in my shell. When I do a commit that I know I will be squashing into the previous commit, I simply do a “git commit -m SQUASH -a” and then run “SQUASH_LAST” (my alias, which is autocompleted) to run the squash. The script verifies that the last commit message starts with “SQUASH” (for verification/sanity), executes the squash, and then prints the current commit, previous commit, and final commit revisions.

It is extremely convenient and saves a ton of time and annoyingly-repetitive steps.

The script (which I put in my home):

#!/bin/bash -e

HEAD_COMMIT_MESSAGE=$(git log --format=%B -1 HEAD)

# For safety. Our use-case is usually to always just squash into a commit
# that's associated with an active change. We really don't want lose our head
# and accidentally squash something that wasn't intended to be squashed.
if [[ "${HEAD_COMMIT_MESSAGE}" != SQUASH* ]]; then
    echo "SQUASH: Commit to be squashed should have 'SQUASH' as its commit-message."
    exit 1
fi

_FILEPATH=$(mktemp)
git log --format=%B -1 HEAD~1 >"${_FILEPATH}"

echo "Initial head: $(git rev-parse HEAD)"

git reset --soft HEAD~2 >/dev/null

echo "Head after reset: $(git rev-parse HEAD)"

git commit -F $_FILEPATH >/dev/null
rm $_FILEPATH

echo "Head after commit: $(git rev-parse HEAD)"

echo

The alias (for completeness):

alias SQUASH_LAST='<filepath>'

It really is about the little things.

I have also put the script into a gist.

Git: Producing a Revert Commit for a Previous Change

Create an inverse commit to flip a previous change. Child’s play:

$ git revert <REFSPEC>

The new commit looks like:

commit 09cc98e3fa121774750728f5fa337befeb02d914
Author: Dustin Oprea <dustin@randomingenuity.com>
Date:   Tue Mar 27 16:02:25 2018 -0400

    Revert "What's the worst that could happen?"

    This reverts commit cf4fc9a50a20a633b82ee28ef9efa46df86db18d.

A lot more nicer and a lot more professional than copying-and-pasting into a new commit or dropping an old commit with a rebase.

It is nearly identical to similar, existing features provided by many version-control review systems.

Git: Get the Latest Commits in a Repository

Sometimes you have a large repository (many commits, many branches) and it either might be an old clone or you might be operating out of a secondary branch and not aware of which branch has the majority of activity. You can do a git-log that displays all commits without respect to the current branch and order them by parents before children, in descending commit-timestamp (as opposed to author-timestamp) order.

This prints the top five:

$ git log -5 --all --date-order

Custom GIT Subcommands

At some point, you might find yourself running the same sequence of git operations on a regular basis. It would greatly improve your efficiency to stash these commands into your own Git subcommand.

For example, I could create a script named “git-dustin”:

#!/bin/sh

echo "Dustin's subcommand: $1"

Then, I’d save it into /usr/local/bin (in order to be in the path), and mark it as executable. I can then access it as if it were a subcommand:

$ git dustin "test argument"

This is the output:

Dustin, subcommand: test argument