You can find the tool developed in this post
here.
In one project we use BDD style tests with
cucumber to test our software.
For regulatory reasons, we track all tests in Azure DevOps.
(Ab)using the example from
https://cucumber.io, our setup looks a bit like this:
Feature: Withdrawing cash
@tc:1001
Scenario: Successful withdrawal within balance
Given Alice has 234.56 in their account
When Alice tries to withdraw 200.00Then the withdrawal is successful
@tc:1002Scenario: Declined withdrawal in excess of balance
Given Hamza has 198.76 in their account
When Hamza tries to withdraw 200.00Then the withdrawal is declined
Every test has a tag that references the work item on Azure DevOps.
To enable easier tracing, we mention all test scenarios we touch in the commit message:
That’s fine and dandy,
until you get to that PR that adds a new step to two dozen different tests…
We can automate this!
Look at your diff,
now back to the test,
now back to your diff,
now back to the test.
Collecting all those @tc:... annotations manually is annoying and error prone.
So lets have the computer do it.
Conceptually the implementation is easy:
Grab the git diff and filter for gherkin files
Parse the gherkin and cross-reference with the diff
Put everything in the commit message
Lets start at the beginning:
libgit2
For interacting with the repository,
I’m using
git2,
a wrapper for the
libgit2 C library.
The main hurdle was to figure out what I actually wanted.
Since it’s about committing, we need everything green in the output of git status.
We need the index.
So, open up the repo and diff the index with the latest committed state (aka HEAD):
let repo = Repository::open_from_env().unwrap();
letmut diff_opts = DiffOptions::default();
// We don't need no context
diff_opts.context_lines(0);
let head = repo
.resolve_reference_from_short_name("HEAD")
.unwrap()
.peel_to_commit()
.unwrap()
.tree()
.unwrap();
let diff = repo
.diff_tree_to_index(Some(&tree), None, Some(&mut diff_opts))
.unwrap();
Much simpler than shelling out,
calling git diff --cached -U0
and parsing its output with all the possible edge cases.
git2 tries to stick closely to the API of the libgit2 library
and here it shows the difference between Rust and C nicely.
Instead of an Iterator we get a foreach() function with a bunch of callbacks.
For our use case we’re only interested in the line callback.
For every gherkin file (*.feature) we store which line was changed,
and whether the change is in the old or new version of the file.
For deletions we check the old file, for insertions the new file.
With that, the first step is done.
gherkin
With a set of paths and line numbers in hand,
we can tackle the next step.
Parsing gherkin!
cargo add gherkin
That should take care of the heavy lifting.
let (path, line, version) = changes.pop();
let text = load_file(path, version).unwrap();
let feature = Feature::parse(&text, Default::default()).unwrap();
Taking a look at the
feature struct,
we can see a list of scenarios
and every scenario comes with span information
and a list of tags.
Profit!
We can cross-reference the scenario spans with the changed line numbers.
If a line from the diff falls into a scenario span,
the tags of the scenario tell us the issue number we need.
The only issue:
Spans are byte offsets in the file
and we have line numbers.
Nothing that a bit of preprocessing can’t fix:
letmut ptr =0;
let line_ranges = text.split_inclusive('\n')
.map(|line| {
let end = ptr + line.len() +1;
let range = ptr..end;
ptr = end;
range
})
.collect::<Vec<_>>;
Run through all lines and convert each line to a range,
indicating the byte positions in the file.
Then we can index this vector with the line number
to compare with the span information from the gherkin struct:
let scenario = feature
.scenarios
.iter()
.find(|s| does_intersect(s.span, line_ranges[line-1]));
If we do find a matching scenario,
then it’s as simple as parsing scenario.tags and grabbing the number from that.
And why do complicated parsing, when you can just do a simple prefix match:
Now that we know the numbers we need to reference,
it’s time to bring them into the commit message.
prepare-commit-msg
Git offers hooks to automatically run code as part of the normal workflow.
One of those hooks is prepare-commit-msg,
which runs just before a commit and can adjust the message.
Just what we need!
The interface is simple:
If .git/hooks/prepare-commit-msg exists, it is run as the hook.
Git passes us one to three arguments:
prepare-commit-msg <message_file> [source] [hash]
The source and hash arguments can mostly be ignored,
message_file is the interesting one.
It contains the path to a file with the “initial” commit message.
This includes any templates (via git commit -t or commit.template),
and the default git instructions:
# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# On branch main
# Changes to be committed:
Since issue numbers belong into the
trailers,
we want to insert them at the end of the message.
For readability ideally just before the instructions.
So the full integration looks like this:
Open the commit message file
Format the issue numbers as trailer (Tests: #1001, #1002)
Insert the trailer just before the instructions (a big block starting with #)
Write new contents back to commit message file
With all that done,
we can hit compile,
copy the binary to .git/hooks/prepare-commit-msg,
and try it out: