Feature: Add machine-readable remediation to the hasDangerousWorkflowScriptInjection probe #3950
Labels
check/Dangerous-workflow
kind/enhancement
New feature or request
needs discussion
question
Further information is requested
Stale
Is your feature request related to a problem? Please describe.
The
finding.Finding.Remediation.Patch
field is meant to store machine-readable patches to fix the finding:scorecard/finding/probe/probe.go
Lines 45 to 47 in 6fc7d4c
Fixing script injection in workflows is pretty straightforward and can be solved procedurally, but the probe's findings don't include the patch.
Describe the solution you'd like
The findings generated by
hasDangerousWorkflowScriptInjection
should include a remediation patch that is machine-readable, and preferably human-recognizable. I'd propose the "unified diff" format used bygit diff
ordiff -u
.The utopian solution would allow us to create a single "global" patch that maintainers could then apply to the project, entirely fixing all the dangerous script injections. However, my understanding of the probe architecture leads me to believe we can only fix individual findings, one at a time.
I have already created some first-draft code that can perform this operation. Running it, we get output such as:
Pretty-printing each of those patches, we get a
git diff
-like* "unified" output that can be used to fix the relevant finding:I'll be happy to send this code as a PR once I've polished it up.
However, there are several issues with my current implementation that I wish to discuss before moving forward:
git diff
-like outputThe biggest problem is generating the
git diff
output. It'd be great if we could use a dependency to create this output given two versions of a file, but a reasonable dependency doesn't really exist, as far as I can tell:git apply
. That functionality is basically just a wrapper for sergi/go-diff, which isn't currently used by Scorecard, so this would need to be added to Scorecard's go.mod.git diff
output. However, these packages simply export an internal module used by Go tools (x/tools/internal/diff). However, given their "one-and-done" concept, they've been archived by their maintainers (and, unsurprisingly, have poor Scorecard scores: 2.9).I've explored two alternatives:
go-git
's output. It works for the handful of tests I've run, but it'd probably have to be more thoroughly tested to make sure it covers edge cases, etc.How to store expected unit test outputs
This code would naturally need to be tested. Each test case will be composed of a workflow that needs to be fixed and all the information we'd receive from the finding (location, unsafe variable, etc). However, the unit tests also need to know what the expected output of each test would be. We need to decide how we wish to store this information. As I see it, our options are:
workflow-fixed.yml
), and then generate thegit diff
-like output during the test and compare it to ours. Conceptually, this would be the best solution, but the only way I see to generate that output would be to... rungit diff
usingexec.Cmd()
or something, which has all the downsides of usingexec
.In this particular case, I don't think it'd be toooooo problematic, though? Any machine running tests on Scorecard likely has
git
installed on the $PATH (and we can simply skip the test if it's not found), and the "unified format" has been stable since its release in 1990, so there isn't much risk of different results on machines with different versions of git.git diff
file (generated by runninggit diff
on the broken workflow and a fixed version – which is untracked) and compare our output directly to it. This makes the test code simple, but means we're storing a human-legible-but-unfriendly format. It's also somewhat brittle: if we modify the original workflow (i.e. adding comments), that will change thegit diff
output as well (modified line positions, maybe even changing the diff itself), so the.diff
file would also need to be updated, which may not be immediately obvious..diff
file and the fixed version of the workflow (i.e.workflow-fixed.yml
). The fixed workflow wouldn't actually be used for the tests, but would serve as useful context, showing what we expect the output to be. But this would also be vulnerable to drift over time: we might change the broken workflow and the diff and forget to update the fixed workflow.make all
that runsgit diff
(again, assuming it exists) on the broken and fixed versions of the workflows and updates the respective.diff
files (this would need to be verified in the CI/CD tests)Let me know which options you prefer for these two questions and I'll send a PR taking them into account.
* One perhaps-notable difference between this output and an actual
git diff
is thatgit diff
adds the "scope" after the@@
anchors (i.e.@@ -1,2 +3,4 @@ func foo() {
), while my code does not. However, this data isn't actually used bygit apply
orpatch
, so this is a meaningless difference.The text was updated successfully, but these errors were encountered: