Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Support Dockerfiles #173

Merged
merged 45 commits into from
Aug 10, 2022
Merged
Changes from 5 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
4f73fe4
Add RFC: Support Dockerfiles
sclevine Jun 30, 2021
ef2b230
RFC: Support Dockerfiles: remove stack references
sclevine Jul 1, 2021
e21b2af
RFC: Support Dockerfiles: add builder-specified Dockerfiles
sclevine Jul 15, 2021
a1810ca
RFC: Support Dockerfiles: Fix reference to app-specified Dockerfiles
sclevine Jul 15, 2021
82cbe67
RFC: Support Dockerfiles: Remove run.Dockerfile reference
sclevine Jul 15, 2021
a9daed9
RFC: Support Dockerfiles: address more feedback
sclevine Jul 18, 2021
3534dc2
RFC: Support Dockerfiles: fix typo
sclevine Jul 20, 2021
05dc2d3
RFC: Support Dockerfiles: Buildpack-associated hooks
sclevine Jul 22, 2021
f3e84c6
RFC: Support Dockerfiles: clarify builder vs. build-image
sclevine Jul 23, 2021
1e813b9
RFC: Support Dockerfiles: integrate with buildpack API
sclevine Jul 29, 2021
ba9ebd7
RFC: Support Dockerfiles: Add build args + runtime SBoM
sclevine Jul 29, 2021
58ed531
Fix buildpack dir env var
sclevine Sep 9, 2021
ebf6b2a
Clarify detect logic
sclevine Sep 9, 2021
22f0ef4
Fix typo
sclevine Sep 9, 2021
084822e
Use UUID for build_id
sclevine Sep 10, 2021
b3b38ea
Fix wording re: rebasable label
sclevine Sep 10, 2021
6eed4b2
Clarify phases
sclevine Sep 29, 2021
fde7b84
RFC: Support Dockerfiles: clarify UID/GID of executables
sclevine Sep 29, 2021
f643f8b
RFC: Support Dockerfiles: rename hook to extension
sclevine Sep 29, 2021
3a37d54
RFC: Support Dockerfiles: fix typo
sclevine Sep 29, 2021
f513c24
RFC: Support Dockerfiles: remove static provides
sclevine Sep 29, 2021
884eaef
Clarify Dockerfile restrictions
sclevine Feb 16, 2022
cd9471f
Move extensions out of order table
sclevine Mar 1, 2022
a2c06a5
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
2935c8a
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
01786f8
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
10a02f2
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
2c7bb58
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
2c6ca72
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
aa74ace
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
10b6c9f
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
1cc2789
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
a2282f1
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
9ba0bc2
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
9533642
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
bfc5781
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
68773ac
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
fefd338
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
74767a0
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
3b73177
Update text/0000-dockerfiles.md
sclevine Jun 22, 2022
4abb253
Add a note about resolving registry credentials up front
natalieparellano Jul 5, 2022
f8e7bfa
Updates per Emily's feedback
natalieparellano Jul 6, 2022
3a26570
Updates per 7/7/22 Working Group
natalieparellano Jul 7, 2022
73de183
Apply suggestions from code review
natalieparellano Jul 13, 2022
9325a13
Updates per 7/14 Working Group
natalieparellano Jul 14, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
153 changes: 153 additions & 0 deletions text/0000-dockerfiles.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# Meta
[meta]: #meta
- Name: Support Dockerfiles
- Start Date: 2021-06-30
- Author(s): sclevine
- RFC Pull Request: (leave blank)
- CNB Pull Request: (leave blank)
- CNB Issue: (leave blank)
- Supersedes: [RFC0069](https://github.com/buildpacks/rfcs/blob/main/text/0069-stack-buildpacks.md), [RFC#167](https://github.com/buildpacks/rfcs/pull/167)
- Depends on: [RFC#172](https://github.com/buildpacks/rfcs/pull/172)

# Summary
[summary]: #summary

This RFC introduces functionality for customizing base images, as an alternative to stackpacks.

# Motivation
[motivation]: #motivation

Relying on Dockerfiles for base image generation and manipulation allows us to apply buildpacks only to the problem that they solve best: managing application runtimes and dependencies.

# What it is
[what-it-is]: #what-it-is

This RFC proposes that we replace stackpacks with multi-purpose build-time and runtime Dockerfiles.

# How it Works
[how-it-works]: #how-it-works

Note: kaniko, buildah, BuildKit, or the original Docker daemon may be used to apply Dockerfiles at the platform's discretion.
sclevine marked this conversation as resolved.
Show resolved Hide resolved

### Builder-specified Dockerfiles
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like these additions you have made!


A builder may specify any number of executable "hooks" in `/cnb/hooks.d/`.
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved
Hook files in this directory are executed in the context of app directory and must output Dockerfiles that are applied to the build-time and runtime base images before an image is built.
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved

If a hook exits with a non-zero status value, the build fails. If a hook exits with a zero status value and no output, the hook is ignored. Directories and non-executable files are ignored.

Each executable must be in the format `/cnb/hooks.d/<name>.(build.|run.|)<format>`, where build, run, or empty specify the build-time base image, runtime base image, or both bash images, respectively. The only valid format is `Dockerfile`, although support for, e.g. LLB JSON, could be added in the future.
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved

A runtime Dockerfile is applied to the selected runtime base image after the detection phase.
A build-time Dockerfile is applied to the build-time base image before the detection phase.

Both Dockerfiles must accept `base_image` and `build_id` args.
The `base_image` arg allows the lifecycle to specify the original base image.
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved
The `build_id` arg allows the app developer to bust the cache after a certain layer and must be defaulted to `0`. When the `$build_id` arg is referenced in a `RUN` instruction, all subsequent layerrs will be rebuilt on the next build (as the value will change).

A runtime base image may indicate that it preserves ABI compatibility by adding the label `io.buildpacks.rebasable=true`. In the case of builder-specified Dockerfiles, `io.buildpacks.rebasable=false` is set automatically before a runtime Dockerfile is applied and must be explicitly set to `true` if desired. Rebasing an app without this label set to `true` requires passing a new `--force` flag to `pack rebase`.
sclevine marked this conversation as resolved.
Show resolved Hide resolved


#### Example: App-specified Dockerfile Hook

This example hook would allow an app to provide runtime and build-time base image extensions as "run.Dockerfile" and "build.Dockerfile."
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved
sclevine marked this conversation as resolved.
Show resolved Hide resolved
The app developer can decide whether the extensions are rebasable.

##### `/cnb/hooks.d/app.build.Dockerfile`
```
#!/bin/sh
cat build.Dockerfile
```
##### `/cnb/hooks.d/app.run.Dockerfile`
```
#!/bin/sh
cat run.Dockerfile
```

#### Example: RPM Dockerfile Hook

This example hook would allow a builder to install RPMs for each language runtime.

Note: The Dockerfiles referenced must disable rebasing, and build times will be slower compared to buildpack-provided runtimes.
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved

##### `/cnb/hooks.d/app.Dockerfile`
```
#!/bin/sh
[[ -f Gemfile.lock ]] && cat /cnb/hooks.d/app.Dockerfile.d/Dockerfile-ruby
[[ -f package.json ]] && cat /cnb/hooks.d/app.Dockerfile.d/Dockerfile-node
sclevine marked this conversation as resolved.
Show resolved Hide resolved
```

### Platform-specified Dockerfiles
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved

The same Dockerfile format may be used to create new base images or modify existing base images outside of the app build process. Any specified labels override existing values.

Dockerfiles that are used to create a base image must create a `/cnb/image/genpkgs` executable that outputs a [CycloneDX](https://cyclonedx.org)-formatted list of packages in the image with PURL IDs when invoked. This executable is executed after all Dockerfiles are applied, and the output replaces the label `io.buildpacks.sbom`. This label doubles as a Software Bill-of-Materials for the base image. In the future, this label will serve as a starting point for the application SBoM.
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved
sclevine marked this conversation as resolved.
Show resolved Hide resolved

### Example Dockerfiles

Dockerfile used to create a runtime base image:

```
ARG base_image
FROM ${base_image}
ARG build_id=0
sclevine marked this conversation as resolved.
Show resolved Hide resolved

LABEL io.buildpacks.image.distro=ubuntu
LABEL io.buildpacks.image.version=18.04
LABEL io.buildpacks.rebasable=true
sclevine marked this conversation as resolved.
Show resolved Hide resolved

ENV CNB_USER_ID=1234
ENV CNB_GROUP_ID=1235

RUN groupadd cnb --gid ${CNB_GROUP_ID} && \
useradd --uid ${CNB_USER_ID} --gid ${CNB_GROUP_ID} -m -s /bin/bash cnb

USER ${CNB_USER_ID}:${CNB_GROUP_ID}

COPY genpkgs /cnb/image/genpkgs
sclevine marked this conversation as resolved.
Show resolved Hide resolved
```

run.Dockerfile for use with the example `app.Dockerfile` hook that always installs the latest version of curl:
```
ARG base_image
FROM ${base_image}
ARG build_id=0

RUN echo ${build_id}

RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*
```
(note: this Dockerfile disables rebasing as OS package installation is not rebasable)

run.Dockerfile for use with the example `app.Dockerfile` hook that installs a special package to /opt:
```
ARG base_image
FROM ${base_image}
ARG build_id=0

LABEL io.buildpacks.rebasable=true

RUN curl -L https://example.com/mypkg-install | sh # installs to /opt
```
(note: rebasing is explicitly allowed because only a single directory in /opt is created)


# Drawbacks
[drawbacks]: #drawbacks
Copy link
Member

@jkutner jkutner Jul 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate that this proposal introduces a new concept (hooks) that a builder-image maintainer needs to learn; especially when this and #172 begin to simplify and remove some concepts. I also work worry that the hook concept is not very intuitive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also work that the hook concept is not very intuitive.

This is subjective, can you qualify this statement with some details or an example?

I would argue that you find the concept of a "hook" (or pre/post script, before/after, wrapper, etc..) in tons of places in the computer world. The general concept is something many sys-admins and dev's have probably already seen.

I agree it adds some complexity because to use a hook, you need to understand the expected output format & what information you have available to generate your output (as well as basic scripting). At the same time, it's optional. You don't need any hooks to generate a builder, so a builder-builder doesn't really need to know this unless they are trying to do more customizations, which you could say is a more advanced task.

My $0.02 only, but it seems like the ratio of complexity added versus additional functionality for users is quite reasonable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is what I believe makes it not intuitive:

  • It isn't clear when the "hook" runs (before all buildpacks? before each buildpack? after buildpacks?)
  • it isn't obvious that the filename defines the format and thus what formats are supported
  • It isn't clear that the "name" determines the order
  • It isn't clear what happens when there's more than one hook
  • Generally, I don't think it's obvious what the scope is (what are the things a hook can do? Can it mutate a buildpack?)

These are not very difficult things to learn, but it's yet another thing to learn (which is a shame given that we're looking to remove things like the stack and mixin to make buildpacks easier to grok).

By comparison to your examples, a pre/post script usually stems from the same construct as the thing it's wrapping around (another executable). Similarly, the pre/post proposal for project.toml adds buildpacks before/after buildpacks. These things are all homogenous. But here, we have a concept that introduces a new mechanism into the existing "order". So now we have heterogeneous things running in some order. I think that's what makes it difficult to reason about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so there seem to be some specifics in the RFC that you think need to be clarified (I agree with some of these btw). IMHO, that seems like something that can be addressed and not a fundamental flaw in the way this is working. Is that your thought as well?


I will have to respectfully disagree on the "hook" as a concept though. There's is a fairly basic lifecycle that buildpacks flow through, you're talking about being able to inject some actions at a couple of points in that lifecycle. This happens all the time in software, they even created AOP to formalize the concept. It happens in software systems as well, the venerable HTTP Servlet Filter in the Java world is an example. As long as we document and have a picture of the lifecycle & indicate where the hooks execute (which is why I agree we need to clarify that point), then we should be fine.

I also strongly believe the complexity we are adding is more than worth it with the functionality this is adding & the number of problems this can solve.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

document and have a picture of the lifecycle & indicate where the hooks execute

I don't consider this a substitute for a tight set of constructs where people don't need to read and learn about them for each thing they need to do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. No disagreement there. Learning nothing is simpler than having to learn something, even if it's simple.

What I'm left trying to understand is this. If your bullet list of concerns were addressed then would the complexity that this RFC adds be OK? or is it your opinion that this RFC is fundamentally too complex and not something we should add?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm generally in favor of this RFC. I'm only trying to push us to find an abstraction over the hook mechanism that makes it's easier for users to understand (especially buildpack authors and builder-image owners).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also in favour of this RFC. Nevertheless, It is still very hard (as reported too by Joe) to understand different very important points such as :

  • When and how Hook(s) will be executed ?
  • How a hook will become part by example of the build execution A and not B, ... ?,
  • The parameters that a hook will consume (= IN) and what they will generate (= OUT),
  • How the buildpacks will react if a Docker build fails (return code, error message, ...) ?,
  • Where the new layers created for a base image or run image will be stored ?
  • What will be the file permissions defined for such new files added to a layer (as normally root user will be used to execute them) ?
  • If the RFC only populates dynamically Dockerfiles OR could consume Dockerfiles provided by a developer under which path of their application currently developed ?

Remark: That should be great to have a simple text workflow explaining how hooks will be integrated within the existing buildpacks phases (e.g: DETECTION --> ANALYSIS --> BUILD --> EXPORT ==> DETECTION --> ANALYSIS --> HOOK --> BUILD --> EXPORT)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When and how Hook(s) will be executed ?
How a hook will become part by example of the build execution A and not B, ... ?,
The parameters that a hook will consume (= IN) and what they will generate (= OUT),
How the buildpacks will react if a Docker build fails (return code, error message, ...) ?,

These are covered in the RFC. Can you be more specific about what details you feel are missing?

Where the new layers created for a base image or run image will be stored ?

Do you mean stored as in cached? Caching strategy would be up to the platform to decide.

What will be the file permissions defined for such new files added to a layer (as normally root user will be used to execute them) ?

The Dockerfile would run as root on the base image and produce a new image. No other changes to file permissions would occur.

If the RFC only populates dynamically Dockerfiles OR could consume Dockerfiles provided by a developer under which path of their application currently developed ?

As specified, buildpacks could create Dockerfiles in their ARGV[1] directory. They could copy them from the app directory, as described in the example, but that's not built-in to the API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildpacks could create Dockerfiles

I think you mean hooks?


- Involves breaking changes.

# Alternatives
[alternatives]: #alternatives

- Stackpacks

# Unresolved Questions
[unresolved-questions]: #unresolved-questions
sclevine marked this conversation as resolved.
Show resolved Hide resolved

- Should `genpkgs` be part of this proposal? Opinion: Yes, otherwise it's difficult to maintain a valid SBoM.
sclevine marked this conversation as resolved.
Show resolved Hide resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this weaken the security stance of buildpacks? (My opinion: yes) You can literally do anything in a Dockerfile.

I feel like stack packs, while maybe not the answer, did put more controls around what you could do. The stack packs were buildpacks, so presumably (IDK exactly as it wasn't implemented), an Operations team could control which stack packs are available to their users, thereby limiting what their users could install or modify to some degree.

There's also a stronger guarantee that a stack pack would produce a legitimate BOM, since Operations teams can control what stack packs are available (and audit them). With this proposal, someone could make a Dockerfile that installs something malicious and set genpkgs to be a copy of the true binary or some other no-op binary. The proposal essentially trusts the creator of the Dockerfile to be honest and report what they've installed, which I don't think is merited givn that this proposal would allow any app dev to customize what's install with a Dockerfile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operations teams can control what stack packs are available (and audit them)

I don't have any answers here, but this seems like a fairly important point to linger on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this concern is pretty well addressed by including the hooks on the builder. If the app-specified Dockerfile hook (above example) is not present, then the developer cannot provide a Dockerfile. And I suppose that trusting hook authors to provide a valid BOM is the same as trusting buildpack authors to provide a valid BOM.

In the case of the app-specified Dockerfile, the hook author could apply some label to the image so that consumers would know to approach the BOM with some suspicion. I wonder if there's value in having that spec'd ('io.buildpacks.extended'?). It's just a thought.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of a label to easily identify images that have had any hooks executed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also /cnb/image/genpkgs (because it is executed after Dockerfiles have been applied) is helping to trust the BOM.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also /cnb/image/genpkgs (because it is executed after Dockerfiles have been applied) is helping to trust the BOM.

@natalieparellano I don't really understand how this helps. By the time it is invoked, the hook author could have replaced /cnb/image/genpgks with a shell script that writes anything to the BOM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samj1912 and I have recently been discussing having the platform inject genpgks. That may help here...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samj1912 and I have recently been discussing having the platform inject genpgks. That may help here...

How would that work? I don't think a mount would work, since the hook could discover and overwrite it. Since the hook could do anything, I'm not even sure running a program like genpgks is safe even if the binary was untouched. I would think the genpgks would almost have to be executed in a new container against the extended image to truly be accurate.

That said - these hooks are likely authored and for sure approved/distributed by the stack author. Just because Dockerfiles are used, doesn't mean the stack distributor has to extend that control to buildpack authors or the app author. They can create a much more simple constraint (like defining an Aptfile to read package names).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think the genpgks would almost have to be executed in a new container against the extended image to truly be accurate.

That's basically what we're proposing - please see https://github.com/buildpacks/rfcs/pull/173/files#r794608075 for the download I got from @samj1912 and @sclevine .

# Spec. Changes (OPTIONAL)
[spec-changes]: #spec-changes

This RFC requires extensive changes to all specifications.
jkutner marked this conversation as resolved.
Show resolved Hide resolved
natalieparellano marked this conversation as resolved.
Show resolved Hide resolved
sclevine marked this conversation as resolved.
Show resolved Hide resolved