What is the role of cache in image builds?
Answer
When you build an image for the first time, the different layers are being cached. So, while the first build of the image might take time, any other build of the same image (given that Containerfile/Dockerfile didn't change or the content used by the instructions) will be instant thanks to the caching mechanism used.
In little bit more details, it works this way:
- The first instruction (FROM) will check if base image already exists on the host before pulling it
- For the next instruction, it will check in the build cache if an existing layer was built from the same base image + if it used the same instruction
- If it finds such layer, it skips the instruction and links the existing layer and it keeps using the cache.
- If it doesn't find a matching layer, it builds the layer and the cache is invalidated.
Note: in some cases (like COPY and ADD instructions) the instruction might stay the same but if the content of what being copied is changed then the cache is invalidated. The way this check is done is by comparing the checksum of each file that is being copied.