A Hash Mismatch Made in BitBake

Dear Brother Yocto,

These ‘vardeps’ variables in Yocto are entirely magic and undocumented from my best reading of the Mega Manual. There is something like one paragraph that says “you might need to change / add these to make your recipe work in some cases”.  It would be nice to have an explanation of when to use vardeps and when to vardepsexclude.  I get that they are something to do with the sstate caching hash.

–Tired of ‘Taskhash Mismatch’

Dear Tired of ‘Taskhash Mismatch’,

Task hashes and their interactions with various caching mechanisms are indeed a confusing part of Yocto.  I’ve always found “ERROR: os-release-1.0-r0 do_compile: Taskhash mismatch 2399d53cb73a44279bfdeece180c4689 verses 9298efe49a8bf60fc1862ebc05568d32 for openbmc/meta/recipes-core/os-release/os-release.bb.do_compile” to be surprisingly unhelpful in resolving the problem.  Worse is when the problem manifests as a failure to run a task when something has changed instead of an error.

As you noted, the Yocto Developer Manual and Yocto Reference Manual are sparse on this topic.  A very careful reader will catch a reference in Section 2.3.5 of the Reference Manual to the Variable Flags section of the BitBake User Manual.  While that section does describe what these magic [vardeps] and [vardepsexclude] incantations do, I suspect you, the reader, are left wondering why they are in an entirely different manual.  Even though the signs are there for the astute observer (what command do you run to build an image?), many Yocto users don’t realize Yocto is an infrastructure built upon an independent tool called BitBake.

Yocto is just BitBake metadata

Per the BitBake User Manual, “BitBake is a generic task execution engine that allows shell and Python tasks to be run efficiently and in parallel while working within complex inter-task dependency constraints.”  Yocto is a set of tasks executed by BitBake to assemble Linux distributions.  This leads to a few non-trivial insights that I’ll spare you from discovering on your own:

  • Yocto recipes are really BitBake recipes (hence why they use a .bb extension) and follow the syntax described in Chapter 3: Syntax and Operators.
  • Many of the “magic” variables used in recipes (PN, bindir, etc) are defined in meta/conf/bitbake.conf.
  • Recipe lookup is actually controlled by BBPATH, not the order of layers in bblayers.conf.  BBPATH just happens to be modified in each layer’s layer.conf (see Locating and Parsing Recipes).

Tasks and Caching

As a Yocto developer, I tend to think in terms of recipes (because they usually correspond to a package) but BitBake’s fundamental unit of work is a task.  A task is a shell or Python script that gets executed to accomplish something, usually generating some form of output.  Often a task depends on the output from some other task and BitBake offers mechanisms for specifying these dependencies (see Dependencies for details).  BitBake will execute the tasks with as much parallelism as possible while maintaining dependency order.

For smaller projects, a straightforward execution of the tasks, where each task is always run on every invocation, might be OK.  When building a full Linux distribution like Yocto does, every invocation could take hours.  Caching to the rescue!

BitBake has multiple caching mechanisms but for this discussion only the setscene (or sstate) cache is relevant.  Setscene is a BitBake mechanism for caching build artifacts and reusing them in subsequent invocations.  Conceptually, recipes can provide a setscene version of a task by creating a new task of the same name with a _setscene suffix.  This version of the task attempts to provide the build artifacts and, if successful, the main version of the task is skipped.  The above link contains a more in-depth discussion including a lot implementation details necessary for anyone writing a setscene task.

Cache Invalidation and the Task Hash

As with any caching system, the setscene cache needs a way to decide when to invalid an entry.  In the case of setscene, it needs to know when the task would generate meaningfully different output.  That’s exactly what a task hash is supposed to answer.

Note: The Yocto and BitBake manuals tend to use the terms task hash, checksum, and signature interchangeably.  I find this especially confusing when thinking about do_fetch because each source file also has a checksum associated with it.  I’ll be using task hash throughout here but keep in mind that you might see the other terms when reading the documentation.

Since a task is just a shell or Python script, taking the hash of the script should tell us when it changes, right?  Mostly, but not quite.  The task’s dependencies can easily affect the output of this task.  OK, so what if we combine the hash of the script with the hashes of all the dependencies and use that to identify the task?  Excellent! That’s more or less what a task hash is: a combined hash of the task basehash (represents the task itself) and the task hash of each dependency.

Why did I use the term “basehash” above instead of script hash?  Any variable used in the script has the potential for changing the script.  Consider a case where the current date is used as part of a build stamp.  Should the task be considered meaningfully changed whenever the date changes?   The basehash is a more flexible hash that allows specific variables to be excluded (or included) in the hash calculation.  By excluding variables such as the current date and time, the task becomes more stable and the artifacts can be reused more often.

Is that what [vardeps] and [vardepsexclude] do then?

Yup.  [vardeps] and [vardepsexclude] are Variable Flags that add or remove variables from the task’s hash.  When you get a “Taskhash mismatch” error like the one from os-release at the start of this post, BitBake is telling you that the task hash changed unexpectedly; the script didn’t change but the task hash did.  In the os-release case, the original os-release.bb from Yocto declared that the do_compile task was dependent on the VERSION, VERSION_ID, and BUILD_ID variables so those were included in its task hash.  OpenBMC’s  os-release.bbappend uses an anonymous Python function to construct the VERSION, VERSION_ID, and BUILD_ID variables by querying the metadata’s git repo.  Subsequently, the task hash will change even though the script didn’t actually change.  The fix was simple: remove those variable as task hash dependencies.

–Brother Yocto