Understanding the YourBase Dependency Graph in depth
At the center of our technology is the YourBase dependency graph, which records and stores information that is used to accelerate subsequent test runs. It ensures a correct build with a minimal number of test runs required.
Not all of the tests in your test suite relate to the changes you want to test. In order to run just the tests that relate to your code changes, you would need to understand the code paths that your tests take when they run. If the code paths overlap with the code changes you want to test, then you will want to run those tests; otherwise, you won’t want to run them. This way, we can always try to run the minimal set of tests that the code changes need.
This approaches makes developers' lives much easier, letting them avoid running long test suites and allowing them to select only 1% or 10% of tests to run at times.
The YourBase dependency graph is at the core of automatically handling this kind of acceleration safely.
How test runs are safely skipped
Consider the following example:
methods.py containing various methods
def capital_case(s): return s.upper() + s[1:-1] def fib(n): if n <= 1: return n else: return (fib(n-1) + fib(n-2))
and a test file
test.py that defines tests using some of the methods from the file
from methods import capital_case def test_capital_case(): assert capital_case('semaphore') == 'Semaphore'
Upon executing the
test.py test file, we can observe that the test
test_capital_case relies on a single method from the file
capital_case, but not on the method
fib. If we were to make a code change to
fib, and re-ran the test file again, we should have the same exact outcome as when we ran the test file the first time.
test_capital_case invokes the method
capital_case, but does not invoke
fib during its execution. Changes to the
fib function simply do not affect the outcome of
test_capital_case. As a result, we say that
fib is not a dependency of the test
capital_case is a dependency of test
During a test run, YourBase traces the execution path of each test. During a test’s execution, it learns about the various files and methods that get invoked, and then stores information about them as dependencies of that test in what is known as a dependency graph (explained more later). YourBase then uses this graph in order to skip tests whose dependencies were not changed in the code changes, delivering the same test results as running the entire test suite as a result.
The power of this logic is that YourBase Test Selection holds itself to a high degree of correctness, ensuring that, given alignment to its design principles, a developer can have 100% confidence in the correctness of each test run with YourBase Test Selection.
YourBase uses this dependency principle over and over again to determine which tests to safely skip per build. The “skippability” of a test is based on whether or not its outcome would stay the same pre-code change, as it would post-code change. In addition, we only skip tests that formerly passed.
If a code change affects methods or files that a given test does not depend upon, then running our test suite on that code change should not change the outcome of that test. And so, that test — if it previously passed — can be safely skipped.
Auto-inferred and robust
Despite our simple example of skipping test
test_capital_case when only
fib is changed, most tests and test suites are not so trivial. YourBase Test Selection is robust enough to handle complex dependency analyses for you, automatically determining how code changes affect test executions in a variety of simple to complex dependency cases. We regularly ensure the correctness of our test selection technology across huge production codebase test suites for our customers.
Using the previous example, the dependency graph for the test file would look something like this:
The dependency graph makes it clear that the
fib method is not a dependency of the test
test_capital_case, whereas the
capital_case method is.
File-level and method-level granularity
Depending on the language, YourBase offers file-level analysis and/or method-level analysis for test selection.
Where method-level analysis may not be available, file-level analysis is used. If changes were made to the
methods.py file, and method-level analysis was not available, then changes made to any of the methods in the file could trigger a test run of
test_capital_case when tests are run.
For Ruby, YourBase supports file-level analysis only. We found that method-level analysis is not required in Ruby in order to achieve the same effective outcomes as using file-level analysis alone.
For Python (both versions 2 and 3), YourBase supports both file-level and method-level analysis to achieve effective accelerated outcomes.
Zero graph management required
The YourBase dependency graph gracefully scales to test suites with ten of thousands of tests or more, enabling test selection on very large scale projects. With YourBase, you can avoid managing the memory of which tests use which parts of your codebase yourself.
Test selection steps
The YourBase dependency graph is used during subsequent test runs to determine if a test can be safely skipped or not. The YourBase engine needs two things to do test selection:
The optimal dependency graph to use if any exists. YourBase chooses the one that most closely matches the current state of the source tree in order to optimize on the number of tests that can be skipped.
If a dependency graph does not already exist, then the engine uses that test run to build a dependency graph. No test selection occurs for that run but does occur for subsequent runs.
The set of code changes in the current source tree, including changes in one’s working directory, relative to the chosen dependency graph.
Source tree change detection
Using metadata stored alongside the graph, the engine is able to determine the set of changes to the source tree that have occurred between the point in time of the graph and the current state of the source tree.
The flow of the YourBase engine is outlined as follows.
Back to our simple example — consider making a change in the
fib method after an initial graph was built.
On a subsequent test run, our input to the engine would be the changes to
fib along with the newly generated graph. Using this information and following the flow above, when asked if we should execute
test_capital_case, the engine would determine that we had a non-empty graph and a non-empty set of changes.
When it proceeds to query the graph, it would be determined that
test_capital_case does not depend on the set of changes (to the
fib method) and can therefore be safely bypassed.
The rich dependency model and granularity of our build graph enable the YourBase engine to apply this same process to very large test suites with tens of thousands of tests in them.
In many cases, the YourBase engine reduces the majority of build and test times from an hour or more, down to a few minutes.
Results: 50-90% tests skipped in over 10,000 tests
To demonstrate, let’s analyze the results of a large Python Django project for a major logistics company. It has over 10,000 tests. In a traditional CI system, the tests take over an hour to run completely.
We performed 6 experiments of 4-10 distinct commits each to test for the % of tests skipped and the average runtime difference before and after YourBase Test Selection.
Each experiment starts with an initial unaccelerated build, or cold build, that resets acceleration by running every test in order to build a new dependency graph. We then committed various code changes to the codebase to trigger accelerated builds, each build running on the same set of 10,000+ tests.
The chart below demonstrates the results of the % tests skipped for all 6 experiments. Each experiment begins with a cold build (skipping 0% tests), and then accelerates each subsequent build over 4-10 commits.
On average, YourBase skipped 50-90% of the tests during subsequent builds, varying per commit due to the size and scope of their code changes.
From 75 minutes to 20 minutes
As a result of YourBase’s test selection technology, build times reduced by 45 minutes each build. Tests went from taking 75 minutes to complete to taking 20 minutes to complete, representing a 66% time savings.
Depending on the code change, the accelerated build took anywhere between 2-5 minutes long to 30-45 minutes long per build.
75% of builds completed in under 9 minutes
While the average is still 20 minutes because of some larger changes that were introduced, the graph above shows that 75% of all builds completed in under 9 minutes.
This experiment serves to show how YourBase’s test selection technology delivers real, significant results that can help teams cut down on unnecessary infrastructure costs and time spent on a daily basis.
In normal use, developer teams create the initial cold build only once, allowing each subsequent build, no matter how many, to be accelerated.
Our small experiment performed a cold build at the beginning of each batch of 6-10 commits, so our results are admittedly underestimating the true impact that YourBase Test Selection could have on developer productivity in normal use.
Our team has worked on developing build and test acceleration technology for the past 4 years, and our work has been featured in a December 2020 publication to the IEEE on Acceleration in Continuous Integration. We are committed to creating a future where large code changes move fast, engineering toil can be eliminated, and the world’s technological progress can move faster at 10x speed.