Software projects with numerous long-running tests can take so much time to run that developers turn to their CI system to run their tests for them. Long feedback times and growing infrastructure costs are not uncommon with projects that have high testing needs.
By selecting only relevant tests to run per code change, developers are enabled to execute tests locally, get faster feedback on changes, improve cycle times, and have a better quality of life. Instead of waiting around for tests, developers reclaim more of their time to do what they do best.
YourBase's intelligent test selection technology integrates directly with popular test execution frameworks across multiple languages, so it is naturally installable on-premise and already embedded with the test runner in its existing development and CI pipeline. After installation and an initial test suite run, YourBase gets to work in selecting the minimal number of tests to run to help you maximize on time saved.
Consider the following example:
methods.py containing various methods
def capital_case(s): return s.upper() + s[1:-1] def fib(n): if n <= 1: return n else: return (fib(n-1) + fib(n-2))
and a test file
test.py that defines tests using some of the methods from the file
from methods import capital_case def test_capital_case(): assert capital_case('semaphore') == 'Semaphore'
Upon executing the
test.py test file, we can observe that the test
test_capital_case relies on a single method from the file
capital_case, but not on the method
fib. If we were to make a code change to
fib, and re-ran the test file again, we should have the same exact outcome as when we ran the test file the first time.
test_capital_case invokes the method
capital_case, but does not invoke
fib during its execution. Changes to the
fib function simply do not affect the outcome of
test_capital_case. As a result, we say that
fib is not a dependency of the test
capital_case is a dependency of test
During a test run, YourBase traces the execution path of each test. During a test's execution, it learns about the various files and methods that get invoked, and then stores information about them as dependencies of that test in what is known as a dependency graph (explained more later). YourBase then uses this graph in order to skip tests whose dependencies were not changed in the code changes, delivering the same test results as running the entire test suite as a result.
The power of this logic is that YourBase Test Selection holds itself to a high degree of correctness, ensuring that, given alignment to its design principles, a developer can have 100% confidence in the correctness of each test run with YourBase Test Selection.
YourBase uses this dependency principle over and over again to determine which tests to safely skip per build. The "skippability" of a test is based on whether or not its outcome would stay the same pre-code change, as it would post-code change. In addition, we only skip tests that formerly passed.
If a code change affects methods or files that a given test does not depend upon, then running our test suite on that code change should not change the outcome of that test. And so, that test — if it previously passed — can be safely skipped.
Despite our simple example of skipping test
test_capital_case when only
fib is changed, most tests and test suites are not so trivial. YourBase Test Selection is robust enough to handle complex dependency analyses for you, automatically determining how code changes affect test executions in a variety of simple to complex dependency cases. We regularly ensure the correctness of our test selection technology across huge production codebase test suites for our customers.
At the center of our technology is the YourBase dependency graph. After installing the library, each test executed by the test runner gets observed by YourBase to generate what is known as a dependency graph.
The YourBase dependency graph is responsible for recording and storing information that is used to accelerate subsequent test runs. It is responsible for ensuring a correct build with the most minimal number of test runs required. At its core, it observes and stores information about test dependencies in order to accelerate subsequent test runs.
Using the previous example, the dependency graph for the test file would look something like this:
The dependency graph makes it clear that the
fib method is not a dependency of the test
test_capital_case, whereas the
capital_case method is.
Depending on the language, YourBase offers file-level analysis and/or method-level analysis for test selection.
Where method-level analysis may not be available, file-level analysis is used. If changes were made to the
methods.py file, and method-level analysis was not available, then changes made to any of the methods in the file could trigger a test run of
test_capital_case when tests are run.
For Ruby, YourBase supports file-level analysis only. We found that method-level analysis is not required in Ruby in order to achieve the same effective outcomes as using file-level analysis alone.
For Python (both versions 2 and 3), YourBase supports both file-level and method-level analysis to achieve effective accelerated outcomes.
The YourBase dependency graph gracefully scales to test suites with ten of thousands of tests or more, enabling test selection on very large scale projects. With YourBase, you can avoid managing the memory of which tests use which parts of your codebase yourself.
The YourBase dependency graph is used during subsequent test runs to determine if a test can be safely skipped or not. The YourBase engine needs two things to do test selection:
The optimal dependency graph to use if any exists. YourBase chooses the one that most closely matches the current state of the source tree in order to optimize on the number of tests that can be skipped.
If a dependency graph does not already exist, then the engine uses that test run to build a dependency graph. No test selection occurs for that run but does occur for subsequent runs.
The set of code changes in the current source tree, including changes in one's working directory, relative to the chosen dependency graph.
Using metadata stored alongside the graph, the engine is able to determine the set of changes to the source tree that have occurred between the point in time of the graph and the current state of the source tree.
The flow of the YourBase engine is outlined as follows.
Back to our simple example — consider making a change in the
fib method after an initial graph was built.
On a subsequent test run, our input to the engine would be the changes to
fib along with the newly generated graph. Using this information and following the flow above, when asked if we should execute
test_capital_case, the engine would determine that we had a non-empty graph and a non-empty set of changes.
When it proceeds to query the graph, it would be determined that
test_capital_case does not depend on the set of changes (to the
fib method) and can therefore be safely bypassed.
The rich dependency model and granularity of our build graph enable the YourBase engine to apply this same process to very large test suites with tens of thousands of tests in them.
In many cases, the YourBase engine reduces the majority of build and test times from an hour or more, down to a few minutes.
To demonstrate, let's analyze the results of a large Python Django project for a major logistics company. It has over 10,000 tests. In a traditional CI system, the tests take over an hour to run completely.
We performed 6 experiments of 4-10 distinct commits each to test for the % of tests skipped and the average runtime difference before and after YourBase Test Selection.
Each experiment starts with an initial unaccelerated build, or cold build, that resets acceleration by running every test in order to build a new dependency graph. We then committed various code changes to the codebase to trigger accelerated builds, each build running on the same set of 10,000+ tests.
The chart below demonstrates the results of the % tests skipped for all 6 experiments. Each experiment begins with a cold build (skipping 0% tests), and then accelerates each subsequent build over 4-10 commits.
On average, YourBase skipped 50-90% of the tests during subsequent builds, varying per commit due to the size and scope of their code changes.
As a result of YourBase's test selection technology, build times reduced by 45 minutes each build. Tests went from taking 75 minutes to complete to taking 20 minutes to complete, representing a 66% time savings.
Depending on the code change, the accelerated build took anywhere between 2-5 minutes long to 30-45 minutes long per build.
While the average is still 20 minutes because of some larger changes that were introduced, the graph above shows that 75% of all builds completed in under 9 minutes.
This experiment serves to show how YourBase's test selection technology delivers real, significant results that can help teams cut down on unnecessary infrastructure costs and time spent on a daily basis.
In normal use, developer teams create the initial cold build only once, allowing each subsequent build, no matter how many, to be accelerated.
Our small experiment performed a cold build at the beginning of each batch of 6-10 commits, so our results are admittedly underestimating the true impact that YourBase Test Selection could have on developer productivity in normal use.
Our team has worked on developing build and test acceleration technology for the past 4 years, and our work has been featured in a December 2020 publication to the IEEE on Acceleration in Continuous Integration. We are committed to creating a future where large code changes move fast, engineering toil can be eliminated, and the world's technological progress can move faster at 10x speed.