TLA+ and PHP 01

What is TLA+?

In my personal understanding of it: a specification language that helps me explore fallacies in my PHP software architecture and test that architecture under the pressure of concurring threads. To better explain what TLA+ is, one should refer to the site of Leslie Lamport, the mind behind TLA+.
I’ve found Hillel Wayne’s site and book to be the best way for me to learn the basics.

The pitch for this latest exploration of mine was Wayne’s video “Tackling Concurrency Bugs with TLA+": it struck a chord. The concurrency part especially. Then I found out there was testing involved and could not let go of the thought.

What am I using TLA+ for?

Given enough hot metal under it, TLA+ could really be used to model any concurring process my mind could think of, from high-level distributed systems to low-level algorithms.
Yet I hate answers that sound like a variation of the “it depends on your situation and context”, and I will avoid that by providing the particular context I’m applying TLA+ to.

I’m using TLA+ to model parts of a process-based PHP testing framework I’m writing.
I’m still pretty foggy about how this testing framework will fit in the grand scheme of things, but I’ve spent some time working with Jest and found it to be really delightful.
I’m writing something from scratch and, without the pressure of having to ship a product, I can take my time to do it right and in a way that satisfies me.

Let’s start from an example file, a “spec files” in Jest jargon, this imaginary test framework would consume:

<?php
/**
 * @env WordPress
 */

describe('get_foos_function', function () {
    // Create the test users.
    $admin      = wp_insert_user(['user_login' => 'a', 'user_pass' => 'a', 'role' => 'administrator']);
    $subscriber = wp_insert_user(['user_login' => 's', 'user_pass' => 's', 'role' => 'editor']);
    // Create the test foos: one public, one draft.
    foo_create(['status' => 'publish']);
    foo_create(['status' => 'draft']);

    describe('Admin context', function () use ($admin, $subscriber) {
        // Simulate Admin context.
        define('WP_ADMIN', true);

        it('should return all foos to admin user', function () use ($admin) {
            wp_set_current_user($admin);
            $foos = get_foos();
            expect(count($foos))->toBe(2);
        });

        it('should return only public foos to non-admin', function () use ($subscriber) {
            wp_set_current_user($subscriber);
            $foos = get_foos();
            expect(count($foos))->toBe(1);
        });
    });

    describe('Front-end context', function () use ($admin) {
        it('should return public foos to admin user', function () use ($admin) {
            wp_set_current_user($admin);
            $foos = get_foos();
            expect(count($foos))->toBe(1);
        });

        it('should return public foos to non-admin', function () {
            wp_set_current_user(0); // Visitor.
            $foos = get_foos();
            expect(count($foos))->toBe(1);
        });
    });

    describe('REST context', function () use ($admin) {
        // Simulate a REST request.
        define('REST_REQUEST', true);

        it('should return all foos to REST admin user', function () use ($admin) {
            wp_set_current_user($admin);
            $foos = get_foos();
            expect(count($foos))->toBe(2);
        });

        it('should show no foos in REST to visitors', function () {
            wp_set_current_user(0); // Visitor.
            $foos = get_foos();
            expect(count($foos))->toBe(0);
        });
    });
});

To note in the above code example:

  • The @env WordPress annotation will tell the (again: still hypothetical) testing framework that WordPress should be loaded before the tests. This will allow the following code to call WordPress functions.
  • The block at the top creating the users and the “foos” would be the equivalent of Jest beforeAll function, or PHPUnit setUpBeforeClass method call: it will do something once, before any test runs.
  • In the inner describe blocks, I’m defining constants like WP_ADMIN and REST_REQUEST to simulate, respectively, the admin and the REST request contexts.
  • Was this code to run sequentially, the second describe block, the Front-end context one, would inherit a defined WP_ADMIN constant that would make it fail.
  • Being “smart” about the order of the describe blocks here, then, would not change much as at least one describe block would end up inheriting the constant defined by one of the describe blocks that ran before.

The “simple” solution is to run each describe block in isolation in a dedicated, separate PHP process.

Let’s say I’m willing to do it: there will be a main tread in charge of starting, and managing the state of, multiple dedicated PHP processes running the describe blocks.

Loop concerns

A loop waiting on processes to be done is a pretty well-explored pattern.
Yet there are challenges deriving from the specific application and PHP limits.

A quick list:

  1. Processes might succeed or fail and will need to communicate that back to the main PHP process for it to be able to orderly print the output: this means Inter-Process Communication (IPC) is required.
  2. On Windows, one of the possible environments where this testing framework might run, IPC is not that easy given the many IPC pipes and streams inconsistencies.
  3. A loop should not be implemented with a logic that sounds like, “Every now and then check on the started processes, sleep a bit, repeat”. Checking more often would make the Loop more responsive but hoard CPU cycles. Checking less often would reduce resource consumption, making the Loop less responsive.

Some prototypes later, I concluded I should use the stream_select function to “watch” the pool of running processes from the main PHP thread, get prompt updates, and avoid a CPU lock.
When used on blocking streams, the function will wait for some streams part of the pool (a stream could be opened over a file or an IPC pipe) to update to move on.

All of this long and winded introduction to get to the first TLA+ model.

CLI and TLA Toolbox

To verify my specification, I’m using the TLA+ toolbox; I’m not fond of the IDE part, though, and prefer using vim and Hillel Wayne’s vim plugin to write my specifications.

With that out of the way, let’s start with my model.

I would like to reiterate I’m not a TLA+ expert. I am sharing my discovery process and findings. Go back to the first section of this article to find the references and literature to provide expert insight into it.

I will be dealing, to model the Loop and ancillary PHP processes, with two types of process:

  1. The Loop, the PHP process the user starts that will, in turn, start and manage one PHP process per describe block.
  2. The Worker, a PHP process started by the Loop to handle a describe block.

I’ve created a new spec_loop specification in the TLA+ Toolbox application and wrote the first version of my specification, just to check if things work.

----------------------------- MODULE spec_loop -----------------------------
EXTENDS TLC
CONSTANTS Parallelism, Jobs, Loop, NULL

(*--algorithm loop

process loop = Loop
	begin
		StartInitialBatch:
			goto Done;

end process

end algorithm;*)
=============================================================================

The TLA+ Toolbox supports two languages:

  • TLC - the somewhat difficult to read and write specification language initially used for TLA+.
  • PlusCal - the higher-level language that will transpile to TLC.

PlusCal is what I put between (*--algorithm loop and end algorithm;*).
When I save the file and click “Translate PlusCal Algorithm”, I get the following output:

----------------------------- MODULE spec_loop -----------------------------
EXTENDS TLC
CONSTANTS Loop, NULL

(*--algorithm loop

process loop = Loop \* There is process called Loop
	begin
		StartInitialBatch: \* That process starts...
			goto Done; \* ...and directly goes to Done, a state that means it's terminated.

end process

end algorithm;*)
\* BEGIN TRANSLATION (chksum(pcal) = "f8a984b0" /\ chksum(tla) = "83520898")
VARIABLE pc

vars == << pc >>

ProcSet == {Loop}

Init == /\ pc = [self \in ProcSet |-> "StartInitialBatch"]

StartInitialBatch == /\ pc[Loop] = "StartInitialBatch"
                     /\ pc' = [pc EXCEPT ![Loop] = "Done"]

loop == StartInitialBatch

(* Allow infinite stuttering to prevent deadlock on termination. *)
Terminating == /\ \A self \in ProcSet: pc[self] = "Done"
               /\ UNCHANGED vars

Next == loop
           \/ Terminating

Spec == Init /\ [][Next]_vars

Termination == <>(\A self \in ProcSet: pc[self] = "Done")

\* END TRANSLATION 

=============================================================================
\* Modification History
\* Last modified Wed Mar 16 10:46:22 CET 2022 by lucatume
\* Created Mon Mar 14 08:40:43 CET 2022 by lucatume

The \* BEGIN TRANSLATION and \* END TRANSLATION comments fence the translation operated by the IDE.
In the following code examples, I will not report the translation as that is a direct output of the PlusCal algorithm and not that interesting to this post.
After \*, you see comments in PlusCal and TLC syntax.

Notes:

  • EXTENDS is like import in node: import one or more modules I will need.
  • CONSTANTS defines the inputs my specification will have. They are the parametric part of my specification, the values Models will use to customize a check.

The specification is easy enough, I create a Model called “MC” in TLA+ Toolbox and set it up like this:

  • What is the behavior spec?Temporal formula.
  • What to check?
    • Deadlock
    • Properties
      • Termination
  • What is the model?
    • NULL <- [ model value ]
    • Loop <- [ model value ]

Here’s a screenshot of the settings in the TLA+ toolbox: The settings in TLA+ Toolbox

What does all this mean?

Temporal formula means I want to check if my specification works over time. More on that later.
Deadlock means I want to check if there’s any state, a combination of execution steps, that would put the model in a deadlock.
Termination means I want to check that, in any state, all the processes will get to the Done phase, the last one. NULL and Loop are custom types I will use in my specification.

I run the model and… it fails: Temporal properties were violated., Stuttering.
That means the one temporal property I’ve set, the one where all processes should eventually terminate, is violated.
Here comes the first great power of TLA+: it will model states where some, or all, processes will not run.
If the Loop process never runs, it will not terminate. As such, the Termination temporal property is violated.
I’m not interested in modeling the case where the Loop does not run. That means the PHP code never got there, which implies some error happened before, the user is shown some exception or message: I do not need a specification for that.

Please, run the Loop process

How do I tell the model checker I’m not interested in a state where the Loop process does not run?
I tell it the process is fair. It means any state where the Loop process does not run should not be part of the model check.

I update the specification to make the Loop process a fair one:

----------------------------- MODULE spec_loop -----------------------------
EXTENDS TLC
CONSTANTS Loop, NULL

(*--algorithm loop

fair process loop = Loop
	begin
		StartInitialBatch:
			goto Done;

end process

end algorithm;*)
=============================================================================

After re-compiling the code and rerunning it, the model check passes: the Loop process has only one step: the StartInitialBatch one, which will immediately go to the Done one. A “step” in TLA+:

  • Is indicated by a Label:
  • Is atomic.

This is enough for one post.

Check out the second post.