操作和数值
Operators - 操作
Operators are what you’d think of as a function in a programming language. They take arguments and evaluate to expressions. 操作很像编程语言里的一个函数:接受传参,计算得到表达式。
EXTENDS Integers
MinutesToSeconds(m) == m * 60If you see an error like
Was expecting “=====” or more Module Body, encountered
"$Operator"at line $n…
It’s likely because you used a single = instead of a double == when defining the operator.
碰到上述报错,多半是因为你定义操作的时候用了单个等号,而不是两个等号。
Operators can take any number of arguments. There are no default values, overloaded definitions, or optional arguments. If an operator has two parameters, it must always take two values. If an operator takes no values, you can write it without the parens. In this case, it’s effectively a constant. 操作可以接受任意多个数字作为参数。参数不能有默认值,操作不存在重载,也没有可选变量。如果一个操作有两个形参,那每次调用的时候必须传入两个实参。如果操作没有定义参数,你就可以不写括号。事实上,这就是一个常量。
SecondsPerMinute == 60The right-hand side of an operator is called an expression. 操作右边的东西,称之为表达式。
IF-THEN-ELSE
We have three keywords for structuring expressions. These are LET statements, case statements, and conditionals. We’ll be introducing the first two later: LET statements are much more useful after we can write some basic operators, while case statements aren’t used very often. That leaves conditionals: 结构化的表达式有三类关键词:LET语句,分支语句,条件语句。我们随后再介绍前两种:LET语句更实用一些,有了它就可以实现一些基本的操作了,分支语句就没这么常用。先说一下条件语句:
Abs(x) == IF x < 0 THEN -x ELSE xExpressions always equal something, so there must always be an ELSE branch. Otherwise, this works as you’d expect it to.
表达式必须是有最终值的,因此有IF就一定要有ELSE。
Values - 值
TLA+ is an “untyped” language, due to its roots in mathematics. In practice, the model checker recognizes four primitive types and four advanced ones. We are not going to cover all of them now, just the ones that need little explanation. TLA+是一门无类型语言,因为它本质上是处理数学的。实际上,模型检测器能认出来四种原始类型和四种进阶类型。姑且不一一介绍,这里先浅浅介绍几个。
Every value type in TLA+ has its own operators, with no overlap or overloading. The two exceptions to this are = and #, which “equals” and “not equals”, respectively. Values of different types cannot be tested for equality, and that will throw an error.
TLA+中的每个值类型都有自己的操作,没有重叠,也不支持重载。不过有两个例外,分别是等于(=)和不等于(#)。不同类型的值不能够判断是否相等,否则会报错。
If you see an error like
Encountered “Beginning of Definition” at line $n…
It’s likely because you used a double == instead of a single = when testing for equality.
碰到上述报错,多半是因为在判断两个值是否相等的时候用了两个等号。
The obvious ones - 简单类型
Integers and strings. To get the basic addition operators, you need EXTENDS Integers. Strings must use “double quotes” and cannot use single quotes. There are no operators for strings except = and #. In practice, they are used as opaque identifiers, similar to how some languages have a :symbol type. If your system needs to manipulate strings, we instead store them in a sequence.
整型和字符串。如果要使用基本的加减操作,需要在代码开头写EXTENDS Integers。字符串类型必须添加双引号,不可以用单引号。字符串类型也只适用于= 和# 两个操作。事实上,在实践中,字符串仅用来作为标识,就想ruby里的symbol类型一样。如果你是想操作字符串,那就应该用序列类型来保存字符串。
Note there is not a float type. Floats have complex semantics that are extremely hard to represent. Usually you can abstract them out, but if you absolutely need floats then TLA+ is the wrong tool for the job. 需要注意的是,TLA+里面没有浮点型。浮点型的语法非常复杂,极难表示。通常你可以把它们抽象一下,但是通常,如果你发现没有浮点型就无法工作,那TLA+并不适合你。
Booleans - 布尔
The booleans are TRUE and FALSE.
布尔型就是TRUE和FALSE。
So why do they get their own section? There’s two things you need to know about booleans. First of all, the boolean operators look like mathematical symbols, not programming symbols. They are: 之所以把布尔值拆出来一小节讲,是因为有两件事不得不专门提及。首先,布尔操作看起来非常像数学符号,而非编程符号:
and
/\
or
\/
not
~
A quick mnemonic: ~ is a little flippy thing, so it’s “not”. /\ looks like an “A”, so it’s “and”. \/ is the other one. We can use these to build the other operators, like Xor:
快速助记小妙招:~ 看着就轻浮,所以代表非。/\看起来像一个字母A,所以代表与。最后剩下来的\/就是或了。然后我们可以基于这三个操作,来构建其他操作,比如异或(Xor):
There is one more boolean operator of note: =>, or “implication”. A => B means that B is true or A is false (or both). You don’t see this very often in programming, as it’s pretty useless for control flow. But it’s extremely important for any kind of specification work. We’ll go into more detail about it later.
还有一个值得注意的布尔操作:=>,或曰推断。A => B 的意思是,要么B是true,要么A是false,或者两个都成立。在编程中这个操作不怎么用,因为它在控制流中确实没啥用。不过在写规约的时候,它就很重要了。后面我们会详细展开讲讲。
Exercise: Contrapositives 做两个对换练习。
Rewrite
A => Busing the “regular three” programming operators.
For what values of
AandBis~B => ~Atrue?
The other thing is that TLA+ has a “bullet point notation” for boolean logic. Let’s say you need an expression like A /\ (B \/ C) /\ (D \/ (E /\ F)). That’s really hard to read! So in TLA+ you can instead write it as:
当布尔逻辑非常复杂的时候,TLA+提供了一种新的书写方式,能够使得写出来的布尔逻辑更加易读。
A /\ (B \/ C) /\ (D \/ (E /\ F))可以写为:
That makes it much clearer. Notice that we have an extra /\ before A. That’s not necessary, but it makes the shape more pleasing, so we do it. This is also the only place in the language where whitespace matters. Lets say I instead wrote
这就看起来清晰多了。注意,我们在A之前多写了一个/\。这是非必要的,但是可以使得整体更优雅。这也是TLA+语言中唯一的空格有重要意义的地方。如果我写
That means something different! It’s now A /\ (B \/ C) /\ (D \/ E) /\ F.
意思就变了,现在是A /\ (B \/ C) /\ (D \/ E) /\ F。
Sequences - 序列
A sequence is like a list in any other language. You write it like <<a, b, c>>, and the elements can be any other values (including other sequences). As with most other languages, you look up a value of the sequence with seq[n], except that instead of the indices being 0..Len(seq)-1, it’s 1..Len(seq). So yeah, they’re 1-indexed.
序列就像其他语言中的列表。写法是<<a, b, c>>,其中的元素可以是其他类型的值(包括序列)。
Did I mention they’re 1-indexed? Because they’re 1-indexed.
There’s also a Sequences module. If you EXTENDS Sequences, you also get (letting S == <<"a">>):
Expression
Gives
Append(S, "b")
<<"a", "b">>
S \o <<"b", "c">>
<<"a", "b", "c">>
Head(S)
"a"
Tail(<<1, 2, 3>>)
<<2, 3>>
Len(S)
1
SubSeq(<<1, 3, 5>>, 1, 2)
<<1, 3>>
If you see an error like
Encountered “EXTENDS” at line 3, column 1 and token “Sequences”
It’s because you wrote two separate EXTENDS lines. TLA+ can only have one EXTENDS line per spec, but you can have multiple modules (separated by commas) on it. So instead write EXTENDS Integers, Sequences and you should be fine.
With sequences, we can represent a 24-hour clock as <<hour, minute, second>>.
Sets - 集合
A set is a collection of unordered, unique values. You write them with braces, like {1, 2, 3} or {<<"a">>, <<"b", "c">>}. You can even have sets inside other sets, like {{1}, {2}, {3}}.
Sets cannot contain elements of different types; {1, "a"} is invalid.
Set Operators
To check if x is an element of set, we write x \in set. \in is used in a few other places as syntax, not just as an operator. There’s also the inverse, \notin. set1 \subseteq set2 tests if every element of set1 is also an element of set2.
Note
That’s “subset or equals”. It’s a way to sidestep the question “Is a set a subset of itself?”
We also have ways of slicing and dicing sets:
set1 \union set2is the set of all elements inset1orset2(or both).set1 \intersect set2is the set of all elements in both sets.set1 \ set2, or “set difference” is the set of all elements inset1but notset2.
Note
You might see \cup and \cap instead of \union and \intersect. This comes from the mathematical symbols for set union and intersection, which are and .
If you EXTEND FiniteSets, you also get Cardinality(set), which is the number of elements in the set.
Tip
The easiest way to test if a set is empty is by writing set = {}. Similarly, you can test if a sequence is empty by writing seq = <<>>.
Sets of Values
Now imagine we’re writing a spec which uses clock values, and we want a quick operator to add times. I might write this as
Then AddTimes(<<2, 0, 1>>, <<1, 2, 3>>) = <<3, 2, 4>>, and AddTimes(<<2, 0, 1>>, <<1, 2, 80>>) = <<3, 2, 81>>.
Wait, 81 seconds? Our clock can’t show 81 seconds, the answer should be <<3, 3, 21>>. You can think of there being a set of valid clock values, all the way from <<0, 0, 0>> to <<23, 59, 59>>, and AddTimes should always return some value in that set, almost like it has a type signature. We can enforce this in TLA+, but first we need a way of generating sets of values from values. Fortunately, for every type of value in TLA+, there’s a method to generate sets of those values.1
Let’s start with the easiest: to get the set of all booleans, just write BOOLEAN. That’s the set {TRUE, FALSE}. For integers, a..b is the set {a, a+1, a+2, ... , b}. You need EXTENDS Integers for this to work.
Tip
If a > b, then a..b is empty. This makes a lot of things a lot simpler. For example, 1..Len(seq) is the set of the indices of seq. If seq = <<>>, you get 1..0 = {}, which is what you’d expect.
Now for sequences. The Cartesian product of two sets S and T is the set of all sequences where the first element is in S and the second is in T. It’s written with \X. For example, consider LoginAttempt containing who’s logging in, the time they attempted the login, and if it was successful or not. I can represent the set of all possible such values as LoginAttempt == Person \X Time \X BOOLEAN {{explain better}}.
Speaking of Time, we can combine \X and .. to finally get our clock type:
As a quick sanity check, run Cardinality(ClockType) in your Making a Scratchfile (remember, you’ll need EXTENDS FiniteSets). You should see it has 86400 elements. We’re now one step closer to having a property for AddTimes: we want the result of it to always return a value in ClockType.
Finally, we can get all subsets of a set with SUBSET S. SUBSET ClockType will be all the sets containing a bunch of clock values… all 7,464,960,000 of them.2
Tip
I often see beginners try to test if “S is a subset of T” by writing S \in SUBSET T. This works but is very inefficient. Write S \subseteq T instead.
Map and Filter
Sets can be mapped and filtered.
I’ve found that the best way to remember which is which is by reading the colon as a “where”. So the map is “x squared where x in 1..4”, while the filter is “x in 1..4 where x is even”.
To get all the times half-past the hour, we could write:
Map and filter are great for utility, too. The range of a sequence is the set of all elements in the sequence. We can get that with a set map:
CHOOSE
Getting the number of seconds past midnight from a clock value is straightforward. But what about going the other way? If we have a time in seconds, we can get the clock time by
Floor divide by 3600 to get the total hours.
Floor divide again the remainder by 60 to get the total minutes.
Take the remainder of the second division as seconds.
This constructs a clock value from the total seconds. This is how we’d do it in a programming language, where we are implementing algorithms to do things. But it’s also error-prone. What happens if I pass in 90,000? Then this would give me <<25, 0, 0>> — a value outside of our ClockType.
Here’s another thing we could do:
Take the set of all possible clock values.
Pick the element in the set that, when converted to seconds, gives us the value.
We don’t do it this way because “the set of all possible clock values” is over 80,000 elements long and doing a find on an 80,000 element list is a waste of resources. But it more closely matches the definition of the conversion, making it more useful for specification. In TLA+ we can write the selection like this:
CHOOSE x \in set: P(x) is the generic “selection” syntax. Try it in Making a Scratchfile.
CHOOSE is useful whenever we need to pull a value from a set.
Now what happens if we write ToClock(86401)? There are no clock times that have 86,401 seconds. If you try this, TLC will raise an error. This is in contrast to the implementation solution, which will instead give us a nonsense value. 99% of the time if it can’t find a corresponding element of the set, that’s a bug in the specification, an edge case you didn’t consider. Better to harden up the operator:
Troubleshooting
If you see an error like
Attempted to compute the value of an expression of formCHOOSE x in S: P, but no element of S satisfied P.
It’s because you wrote a CHOOSE that couldn’t find any values. Sometimes this just means you got the expression wrong. But other times, it points to an actual flaw in your system: you expected a value to exist, and it did not. Better write some error-handling logic or you’ll get a nasty surprise in production.
Warning
What if multiple values satisfy CHOOSE? In this case the only requirement is that the result is deterministic: the engine must always return the same value, no matter what. In practice this means that TLC will always choose the lowest value that matches the set.
LET
As you can imagine, TLA+ operators can get quite complex! To make them easier to follow, we can break them into suboperators, using LET:
The LET gives us a new definition, locally scoped to ToClock. seconds_per_day is an operator that only exists in the definition of this one.
Wait, operator? Yes, we can add parameterized operators in LET, too!
And you can define multiple operators in the same LET:
Each operator in the LET can refer to previously defined operators in that scope. With this we can construct solutions step-by-step.
Let’s calculate ToClock the “programming way”:
If you have to write a complex operator, breaking it into steps with LET is a great way to make it more understandable.
Summary - 总结
Operators are top-level “functions”, and evaluate to expressions. They are written
Op(a, b) == expr, with two equal-signs.Operators can have conditions with
IF-THEN-ELSE, and suboperators withLET-IN.
Sequences are collections of ordered values, and are 1-indexed.
Logic is
/\for “and”,\/for “or”, and ~ for “not”.Logical statements can be written “bullet-points” style.
Sets are collections of unordered, unique values.
We can test if an element is
\ina set or if a set is a\subseteqof another set.We can
\union,\intersect, and (set difference) two sets.We can CHOOSE elements of sets.
All types have “sets of” that type. For integers it’s
a..b, for booleans it’sBOOLEAN, for sets it’sSUBSET, and for sequences it’sS1 \X S2.We can map and filter sets.
Except strings. Well actually there is a keyword, STRING, but it represents all possible strings, which is an infinitely large set, so…
If you actual try this TLC will error out, because it assumes sets with more than 1,000,000 elements are unintentional. You can raise the limit in the TLC options.
Last updated