Skip to main content

Submodels vs Elementary Submodels

01 Oct 2020

I’m giving a GSS talk next week in the hopes of introducing myself to my new department. More importantly, I am giving this talk to try and showcase the utility of model theory in combinatorics and algebra. While writing this talk, I’ve been thinking about the best way to discuss elementary submodels as opposed to regular old submodels. When it was taught to me, a focus was placed on “quantifying over extra stuff”, which is true, but I was never shown a simple example. While I was thinking about how to teach it, I realized we can already find elementary and nonelementary submodels in graph theory, and that these might serve as good examples for people new to the area.

So, to begin, we have to talk about Models. I’m planning to write a longer form post on model theory at some point, but here are the basics.

We typically have functions, relations, and constants associated to any given “structure” in mathematics. For groups, we obviously have the constant e (representing the identity), a binary function (for the group operation), and a unary function 1 (for inversion). We can define other operations too, such as the commutator [x,y]=x1y1xy, but these can all be defined in terms of the operations given. For the natural numbers N we often restrict our attention to the constants 0 and 1, as well as binary functions + and ×, and a relation ≤⊆N2. An example that we will refer to for the rest of this post will be graphs. A graph has a single binary relation E – the edge relation.

When studying any fixed branch of math, the discourse is usually confied to formulas involving only some collection of symbols. Often we prove things by manipulating these symbols according to known rules, which are set out in axioms for how these symbols should behave. Model Theory studies what we can prove using only this information, and by formulating certain problems model theoretically, we can bring a lot of powerful machinery to bear.


To start, we have to fix the symbols of interest. A Signature is a tuple σ=(C,R,F), where C is a set of Constant Symbols, Rn is a set of (n-ary) Relation Symbols, and Fn is a set of (n-ary) Function Symbols.

Then we can define the Language L(σ) as (informally) the set of mathematical formulas using

From your mathematical experience, you doubtless know intuitively how these symbols are allowed to fit together. However, for those wanting to see the precise definition, I’ve included one at this footnote 1.

Now that we know what formulas we are allowed to discuss (in a fixed language L(σ)), it is time to discuss models for this language. These models are sets equipped with functions and relations for each symbol in σ, and will give us a way to assign a truth value to formulas.

Formally, a σ-model M=(M,CM,RM,FM) is a set M plus an element cMM for each cC, a function fM:MnM for each (n-ary) fFn, and a relation rMMn for each (n-ary) rRn. In practice, the superscript M is often omitted when the model is clear from context.

Now that we know how to interpret the symbols of σ, it is intuitively clear how to bootstrap our way to an interpretation of the formulas of L(σ). One uses the traditional meantings of the extra symbols, and asks whether the resulting formula is true in M. This is perhaps best given as a “definition by example” (which I’ll provide in the next paragraph), but I have again left a formal definition as a footnote 2.

Let’s look at the signature σ with one constant 0, a binary function +, and a binary relation . Then we can take the set N with the obvious interpretation as a σ-model. Let’s take φ=m.n.nm as a case study. N obviously thinks this formula is true, because for all mN there does exist a nN so that nNm. In this case, we write Nφ, which we read as “N models φ” or, “N satisfies φ”, or “N thinks that φ is true”, etc. If instead we look at ψ=n.x.x+n=0, we see that ψ is false in N. This is because, no matter which nN you pick, there is always an xN so that n+Nx0N. In this instance, we say Nψ.

Notice, however, that there is no restriction (yet) on our interpretation. We could look at a nonstandard interpretation, say N~, where 0N~=3, x+N~y=3 for every x and y, and N~={(4,5)}. Then we see N~φ, and N~ψ (why?). We could even work with a totally different set, and look at M={a,b,c} with 0M=a, M={(a,b),(c,b)}, and x+My=b. Obviously this example was arbitrary, so you can imagine the diversity of possible models.

In practice, though, we almost never work with bare models. We always ask that our models satisfy some axioms AL(σ). These axioms greatly restrict which models we view as worthy of study, but we still view them abstractly. A very common technique in model theory is to add symbols and axioms to a theory of interest. Then we use these extra symbols and axioms to force our model to behave in a certain way (after arguing that such a model must exist). Then we “forget” about the extra symbols, and see that this new model is still a model of our original theory, but it has new properties that are desirable.


Today, though, we’re not talking about these techniques. We are instead focusing on two notions of “submodel”, and talking about their differences. We will focus on graphs, since they are simple combinatorial objects that one can easily understand and play with. For us, a graph will be a model for the signature G with exactly one binary relation E. Moreover, we will restrict attention to those models satisfying the symmetry axiom x.y.xEyyEx. If you would rather your graphs be loopless, we could also add the irreflexivity axiom x.¬xEx, but this will not be relevant for our discussion.

A submodel X of a σ-model M is almost certainly what you’re expecting. It is a subset XM which contains all the constants in C and is closed under all the functions in F. Then we can define cX=cM, fX=fMXn, and rX=rMXn. The only important (and sometimes counterintuitive) part of this definition is the notion of relation. This is best shown by a graph theoretic example.

Consider the following graph Γ, which we view as a G-model:

A triangle with an extra edge

Then submodels are the induced or full subgraphs. Notably this is a submodel:

the induced triangle subgraph

While this (the tripod T) isn’t:

A subgraph that removes an edge

This fails to be a submodel beacsue its edge relation ET is not equal to EΓ{a,b,c,d}2. Model theory has a way to talk about these kinds of “almost submodels”, though. The inclusion i:TΓ is a homomorphism of models, and homomorhpisms have to preserve “positive formulas” (roughly, the formulas with no ¬ and no quantifiers). The converse fails, however, and we can see T¬bEc while Γ¬i(b)Ei(c).

As an algebraic example, consider the natural homomorphism p from a group to its abelianization. If some element g is of order n, then its image p(g) will still be of order n. We can think of this is reflecting that whenever Ggn=1, we also have Gabp(g)n=1. Any relations among the elements of g are preserved by the homomorphism. These are the “positive formulas”. Statements about what things aren’t related, though, might fail to be preserved. For instance, if g and h don’t commute in G, we will have Gghhg, while Gabp(g)p(h)=p(h)p(g). Existing relations must be preserved, but homomorphisms are allowed to add new relations.

Submodels, though (and more generally embeddings) preserve positive and negative formulas. Phrased differently, submodels both preserve and reflect positive formulas. Phrased differently yet again, if X is a submodel of M then for every quantifier free formula φ and every a1,,anX we have

Xφ(a1,,an)Mφ(a1,,an)

But what, I hear you asking, about quantifiers? Notice it is not true that submodels have to prserve the truth of formulas with quantifiers. Intuitively, think about a formula x.φ. It is entirely possible that such an x exists in M but that element was excluded from the submodel. Similarly, it might be the case that X thinks that x.φ, simply because we excluded all of the counterexamples which might exist in M!

As a concrete example, consider the center Z(G) of a group G. This is a submodel, but Zx.y.xy=yx, while G (typically) does not. Perhaps even more agressively, we have Gx.xe, while the trivial subgroup 1G doesn’t!

We can see similar behavior in graphs. For instance, the following graph has Γx.y.xyxEy.

a graph which is held together by t

However, the submodel which excludes t does not satisfy that formula.


This brings us to a very special kind of submodel. A submodel X of M is caled elementary if and only if, given any elements a1,,anX and any formula φ at all,

Xφ(a1,,an)Mφ(a1,,an).

To show that a submodel is elementary, we have the Tarski-Vaught Test, which says that X is an elementary substructure of M if and only if for every a1,,anX and for every φ: whenever some mM satisfies φ(m,a1,,an), there is already a mX which also satisfies φ(m,a1,,an). This says that anytime we can find an element in M that makes a formula true, we can actually do it without leaving X.

It turns out there’s a reason there aren’t many simple examples of elementary submodels. Can you show that the only elementary subgraph of a finite graph is itself? (Hint: try to write down a formula which characterizes the graph up to isomorphism.)

This is true for more than just graphs, for the same reason. So any model with an elementary submodel must be infinite.

One easy example of an elementary submodel is KN, the complete graph with one vertex for each natural number, as a submodel of KZ, the complete graph with one vertex for each integer. These are both isomorphic to K0, and thus they are isomorphic to each other. It is clear, then, that they have the same first order theory.

Since an elementary submodel X must look almost exactly like the full structure M, you might think that this is the only possibility. You might think there must be an isomorphismm XM. Thankfully (and surprisingly!) this is false! This is the source of lots of interesting mathematics.

To get an example of this phenomenon, we can generalize the above example slightly. Look at KN again, but now viewed as a subgraph of KR (the complete graph with one vertex for each real number). Clearly these aren’t isomorphic, since they have different cardinalities. On the other hand, it is intuitively clear that they look extremely similar, especially since any fixed formula can only refer to finitely many vertices (or, with a quantifier, all vertices at once).

As a second, less trivial example, consider the following graph Γ of integers, where x and y are adjacent whenever their difference is ±1. Obviously the graphic shows only a finite part of the infinite graph.

A fragment of the integers

We can also consider the disjoint union Γ+Γ of two such graphs. Again, we only show a finite part of this graph.

A fragment of two copies of the integers

It turns out Γ and Γ+Γ are elementary equivalent! (can you show this with the Tarski-Vaught Test?) Then no first order logic formula can detect disconnectedness. If Gφ were true if and only if G is connected, then Γφ while Γ+Γφ. But since Γ and Γ+Γ have the same first order theory, no such φ can exist! Intuitively, this is because any first order formula can only say “there is no path of length n”, which doesn’t exclude the possibility of some longer path. If you choose to prove the first graph is an elementary submodel of the second using vaught’s test, you’ll make this intuition formal. Any existential statement which is witnessed by something in the other copy of Z can actually be witnessed by something in the same copy of Z that’s far enough away.

There are lots of reasons elementary submodels are useful (for a famous example, see nonstandard analysis ), but it’s important to keep straight the difference between homomorphic images, submodels, and elementary submodels. I know I was tripped up by this for a long time, and a younger me would have been helped by these toy examples from graph theory.

As a last exercise, there is a strengthening of these results. Can you show that, whenever A is a submodel of B and φ is quantifier-free, we have the following:

  • if Bx.φ, so does A
  • if Ax.φ, so does B

In words, universal formulas are “preserved downwards”, and existential formulas are “preserved upwards”.


  1. We recursively define terms, which pick out particular elements of a model. Then we use terms to recursively define formulas, which are the actual objects of interest.

    A term is defined recursively as follows:

    • each variable xi is a term
    • each constant symbol c is a term
    • given terms t1,,tn and an n-ary function symbol f, f(t1,,tn) is a term

    A formula is defined recursively as follows:

    • given terms t1 and t2, t1=t2 is a formula.
    • given terms t1,,tn and a n-ary relation symbol r, r(t1,,tn) is a formula
    • given formulas φ1 and φ2, the following are formulas too
      • ¬φ1
      • φ1φ2
    • given a formula φ and a variable xi, the following are formulas too
      • xi.φ

    formulas of the first two types are called atomic.

    We can then define various other symbols as abbreviations: φ1φ2=¬(¬φ1¬φ2), xi.φ=¬x.¬φ, etc. 

  2. We first discuss how to interpret terms. Each term t will define a function tM:MnM, where n is the number of variables in t. Notice a term with no variables refers to a function M0M, which can be identified with a fixed element of M.

    • Variables xi are interpreted as the identity function.
    • Constants c are interpreted as the cM specified by the model.
    • A term f(t1,,tn) is interpreted as the composite fM(t1M,,tnM).

    Notice that we can assume t1,,tn all have the same input variables. There must be a biggest xm referred to by any of the terms, and we can view each tj as a function of all the xi with im. We simply ignore any unnecessary inputs, just like f(x,y)=x2 is still a function of y. This observation makes the composition well defined, and will be useful again in the definition of formulas.

    Next, we interpret formulas as subsets of Mn, where n is tne number of free variables in the formula. Here a free variable is a variable that is not “quantified away”. Notice a formula with no free variables correpsonds to a subset of M0={}. We think of M0 as “false” and {}M0 as “true”.

    • (t1=t2)M={(a1,,an)Mnt1M(a1,,an)=t2M(a1,,an)}
    • r(t1,,tn)M={(a1,,an) | (t1(a1,,an),,tn(a1,,an))rM}
    • (¬φ)M=MnφM
    • (φ1φ2)M=φ1Mφ2M
    • (xi.φ)M={(a1,,ai1,ai+1,an) | For some ai,φM(a1,,an)}

    Again, we can always choose the free variables as xi with im for some m, by allowing some φ to ignore some of their inputs. This is necessary, for instance, to make sure φ1Mφ2M is well defined.

    Since the other symbols , , etc. are defined in terms of these, this suffices to interpret all formulas.

    Finally, we write Mφ(a1,,an) if and only if (a1,,an)φM. In particular, if φ has no free variables, then Mφ if and only if φM={}.

    As a (possibly fun) exercise, fix a signature σ and write some code which takes in a formula (with no free variables) and a model as input, and tells you if the formula is true or false in that model. If you’re feeling particularly ambitious, try to make it take a signature as input too. As some advice, I think this exercise is much easier in a functional language than an imperative one. This is what algebraic data types are made for.