Umbral Calculus Derivation of the Bernoulli numbers

\\(“(B-1)^n = B^n”\\)

(B-1)^2 &= B^2\\
B^2 – 2B^1 + 1 &= B^2 \\
-2B^1+1 & = 0\\
B^1 &= \frac{1}{2} \\
B_1 &= \frac12

(B-1)^3 &= B^3\\
B^3 – 3B^2 + 3B^1 – 1 &= B^3 \\
B^3 – 3B^2 + 3B_1 – 1 &= B^3 \\
-3B^2 + 3(\frac{1}{2})-1& = 0\\
-3B_2 + \frac{1}{2} &= 0 \\
B_2 &= \frac{1}{6}

Thanks to Laurens Gunnarsen for showing me this strange trick.

I’ve finally understood the principle which allows us to lower the index. The step where we move \(B^i\) to be \(B_i\) is quite simple. As Rota and Roman say, one method of expressing an infinite sequence of numbers is by a transform method. That is, to define a linear transform \(B\) such that \\(B x^n = B_n\\)

So, the above “lowering of the index” is actually using the relation \((X-1)^n = X^n\), and applying \(B\) to both sides of it. To get \(B(X-1)^n = B(X^n)\). Let’s look for example at the “lowering step” of the first calculation:
X^1 &= \frac{1}{2} \\
B (X_1) &= B(\frac12) \\
B_1 &= \frac12 B(1) = \frac12

How do I construct the Tits-Freudenthal magic square?

Thanks to Mia Hughes and John Huerta for the helpful discussions on this topic.

I am here taking another quick jab at trying to understand the construction of the Tits-Freudenthal Magic square. Let’s see if we can get into Vinberg’s mindset when he wrote down Vinberg’s construction.

Let’s say we knew the following theorem: \\(\text{ the derivations of } \mathcal{J}_3(\mathbb{O}) = f_4\\) We want to write down derivations of other algebras, \(\mathbb{O} \otimes_{\mathbb{R}} \mathbb{D}\), where \(\mathbb{D}\) is a division algebra.

Let’s see how we might derive the fact that \\(\text{der}(\mathcal{J}_3(\mathbb{A})) \simeq a_3(\mathbb{A}) \oplus \text{der}(\mathbb{A})\\)

where \(a_3\) denotes the 3×3 trace-free antisymmetric matrices, and \\(\text{der}(A) = \text{Lie}(\text{Aut}(A))\\)

Look upon the set of 3×3 Hermitian (aka self-adjoint) matrices over an associative division algebra \(\mathcal{A}\).

Equip this set with a Hermitian product, \(\frac{XY+YX}{2}\). Note that this multiplication is commutative but not associative, more importantly, it preserves self-adjointness. We call this algebra \(\mathcal{J}_3(\mathbb{A})\).

The automorphisms of this algebra are that of conjugation by unitary matrices.

\\(X \mapsto UXU^{-1}\\)

this preserves the product. Note that we may write any unitary element \(U\) as the exponent of a hermitian matrix, that is,

\\(U = e^{T}\\)

for some \(T\). So, when \(T\) has components really close to \(0\),

\\(U = e^T = Id + T\\)

\\(U^{-1} = e^{-T} = Id – T\\)

So, when \(T\) is very small, that is, \(T^2 = 0\), we see that:

\\(UXU^{-1} = (Id + T) X (Id – T) = X + TX – XT- TXT = X + [T, X]\\)

In other words, when \(T\) is very small our automorphism \\(X \mapsto UXU^{-1}\\)

becomes: \\(X \mapsto X + [T,X]\\)

We may break \(T\) up into its trace-free (0 on the diagonal), and trace parts: say \(T = T_0 + T_1\).

\\(X \mapsto X + [T_0, X] + [T_1, X] \\)

We look first at \([T_0, X]\): We see that \(T_0\) is the algebra of trace free antisymmetric matrices, \(a_3(\mathbb{A})\), acting by commutator on \(X\): \([T_0, X]\).

Now, we look at \([T_1, X]\): When \(\mathbb{A} = \mathbb{R}\), the trace part is 0. But, otherwise, the trace is purely imaginary.

So, \([T_1, X] = [Id \text{Tr }T , X] = [i, X] = iX – Xi\), where, \(iX – Xi \simeq \frac{d}{dX}(e^{\epsilon i} X e^{-\epsilon i} – X)\). That is \([T_1, X]\) is the derivative of an automorphism of \(X\), which means that it is a derivation! So, it is believable that \(T_1\) (acting by commutator) is the algebra of derivations of \(\mathbb{A}\).

\(\implies\) We have shown (intuitively, at least), by examining our automorphisms when \(T\) is taken to be very small, \(X \mapsto X + [T_0, X] + [T_1, X] \), that:

\\(\text{Lie}(\text{Aut}(\mathcal{J}_3(\mathcal{A})))\simeq a_3(\mathbb{A}) \oplus \text{der}(\mathbb{A})\\)

where \(a_3\) denotes to 3×3 trace-free antisymmetric matrices, and \\(\text{der}(A) = \text{Lie}(\text{Aut}(A))\\)

It is reasonable then to plug in other algebras \(\mathbb{K}\) in place of \(\mathbb{A}\), \\(a_3(\mathbb{K}) \oplus \text{der}(\mathbb{K})\\)

Indeed, we may look at the case \(\mathbb{K} = \mathbb{O} \otimes_{\mathbb{R}} \mathbb{D}\) where \(\mathbb{D}\) is a division algebra.

\\(a_3(\mathbb{O} \otimes_{\mathbb{R}} \mathbb{D}) \oplus \text{der}(\mathbb{O} \otimes_{\mathbb{R}} \mathbb{D})\\)

the typical Vinberg construction looks like

\\(a_3(\mathbb{O} \otimes_{\mathbb{R}} \mathbb{D}) \oplus \text{der}(\mathbb{O}) \oplus \text{der}(\mathbb{D})\\)

I am not sure if \(\text{der}(\mathbb{O}) \oplus \text{der}(\mathbb{D}) \simeq \text{der}(\mathbb{O} \otimes_{\mathbb{R}} \mathbb{D})\), but it is believable. At least one includes the other.

As to why this gives us the exceptional Lie algebras, we still do not know, but at least we see how Vinberg’s construction may have come about!

What does an algebraic integer have to be?

What does an integer have to be?

  • No matter how you extend \(\mathcal{Q}\), the integers which lie in \(\mathcal{Q}\) must lie in \(\mathcal{Z}\).
  • If \(\alpha\) is an integer, then so are its conjugates.
  • The sums and products of integers are also integers.

From this we may describe what an algebraic integer must be.

Start with a root \(\alpha\).

Look at all of it’s conjugates. \\(\alpha, \alpha’, \alpha”, …\\) By conjugates, I mean the elements that have the same minimal polynomial as \(\alpha\) (that is, the elements that cannot be distinguished).

Look at all products and sums of \(\alpha, \alpha’, \alpha”, …\).

Look at symmetric polynomials in \(\alpha, \alpha’, \alpha”, …\). Things that are symmetric in the roots must have quadratic coefficients (by the fundamental theorem of Galois theory wrt symmetric polynomials), and it must be integral because sums/products of integral things must be integral. So, by Vieta, the minimal polynomial must be monic.

Thanks to Aaron Slipper and Hecke.

Newspaper Ad: Looking for a Variety

Hello, my name is Catherine. I don’t want much, just looking for a nice Variety to spend my days with. If you apply, I’d like you to have a well understood group law that comes from some 3-fold symmetry, but I’m a simple girl, easy to please, and I don’t need your group law to be all fancy and closed — a group chunk (group law which closed at least locally to the origin) is fine by me. I’ll have to put you through an interview process to see if you’re group chunk gives me a formal group law which is height 3, but don’t worry, it’ll be painless. Please let me know if you have a friend that matches this description!

This post is mostly a set-up to an (ill-formed) question. It’s motivated by this question:

How do I construct a variety which gives me
a formal group law of height 3?

That is, I want it to have a nice kind of 3 fold symmetry which is reflected in its structure around a marked point.

I want this variety to not to be a bunch of copies of the additive or multiplicative group on \(R^n\). I am trying to define an at least 3-dimensional variety with a group chunk (that is, a group law which is closed at least locally to the origin). I want this group chunk to not be isomorphic to an additive, multiplicative, or elliptic group, or products of such groups.

The previous variety I was looking at ended up being isomorphic to the additive group, though it was very pretty. Aaron and a few others derived a variety from the relationship between the lengths of the vertices to a point interior to an equilateral triangle, which I rederived with the help of Laurens. Unfortunately the group law Alex Mennen and I defined on the variety ended up being the additive formal group law. I didn’t recognize it at first because it had 2 layers of square roots as a disguise, but Jack Shotton pointed out that if we did a variable change to get rid of the square roots (a variable change I had been doing formally to make calculations easier) it became quite obviously isomorphic to the additive group.

It was also pointed out to me by Doug Ravenel that height 3 formal group laws cannot be dimension 2. For some reason to do with the symmetry of the Jacobian of a dimension 2 variety which I don’t understand. So, I look now to dimension 3. More specifically, I look at tetrahedrons — the analogue of the square lattice, in some sense.

We begin with the vague desire of deriving a variety from some relationship on a tetrahedron (hoping that this variety has both a group law, and that the group law is height 3). Inna suggested that I look at a right angled tetrahedron to make my life easier, so we will look there.


I had an idea for a group law on right-angled tetrahedrons: is a group in which each element is indexed by an angle, that is, the angle \(theta\) of the plane that intersects it symmetrically (to produce the tetrahedron).

0 (1)

Then, we might add their volumes. We now get a third volume. What is an angle which gives us a tetrahedron with this volume? Is it unique?

Inna and I talked about this and she referred me to a group law on angles of a tetrahedron, which looks multiplicative but involves \(\sin\), so perhaps is a bit more complicated. It looks something like this: \(\alpha + \beta = \sin^2(\alpha)\sin^2(\beta)\). Where does this group law live? What variety has angles as points? Does the free abelian group generated by tetrahedra have a geometric structure we could use?

Another issue: We now have a group law, but no variety! The whole point was to define a group law close to the origin, but what is closeness in this group?

I stop writing with a question still quite ill-formed and fuzzy:

How might we derive a variety based on a tetrahedron which allows us to put an angle-y/volume-y group law on its points?

How do I explicitly write an ODE as a linear ODE + a nonlinear ODE?

I was learning about linearized stability and was confused by where the magical linearized version of the equation was coming from. I finally understand it, and so stupidly simple so I want to tell you about it. First, I’ll motivate the question. If you don’t care about the motivation, just scroll ahead a bit.

How does a small perturbation of the initial input of my ODE \(y’ = \phi(y)\) affect the long-term behavior?

Let’s say there’s a weight at the end of a stiff thin rod. If we stand this pendulum straight up and manage to have it stay put, this is an equilibrium point. However, if we knock it ever so slightly, it will swing and settle back down at the bottom — so this is unstable. If we flick the pendulum when it is settled at the bottom, it will oscillate and eventually resettle at the bottom.

Let’s be a bit more explicit about this. Let’s start with an equilibrium point, that is, a solution \(y\) such that \(y(t) = q\) for all \(t\). If we instead set \(y(0) = q + \epsilon_0\), where \(\epsilon_0\) is a pretty small perturbation, will \(y(t) = y + \epsilon(t)\) spiral away or stay within some small delta neighborhood of \(q\)?

This question is too hard for a general ODE, but we sure do know the stability of systems of the form \(y’ = Ay\) by just looking at the eigenvalues of \(A\). But, what good does that do? Will the stability of the solutions of \(y’= Ay\) tell me anything about the stability of the solutions of \(y’ = \phi(y)\)? And, really, in the first place, how can I explicitly re-write my nonlinear ODE as a linear ODE + a nonlinear part? That is, how do I find the form: \\(y’ = \phi(y) = Ay + g(y)\\) where \(g\) is some smooth function, and \(A\) is a matrix.

Here’s how we find the matrix \(A\):

If we have an ODE of the form \(y’ = \phi(y)\), we can take the Taylor series of \(\phi(y)\) with respect to a point \(y_c\). So, locally around \(y_c\):
\\(\phi(y) = \phi(y_c) + \frac{\partial \phi}{\partial y}(y_c) \cdot (y_c – y) + \frac{\partial^2 \phi}{\partial y^2}(y_c) \cdot (y_c – y)^2 + \dots\\)

If we want to get rid of the pesky first term, we can we pick \(y_c\) to be an equilibrium point (a solution \(q\) where \(q(t)= c\) for some fixed \(c\), for all time, so \(q’ = 0\)), that is, a point \(q\) such that \(\phi(q) = 0\). So, locally around \(q\):

\\(\phi(y) = \frac{\partial \phi}{\partial y}(q) \cdot (q – y) + \frac{\partial^2 \phi}{\partial y^2}(q) \cdot (q – y)^2 + \dots\\)

If \(y\) is an initial value close to \(q\), that is, \(|| y – q|| = \epsilon_0\) is small, then \(|| y – q ||^2\) and all higher powers are going to be hella small, so we can sometimes ignore them. Locally around \(q\):

\\(\phi(y) = D\phi(q) \cdot (y-q) + \text{ small nonlinear stuff }\\)

Where \(D\phi(q)\) is the Jacobian of \(\phi\), evaluated at the point \(q\). We call the approximation \(y’ = D\phi(q)\cdot(y-q)\) the “linearized” version of our original equation, \(\phi(y) = y’\).

And, \(g(y)\), the nonlinear stuff, is then the difference of our original equation and the linear portion: \(g(y) = \phi(y) – D\phi(q)\cdot(y-q)\).

Afternote: If this was an equation with terms explicitly dependent on \(t\), we could take the two dimensional Taylor series. That is, if \(y’ = \phi(t, y)\), then, locally around \((y_c, t_c)\):

\(\phi(y,t) = \phi(y_c, t_c)\)

\\(+ \frac{\partial \phi}{\partial y}(y_c, t_c) \cdot (y_c – y) + \frac{\partial \phi}{\partial t}(y_c, t_c) \cdot (t_c – t) \\)

\\(+ \frac{\partial^2 \phi}{\partial y^2}(y_c,t_c) \cdot (y_c – y)^2 + \frac{\partial^2 \phi}{\partial t^2}(y_c, t_c) \cdot (t_c – t)^2 + \dots \\)

Spectrum of a Ring and Spectrum of a Linear Operator

A quick post before bed, an impressionist stroke on some nice things lurking in linear algebra. I love polynomials. They are the ultimate tools that make me feel like I’m touching something, calculating at the level of a polynomial is a good clean feeling. I want to show you that it is nice to think of a vector space over \(F\) as a \(F[x]\)-module (thanks Emmy Noether). Thanks to Semon Rezchikov for helping me get over a few bumps in grasping some of the following.

Let \(A\) be a finite type \(\mathbb{C}\)-algebra, then \(A = \mathbb{C}[x_1, \ldots, x_n]/(f_1, \ldots , f_k)\), then Spec \(A\) is the variety cut out of \(\mathbb{C}^n\) by \(f_1, \ldots, f_k\).

We can specify representation of \(C[x]\) on a vector space \(V\) by specifying where to send \(x\). That is, by specifying a linear operator \(A\).

We can alternatively phrase this. Given \(A \in \text{End}_F(V)\), we can treat \(V\) as an \(F[x]\) module by \\(f(x)\cdot v =: f(A)v\\)

Alright, now, take the image of \(C[x]\) in \(\text{End}(V)\), let’s denote this \(C[A]\). This is a subring of \(\text{End}(V)\), it’s commutative.

Here’s the cool part:

\\(\text{Spec }C[A] = \text{Spec }A \\)

But where is this identification coming from?

Well, \\(C[A] = C[x]/(\text{ stuff that acts by 0 on V })\\)

Let’s give “stuff that acts by 0 on V” a closer look as the subring of \(F[x]\) it is, let’s frak it up:

\\(\mathfrak{P}_A = { f \in F[x] | f(A)v = 0, \forall v \in V }\\) This is an ideal of \(F[x]\), and since \(F[x]\) is a PID, any ideal is generated by one element. The generator of \(\mathfrak{P}_A\) is the minimal polynomial of \(A\) (i.e., characteristic polynomial with multiplicity 1).

So! Spec \(C[A]/\mathfrak{P}_A\) is the variety cut out of \(\mathbb{C}\) by all \(g \in \mathfrak{P}_A \subset F[x]\).

Nice, now, the operator spectrum \(\text{Spec }A\) is defined as \({x \in C | A-xI \text{ is not invertible }}\), which is like eigenvalues, since \(A-xI\) is not invertible \(\Leftrightarrow\) \(\det(A-xI) = 0\). Not remembering the multiplicity, if the eigenvalues of \(A\) are \(a, a, b\), then \(\text{Spec }A = {a,b}\). Also, we’re assuming the eigenvalues have finite multiplicity, (otherwise nilpotent matrices might run us amuck)

Start with \(\text{Spec }A\). This is a set, it’s the eigenvalues* of A, look at the polynomial \(p\) whose roots are \(\text{Spec } A\). Look at the \(F[x]\)-submodule generated by this \(p_A\), this is exactly \(\mathfrak{P}_A\).

*finite multiplicity.

That’s pretty nice!

It reminds me of something. The original reason for studying group determinants (Dedkind’s idea) probably come from Galois theory, where matrices were what Galois used to depict his “groups” as permutation groups of the roots. Dedekind looked at the determinent of such a “group matrix.”

Frobenius factored this “group determinent” over the complex numbers, and found two things:

1) the number of irreducible factors equals the number of conjugacy classes of G

2) each irreducible factor is homogeneous of some degree, and then each such irreducible factor appears to the power of its own degree.

What is the “universal enveloping algebra” of a formal group law?

I posted earlier a query toward exploring the analogy between

smooth algebraic groups over \(\mathbb{R}\) or \(\mathbb{C}\) :: Lie algebras

smooth algebraic groups over R (any commutative ring) :: Formal group law

in which I tried to answer this question, and ended up with “the Lazard ring doesn’t quite work,” which makes sense in retrospect, as the Lazard ring is not associated with any particular formal group law. When I say “formal group law” I mean “1-d formal group law.”

What is a formal group law? It’s an expression of the group structure of G in an infinitesimal neighborhood of the origin. At the Midwest Topology Seminar, talking with Paul V. and Dylan Wilson, I have a somewhat more satisfying answer.

What is a Lie algebra? It’s an expression of the group structure of \(G\) at the FIRST infinitesimal neighborhood of the origin. In characteristic 0, this extends to a definition of the group structure in an infinitesimal neighborhood of the origin by the Baker Campell Hausdorff formula.

Specifically, what is the universal enveloping algebra of a formal group law? It’s the formal group law itself! Well, more specifically:

Universal enveloping algebra :: Lie algebra

Functions on Formal group law  :: Formal group law

According to wikipedia, “The universal enveloping algebra of the free Lie algebra generated by X and Y is isomorphic to the algebra of all non-commuting polynomials in X and Y. In common with all universal enveloping algebras, it has a natural structure of a Hopf algebra, with a coproduct \(\Delta\). The ring S used above is just a completion of this Hopf algebra.”

A formal group law already has a Hopf algebra structure. This is just the cogroup on formal power series \(R[[x]]\) induced by the formal group law, \(f\), that is,

\\(R[[x]] \to R[[x]] \widehat{\otimes} R[[x]]\\)

\\(x \mapsto f(1 \otimes x, x \otimes 1)\\)

This is already complete! We’re a formal group law so we’re already completed at the origin! And, if we are a 1-d formal group law, we’re always commutative (unless our ring is nilpotent), so this is promising.

Classification of Conic Sections

Apollonius of Perga (262 BC) wrote an exhaustive treatise exploring conics. He presented a classification of conic sections by angle. I’ll show you a summary of what he did, and then a conceptually more pleasing and suggestive way to think about it.

Note that Apollonius of Tynea, the Greek philosopher, is a different dude — a contemporary of Christ.

Let’s get the definitions out of the way so we’re all on the same page:

  • A cone is the shape generated by considering all lines in space that pass through a fixed circle \(C\) and a fixed point \(O\) (the vertex) not in the plane of the circle.
  • A conic section is the intersection of a plane with a cone.

Here’s the “angles” way to think about the different sorts of conic sections:
We see that the plane can either intersect the point \(O\), or not intersect \(O\). If the plane intersects \(O\), then it can either intersect no other part of the cone, intersect one lobe of the cone, or intersect both lobes of the cone.

The above picture also makes it clear that there are three categories of thing. Appollonius named these hyperbola, parabola, and ellipse (angle greater than alpha, equal to alpha, or less than alpha). Then we have the “degenerate” hyperbola, parabola, and ellipse. What do I mean by degenerate? I mean the intersecting lines, the double line, and the degenerate ellipse — these guys touch the origin O.

This is the classic method of classification. I find it a bit unsatisfying. It leaves so many of the suggestive questions in the dark!

The cleanest way to classify conic sections is the following observation.

A line can intersect a circle 0 times, 1 time, or 2 times.


Picture the real plane compactified by a point at infinity, we’re in projective space, so it’s safe to think of this as a line at infinity.

Picture the ellipse — it is a circle which doesn’t intersect infinity,

Picture the parabola — it is a circle which intersects infinity once (with a double root).

Picture the hyperbola — it is a circle which intersects infinity twice.

What I’m trying to show you is that the best way to understand the conics is projectively!

This was first appreciated by Girard Desargues, in the late 1630s. Desargues was motivated by very practical considerations. He was trying to develop a complete geometry of the visual world. He wanted a comprehensive framework into which the tricks and techniques of perspective drawing, which had been around since the early fifteenth century, would fit naturally and perfectly.

Unfortunately, he was a terrible expositor. Instead of “straight line,” and “point of intersection,” he spoke of “palms,” and of “fronds,” and all sorts of other things. He used an arboreal vocabulary to speak of geometric ideas.

He wanted to address himself to craftsmen and artisans, who lacked a classical education. People who couldn’t read Euclid. But he was like other great mathematicians who have had profoundly new ideas: he was almost incapable of imagining what it was like not to know what he himself knew. It took a long time for other mathematicians really to assimilate Desargues’ ideas. The process didn’t really get far until about the middle of the 19th century — 200 years after Desargues’ death. The biggest single step was taken by Plücker and Möbius.

Rather than tell you how Desargues thought about it, I’ll give you my take. Let’s start with: conic sections are projective transformations of the circle.

We begin with a circle

\\(x^2 + y^2 – z^2 = 0\\)

Which we may rewrite, for ease of manipulation, in the following form:

\\(\begin{pmatrix} x & y & z \end{pmatrix} \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & -1 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = 0\\)

We may then projectively transform the circle:

\\((\begin{pmatrix} x & y & z \end{pmatrix} \cdot M^T) \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & -1 \end{pmatrix} \Bigg(M \cdot \begin{pmatrix} x \\ y \\ z \end{pmatrix}\Bigg) = 0\\)

Isolating the center matrix, let’s rewrite \(M^T \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & -1 \end{pmatrix} M \text{ as } \begin{pmatrix} a & b_1 & d_1 \\ b_2 & c & e_1 \\ d_2 & e_2 & f \end{pmatrix}\).

\\(\begin{pmatrix} x & y & z \end{pmatrix} \begin{pmatrix} a & b_1 & d_1 \\ b_2 & c & e_1 \\ d_2 & e_2 & f \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = 0\\)

\\(= ax^2 + (b_1 + b_2) xy + cy^2 + (d_1 + d_2)xz + (e_1 + e_2)yz + fz^2\\)

Tada! We see that any projective transformation of a circle is a quadratic form in 3 variables!

How is this related to the intersection of a conic with a plane? Well, we can view the 3-variable quadratic form

\\(ax^2+bxy + cy^2 + dxz + eyz – fz^2\\)

as the intersection of the 2-variable quadratic form \\(ax^2 + bxy^2 + cy^2 – z^2\\) with the plane \(z = -(dx + ey + fz)\)

Alternatively, we may examine the intersection of the 3-variable quadratic form, \\(ax^2 + bxy^2 + cy^2 + dxz + cyz – fz^2\\) with the plane \(z=0\). We see that this intersection is \(ax^2 + bxy^2 + cy^2\).

A conic section is a projectively transformed circle. We can classify it by counting the number of times it intersects the line at infinity.


Observe that the number of roots of the polynomial \(Q(x,y) = 0\) is equal to the number of times the quadratic form \(Q(x,y)\) intersects the origin.

So, we’ve reduced the problem of classifying conic sections to the problem of counting the solutions of \(ax^2 + bxy^2 + cy^2 = 0\). Let’s do it!

Since we are in projective space, and currently in \([x: y: 0]\), we may rescale \(ax^2 + bxy + cy^2 = 0\) to the coordinates \([x: 1: 0]\) to get a univariate polynomial:

\\(ax^2 + bx + c = 0\\)

How do we solve for the number of roots the polynomial \(ax^2 + bxy^2 + cy^2 = 0\) has over \(\mathbb{R}\)?

We assume that the polynomial factors — then, we solve for the roots in terms of the coefficients of the polynomial.

ax^2 + bxy^2 + cy^2 &= a(x- \beta)(x – \gamma) \\
x^2 + \frac{b}{a}xy^2 + \frac{c}{a}y^2 &= (x- beta)(x – \gamma) \\
&= x^2 – (\beta + \gamma)x + \beta \gamma \\
\frac{b}{a}xy^2 + \frac{c}{a}y^2 &= -(\beta + \gamma)x + \beta \gamma

– \frac{b}{a} &= \beta + \gamma \\
\frac{c}{a} &= \beta\gamma

Note that we have both the sum \(a+b\) and the product \(ab\) of two numbers. How do we find the individual numbers?

We find the square of their difference! As can be seen pictorally:

Bildschirmfoto 2015-12-12 um 2.18.33 nachm.
source: Laurens Gunnarsen

We proceed undaunted to solve for \(\gamma\) and \(\beta\) in terms of the coefficients \(a, b, c\). Apologies, it’s a bit messy.

(\gamma – \beta)^2 &= (\gamma + \beta)^2 – 4\beta \gamma \\
&= \Big(\frac{b}{a}\Big)^2 – 4\frac{c}{a} \\
\gamma – \beta &= \Big(\Big(\frac{b}{a}\Big)^2 – 4 \frac{c}{a}\Big)^{1/2} \\

\\((\gamma + \beta) + (\gamma – \beta) = 2\gamma\\)

\\(-\frac{b}{a} +\Big(\Big(\frac{b}{a}\Big)^2 – 4 \frac{c}{a}\Big)^{1/2} = -\frac{b}{a} + \Big(\Big(\frac{b}{a}\Big)^2 – 4 \frac{c}{a}\Big)^{1/2}\\)

Thus, \(\gamma = \frac{-\frac{b}{a} + ((\frac{b}{a})^2 – 4 \frac{c}{a})^{1/2}}{2}\), and \(\beta = \frac{-\frac{b}{a} – ((\frac{b}{a})^2 – 4 \frac{c}{a})^{1/2}}{2}\).

Note that we made the the arbitrary choice for \(\beta – \gamma\) to be the negative root, and \(\gamma – \beta\) to be the positive root. Until then, everything is symmetric.

Usually, the square \((\gamma – \beta)^2 = (\frac{b}{a})^2 – 4 \frac{c}{a}\) is referred to as the discriminant. It’s either one, zero, or two, corresponding to the number of real solutions of the quadratic \(ax^2 + bxy + cy^2\).

That this conincides with the determinent of the matrix associated to the quadratic form is no accident!

\\(\begin{pmatrix} x & y \end{pmatrix} \begin{pmatrix} a & b/2 \\ b/2 & c \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}\\)

It it is a theorem that the sign of the \(\det \begin{pmatrix} a & b/2 \\ b/2 & c \end{pmatrix} = ac – \frac{b^2}{4} >0\) implies that the evaluation of \(ax^2 + bxy^2 + cy^2\) at any point (x,y) is always positive.

In other words:

The determinent of the upper k-minors is positive \(\Rightarrow\) the quadratic form is positive definite.

In this case, we pick the standard flag, so the upper minors are \((a)\) and \(\begin{pmatrix} a & b/2 \\ b/2 & c \end{pmatrix}\). (We see that the degenerate conics correspond to the projective transformations which have determinant 0.)

Thanks to Laurens Gunnarsen for telling me about Desargues, Ronno Das and Nir Gadesh for helping me to understand the discriminant, and Inna Zakharevich for telling me to write this post.

A quick comparison of Lie algebras and formal group laws

This post assumes that you are familiar with the definition of Lie group/algebra, and that you are comfortable with the Lazard ring. Note: This is less of an expository post and more of an unfinished question.

Why care about formal group laws? Well, we want to study smooth algebraic groups, but Lie algebras fail us in characteristic p (for example, \(\frac{d}{dx}(x^p) = 0\)), so, rather than a tangent bundle, we take something closer to a jet bundle.

Lie algebras are to smooth groups over \(\mathbb{R}\) or \(\mathbb{C}\) as formal group laws are to smooth algebraic groups over any ring \(R\).

I want to apply this analogy! I want this deeply. I’m trying to puzzle out how to see if this analogy is deep or superficial. How deep does the rabbit hole go? Let’s look at an example.

Given the universal enveloping algebra of a Lie algebra \(\mathfrak{g}\), we might think of this as a deformation of the Symmetric algebra:

\\(U_{\epsilon}(\mathfrak{g}) := T(\mathfrak{g})/(x \otimes y – y \otimes x – \epsilon[x, y])\\)

\\(\mathbb{C}[\mathfrak{g}^*] \simeq Symm[\mathfrak{g}] := T(\mathfrak{g})(x \otimes y – y \otimes x)\\)

Action on all of this is the adjoint action, that is, the action which takes an element \(g\) of a Lie group \(G\) sends \(X \mapsto gXg^{-1}\). Orbits of this action stratify the dual Lie algebra, and there is a symplectic form that lives on each orbit.

I want to think of the adjoint action as directly analogous to the compositional conjugation action on the spectrum of the Lazard ring (over a ring \(R\)).

This action takes an invertible power series \(u\) and applies it in the following manner to a formal group law \(F\) over \(R\).

\\(F(x, y) \mapsto u^{-1}(F(u(x), u(y)))\\)

This is an isomorphism on formal group laws of the same height (assuming that we are over a seperable field). Thus, the action of the invertible power series on the Lazard ring stratifies by height of the formal group law. But as far as I know, each of these geometric points carries no analogue of a symplectic form on each orbit. Do they? How would I find them? What could possibly impose a symplectic structure on the collection of of formal group laws of a given height?

What is our universal enveloping algebra? Perhaps, it is the contravariant bialgebra of the formal group law? Perhaps, this analogy is not as deep as I thought.