Hello. I can type now!
Talk of AlphaGeometry seems to have taken a step up recently, mostly because of the AlphaProof+AlphaGeometry 2 announcement. I'd like to keep this article short and sweet as to simply share my thoughts on the architecture.
I would've written on AlphaProof, but it seems it's closed source. Interesting and most peculiar.
As for AlphaGeometry 2, its language model seems to be a base setting Transformer with ReLU activation and sequence packing, trained with cosine decay scheduling. The approach is one that is Neural: Sym -> Neur on Kautz's taxonomy; it is one that consists of simply serializing synthetic theorems into a string containing its premises and its proof. The theorems themselves are random premises (graphs) defined into directed acyclic graphs, topologically sorted and then inserted into a deductive database as definite Horn clauses. There is indeed a benefit to formatting the clauses as definite especially in an environment such as this one, as they are soluble. This is the DD part of the DD+AR solver, and this approach is also taken from Chou et al in "A Deductive Database Approach to Automated Geometry Theorem Proving and Discovering". I would've gone in more detail about its origins, but the source seems to be a Springer Link publication. Apart from the deduction database segment of the DD+AR solver, AR seems to simply be a tool for performing distance quasimetric measurements through a nearly-reminiscent-of-Gram-Schmidt Gaussian elimination process. The language model only comes through during proof search and uses beam search, which is more than enough, and best-first search would've been adequate. Having a truck dump of random ansatze lazy-sorted alongside a Transformer is not practical, even if it's on a Google TPU, and to be honest, I was expecting it to be variational in the slightest. (to which, it was latent just in another way). A proper topological space-informed deterministic model or a derived Theory-Trained Network that actually did the job without needing 50 million premade randomized ansatz, and the state of PINNs such as segmentation PointNet and mesh-free topological analysis prove metric space analysis is possible without needing to shove NLP where it doesn't need to be shoved. Everyone wish good luck to Demis who doesn't seem to understand hybrid modeling very well. I'd also like to apologize for the lack of diagrams and of a further dissection (in which I plan to write.. someday... once I'm done with IdentityV RLHF... and school) as the only driving force for this post was vitriol against the sensationalist half of Google DeepMind.