Theorem Loader -- Overview



Purpose:


The two main problems which the Theorem Loader solves are 1) providing a way for external systems using the new "mmj2 Service" feature to load theorems into Metamath formatted files via mmj2; and 2) providing a way for users of the mmj2 Proof Assistant GUI to store new theorems and new proofs of existing theorems into the Logical System loaded in memory -- and into the set of .mm data which is loaded by mmj2 at start-up (the main input .mm files, such as set.mm are, however, not updated directly by mmj2.)




Theorem Loader Overview Diagram:

The Theorem Loader Overview diagram below shows the main high-level interfaces of the Theorem Loader. The diagram is generally accurate in showing the mmj2 classes interacting with Theorem loader, but only at a high level. For example, mmj.util.BatchFramework.java actually calls another object -- an instance of  mmj.util.LogicalSystemBoss.java -- which then calls the Theorem Loader. The rectangles on the diagram represent input/output data stores (files) and the arrows to/from these rectangles show the I/O accesses. The arrows between the objects which do not have borders, such as "BatchMMJ2" and "BatchFramework" represent high-level objects using function calls, where the arrow points to the object being called.

Theorem Loader Overview Diagram






What It Does (Overview):


The Theorem Loader feature stops several steps short of being a full Metamath data exchange facility for external systems and users wishing to load data into or get data out of Metamath via mmj2. Here are the main functions of the new Theorem Loader feature:

Component Descriptions:

Note: the names of these things match the overview diagram above.

I. Data Stores

A. Command Line Parms

The Command Line Parms tell mmj2 the filename of the RunParms.txt file, and optionionally, the RunParm file's "charset" name, quoting character and delimiter character. They are documented in mmj2\src\mmj\util\BatchMMJ2.java and an example is provided in mmj2\mmj2jar\mmj2.bat. When the mmj2 Service feature is used in "Caller mode", the calling program calls BatchMMJ2 passing the Command Line Parms as if they had been passed on the "java" command line as arguments (e.g. "String[] arg").

B. .mm file(s)

This is the primary input Metamath .mm format file -- or files -- input to mmj2 during a single run. Most often a single .mm file such as set.mm is input to mmj2, and this is specified with the RunParms.txt command "LoadFile,c:\metamath\set.mm" line (see mmj2\mmj2jar\RunParms.txt and metamath\set.mm). It is possible to use two or more LoadFile run parms sequentially to load more than one .mm file consecutively and as if they were strung together into one large file.

C. .mmt folder

The .mmt folder  (or folders, since the directory in use can be changed during the course of processing) contains theorems in Metamath .mm format but written using file-type ".mmt" (to avoid accidents involving real Metamath files such as set.mm.). These theorems can be loaded, upon command, into the Logical System currently loaded into memory, and theorems from the Logical System and/or the mmj2 Proof Assistant can be written to the .mmt folder. Thus, the .mmt folder can be used as a repository for sets of theorems which in development, prior to their ultimate -- manual -- entry into the main .mm file.

In normal use the theorems in the .mmt folder are written to the folder upon user command by mmj2's Theorem Loader, which is called either by Proof Assistant, or via the new RunParm "ExtractTheoremToMMTFolder". Therefore, the following processing rules pertaining to loading of theorems from the .mmt folder will be of minor concern to the users -- except that periodically when the user manually updates the main input .mm file with new theorems and new proofs, the contents of the .mmt folder must be manually cleaned out (and archived or backed up) -- or if .mmt files are created manually:

Theorem Loader Processing Rules for loading theorems from the .mmt folder:

Note: Most of the rules below mirror what the mmj2 Proof Assistant GUI can do, so don't be overly concerned by the number and complexity of these rules! In most cases, the mmj2 Proof Assistant GUI user won't need to worry about the batch procesing rules because the GUI handles writing theorems to the .mmt folder.


D. RunParms.txt

The RunParms.txt file contains a set of executable commands and option values for mmj2. These are read by BatchFramework in order to customize mmj2 for a particular .mm file's characteristics and to control processing flow in mmj2 for a single run. Most RunParms are optional and have default settings which are hardcoded into mmj2. Documentation for the mmj2 RunParm "language" may be found here: mmj2\mmj2jar\AnnotatedRunParms.txt and the main version of the RunParms.txt file which is part of every mmj2 release is here: mmj2\mmj2jar\RunParms.txt.

Here are the new RunParms for the Theorem Loader feature:

1. "TheoremLoaderMMTFolder,DirectoryPathname" where:



2. "TheoremLoaderDjVarsOption,DjVarsOption" where:
NOTE: In Proof Assistant this option's correct setting will depend on what the user is actually doing and by the current setting of the "ProofAsstDjVarsSoftErrors" option (see mmj2\mmj2jar\AnnotatedRunParms.txt ). The simplest choice is "NoUpdate" because a subsequent VerifyProof will report any missing $d statements. However, the recommended setting is "Replace" if your ProofAsstDjVarsSoftError option is "GenerateNew" (which always create a completely new set of $d statements even if there are no $d errors.) Here is a table showing the recommended choice combinations:

ProofAsstDjVarsSoftErrors
TheoremLoaderDjVarsOption
Notes                            
GenerateNew
Replace
recommended settings
GenerateReplacements
NoUpdate
GenerateReplacements generates $d statements in the Proof Worksheet only if soft $d errors are found, so using "Replace" would wipe out existing $d statements when the .mm file is subsequently reloaded.
GenerateDifferences
Merge
these are "ok" too
Report
NoUpdate

Ignore
NoUpdate





3
. "TheoremLoaderAuditMessages,GenerateAuditMessagesOption" where:



4. "LoadTheoremsFromMMTFolder,LoadSpecification" where:
NOTE: This RunParm is provided for low-volume use, something on the order of 100  .mmt files.

Example: "LoadTheoremsFromMMTFolder,*"

Processing/Rules:

Note: the equivalent of the LoadTheoremsFromMMTFolder command will be added to the mmj2 Proof Assistant GUI as a menu item, and to the ProofAssistant.java public call interface (for use by the mmj2 Service feature.) For mmj2 Proof Assistant GUI users this means that new .mmt format files can be created -- manually, if desired -- in the .mmt folder, and these theorems can be dynamically loaded into the Logical System without exiting the mmj2 Proof Assistant GUI. This could be a very useful feature when working with lemmas (note that if the Proof Worksheet "LOC_AFTER" is used with a new theorem and a lemma is dynamically loaded from the .mmt folder, and the lemma has a "?" in its proof (meaning that the lemma is loaded at the logical end of the Logical System), then the user will need to erase the LOC_AFTER label to avoid a "forward reference" in the proof.


5. "ExtractTheoremToMMTFolder,TheoremLabel" where:
The ExtractTheoremToMMTFolder RunParm is provided for testing purposes, but could be used by anyone for other reasons.



6. "UnifyAndStoreInLogSysAndMMTFolder,ProofWorksheetFileName



7. "UnifyAndStoreInMMTFolder,ProofWorksheetFileName



II. Processes

A. mmj2 Service "Caller"

The new mmj2 Service feature can be used in "Caller" or "Callee" mode.

Here is how "Caller" mode works:
"Callee" mode is similar except that the RunParms.txt file contains a RunParm naming the user's class which implements the SvcCallback interface. Then when BatchFramework encounters the "SvcCall" RunParm, it creates an instance of that class using a default constructor and calls the "go()" method in that object.

An example of Caller and Callee mode programs is provided at the end of this document.

B. BatchMMJ2

mmj.util.BatchMMJ2.java is the main entrypoint for mmj2. It checks to see that the Java version is adequately modern and passes the input Command Line Parms to BatchFramework. If the BatchFramework return code is non-zero, BatchMMJ2 executes System.exit() passing the non-zero return code back to the command line environment.

C. BatchFramework

mmj.util.BatchFramework.java is responsible for reading the RunParms.txt file and passing the input RunParms to various "Boss" objects which actually perform the requested work. BatchFramework also catches un-handled exceptions and performs any cleanup or reporting prior to exiting.

D. SvcCallback

mmj.svc.SvcCallback is an interface which is used by mmj2 to pass control to a user program operating either in "Caller" or "Callee" mode. The SvcCallback.go() method's arguments are passed by mmj2 and provide access to the major mmj2 objects, such as the ProofAsst and LogicalSystem. Using these objects a user program can, effectively, use mmj2 as a (single-threaded) subroutine. (Note that the mmj2 Proof Assistant can be used without initiating the mmj2 Proof Assistant GUI.)

E. LogicalSystem

mmj.lang.LogicalSystem.java contains the input .mm file's symbol table (symTbl) and statement table (stmtTbl), as well as a few other necessaries :-) Other objects are needed by the mmj2 Service feature to fully utilize mmj2 as a subroutine, but LogicalSystem is the main repository of data from the input Metamath .mm file.

1) A new object, "SeqAssigner"  is present in LogicalSystem. It replaces the existing rudimentary function, LogicalSystem.nextSeq(), which simply increments the MObj counter and multiplies by LangConstants.MOBJ_SEQ_NBR_INCREMENT to determine the next MObj.seq. The SeqAssigner handles sequence number assignment for objects appended to the end of the LogicalSystem as well as inserts located in the middle of LogicalSystem. A key feature of SeqAssigner is that it has updateInit(), commit() and rollback() functions so that a failed partial mass-update by the Theorem Loader can be "undone".

The SeqAssigner takes advantage of the fact that MObj.seq numbers are assigned from 100 by 100 using a Java "int" field -- meaning that approximately 2 million Metamath objects can be input before overflowing the system's capacity. In between two objects, say, "100" and "200", there are 99 unused sequence numbers. What SeqAssigner does is keep track of the MObj's which have been inserted in the middle of the LogicalSystem in the "gaps" -- and it keeps track of the highest sequence number assigned so far to appended MObj's. MObj's added at the end of LogicalSystem are not considered "inserted" but "appended", as they do not require the special "gap" insertion processing and nearly unlimited numbers of them can be added to the LogicalSystem.

The SeqAssigner's implementation goals are to a) use CPU and memory resources efficiently; b) provide relatively rapid service for individual sequence number assignments, and c) to consume resources on inserts proportionately to the number of inserts (meaning that if the user doesn't use the Theorem Loader then minimal resources are utilized. Here is how it works.

The SeqAssigner maintains a HashMap whose Key is existing MObj.seq number modulo 100 (the increment interval), and whose Value is a reference to a BitMap of length 100 (one hundred bits) wherein a "1" (one) bit signifies that the associated MObj.seq number object has been inserted into a "gap". Thus, the HashMap is empty until at least one insert has been performed successfully. Insertion into a gap requires obtaining the Value object for the gap's Key (or creating a new Value object), and then scanning through the BitMap looking for the first available open spot (after bit number 0 which is always open since it represents a MObj that was loaded during the initial load.) In the event that the gap BitMap is full, SeqAssigner assigns the next seq number at the very end of the Logical System. Note that the calling function needs to ascertain this "overflow"situation because it affects the MObj.seq of every other theorem which uses the inserted theorem in a proof, and may even result in a severe "forward" reference error in an existing theorem's proof if the proof has been modified to use the new theorem being appended because of a gap "overflow" situation.

2) Another new object in LogicalSystem is "TheoremLoadListeners", which is a list of objects which must be informed when a set of theorems is commit()'ed by TheoremLoader. There are several of these, besides the obvious candidates, LogicalSystem.stmtTbl and LogicalSystem.seqAssigner. During previous enhancements to mmj2 certain lists of logical assertions (which were sorted in a specific sequence) were added to mmj2 -- these are always created during initialization of mmj.lang.ProofAsst. They are mmj.pa.ProofUnifier.unifySearchList and StepSelectorSearch.assrtArray. In addition, the new Book Manager feature, added in this release, keeps track of MObj's in the LogicalSystem and assigns them its own section and sequence numbers. So, the following changes will be made for the TheoremLoader enhancement to accomodate objects outside of LogicalSystem which keep track of theorems:
3) Note: mmj.pa.ProofAsst.getSortedTheoremIterator() contains redundant code -- probably a leftover from the initial coding, and exercised only via the batch testing RunParm commands -- which builds its own sorted list of assertions. Since this redundant list is identical to unifySearchList, getSortedTheoremIterator() will be modified to contain just the following statement:

    
"return proofUnifier.getUnifySearchListByMObjSeq().iterator();".

4)
Another "major" change to LogicalSystem for the TheoremLoader enhancement is a new LogicalSystem method (aka "function") called "TheoremLoaderMassUpdate" which is responsible for performing all inserts of new theorems, and updates of existing proofs and $d statements for existing theorems, as specified in its input parameter, "theoremStmtGroupList", which designates a rigorously pre-validated pre-validated list of  TheoremStmtGroup objects sorted by MObj.seq.

Each TheoremStmtGroup object contains the data corresponding to exactly one theorem's input .mmt file from the TheoremLoader .mmt Folder (see above).  Not only will TheoremStmtGroup objects be rigorously validated prior to any attempt to update the LogicalSystem, but the MObj.seq numbers for each new Theorem and LogHyp are "reserved"
during validation so that the final "mass update" process can proceed with near certainty of completion.

F. ProofAsst

mmj.pa.ProofAsst.java and friends handle the actual "proof assistanting" work in mmj2. It is a little-known fact that the mmj2 Proof Assistant GUI doesn't actually know anything about proofs -- it just passes user requests for work to ProofAsst.java and is itself only responsible for blindly moving things around on the screen. The ProofAsst "public" interface is available for use by the mmj2 Service, including functions which involve the new Theorem Loader.

Here are the new features in the mmj2 Proof Assistant GUI, to be added into a new main menu item: Theorem Loader.
Note: The "LOC_AFTER" field in the Proof Worksheet Header line specifies the location of a new theorem being proven using the Proof Assistant GUI. However, the LOC_AFTER label specification applies only to the Proof Assistant GUI's validation of proof step Ref labels and formulas; it does not control the location of new theorems loaded by the Theorem Loader. When a new theorem is stored after unification in the Proof Assistant via Unify + Store in MMT Folder or Unify + Store in LogSys and MM Folder the LOC_AFTER specification has no effect upon where the theorem is stored when the Logical System is updated -- that location is controlled by the contents of its proof.

G. TheoremLoader

Theorem Loader is new, and hasn't been coded yet :-) Basically, it does everything that the prior sections of this document say it must be able to do (it will have helpers of its own...the coding implementation objective is to hide as much complicated new code in TheoremLoader.java and its helpers as possible.



Sample Program: mmj2 Service in "Callee" mode:

import java.io.*;
import java.util.*;
import java.io.File;
import java.util.Map;
import mmj.lang.*;
import mmj.pa.*;
import mmj.util.*;
import mmj.verify.*;
import mmj.svc.*;

/**
 *  Test mmj2 SvcCallback as "callee"
 */
public class TSvcCallbackCallee implements SvcCallback {
    public TSvcCallbackCallee() {
    }

    public  void go(Messages                messages,
                    OutputBoss              outputBoss,
                    LogicalSystem           logicalSystem,
                    VerifyProofs            verifyProofs,
                    Grammar                 grammar,
                    WorkVarManager          workVarManager,
                    ProofAsstPreferences    proofAsstPreferences,
                    ProofAsst               proofAsst,
                    File                    svcFolder,
                    Map                     svcArgs) {

        System.out.println(
            "Hello world, I am TSvcCallbackCallee.java");
    }
}
 



Sample Program: mmj2 Service in "Caller" mode:

import java.io.*;
import java.util.*;
import java.io.File;
import java.util.Map;
import mmj.lang.*;
import mmj.pa.*;
import mmj.util.*;
import mmj.verify.*;
import mmj.svc.*;

/**
 *  Test mmj2 SvcCallback as "callee"
 */
public class TSvcCallbackCaller implements SvcCallback {
    public TSvcCallbackCaller() {
    }

    public  void go(Messages                messages,
                    OutputBoss              outputBoss,
                    LogicalSystem           logicalSystem,
                    VerifyProofs            verifyProofs,
                    Grammar                 grammar,
                    WorkVarManager          workVarManager,
                    ProofAsstPreferences    proofAsstPreferences,
                    ProofAsst               proofAsst,
                    File                    svcFolder,
                    Map                     svcArgs) {

        System.out.println(
            "Hello world, I am TSvcCallbackCaller.java");
    }

    /**
     *  Main function interfacing to Java environment,
     *  running the BatchMMJ2 functions.
     *
     * @param args see class description.
     */
    public static void main(String[] args) {

        TSvcCallbackCaller tSvcCallbackCaller
                                  =
            new TSvcCallbackCaller();

        BatchMMJ2 batchMMJ2       = new BatchMMJ2();

        int returnCode            =
            batchMMJ2.generateSvcCallback(args,
                                          tSvcCallbackCaller);
        if (returnCode != 0) {
            System.exit(returnCode);
        }
    }
}
 



New mmj2 Service interface "mmj.svc.SvcCallback.java":

//********************************************************************/
//* Copyright (C) 2008                                               */
//* MEL O'CAT  mmj2 (via) planetmath (dot) org                       */
//* License terms: GNU General Public License Version 2              */
//*                or any later version                              */
//********************************************************************/
//*4567890123456 (71-character line to adjust editor window) 23456789*/

/*
 *  SvcCallback.java  0.01 08/01/2008
 *
 *  Version 0.01:
 *      --> new.
 */

package mmj.svc;
import  java.io.File;
import  java.util.Map;
import  mmj.lang.*;
import  mmj.pa.*;
import  mmj.util.*;
import  mmj.verify.*;


/**
 *  Interface for using mmj2 as a service.
 *  <p>
 */
public interface SvcCallback {

    /**
     *  Method go is called by mmj2 after initialization
     *  and a RunParm command is encountered triggering
     *  a SvcCallback.
     *  <p>
     *  Note: if a "fatal" or severe error is encountered
     *        just trigger an IllegalArgumentException
     *        to instantly terminate the mmj2 process.
     *  <p>
     * @param messages              mmj.lang.Messages object.
     * @param outputBoss            mmj.util.OutputBoss
     * @param logicalSystem         mmj.lang.LogicalSystem
     * @param verifyProofs          mmj.verify.VerifyProofs
     * @param grammar               mmj.verify.Grammar
     * @param workVarManager        mmj.lang.WorkVarManager
     * @param proofAsstPreferences  mmj.pa.ProofAsstPreferences
     * @param proofAsst             mmj.pa.ProofAsst
     * @param svcFolder             home Folder for use by Svc
     *                              (specified via RunParm).
     * @param svcArgs               Map of Svc key/value pair arguments
     *                              (specified via RunParm).
     */
    void go(Messages                messages,
            OutputBoss              outputBoss,
            LogicalSystem           logicalSystem,
            VerifyProofs            verifyProofs,
            Grammar                 grammar,
            WorkVarManager          workVarManager,
            ProofAsstPreferences    proofAsstPreferences,
            ProofAsst               proofAsst,
            File                    svcFolder,
            Map                     svcArgs
            );
}