minimunger(1) FreeBSD General Commands Manual minimunger(1) NAME MiniMunger – Language for writing text-processing filters SYNOPSIS minimunger ⟨source-file⟩ DESCRIPTION MiniMunger is a compiler-to-C for a small variant of Munger, written in Munger, limited to writing filters. An interface to SQLite is provided. This manual page describes only the differences between Munger and MM. For more information, see the Munger manual page. Example programs and helper modules are installed in /usr/local/share/minimunger. grep.mm is an egrep-like filter. fmt.mm is a fmt-like filter. tables.mm is a hash table demo and performance test. sqlite.mm is a SQLite demo options.mm aids the processing of command-line arguments. stacks.mm aids working with stacks. stringstack.mm aids in concatenating stacks of strings. tsml2sqlite.mm converts a TSML document into a SQLite database. tsmlquery.mm provides functions to query a converted TSML document. tsmltest.mm illustrates the use of tsml2sqlite.mm and tsmlquery.mm. y.mm uses the y-combinator to calculate the factorial of 19. The example programs can be compiled in the source directory with: make tests IMPLEMENTATION NOTES The MM compiler is a whole-program compiler, reading one input file, and producing two output files of C code, which may then be compiled by the C compiler with the MM runtime to produce an executable. Instructions for using the compiler follow this section. • MM does not support lists nor any list-related functions. Programs are written as lists, but programs may not create lists. • The first-class data types are: stacks, tables, closures, continuations, compiled regular expressions, strings, and fixnums. • Global variables must be declared before they are recognized in subsequent code. If necessary, "declare" an initial dummy value as a forward reference and then subsequently "bind" the correct value. • Side-effects are only permissable on globals. • MM has no looping constructs. Iteration is performed by recursion. The compiler peforms CPS conversion, which converts all calls to tailcalls. • First-class continuations are captured with the "call_cc" intrinsic which behaves like "call/cc" does in Scheme. • User-defined functions have fixed-size argument lists. • All equality comparisons use "eq". • User-defined macros are not supported in minimunger code, but you can hack the compiler to define your own compiler macros. See the function "make_initial_cte". • The compiler does not detect parameter/argument count mismatches in the application of user-defined functions. Such mismatches crash the runtime. • For maximum performance, The MM runtime DOES NOT PERFORM type- checking. If you your code calls an intrinsic function with arguments of the wrong type, the runtime will crash. • Although lambda-expressions can be bound to variables in "let" and "letn" forms, there is no "labels" nor "letf" to allow the functions to see their own bindings. Any function that calls itself must have a toplevel binding. COMPILING MM PROGRAMS MM depends upon the the SQLite database library, which must installed before you can compile MM programs. Invoke "pkg sqlite3 install" The MM compiler is written in Munger, and compiles MM code to an intermediate language, defined as a set of macros in the source of the C runtime. Most of the macros expand to in-line code for speed, resulting in larger executables than might be expected from the size of the original MM programs. To compile a MM program the compiler must be invoked on the main source file: % minimunger grep.mm The compiler performs source-to-source conversions before it begins to emit code, printing status messages as it does so. When it has finished, two files are created, one named "functions.c" and one name "functions.h". To create an executable from these files, invoke the C compiler on the MM runtime source, which includes the other two files. The command line below will be the same when building any program compiled by MM exceptfor the argument to the -o option. % cc -o grep /usr/local/share/minimunger/runtime.c \ -I./ -I/usr/local/share/minimunger -I/usr/local/include \ -L/usr/local/lib -lsqlite3 -lcurses The main source file of a program may include other source files with the "include" directive. The "include" directive resembles its similarly- named C preprocessor counterpart, and consists of the word "include" preceded by an octothorpe (#), and succeed by a double-quote delimited filename. For example: #include "options.mm" If the filename itself contains double quotes, they do not need to be escaped. Include directives must start in column zero to be recognized. Otherwise, they will be treated as comments. Included files themselves may also "include" other source files. THE INTRINSICS The MM intrinsics bear strong resemblence to their similarly-named Munger counterparts. Some behave differently. This summary does not completely document the operation of the intrinsic functions, but merely lists which are available and how they differ from their Munger counterparts. For complete documentation of an intrinsic, see the Munger(1) manual page. Control Flow / Side-Effects The empty string and 0 are boolean false values. All other objects are considered boolean true values. The forms below function identically to their Munger counterparts, with the exception of the conditionals. Note that "bind" is the only means of accomplishing side-effects on variables, and that side-effects are only permissable upon globals. When "if" is invoked with only a "true" subsequent clause, and the test condition evaluates to a false value, 0 is returned, and not the value of the failed test condition. Similarly, if all test clauses of an invocation of "cond" fail, then 0 is returned, rather than the value of the last failed test condition. Both "when" and "unless" also return 0 if their test conditions fail. If you want an expression to return a value rather than the result of evaluating an expression, you must use "eval". "eval" effectively does nothing. Form Use declare (declare symbol expr) bind (bind symbol expr) if (if test expr1 expr2 ...) cond (cond (test_expr subsequent ...)+) when (when test expr ...) unless (unless test expr ...) progn (progn expr ...) eq (eq expr1 expr2) or (or expr ...) and (and expr ...) not (not expr) let (let ((symbol expr)+) expr+) letn (letn ((symbol expr)+) expr+) exit (exit expr) quit (quit) die (die ...) eval (eval expr) call_cc is used to capture the current continuation. It functions exactly as call/cc does in Scheme: call_cc (call_cc monadic_function) Regular Expressions Intrinsic Use Return Value regcomp (regcomp str) compiled rx match (match rx str) 0 or stack of 2 fixnums matches (matches rx str) stack of 20 strings substitute (substitute rx rep str cnt) string regexpp (regexpp expr) 0 or 1 Tables Intrinsic Use Return Value table (table) new table tablep (tablep expr) 0 or 1 hash (hash table expr1 expr2) table unhash (unhash table expr1) table items (items table) number of items lookup (lookup table expr) associated expr keys (keys table) stack of keys values (values table) stack of values Stacks Note that the "unshift", "push", and "store", intrinsics all return the affected stack instead of their second arguments. Intrinsic Use Return Value stack (stack) new stack shift (shift stack) item at index 0 unshift (unshift stack expr) stack push (push stack expr) stack pop (pop stack) item at top of stack index (index stack expr) item at index expr store (store stack fixnum expr) stack used (used stack) stored item count sort_numbers (sort_numbers stack) stack (sorted in situ) sort_strings (sort_strings stack) stack (sorted in situ) stackp (stackp expr) 0 or 1 Fixnums Each of these functions accept only TWO arguments, unlike their Munger counterparts. Intrinsic Use Return Value eq (eq expr1 expr2) 0 or 1 < (< expr1 expr2) 0 or 1 <= (<= expr1 expr2) 0 or 1 > (> expr1 expr2) 0 or 1 >= (>= expr1 expr2) 0 or 1 + (+ expr1 expr2) sum - (- expr1 expr2) difference * (* expr1 expr2) product % (% expr1 expr2) remainder / (/ expr1 expr2) quotient abs (abs expr) absolute value minnum (minnum) Lowest fixnum value maxnum (maxnum) Highest fixnum value Note that "stringify" accepts only one argument, which must evalute to a fixnum. Intrinsic Use Return Value stringify (stringify expr) string representation of expr numberp (numberp expr) 0 or 1 char (char expr) one-character string I/O Theses are the general I/O functions. Note that both "getline" and "reachars" return 0 upon encountering EOF, and the empty string on error. "flush" does what "flush_stdout" does in Munger. Intrinsic Use Return Value print (print expr ...) 1 println (println expr ...) 1 flush (flush) fixnum die (die ...) does not return warn (warn expr ...) 1 getline (getline) string or 0 readchars (readchars expr) string or 0 file2string (file2string expr) string or 0 These are the intrinsics redirecting the standard descriptors onto files and processes. These functions return 1 upon success, or a string describing an error condition. Intrinsic Use pipe (pipe desc program) with_input_process (with_input_process program expr ...) with_output_process (with_output_process program expr ...) redirect (redirect desc file append) with_input_file (with_input_file file expr ...) with_output_file (with_output_file file expr ...) with_output_file_append (with_output_file_append file expr ...) resume (resume desc) System-Related "random" returns a fixnum in the range of 0 to one less than its argument. The "time" intrinsic returns a string represention of the UNIX time value, padding with leading zeros to become sixteen-character strings, so they may be compared with each other using "strcmp". The "stat" intrinsic returns a five element stack, containing all strings: owner name or uid, group name or uid, time of last access, time of last modification, and size, with the time values formatted similary to those returned by "time". The "date" intrinsic returns a textual representation of the current date and time. Intrinsic Use Return Value basename (basename path) string dirname (dirname path) string rootname (rootname path) string suffix (suffix path) string directory (directory expr) stack of filenames rename (rename from to) 0 or error string remove (remove expr) 0 or error string rmdir (rmdir expr) 0 or error string stat (stat expr) stack or error string getenv (getenv string) string or 0 random (random expr) fixnum time (time) fixnum date (date) string Command-Line Args These function identically to their Munger counterparts. Intrinsic Use Return Value next (next) 0 or string previous (previous) 0 or string current (current) string rewind (rewind) string Strings The ability of the "split" intrinsic in Munger to explode a string into a list of one-character strings, is not present in the MM "split". The "explode" intrinsic does this. A version of "join" that works on stacks of strings is in "stringstack.mm". Intrinsic Use Return Value chop (chop expr) string chomp (chomp expr) string length (length expr) fixnum digitize (digitize expr) fixnum code (code expr) fixnum explode (explode expr) stack of strings stringp (stringp expr) 0 or 1 join (join delim expr ...) string split (split delims string [limit]) stack of strings concat (concat expr1 expr2 ...) string substring (substring string expr1 expr2) string strcmp (strcmp expr1 expr2) fixnum expand_tabs (expand_tabs expr1 string) string SQLite These functions provide the interface to the SQLite library. Only one database file may be open at any one time. The database handle is managed internally by the runtime engine. Column data is returned as a stack of strings. Intrinsic Use Return Value sqlite_open (sqlite_open expr) error string or 1 sqlite_close (sqlite_close) 0 or 1 sqlite_exec (sqlite_exec expr) stack or string sqlite_prepare (sqlite_prepare expr) sql object or string sqlp (sqlp expr) 0 or 1 sqlite_bind (sqlite_bind expr expr expr) 1 or string sqlite_step (sqlite_step expr) 0, 1 or string sqlite_row (sqlite_row expr) stack or string sqlite_reset (sqlite_reset expr) 1 or string sqlite_finalize (sqlite_finalize expr) 1 or string Terminal Colors The foreground and background colors can be modified with the following functions. Intrinsic Use Return Value black (black) 1 white (white) 1 red (red) 1 green (green) 1 yellow (yellow) 1 blue (blue) 1 magenta (magenta) 1 cyan (cyan) 1 bg_black (bg_black) 1 bg_white (bg_white) 1 bg_red (bg_red) 1 bg_green (bg_green) 1 bg_yellow (bg_yellow) 1 bg_blue (bg_blue) 1 bg_magenta (bg_magenta) 1 bg_cyan (bg_cyan) 1 AUTHORS James Bailie ⟨jimmy@mammothcheese.ca⟩ http://www.mammothcheese.ca Wed, Feb 25 2026 minimunger(1)