Toka Documentation
Home
Hello, and welcome to Toka!
Toka is a concatenative language with roots in Forth. Don't expect your existing knowledge to apply directly to Toka. It allows a good deal of low-level control, while supporting some useful abstractions that make development easier. Some influences come from RetroForth, HelFORTH, 4p, Factor, and SmallTalk.
If you can't find something, try the index. A more structured set of topics can be found in First Steps.
General Topics
- License
- Building and Installation
- First Steps, a short guide to the language
- The Words and Their Uses
Internals
Topics
- Information about the Ports
- How the garbage collector works
- How the parser works
- Notes on the threading model
- Using TokaDoc to comment C code
- ErrorCodes that can arise
- Limitations
Source Files
- bits.c
- class.c
- cmdline.c
- conditionals.c
- console.c
- data.c
- debug.c
- decompile.c
- dictionary.c
- errors.c
- ffi.c
- files.c
- gc.c
- initial.c
- interpret.c
- math.c
- parser.c
- quotes.c
- stack.c
- toka.c
- vm.c
Arrays
The standard bootstrap adds support for arrays. These are a superset of variables, and hold either character or numeric data.
The words provided are:
is-array n"- Create an array of size n. Parses for the name. array.put nia- Put value (n) into array (a) at index (i) array.get ia-n Get the value (n) from array (a) at index (i) array.putChar nia- Put character value (n) into array (a) at index (i) array.getChar ia-n Get the character value (n) from array (a) at index (i)
Example
10 cells is-array foo 0 foo array.get . 100 0 foo array.put 10 1 foo array.put 0 foo array.get . 1 foo array.get .
Tips
- The command line arguments are stored in an array (*arglist*).
- At 0 is the name of the script
- Actual arguments start at 1.
- Be careful not to exceed the length of your array when giving an index
bits.c
Use
This file provides a handful of primitives for manipulating bits.
Functions Provided
lshift() Shift NOS left by TOS bits rshift() Shift NOS right by TOS bits and() Perform a bitwise AND or() Perform a bitwise OR xor() Perform a bitwise XOR
Primitives Provided
<< ( ab-c ) Shift 'a' left by 'b' bits >> ( ab-c ) Shift 'a' right by 'b' bits and ( ab-c ) Perform a bitwise AND or ( ab-c ) Perform a bitwise OR xor ( ab-c ) Perform a bitwise XOR
Building
To build Toka, you will need the following:
- GCC 2.9x, 3.x, or 4.x
- Make (Either GNU Make or a BSD Make should work)
For most people, the following should work. If it fails, try checking the Ports page to see if there are any platform-specific instructions. The process using Make:
make
If you experience problems, check the PLATFORM line in the Makefile and add any OS-specific flags your system may need.
As a final note, some pieces of documentation can be updated automatically as well:
make docs
class.c
Use
This is where the WordClasses are implemented. It does not export any functions directly to Toka.
Functions Provided
word_class() If compiling, compile the xt into the current quote. If interpreting, call the word. macro_class() Always call the word data_class() If compiling, compile a call to lit() and then inline TOS into the following location. Otherwise leave TOS alone. quote_macro_class() Always invoke the quote. quote_class() Handler for quotes; this takes two cells, one which is a call to this function, the other is the pointer to the quote to invoke. quote_word_class() Perform data_class() semantics, then, if compiling, compile a call to invoke(). Otherwise, invoke() is called with the xt on TOS.
Primitives Provided
None
Related Words
.PRIM_WORD ( -n ) Class # for primitive words .PRIM_MACRO ( -n ) Class # for primitive macros .DATA ( -n ) Class # for data elements .WORD ( -n ) Class # for quote words .MACRO ( -n ) Class # for quote macros
cmdline.c
Use
Build the list of command line arguments that Toka programs can access. This is a subset of the full arguments passed to the toka executable.
Functions Provided
Variables: long arg_count Holds the number of command line arguments char *arg_list[128] Holds the list of command line arguments. num_args() Return the number of arguments, not including the file names used to launch this program. get_arg_list() Return the number of arguments, not including the file names used to launch this program. build_arg_list(char *args[], long count) Copy pointers to the command line arguments to arg_list[]. Also sets arg_count.
Primitives Provided
#args ( -n ) Return the number of arguments arglist ( -a ) Return a pointer to the argument list.
Commenting Code
It is generally a good idea to comment your code. Comments will make it much easier to remember what's going on, can help you locate bugs, and also be of significant benefit to others who may work on your code at a later time.
Toka provides two ways to comment your code.
In The Listener
At the listener (the top level interpreter), you can use #! comments. These are comments that start with #! and end at the end of the current line. For example, I often start my programs with a short information block such as the following:
#! --------------------------------------------------------------- #! A small HTTP server for Toka #! #! Developed by: #! Charles R. Childers #! erider #! #! ---------------------------------------------------------------
This style comment works well for blocks, and thus can be very useful at the start of a function:
#! --------------------------------------------------------------- #! get-request #! This reads a client request of up to 1k into the buffer. The #! number of bytes read is returned. #! ---------------------------------------------------------------
#! comments can not be used inside quotes.
Inside Quotes
Inside a quote, you can use ( ... ) comments. These start with a ( and end when ) is encountered. Unlike #! comments, these can span multiple lines and be used inside of quotes. An example:
[ connection ( -a ) @ ( a-n ) pClose ( n- ) ] is end-connection
In this example, each stack action is mapped out using a stack comment. This is helpful when learning to use Toka, as it makes it easier to visualize the stack at each step.
Conditionals
Toka provides a few basic comparison primitives and one primitive for handling conditional execution of code. The standard bootstrap also adds two quotes that extend the conditional functions.
Basic examples:
1 100 = [ 1 . ] ifTrue 1 100 = [ 2 . ] ifFalse 1 100 = [ 3 . ] [ 4 . ] ifTrueFalse
The first case will invoke the quote if the flag returned by = is TRUE. The second invokes the quote if the flag is FALSE, and the third form invokes the [3 .] quote if TRUE, or the [4 .] quote if FALSE.
conditionals.c
Use
Contains the implementation of the core conditionals. These are words which do comparisons, and return a flag on the stack.
Functions Provided
less_than() Compare TOS and NOS, return a flag. greater_than() Compare TOS and NOS, return a flag. equals() Compare TOS and NOS, return a flag. not_equals() Compare TOS and NOS, return a flag.
Primitives Provided
< ( ab-f ) Compare 'a' and 'b', return a flag > ( ab-f ) Compare 'a' and 'b', return a flag = ( ab-f ) Compare 'a' and 'b', return a flag <> ( ab-f ) Compare 'a' and 'b', return a flag
Related Words
FALSE ( -f ) Value returned for FALSE TRUE ( -f ) Value returned for TRUE ifTrue ( fq- ) Execute quote ('q') if flag ('f') is TRUE ifFalse ( fq- ) Execute quote ('q') if flag ('f') is FALSE ifTrueFalse ( fab- ) Invoke 'a' if 'f' flag is true, 'b' if false. whileTrue ( a- ) Execute quote. If the quote returns TRUE, execute again. otherwise end the cycle. whileFalse ( a- ) Execute quote. If the quote returns TRUE, execute again. otherwise end the cycle.
console.c
Use
Provide a very basic console interface. The provided interface is intentionally kept to a minimum, a better console interface can be loaded later.
Functions Provided
dot() Display the number on TOS using the current base if possible. emit() Display the character TOS corresponds to. Consumes TOS. type() Display the string TOS points to. Consumes TOS. bye() Quit Toka
Primitives Provided
. ( n- ) Display the TOS emit ( c- ) Display the ASCII character for TOS type ( a- ) Display a string bye ( - ) Quit Toka
Related Words
cr ( - ) Display a CR character space ( - ) Display a space tab ( - ) Display a tab ." ( "- ) Parse to the next <b>"</b> and display the string. clear ( - ) VT100: Clear the screen
Constants
For data that does not change, a constant can be created as follows:
100 is-data OneHundred " /home/crc/htdocs/" is-data docroot
The first example takes the number 100, and assigns a name (OneHundred) to it. The name can now be used as a symbolic constant at the listener or inside a quote. The second example creates a named pointer to a string, which can also be used at the listener or inside quotes.
The use of constants is encouraged as it makes code easier to read and maintain. They have minimal impact on performance, and are significantly faster to use than variables.
data.c
Use
These are words useful for accessing and modifying data.
Functions Provided
make_literal() Compile a call to lit() and then place TOS into the next memory location. make_string_literal() Compile a call to string_lit() and then place TOS into the next memory location. fetch() Fetch the value in the memory location pointed to by TOS. store() Store NOS into the memory location specified by TOS. fetch_char() Fetch the value in the memory location pointed to by TOS. This version reads a single byte. store_char() Store NOS into the memory location specified by TOS. This version stores a single byte. copy() Copies 'count' bytes from 'source' to 'dest'. The stack form for this is: source dest count The memory locations can overlap. cell_size() Push the size of a cell to the stack. char_size() Push the size of a char to the stack
Primitives Provided
# ( n- ) Push the following cell to the stack. $# ( n- ) Push the following cell to the stack. @ ( a-n ) Fetch the value in memory location 'a' ! ( na- ) Store 'n' to memory location 'a' c@ ( a-n ) Fetch a byte from memory location 'a' c! ( na- ) Store byte 'n' to memory location 'a' copy ( sdc- ) Copy 'c' bytes from 's' to 'd' cell-size ( -n ) Return the size of a cell char-size ( -n ) Return the size of a char
Related Words
>char ( n-c ) Convert the value on TOS to a single character char: ( "-c ) Parse ahead and return one character " ( "-$ ) Parse until " is encountered and return a string chars ( x-y ) Multiply TOS by char-size. Useful w/arrays char+ ( x-y ) Increase TOS by char-size char- ( x-y ) Decrease TOS by char-size cells ( x-y ) Multiply TOS by cell-size. Useful w/arrays cell+ ( x-y ) Increase TOS by cell-size cell- ( x-y ) Decrease TOS by cell-size +! ( xa- ) Add 'x' to the value in address 'a' -! ( xa- ) Subtract 'x' from the value in address 'a' on ( a- ) Set a variable to TRUE off ( a- ) Set a variable to FALSE toggle ( a- ) Toggle a variable between TRUE and FALSE variable ( "- ) Create a variable variable| ( "- ) Create multiple variables is-array ( n"- ) Create an array of size 'n' array.get ( ia-n ) Get element 'i' from array 'a' array.put ( nia- ) Put value 'n' into element 'i' of array 'a' array.getChar ( ia-n ) Get char-size element 'i' from array 'a' array.putChar ( nia- ) Put char-size value 'n' into element 'i' of array 'a' value ( "- ) Create a new value to ( - ) Set the value of a value value| ( "- ) Create multiple values
debug.c
Use
This provides a very small collection of simple tools that can be helpful when trying to track down bugs. It is not intended to be a full-scale debugging tool.
Functions Provided
display_stack() Display all items on the stack. vm_info() Display information about Toka's memory use, stacks, and other aspects of the virtual machine.
Primitives Provided
:stack ( - ) Display all values on the data stack :stat ( - ) Display information about the virtual machine status
Related Words
words-within ( n- ) Display all words with a specified class # :prims ( - ) Display all primitives :quotes ( - ) Display all named quotes :datas ( - ) Display all named data items words ( - ) Display all names in the dictionary
decompile.c
Use
This allows for decompiling a quote and displaying the source needed to recreate it.
Functions Provided
long resolve_name(Inst xt) Search for a name in the dictionary that corresponds to 'xt'. Display it if found, and return a flag. decompile(Inst *xt) Decompile a quote and its children and display the result on the screen. see() Decompile the quote on the stack.
Primitives Provided
:see ( "- ) Decompile the specified quote
dictionary.c
Use
This file provides the functionality to create new dictionary entries and search for a specific entry.
Functions Provided
Variables: ENTRY dictionary[MAX_DICTIONARY_ENTRIES]; Holds the dictionary entries, up to MAX_DICTIONARY_ENTRIES long last A pointer to the most recent dictionary entry add_entry(char *name, Inst xt, Inst class) Add an entry to the dictionary. name_attach(void *class) Attach a name (from the input stream) to the specified quote address. This word is given the semantics of the specified class. name_quote() Attach a name (from the input stream) to the specified quote address. This word is given the semantics of quote_forth_class(). name_quote_macro() Attach a name (from the input stream) to the specified quote address. This word is given the semantics of quote_macro_class(). name_data() Attach a name (from the input stream) to the data at the specified address. Semantics are the same as the data_class(). find_word() Search for a word (name taken from the string passed on TOS) in the dictionary. Returns the xt, class, and a flag of -1 if found. If not found, returns only a flag of 0. return_quote() Find a name (from the input stream) and return a quote that corresponds to the word. return_name() Return a pointer to the name of a dictionary entry return_xt() Return the starting address of a word return_class() Return the class number for a dictionary entry
Primitives Provided
is ( a"- ) Attach a name to a quote ( a$- ) Non-parsing form is-macro ( a"- ) Attach a name to a quote ( a$- ) Non-parsing form is-data ( a"- ) Attach a name to data memory ( a$- ) Non-parsing form ` ( "-a ) Return a quote corresponding to the specified word. ( $-a ) Non-parsing form :name ( n-$ ) Return the name for a dictionary entry :xt ( n-$ ) Return the address of a dictionary entry :class ( n-$ ) Return the class # for a dictionary entry last ( - a ) Variable holding the number of the most recent dictionary entry
Related Words
<list> ( -a ) Stores a list of pointers used by { and } { ( - ) Start a scoped area } ( - ) End a scoped area
ErrorCodes
When using Toka, you may encounter various error messages. These are described below. Each error has a unique code; the messages below are sorted by this code.
E0
Nonfatal. This error is when a token can not be found in the dictionary or converted to a number.
E1
Fatal. This error is when the garbage collector fails to allocate enough memory to fill a gc_alloc() request.
E2
Nonfatal. This error is given when an alien function (handled by the FFI) is invoked with too many arguments for Toka to handle. You shouldn't see it often.
E3
Nonfatal. This error arises when a library can not be located for opening by the FFI.
E4
Nonfatal. This error arises when a symbol can not be found in the currently open library.
E5
This error arises when Toka detects a problem with either the data or return stack. It can be either fatal (when the problem is with the return stack) or nonfatal (when the problem is with the data stack). When nonfatal, Toka resets the data stack to an empty state.
E6
Fatal. This error arises when the stdin device is closed unexpectedly.
E7
Fatal. This error arises when a quote exceeds the maximum size permitted. See Limitations.
E8
Fatal. This error arises when a malloc fails. Normally you shouldn't see this unless you are dealing with a bad malloc/free implementation or very large sets of data.
errors.c
Use
This is where errors are handled. It does not export any functions directly to Toka.
Functions Provided
error(long code) Display a given error by code
Primitives Provided
None
See Also
Extending the Language
Toka provides a few tools that allow you to extend the language in new directions. The core of this is compiler macros, special words which are (under normal circumstances) always invoked.
Consider a fairly simple task, such as displaying a string. We know that this can be done by using type, and that parse can be used to create a string. A simple first try would be something like:
[ char: " parse type ] is ."
Trying this at the interpreter works just fine:
." Hello, World"
But inside a definition, we run into a problem:
[ ." Hello, World" ] is foo E0: 'Hello,' is not a word or a number. E0: 'World"' is not a word or a number.
To make this work inside a definition, we need to do a few things:
- Parse the string at compile time
- Compile in a reference to the string
- Compile a call to type into the quote
To begin, we should be using is-macro to make this into a compiler macro. As a macro, it will also run in the interpreter, so we need to make it aware of the compiler state. This is provided by the compiler variable. So an initial expansion:
[ char: " parse compiler @ [ ( for compile time ) ] [ type ] ifTrueFalse ] is-macro ."
A simplification is now in order. The first two challenges (parsing the string and compiling a reference to it) are already handled by the " word. We can reuse this and save some trouble:
[ ` " invoke compiler @ [ ( for compile time ) ] [ type ] ifTrueFalse ] is-macro ."
` returns a quote containing the requested function, in this case ". We can then invoke the quote. This is a fundamental aspect of extending the langauge.
To compile a call to a quote, Toka provides another word, compile. We can use this as follows:
[ ` " invoke compiler @ [ ` type compile ] [ type ] ifTrueFalse ] is-macro ."
And now we are done. Use of `, invoke, and compile allows for a fair amount of flexibility in extending the language with new features. Learn to use them, and your ability to adapt Toka to your programs needs will multiply drastically.
These functions can also be used with normal quotes to create defining words. As a nominal example, this is a defining word that creates a function that always returns a specific value. Basically a form of constant:
[ >r ` [ invoke r> # ` ] invoke is ] is const
One final note. Considering the example above, many items have a form similar to this. Toka has an additional word +action that creates a new quote combining a specific action and data element. The word "const" above could be rewritten using it:
[ [ ] +action ] is const
Or, with a slightly more complex example, try this:
[ [ . ] +action ] is foo 100 foo bar bar
FFI
To allow use of external libraries, Toka provides a simple Foreign Function Interface (FFI). This is built around the following primitives:
from LIBRARY
Set the import source to LIBRARY. This should be a fully-qualified filename; it may require a path as well as the .so extension.
N import FUNCTION
Import FUNCTION from the previously specified library. A new quote named FUNCTION will be created, and will take N arguments off the stack. This function will always have a return value, even for void functions.
You can also make use of as to rename the imported function.
Example:
#! Linux libc = libc.so.6, BSD libc = libc.so from libc.so.6 2 import printf as printf.2 " %i\n" 100 printf.2
The FFI is optional, and can be disabled at build time. Doing so reduces the overall functionality of Toka, so doing this is only recommended if you are using a system without dlopen()/dlsym(), or if you need more direct control over the functionality provided. To build a version of Toka without the FFI, do:
rm source/ffi.c make CFLAGS=-DNOFFI
Again, this is only recommended if your system does not support dlopen()/dlsym().
ffi.c
Use
Implements the Foreign Function Interface, which allows use of external libraries.
Functions Provided
Variables: void *library Pointer to the most recently opened library ffi_invoke() Call a foreign function. This translates between Toka and CDECL calling conventions. ffi_from() Select a library to load from. ffi_import() Import and name an external function. This wraps the imported function in a quote. ffi_rename() Rename the most recently defined word in the dictionary.
Primitives Provided
from ( "- ) Set the library to import from ( $- ) Non-parsing form import ( n"- ) Import a function taking 'n' arguments. ( n$- ) Non-parsing form as ( "- ) Rename the last defined word ( $- ) Non-parsing form
Files
Toka provides functionality roughly identical to the standard C file I/O functionality (fopen, fread, etc).
file.open ( $m-n ) Open a specified file with the specified mode. file.close ( n- ) Close the specified file handle file.read ( nbl-r ) Read 'l' bytes into buffer 'b' from file handle 'n'. Returns the number of bytes read. file.write ( nbl-w ) Write 'l' bytes from buffer 'b' to file handle 'n'. Returns the number of bytes written. file.size ( n-s ) Return the size (in bytes) of the specified file. file.seek ( nom-a ) Seek a new position in the file. Valid modes are START, CURRENT, and END. These have values of 1, 2, and 3. file.pos ( n-a ) Return a pointer to the current offset into the file. file.slurp ( $-a ) Read file '$' into a new buffer. "R" ( -x ) Mode for file.open "R+" ( -x ) Mode for file.open "W" ( -x ) Mode for file.open "W+" ( -x ) Mode for file.open "A" ( -x ) Mode for file.open "A+" ( -x ) Mode for file.open START ( -x ) Mode for file.seek CURRENT ( -x ) Mode for file.seek END ( -x ) Mode for file.seek
Examples
variable fid " /etc/motd" "R" file.open fid ! fid @ file.size . fid @ file.close " /etc/motd" file.slurp [ type cr ] ifTrue
files.c
Use
Allows for reading and writing data to files.
Functions Provided
file_open() Open a file using the specified mode. Modes are a direct map to the fopen() modes: "r", "r+", "w", "w+", "a", and "a+". Numeric values for these are 1 - 6, in that order. file_close() This is just a simple wrapper over fclose(). file_read() This is just a simple wrapper over fread(). file_write() This is just a simple wrapper over fwrite(). file_size() This is just a simple wrapper over fstat() which returns the size of the file. file_seek() This is just a simple wrapper over fseek(). file_pos() This is just a simple wrapper over ftell().
Primitives Provided
file.open ( $m-n ) Open a specified file with the specified mode. file.close ( n- ) Close the specified file handle file.read ( nbl-r ) Read 'l' bytes into buffer 'b' from file handle 'n'. Returns the number of bytes read. file.write ( nbl-w ) Write 'l' bytes from buffer 'b' to file handle 'n'. Returns the number of bytes written. file.size ( n-s ) Return the size (in bytes) of the specified file. file.seek ( nom-a ) Seek a new position in the file. Valid modes are START, CURRENT, and END. These have values of 1, 2, and 3. file.pos ( n-a ) Return a pointer to the current offset into the file.
Related Words
"R" ( -x ) Mode for file.open "R+" ( -x ) Mode for file.open "W" ( -x ) Mode for file.open "W+" ( -x ) Mode for file.open "A" ( -x ) Mode for file.open "A+" ( -x ) Mode for file.open START ( -x ) Mode for file.seek CURRENT ( -x ) Mode for file.seek END ( -x ) Mode for file.seek file.slurp ( $-a ) Read a file into a dynamically allocated buffer
First Steps
The following is a set of topical guides to various aspects of the Toka language. It was written as the initial tutorial material, but as should be evident, I am not a great writer of tutorials. The material should be fairly easy for a moderately experienced Forth or Joy programmer to pick up.
Introductory Bits
Writing Functions
Data Structures
Interfacing with the World
Practical Matters
Advanced Topics
garbage collector
The Toka implementation makes heavy use of dynamically allocated memory. To help avoid memory leaks, the memory allocation provides a very simple form of garbage collection.
The model is very simple. Two lists are kept of pointers and corresponding sizes are kept. One list is used primarily for user-level allocations, and the other is only used by the primitives when they need to allocate memory.
When either list is filled, gc() is called. This frees up to 64 allocations from the internal list, and up to 32 allocations from the user list. Allocations are freed from oldest to newest, and the newest allocations in both lists (up to 32 in the user list and 64 in the internal list) will be kept. This prevents recent creations from falling out of scope until they can be used.
Obviously there is also a way to mark entries as permanent. This is done by keep (the gc_keep() function). This routine skims through the list of allocations (from most recent to oldest) looking for a specific pointer. When found, that entry (and any subsequent allocations -- in Toka these will almost always be subquotes, strings, etc) are removed from the list.
Information regarding the current status of the garbage collection subsystem can be obtained via :stat (the vm_info() function).
gc.c
Use
Implements the memory allocator and basic garbage collector.
Functions Provided
Variables: GCITEM gc_list[128] Holds the list of items marked as garbage long gc_depth A pointer to the top of the garbage collection list GCITEM gc_trash[128] Holds the short list of items marked as garbage long gc_tdepth A pointer to the top of the short garbage collection list long gc_used Contains the total size of all currently used memory, including permanent quotes. long gc_objects Contains the total number of objects that are currently existing, including permanent ones. gc_alloc(long items, long size, long type) Allocate the requested memory and add it to the garbage collection list. If type is set to 0, add to the normal garbage collection list. If set to 1, add to the short list of known garbage items which can be safely freed at the next gc(). If the allocation fails, gc() is called, and the allocation is retried. If it still fails, an error is reported and Toka is terminated. gc_keep() Remove the specified address (and any childern it has registered) from the garbage collection list. If the TOS is not an allocated address, this will silently ignore it. gc() Free the oldest allocations on the garbage list. Will free up to 64 trash items and 32 items from the user allocation list per call. This leaves the newest allocations on both lists alone, even with repeated calls. toka_malloc() Allocate TOS bytes of memory. Returns a pointer to the allocated memory.
Primitives Provided
keep ( a-a ) Mark quotes/allocated memory as permanent. gc ( - ) Clean the garbage malloc ( n-a ) Allocate 'n' bytes of memory
Index
- Arrays (07/27/2007 21:01:34)
- bits.c (05/21/2007 21:44:03)
- Building (07/21/2007 14:17:16)
- class.c (07/21/2007 14:29:22)
- cmdline.c (07/21/2007 14:29:45)
- Commenting Code (04/27/2007 19:12:55)
- Conditionals (04/28/2007 22:11:10)
- conditionals.c (07/21/2007 15:02:56)
- console.c (09/13/2007 16:47:33)
- Constants (04/29/2007 19:19:14)
- data.c (07/27/2007 21:02:47)
- debug.c (07/21/2007 14:41:41)
- decompile.c (07/21/2007 14:32:11)
- dictionary.c (07/21/2007 15:00:59)
- ErrorCodes (09/02/2007 12:58:48)
- errors.c (04/21/2007 17:02:50)
- Extending the Language (07/21/2007 15:18:28)
- FFI (04/29/2007 22:43:53)
- ffi.c (07/21/2007 14:34:11)
- Files (04/28/2007 22:11:30)
- files.c (07/21/2007 14:35:45)
- First Steps (09/06/2007 19:01:17)
- garbage collector (07/14/2007 20:51:43)
- gc.c (07/21/2007 14:36:11)
- Home (09/06/2007 19:00:43)
- Index (07/13/2002 14:58:54)
- initial.c (03/30/2007 18:43:04)
- Installation (04/23/2007 18:18:57)
- interpret.c (08/28/2007 18:52:25)
- License (09/03/2007 02:34:27)
- Limitations (07/14/2007 20:41:28)
- Loops (09/06/2007 18:55:02)
- math.c (07/21/2007 14:37:18)
- MathOperations (04/19/2007 18:01:47)
- parser (07/21/2007 15:05:10)
- parser.c (09/13/2007 16:47:19)
- Ports (09/06/2007 19:00:08)
- Quotes (04/24/2007 21:47:51)
- quotes.c (09/06/2007 18:55:27)
- Recent Changes (07/13/2002 14:41:15)
- Scripts (09/06/2007 19:10:35)
- Search (03/29/2007 21:18:59)
- Stack Comments (04/27/2007 19:24:28)
- stack.c (07/21/2007 14:40:13)
- strings (07/27/2007 21:04:14)
- threading model (04/10/2007 22:10:25)
- toka.c (07/21/2007 15:06:11)
- TokaDoc (03/30/2007 23:52:20)
- Types (04/24/2007 21:02:42)
- User Code (09/02/2007 12:52:26)
- UsingTheStack (07/21/2007 15:17:53)
- Values (05/28/2007 15:20:13)
- Variables (04/19/2007 18:05:19)
- vm.c (07/21/2007 14:40:50)
- WordClasses (04/28/2007 01:03:03)
- Words and Their Uses (09/13/2007 16:51:10)
56 Pages
initial.c
Use
Build the initial dictionary
Functions Provided
build_dictionary() Attach names and classes to the various initial words in the Toka language.
Primitives Provided
None
Installation
To get the most out of Toka, it needs to be installed. This can be done via the build system:
make install
When done this way, the following files are installed:
/usr/bin/toka /usr/share/toka/bootstrap.toka
After installation, run the test suite:
make tests
Look for any failures in the test.log. If you encounter a problem, please forward the test.log to charles.childers@gmail.com along with some basic information about your system (OS, CPU type, GCC version, Toka revision #)
interpret.c
Use
The interpreter itself.
Functions Provided
Variables: long compiler When set to 0, interpret; when set to -1, compile. This is checked by the various word classes defined in class.c char *scratch Temporary holding area used by the parser and other routines. char *tib Pointer to the text input buffer. interpret() Accept and process input.
Primitives Provided
compiler ( -a ) Variable holding the compiler state
License
Copyright (c) 2006, 2007, Charles R. Childers
Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Limitations
These limits are dependent on the host system and can not be altered:
- Cell size is machine dependent, generally 32 bits (4 bytes) or 64 bits (8 bytes) in size. cell-size returns the actual size in bytes.
- Character size is machine and OS dependent. Normally 8 bits (1 byte). char-size returns the actual size in bytes. This is normally 1 byte in size.
The following limits can be overridden in config.h:
- Quotes can be up to 64 elements (values, primitives, embedded quotes, calls to other quotes, string pointers, etc) in length in most cases.
- The data stack can hold up to 100 cells.
- The return stack can hold up to 1024 cells.
- You can have up to 4096 named items
Loops
Toka provides a limited collection of primitives for building various types of simple loops.
The most basic form is countedLoop, which takes the following form:
upper-limit lower-limit quote countedLoop
countedLoop will invoke quote the specified number of times. Upper and lower indexes can be set as desired; Toka will count up or down as necessary.
Some examples:
10 0 [ i . ] countedLoop 0 10 [ i . ] countedLoop
Note the use of i, the loop index. When looping via countedLoop, the loop index is set to the current cycle number. Other types of loops do not set the loop index.
The other type of loop is a whileTrue or whileFalse loop. The normal form:
quote whileTrue quote whileFalse
Each time quote is invoked, a value of TRUE or FALSE should be left on the stack. The loop primitives will consume this, and either repeat the loop or end the loop until the condition is not met. whileTrue continues execution if the returned value is TRUE; whileFalse continues if the returned value is FALSE.
Some examples:
1 [ dup . 1 + dup 101 < ] whileTrue 101 [ dup . 1 - dup 1 < ] whileFalse
math.c
Use
Basic math operations on the stack.
Functions Provided
add() Add TOS to NOS subtract() Subtract TOS from NOS multiply() Multiply TOS by NOS divmod() Divide and return the result, including remainder
Primitives Provided
+ ( ab-c ) Add TOS and NOS - ( ab-c ) Subtract TOS from NOS * ( ab-c ) Multiply TOS by NOS /mod ( ab-cd ) Divide and get remainder
Related Words
negate ( x-y ) Invert the sign of TOS / ( xy-z ) Divide two numbers mod ( xy-z ) Divide two numbers and get remainder not ( x-y ) Invert the value 'x' */ ( abc-d ) (a*b)/c
MathOperations
Toka provides a very basic set of math functionality, sufficient for many purposes. The functions are:
+ ( ab-c ) Add TOS and NOS - ( ab-c ) Subtract TOS from NOS * ( ab-c ) Multiply TOS by NOS /mod ( ab-cd ) Divide and get remainder
In addition, the standard bootstrap adds some additional operations:
1+ ( x-y ) Increase value on stack by 1 1- ( x-y ) Decrease value on stack by 1 negate ( x-y ) Invert the sign of TOS / ( xy-z ) Divide two numbers mod ( xy-z ) Divide two numbers and get remainder not ( x-y ) Invert the value 'x' */ ( abc-d ) (a*b)/c
parser
Toka's parser is fairly simple. It only handles input coming from a file stream (which includes stdin). Input sources are implemented as a stack.
At the bottom is the stdin file. Generally on startup, the bootstrap.toka. A script would be at the third position, and any files that includes get added above it.
The parser will read from the top file until either end. or an EOF (end of file) is detected. At this point, it closes the file and drops to the next one on the stack.
Internally, the fundamental function for parsing is get_token(). This accepts two arguments, a pointer to a buffer to place the resulting string, and a delimiter character. Parsing ends when the delimiter character is encountered. get_token() also leaves a pointer to the resulting token on the stack. A Toka-level wrapper, parse(), makes use of get_token().
When parsing, a number of escape sequences can be recognized. These are listed below:
\n Embed a line feed (ASCII 10) into the token \r Embed a carriage return (ASCII 13) into the token \^ Embed ASCII 27 into the token. Useful with VT100/ANSI terminal escape sequences \\ Embed a \ character into the token \" Embed a quote into the token
Processing of escape sequences can be enabled or disabled by turning the escape-sequences variable on or off. For example:
escape-sequences off " \\ hello \\" type cr escape-sequences on " \\ hello \\" type cr
Tips
- Tokens have a maximum size of 4096 characters
- If the delimiter = 10, get_token() will also break on encountering ASCII 13.
- If the delimiter = 32, get_token() will also break on encountering ASCII 10 or ASCII 13.
parser.c
Use
Implement the parser.
Functions Provided
Variables: FILE *input[] Current file stream to parse from. Setup as an array of 8 inputs. long isp Pointer to the most recent input source in the array long base Holds the current numeric base long parser When ON (TRUE), system parsing words will parse. When OFF (FALSE), they will take a string from the stack. long escapes When ON (TRUE), escape sequences will be handled by the parser. When OFF (FALSE), they will be ignored. to_number() Attempt to convert a string (on TOS) to a number. This accepts a format of: [-]number If successful, it leaves the number and a flag of TRUE on the stack. Otherwise, it leaves the original string, and a flag of FALSE. to_string() Convert a number to a string. parse() Parse the input buffer until the character passed on TOS is found, or until the end of the line is encountered. Return a pointer to the resulting string on the stack. get_token(char *s, long delim) Return a string (in "s") up to the specified delimiter. This also puts the resulting string on the stack. long include_file(char *s) Attempt to open a file ("s") and add it to the top of the input stack. include() Take a filename off the stack, attempt to open it and add it to the input stream if successful. needs() Take a filename off the stack. Attempt to open it from the library, and add it to the input stream if successful. force_eof() Remove the current file from the input stack. This can be used to abort an include.
Primitives Provided
base ( -a ) Variable containg the current numeric base parser ( -a ) Variable holding current parser mode. escape-sequences Variable determining if ( -a) escape sequences are used. >number ( a-nf ) Attempt to convert a string to a number >string ( n-a ) Convert a number to a string parse ( d-a ) Parse until the character represented by 'd' is found. Return a pointer to the string include ( "- ) Attempt to open a file and add it to the input stack. ( $- ) Non-parsing form needs ( "- ) Attempt to include a file from the library (normally /usr/share/toka/library) ( $- ) Non-parsing form end. ( - ) Remove the current file from the input stack
Related Words
wsparse ( -a ) Parse until a SPACE is encountered lnparse ( -a ) Parse to the end of the line, leave the resulting string on the stack. hex ( - ) Set the base to hexadecimal (16) decimal ( - ) Set the base to decimal (10) binary ( - ) Set the base to binary (2) octal ( - ) Set the base to octal (8)
Ports
This page lists the officially supported platforms and includes notes on building for specific targets.
Quick Summary
CPUs: x86, x86-64, ARM, MIPS, Itanium
OSes: Linux, NetBSD, FreeBSD, OpenBSD, DragonFly BSD, BeOS, Cygwin, OpenSolaris
Linux
Supported CPUs: x86, x86-64, MIPS, Itanium
Toka has been built and tested on Debian and SuSE, using GCC versions 2.95, 3.0, 3.1, and 4.1
BSD
OpenBSD
Supported CPUs: x86, ARM
FreeBSD 4.x - 6.x
Supported CPUs: x86
NetBSD 3
Supported CPUs: x86, MIPS, ARM
DragonFly BSD
Supported CPUs: x86
BeOS
Supported CPUs: x86
Requires libdl.so (from http://bebits.com/app/2917)
You may also need (or want) to use the newer GCC from http://bebits.com/app/4011
Build with OTHER=-I/beos/develop/headers
Cygwin
Supported CPUs: x86
Toka has been built and run under Cygwin on both Windows XP and Windows Vista. Note that some of the libraries and examples may not function under Cygwin at this time.
OpenSolaris
Supported CPUs: x86
Only tested under Nexenta GNU/Solaris distribution but builds and works fine.
Quotes
With the functions described above, many small programs can be written. Toka has much more functionality, but before proceeding further, we need to take a look at how to create new functions (called quotes).
In Toka, a quote is the basic building block. At the simplest level, they are anonymous blocks of code and data. Quotes are created by encolsing the code and/or data in brackets:
[ ]
The above will create an empty quote and leave a pointer to it on the stack. You can attach a name to this pointer with is:
[ ] is empty-quote
Names can include any characters, other than space, tab, cr, or lf (enter key). (There are other forms of is, but this is the most common one; the others are discussed elsewhere).
quotes.c
Use
Build, operate on quotes.
Functions Provided
Variables: QUOTE quotes[8] Holds details about the compiler state, heap, etc for quotes during compilation. long qdepth Tracks how deeply the quotes are nested long quote_counter Tracks the current loop index Inst top Holds a pointer to the root quote begin_quote() Create a new quote. This allocates space for it, and sets the compiler flag. A pointer to the quote's start is pushed to the stack. end_quote() Terminate the previously opened quote and perform data_class() semantics. invoke() Call a quote (passed on TOS) compile() Compile the code needed to call a quote (passed on TOS) countedLoop() Execute a quote a given number of times. You pass a quote, and upper/lower limits. The loop counter, 'i', is updated with each cycle. truefalse() Takes three items (true-xt, false-xt, and a flag) from the stack. Stack should be passed in as: flag true false It will execute true if the flag is true, false otherwise. recurse() Compiles a call to the top-level quote. As a trivial example: [ dup 1 > [ dup 1 - recurse swap 2 - recurse + ] ifTrue ] is fib qlit() Push the value in the following memory location to the stack. This is used instead of lit() so that the decompiler (and eventually debugger) can reliably identify nested quotes as opposed to regular literals. quote_index() Return the current loop index (counter) quote_while_true() Return execution of a quote until the quote returns FALSE. quote_while_false() Return execution of a quote until the quote returns TRUE.
Primitives Provided
[ ( -a ) Create a new quote ] ( - ) Close an open quote invoke ( a- ) Execute a quote compile ( a- ) Compile a call to the quote countedLoop ( ulq- ) Execute a quote a set number of times, updating the 'i' counter each time. ifTrueFalse ( fab- ) Invoke 'a' if 'f' flag is true, 'b' if false. recurse ( - ) Compile a call to the top quote. i ( -n ) Return the current loop index whileTrue ( a- ) Execute quote. If the quote returns TRUE, execute again. otherwise end the cycle. whileFalse ( a- ) Execute quote. If the quote returns TRUE, execute again. otherwise end the cycle.
Recent Changes
- Words and Their Uses (09/13/2007 16:51:10)
- console.c (09/13/2007 16:47:33)
- parser.c (09/13/2007 16:47:19)
- Scripts (09/06/2007 19:10:35)
- First Steps (09/06/2007 19:01:17)
- Home (09/06/2007 19:00:43)
- Ports (09/06/2007 19:00:08)
- quotes.c (09/06/2007 18:55:27)
- Loops (09/06/2007 18:55:02)
- License (09/03/2007 02:34:27)
- ErrorCodes (09/02/2007 12:58:48)
- User Code (09/02/2007 12:52:26)
- interpret.c (08/28/2007 18:52:25)
- strings (07/27/2007 21:04:14)
- data.c (07/27/2007 21:02:47)
- Arrays (07/27/2007 21:01:34)
- Extending the Language (07/21/2007 15:18:28)
- UsingTheStack (07/21/2007 15:17:53)
- toka.c (07/21/2007 15:06:11)
- parser (07/21/2007 15:05:10)
- conditionals.c (07/21/2007 15:02:56)
- dictionary.c (07/21/2007 15:00:59)
- debug.c (07/21/2007 14:41:41)
- vm.c (07/21/2007 14:40:50)
- stack.c (07/21/2007 14:40:13)
- math.c (07/21/2007 14:37:18)
- gc.c (07/21/2007 14:36:11)
- files.c (07/21/2007 14:35:45)
- ffi.c (07/21/2007 14:34:11)
- decompile.c (07/21/2007 14:32:11)
- cmdline.c (07/21/2007 14:29:45)
- class.c (07/21/2007 14:29:22)
- Building (07/21/2007 14:17:16)
- garbage collector (07/14/2007 20:51:43)
- Limitations (07/14/2007 20:41:28)
- Values (05/28/2007 15:20:13)
- bits.c (05/21/2007 21:44:03)
- FFI (04/29/2007 22:43:53)
- Constants (04/29/2007 19:19:14)
- Files (04/28/2007 22:11:30)
- Conditionals (04/28/2007 22:11:10)
- WordClasses (04/28/2007 01:03:03)
- Stack Comments (04/27/2007 19:24:28)
- Commenting Code (04/27/2007 19:12:55)
- Quotes (04/24/2007 21:47:51)
- Types (04/24/2007 21:02:42)
- Installation (04/23/2007 18:18:57)
- errors.c (04/21/2007 17:02:50)
- Variables (04/19/2007 18:05:19)
- MathOperations (04/19/2007 18:01:47)
- threading model (04/10/2007 22:10:25)
- TokaDoc (03/30/2007 23:52:20)
- initial.c (03/30/2007 18:43:04)
- Search (03/29/2007 21:18:59)
- Index (07/13/2002 14:58:54)
- Recent Changes (07/13/2002 14:41:15)
56 Pages
Scripts
Toka can be used to write scripts. Start your code with the following:
#! /usr/bin/toka
And set the permissions to executable (+x), and you can run your code directly at the command line (assuming that Toka is installed).
To make the most of your scripts, you will probably want to handle command line arguments. Toka allows for this using the following words:
#args ( -n ) Returns the number of command line arguments arglist ( -a ) An array containing the arguments array.get ( ia-n ) Return a pointer to an element in the array
Toka accepts a command line like the following:
toka <script> <arg0> <arg1> ... <argN>
If no arguments are given to Toka, excluding than the optional script filename, #args will be set to 0. In any other case, #args will contain the number of arguments following the script name. For example, in the following example, #args will return 3:
toka foo.toka apple banana carrot
The breakdown in the arglist will then be as follows:
- Element 0 is the name of the script, in this case foo.toka
- Element 1 is apple
- Element 2 is banana
- Element 3 is carrot
You can loop over the arguments to process them. For instance, if we wanted to display each of the script arguments, we could do:
1 #args [ i arglist array.get type cr ] countedLoop
Search
- Arrays (07/27/2007 21:01:34)
- bits.c (05/21/2007 21:44:03)
- Building (07/21/2007 14:17:16)
- class.c (07/21/2007 14:29:22)
- cmdline.c (07/21/2007 14:29:45)
- Commenting Code (04/27/2007 19:12:55)
- Conditionals (04/28/2007 22:11:10)
- conditionals.c (07/21/2007 15:02:56)
- console.c (09/13/2007 16:47:33)
- Constants (04/29/2007 19:19:14)
- data.c (07/27/2007 21:02:47)
- debug.c (07/21/2007 14:41:41)
- decompile.c (07/21/2007 14:32:11)
- dictionary.c (07/21/2007 15:00:59)
- ErrorCodes (09/02/2007 12:58:48)
- errors.c (04/21/2007 17:02:50)
- Extending the Language (07/21/2007 15:18:28)
- FFI (04/29/2007 22:43:53)
- ffi.c (07/21/2007 14:34:11)
- Files (04/28/2007 22:11:30)
- files.c (07/21/2007 14:35:45)
- First Steps (09/06/2007 19:01:17)
- garbage collector (07/14/2007 20:51:43)
- gc.c (07/21/2007 14:36:11)
- Home (09/06/2007 19:00:43)
- Index (07/13/2002 14:58:54)
- initial.c (03/30/2007 18:43:04)
- Installation (04/23/2007 18:18:57)
- interpret.c (08/28/2007 18:52:25)
- License (09/03/2007 02:34:27)
- Limitations (07/14/2007 20:41:28)
- Loops (09/06/2007 18:55:02)
- math.c (07/21/2007 14:37:18)
- MathOperations (04/19/2007 18:01:47)
- parser (07/21/2007 15:05:10)
- parser.c (09/13/2007 16:47:19)
- Ports (09/06/2007 19:00:08)
- Quotes (04/24/2007 21:47:51)
- quotes.c (09/06/2007 18:55:27)
- Recent Changes (07/13/2002 14:41:15)
- Scripts (09/06/2007 19:10:35)
- Search (03/29/2007 21:18:59)
- Stack Comments (04/27/2007 19:24:28)
- stack.c (07/21/2007 14:40:13)
- strings (07/27/2007 21:04:14)
- threading model (04/10/2007 22:10:25)
- toka.c (07/21/2007 15:06:11)
- TokaDoc (03/30/2007 23:52:20)
- Types (04/24/2007 21:02:42)
- User Code (09/02/2007 12:52:26)
- UsingTheStack (07/21/2007 15:17:53)
- Values (05/28/2007 15:20:13)
- Variables (04/19/2007 18:05:19)
- vm.c (07/21/2007 14:40:50)
- WordClasses (04/28/2007 01:03:03)
- Words and Their Uses (09/13/2007 16:51:10)
56 Pages
Stack Comments
Stack comments provide a way to specify the action of a quote as it relates to the stack. It's normally a good idea to keep a list of words with their description and stack comments on hand if they are not part of the source.
A typical stack comment will resemble the following:
( abc-d )
The dash shows the before/after split. In the example above, the quote takes three elements from the stack (a, b, and c) and leaves a new one (d).
Words that parse use " as a symbol to denote this. For example:
( a"-b )
This is a stack comment for a word that takes a string (a), parses the input until a specified symbol is encountered, and returns a new string (b).
The exact meaning of the symbols is up to you. Generally in a description, you can explain them better, but the stack comment form is just a general overview.
When you are learning Toka, it may be helpful to write out stack comments for each step in a quote. This can help you keep track of the stack and become more comfortable using it.
stack.c
Use
Implements the basic stack operations. This is intentionally kept minimal, though additional primitives can be added to improve overall performance.
Functions Provided
stack_dup() Duplicate the TOS stack_drop() Drop the TOS stack_swap() Exchange TOS and NOS stack_to_r() Push TOS to return stack, DROP TOS stack_from_r() Pop TORS to the data stack stack_depth() Return the number of items on the stack
Primitives Provided
dup ( n-nn ) Duplicate the TOS drop ( n- ) Drop the TOS swap ( ab-ba ) Exchange the TOS and NOS >r ( n- ) Push TOS to return stack, DROP r> ( -n ) Pop TORS to the data stack depth ( -n ) Return the number of items on the stack
Related Words
nip ( xy-y ) Remove the second item on the stack rot ( abc-bca ) Rotate top three values on stack -rot ( abc-acb ) Rotate top three values on stack twice over ( xy-xyx ) Put a copy of NOS above the TOS tuck ( xy-yxy ) Put a copy of TOS under NOS 2dup ( xy-xyxy ) Duplicate the top two items on the stack 2drop ( xy- ) Drop TOS and NOS reset ( *- ) Drop all items on the stack r@ ( -x ) Get a copy of the top item on the return stack
strings
Strings are sequences of characters. Specifically a string is any sequence of ASCII characters (each character can have a length identical to char-size), and ending with the literal value 0.
Strings are created by parsing, or can be manually constructed as an array. For example the following two are functionally identical:
" hello" is-data hello hello type cr 5 chars is-array hello char: h 0 hello array.putChar char: e 1 hello array.putChar char: l 2 hello array.putChar char: l 3 hello array.putChar char: o 4 hello array.putChar 0 5 hello array.putChar hello type cr
Since all strings are arrays, you can also manipulate individual elements in a string:
" hello" is-data hello hello type cr #! Now change the lowercase 'h' to uppercase. char: H 0 hello array.putChar hello type cr
Tips
- char-size is normally equal to 1, corresponding to one byte.
- It is possible (though not likely) to have a char-size larger than one byte.
- Use c@ and c! when manipulating char-size elements.
- Do not use @ and ! (which are for cell-size elements).
threading model
Toka relies on call threading. This is an implementation technique in which a list of addresses is compiled. Each of these is then called in sequence. The model is similar to direct threading in the Forth world.
There is one special address which marks the end of a threaded sequence. This address is 0.
As an example of this, consider the following quote:
[ a b c 1 2 + . ]
This is compiled into a list of addresses:
a, b, c, lit, 1, lit 2, +, ., 0
Note the special form for numbers. The lit() function pushes the value in the following cell to the stack.
When this quote is invoked, a will be called. After it finishes executing, b will be called, and so on until 0 is encountered. At that point, the vm_run() function exits.
toka.c
Use
Setup and call the interpreter.
Functions Provided
main() The main entry point into Toka. Calls the functions needed to setup the initial dictionary and calls interpret().
Primitives Provided
None
TokaDoc
Introduction
To help keep an up-to-date list of C functions and their Toka equivilents, the TokaDoc scripts were developed. These seach C sources for specially formatted comments and extract them into various files. These files can be updated when building.
Supporting TokaDoc
At the start of each C file, have a comment header like this:
/* *|F| *|F| FILE: filename *|F| */
Before any global variables, have a comment block like this:
/* *|F| Variables: *|F| GCITEM gc_list[128] *|F| Holds the list of items marked as garbage. *|F| */
Before each function have a comment block like this:
/* *|F| <return type> functioname(<arguments>) *|F| description of the function *|F| */
And for any functions having a Toka equivilent, have a block like:
/* *|G| wordname stack-effect description */
Please try to keep columns lined up as much as possible for Toka wordlist comments.
Types
Toka has one data type, called the cell. This is a machine dependent sized memory area that can hold a single number or pointer. On 32-bit systems, cells are 4 bytes in length, and on 64-bit systems, they take 8 bytes. A constant, cell-size, returns the exact length provided by Toka on your system.
Cells can hold numbers, pointers to allocated memory, and pointers to quotes. When Toka encounters a number or a pointer, it is placed on the stack. For more permanent storage, you can store cell values into memory locations for later use.
This is where a number of abstractions arise. Memory allocated for storage of values is called variables. Memory allocated for sequences of values are called arrays. And a special class of array is used for sequences of characters. These are called strings.
In all of these, when you reference them, a pointer is left on the stack. Pointers are simply numbers corresponding to an actual memory address. It is up to you to know what abstract data type a pointer represents.
When dealing with variables, you can use @ (fetch) and ! (store) to set and obtain the values stored in them. If you need to fetch or store single characters (as may arise when manipulating strings), you can use c@ and c! instead.
Arrays have an entire abstracted set of words for dealing with them. It's well worth learning to use these, as they are portable, and make accessing and setting individual elements in an array trivial.
Tips
You are strongly encouraged to use cell-size and char-size instead of hard coding the sizes for data. This helps ensure readability and portability.
User Code
Use this page to extend Notebook using the the Tcl language. Commands you add here can be used as Magic Buttons and Embedded Macros.
Note that you can intersperse normal prose in between blocks of Tcl code.
User Menu
The User Menu pops up when you right-click or control-click on a page in the Page Browser. You can customize it however you like.
usermenu { Back back-page Home {goto-page Home} }
Edit Menu
The Edit Menu pops up when you right-click or control-click on a page in the Page Editor. You can customize it however you like.
editmenu { Undo undo-change Redo redo-change separator {} Cut cut-string Copy copy-string Paste paste-string "Insert Page..." insert-page }
Example
The following embedded macro used to be used by the [Tour] to create a magic button that said "Click here to continue..." and took you to the next page in the tour when you clicked it. It's no longer needed, because now you can write such links directly, like this:
[Click here to continue...|Tour 2]
Still, it's a nice example of how to write a macro that creates a button.
proc clickToContinue {name} { return "\[%Click here to continue...|goto-page [list $name]%\]" }
html configure -css { h1 { color: yellow; background: #990000; margin-left: -4%; font-family: "Arial Rounded MT Bold" Helvetica Arial sans; } h2, h3, h4 { margin-left: -4%; font-family: "Arial Rounded MT Bold" Helvetica Arial sans; } hr { margin-left: -5%; margin-right: -5%; } body { margin-left: 5% ; margin-right: 5%; background: #FFFFDD; } pre { outline: thin dashed blue; padding-top: 2px; padding-bottom: 2px; padding-left: 4 px; background: #FFFF99; } } -nbtitle "Toka Documentation"
UsingTheStack
The Toka language makes use of a stack to pass data between functions (called quotes in toka). Imagine a stack of blocks with numbers on them. You can add or remove numbers from the top of the stack. You can also rearrange the order of the numbers.
The stack is initially empty. Let's start by putting some numbers on the stack. Type in:
23 7 9182
Excellent! Now print the number on top of the stack using ., which is pronounced "dot". This is a hard word to write about in a manual because it is just a single period.
Enter:
.
You should see the last number you entered, 9182, printed. Each time . is used, the top element on the stack is lost. If you want to see what is on the stack, you can use :stack. Try this:
:stack
You should see:
<2> 23 7
The number on the left, enclosed in brackets, is the number of items on the stack. The number to the far right is the top of the stack, or TOS. It should be mentioned that :stack leaves the stack unchanged.
Since Toka uses the stack to hold data being operated on, and it uses the stack to pass data between quotes, it is very important to practice using it. Quotes generally take what they need off of the stack, and put their results back on it. To help understand exactly what each quote consumes and leaves, we use stack diagrams. As an example:
. ( x- )
That is to say, . takes one word off the stack (the 'x') and puts nothing on the stack. In other words, it consumes the TOS.
In the examples that follow, you do not need to type in the comments. When you are programming, of course, liberal use of comments and stack diagrams may make your code more readable and maintainable.
Between examples, you may wish to clear the stack. If you enter reset, the stack will be cleared. Since the stack is central to Toka, it is important to be able to alter it easily. Let's look at some more functions that manipulate the stack. Enter:
reset 777 dup :stack
You will notice that there are two copies of 777 on the stack. The quote dup duplicates TOS. This is useful when you want to use the TOS and still have a copy. The stack diagram for dup would be:
dup ( x-xx )
Another useful quote is swap. Enter:
reset 23 7 :stack swap :stack
The stack should look like:
<2> 7 23
The stack diagram for swap would be:
swap ( xy-yx )
Now enter:
over :stack
You should see:
<3> 7 23 7
over causes a copy of the second item on the stack to leapfrog over the first. Its stack diagram would be:
over ( xy-xyx )
Here is another commonly used function:
drop ( x- )
Can you guess what we will see if we enter:
drop :stack
Another handy function for manipulating the stack is rot (short for rotate). Enter:
11 22 33 44 :stack rot :stack
The stack diagram for rot is, therefore:
rot ( xyz-yzx )
You have now learned the more important stack manipulation words. These will be present in almost every non-trivial Toka program. I will say that if you see heavy use of these words in your code, you may want to examine and reorganize (refactor) your code. Use of variables and arrays (which we will discuss later) can also help clean things up.
Here are stack diagrams for some other useful stack manipulation functions. Try experimenting with them by putting numbers on the stack and calling them to get a feel for what they do. Again, the text in parentheses is just a comment and need not be entered.
2drop ( xyz--x ) 2dup ( xy-xyxy ) nip ( xyz-xz ) tuck ( xy-yxy )
Values
Values are a special form of variable that may be more readable in many situations.
Example:
value foo foo . 100 to foo foo .
Variables
To hold data for longer periods of time than is practical with the stack, variables can be used. Variables are pointers to memory locations large enough to hold a number. There are two primary ways to create variables:
variable foo variable bar variable baz
The above would create three new variables, named foo, bar, and baz. When creating multiple variables, it is more readable to use variable| though:
variable| foo bar baz |
You can use @ (fetch) and ! (store) to alter the contents of a variable:
variable foo 100 foo ! foo @ .
For reading/writing character-sized values, c@ and c! are also provided. A full list of functions for working with variables follows:
variable ( "- ) Parse ahead and create a named entry corresponding to a memory location variable| ( |- ) Parse and create variables until | is encountered. @ ( a-n ) Fetch the value from variable 'a' ! ( na- ) Store 'n' to variable 'a' c@ ( a-n ) Fetch a byte from variable 'a' c! ( na- ) Store byte 'n' to variable 'a'
vm.c
Use
Implements the heart of the virtual machine.
Functions Provided
Variables: Inst *heap Pointer into the current heap Inst *ip The instruction pointer long stack[MAX_DATA_SIZE], rstack[MAX_RETURN_STACK] The data and return stacks long sp, rsp The stack pointers vm_run(Inst) Run through a list of instructions Side effects: modifes *ip vm_stack_check() Check for over/underflow and reset if detected If the return stack over/underflows, exit Toka push(long a) Push a number to the stack. lit() Push the value in the following memory location to the stack string_lit() Push the pointer in the following memory location to the stack. This is a helper function for strings.
Primitives Provided
heap ( -a ) Variable pointing to the top of the local heap
WordClasses
An implementation strategy adopted from HelFORTH. Each quote, primitive, and data structure has a class assigned to it. These classes are aware of the current compiler state (on or off), and may be aware of other aspects of the Toka system as well.
When an item is found, the corresponding class handler is invoked. The primary classes you will encounter in Toka are:
is
This is the most common class.
- Compiling: compile a call to the quote.
- Interpreting: invoke the quote.
is-macro
A special case, these are used for quotes that need to be invoked whenever they are encountered. Macro class is used for creating strings (via ") and symbolic creation of characters (via char:).
- Compiling: invoke the quote.
- Interpreting: invoke the quote.
is-data
This is the second most common class. It is used for all data structures, including variables, arrays, and strings.
- Compiling: compile the value into the quote.
- Interpreting: leave the value on the stack.
Words and Their Uses
Primitives
These are words that are built into the Toka executable. If the bootstrap.toka can not be found, these are the only words that will be provided.
<< ( ab-c ) Shift 'a' left by 'b' bits >> ( ab-c ) Shift 'a' right by 'b' bits and ( ab-c ) Perform a bitwise AND or ( ab-c ) Perform a bitwise OR xor ( ab-c ) Perform a bitwise XOR #args ( -n ) Return the number of arguments arglist ( -a ) Return a pointer to the argument list. < ( ab-f ) Compare 'a' and 'b', return a flag > ( ab-f ) Compare 'a' and 'b', return a flag = ( ab-f ) Compare 'a' and 'b', return a flag <> ( ab-f ) Compare 'a' and 'b', return a flag . ( n- ) Display the TOS emit ( c- ) Display the ASCII character for TOS type ( a- ) Display a string bye ( - ) Quit Toka # ( n- ) Push the following cell to the stack. $# ( n- ) Push the following cell to the stack. @ ( a-n ) Fetch the value in memory location 'a' ! ( na- ) Store 'n' to memory location 'a' c@ ( a-n ) Fetch a byte from memory location 'a' c! ( na- ) Store byte 'n' to memory location 'a' copy ( sdc- ) Copy 'c' bytes from 's' to 'd' cell-size ( -n ) Return the size of a cell char-size ( -n ) Return the size of a char :stack ( - ) Display all values on the data stack :stat ( - ) Display information about the virtual machine status :see ( "- ) Decompile the specified quote last ( -a ) Variable holding a pointer to the most recent dictionary entry number is ( a"- ) Attach a name to a quote ( a$- ) Non-parsing form is-macro ( a"- ) Attach a name to a quote ( a$- ) Non-parsing form is-data ( a"- ) Attach a name to data memory ( a$- ) Non-parsing form ` ( "-a ) Return a quote corresponding to the specified word. ( $-a ) Non-parsing form :name ( n-$ ) Return the name for a dictionary entry :xt ( n-$ ) Return the address of a dictionary entry :class ( n-$ ) Return the class # for a dictionary entry from ( "- ) Set the library to import from ( $- ) Non-parsing form import ( n"- ) Import a function taking 'n' arguments. ( n$- ) Non-parsing form as ( "- ) Rename the last defined word ( $- ) Non-parsing form file.open ( $m-n ) Open a specified file with the specified mode. file.close ( n- ) Close the specified file handle file.read ( nbl-r ) Read 'l' bytes into buffer 'b' from file handle 'n'. Returns the number of bytes read. file.write ( nbl-w ) Write 'l' bytes from buffer 'b' to file handle 'n'. Returns the number of bytes written. file.size ( n-s ) Return the size (in bytes) of the specified file. file.seek ( nom-a ) Seek a new position in the file. Valid modes are START, CURRENT, and END. These have values of 1, 2, and 3. file.pos ( n-a ) Return a pointer to the current offset into the file. keep ( a-a ) Mark quotes/allocated memory as permanent. gc ( - ) Clean the garbage malloc ( n-a ) Allocate 'n' bytes of memory heap ( -a ) Variable pointing to the top of the local heap compiler ( -a ) Variable holding the compiler state + ( ab-c ) Add TOS and NOS - ( ab-c ) Subtract TOS from NOS * ( ab-c ) Multiply TOS by NOS /mod ( ab-cd ) Divide and get remainder base ( -a ) Variable containg the current numeric base parser ( -a ) Variable holding current parser mode. escape-sequences ( -a) Variable determining if escape sequences are used. >number ( a-nf ) Attempt to convert a string to a number >string ( n-a ) Convert a number to a string parse ( d-a ) Parse until the character represented by 'd' is found. Return a pointer to the string include ( "- ) Attempt to open a file and add it to the input stack. ( $- ) Non-parsing form needs ( "- ) Attempt to include a file from the library (normally /usr/share/toka/library) ( $- ) Non-parsing form end. ( - ) Remove the current file from the input stack [ ( -a ) Create a new quote ] ( - ) Close an open quote invoke ( a- ) Execute a quote compile ( a- ) Compile a call to the quote countedLoop ( ulq- ) Execute a quote a set number of times, updating the 'i' counter each time. ifTrueFalse ( fab- ) Invoke 'a' if 'f' flag is true, 'b' if false. recurse ( - ) Compile a call to the top quote. i ( -n ) Return the current loop index whileTrue ( a- ) Execute quote. If the quote returns TRUE, execute again. otherwise end the cycle. whileFalse ( a- ) Execute quote. If the quote returns TRUE, execute again. otherwise end the cycle. dup ( n-nn ) Duplicate the TOS drop ( n- ) Drop the TOS swap ( ab-ba ) Exchange the TOS and NOS >r ( n- ) Push TOS to return stack, DROP r> ( -n ) Pop TORS to the data stack depth ( -n ) Return the number of items on the stack
Bootstrap
These are additional words, provided in bootstrap.toka. They significantly expand the core language.
VERSION ( -n ) Return a number representing the current version of Toka. #! ( "- ) Parse to the end of the line and scrap the results. ( ( "- ) Parse until ) is found and scrap the results .PRIM_WORD ( -n ) Class # for primitive words .PRIM_MACRO ( -n ) Class # for primitive macros .DATA ( -n ) Class # for data elements .WORD ( -n ) Class # for quote words .MACRO ( -n ) Class # for quote macros SPACE ( -n ) ASCII value for SPACE character CR ( -n ) ASCII value for CR character LF ( -n ) ASCII value for LF character ESC ( -n ) ASCII value for ESC character TAB ( -n ) ASCII value for TAB character wsparse ( -a ) Parse until a SPACE is encountered lnparse ( -a ) Parse to the end of the line, leave the resulting string on the stack. FALSE ( -f ) Value returned for FALSE TRUE ( -f ) Value returned for TRUE ifTrue ( fq- ) Execute quote ('q') if flag ('f') is TRUE ifFalse ( fq- ) Execute quote ('q') if flag ('f') is FALSE >char ( n-c ) Convert the value on TOS to a single character char: ( "-c ) Parse ahead and return one character " ( "-$ ) Parse until " is encountered and return a string cr ( - ) Display a CR character space ( - ) Display a space tab ( - ) Display a tab ." ( "- ) Parse to the next <b>"</b> and display the string. clear ( - ) VT100: Clear the screen normal ( - ) VT100: Set the colors back to the default nip ( xy-y ) Remove the second item on the stack rot ( abc-bca ) Rotate top three values on stack -rot ( abc-acb ) Rotate top three values on stack twice over ( xy-xyx ) Put a copy of NOS above the TOS tuck ( xy-yxy ) Put a copy of TOS under NOS 2dup ( xy-xyxy ) Duplicate the top two items on the stack 2drop ( xy- ) Drop TOS and NOS reset ( *- ) Drop all items on the stack r@ ( -x ) Get a copy of the top item on the return stack negate ( x-y ) Invert the sign of TOS / ( xy-z ) Divide two numbers mod ( xy-z ) Divide two numbers and get remainder not ( x-y ) Invert the value 'x' */ ( abc-d ) (a*b)/c chars ( x-y ) Multiply TOS by char-size. Useful w/arrays char+ ( x-y ) Increase TOS by char-size char- ( x-y ) Decrease TOS by char-size cells ( x-y ) Multiply TOS by cell-size. Useful w/arrays cell+ ( x-y ) Increase TOS by cell-size cell- ( x-y ) Decrease TOS by cell-size +! ( xa- ) Add 'x' to the value in address 'a' -! ( xa- ) Subtract 'x' from the value in address 'a' on ( a- ) Set a variable to TRUE off ( a- ) Set a variable to FALSE toggle ( a- ) Toggle a variable between TRUE and FALSE variable ( "- ) Create a variable variable| ( "- ) Create multiple variables hex ( - ) Set the base to hexadecimal (16) decimal ( - ) Set the base to decimal (10) binary ( - ) Set the base to binary (2) octal ( - ) Set the base to octal (8) "R" ( -x ) Mode for file.open "R+" ( -x ) Mode for file.open "W" ( -x ) Mode for file.open "W+" ( -x ) Mode for file.open "A" ( -x ) Mode for file.open "A+" ( -x ) Mode for file.open START ( -x ) Mode for file.seek CURRENT ( -x ) Mode for file.seek END ( -x ) Mode for file.seek file.slurp ( $-a ) Read a file into a dynamically allocated buffer is-array ( n"- ) Create an array of size 'n' array.get ( ia-n ) Get element 'i' from array 'a' array.put ( nia- ) Put value 'n' into element 'i' of array 'a' array.getChar ( ia-n ) Get char-size element 'i' from array 'a' array.putChar ( nia- ) Put char-size value 'n' into element 'i' of array 'a' <list> ( -a ) Stores a list of pointers used by { and } { ( - ) Start a scoped area } ( - ) End a scoped area +action ( aq"- ) Create a new word (") with the action of the specified quote (q) and data element (a) value ( "- ) Create a new value to ( - ) Set the value of a value value| ( "- ) Create multiple values string.grow ( $n-$ ) Increase the physical size of a string by 'n' bytes. string.append ( ab-c ) Append string 'b' to string 'a' string.compare ( ab-f ) Compare two strings for equality string.appendChar ( $c-$ ) Append character 'c' to string '$' string.getLength ( $-$n ) Get the length of a string (up to ASCII 0) string.clone ( $-$ ) Make an additional copy of a string. words-within ( n- ) Display all words with a specified class # :prims ( - ) Display all primitives :quotes ( - ) Display all named quotes :datas ( - ) Display all named data items words ( - ) Display all names in the dictionary
Notebook exported on Thursday, 13 September 2007, 16:51:20 PM EDT