Monday, August 29, 2011

Smallworld Technical Paper No. 5 - An Overview of Smallworld Magik

by Arthur Chance, Richard G. Newell & David G. Theriault

Introduction

Smallworld Magik is an extremely powerful language for the implementation of large interactive systems. The language is a hybrid of the procedural and object oriented approaches and program development is carried out in an interactive environment.

It would be pertinent of the reader to ask, why would a small company like Smallworld devote considerable resources to the creation of such a development environment? Are not the common languages available on the market today adequate for the job? Well, the simple answer to the second question is no and it is the purpose of this paper to try and give an answer to the first.

Background

Traditionally, large interactive systems were developed in a non-interactive procedural language, such as Fortran or C. In order that an end user could drive such a system, an interactive command language was provided so that the user could type in his commands. Many command languages evolved into programming languages usually by borrowing the programming concepts of Basic. In more modern user interfaces, this command language may well be hidden behind a system of screen menus, tablet menus or other input devices. Large modular systems were glued together using an operating system command language or script. Within the system, developers and customisers had a number of other languages available to them to define such things as syntax, menus, data descriptions, graphic codes, etc, all running as different processes communicating by files.

The structure of such systems were commonly organized at the highest level around the command syntax and complex commands were structured in a top down approach from there.

If one examines systems that have been put together in this manner over the last five to ten years they all suffer a number of difficulties:

  • Development is slow. Users' requests for enhancement have to wait for the next release, which is usually over a year away.
  • They are difficult and expensive to maintain. During the life cycle of the system, probably 90% of development goes into maintenance.
  • Major restructuring in the light of five or ten years of hindsight is unthinkable.
  • Customisation is arbitrarily done in one or more of the many languages used to put the system together, typically Fortran or the command language.
  • Integration with other systems is nearly impossible.

Smallworld has implemented Magik in order to avoid all these difficulties. The way this is achieved is by embodying the following features in the language and its development environment:

  • There is but one language for system, application and customisation development.
  • Both object orientation and procedural methodologies are supported.
  • Development is in an interactive environment.
  • The language is expressive and very readable.
  • There is an extensive library of standard object classes, methods and procedures.
  • The language is built as a platform suitable for delivering commercial systems.
  • Applications can be transferred with a minimum of effort between hardware platforms.

It is Smallworld's belief that the presence of all these features is essential if commercial systems are to be developed, maintained and customised with a minimum of programmer effort. It is the lack of a viable language with a sufficient subset of these facilities that has stimulated Smallworld to produce its own which embodies all of them.

Magik allows programs to be developed in one seamless environment, meaning that systems programming, applications development, system integration, and customisation are all written in one environment in the same language. Thus, end users who wish to customise the system can be confident in the quality of the tools provided because they are identical to the development tools used by the core and application system developers. Further, existing systems, such as most database management systems, can be fully integrated so that to the user they appear as part of one homogeneous system.

We have already started to use some of the jargon of object orientation, and in order to understand Magik, it will be necessary for us to try and explain object orientation and a number of the other terms used to explain it.

Object Orientation

Object orientation is not an easy concept to explain, and we are not likely to fully succeed here, how-ever, it is suffice to say that experienced computer industry observers are in no doubt as to the power of the technique and indeed are convinced that within the next few years object orientation will be the dominant approach to structuring and building large complex systems. The argument is subtle, but the benefits of it are profound.

Object orientation refers to a way of structur-ing software systems, thus people who refer to object oriented databases may be misusing the term, as it is not clear (at least to this writer) what they mean. Putting it at its most terse, object orientation structures software around the things being processed, not around the functions being performed. In order to try and get over some of the flavour of what object oriented programming is all about let us try and describe some of the common terminology surrounding it.

Object

An object comprises two things, its own state (manifested as a set of instance variables) which no other part of the system can access directly and a set of procedures (called methods) which describe its behaviour. Everything about an object is encapsulated within it and the only way of getting data out of it, or changing it, or getting it to do something is by sending messages to it. An object is a rather sophisticated extension of the concept of variable in other languages. The fact that data is hidden and that the only way of communicating with an object is via a rigorously defined system of message passing (see below) means that extremely robust systems can be created.

Class

Every object belongs to a unique class. Some classes are primitive such as real numbers, integers and text strings. A class is very similar to a type in normal procedural languages. All objects of a given class will exhibit the same behaviour, i.e., have the same set of methods. In an object oriented language, the programmer himself defines new classes in order that he can later make instances of them. In order to be more precise here, we should say that Magik is based on "exemplars" in that a new class is defined by having a special way of defining the first instance of it, i.e. an example of it. From then on all further instances are cloned from the exemplar.

Method

A method is a procedure, no more, no less. The reason a different term is used is because of the way in which it is called. In a normal procedural language if one sees a reference to a procedure, then it refers unambiguously to a particular piece of code. However a reference to a method in a message (see below) may well refer to any one of many pieces of code depending on the class of the object receiving the message. Further, it will not be possible to ascertain which method will be executed until run time. What this means is that routines can be written which do not depend on the type of the data thrown at them and therefore the same code can be reused over and over again in different contexts. Such code written without a knowledge of what it is to be used for is sometimes referred to as a software IC.

Inheritance

Quite frequently, different classes might be similar, in that one might exhibit all the behaviour of another plus some additional behaviour. In other words, one class may share a number of another classes' methods, but also has a different version of some methods and some new methods of its own. In these cases, one class can be defined to be a sub-class of another. This is another way of exploiting code reusability. Note that the behaviour of a sub- class is not a subset of the behaviour of its super-class, because typically it does things in addition to the superclass. A subclass is a specialization of its superclass in the same sense that a duck-billed platypus is a specialization of a monotreme which is a specialization of a mammal which is a special-ization of a vertebrate which belongs to the animal kingdom. A duck-bill is unique to the platypus, whereas egg laying behaviour is common to all monotremes, warm bloodedness is common to all mammals and all vertebrates have a backbone.

Usually, classes are organized in a strict hier-archy, each subclass in the hierarchy inheriting its behaviour (methods) from its superclass. In Magik, the concept of multiple inheritance is also supported (equivalent to a hybrid in our example from the animal kingdom) so that a class may inherit behaviour from any number of super classes (thus forming a heterarchy). Some objects are special in that their only function is to supply behaviour to other classes, it is not possible to create an object instance from them; these are called "mixins".

The point of the class heterarchy is to try and maximize code reusability.

Message Passing

Message passing in an object oriented language means exactly the same thing as a procedure call in a procedural language. "Message expression" is the term used to describe how a message is sent. A message expression comprises three parts: an object, a message name, and a list of zero or more parameters, which are themselves objects.

In Magik a message expression would look like:

p3 << point.new(10,20)

Here a new object "p3" is manufactured by taking the object "point", sending it the message "new" with the parameter objects 10 and 20 (note: "<<" is the Magik assignment operator).

x3 << p3.x    this makes the object "x3" by                  sending the message "x" to "p3"                 (thus x3 becomes 10) p3.y<< 50 this could be regarded as                 changing p3 by sending the message                 "y<<" to "p3" with parameter 50. 

Magik - a Hybrid Language

We have said in the introduction to this paper that Magik is a hybrid language. This is because whereas object orientation is the preferred method of organizing large systems (programming in the large), it can be cumbersome and contorted for writing certain kinds of routines, especially pure functions. Such routines are usually not very large and are reasonably self contained. It is therefore preferable to write such parts of the system using a conventional procedural approach, as is often the case for small programs (programming in the small).

In Magik, object oriented code can be freely mixed with procedural code, and indeed the message expression has been made to look rather like a procedure call where the procedure name can be thought of as a concatenation of the object class with the message.

Interactive Development Environment

At least as important as the power of a language is the nature of the development environment provided for programmers. Progress with conventional system programming languages is considerably slowed by having to re-link every time a new piece of code needs to be tested. If one provides programmers with an interactive language with powerful tools to explore and discover existing code in the extensive library of standard objects, then programmer productivity is improved by a large factor, regardless of the quality of the language itself. In Magik, the development environment and code browsing facilities are inspired by Smalltalk.

The inheritance relationships for the objects in the Magik environment can be complex but inheritance diagrams can be generated automatically. Also, the implementing class of any particular method can be easily identified. The annotated source code for most of the environment is available online. There are code browsers and object inspectors; the system can locate methods and variables given only part of the name. Debugging tools include tracing of calls and access to global variables. It is possible to fix and continue after an error.

Magik - a Readable Language

Some popular languages, although very powerful, are extremely difficult to read. Occasionally language constructions are so cryptic that even the original author has difficulty deciphering the code that he has written, let alone passing it to anyone else. The same criticism might also be levelled at very low level languages such as most assemblers. In order that large systems can be maintained, it is essential that the language syntax is designed so that the majority of the program-ming population does not get immediate dyslexia. As far as procedural programming is concerned, Magik has been designed to fit in with the Algol style of syntax, so that conditionals, procedure calls, expressions and loops will all appear familiar and readable. The object oriented facilities such as method definitions and message expressions have been designed to fit in with the Algol style.

Library of Object Classes, Methods and Procedures

At the heart of the programming environment is a variety of useful classes that can be used directly or subclassed. These classes include:

  • Different kinds of collections for grouping objects (sets, association tables, ordered lists, stacks and queues and so on). Some of these collections support relational algebra.
  • Multi-dimensional arrays.
  • Stream of objects or text, either to or from an external channel such as a file or to and from buffers.
  • A text editor
  • Application framework and menus for putting together interactive applications. This is in turn built on a library of objects that do graphics.

There are also packages of common mathematical and statistical functions.

Magik - a Delivery Platform

Magik is designed with three strategies specifically aimed at delivery platforms. The first strategy is simply a defensive programming style. The environment and application framework are designed to encourage this and error-handling mechanisms in the programming environment that are intended for debugging can easily be replaced in a delivery version with appropriate error recovery procedures.

A second strategy is to protect application and system code. Online source code can be removed from the delivery version if preferred and method tables can be locked so that methods cannot be changed or removed from the sensitive classes. Global variables can be made constant. If necessary, the compiler can be removed from the platform and users restricted to menus instead.

A third strategy is to separate text and pictures intended for human interpretation from other data wherever possible. This is vital for internationalization so that words and symbols can be translated into different languages.

Appendix - Snippets of Smallworld Magik code

Simple assignments and expressions:

y << x + 1/x x << (-b + sqrt(b*b-4*a*c))/(2*a) 

Multiple assignments. Given two objects a and b, both the following examples swap a and b, regardless of their classes:

 (b,a) << (a,b) # Parallel assignment a << b ^<< a   # ^<< Boot and becomes 

Calling procedures and sending messages:

largest << max(p1, p2, p3) (red,green,blue) << chair.colour.rgb() 

Conditionals:

if chair.height < min then   report('This chair is too small') elif chair.height > max then    report('This chair is too big') else   report('This chair is just right') endif 

Loops:

member << for mem over membershiplist.elements() loop    # statements to be looped over   if mem.age < 18 then continue abc endif   # more statements   if mem.available then leave with mem endif endloop 

Procedures:

quadratic << proc(a,b,c)   if a = 0   then     if b <> 0     then       return c/b     endif   else     discriminant << b*b-4*a*c     if discriminant < 0     then       return     else       s << sqrt(discriminant)       ta << a+a       return (-b+s)/ta, (-b-s)/ta     endif   endif endproc  # leaves x1 with 2 and x2 with 3 (x1,x2) << quadratic(1,-5,6)    # leaves x1 with 2 and x2 unset (x1,x2) << quadratic(0,2,4)                 # leaves both x1 and x2 unset (x1,x2) << quadratic(4,1,3) 

Returning objects. The normal way to return objects from a procedure or method is to use the return statement.

# returns three objects return a,b,c              # returns a single list of three objects return vec(a,b,c)       # Any statement, simple or complex can also return # objects using the form '>>' # "a" ends up being 42 or 99 a <<    if p > q    then >> 42   else >> 99    endif 

Iterators. Iterators generate values used in loops. This example uses the fibonaccinumbers iterator to supply values for f.

 # The fibonacci series is the series 0, 1, 1, 2, 3, 5, 8, 13 . . . for f over fibonacci_numbers(10) loop   # statements endloop  # The iterator "fibonaccinumbers" could be defined as follows: fibonacci_numbers << iter proc(n)   if n < 1 then return endif   loopbody(0)   if n < 2 then return endif   loopbody(1)   f0 << 0   f1 << 1   for i over range(1,n)   loop     (f0,f1) << f1,f0+f1     loopbody(f1)   endloop endproc 

Method definition:

method rectangle.new(width,length)   clone.init(width,length) endmethod  method rectangle.area   >> width*length endmethod 

Iterator methods. Iterator methods are very similar to iterators, except they are defined as part of the behaviour of a particular class.

"Membership.list.elements()" under loops above is an example  of the use of an iterator method. 

Protected code. The protection code in a protect statement is guaranteed to be executed, even if an error occurs:

input << text_fileinput.new(filename) protect   # code to process the file protection   input.close() endprotect

No comments: