Jump to content

Oberon/Text

From Wikibooks, open books for an open world

Confusingly for a novice, three forms of line ending can be found in text files in contemporary computer systems.

  • In Oberon and some other systems (most prominently Classic MacOS), a line of text ends with the carriage return character, denoted CR.
  • In Unix-like and some other systems, a line of text ends with the line feed character denoted LF.
  • In DOS, Microsoft Windows and some other systems, a line of text ends with the two characters, CR and LF.

This multiplicity poses no difficulty in Oberon; most Oberon systems allow convenient editing with any of these line endings.[1]

Also confusing for a novice, an Oberon Text[2] can appear at the first sight to be no more than a plain text comprising a sequence of ASCII characters. Nevertheless Text in Oberon is a type defined in the Text module. A Text of this type is a sequence of characters including non-printing characters. A character in a Text can have attributes including typeface, size and color. Furthermore a Text can include non-character objects; an image or a hyperlink for example. Consequently two displayed documents, one an Oberon Text and the other an HTML text, can have the same appearance and the same behavior of links.

Editing a Text

[edit | edit source]

Each Oberon system has an Edit module. If Edit.Open Example.Text appears almost anywhere on the screen, a MM on Edit.Open will open a viewer. If a file named Example.Text exists, it's content will appear in the viewer. If no file with the name exists, the viewer will be empty. The caret can be set by ML and characters can be inserted using a keyboard. In addition to Edit, ETH Oberon has ET and the Gadgets subsystem. Commands ET.Open <aFile> and Desktops.OpenDoc <aFile> are available. ETH Oberon also has the Hex module providing a hexadecimal editor with command Hex.Open <aFile>.

Programmatical Access to a Text

[edit | edit source]

A Text can be manipulated using procedures in the Texts module; for ETH Oberon and for V5. Also summarized for ETH Oberon in DEFINITION Texts.

To read a Text programmatically, a RECORD of Type "Reader" is opened on the Text at a specified offset. (A reader is a RECORD; not a procedure.) In this code fragment from ETH Oberon, the reader is named R.

 VAR
   T: Texts.Text;
   R: Texts.Reader;
   ch: CHAR;
 BEGIN
   NEW(T);
   Texts.Open(T, “Texts.Mod“);
   Texts.OpenReader(R, T, 0);
   Texts.Read(R, ch);
   WHILE ~R.eot DO
     Out.Char(ch);
     Texts.Read(R, ch)
   END
 END

After an execution of Read(R, ch) and assuming the object read was a character, it is available in ch with attributes in the fields of R. The font of a character is referenced in the Objects.Library, color is in R.col and vertical offset is in R.voff. Each execution of Read(R, ch) increments the offset by one until the end of the Text is reached and R.eot becomes TRUE.

In ETH Oberon a text can contain a non-character object, an image for example. For a non-character object, (R.lib IS Fonts.Font) will be FALSE.

How Text Works ...

[edit | edit source]

These tables show the structure of records in memory representing a Text. When Texts.Store() records a Text in a file for a storage medium, the information in the record structure is serialized. In the inverse process, Texts.Load() deserializes the information to the record structure of the Text in memory.

Click on a hyperlink to see the module where a type is defined.

In V2

[edit | edit source]
Texts.Text, a pointer to a Texts.TextDesc
Fields of TextDesc Types of Fields Notes
len LONGINT Length of text, in bytes.
changed BOOLEAN Flag indicating a revision.
notify Texts.Notifier Pointer to a method to notify interested clients of state changes.
trailer Texts.Piece Pointer to the Sentinel node in the list of pieces.
pce Texts.Piece Last found piece.
org LONGINT Offset in [0,len) of first character in last found piece.
Texts.Piece, a pointer to a Texts.PieceDesc
Fields of PieceDesc Types of Fields Notes
f Files.File Pointer, to file.
off LONGINT Integer offset in Text of first character in Piece.
len LONGINT Number of bytes in Piece.
fnt Fonts.Font Pointer to a font.
col INTEGER  
voff INTEGER Vertical offset of characters in pixels.
prev Texts.Piece Pointer to previous piece of Text.
next Texts.Piece Pointer to next piece of Text.

In ETH Oberon

[edit | edit source]

TextDesc is an extension of Objects.ObjDesc. The fields added to make TextDesc are distinguished from the fields inherited from ObjDesc by a differing background color.

Texts.Text, a pointer to a Texts.TextDesc
Fields of
TextDesc
Types of Fields Notes
stamp LONGINT Integer
dlink Objects.Object Pointer
slink Objects.Object Pointer
lib Objects.Library Pointer
ref INTEGER  
handle Objects.Handler Pointer
len LONGINT Length of text.
obs Objects.Library Pointer
trailer Texts.Piece Pointer to Sentinel node in list of pieces.
org LONGINT Offset in [0,len) of first character in last found piece.
pce Texts.Piece Last found piece.
Objects.Library, a pointer to a LibDesc
Fields of LibDesc Types of Fields Notes
next Objects.Library Pointer
ind Objects.Index Pointer
f Files.File Pointer
R Files.Rider Pointer
name Objects.Name  
dict Objects.Dictionary Pointer
maxref INTEGER  
GName POINTER  
Texts.Piece, a pointer to a Texts.PieceDesc
Fields of PieceDesc Types of Fields Notes
f Files.File Pointer
off LONGINT Integer
len LONGINT  
obj Objects.Object Pointer
lib Objects.Library Pointer
ref INTEGER  
col SHORTINT  
voff SHORTINT  
prev Piece Pointer
next Piece Pointer

In the Oberon Subsystem of A2

[edit | edit source]

TextDesc is an extension of Objects.ObjDesc. The fields added to make TextDesc are distinguished from the fields inherited from ObjDesc by differing background colors.

The type Texts.Text, a pointer to a Texts.TextDesc
Fields of
TextDesc
Types of Fields Notes
stamp LONGINT Integer
dlink Objects.Object Pointer
slink Objects.Object Pointer
lib Objects.Library Pointer
ref INTEGER  
handle Objects.Handler Pointer
len LONGINT Length of text.
obs Objects.Library Pointer
trailer Texts.Piece Pointer to Sentinel node in list of pieces.
org LONGINT Offset in [0,len) of first character in last found piece.
pce Texts.Piece Last found piece.
The type Objects.Library, a pointer to a LibDesc
Fields of LibDesc Types of Fields Notes
next Objects.Library Pointer
ind Objects.Index Pointer
f Files.File Pointer
R Files.Rider Pointer
name Objects.Name  
dict Objects.Dictionary Pointer
maxref INTEGER  
GName POINTER  
The type Texts.Piece, a pointer to a Texts.PieceDesc
Fields of PieceDesc Types of Fields Notes
f Files.File Pointer
off LONGINT Integer
len LONGINT  
obj Objects.Object Pointer
lib Objects.Library Pointer
ref INTEGER  
col SHORTINT  
voff SHORTINT  
prev Piece Pointer
next Piece Pointer

In V5

[edit | edit source]
Texts.Text, a pointer to a Texts.TextDesc
Fields of TextDesc Types of Fields Notes
len INTEGER[3] Length of text, in bytes.
changed BOOLEAN Flag indicating a revision.
notify Texts.Notifier Pointer to a method to notify interested clients of state changes.
trailer Texts.Piece Pointer to the Sentinel node in the list of pieces.
pce Texts.Piece Last found piece.
org INTEGER Offset in [0,len) of first character in last found piece.
Texts.Piece, a pointer to a Texts.PieceDesc
Fields of PieceDesc Types of Fields Notes
f Files.File Pointer, to file.
off INTEGER Integer offset in Text of first character in Piece.
len INTEGER Number of bytes in Piece.
fnt Fonts.Font Pointer to a font.
col INTEGER  
voff INTEGER Vertical offset of characters in pixels.
prev Texts.Piece Pointer to previous piece of Text.
next Texts.Piece Pointer to next piece of Text.

Texts.FindPiece and the cache

[edit | edit source]

For a given Text, T, and offset pos in [0, T.len), procedure Texts.FindPiece has the task of locating the piece containing pos. At each execution, FindPiece could begin at offset 0 and add lengths of pieces until the piece containing pos is located. A cache based upon T.pce and T.org allows better efficiency. When FindPiece completes a search, the pointer to the found piece is recorded in T.pce; the offset of the first character of that piece is recorded in T.org. The next execution of FindPiece begins at that cached location. With a result from FindPiece often being near the preceeding result, this strategy avoids repeated summation of lengths from the beginning of the first piece.

References

[edit | edit source]
  1. Each case is easily displayed explicitly. In an Oberon system with the hexadecimal editor, Hex, MM on Hex.Open <aFile>. In ETH Oberon a file with lines ending CR LF is edited by ET.OpenAscii <aFile>.
  2. With Text being a fundamental type in an Oberon system, "Oberon Text" is considered a proper name. Hence the capitalization "Text".
  3. In V5 the only integer type is INTEGER. Cf. LONGINT in V2.