Computability and Complexity/Formal Languages/Chomsky Hierarchy/Unrestricted Languages
Unrestricted Languages
[edit | edit source]As the name implies, the class of unrestricted languages is the least restrictive class in the Chomsky Hierarchy, and is the set of languages generated by unrestricted grammars. Unrestricted grammars are grammars composed of a finite number of rules of the form A -> B, where A and B are strings of terminal and non-terminal symbols, and A is not the empty string. The languages produced by these grammars are also referred to as recursively enumerable, as a recursive function could theoretically generate all the strings in them, although not necessarily in a finite time.
Turing Machines
[edit | edit source]Equivalent to the class of unrestricted languages is the class of languages recognized by a Turing machine. A Turing machine (TM) is identical to an LBA (see Context Sensitive Languages), with one exception: a Turing machine's tape is infinite. In its standard form, the Turing machine's tape has a left endpoint but continues indefinitely to the right. Other infinite models for the tape, such as a tape infinite in both directions, or multiple infinite tapes, are equivalent to the standard form.
The Turing Machine is the most powerful machine in the hierarchy, with the power to emulate any of the other machines. It is equivalent in power to most programming languages, although computers only contain finite memory, while a true Turing Machine has infinite memory.
Universal Turing Machine
[edit | edit source]Turing Machines can also be programmed as something called a Universal Turing Machine (UTM), a single TM (meaning a single set of states, rules, and alphabet) that can accept as input another TM encoded as a string, together with an input string. The Universal Turing Machine can then emulate the other TM running the input string. This universal emulation of its own class is a property not shared by any other machine in the hierarchy. Some of them can be programmed to emulate a specific subset of their class, but none of them can emulate any given member of their class.
Looping
[edit | edit source]However powerful they may be, they do have limitations. One of the most obvious limitations is that unlike LBAs, a TM has an infinite number of conditions, due to the infinite tape. This means that the TM can not only loop, it can be in an infinite running, non-halting pattern that never loops. As an example, consider a naively programmed TM designed to take as input a positive integer a and to determine if the square root of a is rational by calculating it and printing it out on the tape. If that machine is given the number 2 as input, it will never complete printing the irrational number , and so will run forever.
Examples
[edit | edit source]The code below is a sample TM emulator in Perl. Given the description of a machine and an input string, it simulates the machine processing the input string, and shows whether the machine accepts or not.
The syntax is: progname.pl TMFile inputFile, where TMFile is a text file containing the TM instructions, and inputFile is a text file containing the input string. Some sample inputs, including a set of TM instructions for a machine to multiply two numbers, can be found under sample TM inputs
A sample TM emulator in Perl.
#!usr/bin/perl
use Text::ParseWords;
use strict;
use warnings;
# Grabs the filenames for the machine and the word to be run on it.
my $tmFile = $ARGV[0];
my $input = $ARGV[1];
# We use subroutines to parse and verify the data in the input files.
# The machine data is stored in the $machine structure as the keys rules, accepts, alphabet, and startState.
my $machine = readTM($tmFile);
# Rules and accepts are extracted from the $machine structure for ease of access.
my @rules = @{$machine->{rules}};
my %accepts = %{$machine->{accepts}};
# This reads the input file and parses it into an array of strings, with each element being one input symbol.
# It checks to make sure the elements are all in the machine's alphabet.
my @tape = readInput($input, $machine->{alphabet});
# $changed records whether or not a rule has been used when running through the rules list to make transitions.
my $changed = 1;
# $state is the state the Turing Machine is in, and is initialized to the start state from the machine file.
my $state = $machine->{startState};
# $tapeIndex is the position of the machine's head on the tape.
my $tapeIndex = 0;
# Now that the machine is initialized, we can begin making transitions
# As long as things keep changing, keep cycling through the rules.
while($changed)
{
# Unless it changes while going through the rules, the machine will terminate.
$changed = 0;
# The current tape is printed, with the symbol under the head highlighted.
print "@tape[0..$tapeIndex-1]<".$tape[$tapeIndex].">@tape[$tapeIndex+1..$#tape]\n";
# The current state of the machine is printed.
print "$state\n";
# A new state is calculated by checking conditions against the list of rules
for my $ruleRef (@rules)
{
# print "::$ruleRef->[0]??$branches[$i][0]";
# print "::$ruleRef->[1]??$string[$branches[$i][1]]";
# print "::$ruleRef->[2]??".$branches[$i][2][-1]."::\n";
# Checks the current state and tape symbol against the rule being examined
if ($ruleRef->[0] eq $state &&
$ruleRef->[1] eq $tape[$tapeIndex])
{
# The state transition is printed.
# print "State: ".$state." -> ".$ruleRef->[2]."\n\n";
# Set the new state,
$state = $ruleRef->[2];
# Write the new symbol to the tape,
$tape[$tapeIndex] = $ruleRef->[3];
# Shift the tape to the new index,
$tapeIndex += $ruleRef->[4];
# and make sure it hasn't run past the left edge of the tape.
if ($tapeIndex < 0) { $tapeIndex = 0; }
# If the machine nears the end of the allocated tape, expand the tape.
if ($tapeIndex >= $#tape-1) { push(@tape, "_"); }
$changed = 1;
# Once we've made a transition, we can stop and begin looking for the next one.
last;
}
}
}
# When there are no more possible transitions, if the machine is in an accepting state,
if (exists($accepts{$state}))
{
# Print that it accepts and quit.
print "The machine accepts the string.\n";
exit;
}
# Otherwise, print that it does not accept, and quit.
print "The machine does not accept the string.\n";
exit;
###################################################
sub readTM
# This subroutine reads the machine data from the specified file into variables (mostly hashes).
{
my (%states, %accepts, %alphabet, @rules);
open(INFILE, shift) or die "Can't open machine file: $!";
# This block reads the list of states from the machine file.
# Discards the section header,
<INFILE>;
my $line = <INFILE>;
chomp($line);
my @words = parse_line('\s+', 0, $line);
for (@words)
{
# records the state names for checking the rules,
$states{$_} = 0;
}
# This block reads the start state from the machine file.
# Discards the header,
<INFILE>;
my $startState = <INFILE>;
# takes the whole line as the start state,
chomp($startState);
# and makes sure that the start state is defined in the list of states.
exists($states{$startState}) or die "The start state $startState isn't a state!";
# This block reads the list of accepting states from the machine file.
# Discards the header,
<INFILE>;
$line = <INFILE>;
chomp($line);
# breaks up the line into state names,
@words = parse_line('\s+', 0, $line);
for (@words)
{
# checks to make sure that the accept states are defined states,
exists($states{$_}) or die "$_ isn't a state!";
# and defines those names in a new hash. The use of a hash makes it easier to determine later if a specific state name accepts or not.
$accepts{$_} = 1;
}
# This block reads the list of symbols in the alphabet from the machine file.
# Discards the header,
<INFILE>;
$line = <INFILE>;
chomp($line);
# breaks up the line into alphabet symbols (note that the symbols can be of arbitrary length),
@words = parse_line('\s+', 0, $line);
# e is used as the empty symbol in the rules.
$alphabet{e} = 1;
for (@words)
{
# This records which symbols are in the alphabet for checking the rules.
$alphabet{$_} = 0;
}
# This block reads the state transition rules from the machine file.
# Discards the header,
<INFILE>;
# This variable synchronizes the position of each rule in the rules array.
my $rulesCounter=0;
while(<INFILE>)
{
# breaks each rule into start state, input symbol, stack symbol, end state, and new stack symbol.
chomp;
@words = parse_line('\s+', 0, $_);
# checks that the first four pieces are defined in the state and alphabet hashes,
exists($states{$words[0]}) or die "$words[0] isn't a defined state!";
exists($alphabet{$words[1]}) or die "$words[1] isn't defined in the alphabet!";
exists($states{$words[2]}) or die "$words[2] isn't a defined state!";
exists($alphabet{$words[3]}) or die "$words[3] isn't defined in the alphabet!";
# and converts the left/right instruction into a number to be added to the position counter.
if ($words[4] eq "left" || $words[4] eq "-1")
{
$words[4]=-1;
}
elsif ($words[4] eq "right" || $words[4] eq "+1")
{
$words[4]=1;
}
else
{
die "$words[4] isn't left, right, -1, or +1!";
}
# then creates an array of each rule.
for (0..4)
{
$rules[$rulesCounter][$_] = $words[$_];
}
# The synchronization variable has to be updated.
$rulesCounter++;
}
# Reading complete, the subroutine closes the file and returns the name of the start state.
close INFILE;
# The relevant data is stored in the $machine structure and returned to the main routine.
my $machine =
{
rules => \@rules,
accepts => \%accepts,
alphabet => \%alphabet,
startState => $startState
};
return $machine;
}
sub readInput
# This subroutine reads the starting tape from the specified file into an array of symbols.
{
open(INFILE, shift) or die "Can't open ".$input.": $!";
my $alphaRef = shift;
# The first line of the file is read as the initial state of the tape, with symbols delimited by spaces.
my $line = <INFILE>."";
chomp($line);
my @tape = parse_line('\s+', 0, $line);
# This makes sure every symbol in the input string was defined in the machine's alphabet.
for (@tape)
{ exists($alphabet->{$_}) or die "$_ in $input isn't in this machine's alphabet!"; }
close INFILE;
return @tape;
}