Robotics Engineering

In this work an approach to robotics engineering called layered evolution and merging features from the subsumption architecture into evolutionary robotics is presented, This approach is used to construct a layered controller for a small simulated robot called street sweeper (SS) that learn crossing a busy street and satisfied its goal of collecting garbage and not getting crushed by passing cars i


1.Evolutionary Robotics
Evolutionary robotics is a relatively novel approach to the relatively mature field robotics, which itself is quite recent compared to the age-old dream of building intelligent machines.A plethora of methods have been used, but the most common way of doing ER (as evolutionary robotics) is this: a set of artificial genomes, called the population, is created.Each genome is a data structure specifying the configuration of a robot controller, and sometimes robot morphology as well.Usually the robot controller is a neural network connected to the sensors and motors of the robot, and the information in the genome is used to specify the synaptic (inter-neuron connection) weights of the network.Then, evolution takes place.All genomes are evaluated, which means that robots with genetically specified controllers are tested on a task and genomes are scored according to how well the controllers they specified did on the task.Good genomes are kept, and bad genomes are replaced with modifications or combinations of the good genomes, and the process is repeated a number of times or until good enough performance has been reached [10].

2.Behavior-Based Robotics and The Subsumption Architecture
Evolutionary robotics is certainly not the only modern approach to robotics.In the mid-eighties, [4] invented the subsumption architecture, and thereby gave birth to the active research field behavior-based robotics [9], which more or less dominates modern academic robotics research.In this field, like in good old-fashioned AI robotics, robots are hard-wired and behaviors are pre-programmed.Here, briefly describe the principles of the classic subsumption architecture (sometimes abbreviated SA).
A subsumption architecture is organized into layers, where each layer is responsible for producing one (or a few) behaviors.The layers have a strict vertical ordering where higher layers always have precedence over lower layers; often, the behaviors controlled by the higher layers are more complex and "cognitive" than those controlled by the lower layers.Importantly, all layers except the bottom layer presuppose the existence of lower layers, but no layer presupposes the existence of higher layer; in other words, if the robot is built from the bottom up, each stage of the robot development is able to operate [10].

3.Behavior Based Control
The control system of the agent is based on behaviors.A behavior is a subsystem that is responsible for one specific coupling between sensors and actuators Fig.

4.Reactive Control of Robot
Reactive control techniques decompose the operation of the robot into task-achieving behaviors .As can be seen in Fig. 2, these behaviors are arranged into an order of increasing competence.The lowest behavior, avoids objects, has the highest priority and the most advanced behavior, wandering around in this case, and has the lowest priority.These behaviors run in parallel with each other, but if required the higher behaviors can take control of the robot "subsuming" the behaviors beneath it.The lower behaviors, which have a higher priority, are still functioning and can if required, take control of the robot even when subsumed.
Reactive control of a robot allows it to react to unexpected events in real-time, even if a higher behavior is "thinking" about its task.For example, if a robot with the reactive architecture shown in Fig. 2 is chasing prey and sees a predator, it would switch its behavior to avoiding the predator, since there is no point chasing food, if you are going to get eaten in the process.Furthermore, if the robot now sensed an obstacle in its path, it would again switch behavior, this time to that of avoiding obstacles, since the robot cannot run away from the predator if it is trying to push a tree out of its way.Once the obstacle has been successfully avoided, the robot can swap back to running away from the predator [7].

5.Control System Using Classifier System
A control system is an interconnection of components forming a system configuration that will provide a desired system response.A network of different Classifier Systems can implement the control system of an agent.The issue of architecture is therefore the problem of designing the network that best fits some various classes of behaviors [8].

6.What Is Classifier System?
Learning Classifier systems are a form of adaptive system introduced by John Holland in the mid1970s as a form of domain -independent learning system.A classifier system is used as a machine learning system that learns syntactically simple string rules (called classifier) to guide its performance in an arbitrary environment .A classifier system consists of three main components as illustrated in Fig. 3

1487
• Performance System (Rule and Message System).

• Apportionment of Credit System (Bucket Brigade Algorithm). • A Rule Discovery (the Genetic Algorithm).
• The Performance System The performance system (also called rule and message system), is a special kind of production system.A production system is a computational scheme that uses rules as its only algorithmic device.The rules generally have the form.If <condition> then <action> the meaning of a production rule is that action may be taken (the rule is fired) when the condition is satisfied.The performance system is composed of: classifier store, message list, detector, and effecter.

Classifier Store
The classifier store is the system's long term memory.It is made up of a population of classifiers.A classifier is made up of one or more conditions (known as the condition part) and one action (called the action part).The condition part specifies the set of messages to which a classifier is sensitive, and the action part indicates the message it will broadcast or send out when its condition part is satisfied.Thus a classifier list consists of one or more classifiers of the form: are the conditions making up the condition part and ' ' a is the action part, conditions are connected by AND operator.The'/' indicates a separation between conditions and action.Each i C is a string of fixed length K over a fixed alphabet.In most practical systems, the string is defined over three alphabets: {1,0, #}.The '#' is don't care (wild card) symbol that can match any of the chosen symbols {0 or 1}.for example: A classifier having a condition represented by: 1#01 Will recognize (match) the following messages: 1101 and 1001A classifier posts one or more messages onto the message list when it is activated.The action part of a classifier is used to form the message it sends out when it is activated.It is also a string of fixed length K defined over the alphabet {1,0}.

Message list
The message list acts as the system's short-term memory and as the medium for communication between classifiers, and the output interface.It is made up of external messages (input observations) and internal messages (messages from classifiers).A message is represented by a string of fixed length K (same length as that for a condition) over the same set of alphabets {1,0} as the action.

3.Input Interface (Detector)
This receives the input messages from the environment and transforms them into fixed length strings to be placed on the message list

4.Output Interface (Effecter)
Messages placed on the message list by classifiers are processed through the output interface in order to communicate with the system 's environment [3,5].The main task of the apportionment of credit algorithm is to classify rules in accordance with their usefulness.In other words, the algorithm works as follows: a time varying real value called strength is associated to every classifier ' 'C .At time zero each classifier has the same strength.When an external classifier causes an action on the environment a payoff is generated whose value is dependent on how good the action performed was with respect to the system goal.This reward is then transmitted backward to internal classifiers that caused the external classifier to fire.The backward transmission mechanism causes the strength of the classifiers to change in time and to reflect their relevance to the system performance (with respect to the system goal).It is not possible to keep track of all the paths of activation actually followed by the rule chains.(A rule chain is a set of rules activated in sequence, starting with a rule activated by environmental messages and ending with a rule performing an action on the environment).Because the number of these paths grows exponentially, it is then necessary to have an appropriate algorithm that solves the problem using only local (in time and space) information.Local in time means that the information used at every computational step is coming only from a fixed recent temporal interval.Spatial locality means that changes in a classifier strength are caused only by classifiers directly linked to it; • The Rule Discovery System (RD) A complete classifier system needs some means of generating new rules for use in the performance and learning systems.Well known genetic algorithm (GA) techniques have been used as the main source of rule discovery in Classifier System.Genetic algorithms were inspired by natural selection and operate by evolving generations of individuals, which are successively, more fit according to some fitness evaluation function.In traditional classifier systems classifiers strength is taken as a measure of its fitness or utility in rule discovery.In addition to its use in the performance system, the basic operation of a genetic algorithm is summarized as follows:

PDF created with pdfFactory
1. Select classifiers for reproduction: The probability of a classifier being selected as a parent is based on its strength.effected probabilistically on some part of the bit string.

Select classifiers for deletion:
In order for the population of classifiers to remain within some reasonable size limit existing classifiers must be deleted as new ones are introduced.There are various means of selecting classifiers for deletion for example probabilistically based on an inverse function of the classifiers strength i.e. weaker classifiers are more likely to be deleted [2,6,11].

7.The Street Sweeper: A Proposed System
In this study the system called SS (Street Sweeper) uses three classifier systems, which has two levels distributed architecture.Three classifier systems were used to perform complex behavior.First classifier learns simulated robot to crossing the street i.e. move one step toward garbage when there is no predator such as cars passing the street or traffic light color is red .Second classifier learns the simulated robot to stop when the cars are passing the street or traffic light color is green.The third classifier system is controller classifier system should learn switching policy i.e. to which classifier system gives the control when more than one of them is active.The objects in environment are as follows: Moving robot, moving cars, and fixed position represented by traffic light, and fixed position represented by garbage.The environment illustrated in Fig. 4.since the small robot street sweeper can crossing a busy street and satisfied its goal of collecting garbage and not getting crushed by passing cars would be behaving very adaptively and intelligently in its specified environment.

8.Layered Evolution Structure
Layers evolution approach possible is organizing the controller as a subsumption architecture.Each layer consists of a learning classifier system connected a subset of the robot 's sensors and actuators.The layers are connected in a simple structure where higher layers can influence or subsume lower layers The robot Street Sweeper is controlled by more layers, where each layer consists of a learning Classifier system.Communication between layers is restricted to that higher layers can influence lower layers using a hardcoded, task-specific link.The Street Sweeper (SS) system is built of three learning classifier systems.Organized in two level hierarchical architecture, interacting to gather to perform complex behavior, consist of three classifier systems (LCS-CONTROL), (LCS-APPROACH) and (LCS-STOP).The SS structure is shown in Fig. 5.

10.2Coding (LCS -Approach) Actions
The desired action should be the same as system input message.Therefore we have eight actions.Action has the form and meaning as in table.4.10.3Representation of (LCS -Approach) LCS -Approach consists of a condition part of( 3bit ) representing the position of garbage in the environment and form action part of (3 bit) representing action to be done in the environment the size of classifier store for LCS -Approach will be (8) rules.

Example:
The representation of the rule "if a relative position of a garbage to be sensed from simulated robot is to the north then the action to be taken by simulated robot is moving to the north, and so on.
Position of garbage From the robot / direction moving of robot 0 0 0 0 0 0 10.4Executing of system Street Sweeper Code for (LCS -Approach) The whole project was programmed in Pascal language, Executing the Street Sweeper code, the system responds by presenting the initial report for LCS -Approach .theclassifier system run for 200 iterations, termination with the snapshot report display in Appendix .B the correct rules have achieved high strength values, by contrast the bad rules have strength and bid value near zero.The classifier system eliminates the bad rule quickly there by achieving near perfect performance.The using of SS starts with insertion environment messages to the SS system.Each environment message consists of 2-bit.These environment messages are received from detectors of the controller LCS -Control transfer them to its performance system and executed them sequentially only one message in each cycle.In the performance system of the controller LCS-Control is performed matching process for each environment message with the condition part of all classifiers in classifier store.All classifiers that matched with environment message are sent to the AOC system and use reinforcement learning to reward the winner.In AOC system three procedures are called (Auction, clearinghouse, and tax collector), then a system is calls GA to inject new rule which may increase the performance of the system.If the environment message is matched with at least one of the classifiers in classifier store, then the winner classifier action is transferred to the effecter of the controller LCS-Control.In case of existence of predator,(cars crossing the street or traffic light color Is green) the winner classifier action is '0',i.e(LCS -Control) switch toward (LCS -Stop). in case of no predator,(cars not crossing the street or traffic light color is red) winner classifier action '1'.i.e LCS -Control switch toward( LCS -Approach).LCS-Control is checked whether there is predator or not to determine which of two classifiers will work the controller LCS-Approach or LCS-Stop.

11.The Street
Case of no predator, the controller LCS-Control switch toward the LCS-Approach the controller LCS-Approach receives messages, which consist of 3-bit.Detector of LCS-Approach transfers the message to the performance system, In the performance system of LCS-Approach matching process is performed on each message with the condition part of all classifiers in classifier store.If the message matches at least one of the classifiers in classifier store then the winner classifier action is transferred to effecter of the controller LCS-Approach.the action of winner classifier is sent to the environment by the effectors, to determine the direction of movement for the robot.SS cycle illustrated in Fig. 7.  --------------------

Case of existence of predator, LCS-
crossing the street or stop otherwise.The Controller (LCS-Control) should learn to suppress the controller LCS-Stop whenever the Approach behavior proposes an action, which represents complex behavior.receives 2-bit message from environment mapping it to 4 states from 0 to 3 of two bit only.The two bit represent as fallows: • First bit represented are cars crossing the street.(1 the cars not crossing the street, 0 the cars are crossing the street).

•
Second bit represented traffic light color.(1 traffic light color is red, 0 traffic light color is green).LCS -Control Conditions has the form and meaning as in table.1 9.2Coding (LCS -Control) Actions LCS -Control has one action consisting of only one bit, LCS -Control actions have the form and meaning as in Table.2 9.3Representation of (LCS -Control)Performance system of the Controller LCS-Control consists of a message list and classifier store.The classifier stores of LCS-Control contain a set of rules called classifiers, which represents the knowledge and controller of the system at execution time.Condition part of classifier consists of (2 bit), and action part consists of (1 bit).The size of classifier store for LCS-Control will be (4) Rules and all classifiers have the same strength value at the beginning.Example:The representation of the rule "If the cars are not passing the street and traffic light color is red then the robot approach the garbage ": Cars are not traffic light is red language, Executing the Street Sweeper code, the system responds by presenting the initial report for LCS -Control .The classifier system run for 200 iterations, termination with the snapshot report display in Appendix.A the correct rules have achieved high strength values, by contrast the bad rules have strength and bid value near zero.The classifier system eliminates the bad rule quickly there by achieving near perfect performance.
Control is switch toward LCS -Stop, The action of robot is stop.
pdfFactory Pro trial version www.pdffactory.com

com Robotics Engineering 7, 2010 No. 28, . Journal, Vol Eng. & Tech.
1.This contrasts sharply with the view of traditional AI where control is typically based on a set of goals, a model of the world and a search PDF created with pdfFactory Pro trial version www.pdffactory.
Pro trial version www.pdffactory.com

New winner [3]: old winner [3] Last report For Street Sweeper System for (LCS -Control) 10.The (LCS-Approach) development
PDF created with pdfFactory Pro trial version www.pdffactory.com

Sweeper System Cycle PDF
created with pdfFactory Pro trial version www.pdffactory.com