Systems, Interactions, and Macrotheory
PHILIP BARNARD
British Medical Research Council JON MAY
University of Sheffield
DAVID DUKE
University of Bath
and
DAVID DUCE
Oxford Brookes University
A significant proportion of early HCI research was guided by one very clear vision: that the existing theory base in psychology and cognitive science could be developed to yield engineer- ing tools for use in the interdisciplinary context of HCI design. While interface technologies and heuristic methods for behavioral evaluation have rapidly advanced in both capability and breadth of application, progress toward deeper theory has been modest, and some now believe it to be unnecessary. A case is presented for developing new forms of theory, based around generic “systems of interactors.” An overlapping, layered structure of macro- and microtheo- ries could then serve an explanatory role, and could also bind together contributions from the different disciplines. Novel routes to formalizing and applying such theories provide a host of interesting and tractable problems for future basic research in HCI.
Categories and Subject Descriptors: H.1.2 [Models and Principles]: User/Machine Sys- tems—Human factors; H.1.1 [Models and Principles]: Systems and Information Theory—General systems theory
General Terms: Human factors, Theory, Design
Additional Key Words and Phrases: Cognitive models, computing system models, models of Interaction
Authors’ addresses: P. Barnard, Cognition and Brain Sciences Unit, British Medical Research Council, 14 Chaucer Road, Cambridge, CB2 2EF, UK; email: philip.barnard@mrc-cbu.cam.ac.uk; J. May, Department of Psychology, University of Sheffield, Sheffield S10 2TP, UK; email: jon.may@shef.ac.uk; D. Duke, Department of Mathematical Sciences, University of Bath, Bath BA2 7AY, UK; email: d.duke@bath.ac.uk; D. Duce, School of Computing and Mathematical Sciences, Oxford Brookes University, Oxford OX3 0BP, UK; email: daduce@brookes.ac.uk. Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee.
© 2000 ACM 1073-0516/00/0600–0222 $5.00
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000, Pages 222–262.
Systems, Interactions, and Macrotheory • 2231. THEORY DEVELOPMENT IN A BOUNDLESS DOMAIN
In less than a quarter of a century information technologies and their users have diversified into an extraordinary range of sociotechnical ecosystems. Scientific studies of HCI ran alongside each advance in interface design and each new application domain. A virtual relay baton of research-and- development interest passed from line editors and programming applica- tions, through command and WIMP interfaces for word processors and information retrieval systems, all the way to the current range of favorites. These now include: intelligent agents, awareness servers in CSCW, virtual environments, synthetic battlefields, mobile and wearable computers, and, of course, games, the world wide web, and embodied conversational charac- ters capable of exhibiting emotion at the interface. Few would disagree with the proposition that the study of HCI is now effectively a boundless domain.
At the outset, many shared the vision of Card et al. [1983] that step-by- step task analysis could be combined with theories of the human informa- tion processing mechanism and human knowledge representation. The product would be engineering methods to support design decision-making, based upon sound theory, and subjected to empirical validation. As appli- cations and interfaces diversified, the limitations of simple theoretical assumptions about a prototypical user’s cognitive mechanism, and of the engineering methods they gave rise to, became all-too-readily apparent. Richer ways of thinking about users, tasks, systems, contexts of use, and design processes were needed.
On the user side, our tooling diversified into situated action, ethnometh- odology, distributed cognition, activity theory and coordination science (e.g., see Suchman [1986] and Malone and Crowston [1994]). Theory-based evaluations of interface usability were often considered to be of limited value (e.g., Landauer [1987]) and most human factors work continued to rely on, or improve, heuristic methods of empirical evaluation (e.g., Nielsen [1993]). Design was also recognized as a complex process involving many trade-offs. These could beneficially be addressed and resolved by develop- ing new methodologies such as scenario-based design [Carroll and Rosson 1992], or design rationale [MacLean et al. 1991] rather than extending theory-based evaluations.
As the scope of interaction design widened to include multimodal commu- nication, multiple users, and worlds of interaction having no natural counterpart, there was little in the way of a well-developed body of theory ready for direct application. It was, of course, recognized from the very outset that the development of theory lagged developments in interface design [Newell and Card 1985]. It was also widely acknowledged that our theories suffered from numerous other deficiencies. They were typically, for example, of restricted scope, applying at best to relatively local features of interface design. It often proved hard to reuse them in novel contexts or to scale up theoretical exercises from simple laboratory examples to real design settings. In spite of their many limitations, there is little doubt that
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
224 • P. Barnard et al.
our theories have evolved to address a broader range of issues. Some have even been better tailored to meet the everyday demands of practical application (e.g., Rudisill et al. [1996]). Nonetheless, as we move into the new millennium, theory has yet to mature to a point where it could be truly regarded as on a course that is likely to yield large scale benefits on a tangible time-scale.
At this juncture, we could simply abandon serious attempts to maintain theory development as a key element within the wider HCI enterprise. Indeed, many researchers and practitioners now believe that the science of HCI will not benefit significantly from the further development of deep theories. There are too many users, too many computers, and their interac- tions are too complex for any theoretical approach to capture adequately. Deep theory is, in this view, a passive ivory tower occupation, that has little to offer day-to-day design decision-making. Even in those areas where our scientific understanding is well-developed, it is quite likely to have little impact in real-world contexts. Despite the evidence concerning the superiority of alternatives, and any theoretical understanding of that evidence, we still use QWERTY keyboards and we still enter numbers into calculators that have one layout for the numbers 0–9 and switch to telephone keypads with a quite different layout.
The very diversity of theoretical representation presents another barrier to their development and adoption. Although there is universal agreement that HCI is interdisciplinary, the vast bulk of theory has been developed along resolutely X-centric lines (System-centered; Application-centered; User-centered; Task-centered; Team-centered; Worksystem-centered; “Vir- tual Reality”-centered, or whatever). Continuing with this strategy into the next millennium will inevitably lead to an increasing number of theories of different form and content. We would land up with a whole range of theories dealing with different facets of individual user performance, and yet more theories of the behavior groups and larger organizations. The obvious problem for this strategy is that our theories would be unlikely to “fit together” in a sufficiently coherent way to enable us to resolve the conceptual jigsaw puzzles that exist in real design spaces [Bellotti et al. 1996; Blandford and Duke 1997]. The vital connective tissue that binds together different topics within a level of analysis and that which binds one level of analysis to another has, at best, been underexplored [Malone and Crowston 1994]. At worst, it has been fundamentally obscured by the coexistence of multiple systems of scientific semantics all being used within the same general problem space.
There is one very obvious problem with the strategy of abandoning the development of deep theory altogether. In the absence of a good body of formal theory, practitioners will undoubtedly continue to invent their own informal, or folk theories, to help them represent and think about the problems and issues that are important to them in their context. The practice of HCI could become like that of psychoanalysis, with one school of thought communicating among themselves about a given set of issues in very different terms from those adhering to another school of thought.
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 225
In this article, we discuss how theories about different problems and different topics of relevance to the use of computers might come to be framed and expressed in a common language. Our discipline will not be best served in the new millennium either by abandoning theory or by the unconstrained development of more and more unconnected “local theories” at different levels and in different domains. Our theories should not simply deal with users, computer systems and their applications, or teams and organizations. They should deal with interactions between all these enti- ties. They should also directly address the problem of how to link up all the different ways of talking about and modeling the properties and behaviors of all these different entities. Like Furnas [2000] we consider the delivery of such a body of theory to be a major long-term undertaking. We argue here that greater integration within a boundless domain such as HCI is still a tractable proposition for research in the new millennium. As a vehicle for stimulating further debate on alternative routes to theory development, we advocate the development of generic representations of “systems of interactors.”
2. SYSTEMS OF INTERACTORS, MACROTHEORY, MICROTHEORY, AND LAYERED EXPLANATION
Many of the general arguments for layered systems analysis are well known and have been widely discussed. In systems engineering, for exam- ple, Chapanis [1996] uses hierarchical diagrams to illustrate how humans, hardware and software are grouped together in sub-subsystems that are embedded in subsystems that make up a complete system, be it a team or wider organization. Likewise, from a cognitive science perspective, Newell describes a system level as “a collection of components that are linked together in some arrangement and that interact, thus producing behavior at that system level. In a system with multiple levels, the components at one level are realized by systems at the next level down and so on for each successive level” [Newell 1990, p. 117]. Each different class of subsystem can be understood in terms of a specialized body of theory. These can be thought of as a collection “microtheories,” while the overall system they compose requires a wider “macrotheory” of how the various subsystems themselves interact.
Newell’s points about systems were part of a wider and very important argument that emphasized the need to move away from specific microtheo- ries of different phenomena and toward unified theories of cognition that could furnish those phenomena with a common basis of explanation. Unification requires the development of abstractions that capture regular- ities at an appropriate system level. Marr [1982] recognized the importance of abstraction when he distinguished a computational level of theory, which specifies the essential logic of what needs to be computed for a task to be carried out, from algorithmic and hardware levels of theory.
Here, we draw upon important components of all these ideas, but in a generalized form. Rather than taking hardware, software, users, comput-
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
226 • P. Barnard et al.
Fig. 1. Macrotheory and Microtheory for an assembled system [A] composed of basic interactors [B]s each of which is composed of constituent interactors [C]s.
ers, or teams as specific points of departure for theory development, we start by defining all of these entities as “interactors.” The use of this particular term originates in computer science [Duke and Harrison 1993; 1994; 1995a; 1995b]. Very much like the concept of an attractor in mathe- matics, the concept captures the idea of something that interacts with something else. This term has a number of advantages. First, by being generically, as opposed to specifically X-centric, it enables us to refer to things that interact without carrying the implicit semantic overheads that come with terms such as computers, users or teams. Second, an interactor is something that is composed of other interactors, and as such is a relative rather than absolute construct. Third, an interactor is something whose behavior can, in principle, be mathematically described in terms of the properties of the lower-order interactors of which it is itself composed. Finally, any interactor is an entity that behaves over time.
To model the behavior of an interactor over time, we need to understand how its behavior is constrained. The behavior of any interactor will be determined in part by constraints originating in its own constituents (its own microtheory) and in part by the constraints imposed by the other interactors with which it engages as a part of some assembled system (the system’s macrotheory). Figure 1 provides a simple illustration of an hypo- thetical system of interactors. It is focused around a number of basic interactors—[B]s. These are the basic units that interact and their behav- ior is regarded as constrained by their constituents—[C]s, and by the overall organization of the assembled system [A]. We require macrotheory to understand how the particular assembly [A] functions, and we require microtheories of all the basic interactors to understand how their behavior is determined by their decomposition. There is unlikely to be a simple theory that explains how the [A] node behaves in terms of all of the possible [C] nodes.
In the case of human computer interaction, a system of basic interactors might minimally be composed of a user, a computer and other things used in the task context, such a printed document. The behavior of the user (a [B]) will be constrained by the properties of components ([C]s) that interact within the user’s mental architecture. The appropriate body of microtheory for this interactor will be in the realms of psychology and biomechanics
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 227
(e.g., perceptual mechanisms, decision mechanisms and mechanisms for the control of action etc.). The body of theory might exist only in “verbal” form, or it might include some AI simulation. Likewise, the behavior of the computer system (a different [B]) will be constrained by its components (I/O devices, processor properties, etc.). The microtheory for the computer interactor will be in the realm of computer science, and may well be expressed in formal logic or mathematics. The “behavior” of the document would equally be constrained by factors like the flexibility of its physical components and how they were bound together. This microtheory will be in the realm of book manufacture, and may be less abstract than the other two microtheories. For the assembled system of interactors, at least three types of microtheory are required, a model of the psychological system, a model of the computational system and a model of a physical system. The relevant macrotheory for the assembled system of interactors would specify how their conjoint behavior is constrained, and must abstract over micro- theories from different scientific realms. Only the combination of micro- macrotheory would provide a complete theory of this assembled system of interactors. A coherent theory made up of an interrelated body of micro- and macrotheory will, for the purposes of a later contrast, be referred to as a “Type 1” theory.
The importance attached to the potential contribution of a body of theory quite obviously depends on the type of system we are interested in analyzing. One analyst’s macrotheory will be another’s irrelevant details. Those interested in modeling the operation of a psychological system or a computing system, and in applying their models to HCI, are likely to be concerned with the layers shown in Figure 2. At the apex of this Figure is an overall system composed of a user, a computer and some other interac- tor, which might again be documentation. The topmost level is labeled a behavioral or syndetic system. The term syndetic [Duke et al. 1998] is derived from the Greek term syndesis meaning to bind together. We use the term syndetic to refer to the specific case of behavioral systems that are composed of interactors of fundamentally different types. To the sides of the apex are shown a psychological system of interactors and a computing system of interactors, and to the side of these are shown their respective refinements into neurological and electronic systems. The psychologist with a cognitive model may well have designed it so that the kinds of mental modules that contribute to their models are at least consistent with what is known about underlying neurological architecture. However, those with an interest in the functioning of organizations are unlikely to be interested in such minutiae. They might wish to see an extension of this form of diagram “upward” rather than “downward,” with higher-order systems in which the basic interactors are teams, whose constituents are the type of system at the apex of Figure 2.
Systems and levels are distinguished in terms of the focus of scientific attention. Each system is organized around entities that behave. In the neurological system the things that behave are neurons or glands that release hormones. In the psychological system the things that behave can
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
228 • P. Barnard et al.
Fig. 2. Systems of Interactors at different levels of explanation organized by Type 1 and Type 2 theories.
be thought of as processes that construct or change mental representations. In a behavioral or syndetic system, the things that behave are humans and technological artifacts. Unlike a strict hierarchical decomposition of succes- sive systems, Figure 2 overlaps hierarchies at different levels. This enables us to highlight two characteristics. First, when we focus our attention on the behavior of a system, we adopt a frame of reference appropriate to the entities that make it up. This requires us to consider both the organization of the assembled system and the subordinate constituents of the entities. A complete Type 1 theory is composed of macrotheory and microtheory. Newell’s arguments were applied primarily to the unification of theories within a cognitive layer. We take this form of argument and generalize it to any system level. Second, when we adopt a frame of reference for model building or theory development, we do so in terms of the scientific seman- tics of a particular class of theory. When we move from one level of system to another we use theories with different form and content, as identified by Marr [1982] and others. This is shown in Figure 2 by introducing the notion of a Type 2 theory, which is the mapping from the macrotheory of one level of explanation into the microtheory of another and vice versa.
A Type 2 theory interrelates adjacent layers of systems analysis. This kind of theory maps from the superordinate composition of one system into the basic units of the higher system, and from the basic units of the lower system into the constituents of the higher system. When we move up a level we discard the microtheory of the lower level, and when moving down we add microtheory to support more detailed explanation or implementation. This is marked in Figure 2 by the horizontal arrows to the left and right of the syndetic system. In moving to either a human or computer system, a [B] unit of the syndetic system becomes the [A] entity of our new theoreti- cal domain. Its [C] interactors become the [B] interactors of our new theory,
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 229
and we need a new body of microtheory to add in the new [C] structures, which were not specifically represented in the theoretical analysis of the syndetic system. Type 2 theories are needed to translate the syndetic concepts into psychological or computational concepts. Similarly, Type 2 theories are needed to specify how psychological concepts are realized in neurological systems and how computing systems are realized in electronic ones.
Macrotheory at one system level is not the same as microtheory in an adjacent and overlapped level. As Marr [1982] noted, there may be concepts in one level of explanation that have no direct realization in an adjacent level. A key contrast at one level may, for example, be an emergent property of a lower-order system. Alternatively, a given level of system may be based upon the same lower-order foundations but use the higher-order constituents in different ways. In Cantonese, pitch contour may completely change the meaning of a word. Since pitch contour and stress interrelate, the communicative functions of these resources are restricted and another resource must be used to fulfil some of the functions of pitch contour in English and French (e.g., see Barnard and Marcel [1984]). This makes little difference to the macrotheoretic architecture of eastern and western brains. However, it would need to be accommodated in alternative microtheories of language comprehension at a psychological level of explanation. Similarly, in HCI, there can be many alternative ways of combining user and computer capabilities to realize interactions that have particular behav- ioral properties. These need to be accommodated in modeling higher-order systems.
In the years since Newell and Card’s [1985] discussion of the problems faced by theory development in HCI, there can be little doubt that signifi- cant advances have been made in the development of unified cognitive architectures and placing modeling on firm computational foundations. SOAR [Newell 1990], ACT-R [Anderson 1993], and EPIC (e.g., Meyer and Kieras [1997]) have all achieved significant application successes in this domain. However, the progress has been rather uneven. The advances tend to have been concentrated on particular attributes of tasks. These reflect the purchase that AI architectures, inspired by Newell and Simon [1972], provide on the acquisition and use of knowledge in task execution. Progress has been rather more modest on other topics such as modeling human understanding and use of dynamic graphical interfaces, multimodal percep- tion, and emotion. Our ability to model these aspects of a user’s psycholog- ical system continues to lag the leading edge of interface design by a substantial margin. In these areas, our existing microtheories require considerable development and forms of macrotheory need to be specified into which they too can be more readily integrated. Only then might we expect a mature Type 1 theory of our complete psychological system to emerge. Existing attempts at unified theories of cognition remain only partial macrotheories of the complete psychological system.
The gulfs between different levels of analysis remain wide. We are not very good at establishing coherent theoretical connections of Type 2, and it
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
230 • P. Barnard et al.
is this weakness that may lead to the fractionation and demise of HCI as a coherent science. Some find the very concept of exchange between theoret- ical frameworks hard to grasp, but it is vital if theoreticians from different domains are to communicate with each other. Someone whose primary interest lies in the overall design of a human-computer work system is unlikely to be all that interested in the fine grain details of the limitations on human spatial working memory or in the limitations of graphics algorithms used to render a particular image. They will want to know about how such limitations are likely to impact the performance of the work system itself and how they are likely to trade-off with other con- straints in their design space [Norman, 1983]. When they seek simple answers from other disciplines they are quite likely to be frustrated. They may, for example, find a range of competing psychological theories. Each of these may be formulated in a quite different terminology from each other and address slightly different ranges of issues and outcomes. Few will come with ready-made Type 2 links to the designer’s work system interests, and the designer will struggle to establish such links for themselves. For the purpose of theoretical advance the theorists may even have drawn atten- tion to areas where their model makes different predictions to competing models, rather than highlighting their areas of agreement. All of this can make it extremely difficult to deliver an interdisciplinary synthesis at one level that is based upon principled reasoning grounded in other levels [Bellotti et al. 1996; Blandford and Duke 1997].
The overlapped hierarchy of Figure 2 draws our attention to two distinct and key roles for macrotheory. It provides the connective tissue that binds together those microtheories of the entities that make up any assembled system of interactors. It also provides the key level of abstraction that should enable us to carry over relevant knowledge, in a systematic way, from the science base of one level of system analysis to the next level up. Our current theory base may well be getting better at modeling and predicting the behavior of humans and computers in specific task contexts. However, it will remain of limited utility until and unless we develop true macrotheories that can meet the challenge of providing the connective tissue for both Type 1 and Type 2 theory.
As with principal components analysis in statistics, vector analysis of physical forces, or Fourier transforms of complex waveforms, the problem of developing better theory needs to be broken down into clearly defined parts. As a boundless domain, HCI needs Type 1 microtheories of interac- tors and macrotheories of their interaction. It needs such theories at different levels of abstraction that extend from the fundamentals of the behavior of users and computers all the way up to that the coordination of people and technologies in large scale organizations [Malone and Crowston 1994]. Historically, HCI has made good progress in building specialized Type 1 microtheories, and some progress at macrotheoretic integration. If HCI as a whole is to maintain some overall unity and coherence, it will also have to nurture the development of Type 2 theories. They are needed to support effective communication between those whose focus of attention is
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 231
at different levels. Type 2 theory development does not require the whole- sale rejection of existing theory. It requires us to consider how the scientific semantics preferentially adopted in one layer of systems analysis can be more systematically mapped to the scientific semantics adopted in adjacent layers. The development of Type 2 theory is vital to enable knowledge in all the relevant disciplines to be brought to bear systematically on the solution of design problems involving the use of computers by individual users, by groups, and in organizations.
3. MACROTHEORY AND INTERACTION
Our fundamental conjecture is that macrotheories at all levels of system decomposition can be represented within a general modeling framework, and that this framework can provide the support needed for theorists to build the Type 2 theories that are lacking in HCI. The objective is to capture the interdependencies between interactors at any level of system decomposition. It is intended to provide the scaffolding both for the specifi- cation and elaboration of microtheory at any level and to support moves from one level of analysis to another. Such a framework should enable us to express theoretical ideas originally developed in the specialized languages of different disciplines in a form that captures properties of interactions in more generic terms. The key claim is that macrotheory for the behavior of any system of interactors can be represented as a complex function of four distinct classes of constraint:
System behavior Fn(Configuration of Interactors;
the interactors’ individual Capabilities;
the Requirements that must be met to use those capabilities; and the regime of Dynamic Control and Coordination of the interactors)
This four component framework was originally introduced as a basis for developing an explicitly formulated body of macrotheory concerning the behavior of the human mental architecture [Barnard 1987]. Here we represent that framework in an abstract form that can be generalized to all systems of interactors, and we shall develop some concrete examples of how it might be applied at different levels of analysis.
The Configuration defines the identity of the basic interactors that make up a system and specifies their potential for engagement with each other. Their engagement might have physical or informational properties. For example, a system of three interactors might be configured so that they can all communicate with each other directly, or the channels might be more constrained with Interactor 1 being able to communicate with Interactor 3 only indirectly via Interactor 2.
The Capability of an interactor is defined as the transformations in information states or of physical state that it can accomplish. As basic interactors within a cognitive architecture, mental processes can be defined as having the generic capability to change the form or content of a mental representation. The generic capability of an interactor composed of a human and a technological device might be that of document preparation,
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
232 • P. Barnard et al.
with a repertoire of more specific capabilities for the human and the software.
The Requirements that must be met for an interactor to realize a specific capability are essentially the states that they need to function—be they physical or information states. The mental process for language under- standing may require a clear incoming phonological representation in a language it has learned.
Systems behave over time and the fourth component of the framework, the regime of Dynamic Control and Coordination, is intended to summarize properties of system activity on a temporal dimension. If we take a time slice of activity within a system, there will be some dynamic properties that characterize the overall state of the interaction. So, for example, a system may be in a state where the pattern of information or physical exchange among interactors is stable over time. Alternatively, a system may be engaging in two or more patterns of information exchange repeatedly over some period, perhaps with a dominant and a subsidiary activity oscillating rapidly, or perhaps with more prolonged phases of each pattern being interleaved.
The ways in which activities are synchronized and controlled are also included in this fourth class of constraint on system operation. In some systems, wider control may be an emergent of synchronous exchanges between interactors. In other systems, particularly military or managerial ones, some interactors have the explicit capability to direct or control the activities of others. At a macrotheoretic level, we still need to capture any states of activity where the effective locus of control lies within a set of interactors, and how the pattern of control changes over time.
For example, it is well known that most people can drive a car and hold a conversation at the same time. The moment-to-moment dynamic control of the mental processes required to drive the car may reside mostly within a peripheral perceptuo-motor configuration, while central thought processes are primarily engaged in controlling an independent auditory-verbal config- uration. All that might be required under these circumstances is an occasional oscillation in which central processes are momentarily rede- ployed to monitor some aspect of the driving task. If a small child were to move from the sidewalk onto the road, then the complete mental system would reconfigure. The peripheral configuration would be brought directly under the control of central processes. The central control of driving would be interleaved and conversation consequently pause [Barnard 1999].
The same description can be applied to the workings of a team. In the context of an open-plan call center a team might be taking product orders over the phone while entering them into a computer. At this level individ- uals, computers and phones are the basic interactors. The supervisor might be working on a primary task, periodically glancing at activity in the center. Were they to notice a problem they might interrupt their current task to go and interleave an activity of helping someone. In this interaction, the supervisor is the locus of control for the team. On a fine grained time scale the activity of the supervisor’s psychological architecture, a constitu-
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 233
ent interactor at this level of system analysis, would be described as oscillating between a primary and secondary task during the first segment of activity. In the second segment the supervisor interleaves an activity. When viewed from the coarse grain of workflow over the course of a week, an interleaved activity at the lower level may be rerepresented as an oscillation in the configuration of the workforce.
In the driving example, the analysis focuses on what specific mental processes are doing. In the second example the focus is on what individuals in a team are doing. The four component framework enables us to explore the possibility of macrotheoretical principles that relate configurations, capability and requirements to the behavior of a system, acting over time. For example, the pattern of dynamic control and coordination of a system of interactors may alter in a principled manner when particular generic attributes of configurations, capability or requirements apply. The extent or complexity of dynamic control and coordination may rise when capabili- ties are suboptimal, when requirements are not met, or when a configura- tion is depleted. At the level of a psychological system, a novice driver may have few of the skills required to coordinate perception and motor aspects of driving without thinking about the driving task. As they become more experienced, the proportion of fully automated skills would rise and with it the proportion of time that central processes could be configured to sustain uninterrupted conversations with a passenger. If the call center team were composed of a high proportion of individuals with less than optimal capability, then workflow might involve a high rate of oscillations in the configuration of the team. Other examples of possible commonalities in generic properties of interactions at different levels of systems analysis are provided by Furnas [2000].
The idea that the behavior of systems can be captured in terms of systematic relationships among four generic classes of constraint does not mean that all systems are governed in the exactly same way. Systems that are configured in different ways and whose activities are subject to differ- ent regimes of dynamic control and coordination would be expected to exhibit different behaviors. The four component framework is intended to provide a basis for developing macrotheories that capture both the similar- ities and the differences in behavior of different systems of interactors. The content of the macrotheory is not absolute. It must be bound to relevant microtheories of the individual interactors of which it is composed.
The behavior of any system of interactors evolves over time, and that behavior can usefully be thought of as a trajectory through a set of possible states of an interaction. The behavior trajectory of a system of interactors, be they cognitive, computational, syndetic, or organizational, can itself be decomposed. Figure 3 depicts a trajectory of continuous interaction divided into segments. These each approximate a state of activity among the interactors. There is a transition from segment to another when there isconsequential change in configuration, capabilities, requirements or the pattern of dynamic control and coordination of the basic interactors within the system. One segment captures the properties of system behavior in the
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
234 • P. Barnard et al.
Fig. 3. An outline characterization of a behavior trajectory subject to four “generic” classes of constraint (Modified for the current perspective from Barnard et al. [1987]).
very short term (VST). A phase of activity is sequence of related short term (ST) transitions among related segments. A transition from one phase to another would typically be associated with longer term (LT) changes in the properties of systems. A number of distinct phases may contribute to a trajectory. Drawing upon Newell’s [1990] proposals, in the modeling of a psychological system of interactors “segments” can be represented by very short term activities lasting a few hundred milliseconds. A “phase” might be represented by periods of up to 10 seconds [Barnard and May 1999]. Changes in psychological capability brought about by knowledge acquisi- tion and learning would naturally encompass far longer time scales. Exactly what falls within the scope of a segment, phase and trajectory as well as the time scales for very short term, short term and long term transitions would be bound to the definition of basic interactors, and sensitive to the activities modeled.
Thinking about a system’s behavior as a trajectory governed by system- atically structured sets of constraints is quite different from the more usual forms of step-by-step task analysis conducted in HCI and traditional human factors research. As with simulations of artificial life, the way in which system behavior evolves over time is an interaction of constraints. Each segment or phase of interaction has a point of departure and an
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 235
outcome. The outcome of one phase is the point of departure for the next, and each will be defined in terms of the attributes of configurations, capabilities, requirements and dynamic control and coordination that apply at that point. All of these are variable. So, for example, the capabilities of interactors can change—as they do when an individual or an organization learns. Similarly, new interactors may be introduced, requirements may change and the regime of dynamic control and coordination for a given system may change—for example when a business undergoes reorganiza- tion or when politicians change the rules of engagement for their military forces.
The whole trajectory can be thought of as analogous to a sentence in natural language, with the segments being analogous to words and the phases analogous to clauses. Just as rules apply to sentence formation in a language, so a class of interaction is governed by a collection of high-level rules. Just as language enables an infinite set of sentences to arise from its vocabulary and rules of combination, so there can be an infinite set of behavior trajectories for some systems. Just as different grammars and vocabularies apply to different languages, so different rules and segmental analyses might be applied to different classes of interaction.
Up to this point, the arguments for a generic form of macrotheory have necessarily been rather abstract. The next section provides a concrete illustration of an analysis of an interaction trajectory and subsequent sections discuss how the wider schema for macrotheory could actually be specified and delivered in practice for psychological, computational and syndetic systems as well as its possible extension to higher-order systems of interactors.
4. CAPTURING SIGNIFICANT VARIATION IN INTERACTION TRAJECTORIES
When interactions are inefficient or “go wrong,” traditional forms of analy- sis try to eliminate specific causes of error by redesigning the system, redesigning the task, by retraining users, or by changing the allocation of function between users and technologies. An analysis based around sys- tems of interactors and the trajectories of their conjoint behavior can potentially help us to think about what is going on in new ways. Couched in these terms, errors represent detours in an interaction trajectory [Bland- ford et al. 1995]. Once a computer user makes an error, they typically have to take a number of additional steps to recover.
A well-known example of this is the unselected window scenario. A computer user, who is interleaving activities conducted in different win- dows, may start typing only to find that the products appear in a window they are not looking at, or that one window changes from being inactive to active. Trajectories for this situation are shown in Figure 4. This is a syndetic system composed of a user and two computers. It is assumed that the microtheory for the trajectory of a user’s mental activity would be captured in a cognitive model. This model constrains the sequence of
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
236 • P. Barnard et al.
Fig. 4. Two trajectories for interactions with a “selected” and “unselected window” (Modified from Barnard and Harrison [1992]).
mental events that cause outcomes in segments of mental activity. These are represented by the linear sequence CE1…CEn linked to the cognitive model. Exactly the same description applies to the model of the devices, in this case computers. These are represented in the lower part of the figure by the sequence DE1…DEn etc. Alternative trajectories for the interaction are shown in the center as a series of phases (Px, Py, Pz), one of which is decomposed further into segments (PzSa, PzSb). Each of these represents not a user state or a device state but a state of interaction, or engagement between the interactors within the system.
The trajectory shows a case where a user is conducting a main task on one computer, in this case share dealing, and is periodically interrupted by requirements to deal with a separate task on a different computer (in this case a numbers task). The share dealing task has two windows and
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 237
Fig. 5. A trajectory designed to include overt marking of transitional phases.
requires frequent movement from one to another. The figure shows a phase of interaction (Px) on the main task followed by an interruption (Py). When returning to the dealing task, interaction may be reengaged with the active window on trajectory 2, or it may be reengaged with an inactive window (PzSa) requiring a segment of trajectory for repair (PzSb). The trajectory description is not a combination “user does this, system does that.” It is a representation of a state of engagement between the two. An attribute of dynamic control and coordination would mark the fact that in PzSa that the activity of the two interactors within their engagement was not coordinated.
In many other types of interaction it is clear that people do not just stop one activity and start another. There tend to be transitions that are explicitly marked as distinct phases of interaction. The most obvious cases occur in human conversation. We do not simply start up a conversation with someone, we go through an orientation phase of saying hello, exchang- ing pleasantries, and only then get on with the real task once common ground is established. A similar transitional activity occurs as conversa- tions are closed down and the various phases appear to be clearly governed by conventions or principles [Clark 1996].
From this generic principle, we reasoned that the situation with uns- elected windows could be improved, not by redesigning the way in which an active window was marked, but by redesigning the interaction trajectories to introduce two kinds of transitional phases (TPs). These are illustrated in Figure 5. One transitional phase is labeled “Possible Disengagement” and the other “Transitional Resumption” [Barnard and Harrison 1992].
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
238 • P. Barnard et al.
In most forms of interaction of this type, the user generally holds the initiative [Blandford et al. 1995]. They are the locus of control within the system of interactors. When the user stops doing anything, the conjoint state of the interaction is unclear—the user may be looking at something on the VDU, they may be interleaving an alternative activity, or they might have gone to lunch. In this case, we designed a system that responded to potential disengagement by taking the initiative. After a period of zero input, the computer changed a property of the currently active window— the window border gradually started to “fizz,” pixels in the border went off and on. If, at any point in the gentle increase in this attribute, the user did anything with the mouse or keyboard the window border returned immedi- ately to its passive state. If the user continued to do nothing the border reached the maximum extent of its “fizzing” capability (steady-state disen- gagement). At any point when the user reengages with this system (transi- tional resumption), the properties of the active window attract the user’s attention. As soon as the user carries out an action, the border returns to its more passive indication that it is in an active state. Work with an experimental system demonstrated that it led to substantially fewer uns- elected window errors than occurred with systems that did not mark the transitions in this way [Lee 1992].
What is interesting about Lee’s experimental results is not the reduction in errors, but that design changed the overall pattern of behavior across the whole interaction trajectory. Although there was a small reduction in the occurrence of unselected window errors when the user returned to the main task following an interruption, the greatest reduction occurred during continuous segments of engagement on the main task. When working on the main task such errors occurred in the natural course of the user moving from one window to another. During such phases conditions virtually never arose where the user was inactive for long enough for the border to become “fizzy”—this was designed to happen only when the subsidiary task was interleaved.
The ultimate explanation of this result lies in a more detailed micro- theory of user cognition. Users presumably developed a generic schema for interaction which led to an evaluation of border state on each move among windows, whatever the exact context. However, the general pattern is best understood by reference to more abstract macrotheoretic principle concern- ing interaction in the higher-order system. The principle requires transi- tional phases to be explicitly marked in the interaction in a way that is sensitive to context. The dynamic control and coordination of the interac- tion depends on a number of interrelated attributes that can be represented in the four component framework. These link the general principle of marking transitional phases in a trajectory to more specific system and user capabilities as well as their coordination:
—A property of system capability—dynamic and passive attributes on window border.
—A property of user capability—the schema for monitoring window stateACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 239
—A property of dynamic control and coordination—where the system acts as the locus of control in one phase and the user in others.
A change to any of these properties may change the likelihood of detours occurring. Indeed, the same reduction in errors does not occur when active window “fizzes” all the time and is not sensitive to the transitional phases [Lee 1992]. What is new here is that thinking about conjoint states of interaction, and the abstract attributes that underlie them, quite naturally led to the idea of a design solution. That solution would not have been obvious by considering a task, a user model and a system model in isolation. It required a model of the interaction and its trajectory.
We have discussed a number of general properties of interaction trajecto- ries elsewhere in the literature [Barnard and Harrison 1989; 1992; Harri- son and Barnard 1993; Blandford et al. 1995]. An important aspect of the approach is to see if properties of trajectories can help us to understand how different phenomena are related. Detours such as that in the uns- elected window scenario occur in many contexts. A classic series of studies by Carroll and his colleagues (e.g., Carroll and Carrithers [1984]) showed that the availability of large sets of functionality provided an environment in which novice users could go down all sorts of confusing and inessential paths. A “training wheels” system was designed that explicitly blocked access to a significant range of system capability. This was shown to help users learn more effectively. The concept of potential captures the idea that some systems, like use of automatic teller machines, have a very small state space for interaction, while others such as UNIX, have an enormous state space [Barnard and Harrison 1989]. As with the training wheels system, the acquisition of user capability may be best served by designing interaction trajectories of limited potential. At the other extreme, advanced users may benefit from the most direct and immediate route to that same functionality, however infrequently used.
This form of analysis does not represent or distinguish concrete proper- ties of good or bad trajectories. It is necessary to understand the range of trajectories that may be possible, and their relationship to the design context. Complex trajectories may be bad where the requirements include a concern for efficiency and speed. However, both theorists and designers may have other concerns. In the context of safety-critical systems, more complex and involved trajectories may be warranted. In the context of computer games, trajectories might be designed to achieve a positively motivating balance between success and failure, as well as challenging skill development. It may be appropriate to guide design by including as a requirement something like: “for a range of users with varying capabilities (skill levels) the normative segments of interaction trajectory for play should mostly lie within a range of 5–12 exchanges” [Barnard and Harrison 1992]. Below this range, interactions may be frustrating. Above it, the interactions may be boring.
Just as there can be no absolute gold standard for the complexity of trajectory properties, so it is unlikely that there will be any simple recipes
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
240 • P. Barnard et al.
for deciding when it is appropriate to add functionality to a system. Successive periods of technology development have often led researchers to ask direct empirical questions about the consequences of adding function- ality. Adding video-channels to communication links may deliver benefits only in specific circumstances [Veinott et al. 1997]. The addition to an interface of faces that show emotion, or agents with a human-like persona, can have subtle influences on properties of an interaction, perhaps affect- ing user satisfaction more than traditional measures of the efficiency of task performance [Walker et al. 1994; van Mulken et al. 1998]. As with the unselected window problem, empirical variation in these kinds of setting is unlikely to yield up all of its secrets to step-by-step forms of task analysis. Taking each of these problems individually, it may be that sufficient experimentation and microtheorizing can identify all of the sources of variation in behavior, to the point that the examples can be said to be “well understood” in isolation. This does not lessen the benefit of macrotheory. Adding the macrotheoretical level of explanation allows them to be under- stood together, as different facets of a common problem, in this case that of interaction trajectory. We suggest that modeling the abstract properties of trajectories will be a productive basis for understanding the significant variation evident in the results from user testing in HCI. Such progress can only be made on the basis of macrotheory.
Our discussion of the unselected window scenario focused on the proper- ties of behavior trajectories for a syndetic system involving one user and two computers. For the purposes of illustration, it assumed an unspecified cognitive model of the user and an equally unspecified model of the computer system. If our arguments for macrotheory are to be realized, we need to demonstrate how a macrotheory of the user’s psychological system and a macrotheory of the behavior of a computer system can both be mapped into a well-specified model of their interaction.
5. TOWARD MACROTHEORY FOR PSYCHOLOGICAL SYSTEMS OF INTERACTORS
In his argument for integration, Newell [1990] called for the development of unified theories of cognition. Much of the subsequent work that followed was based in an AI tradition of simulation of a range of phenomena within a single mental architecture. Our approach to integration has followed a different course. Rather than simulating user cognition, we have sought to specify a mental architecture and its principles of operation, and then use rules or mathematics to infer properties of its behavior across a range of conditions. This approach is like the modeling of economic systems, where a set of equations are used to infer what is likely to happen across an economy were taxes to be reduced or increased.
The justification of the particular architecture we use, and its approach to modeling, is beyond the scope of the present article. The architecture was originally developed to integrate accounts of multimodal aspects of human short-term memory with those of language processing, attention
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 241
and central executive functioning (e.g., Barnard [1985; 1999]). The archi- tecture has been applied to the decomposition of HCI tasks [Barnard 1987], to multimodal aspects of HCI performance [Barnard and May 1995], to the understanding of dynamic graphical displays [May and Barnard 1995b]. It has also been applied to the effects of emotion on human cognition and to clinical disorders such as depression [Teasdale and Barnard 1993], and to the investigation of the structure of central executive functions [Scott et al. 2001].
The basic theory starts from a noncontroversial position. It assumes that our mental architecture is a distributed system in which processes, special- ized to handle different facets of mental life, operate concurrently. It specifically assumes that our mental architecture is composed of nine subsystems and their general organization is shown in Figure 6. More controversially the theory assumes that the operation of all these sub- systems is governed by a common set of underlying principles. Equally controversial is the assumption that all types of mental representation have common principles of construction—differing only in the form that information is actually encoded.
All of these subsystems are composed of processes. These are the basic interactors within the wider psychological system. The basic interactors are of three types. The first type includes all those processes that transform one kind of mental representation into another, such as the transformation of a visual representation into an object-based representation (e.g., VIS 3
OBJ). The second type of interactor includes those processes that construct representations over time (e.g., COPY VIS). The third type of interactor includes record processes, or image records, which have the capability to regenerate representations of past inputs—a memory system in more conventional terms. In the figure the vertical depiction of black and white nodes on the left-hand side of each subsystem represents the “array” of data that contains current input. Subsystems interact when a process in one subsystem generates an output for use by another subsystem, be it based upon information flowing into a subsystem in real time, or be it based on information regenerated by the record process. Within this schema, mental activity (the behavior of this system of interactors) is composed of interactions between subsystems, hence this theory is called Interacting Cognitive Subsystems, or ICS [Barnard 1985].
In ICS, three sensory subsystems deal with information derived either from distal sources (ACoustic and VISual shown on the left) or from Bodily States (BS—shown center right), including feedback from SOMatic and VISCeral response systems. Two subsystems handle the coordination of actions in the world through skeletal movement (the LIMb subsystem) and verbal communication (the ARTiculatory Subsystem). Four subsystems handle higher-order abstractions of information. They are (a) the Mor- phPhonoLexical subsystem, which represents auditory verbal abstractions; (b) the OBJect subsystem (visuospatial representation); (c) the PROPosi- tional subsystem which encodes semantic representations in a form that is
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
242 • P. Barnard et al.
Fig. 6. Interacting Cognitive Subsystems (after Teasdale and Barnard [1993]).
referential and relationally specific; and (d) the IMPLICational subsystem which represents more generic semantic relationships. This subsystem encodes schematic models of the broader existential state of the complete system, in a body, in a wider sensory environment. The content of these models encompasses higher-order regularities derived over the products of processing within propositional, acoustic, visual and body state sub- systems. It is at this level that emotion is experienced and this subsystem
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 243
handles the wider existential regularities that constrain the content of thought as realized in Propositions.
The use of upper case marks the abbreviated forms of reference to the different forms of mental representation. These abbreviations index the processes shown in Figure 6. For example, the Acoustic subsystem (upper left) contains a number of processes. One (COPY) transfers an incoming auditory waveform to the memory process (the Acoustic Image Record). A second process (AC 3 MPL) transforms the incoming speech waveform into a higher-order representation of its content in a different code. A third process (AC 3 IMPLIC) maps other attributes of the acoustic waveform, such as “tone of voice,” directly into more abstract semantic code. The three processes operate concurrently as indicated by their parallel arrangement within the diagram.
The processes shown in Figure 6 specify the interconnectivity of the overall mental architecture. They capture information flow between the interactors (Figure 7) and this can be used to define the configurations that play a significant role in different tasks. Some processes can talk to each other directly, others can only do so via an intermediary process; sensory processes can only send to the central ones, effector subsystems typically receive flows from the central ones, and the central subsystem can ex- change information one to another in well-defined patterns. The capability of the individual interactors is constrained by a small set of general information processing principles. For example, it is assumed that a given process can only transform a single coherent stream of data at any one point in time.
The patterns of information flow are intimately connected to the nature of the different types of mental representation. The transformations from one mental code to another exactly mirrors the Type 2 theories outlined for scientific abstraction and refinement (Figure 2). By transforming informa- tion, mental processes are mapping from one system of representation to another. In the transformation from a sensory subsystem to central ones, detailed information about sensory properties is discarded, higher-order basic units of representation are formed, and the added value of the higher-order mental representation is the new superordinate organization it delivers (Figure 8). In recoding information received from a central subsystem, an effector subsystem would accomplish the opposite of the transformation indicated in the figure. For example, the ARTiculatory representation would be used to compute and coordinate the motor instruc- tions for the various musculatures controlling lips, tongue, mouth and breath during speech output.
The general argument implies that systems of mental representations abstract over the variance in the information received by a subsystem and that, through learning, the processes that transform mental representa- tions must embody regularities in input that are directly mapped onto consequent regularities in output. The set of input-output relationships is the computational function of that process, and represents its capability.
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
244
• P. Barnard et al.
Acoustic Subsystem
AC
Morphonolexical Subsystem
Propositional Subsystem
Implicational Subsystem
Object Subsystem
VIS
Visual Subsystem
Data Network Surface form
MPL
Specific Meaning
PROP
IMPLIC
Schematic Model
OBJ
Data Network
Articulatory Subsystem
ART
Fig. 7. Some of the information flows afforded by the processes shown in Figure 9.
Where information is received from different types of sources (e.g., sensory and semantic), the associated system of mental representation will, as with our syndetic theory at the apex of Figure 2, model the interdependencies
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Body State Subsystem
Bodily Receptors
Visceral & Somatic Response Systems
LIM
Limb Subsystem
Systems, Interactions, and Macrotheory • 245
Fig. 8. The representational shift accomplished by a process transforming incoming sensory representations (from May et al. [1995]).
over these inputs. This is the basis for the theoretical treatment of multimodal integration [Barnard and May 1995].
A relatively small set of processing principles [Barnard 1985] is assumed to constrain the transformation of mental representations (their capabili- ty). All nine forms of mental representation are hierarchically organized with a superordinate structure, basic units, and constituents. They are of similar form but different content. These representations are the input to subsystems over time. They are also the structures that are preserved in the image records of the subsystems. This enables us to formulate general principles for requirements at input that must also be met to allow the capability of the subsystems to be fulfilled. This renders the problem of developing macrotheory for psychological systems tractable, not only in terms of its verbal formulation and heuristic rules, but also in terms of its formal expression in mathematics, as we show later.
In a series of theoretical exercises, first reported at CHI’87, [Barnard et al. 1987], we have sought to model the behavior of the ICS architecture for a range of problems in HCI. The make use of in the four component framework (Figure 9), which has been generalized in this article to all systems of interactors (Figure 3). In this approach, we used the theory to define detailed attribute spaces for configurations, attributes of procedural knowledge (Capability), record content (Requirements) and attributes of the dynamic control and coordination of wider mental processing activity. We approximate over generic phases of mental activity (goal formation, action specification, and execution) and for different parts of a learning trajectory (novice, intermediate, and expert). We also made use of the generic structure of mental representations (Figure 8) to develop a compact number of rules around which to organize the details of representations constructed and preserved in the various subsystems. The core theoretical ideas and techniques are reported elsewhere (e.g., Barnard [1987; 1991], Barnard and May [1993; 1995], and May and Barnard [1995a]). Their potential for practical application to interface design has also been detailed
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
246 • P. Barnard et al.
Fig. 9. A family of cognitive task models (modified from Barnard and May [1993]). For each phase and segment, process configurations are represented explicitly together with attributes of their capability, requirements that are met and the properties of dynamic control and coordination that apply.
in prototype handbooks and tutorials directed at practitioners, some of which are now available on the World Wide Web [May and Barnard 1997; May et al. 1995]. The details of how the four components can be realized in attribute spaces and some of the principles and rules are documented in a recent issue of the journal HCI [Barnard and May 1999].
In order to provide the rules with some grounding in evidence, the results of a variety of laboratory studies were summarized. These were drawn from the general psychological literature and from specific experiments with command languages [Barnard et al. 1984; 1989], menu systems (e.g., Hammond and Barnard [1982]), and graphical interfaces (e.g., Green and Barnard [1990] and May et al. [1993]). On the basis of this empirical evidence we formulated a set of rules for tying specific properties of tasks and interfaces to the attribute spaces, to generic principles and form these to properties of user behavior. The resulting rules were then embodied in an expert system [May et al. 1993]. To support generalization and exten- sion, the core modeling rules were organized into one set of knowledge
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 247
bases. These were separated from the two other major components of the system. One was a self-contained set of rules for collecting information about tasks, users and interfaces. The other set of rules was used to generate predictions or design advice.
The approach is currently neither as well developed nor as extensively tested as the AI architectures like SOAR, the ACT family or EPIC. The capabilities of our expert system were confined and subject to all the normal caveats about the validity and generality of the empirical evidence on which its rules were based. The rules of the expert system nonetheless embodied macrotheory and, where we had it for specific task environments, macrotheoretic inference was supported by some microtheoretic detail. It therefore conforms to our requirement for the composition of Type 1 theories of the behavior of an assembled system. The inferences do not rely upon a runnable simulation like those of the AI tradition within cognitive science, but on logic. The reasoning is explicit and it is based upon theoretical principles whose effects can be traced.
The extent to which principles generalize can also be tested and vali- dated empirically. For example, during the earliest phases of learning, we assume that the “how-to-do-it knowledge” for performing tasks is held in the image record of the Propositional subsystem. The knowledge can be assigned generic attributes such as the level of item and ordering uncer- tainty that must be resolved to generate the surface structure of a com- mand sequence from its underlying semantic representation [Barnard et al. 1987]. Likewise, current states of a graphical interface are modeled as the set of representations recently copied into the image record of the Object subsystem. In this case a generic attribute will be set that captures the level of spatial uncertainty associated with the location of an item in a graphical array. Higher levels of uncertainty associated with any one of these attributes are assumed to increase the complexity of processing exchanges between the Propositional and Implicational subsystems that fulfil executive functions in the ICS model, and may also change the locus of control within the mental configuration. These assumptions are testable with laboratory methods (e.g., May et al. [1993]). Similar evidence about the effect of slight changes to propositional content of task instructions upon the resulting implicational task model that people generated, and consequently upon their behavior (and high rates of error) has been reported by Scott et al. [2001].
The expert system was developed to use a macrotheory of the human cognitive architecture to draw direct inferences about patterns of interac- tion. In terms of the schema of overlapped system layers represented in Figure 2, the expert system may well have formed the basis for a Type 1 model of a psychological system. Via a Type 2 mapping it could also form one basis for a microtheory of the human component of a higher-order system. However, taken in isolation it was clearly flawed as a serious model of interaction. As with other forms of cognitive modeling, the approach was undeniably unbalanced as a means of representing interac- tions in a higher-order system composed of user and computers. While the
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
248 • P. Barnard et al.
representation of the cognitive architecture was reasonably formal and relatively rich, the representation of the computer was both informal and highly impoverished. The development of a proper Type 1 theory of a syndetic system requires a more balanced approach. It requires the specifi- cation of microtheories of all the relevant interactors at an equivalent level of richness, and it requires a macrotheory of their interaction.
6. REALIZING COHERENT TYPE 1 THEORIES OF INTERACTION GROUNDED IN THEORIES OF COGNITIVE AND COMPUTATIONAL SYSTEMS
The development of more adequate Type 1 theories for the behavior of a system of interactors, involving both people and computers, would be facilitated if we could represent theories of user behavior and theories of computer behavior in a common language. The earlier discussion of the unselected window scenario introduced the general idea of an interaction trajectory, but did so in informal terms. Although it mentioned generic principles and properties that might facilitate the coordination of user and system behaviors, these were by no means a formally specified model or theory of interaction. Within computer science, it has been acknowledged that the behavior of computational systems needs to be modeled at an abstract level. To model computational systems, the formal methods com- munity have sought to evolve a body of mathematics that enables them to represent abstract properties of systems. Their fundamental concern is to use those models to reason about the behavior of a computer system before it is refined and built. When designing a system it is, for example, important to establish that it cannot enter deadlock. The use of mathemat- ics to describe software and hardware systems has been widely explored. Current work includes its application to human-computer interaction (e.g., Dix [1991], Duke and Harrison [1993], Paternó and Palanque [1997], and Harrison and Torres [1997]).
In software development there is a need to carry out rigorous checks on models. The development of such models uses software tools, and this requires the mathematics for software specification to be defined at a level of rigor that is not commonly applied in other areas. Specific “formal methods,” such as VDM, Z, and CSP have been developed to do this. They define collections of mathematical structures with a specific syntax and semantics for use in systems modeling. Several key advances in under- standing complex problems in computing have already come about through the development of new mathematical abstractions for the operation of concurrent systems (e.g., Hoare [1996] and Milner [1989; 1993]).
When a computer scientist does produce a formal model of the behavior of a system before it is refined and built, it fulfils a role directly equivalent to that of macrotheory for a psychological architecture. The conceptual paral- lels run deeper. We can characterize a system of computing interactors in the four component schema used to organize psychological macrotheory. Indeed, the four component framework maps directly onto the schema for
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 249
applying the mathematics of control theory. A generalized construct of a data array (requirements) on which functions (capability) operate to update the array according to selection restrictions (coordination and control). Configurations can then be an emergent property of what functions can operate on what data types within the array at what time.
The psychological theory (ICS) discussed in the previous section is itself a model of a concurrent computing architecture, albeit one grounded in biological wetware rather than silicon. Recognizing the parallel, we should be able to use the mathematics being developed within computer science to model abstract properties of the behavior of the mental architecture and bind it to an equally abstract model of the behavior of a computer system. To develop a coherent Type 1 model of the resulting system, some macro- theory of the interaction needs to be added—in this case, a syndetic model. In the Type 2 transition (Figure 2), macrotheory from psychology and macrotheory from computer science are both rerepresented to form micro- theory at the level above in the overlapped hierarchy of abstraction.
Our first attempt at this form of theoretical integration is fully reported in a recent issue of the journal HCI [Duke et al. 1998]. Although this contribution relies heavily on an understanding of advanced mathematics, the specific form of mathematics, as we discuss later, is less important than the overall methodology. Working within a large scale European project on the integration of theory within design processes [Barnard et al. 1995], we focused on a number of concrete design scenarios for advanced systems. One of these was a Multimodal Air Travel Information System (MATIS) capable of integrating typed, voice and gestural inputs. The user could say “…Flights from here to there” and click on reference to the relevant cities by pointing at the appropriate referent with a mouse [Nigay and Coutaz 1995]. The second scenario was a specific form of gestural language. This was designed to be used with the sort of data glove interfaces where the hand movements can be used to issue commands to the computer and where an image of the glove is concurrently rendered on the user’s display [Bordegoni and Hemmje 1993].
With well defined configurations for information flow (Figure 7), ICS provides some generic rules for how the cognitive mechanism handles multimodal integration (see Barnard and May [1995]). Since the properties of the two computing systems were known and specifiable, the constraints on key components of the two computing systems were modeled using modal action logic. The equivalent properties of the ICS architecture were also modeled using modal action logic, but adding deontic extensions to capture aspects of the indeterminacy of mental processing.
Although very different in character and process from the simulation methods of AI, the requirements of formalization forced us to be more precise in our specification of human information processing theory than we had been in our earlier attempts to model cognitive activity. The result of this modeling was two sets of axioms. One set of axioms represents assumptions about the constraints governing computer system behavior and the other set represents assumptions about how the user’s mental
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
250 • P. Barnard et al.
mechanism would function in the same setting. The axioms illustrate two aspects of the Type 2 mapping from macrotheory at one level to micro- theory at a more abstract layer of system analysis. The axioms discarded the detail of the full psychological or system models. They were also selective in that they represented only those parts of the lower-level models that were necessary to deal with specific issues. These axioms were microtheories of user and system components. The added value of a Type 2 theory is the addition of a new level of organization that captures the core abstract properties of user-system interaction. This was represented by a third set of axioms that specified assumptions about conditions that must apply in the interaction itself.
This combination of micro- and macrotheory represents a Type 1 syndetic model grounded in theories of cognitive and computing mechanisms. Like any other form of representation, mathematics can be hard to follow unless people are familiar with the notations. Figure 10 reproduces a sample of axioms. In each case the notation is explained in a sentence. These sentences illustrate the generic nature of what is abstracted into the mathematical notation. The first two axioms are drawn from the 14 used by Duke et al. [1998] to represent an ICS-based formulation of the require- ments that need to be met in the mental mechanism for information arriving at a subsystem from different sources to be combined, or blended, within process operation. Two axioms are included from the specification of the Multimodal Air Travel Information system (MATIS) and just one of the syndetic axioms needed to bind together the user and system models. The syndetic axiom illustrated deals with the coordination of user and system capabilities.
The specification can be very economic, since the syndetic model inherits the model of the user and the model of the computer. In the two examples reported by Duke et al. [1998], the syndetic models of the salient aspects of user-system interaction with the MATIS system and with the gesture language required only three and two syndetic axioms respectively. The specifications are really rather short by comparison to those normally encountered in the formal methods literature, and they are certainly far shorter than the kinds of coding required in full AI simulations. Most importantly, the specifications are not committed to any unnecessary detail either concerning user mentation or system implementation.
Once expressed in mathematical form, the model can do work. Unlike a simulation, this class of specification does not run and make a prediction. Newell [1990] noted, “theories don’t make predictions, theorists do.” In this case, conjectures can be postulated and tested. The abstract model of a syndetic system is used to answer questions formulated by designers and to evaluate claims about the behavior of systems of interactors. In the case of the airline system, we might want to know how the user will cope with this design and develop a set of conjectures about it. The model is used to derive a formal proof of that conjecture. Duke [1995], Duke et al. [1995; 1998], and Duke and Duce [2000] provide a number of examples of this process.
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 251
Sample ICS axioms
coherent(t1, t2) ⇔ dest(t1) = dest(t2) ∧
∀ p, q : repr • p on t1 ∧ q on t2 ⇒ p ≈ q
Stream, t1 and t2 are coherent if and only if they have the same destination, and for any representation p available on t1 and q on stream t2, p and q are coherent.
t ∈ stable ⇔ ∀s1, s2 : sources(t) • coherent(s1, s2) ∧(t = buffered ∨ sources(t) ⊆ stable)
A transformation ‘t’ is stable if and only if every pair of streams on which it operates are coherent, and either the transformation is buffered, or the input streams are stable.
Sample Axioms from the MATIS Model
speech = X ⇒ [speak(nm,d)] speech = X^〈(nm, d)〉
If the speech stream holds X, then speaking a name/data pair results
in a speech stream with that pair appended to X.mouse = M ⇒ [select(d)] mouse = M^〈d〉
If the mouse stream holds M, then selecting a data item d results in a stream in which d is appended to M.
An axiom from the syndetic model
per (read(d))⇒
d in MATIS ∧〈∗vis-obj:, :obj-mpl:, :mpl-prop:〉∈flows
It is possible to read some data item ‘d’ if d is part of a field of a query in the display and the cognitive configuration enables reading.
Fig. 10. Sample axioms from ICS, MATIS and their syndesis (from Duke et al. [1998]).
In the case of the multimodal airline information system, the proof indicated that mental activity would not be well coordinated with the system behavior and that deictic reference using speech and mouse with this particular system would be problematic. Although the functionality was available, the analysis suggested that the likelihood that it would be used was low. As with the detours of the unselected window scenario, the analysis provided an insight into the properties of a behavior trajectory for a system composed of user and computer. It also provides a prototype for the kind of model that could in principle predict when the addition of a video channel to a communication link might beneficially influence interac- tion trajectories and when such additions might have little beneficial effect (again see Veinott et al. [1997]).
As the underlying cognitive and system models provide the microtheory supporting the syndetic macrotheory, the reason why a particular difficulty occurs can be traced to somewhere in the Type 1 theory for that system of
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
252 • P. Barnard et al.
interactors. Alternative design solutions can then be driven by theoretical insight rather than by generate-and-test cycle of ad hoc change. Using mathematics, the consequences of theory can be explored in much the same way as they are in other physical sciences. The reasoning depends upon the theoretical assumptions, and these are not intermingled with the kind of detail that can lead to problems in linking underlying theory to conse- quence in simulation methodologies [Cooper et al. 1996].
Similarly, it is expensive to generate new models afresh for each new system and application. What Duke et al. [1998] show is that axioms developed in one context can be reused to model generically related circumstances. Once in place, a body of psychological theory, such as ICS, can be combined with models of different computer systems. In the Duke et al. [1998] case, the reasoning about pose-formation with a data glove has significant fragments in common with the reasoning about the MATIS system. Where similar abstract conditions apply in the models of computer systems, then the relevant bodies of theory are reused to support the development and exploration of alternative syndetic systems. Although the specifications and proofs presented by Duke et al. [1998] cover several pages, they rely on inferences that could, in principle, be carried out using the current generation of theorem provers such as MURAL [Jones et al. 1991] or PVS [Owre et al. 1995]. It should not ultimately be necessary to have a tame professor of formal methods in every design team’s cupboard, or in every psychology laboratory.
Although this initial work has relied upon a particular psychological theory (ICS) and a particular formal notation (Modal Action Logic), the approach does not depend upon either. Other psychological models could be represented in same formalism. Likewise, we are fully aware of the limitations of Modal Action Logic—this form of mathematical representa- tion has limited expressive power and can only capture part of what we would wish the mathematics to accomplish within the framework of syn- detic modeling. The overall approach is not reliant on a specific mathemat- ical calculus. Indeed, ICS has itself been modeled using an altogether different formal method, LOTOS [Bowman and Faconti 1999]. The body of mathematics that is evolving in computer science is not static, but moving forward to encompass problems that were not previously tractable. To deal adequately with the human interactor, we might need to marry up several different forms of mathematics, including, for example, the mathematics of signal processing, the mathematics of control theory, and yet other forms of logic for computational systems, such as interval temporal logics.
7. EXTENSION TO HIGHER-ORDER SYSTEMS OF INTERACTORS
An important part of our argument is that four generic classes of constraint need to be modeled at all levels of systems analysis. So far we have illustrated how macrotheory can be explicitly specified for users and for computing systems. We have also illustrated how it can be combined in a formal mathematical model of a syndetic system. However, to support
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 253
wider integration across the field of HCI, it needs to be shown that very similar forms of analysis might also hold for a characterization of behavior trajectories for groups and yet larger organizations. The authors of this article are cognitive psychologists and computer scientists, we are not specialists in CSCW or in the analysis of the behavior of organizations. To go beyond our own theoretical territories and worked examples is obviously a risky endeavor, since we must necessarily rely on far more speculative conjectures, but such a move is needed to illustrate our envisaged trajec- tory for theory development in next millennium. We shall not attempt to compete head-to-head with the leading edge of HCI. Rather, for our concrete examples of how our four component framework might be applied at the level of groups and organizations, we turn to some classic experi- ments on communication nets, and to a case study of the reuse of generic principles of interaction across a truly historic time span.
Starting with Leavitt [1951], social psychologists carried out research on tasks performed by collaborative groups working in configurations where they could only communicate along the kinds of constrained paths shown in Figure 11. These are analogous to our representation of the potential configurations for a cognitive architecture (Figure 7). Predating computer technologies, this paradigm can be regarded as paper-and-pencil supported cooperative work. The only interactors in this setting that have the capability to change representations or physical states are the people. They communicate by passing written messages through slots in reconfigurable walls. The paper, writing implements and walls are, of course, subordinate interactors in this setting and they too have definable capabilities (paper affords writing etc., slots afford message passing). However, our concern is not with the micro-properties of how this medium might differ from a more modern email counterpart, but with characterizing how properties of the configurations relate to interactor capabilities, requirements and their dynamic control and coordination. In this setting the basic interactor is “Human agentpaperpencil,” and it has the capability to construct task relevant messages and pass them through one or more slots.
Leavitt [1951] made use of a very simple task. Each person had a set of different symbols and their collective task was to identify the particular symbol that was common to all of them. Leavitt found that, of the circle, star and “Y” configurations, the circle configuration gave rise to most errors, took most time and involved the greatest number of messages exchanged. While least efficient according to conventional criteria, people actually liked being in this configuration more than the other ones. It is neither hard to generate explanations of why the more centralized star configuration was more efficient, nor why it was disliked more. An essen- tial requirement for the group level interactor is to bring information together, and the centralized networks have, by their very structure, a locus of control through which all information is channeled. It is easy to see how the interaction trajectories for the centralized and decentralized nets might differ in terms of the segments and phases of interaction needed to describe them. Although any given interactor in the circle network might
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
254 • P. Barnard et al.
Fig. 11. Centralized and decentralized communication networks.
have the capability to determine the required answer, the messages have to pass through more links, and the coordination of exchanges, and of decision making capability, would be more involved.
Most people in the network may have disliked their restricted status and the fact that they could only communicate with those who were adjacent to them. The point to take from this example is not the specification of particular a trajectory or its properties. What is more important are the findings demonstrating that the relationships are not constant. With more complex tasks, Shaw [1954] found that decentralized networks gave rise to better performance. The requirements of bringing information together, and the issues that must be resolved in doing so, exceed the capabilities of the single individual acting as the locus of control. They have to interleave information and control transactions and keep track of everything that is going on. In contrast, activity in decentralized networks supports concur- rent activity on different parts of the problem, ultimately resulting in fewer exchanges and less time.
Clearly, in order to model this example, including the alternative out- comes of participant satisfaction, the attributes and principles of a four component analysis would need to be captured with more precision and completeness than we can express here. Our reasoning would nonetheless be framed by interrelationships that hold between properties of the capabil- ities of the interactors, the requirements of the interaction and the dynamic control and coordination of any system assembled to have particular configural properties. No amount of task description, knowledge analysis, or considerations of the limits on human cognition would provide such an analysis without reference to other factors not usually considered as within the scope of any one of these analytic approaches taken in isolation.
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 255
Fig. 12. The Battle of Cannae.
In the case of yet higher-order systems of interactors, the very terminol- ogy of the four component framework should have a strong sense of
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
256 • P. Barnard et al.
familiarity. Military strategists think about how their forces are config- ured, what their capabilities are, what requirements must be met for them to use that capability and, of course, how command and control across the whole system is to be exercised. Our final concrete scenario is historic rather than current. It has been deliberately chosen to illustrate how generic principles of interaction can be reapplied to interactions between humans and technologies across a millennial time scale.
On August 2, 216 B.C. at the battle of Cannae, a Carthaginian army commanded by Hannibal slaughtered over 60,000 soldiers of a Roman army commanded by Varro, while losing only 6,000 of their own, very much smaller force. The three main phases representing the behavior trajectory for this highly concurrent system of interactors are shown in Figure 12. The progress of the battle can be succinctly summarized.1 In the opening phase of the battle, Hannibal advanced a thin salient of his infantry toward the Romans; and the Romans advanced to meet them. At the end of this phase Hannibal executed a preconceived tactical withdrawal. In the second phase the Roman army, believing that they were already in sight of victory, was drawn into Hannibal’s trap as he maneuvered his infantry around the flanks of the advancing Romans. Hannibal’s cavalry, under the command of Hasdrubal, had engaged the Roman cavalry to the rear during the first phase, now disengaged and completed the encirclement of the Roman army. In the third phase of the battle, the Romans became a herd of panic- stricken individuals, their force structure losing all coherence and unity at the point when they realized that they had been enveloped. Hannibal’s numerically inferior force quite literally cut them to pieces.
The block diagrams specify the different configurations for the various interactors within this system, the different units of infantry and cavalry. Given that their infantry outnumbered the Carthaginians by more than two to one, the Romans should have won. Hannibal had negated the Roman’s numerical superiority in two ways. First, once encircled, a signif- icant proportion of the Roman units was trapped behind the line of engagement. They could not be configured to interact physically with the Carthaginians, and so were unable to fulfill their capability. Second, being only human, the Romans panicked and the individual units lost the ability to fight as constituents within a higher-level force structure. The loss of coordination and control within basic units of organization at one level of analysis has the effect of degrading the capability of the basic interactors at a higher-level of systems analysis.
Some of the connections to our earlier discussions are at least intriguing. For example, the training wheels system developed by Carroll and Carrith- ers [1984] denied novice users access to software capability. As at Cannae, the “potential” of subsequent behavior trajectories was reduced by denying use of capability. In both instances the effect was to facilitate the trajectory toward an intended outcome, albeit faster learning in one instance and more effective slaughter in the other. The informally presented analysis of
1Details derived from Dupuy and Dupuy [1993].
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 257
the unselected windows scenario and the more formal modeling of interac- tions with the multimodal air travel information system (MATIS), both involved reasoning about dynamic control and coordination of simple user and system exchanges. When mapped upward to the concerns of designers of the higher-order system, the conclusions may need to be captured, not in the detailed model, but in a Type 2 mapping. As in the case of Cannae, if coordination and control is degraded in a lower-order system, then it impacts the capability of the higher-order system. One sort of window improves capability of the combined system relative to the existing variant. It simplifies behavior trajectories by reducing the likelihood of detours. An envisaged design for multimodal integration of speech and mouse action in airline enquiries is unlikely to be used and hence have an important impact the capability of the higher-order system. Only when it could be argued that an envisaged design would facilitate behavior trajectories for the higher-order system might further investment in that aspect of software development be justified.
If those operating at different levels of systems analysis and on different topics were able to relate their own contributions to some common vehicle of expression, such as talking about generic constraints on interaction, then communication should be enhanced. Individual researchers might even gain substantial benefit from advances made by those working on quite different systems of interactors. Those working on modeling computational systems, cognitive systems, social systems or military systems might improve their own tools and models on the basis of greater interdisciplinary coordination and reciprocated insight.
8. CONCLUSION
We have argued that the future course of theory in HCI might not be best served by the unconstrained development of more and more local theories and models specifically tailored only to meet the needs of different levels of analysis or different software applications. Because of the different systems of scientific semantics adopted, such an approach makes it hard to realize connections—either within or between levels of systems analysis.
We have offered some arguments for developing macrotheories. These provide connective tissue to bind together different concerns within a level of explanation. We also argued that macrotheories were needed to support a different kind of theory. The second form of theory provides the connec- tive tissue between the systems of scientific semantics adopted at adjacent levels of analysis. A specific framework was proposed for the consistent organization of macrotheory across levels of analysis. This assumed that the behavior of any system was determined by four classes of constraint. At each level of analysis we would expect the content expressed in a theory or model to be quite distinct and testable in its own right. We used a particular cognitive theory and a particular mathematical route to building models of syndetic systems to support the wider argument that such developments are a tractable research proposition. The four classes of
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
258 • P. Barnard et al.
constraint are dependent upon neither our psychological model nor a particular form of mathematics. We used these tools because they are the ones that we are most familiar with. The general argument could equally well have been based upon other cognitive models or mathematical nota- tions. We also sought to illustrate how the framework might be applied across a broad range of issues. Concrete scenarios were used from both past and current technologies, from window design to the design of multimodal or data glove interfaces. In addition to interactions, at the human computer interface, we also briefly considered the behavior of groups and organiza- tions. We concluded by examining how the framework might help express some simple parallels between interactions at different levels of system analysis.
Hannibal’s plan for the Battle of Cannae provided military theorists with an often cited model of tactical perfection. Some two thousand years later, General Schwarzkopf remarked that he essentially reused Hannibal’s model for the battle of Cannae when he directed operation Desert Storm. An abstract model that is reusable over a couple of millennia, and from the technologies of swords and shields to those of tanks and missiles guided by advanced information technologies, is a significant achievement. As HCI moves into the next millennium, the development of new forms of theory, or of a viable mathematics for modeling systems at different levels of analysis, still poses a whole range of exciting and important challenges. Before reliable and enduring models are achieved, research may well go up numerous blind alleys. Developing the kind of models and mathematics we may need to realize such an ambitious agenda is not only difficult, it would require a substantial investment of effort over a long time period. Most readers will probably remain unconvinced that such an effort is really worthwhile. At this point we may be part of a minority who still advocate an agenda for developing deep theories of interaction. We consider the fundamental case for developing such theories to be just as strong now, if not stronger, as it was when Card et al. [1983] published their vision for the development and application of theory in the practice of HCI. At the point of departure for the Battle of Cannae, the Romans had the consider- able advantage of massive numeric superiority in military capability. We take heart from history. What Hannibal lacked in capability was more than offset by an abstract model of a novel interaction trajectory. Its course was neither foreseen, nor significantly altered, by the tacticians of the dominant and most effective fighting force of the era.
ACKNOWLEDGMENTS
This article is based on the plenary address given at HCI ‘98, Sheffield by the first author. Some of the ideas were also developed in collaborations with others on the ESPRIT AMODEUS-2 project, with the UK Defence Evaluation and Research Agency, and in the EU T&MR network TACIT. All are gratefully acknowledged.
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 259
REFERENCES
ANDERSON, J. R. 1993. Rules of the Mind. Lawrence Erlbaum Associates, Inc., Mahwah, NJ. BARNARD, P. J. 1985. Interacting cognitive subsystems: A psycholinguistic approach to short term memory. In Progress in the psychology of language, A. Ellis, Ed. Lawrence Erlbaum
Associates Inc., Hillsdale, NJ, 197–258.
BARNARD, P. J. 1987. Cognitive resources and the learning of human-computer dialogs. In
Interfacing Thought: Cognitive Aspects of Human-Computer Interaction, J. M. Carroll, Ed.
MIT Press, Cambridge, MA, 112–128.
BARNARD, P. 1991. Bridging between basic theories and the artifacts of human-computer
interaction. In Designing Interaction: Psychology at the Human-Computer Interface, J. M. Carroll, Ed. Cambridge Series on Human-Computer Interaction. Cambridge University Press, New York, NY, 103–127.
BARNARD, P. 1999. Interacting Cognitive Subsystems: Modeling working memory phenomena within a multi-processor architecture. In Models of Working Memory: Mechanisms of active maintenance and executive control, A. Miyake and P. Shah, Eds. Cambridge University Press, New York, NY, 298–339.
BARNARD, P. AND HARRISON, M. 1989. Integrating cognitive and system models in human computer interaction. In Proceedings of the 5th Conference of the British Computer Society, Human-Computer Interaction Specialist Group on People and Computers V (Sept. 5–8), A. Sutcliffe and L. Macaulay, Eds. British Computer Society Workshop Series Cambridge University Press, New York, NY, 87–103.
BARNARD, P. J. AND HARRISON, M. D. 1992. Towards a framework for modeling human- computer interactions. In Proceedings on East-West International Conference on Human Computer Interaction (EWCHI’92, Moscow), J. Gornostaev, Ed. 189–196.
BARNARD, P. AND MARCEL, A. J. 1984. Representation and understanding in the use of symbols and pictograms. In Information Design: The Design and Evaluation of Signs and Printed Material, R. Easterby and H. Zwaga, Eds. John Wiley and Sons Ltd., Chichester, UK, 37–75.
BARNARD, P. AND MAY, J. 1993. Cognitive modeling for user requirements. In Computers, Communication and Usability: Design Issues, Research and Methods for Integrated Services, P. F. Byerley, P. J. Barnard, and J. May, Eds. Elsevier, Amsterdam, The Nethlands, 101–145.
BARNARD, P. AND MAY, J. 1995. Interactions with advanced graphical interfaces and the deployment of latent human knowledge. In Eurographics Workshop on Design, Specification and Verification of Interactive Systems, F. Paternó, Ed. Springer-Verlag, Berlin, Germany, 15– 49.
BARNARD, P. AND MAY, J. 1999. Representing cognitive activity in complex tasks. Hum. Comput. Interact. 14, 1-2, 93–158.
BARNARD, P. J., BERNSEN, N. O., COUTAZ, J., DARZENTAS, J., FACONTI, G., HAMMOND, N., HARRISON, M. D., JORGENSEN, A. H., LOWGREN, J., MAY, J., AND YOUNG, R. M. 1995. Assaying Means of Design Expression for Users and Systems: AMODEUS-2 Project Final report. available at http://www.mrc-cbu.cam.ac.uk/amodeus/abstracts/d/d13.html
BARNARD, P., GRUDIN, J., AND MACLEAN, A. 1989. Developing a science base for the naming of computer commands. In Cognitive Ergonomics and Human Computer Interaction, J. B. Long and A. Whitefield, Eds. Cambridge University Press, New York, NY, 95–133.
BARNARD, P. J., MACLEAN, A., AND HAMMOND, N. V. 1984. User representations of ordered sequences of command operations. In Proceedings on Interact’84: First IFIP Conference on Human-Computer Interaction (London), 434 – 438.
BARNARD, P. J., WILSON, M., AND MACLEAN, A. 1987. Approximate modeling of cognitive activity with an expert system: a strategy for the development of an interactive design tool. In Proceedings on CHIGI’87, 21–26.
BELLOTTI, V., BLANDFORD, A., DUKE, D., MACLEAN, A., MAY, J., AND NIGAY, L. 1996. Interpersonal access control in computer-mediated communications: A systematic analysis of the design space. Hum. Comput. Interact. 11, 4, 357–432.
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
260 • P. Barnard et al.
BLANDFORD, A. E. AND DUKE, D. J. 1997. Integrating user and computer system concerns in the design of interactive systems. Int. J. Hum.-Comput. Stud. 46, 5, 653–679.
BLANDFORD, A. E., BARNARD, P. J., AND HARRISON, M. D. 1995. Using Interaction Framework to guide the design of interactive systems. Int. J. Hum.-Comput. Stud. 43, 1 (July 1995), 101–130.
BORDEGONI, M. AND HEMMJE, M. 1993. A dynamic gesture language and graphical feedback for interaction in a 3d user interface. Comput. Graph. Forum 12, 1, 1–11.
BOWMAN, H. AND FACONTI, G. 1999. Analysing cognitive behaviour using LOTOS and Mexitl.Form. Asp. Comput. 11.
CARD, S. K., MORAN, T. P., AND NEWELL, A. 1983. The Psychology of Human Computer Interaction. Lawrence Erlbaum Associates Inc., Hillsdale, NJ.
CARROLL, J. M. AND CARRITHERS, C. 1984. Blocking learner errors in a training wheels system.Hum. Factors 26, 377–389.
CARROLL, J. M. AND ROSSON, M. B. 1992. Getting around the task-artefact framework: How to make claims and design by scenario. ACM Trans. Off. Inf. Syst. 10, 181–212.
CHAPANIS, A. 1975. Interactive human communication. Sci. Am. 232, 3 (Mar.), 36–42. CHAPANIS, A. 1996. Human factors in systems engineering. Wiley series in systems
engineering. John Wiley and Sons, Inc., New York, NY.
CLARK, H. 1996. Using Language. Cambridge University Press, New York, NY.
COOPER, R., FOX, J., FARRINGDON, J., AND SHALLICE, T. 1996. A systematic methodology for
cognitive modelling. Artif. Intell. 85, 1/2, 3–44.
DIX, A. 1991. Formal Methods for Interactive Systems. Academic Press, Inc., New York, NY. DUKE, D. 1995. Reasoning about gestural interaction. Comput. Graph. Forum 14, 55–66. DUKE, D. J. AND DUCE, D. A. 2000. The formalisation of a cognitive architecture and its
application to reasoning about human computer interaction. Form. Asp. Comput..
DUKE, D. AND HARRISON, M. 1993. Abstract Interaction Objects. Comput. Graph. Forum 12, 1,
C25C36.
DUKE, D. AND HARRISON, M. 1994. A theory of presentations. In Lecture Notes in Computer
Science Lecture Notes in Computer Science. Springer-Verlag, New York, NY, 271–290. DUKE, D. J. AND HARRISON, M. D. 1995a. From formal models to formal methods. In Software Engineering and Human Computer Interaction: Proceedings of the ICSE Workshop on SE-HCI: Joint Research Issues, N. Taylor and J. Coutaz, Eds. Springer-Verlag, Vienna,
Austria, 159–173.
DUKE, D. J. AND HARRISON, M. 1995b. Interaction and task requirements. In Proceedings of
Eurographics Workshop on Design, Specification and Verification of Interactive Systems
(DSV-IS’95), P. Palanque and R. Bastide, Eds. Springer-Verlag, Vienna, Austria, 54–75. DUKE, D. J. AND BARNARD, P. J. 1998. Syndetic modelling. Hum. Comput. Interact. 13, 4,
337–393.
DUKE, D. J., BARNARD, P. J., DUCE, D. A., AND MAY, J. 1995. Systematic development of the
human interface. In Proceedings of the Conference on Second Asia-Pacific Software
Engineering (APSEC’95), IEEE Computer Society Press, Los Alamitos, CA, 313–321. DUPUY, R. E. AND DUPUY, T. N., Eds. 1993. The Collins Encyclopedia of Military History.
HarperCollins Publishers, New York, NY.
FURNAS, G. W. 2000. Future Design: Mindful of the Moras. In Human-Computer Interaction
in the New Millennium, J. M. Carroll Addison-Wesley, Reading, MA.
GREEN, A. J. K. AND BARNARD, P. J. 1990. Iconic interfacing: The role of icon distinctiveness and fixed or variable screen location. In Human-Computer Interaction—INTERACT ’90
Elsevier Sci. Pub. B. V., Amsterdam, The Netherlands, 457–462.
HAMMOND, N. V. AND BARNARD, P. 1982. Usability and its multiple determination for the
occasional user of interactive systems. In Pathways to the Information Society, M. B.
Williams, Ed. North Oxford Academic Publ. Co. Ltd., Oxford, UK, 543–548.
HARRISON, M. D. AND BARNARD, P. 1993. On defining the requirements for interaction. InProceedings of the 1st International Symposium on Requirements Engineering (RE ’93, San Diego, CA), S. Fickas and A. C. W. Finklestein, Eds. IEEE Computer Society Press, Los
Alamitos, CA, 50–55.
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
Systems, Interactions, and Macrotheory • 261
HARRISON, M. AND TORRES, J., Eds. 1997. Design, Specification and Verification of Interactive Systems’97. Springer-Verlag, Vienna, Austria.
HOARE, C. 1996. How did software get so reliable without proof? In Lecture Notes in Computer Science, M.-C. Gaudel and J. Woodcock, Eds. Springer-Verlag, Berlin, Germany, 1–17.
JONES, C., JONES, K., LINDSAY, P., AND MOORE, R. 1991. MURAL: A Formal Development Support System. Springer-Verlag, Berlin, Germany.
LANDAUER, T. K. 1987. Relations between cognitive psychology and computer system design. In Interfacing Thought: Cognitive Aspects of Human-Computer Interaction, J. M. Carroll, Ed. MIT Press, Cambridge, MA, 1–25.
LEAVITT, H. J. 1951. Some effects of certain communication patterns on group performance. J. Abn. Soc. Pysch. 46, 38–50.
LEE, W.-O. 1993. The effects of skill development and feedback on action slips. In People and Computers VIII: Proceedings of the HCI’93 Conference (HCI ’93, Loughborough, England), A. Alty and S. Guest, Eds. Cambridge University Press, New York, NY, 73–86.
MARR, D. 1982. Vision. W. H. Freeman and Co., New York, NY.
MACLEAN, A., YOUNG, R. M., BELLOTTI, V. M. E., AND MORAN, T. P. 1991. Questions, options,
and criteria: Elements of design space analysis. Human-Comput. Interact. 6, 3-4, 201–250. MALONE, T. W. AND CROWSTON, K. 1994. The interdisciplinary study of coordination. ACM
Comput. Surv. 26, 1 (Mar. 1994), 87–119.
MAY, J. AND BARNARD, P. 1995a. The case for supportive evaluation during design. Interact.
Comput. 7, 115–144.
MAY, J. AND BARNARD, P. J. 1995b. Cinematography and interface design. In Proceedings of
the 3rd IFIP Conference on Human-Computer Interaction (INTERACT ’95, Lillehammer, Norway, June 27–29), K. Nordby, P. H. Helmersen, D. J. Gilmore, and S. A. Arnesen, Eds. Chapman and Hall, Ltd., London, UK, 26–31.
MAY, J. AND BARNARD, P. J. 1997. Modelling multimodal interaction: A theory-based technique for design analysis and support. In Proceedings of the IFIP TC13 International Conference on Human-Computer Interaction (INTERACT’97), S. Howard, J. Hammond, and G. Lindegaard, Eds. Chapman and Hall, Ltd., London, UK, 667–668.
MAY, J., BARNARD, P., AND BLANDFORD, A. 1993a. Using structural descriptions of interfaces to automate the modeling of user cognition. In Proceedings on User Modeling and User Adaptive Interfaces, 27– 64.
MAY, J., SCOTT, S., AND BARNARD, P. 1995. Structuring Displays: A Psychological Guide.Eurographics Tutorial Notes PS95 TN4.
MAY, J., TWEEDIE, L., AND BARNARD, P. 1993b. Modeling user performance in visually based interactions. In People and Computers VIII: Proceedings of the HCI’93 Conference (HCI ’93, Loughborough, England), A. Alty and S. Guest, Eds. Cambridge University Press, New York, NY, 95–110.
MEYER, D. AND KIERAS, D. 1997. A computational theory of executive cognitive processes and multiple-task performance: Part 1. Psychol. Rev. 104, 3–65.
MILNER, R. 1989. Communication and Concurrency. Prentice-Hall International Computer Science Series. Prentice-Hall, Inc., Upper Saddle River, NJ.
MILNER, R. 1993. Elements of interaction: Turing award lecture. Commun. ACM 36, 1 (Jan. 1993), 78–89.
NEWELL, A. 1990. Unified Theories of Cognition. Harvard University Press, Cambridge, MA. NEWELL, S. AND CARD, S. K. 1985. The prospects for science in human computer interaction.
Hum. Comput. Interact. 1, 209–242.
NEWELL, A. AND SIMON, H. A. 1972. Human Problem Solving. Prentice-Hall, Englewood
Cliffs, NJ.
NIELSEN, J. 1993. Usability Engineering. Academic Press Prof., Inc., San Diego, CA.
NIGAY, L. AND COUTAZ, J. 1995. A generic platform for addressing the multimodal
challenge. In Proceedings of the ACM Conference on Human Factors in Computing Systems(CHI ’95, Denver, CO, May 7–11), I. R. Katz, R. Mack, L. Marks, M. B. Rosson, and J. Nielsen, Eds. ACM Press/Addison-Wesley Publ. Co., New York, NY, 98–105.
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.
262 • P. Barnard et al.
NORMAN, D. 1983. Design principles for human-computer interaction. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’83), ACM Press, New York, NY, 1–10.
OWRE, S., RUSHBY, J., SHANKAR, N., AND VON HENKE, F. 1995. Formal verification for fault-tolerant architectures: prolegomena to the design of PVS. IEEE Trans. Softw. Eng. 21, 2 (Feb. 1995), 107–125.
PALANQUE, F. AND PATERNÓ, F., Eds, 1997. Formal Methods in Human Computer Interaction. Springer-Verlag, Berlin, Germany.
RUDISILL, M., LEWIS, C., POLSON, P., AND MCKAY, T. 1996. Human-Computer Interface Design: Success Stories, Emerging Methods and Real World Context. Morgan Kaufmann Publishers Inc., San Francisco, CA.
SCOTT, S., BARNARD, P. J., AND MAY, J. 2001. Specifying executive representations and processes in number generation tasks. Q. J. Exp. Psychol..
SHAW, M. E. 1954. Some effects of problem complexity upon problem solving efficiency in different communication nets. J. Exp. Psychol. 48, 211–217.
SUCHMAN, L. A. 1987. Plans and Situated Actions: The Problem of Human-Machine Communication. Cambridge University Press, New York, NY.
TEASDALE, J. D. AND BARNARD, P. 1993. Affect, Cognition and Change: Re-modeling Depressive Thought. Lawrence Erlbaum Associates, Inc., Mahwah, NJ.
VAN MULKEN, S., ANDRÉ, E., AND MÜLLER, J. 1998. The persona effect: How substantial is it? In People and Computers XIII: Proceedings of the BCS-HCI Conference (Sheffield, Sept. 1–4), H. Johnson, L. Nigay, and C. Roast, Eds. Springer-Verlag, Vienna, Austria, 53–66.
VEINOTT, E. S., OLSON, J. S., OLSON, G. M., AND FU, X. 1997. Video Matters!: When Communication Ability is Stressed, Video Helps. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI ’97, Atlanta, GA, Mar. 22–27), S. Pemberton, Ed. ACM Press, New York, NY, 315–316.
WALKER, J. H., SPROULL, L., AND SUBRAMANI, R. 1994. Using a human face in an interface. In
Proceedings of the ACM Conference on Human Factors in Computing Systems: “Celebrating Interdependence” (CHI ’94, Boston, MA, Apr. 24–28), ACM Press, New York, NY, 85–91.
Received: February 1999; revised: February 2000; accepted: September 2000
ACM Transactions on Computer-Human Interaction, Vol. 7, No. 2, June 2000.