Human-Computer Interaction: Principles of Interface Design

 

Patrizia Nanni

 

5 November, 2004

 

 

Student No. 12397901

Project Supervisor: Andrew Marriott

 


Table of Contents

 

1.0       Introduction... 4

1.1       Objectives.. 4

1.2       What is Human-Computer Interaction?. 4

1.3       Communicating with Users: The importance of interaction design    5

1.3.1    Mental and Conceptual Models. 6

1.3.2    Interaction paradigms, idioms and metaphors. 8

1.3.3    Personas. 10

1.4       Types of User Interfaces.. 12

1.4.1    Graphical User Interfaces. 12

1.4.2    Voice User Interfaces. 14

1.4.3    Multi-modal User Interfaces. 15

1.4.4    Other User Interfaces. 17

2.0       Issues in Human-Computer Interaction... 18

2.1       Accessibility... 18

2.1.1    Understanding Accessibility Barriers. 18

2.1.2    Legal Requirements. 22

2.1.3    Accessibility Design Guidelines. 23

2.1.4    Personal Assistive Technologies. 29

2.1.5    Summary. 30

2.2       The Cultural Context.. 30

2.2.1    Cultural Markers. 31

2.2.2    Internationalisation and Localisation. 33

2.3       Usability... 38

2.3.1    Usability Engineering. 41

2.3.2    Usability Inspection Methods. 42

2.3.3    Summary. 43

3.0       Graphical User Interface Design... 44

3.1       Graphic Design Principles for Computers.. 44

3.1.1    Elegance and Simplicity. 44

3.1.2    Visual Variables: Scale, Contrast & Proportion. 47

3.1.3    Perceptual Organisation and Visual Structure. 49

3.1.4    Module and Program: Grid-based Design. 50

3.1.5    Semiotics: Image and Representation. 52

3.1.6    Colour. 54

3.1.7    Text 58

3.2       Widgets: the Building Blocks of Graphical User Interfaces.. 59

3.2.1    Windows. 59

3.2.2    The Menu System... 60

3.2.3    Dialog Windows. 64

3.2.4    Smartening Up Applications. 74

4.0       Conclusion... 77

5.0       Website Design Tutorial.. 78

6.0       References.. 81

7.0       Recommended Reading... 84

 

 


Table of Figures

 

Figure 1. Some of the disciplines involved in the field of Human-Computer Interaction. 5

Figure 2. Software designers design products for humans to use, yet knowledge of users and their mental processes is often elusive. 6

Figure 3. Mental model vs Implementation model. 7

Figure 4. Conceptual model and mental models. 8

Figure 5. The Mentor System’s client interface uses the anthropomorphic metaphor of a virtual lecturer. 9

Figure 6. A simple example of a Persona for Betty the Warehouse Manager. 11

Figure 7. An example of an in-depth primary Persona for Sara Locke, user of the Daisy Bead Company website  12

Figure 8. 1973 Xerox Alto (left). 13

Figure 9. 1984 Apple Macintosh (right). 13

Figure 10. Early Apple Macintosh advertisement 14

Figure 11. MetaFace. 16

Figure 12. Avatar-Conference, a multi-modal interface using avatars to represent humans. 17

Figure 13. The ishihara plate is commonly used to test for red-green colour blindness. 19

Figure 14. The differences between Deuteranopia, Protanopia and Tritanopia. 19

Figure 15. The effects of tunnel vision 12. 19

Figure 16. Pictures showing the effects of eye conditions commonly found in older people. 21

Figure 17. Chromostereopsis. The red should appear closer to the eye than the blue. 24

Figure 18. Lightness differences between foreground and background colours. 25

Figure 19. Contrast differences between hues in hemispheres of the colour wheel 15. 25

Figure 20. Contrast differences between adjacent colours on the colour wheel 15. 26

Figure 21. A sample of some international keyboard layouts from Microsoft 35

Figure 22. Simplicity and Elegance. 45

Figure 23. Microsoft's Bob interface. 46

Figure 24. Bertin's retinal variables. 47

Figure 25. Optical Illusions. 50

Figure 26. Grid-based Design. 51

Figure 27. Canonical Grid. 52

Figure 28. Visual Imagery. 53

Figure 29. Colour Wheel. 54

Figure 30. The difference between Hue, Saturation and Value, as demonstrated by Apple’s colour selector widget. 55

Figure 31. Two Complementary colour schemes based on red and green. 56

Figure 32. Effects of desaturation and greyscale conversion of colour images. 56

Figure 33. VisiBone’s web-safe palette of 216 web-safe colours. 57

Figure 34. Categories of type (after Williams 1994). 58

Figure 35. A Multiple Document Interface application in Microsoft Word (left). 60

Figure 36. A multipaned window and Tabbed Document Interface in Microsoft Excel (right). 60

Figure 37. A tabbed document interface employed in Mozilla. 60

Figure 38. A cascading menu in Microsoft Word. 61

Figure 39. Microsoft Word’s expanding menu system, 62

Figure 40. Icons and accelerator keys in Paint Shop Pro v7. 64

Figure 41. Accelerator keys in Microsoft Word. 64

Figure 42. Some of GTK’s predefined dialog boxes. 65

Figure 43. A sample of button layouts from the TestGTK suite. 66

Figure 44. Apple’s Mac OS X v10.3 operating system includes clearly identifiable radio buttons (left)  and check boxes (right). 67

Figure 45. Toggle buttons in the accessibility settings dialog box of Apple’s Mac OS X v10.3 operating system. 67

Figure 46. The drop-down list used in Microsoft Word. 68

Figure 47. Bounded entry controls. 69

Figure 48. A collapsible pane structure in Adobe’s Acrobat Reader. 69

Figure 49. Microsoft Word’s word search function reports its status via an unnecessary additional dialog box. 70

Figure 50. ZoneAlarm’s notification that it has finished installing requires the user to restart their computer. 71

Figure 51. Mozilla Firefox’s “About” box. 71

Figure 52. Splash screen from Lavasoft’s Ad-Aware. 72

Figure 53. The KDE desktop environment provides an excellent example of intelligent programming. 73

Figure 54. A selection of icons commonly found on toolbars. 74

Figure 55. A customised toolbar in Microsoft Word. 75

 

1.0         Introduction

Humans interact with computers in many ways, and the interface between humans and the computers they use is crucial to facilitating this interaction. Desktop applications, internet browsers, handheld computers, and computer kiosks make use of the prevalent Graphical User Interfaces (GUI) of today. Voice User Interfaces (VUI) are used for speech recognition and synthesising systems, and the emerging multi-modal and gestalt User Interfaces (gUI) allow humans to engage with embodied character agents in a way that cannot be achieved with other interface paradigms. This project broadly investigates these paradigms, and the importance of, and issues associated with, interaction design, and then focuses on the GUI design of desktop applications and websites.

 

Human-Computer Interaction (HCI) is both an art and a science. The interdependence of a software system’s functionality and its interface means that software designers cannot afford to favour one over the other. If the interface is well designed, it will allow the system’s functionality to support the user’s task. However, if the interface is poor, the functionality is obscured and users will have trouble accomplishing their task (Dix et al. 2004:p.110). The IEEE/ACM curriculum council also include HCI as one of the core knowledge focus groups in their computing curriculum (IEEE/ACM 2001).

1.1           Objectives

This project is a broad investigation into the issues surrounding Human-Computer Interaction. It will:

·         Identify the ways in which humans interact with computers, and the roles of different types of user interfaces within these contexts.

·         Examine some current issues in HCI and their impact on interface and interaction design.

·         Investigate fundamental principles for effective interface design.

·         Investigate in further detail, the principles for good graphical user interface design.

 

The theoretical overview will be accompanied by a tutorial outline (Section 5.0) that investigates one of today’s prominent interaction paradigms – the world wide web. It investigates graphical user interface design in websites, and looks at some issues associated with accessibility and usability. It also explores the use and importance of Cascading Style Sheets in web design.

1.2           What is Human-Computer Interaction?

The desire to build more effective weapons during World War II inspired a heightened interest in the study of interaction between humans and machines, a challenge that was readily taken up by researchers of the day. The Ergonomics Research Society, founded in 1949, was primarily concerned with the physical characteristics of machines and systems, and their effect on user performance. The field of Ergonomics (or Human Factors) is concerned with user performance in the context of any mechanical, computer, or manual system. With use of the computer becoming more widespread, more researchers began to specialise in studying the interaction between people and computers, concentrating on the physical, psychological and theoretical aspects of this interaction (Dix et al. 2004). Thus, Human-computer interaction was born.

 

The field of information science has also influenced the development of the field of HCI. Information science is an old discipline that came into existence before technology, and is concerned with the management and manipulation of information within an organisation. The introduction of technology has greatly influenced the way in which information can be stored, accessed and utilised, and has led to the rise of systems analysis, which matches the technology in the workplace to the requirements and constraints of the task (Dix et al. 2004).

 

Although many other disciplines are concerned with HCI, it is especially important in computer science and systems design, because it involves the design, implementation and evaluation of interactive computer systems in the context of the user’s task and work (Dix et al. 2004). The ideal designer of an interactive system would have expertise in such diverse fields as (Dix et al. 2004):

 

·         Psychology and cognitive science for knowledge of the user’s perceptual, cognitive and problem-solving skills.

·         Ergonomics for the user’s physical capabilities.

·         Sociology to understand the wider context of the interaction.

·         Computer science and engineering to be able to build the necessary technology.

·         Graphic design to produce a pleasing visual interface.

·         Technical writing to produce the manuals.

·         Business to be able to market the product.

 

HCI is clearly a multi-disciplinary subject (see Figure 1), and designing an effective interactive system from a single discipline in isolation is almost impossible. Computer scientists, however, are particularly interested in the practicalities of how they can use the principles and methods from each HCI discipline to assist them in designing better systems. Acquiring an understanding of the theory is important, but knowing how to apply the theory to the problem at hand is equally valuable (Dix et al. 2004).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 1. Some of the disciplines involved in the field of Human-Computer Interaction[1]

 

The first section of this document covers the more theoretical aspects of human-computer interaction. The document then proceeds to illustrate the application of these theories to interface design principles, in order to enhance the usability of a software product.

1.3           Communicating with Users: The importance of interaction design

One of the greatest challenges facing a software designer is understanding what a user requires from a product. To do this, the designer must have at least a basic understanding of mental models and other psychological theories and their application to software design. Since the user is interacting with the computer in order to accomplish something, the software interface is crucial to facilitating the user’s goals and tasks. The interface, which typically comprises nearly half of the lines of code of a software product (Milewski 2004), is where the designer must consider the implications of how the software influences and anticipates the user’s thought processes during their interaction.

OSS Usability
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 2. Software designers design products for humans to use, yet knowledge of users and their mental processes is often elusive[2].

1.3.1      Mental and Conceptual Models

Mental models are psychological representations of real or imaginary situations. The mind constructs “small-scale models” of reality in order to reason, to anticipate events, and to underlie explanation (Craic, cited in Hudson 2004). The structure of the mental model corresponds to what it represents, and users acquire their mental models through interaction and explanation. In particular, a user’s mental model of a software product, and their interaction with it, is defined by the way in which users perceive the jobs they want to do and how the program helps them to do it (Cooper & Reimann 2003).

 

Mental models have the following characteristics (Dix et al. 2004):

 

·         They are often partial

·         They are unstable and subject to change

·         They can be internally inconsistent

·         They are often unscientific and may be based on superstition rather than evidence

·         They are often based on incorrect interpretation of the evidence

 

Mental models often take into consideration existing conventions that humans commonly use to interpret the world. Ideally, these conventions should be followed in design. However, if they are to be contravened, explicit support should be provided to enable people to form the correct mental models for the product (Dix et al. 2004).

 

The example of dining out at a restaurant is often used to describe how someone might form a mental model. For instance, if a person mentions that they went to a restaurant for lunch, the listener may assume a certain sequence of events occurred during that interaction. The mental model may go something like this:

 

1.       The person walked into the restaurant through a door, and was greeted by the Maitre d’.

2.       The Maitre d’ showed the person to a table.

3.       Another waiter presented the person with a menu.

4.       The person ordered their food and drink.

This example illustrates the subjectiveness of mental models, and their basis on personal experience instead of scientific method. If someone is accustomed to a formal, sit-down restaurant, then a visit to a buffet restaurant may be confusing at best, or unpleasant at worst. This indicates a mismatched mental model, which the person must then alter to incorporate their new experience.

 

Most software reflects the implementation model of its design (i.e. the logical structure of the program), instead of the user’s goals and the tasks required to accomplish them. However, the user’s mental model of a software product is often distinctly different from the software’s implementation model, because the complexity of the software’s implementation can obscure its functionality from the user’s perspective. This gives rise to a third model in the digital world: the conceptual model (Hudson 2004), which is also known as the represented model (Cooper & Reimann 2003). These terms have been used to refer to the mental model a designer intends their user to follow when using their product. That is, they reflect the way designers choose to represent the workings of the program to the user. Conceptual design techniques aim to specifically assist a user in understanding a system (Newman & Lamming 1995). If the conceptual model of the system is substantially different from the user’s mental model, the user may find the system difficult to use. User interfaces that are based on the software’s conceptual model and assist the user to form a matching mental model, make a software product easier to use (Cooper & Reimann 2003).

 

Figure 3 illustrates the difference between the implementation model and a user’s mental model. The user forms a mental model of how they expect a software product to work. In an ideal situation, the user will be able to approach a new program and complete their desired task in exactly the manner that they expect the program to work. The conceptual (or represented) model reflects the way software designers choose to represent to the user how the program works. This is an aspect of design that developers have great control over, and the closer it matches the mental model of the user, the easier the user will find the program to use (Cooper & Reimann 2003).

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 3. Mental model vs Implementation model.

The closer the represented (or conceptual) model of a software product is to the user’s mental model, the easier the program will be to use (Cooper & Reimann 2003:p.23).

 

 

Figure 4 illustrates how Jasc’s Paint Shop Pro allows the user to see, both in preview and in real-time, how changing the values of certain parameters actually affects the picture s/he is manipulating. The user is more likely to be thinking in terms of how the final picture will look, rather than the numerical values that need to be manipulated to achieve the desired look.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 4. Conceptual model and mental models.

Jasc’s Paint Shop Pro illustrates how the product’s conceptual model can be made to more closely match a user’s mental model.

 

Training, documentation and interaction are all ways in which a user can acquire an appropriate conceptual model of a product; however, for general-purpose systems, interaction is by far the most realistic and effective (Hudson 2004). To create an easy to use system, the conceptual model must be:

 

·         Deliberately designed.

·         Simple enough to be understood through interaction.

·         Appropriate to the users’ tasks.

 

It should also use familiar terms, provide adequate feedback, and be consistent with users’ expectations (Hudson 2004).

 

Conceptual models provide many benefits to software design (Hudson 2004). They:

 

·         Provide an opportunity for simplification and innovation.

·         Define concepts and terms for the user interface.

·         Provide a framework for implementation – the ‘core’ model is elaborated and views and other  interface components are added.

·         Provide a basis for object oriented development.

·         Provide control over “feature bloat”.

 

Reverse-engineering of conceptual models can also be used as a basis for user testing of a system, to evaluate whether or not the users’ mental model matches the designer’s conceptual model of a product. Techniques for developing a conceptual model are beyond the scope of this project, but provide an interesting possibility for future study.

1.3.2      Interaction paradigms, idioms and metaphors

Interaction paradigms serve as illustrations of the ways in which humans interact with computers, and successful paradigms are ones commonly believed to enhance the usability of computer systems. New paradigms often arise through exploring current idioms, and pushing those boundaries to create innovative products (Dix et al. 2004).

 

Metaphors make use of existing conceptual models (Hudson 2004), and are used to teach new concepts in terms of those that are already understood. They have been used successfully to describe the functionality of many interaction widgets, and have contributed greatly to commercial successes in computing. The success of the GUI desktop metaphor in linking computer file manipulation tasks with filing tasks in a typical office environment, initially makes the computerised tasks easier to understand. The spreadsheet metaphor for accounting has also been a resounding success (Dix et al. 2004).

 

There has recently been an upsurge in the use of anthropomorphic metaphors, fuelled by increasing interest in multi-modal interfaces (see section 1.4.3). For example, MetaFace (Marriott & Beard 2004) uses the anthropomorphic metaphor of the virtual “friend” to assist a user with web searching. Anthropomorphic metaphors can reduce a user’s cognitive load[3] in using a product (Marriott & Beard 2004), but they must be carefully implemented because a personality clash with the user, as reported by Marriott (2003) in his evaluation of the Mentor system (Figure 5), can also be detrimental to their effectiveness.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 5. The Mentor System’s client interface[4] uses the anthropomorphic metaphor of a virtual lecturer

(Marriott & Beard 2004).

 

However, metaphors can give rise to complications when they are taken too literally (Dix et al. 2004). For example, the desktop metaphor, common to today’s personal computer systems, allows a user to delete a file by dragging it to the wastebasket or recycle bin. However, if the user wants to “shred” a file for security reasons, the metaphor breaks down, since it doesn’t make sense to drag the file to a bin for recycling if one wants to permanently delete it. It may then appear to the user that the program does not provide the facility for secure disposal of sensitive documents. Similarly, if a user wants to eject a DVD from one of the Apple operating systems, they are required to drag the icon to the wastebasket. Again, this does not make sense in terms of the desktop metaphor, and the user may infer that they are unable to eject their disk. Perhaps even more distressing for the user, would be dragging a rewriteable media, such as a floppy disk, to the wastebasket. The implication here is that the action would result in the contents of the disk being deleted, rather than the disk being ejected. Even a more experienced user may want to think twice before dragging such an item to the wastebasket.

 

On the other hand, the inaccuracy of the desktop metaphor does not detract from its usefulness. The importance of the metaphor is to empower the user to work with the abstraction of a computer by providing an interface that improves on the implementation model of the system (Marriott & Beard 2004).

 

Another problem with the metaphor is that it relies heavily on cultural bias. It should not be assumed that a metaphor can be successfully used in all cultures, particularly with the increasing internationalisation of software (see section 2.2). Metaphors need to be chosen with care, as a metaphor that has no meaning, or the wrong meaning, to a certain group of people, adds an unnecessary layer of complexity to the software (Dix et al. 2004).

1.3.3      Personas

A model of user behaviour is necessary to understand the principles of designing for cognitive control (Norman 1991:p.77). The use of personas as a practical interaction design tool was introduced in 1998, and they swiftly gained popularity in the software industry due to their power and effectiveness in modelling users (Cooper 2003).

 

A persona is a precise descriptive model of the users of a product, what they wish to accomplish, and why. They are based on the behaviours and motivations of real people and represent them throughout the design process, thereby providing a way to match the behavioural patterns, mental models, and goals of users (Cooper & Reimann 2003).

 

The use of personas moves the design process away from discussions that may be personal in nature or vague, to a series of questions and answers based on a concrete example from which the team can work (Hourihan 2002). For example, vague and personal comments such as, “I’d want it to work this way” or “users like to see all the options on the home page” would become more factual data representative of the end users, e.g. “Mary, the primary persona, works from home via dialup four days a week, therefore downloading an Access database isn’t an option” (Hourihan 2002).

 

Personas assist designers to (Cooper & Reimann 2003):

 

1.       Determine what a product should do and how it should behave, by basing the design effort on persona goals and tasks.

2.       Communicate with stakeholders, developers and other designers by providing a common language for discussing design decisions, and to keep the design centered on users throughout product development.

3.       Build consensus and commitment to the design by reducing the need for elaborate diagrammatic models, due to the ability to understand the nuances of user behaviour through the personas.

4.       Measure the design’s effectiveness by providing a tool that designers can use to address problems and allowing iteration to occur rapidly and inexpensively.

 

Personas are developed through research and real-world observation through interviews, contextual enquiries and other dialogues with, and observations of, actual and potential end-users. They are represented as individuals and are designed to engage the empathy of the development team towards the human target of the design. However, a persona also encapsulates a distinct set of usage patterns of the final product (Cooper & Reimann 2003).

 

Personas should not be reused, because they are context-specific and the focus of behaviours may differ between products. Stereotyping a persona is a danger that should be avoided, as stereotypes are based on bias rather than on factual data. Personas explore ranges of behaviour rather than seeking to establish an “average user” (Cooper & Reimann 2003).

 

Olsen (2004) constructed a toolkit for practical development of personas. His toolkit suggests that a persona profile should contain the following:

 

1.       Persona type – primary, secondary, unimportant.

2.       Biographic background of persona – humanises the persona by providing a backstory and matches it to market segments.

3.       Persona’s relationship to the business – how valuable the persona is to the business.

4.       Product/business’ relationship to the persona – may suggest emotional aspects that the product needs to address.

5.       Specific goals, needs and attitudes of the persona – addressing the behavioural aspects of the product by focusing on user goals and restructuring tasks to meet those goals.

6.       Specific knowledge and proficiency of the persona – persona’s overall knowledge and skills in the context of how the product is to be used.

7.       Context of usage of the product – context in which a persona uses the product, focusing on the wider context surrounding the task.

8.       Interaction characteristics of product usage – specific details about the task.

9.       Information characteristics of product usage – how to present the appropriate content.

10.   Sensory/immersive characteristics of usage – aesthetics are increasingly becoming critical to a product’s success.

11.   Accessibility issues – it may be useful to build these into a persona as long as it doesn’t compromise the main purpose of that persona.

12.   Design issues – think through the issues the persona is addressing.

13.   Relationships between personas – similarities, generalisations and other relational issues.

 

 

Personas are still a relatively new tool for interaction design. Whilst Olsen’s toolkit provides a fairly comprehensive method for modelling a persona, often a simple paragraph, as illustrated in Figure 6 is enough to provide an effective tool for system design. Figure 7 provides an example of a more in-depth persona analysis.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 6. A simple example of a Persona for Betty the Warehouse Manager

(Dix et al. 2004:p.201).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 7. An example of an in-depth primary Persona for Sara Locke, user of the Daisy Bead Company website

(sic) (Robinson 2003).

1.4           Types of User Interfaces

Paradigms, metaphors, mental models and personas are driving forces behind the user interface and design employed in a particular system. There are three commonly recognised user interfaces in use today. The Graphical User Interface, which is possibly the most familiar to most users; the Voice User Interface, one that is rapidly being deployed in many aspects of business; and the Multi-Modal Interface, a relatively new area of research that combines several methods of user input into a system.

1.4.1      Graphical User Interfaces

Graphical user interfaces make computing easier by separating the logical threads of computing from the presentation of those threads to the user, through visual content on the display device. This is commonly done through a window system that is controlled by an operating system’s window manager. The WIMP (Windows, Icons, Menus, and Pointers) interface is the most common implementation of graphical user interfaces today, and will be examined in detail later in this document. The appeal of graphical user interfaces lies in the rapid feedback provided by the direct manipulation[5] that a GUI offers (Dix et al. 2004). Direct manipulation interfaces provide the following features (Dix et al. 2004:p.171):

 

·         Visibility of the objects of interest.

·         Replacement of complex command languages with actions to directly manipulate the visible objects (hence the name direct manipulation).

·         Incremental action at the interface, with rapid feedback on all actions.

·         Syntactic correctness of all actions, so that every user action is a legal operation.

·         Reversibility of all actions, so that users are encouraged to explore the product without severe penalty.

 

The robustness of the direct manipulation interface for the desktop metaphor is demonstrated by the documents and folders being visible to the user as icons that represent the underlying files and directories (Dix et al. 2004:p.171). With a drag-and-drop style command, it is impossible to make a syntactically incorrect operation. For example, if a user wants to move a file to a different folder, the move command itself is guaranteed to be syntactically correct; and even though the user may make a mistake in placing the file in the wrong place, it is relatively easy to detect and recover from those errors. While the document is being dragged, continual visual feedback is provided to the user, creating the illusion that the user is actually working in the desktop world (Dix et al. 2004).

 

In February 2004, the National Academy of Engineering awarded the Draper prize to the Xerox Parc research team who, in 1973, unveiled the Xerox Alto, the first computer to use a direct manipulation graphical user interface (Figure 8). However, it wasn’t until 1984 that the GUI was popularised by Apple with its Macintosh personal computer (Figure 9), which became the first to commercially demonstrate the intrinsic usability of direct manipulation interfaces (Dix et al. 2004:p.171). Its relative ease of use was a major selling point in Apple’s advertising campaign with the slogan, “The computer for the bemused, confused and intimidated” (Figure 10). The immense success of these early advertising campaigns highlights an issue that is just as relevant to the computer industry today.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 8. 1973 Xerox Alto[6] (left).

Figure 9. 1984 Apple Macintosh[7] (right).

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 10. Early Apple Macintosh advertisement[8]

1.4.2      Voice User Interfaces

Voice User Interfaces (VUIs) use speech technology to provide people with access to information and to allow them to perform transactions. VUI development was driven by customer dissatisfaction with touchtone telephony interactions, the need for cheaper and more effective systems to meet customer needs, and the advancement of speech technology to the stage where it was robust and reliable enough to deliver effective interaction. With the technology finally at the stage where it can be effectively and reliably used, the greatest challenge remains in the design of the user interface (Cohen, Giangola & Balogh 2004).

 

A Voice User Interface is what a person interacts with when using a spoken language application. Auditory interfaces interact with the user purely through sound. Speech is input by the user, and speech or nonverbal audio is output by the system (Cohen, Giangola & Balogh 2004).

 

People learn spoken language implicitly at a very young age, rather than explicitly through an education system. Therefore, speakers generally do not explicitly think about the technical means of getting a message across, they simply need to focus on the meaning of the message they wish to convey. Consequently, VUI designers must focus on understanding the conventions, assumptions and expectations of conversation instead of its underlying constructs (Cohen, Giangola & Balogh 2004).

 

VUIs are comprised of three main elements (Cohen, Giangola & Balogh 2004):

 

1.       Prompts, also known as system messages, are the recorded or synthesised speech played to the user during the interaction.

2.       Grammars are the possible responses users can make in relation to each prompt. The system cannot understand anything outside of this range of possibilities.

3.       Dialog logic determines the actions the system can take following a user’s response to a prompt.

 

Additionally, auditory interfaces provide opportunity to use nonverbal audio such as background music and earcons (sounds that are used to represent a specific event) to create an auditory environment for the user, and thus creating a unique “sound and feel” for a business or application (Cohen, Giangola & Balogh 2004).

 

Auditory interfaces provide new ways of supplying information, enabling transactions, and facilitating communication. They have benefits for both the company and the end user. Speech systems normally save companies significant amounts of money by lowering call abandonment rates and increasing automation rates, as well as reducing toll call charges through shorter call durations. They can enable companies to extend their reach to customers who don’t have web access and those who desire mobility whilst retaining the same level of service. Additionally, auditory interfaces can solve problems or offer services that were not available or possible in the past, such as automated personal agents that use speech technology to act as a personal assistant (Cohen, Giangola & Balogh 2004).

 

For the end user, voice systems can be more efficient and intuitive by drawing on the user’s innate language skills and simplifying input sequences. Telephones are ubiquitous, which makes speech systems mobile. This also means that users can employ them anywhere and without occupying the user’s hands and eyes, which makes speech systems easily accessible. A well-designed speech system can thus free up the user’s time by more efficiently meeting their needs (Cohen, Giangola & Balogh 2004).

 

Aside from speech recognition systems, other speech technologies include Text-to-Speech (TTS) Synthesis and Speaker Verification. Speaker Verification involves collecting a small amount of a person’s voice to create a voice template, which is used to enrol a person into a system and then compare future conversation. The system can be used, for example, to replace personal identification numbers (PINs) and monitor home incarceration (Cohen, Giangola & Balogh 2004).

 

Text-to-Speech technology, on the other hand, synthesises text into speech. The technology has improved significantly in recent times, and although it does not yet duplicate the quality of recorded human speech, it is still a good option for creating messages from text that cannot be predicted, such as translating web pages for blind users (Cohen, Giangola & Balogh 2004). Some examples of text-to-speech implementations can be found at www.vhml.org/examples.

 

Understanding the nature of speech systems as well as their benefits and limitations, is necessary in order to design effective Voice User Interfaces. Unique design challenges presented by VUIs are due to the transitory and non-persistent nature of their messages (Cohen, Giangola & Balogh 2004). Messages are invisible – once the user hears the message, it is gone, with no screen to display the information for further analysis. The user must usually decide then and there how they will respond to a prompt. The cognitive limitations of the user necessarily restrict the quantity of information that can be presented by the auditory interface at any one time. Thus voice interface designs should not unnecessarily challenge the user’s short term memory, and they should provide a mechanism for the user to adjust the pacing of the interaction to better suit their own needs (Cohen, Giangola & Balogh 2004).

1.4.3      Multi-modal User Interfaces

Multi-modal interfaces attempt to address the problems associated with purely auditory and purely visual interfaces by providing a more immersive environment for human-computer interaction. A multi-modal interactive system is one that relies on the use of multiple human communication channels to manipulate the computer. These communication channels translate to a computer’s input and output devices. A genuine multi-modal system relies on simultaneous use of multiple communication channels for both input and output, which more closely resembles the way in which humans process information (Dix et al. 2004).

 

In the field of psychology, Gestalt Theory is used to describe a relationship where the whole is something other than the sum of its parts[9]. This theory has recently been used to describe a new paradigm for human-computer interaction, where the interface reacts to and perceives the desires of the user via the user’s emotions and gestures (Marriott & Beard 2004). This paradigm is called the gestalt User Interface (gUI) and paves the way for a truly personalised user experience.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 11. MetaFace.

MetaFace is an embodied character agent for a multi-modal interface that uses the “virtual friend” as a metaphor to help with web searches (Marriott & Beard 2004).

 

Marriott & Beard (2004) write that “a gUI is an interface where input to the application can be a combined stimulus of text, button clicks, and analysed facial, vocal, and emotional gestures”. The user’s emotions, captured through input devices such as video and audio, are translated into input for a gUI, and the program’s response is rendered using a markup language.

 

The value of an interface that can interpret a user’s emotions has applications in fields ranging from business management, safety, and productivity, to entertainment and education. For example, if a program could recognise that the user was getting frustrated, it could modify its behaviour to compensate. When evaluating the text-and-GUI-based Mentor System application used by students from Curtin University to assist with their assignments, Marriott (2003) found that “personality conflict” occurred between the system and some users. For example, one user became intensely annoyed at the beeping sound the program made, while another user found the program to be discourteous. Marriott suggests that incorporating a dynamically adjusting module into the user interface could eliminate or reduce some of these problems.

Marriott and Beard (2004) suggest that an anthropomorphic metaphor should be used to specify complete interaction with an Embodied Character Agent[10] (ECA) that incorporates emotion and gesture analysis. The Mentor System, even though it does not incorporate an ECA, also uses an anthropomorphic metaphor of a virtual lecturer (Marriott & Beard 2004).

 

One of the more well-known commercial examples of the ECA is the virtual newscaster at www.ananova.com. Whilst the implementation technology for Ananova has yet to meet the empathy requirements of a gestalt user interface, such as realistic facial animation, the speech synthesis software is quite good and it nevertheless provides a good example of what could be achieved when the technology is realised.

 

Another product that makes use of multi-modal interfaces is Avatar-Conference, an alternative to video-conferencing that uses avatars to represent the conference participants in a virtual environment. Although it does not incorporate dynamically adjusting modules, it nevertheless provides more than one mode of user interaction.

 

 

 

 

 

 

 

 

 

 

 

 


Figure 12. Avatar-Conference, a multi-modal interface using avatars to represent humans