Interaksi Manusia Komputer : Teknik-Teknik Evaluasi (2)
Style for evaluation through user participation
Labority Studies: users are taken out of their normal work environment to take part in controlled tests, often in a specialist usability laboratory. It is used when:
- the system is to be located in a dangerous or remote location, such as a space station.
- Some very constrained single-user tasks may be adequately performed in a laboratory.
Field studies: takes the designer or evaluator out into the user’s work environment in order to observe the system in action.
Empirical methods: experimental evaluation
One of the most powerful methods of evaluating a design or an aspect of a design is to use a controlled experiment. The evaluator chooses a hypothesis to test, which can be determined by measuring some attribute of participant behavior. A number of experimental conditions are considered which differ only in the values of certain controlled variables. There are a number of factors that are important to the overall reliability of the experiment, which must be considered carefully in experimental design. These include the participants chosen, the variables tested and manipulated, and the hypothesis tested.
Experimental evaluation: Participants
Participants should be chosen to match the expected user populationas closely as possible. If participants are not actual users, they should be chosen to be of a similar age and level of education as the intended user group.
A second issue relating to the participant set is the sample size chosen. The sample size must be large enough to be considered to be representative of the population, taking into account the design of the experiment and the statistical methods chosen. Nielsen and Landauer suggest that usability testing with a single participant will find about a third of the usability problems, and that there is little to be gained from testing with more than five.
Experimental evaluation: VARIABLES
There are two main types of variable: those that are ‘manipulated’ or changed (known as the independent variables) and those that are measured (the dependent variables). Examples of independent variables in evaluation experiments are interface style, level of help, number of menu items and icon design. Each of these variables can be given a number of different values; each value that is used in an experiment is known as a level of the variable. For example, an experiment that wants to test whether search speed improves as the number of menu items decreases may consider menus with five, seven, and ten items. Here the independent variable, number of menu items, has three levels.
Dependent variables, on the other hand, are the variables that can be measured in the experiment, their value is ‘dependent’ on the changes made to the independent variable. In the example given above, this would be the speed of menu selection. The dependent variable must be measurable in some way, it must be affected by the independent variable, and, as far as possible, unaffected by other factors.
Experimental evaluation: Hypotheses
A hypothesis is a prediction of the outcome of an experiment. It is framed in terms of the independent and dependent variables, stating that a variation in the independent variable will cause a difference in the dependent variable. The aim of the experiment is to show that this prediction is correct. This is done by disproving the null hypothesis, which states that there is no difference in the dependent variable between the levels of the independent variable
Experimental evaluation: Experimental design
The next step is to decide on the experimental method that you will use. There are two main methods: between-subjects and within-subjects. In a between-subjects (or randomized) design, each participant is assigned to a different condition. The second experimental design is within-subjects (or repeated measures). Here each user performs under each different condition.
Experimental evaluation: Statistical Measures
Experimental evaluation: Example
Test whether adding color coding to an interface will improve accuracy.
- Participants Taken from user population.
- Hypothesis Color coding will make selection more accurate.
- IV (Independent Variable) Color coding.
- DV (Dependent Variable) Accuracy measured as number of errors.
- Design Between-groups to ensure no transfer of learning (or within-groups with appropriate safeguards if participants are scarce).
- Task The interfaces are identical in each of the conditions, except that, in the second, color is added to indicate related menu items. Participants are presented with a screen of menu choices (ordered randomly) and verbally told what they have to select. Selection must be done within a strict time limit when the screen clears. Failure to select the correct item is deemed an error. Each presentation places items in new positions. Participants perform in one of the two conditions.
- Analysis t test.
Observational Techniques
A popular way to gather information about actual use of a system is to observe users interacting with it.
Usually they are asked to complete a set of predetermined tasks, although, if observation is being carried out in their place of work, they may be observed going about their normal duties. The evaluator watches and records the users’ actions (using a variety of techniques – see below).
Observational Techniques: Think Aloud and cooperative evaluation
Think aloud is a form of observation where the user is asked to talk through what he is doing as he is being observed. For example, describing what he believes is happening, why he takes an action, what he is trying to do. A variation on think aloud is known as cooperative evaluation in which the user is encouraged to see himself as a collaborator in the evaluation and not simply as an experimental participant. As well as asking the user to think aloud at the beginning of the session, the evaluator can ask the user questions (typically of the ‘why?’ or ‘what-if ?’ type) if his behavior is unclear, and the user can ask the evaluator for clarification if a problem arises.
Observational Techniques: Protocol Analysis
Methods for recording user actions include the following:
- Paper & Pencil
- Audio Recording
- Video Recording
- Computer Logging
- User Notebooks
Observational Techniques: AUTOMATIC Protocol Analysis TOOL
Query Techniques
Another set of evaluation techniques relies on asking the user about the interface directly. Query techniques can be useful in eliciting detail of the user’s view of a system. They embody the philosophy that states that the best way to find out how a system meets user requirements is to ‘ask the user’. There are two main types of query technique: interviews and questionnaires.
Query Techniques: Interview
Interviewing users about their experience with an interactive system provides a direct and structured way of gathering information. Interviews have the advantages that the level of questioning can be varied to suit the context and that the evaluator can probe the user more deeply on interesting issues as they arise. An interview will usually follow a top-down approach, starting with a general question about a task and progressing to more leading questions (often of the form ‘why?’ or ‘what if ?’) to elaborate aspects of the user’s response.
Query Techniques: QUESTIONNAIRES
An alternative method of querying the user is to administer a questionnaire. This is clearly less flexible than the interview technique, since questions are fixed in advance, and it is likely that the questions will be less probing. However, it can be used to reach a wider participant group, it takes less time to administer, and it can be analyzed more rigorously.
It can also be administered at various points in the design process, including during requirements capture, task analysis and evaluation, in order to get information on the user’s needs, preferences and experience.
There are a number of styles of question that can be included in the questionnaire.
- General
- Open-Ended
- Scalar
- Multi Choice
- Ranked
QUESTIONNAIRES EXAMPLE
SUMI
The Software Usability Measurement Inventory (SUMI) consists of a 50 item questionnaire that the user has to reply with either "Agree", "Disagree" or "Don't Know". The questionnaire comprises five subscales (Efficiency, Affect, Helpfulness, Controllability, Learnability) and scores them against a continuosly maintained database of industry benchmarks. The entire test can be completed in just a few minutes per respondent.
SUS
The System Usability Scale (SUS) provides a “quick and dirty”, reliable tool for measuring the usability. It consists of a 10 item questionnaire with five response options for respondents; from Strongly agree to Strongly disagree. Originally created by John Brooke in 1986, it allows you to evaluate a wide variety of products and services, including hardware, software, mobile devices, websites and applications.
SUS Example
SUMI Example
Sumber
Slide HCI Evaluation Technique 2
Posting Komentar