![]() |
![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
How Do People Really Use Text Editors?John Whiteside, Norman Archer, Dennis Wixon, and Michael Good Digital Equipment Corporation Originally published in Proceedings of the SIGOA Conference on Office Information Systems (Philadelphia, June 21-23, 1982), SIGOA Newsletter, 3 (1 & 2), June 1982, pp. 29-40. Included here with permission. Copyright © 1982 by ACM, Inc. AbstractKeystroke statistics were collected on editing systems while people performed their normal work. Knowledge workers used an experimental editor, and secretaries used a word processor. Results show a consistent picture of free use patterns in both settings. Of the total number of keystrokes, text entry accounted for approximately 1/2, cursor movement for about 1/4, deletion for about 1/8, and all other functions for the remaining 1/8. Analysis of keystroke transitions and editing states is also presented. Implications for past research, editor design, keyboard layout, and benchmark tests are discussed. 1. IntroductionDifferent approaches have been used to study usage of text editors. Several researchers have proposed models of user performance and then tested user behavior with respect to these models [1, 2, 3, 4, 5]. A second approach has been to compare editors by having subjects perform specific benchmark tasks [9]. A third approach has been to compare command transition matrices for various editors [6]. 1.1. Ideal ModelsModels of text editor performance have focused either on predicting the time which users would take to perform a particular task, or on the strategy which users would employ to accomplish a particular goal. Models which focus on time [1, 2, 3] treat the overall task time as the sum of the times which various subtasks take. These subtasks include operations such as typing, pointing, thinking, and waiting for system response. Models which focus on commands include the GOMS model [1]. Hammer's constrained optimal model [5], and Embley and Nagy's model based on file comparison [3]. All of these models share a common attribute of concentrating on idealized performance. For example, Card et al. ignore errors, give subjects practice problems, and instruct them with respect to how particular activities are to be performed. Alternatively, the performance of users is considered as a deviation from the model. Such deviation is interpreted as suggesting the need for user training [5] or has been tabulated and classified according to user experience [3]. Hammer found that an optimal model predicted user performance rather poorly. Constraining the model by taking into account only those commands which were actually used improved the fit considerably. 1.2. Benchmark TasksOne of the most detailed analyses of editors was conducted by Roberts [9]. She developed methods for comparing editors on the basis of speed of use, ease of learning, range of functionality, and error-proneness. Speed analyses of editors involved the performance of benchmark tasks by experts on each system. She used these methods to compare the TECO, Wylbur, NLS. and Wang text editors. TECO and Wylbur were found to be significantly slower to use than the other two editors. TECO was also significantly harder to learn than the other three editors. 1.3. Transition MatricesHammer and Rouse [6] analyzed free keystroke records for both programmers and documenters using either SOS or TECO. Behavior within an editor was categorized into uniform editor primitives, such as type one line, alter one character, type many lines, and alter many characters. The matrices of transition probabilities between editor primitives were then compared to an overall transition matrix which represented the average of various editors. The results led Hammer and Rouse to conclude that the differences between editors could have been due to the fact that different people used them. In another analysis, Hammer and Rouse compared the same users editing programs and documents and found that about 75% of the subjects used similar techniques for each. All of these analyses were applied to editors which use hard copy terminals. 1.4. Present ApproachInstead of postulating an ideal editing model, we attempted to record and describe actual usage patterns. To this end keystroke-level data were collected while people performed their normal work on either an experimental editor or a word processing editor. Unlike previous research, this free-use logging was collected solely from screen editors and analyzed at several levels. Recommendations for editor design and evaluation procedures are based on actual editor usage. 2. MethodIn the present experiment, users' keystrokes were logged and time-stamped for two editors. The first editor was an experimental editor called the Editor Prototyping Tool (EPT) The second editor was Digital Equipment Corporation's 200 Series commercial word processor (WPS). Both of these are character-oriented screen editors, but they differ in important ways. 2.1. EPT DescriptionEPT was designed specifically for research purposes. Thus, it contained a built-in keystroke-logging facility, designed not to influence software performance. EPT provided a simple set of editing functions including cursor movement anywhere on the screen, search, text relocation, and text deletion. Typing could be done in either insert or overstrike mode. These functions were activated by dedicated function keys: in most cases by a single keystroke. A short help facility was provided. EPT contained few of the formatting functions present in a normal word processor; however it did perform automatic word wrapping. 2.2. WPS DescriptionThe WPS word processor is a character-oriented, screen editor in which editing is done on the bottom line of the screen. The cursor may be moved within the boundaries of existing text. The keyboard has a variety of special keys for editing and cursor movement A special "gold" key on the keypad, when used in combination with other keypad keys and with keys on the regular keyboard, greatly expands the number of possible functions which may be selected directly. WPS has logical equivalents for all EPT functions, and has many more functions in addition, including sophisticated text formatting functions. 2.3. SubjectsSix subjects contributed to the EPT work sample. These subjects were knowledge workers creating and revising their own reports and memos. The logging of data was not begun until all subjects had at least three weeks experience with EPT, which was sufficient time to allow these subjects to learn the features of this simple editor. The subjects using WPS were eight secretaries who were carrying out regular secretarial and administrative duties for a research and development department of Digital Equipment Corporation. The experience of the subjects varied widely, with three having had almost two years of experience with the system and the full set of three training courses. One had a year experience and one training course. The remaining four subjects had no training course but were self taught, with experience of one to six months on the system. 2.4. Keystroke LoggingAs mentioned above. EPT had a built-in logging facility. A permanent record of every keystroke was made, together with time of occurrence. EPT ran under the VMS timesharing system on a VAX-11/780 computer, supporting DEC VTI00 terminals. For the WPS word processing system, character streams and occurrence times from six VTI00W terminals were collected by a PDP 11/34 computer, in such a way that WPS system performance was not affected. The recording was done over a period of 11 days. 3. Results3.1. General Characteristics of the Work SamplesTable 1 shows some summary characteristics of the work samples.
Table 1: Comparison of EPT and WPS Work Samples The samples contain roughly equal numbers of keystrokes, about half a million each. Total person hours in the WPS sample were over twice as many as in the EPT sample; this discrepancy was due to the secretaries' habit of turning on their machines in the morning and leaving them running for the entire day. EPT ran on a general timesharing system. so users were less likely to leave the editor running constantly. This difference can be seen by looking at the number of hours of inactivity (defined as the sum of all periods of greater than 150 seconds without a keystroke) which is much larger for the WPS sample. The remaining time, hours of active work, is comparable for both samples at somewhat under 100 hours. 3.2. Individual Keystroke-Level AnalysisTable 2 shows summary statistics for keystroke usage for the EPT editor. The left-hand column identifies the function. Each entry refers to an individual key, except for the case of "printing characters," which includes all alphanumerics and symbols under a single heading. The keys are ordered by frequency of usage, in terms of percentage of total keystrokes recorded. This number is shown in the second column; the percentages are based on the total number of keystrokes in the work sample (510,503). The remaining two columns in Table 2 show data with respect to time. The percentage of use by time is shown in the third column, and the mean time per keystroke is shown in the fourth column. In computing these statistics, we assigned the time preceding each keystroke to the keystroke in question. Analogous computations based on the time following each keystroke give very similar results.
Table 2: EPT Keystroke Frequency Table Table 2 shows that typing accounted for about one half of editor usage; both in terms of keystroke frequency and time. The keystroke-level editing functions showed a non-uniform distribution of usage frequencies; the most frequently used functions are over 1000 times more frequent than the least frequently used functions. The top five editing functions account for more editing time than all the other functions combined. Four of these functions are the up, down, left, and right cursor movement functions -- collective use of these functions accounted for about one quarter of user editing time. Single character deletion was the next most frequently used function, accounting for about 7% of users' time. Table 3 shows summary statistics for the WPS work sample. The format of Table 3 is the same as that for Table 2. The table treats "gold" key editing functions as a single keystroke. Typing accounted for about one half of user time, as was the case with the EPT work sample. The distribution of times across functions is again highly non-uniform. The three most often used functions (Advance, Line, Backup) are all cursor movement functions, similar to the arrow function in EPT. These three functions account for just over 25% of total editing time. Single character deletion (Rub Character) was the next most frequent function.
Table 3: WPS Keystroke Frequency Table 3.3. Keystroke Transition AnalysisIn addition to computing results at an individual keystroke level, we also examined transitions between keystrokes. Table 4 shows results of such analysis in the form of a keystroke transition matrix. The table shows keystroke transitions for the 10 most frequently used EPT functions. The function preceding the transition is shown along the left-hand margin. The function following the transition is shown along the top of the table. The entries in the matrix show the observed probability of transition from the row keystroke to the column keystroke. For example, the upper-left hand cell pertains to transitions between alphanumeric keys (printing characters). The transition probability of .92 indicates that if the user has just entered a printing character, the probability is .92 that the next keystroke will also be a printing character. The sum of all the probabilities in a row is always 1.0 (accounting for all transitions). Note that this is not true of the entries in Table 4; only the upper-left hand corner of the full 29 by 29 entry matrix is presented.
Table 4: EPT Keystroke Transition Probabilities (n = 1) Table 5 shows a transition matrix similar to Table 4. except the upper entry in each cell shows the actual number of transitions of each type that took place. In addition to computing actual transitions, we also estimated how many transitions ought to occur, based on the assumption that the entries in the matrix are determined only by the marginal totals. The empirical matrices differ substantially from expectation, given such a hypothesis.
Table 5: EPT Keystroke Transition Matrix, n = 1 (Totals from full matrix) We were interested in describing the degree of organization. or degree of non-randomness, inherent in the users' keystroke patterns. Suppose, as an example, that keystroke sequences are entirely random. We would then expect the empirical transition frequencies to be very close to those that we would predict on the basis of the marginal frequencies. Clearly, this is not the case for the empirical data shown in Table 5. A comparison of obtained with expected frequencies shows the extent to which the actual keystroke behavior deviates from random expectation. We can compute a chi-square statistic using the expected and obtained frequencies. This shows the two matrices to be significantly different at the .001 level (chi square = 2,249,886; df = 273). Another way of expressing the same idea is with an index of predictive association [7, p. 747]. This is an index varying from 0.0 to 1.0, which shows the extent to which knowing the initial state improves the accuracy of prediction of a subsequent state. The index of predictive association for the data in Table 5 is 0.73 (with a 95% confidence limit of less than .01). For a transition matrix based on random key sequences, the index would approach 0.0. We were also interested in attempting to describe the degree to which keystroke behavior is organized over longer sequences. Table 5 shows transitions between keystroke n and keystroke n + 1, for all n in our sample. We also computed transition matrices for transitions between keystrokes n and n + m for m between 1 and 25. Table 6 shows such a matrix, in this case for m = 12. That is, the column keystrokes are those that follow the row keystrokes with a delay of 11 intervening strokes. The intent of this analysis is to see how far ahead in a sequence of strokes we can make good predictions, based on knowledge of the stroke at the start of the sequence.
Table 6: EPT Keystroke Transition Matrix, n = 12 (Totals from full matrix) We have generated expected numbers of keystrokes for the n to n + 12 transitions; these are also shown in Table 6. Notice that the actual frequencies still deviate substantially from the random predictions, but that the deviations are smaller than those in Table 5 (n to n + 1) predictions. This result is also expressed in the value of the index of predictive association which is 0.30 for the data in Table 6. Figure 1 plots the index of predictive association as a function of the number of transitions between keystroke n and n + m for m = 1 to 25. For EPT this function is plotted by circles. Notice the smooth decay of the the value of the function, indicating that it becomes progressively harder to predict the longer range one's prediction is.
Figure 1: Predictability of Future Keystrokes as a Function of Lookahead Length The transition analysis described above was also performed for the WPS work sample. Table 7 shows the transition probabilities for the 10 most frequently used functions. The overall picture of editing that emerges from these data is very similar to that for EPT; the most frequent transitions are generally along the diagonal. On Figure 1 the predictive association for the WPS sample is depicted as triangles. The general shape of the curve is very close to that for the EPT work sample; it starts at a slightly lower level, crosses over, and appears to level off at a somewhat higher level, but the overall shape of the curve is almost identical.
Table 7: WPS Keystroke Transition Probabilities (n = 1) 3.4. State AnalysisIn addition to analyzing our data at the level of individual keystrokes, we can define sequences of keystrokes as editing states and analyze the data at that level. This gives a more global picture of user behavior. A state consists of a sequence of keystrokes relating to only one editing function. For example, Type state is defined as an uninterrupted string of text. Cursor state is defined as any uninterrupted string of cursor positioning operations. Erase state involves any string of keystrokes concerned with deleting text. Edit state is eclectic but includes any continuous string of keystrokes concerned with altering or moving text. Start state is automatically entered when the editor is turned on; Finish is always the last state in an editing session. Pause state is entered after 150 seconds of user inactivity and ends with the next keystroke. Finally, Help state involves asking for on-line documentation. By definition no state may follow itself and no state may be nested within another state. Table 8 shows the state analysis for EPT, including the number of occurrences of each state, the percent of keystrokes in each state and the mean number of keystrokes per state. The 739 Start states mean that there were 739 editing sessions in the work sample. On average, users in Type state tended to generate strings of 16.5 characters (this average is somewhat misleading, the frequency distribution is highly skewed with a mode of 1 character and a very long tail -- the longest recorded string was 512 characters). Erase state consisted overwhelmingly of single-character-delete strokes so the 3.6 keystroke average is very close to the mean number of characters deleted per state.
Table 8: EPT State Summary Table We are not able to present an identical analysis for WPS as of this writing because it has a much more complex command syntax. However, the mean length of uninterrupted character strings was 17.2, very close to the 16.5 figure for the EPT sample. Table 9 shows the percentage of keystrokes for the WPS sample falling into some of the states defined for EPT. Even this incomplete comparison reveals a striking similarity of usage patterns.
Table 9: WPS State Summary Table 3.5. Transitions Between StatesWe earlier discussed transitions between keystrokes. A similar analysis is possible at the state level. Table 10 shows a transition matrix for the EPT states. This matrix shows transitions between state n and state n + 1 for all states in the work sample. Notice that all the diagonal entries are zero; this follows from the definitions of the states themselves. For each cell in the matrix there are three entries. The first shows the row transition probability. The second entry shows the observed number of transitions and the third entry shows the number of transitions to be expected on the basis of the marginal totals (random expectation). Thus, from Type state there is a .57 probability of entering Erase state next and a .41 probability of entering Cursor state. For the most part, there is fairly close agreement between the obtained and expected transition frequencies.
Table 10: EPT Editing State Transition Matrix (n = 1) From the transition frequencies we can compute the index of predictive association, as we did for the keystroke transitions. Further, we can compute this index for transitions n to n + m, as before. Figure 2 shows the index for up to 25 states ahead. The index falls off slowly as a function of number of states ahead and the values appear to form an alternating series with even-numbered lookaheads consistently higher than adjacent odd-numbered lookaheads. Table 11 helps explain this pattern. It shows a transition matrix for transitions between states n and n + 2. Notice the high transition probabilities along the diagonal and the corresponding high discrepancy between the observed and expected values along the diagonal. This same pattern is evident in the other even-numbered transition matrices. It means that when users leave a state, they tend to return directly to it after they have entered a single intervening state.
Figure 2: Predictability of Future States as a Function of Lookahead Length
Table 11: EPT Editing State Transition Matrix (n = 2) 4. Discussion4.1. General ConclusionsA number of general conclusions about screen-oriented text editing are suggested by the results. The distribution of use across editing functions is grossly non-uniform. Certain functions are used constantly, others almost never. A rough rule of thumb that emerges from our data is that free usage text editing consists of about 1/2 typing. about 1/4 cursor movement, about 1/8 deletion, and 1/8 all other functions put together. This rule has been shown to apply in two different situations: knowledge workers creating documents with an experimental text editor, and secretaries transcribing and updating documents on a commercial word processing system. Considering the differences in the editors, the workers, and the tasks, the correspondence between the work samples is striking. It implies that it may be possible to optimize a single editor for a variety of users. Another finding to emerge from the state analysis is that users tend to alternate editing subtasks; they tend to return to the state they left one state before. For example, if a user is typing, he tends to leave that state to do one other thing (perhaps deleting) and then immediately returns to typing. 4.2. Extending the State AnalysisThe set of states presented here was sufficient for analyzing EPT, but not for analyzing more powerful editors. Two new states, Format and Manage, can handle these more sophisticated functions. Format includes formatting commands that alter the outward appearance of a document. Manage includes commands for managing multiple files and windows while within an editor. Since EPT was a simple editor, it was easy to build the necessary state analysis programs. State analysis is much more difficult than keystroke analysis for any large production editor due to the extensive parsing that must be done. This problem is alleviated in situations where the analysis can be done using the code from the editor's parser. 4.3. Relation to Past ResearchThe analysis gives us a picture of editing based on actual usage data, rather than on some abstract model. As such, the data presented here provide a good test of the relevance of these abstract models and can be seen as a complement to them. 4.4. Implications for Editor DesignData such as these can be used to drive the software architecture of text editors. Raw frequency-of-use data indicates which aspects of editor design should be optimized. For instance, cursor placement is the single most time consuming editing function next to typing itself. Thus, the mechanics of cursor positioning deserve considerable engineering attention and optimization. In addition, the transition data might be used to tune editor software to the behavior patterns of users. For example, faster performance could be obtained by reordering the alternatives within a parser, and by improving the locality of reference within the editor. 4.5. Implications for Keyboard DesignBeyond its implications for editor design. this data provides a sound empirical basis for the application of McCormick's [8] principles of keyboard design. In order to place keys in terms of either their overall usage frequency or the frequency of transitions between keystrokes one must have data of the type provided here. 4.6. Implications for Benchmark TestsBenchmarking is an appropriate testing method for text editors when there is an interest in making comparisons among existing editors, or improving the performance of an editor which may be under development. The best way one can develop a realistic set of benchmark tasks for a text editor is on the basis of actual usage statistics. Roberts has made the first attempt at developing such benchmark tasks. These tasks include a test of speed of use for experts, developed on the basis of an informal study. In order to evaluate the representativeness of Roberts' benchmarks two EPT experts performed Roberts' expert tasks. Their data was then compared to real EPT behavior in Table 12. Table 12 shows that users spent much more time in the Edit and Cursor states when performing Roberts' tasks than they did in free use.
Table 12: EPT State Usage as Percentage of Working Time An increased knowledge of actual user behavior should enable researchers to develop benchmarks which more closely reflect actual behavior. In addition, given the similarities in user behavior, newly developed benchmarks should be widely applicable. FootnoteVAX, VMS, VT100, DEC, and PDP are trademarks of Digital Equipment Corporation. References
Copyright © 1982 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. This is a digitized copy derived from an ACM copyrighted work. ACM did not prepare this copy and does not guarantee that is it an accurate copy of the author's original work. Home - Music - Software - MusicXML - Events - eConcertBand - Search - Store - About Us - Publications |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||