Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberWO1997023816 A1
Publication typeApplication
Application numberPCT/IB1996/001253
Publication dateJul 3, 1997
Filing dateNov 19, 1996
Priority dateDec 21, 1995
Publication numberPCT/1996/1253, PCT/IB/1996/001253, PCT/IB/1996/01253, PCT/IB/96/001253, PCT/IB/96/01253, PCT/IB1996/001253, PCT/IB1996/01253, PCT/IB1996001253, PCT/IB199601253, PCT/IB96/001253, PCT/IB96/01253, PCT/IB96001253, PCT/IB9601253, WO 1997/023816 A1, WO 1997023816 A1, WO 1997023816A1, WO 9723816 A1, WO 9723816A1, WO-A1-1997023816, WO-A1-9723816, WO1997/023816A1, WO1997023816 A1, WO1997023816A1, WO9723816 A1, WO9723816A1
InventorsMichael P. Whelan, Hsiang-Lung Wu
ApplicantPhilips Electronics N.V., Philips Norden Ab
Export CitationBiBTeX, EndNote, RefMan
External Links: Patentscope, Espacenet
User identification system for data processing equipment with keyboard
WO 1997023816 A1
Abstract
User keystrokes and keystroke timings are continually captured during computer use. The captured keystrokes and timings are compared with a user profile of keystrokes and timings. A statistical analysis determines what should be in the profile and how the captured keystrokes and timings match up with the stored ones. An error action is initiated if the statistical analysis indicates that the user is not the one who is supposed to be using the computer.
Claims  (OCR text may contain errors)
CLAIMS:
1. A computer method for identifying a computer user, the method comprising the following steps receiving user data at a keyboard; continually seeking to match entered sequences of keystrokes with stored predetermined sequences of keystrokes; continually capturing entered keystroke timings; comparing particular entered keystroke timings, in a sequence of entered keystrokes that matches one stored predetermined sequence of keystrokes; and - stored keystroke timings that correspond to the one stored predetermined sequence of keystrokes; and executing an error action when the particular entered keystroke timings fail to compare properly with the stored keystroke timings.
2. The method of claim 1 wherein comparing comprises using a statistical formula to analyze a statistical likelihood that the particular entered keystroke timings were entered by a same use as the stored keystroke timings corresponding to the one stored predetermined sequence of keystrokes.
3. The method of claim 1 wherein the statistical formula is
where - i is an index running from one to three;
Tj(i) is the x^1 particular entered keystroke timing;
T2(i) is an average value of timings of the ith keystroke taken during an initialization phase and corresponding to the one stored predetermined sequence of keystrokes; - E(Tj) is the mean of the Tι(i) values;
E(T2) is the mean of the T2(i) values; σ(T]) is the variance of the T,(i) values; <x(T2) is the variance of the T2(i) values.
4. The method of claim 3 wherein the particular entered keystroke timings are considered to match when correlation(Tj ,T2) exceeds a predetermined threshold.
5. The method of claim 4 wherein the threshold is E-0.5σ, where E is the mean of the keystroke timings for the one stored predetermined sequence of keystrokes and σ is the variance of the keystroke timings for the one stored predetermined sequence of keystrokes.
6. The method as claimed in any of claims 1 to 5, wherein the error action is triggered after several failures of entered keystroke timings to compare properly with stored keystroke timings.
7. The method as claimed in any of claims 1 to 6 further comprising the step of maintaining a list of recently used sequences of keystrokes; and wherein executing the error action comprises only executing the error action when a number of sequences of keystrokes in the list, which compared properly with the stored keystroke timings, fails to meet a predetermined threshold.
8. A computer system comprising a processor; a keyboard; - means for, in response to data entered by a user on the keyboard, continually seeking to match entered sequences of keystrokes with stored predetermined sequences of keystrokes; and continually capturing entered keystroke timings; means for comparing - particular entered keystroke timings, in a sequence of entered keystrokes that matches one stored predetermined sequence of keystrokes; and stored keystroke timings that correspond to the one stored predetermined sequence of keystrokes; and code means for executing an error action when the particular entered keystroke timings fail to compare properly with the stored keystroke timings.
9. The system of claim 8 wherein the comparing means comprises statistical formula means for analyzing a statistical likelihood that the particular entered keystroke timings were entered by a same user as the stored keystroke timings corresponding to the one stored predetermined sequence of keystrokes.
10. The system of claim 9 wherein the statistical likelihood is analyzed according to the following formula
where i is an index running from one to three; - Tj(i) is the ith particular entered keystroke timing;
T2(i) is an average value of timings of the iΛ keystroke taken during an initialization phase and corresponding to the one stored predetermined sequence of keystrokes;
E(T]) is the mean of the T, (i) values; - E(T2) is the mean of the T2(i) values; σ(T,) is the variance of the T,(i) values; σ(T2) is the variance of the T2(i) values.
11. The system of claim 10 wherein the particular entered keystroke timings are considered to match when correlation(Tι ,T2) exceeds a predetermined threshold.
12. The system of claim 10 wherein the threshold is E-0.5σ, where E is the mean of the keystroke timings for the one stored predetermined sequence of keystrokes and σ is the variance of the keystroke timings for the one stored predetermined sequence of keystrokes.
13. The system as claimed in any of claims 8 to 12 wherein the error action is triggered after several failures of entered keystroke timings to compare properly to stored keystroke timings.
14. The system as claimed in any of claims 8 to 13 further comprising means for maintaining a list of recently used sequences of keystrokes; and wherein the means for executing the error action only executes the error action when a number of sequences of keystrokes in the list, which compared properly with the stored keystroke timings, fails to meet a predetermined threshold.
Description  (OCR text may contain errors)

User identification system for data processing equipment with keyboard.

BACKGROUND OF THE INVENTION

The invention relates to the field of computer security and in particular to user identification.

In the past, a number of techniques have been proposed to improve computer security. For instance, passwords have been used to identify users as they sign on to computer systems.

Password techniques have the limitation that once a user signs on, there is no further checking as to the identity of the user. If the user leaves the computer without signing off, another user can come by and use the computer for nefarious purposes. To obviate this problem, some systems have screen saver software which puts a system into screen saver mode when a user has not been active on the computer and then require the user to enter a password to get out of the screen saver. However, since the screen saver takes some time to kick in, a crafty miscreant can still dart into a departed user's place and do his dastardly deeds. Another problem with password based systems, is that would-be intruders into the computer system may learn a password and enter that password in the genuine user's stead. Proposed solutions to this problem appear in R. J. Spillane, "Keyboard Apparatus for Personal Identification", IBM Tech. Disci. Bull. , vol. 17, No. 11 , April 1975 and GB 2,247,964A. Both of these systems propose looking at time intervals between key depressions during entry of a standard code or a password. These systems give some additional assurance that the password is being entered by the correct person, however, they still have the problem that, once a user is signed on, there is no additional verification.

SUMMARY OF THE INVENTION The object of the invention is improved user identification for computer security in keyboard based systems. The object is achieved in that keystrokes are continually analyzed to determine if they match stored sequences and to determine whether their timings match stored timings for the stored sequences. BRIEF DESCRIPTION OF THE DRAWING

The invention will now be described by way of non-limitative example with reference to the following drawings.

Fig. 1 is a computer system equipped with the invention. Fig. 2 is a flowchart of an initialization routine for the invention.

Fig. 3 is a flowchart of a first embodiment of the invention.

Fig. 4 is a flowchart of a second embodiment of the invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Fig. 1 shows a computer system on which the invention can run. The system includes equipment 101 to be guarded, which takes data from a keyboard 104. A user identification unit 105 processes data received from the equipment 101. The user identification unit 105 has a database 102 of security information. When data received from the equipment to be guarded 101 does not correlate with data in the database 102, the user identification unit 105 executes an error action 106. That action might be to lock the equipment 101 or to send an error message to another machine, for instance where a system administrator is sitting. The method can be run on a system where the user identification unit is incorporated within the equipment to be guarded. However, sometimes better security is achieved if the user identification unit is separate from the equipment to be guarded. Fig. 2 shows a flow chart of an initiation phase of the invention. During this phase, a database is created for a particular user. The user is asked to type a set of words or keystrokes. The words in the set should contain three letter sequences which are expected to be entered frequently by the user under actual operating conditions. The three letter sequences or triplets have been found to work well experimentally. However, sequences of other lengths could also be used.

The set could contain commonly used English words, or the list could contain commonly used computer commands or codes. For users of a UNIX based system, the set should preferably include commonly used UNIX operating system commands. Preferably the set should contain repetitions of the three letter sequences which are thought to be most useful, so that the user is given several opportunities to enter relevant triplets. In general the set should contain triplets which are "statistically repetitive". The term "statistically repetitive" will be defined below.

Block 200 starts the operation. So long as there are more words to type (201Y), the system analyzes input at 202. At 206, the computer orders the user to enter a word. The loop 207/208 comprises the key-wise analyzing of the word, and the detecting whether the word has terminated effectively. The analyzing data include keystroke identifications and timings for each keystroke. Words are delimited by finding a sequence of keystrokes ending with a carriage return, a space, or a long time interval without keystrokes. The system then breaks the entered keystrokes for each word into triplets at

203. Each word can be represented by a set, (C(i) | 1 < i < the length of the word}, where C(i) is the i* character of this word. The keystroke timings for this word are represented by a set, {T(i) | 1 < i < length of the word}, where T(i) is the time lapsed between either entry of a previous keystroke or issuance of a system prompt, and - entry of the current keystroke.

A six letter word, for instance, will contain four triplets, {C(1),C(2),C(3)} , {C(2),C(3),C(4)}, {C(3),C(4),C(5)J , and (C(4),C(5),C(6)}. Each triplet is represented in the same form as the word, but the keystroke timings are converted into values relative to the time when the first character of this triplet is typed. Any words less than 3 characters is ignored by the system.

At 204, relevant timings are entered into the user's profile. The user profile should contain records of statistically repetitive triplets for that user.

When the list of words is complete, the system goes to block 209. Therein, for each statistically repetitive triplet (see 209), three mean values are stored 210 which represent means of the timings of the keystrokes for each character of that triplet. Also, for each statistically repetitive triplet, a variance value is stored 210 which represents a variance of the timings of all of the recorded keystrokes for that triplet. When all processing has finished, at 205, initialization ends.

Fig. 3 shows a flow chart of the operation of the invention after initialization. After start block 300, this flow chart is triggered by entry of a keystroke at 301. If a keystroke is detected, its identity and timing are captured at 302. Keystrokes are captured in sequence until a word is identified (see 303). The word is then broken into triplets, cf analogous operation at box 203 at Fig. 2. If, at 304, any of the triplets matches 310 a stored triplet developed in Fig. 2, the timings matched triplets are correlated with the timings of the stored triplets at 305. If no stored triplets are matched at 310, control returns to 301. Given two sets of keystroke timings (T,(i) | 1 < i < 3} and (T2(i) | 1 < i < 3} of the same triplet, the function E(x) denoting the mean of a variable x, and the function σ(x) denoting the variance of a variable x, the correlation of the two sets of timings is given by

, and calculated in block 305.

Two keystroke timings are similar if the correlation between them is larger than a value close to 1 (e.g. 0.99). A triplet is "statistically repetitive" if the mean of the correlations of various versions of it is close to 1 (one) and the variance of correlations is small.

If the correlation exceeds a threshold at 306, control returns to 301. If the correlation does not exceed the threshold an error action can be commenced at 307 (cf box 106 of Fig. 1). Alternatively, the system could wait until the threshold has not been met several times before triggering an error action.

A statistically more reliable alternative is to trigger the error action after a series of correlation values fail to meet the threshold. In such an alternative, a number of yes or no votes is to be recorded with a list of most recently used words. If the ratio of yes to no votes is too small after a given number of matches, the error action would be triggered. Fig. 4 illustrates this alternative. If this is the first run since initialization per Fig. 2 after start block 400, at 401 a most recently used list must be initialized. The most recently used list should be initialized to some or all of the triplets in the personal profile, along with yes votes. During the running of the identity checker of the invention, the most recently used list will be updated. The most recently used list will be retained from session to session. After 401 , the steps of boxes 301 , 302, and 303, from Fig. 3 are executed. At

403 a retrieved word is broken into triplets and stored in the list. At 404, the system checks whether there are more triplets to check in the list. If no, control returns to 402. If yes, at 405, the system tests whether the current triplet is in the user's personal profile. If so, the most recently used list is updated at 406. The method of updating the most recently used list will be explained below. Then at 407, the correlation of the current triplet is calculated as per the above equation. Then, as a result of the correlation calculation, "yes" and "no" votes are updated at 408. Updating of "yes" and "no" votes will be explained below together with the explanation for the updating of the most recently used list.

At 409, the system calculates whether a ratio of "yes" to "no" votes exceeds a threshold. If not, the error action 307 is triggered. If so, the current triplet is deleted from 410 and control is returned to 404. The threshold can be any ratio which works in practice. A ratio of 80% was found to work in one experiment.

Updating of the most recently used list and the "yes/no" votes will now be explained. The idea is to keep a running count of whether the user is typing as expected. Therefore several values must be compared. Below, the variable T represents a current time and the variable N represents the length of the list of most recently used triplets. The variable N can take on any value which works in practice. In one experiment, a value of 20 for N was found to give reasonable results. At time T, the N most recently used triplets list might look like the following.

Triplet Time for the most recent "yes" or "no" appearance

triplet j T-100 sec yes

triplet2 T-95 sec yes

triplet3 T-92 sec no

• • •

tripletN T- 10 sec yes

When tripletN+ j is entered, the system has to check whether the new triplet is in the most recently used list. If not, the update will take the following form.

Triplet Time for the most recent "yes" or "no" appearance

triplet2 T-95 sec yes

triplet3 T-92 sec no • • •

tripletN T-10 sec yes

tripletN+ 1 T sec no

In other words, the oldest triplet is removed from the list, and the most recent one is added. The second column of this table is updated at 406, while the third column is updated at 408. If, on the other hand, the newest triplet already appears in the list, a different type of updating is undertaken. If for instance the tripletN= ] =triplet3, the list will look as follows:

Triplet Time "yes" or "no"

triplet! T-100 sec yes

triplet2 T-95 sec yes

triplet4 T-85 sec yes

• • •

tripletN T-10 sec yes

tripletN=

T sec no

In this way, there are always N triplets in the most recently used list, and there are no duplicate triplets in the list. Yet another alternative would be to establish multiple yes/no ratio thresholds. In such a case, the error action would be initiated either if a small number of correlations gave highly aberrant values; or a larger number of correlations gave slightly aberrant values. Sometimes user patterns of typing vary over time. For instance, as a user becomes more experienced, the timings will decrease. If a user acquires new software, the sequences of keystrokes that are statistically repetitive may change. Therefore, it may be desirable to allow the system to update user profiles over time.

Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
GB2247964A * Title not available
US4621334 *Aug 26, 1983Nov 4, 1986Electronic Signature Lock CorporationPersonal identification apparatus
US4805222 *Dec 23, 1985Feb 14, 1989International Bioaccess Systems CorporationMethod and apparatus for verifying an individual's identity
Non-Patent Citations
Reference
1 *IBM TECHNICAL DISCLOSURE BULLETIN, Volume 17, No. 11, April 1975, R.J. SPILLANE, "Keyboard Apparatus For Personal Identification".
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
WO2006053991A1 *Nov 18, 2005May 26, 2006Laurent Guyot-SionnestMethod and device for controlling and inputting data
DE102004049428A1 *Oct 8, 2004Apr 20, 2006Claudia Von HeesenAutomatic identification and verification of user data to provide access by a user to electronic equipment
EP1470549A1 *Dec 12, 2001Oct 27, 2004International Business Machines CorporationMethod and system for non-intrusive speaker verification using behavior models
EP1512113A2 *May 14, 2003Mar 9, 2005Biocom, LLCIdentity verification system
EP1512113A4 *May 14, 2003Oct 29, 2008Biocom LlcIdentity verification system
EP2069993A2 *Oct 4, 2007Jun 17, 2009Behaviometrics ABSecurity system and method for detecting intrusion in a computerized system
EP2069993A4 *Oct 4, 2007May 18, 2011Behaviometrics AbSecurity system and method for detecting intrusion in a computerized system
US7689418Sep 12, 2002Mar 30, 2010Nuance Communications, Inc.Method and system for non-intrusive speaker verification using behavior models
US8125440Nov 18, 2005Feb 28, 2012Tiki'labsMethod and device for controlling and inputting data
US8443443Oct 4, 2007May 14, 2013Behaviometrics AbSecurity system and method for detecting intrusion in a computerized system
Classifications
International ClassificationG06F21/55, G06F21/31, G07C9/00, G06F3/023
Cooperative ClassificationG06F3/023, G07C9/00142, G06F21/316, G06F21/554
European ClassificationG06F21/31B, G06F21/55B, G06F3/023, G07C9/00C2B
Legal Events
DateCodeEventDescription
Jul 3, 1997AKDesignated states
Kind code of ref document: A1
Designated state(s): JP
Jul 3, 1997ALDesignated countries for regional patents
Kind code of ref document: A1
Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE
Sep 24, 1997121Ep: the epo has been informed by wipo that ep was designated in this application
Jul 15, 1998122Ep: pct application non-entry in european phase
Sep 4, 1998NENPNon-entry into the national phase in:
Ref country code: JP
Ref document number: 97523449
Format of ref document f/p: F