US 20070011003 A1
A microphone apparatus includes a code generator that produces a code output and communicates with a user computer over a channel via which the microphone apparatus provides an electrical signal to the user computer. The microphone apparatus is used by the user such that the user provides authentication information comprising a user code that is generated by the code generator of the microphone apparatus.
1. A microphone apparatus for use with a user computer, the microphone apparatus comprising:
a microphone transducer that converts speech input to an electrical signal and provides the electrical signal to the user computer over a channel;
a code generator that provides a code output to the user computer over the channel.
2. The microphone apparatus as defined in
3. The microphone apparatus as defined in
4. The microphone apparatus as defined in
5. The microphone apparatus as defined in
6. The microphone apparatus as defined in
7. The microphone apparatus as defined in
8. The microphone apparatus as defined in
9. The microphone apparatus as defined in
10. The microphone apparatus as defined in
11. The microphone apparatus as defined in
12. The microphone apparatus as defined in
13. The microphone apparatus as defined in
14. The microphone apparatus as defined in
15. A microphone apparatus for use with a user computer, the microphone apparatus comprising:
a microphone transducer that converts speech input to an electrical signal and provides the electrical signal to the user computer over a channel;
code means for providing a code output to the user computer over the channel.
16. The microphone apparatus as defined in
17. The microphone apparatus as defined in
18. The microphone apparatus as defined in
19. The microphone apparatus as defined in
20. The microphone apparatus as defined in
21. The microphone apparatus as defined in
22. The microphone apparatus as defined in
This application is a divisional application of co-pending application Ser. No. 10/023,923 filed Dec. 18, 2001 to Z. Shpiro, et al. which claims priority from U.S. Provisional Patent Application Ser. No. 60/256,558 entitled “Access Control for Interactive Learning System” by Z. Shpiro et al., filed Dec. 18, 2000. Priority of the filing dates of these applications is hereby claimed, and the disclosures of these applications is hereby incorporated by reference.
1. Field of the Invention
This invention relates generally to access control for computer network resources and, more particularly, to controlling access to a network location that provides interactive learning processing.
2. Description of the Related Art
As commerce becomes more global, the need for understanding second languages and being able to communicate in them is growing. The Foreign Language/Second Language training industry therefore is a rapidly expanding industry, and is now investigating how to apply new technologies, such as the Internet, to such training. Current language training product elements include printed materials, audio cassettes, software applications, video cassettes, and Internet sites through which information and distance learning lessons are provided. Several attempts have been made to apply various Foreign Language/Second Language training processes to the Internet world, but most of them are simple conversions of printed, audio, and video material into a computer client-server application; i.e. the Internet applications are typically not offering new features beyond the current features offered by conventional media.
The publishing industry involved with Foreign Language/Second Language training is vulnerable to lost revenue due to forgeries, lending, photocopying, and second-hand purchases of their printed training materials. A forgery occurs when someone makes an unauthorized copy of the original training materials, such as by illicit photocopying. The forgeries may be passed off as genuine, authorized materials. When a forgery is sold, the publisher receives no compensation and all revenue from the forgery is collected by the seller of the forged copy. Lending losses occur because copies of original materials are loaned to third parties, who then need not purchase the source materials. Similarly, unauthorized photocopying of original materials results in reduced demand for the materials. Second-hand purchases deprive publishers of revenue because the second-hand seller receives the revenue from such sales. All of these uses of original materials are either unauthorized or currently beyond the control of the publishers, and all reduce the publisher's revenue. It would be advantageous if producers of language training materials could capture some of the lost income from such uses of their printed materials.
Modern computer technology can provide a network implementation of software applications to make on-line versions of the training materials available, thereby enabling access and approaching larger numbers of users. Computer technology can also be used to supplement and enhance the presentation of training materials. Network access to such training materials is conveniently implemented via the Internet. Because on-line access is so easily obtained, the easy access also means that the opportunity for unauthorized usage of the applications and materials in the new medium is greater. Speaker recognition technology is a potentially powerful means of increasing the efficiency, quality, and enjoyment of language instruction through on-line access. There are many applications, in different areas (such as credit card transaction authorizations, security access, password protection for access to computerized systems, etc.), where speaker recognition technology is being applied as a security measure to ensure proper identification of a user.
A variety of speaker recognition products are currently offered by companies such as SpeechWorks International, Inc. of Boston, Mass., USA and Dialogic Corporation of Parsippany, N.J., USA, and the like. Speaker recognition technology also is currently being offered by companies such as ITT SpeakerKey, NetKey and WEBKey, Lucent Speaker Verification, and “SpeakEZ” from T-Netix, Inc. of Englewood, Colo., USA. An example of a commercial application is the integration of speaker verification into the “Mac OS 9” operating system by Apple Computer, Inc. of Cupertino, Calif., USA for voice verification of user access to the computer operating system. In such systems, access is denied until a speaker recognition process is completed.
The phenomena of photocopying, second-hand purchase, lending, and forgery is a significant problem for the publishing industry. The publishing industry suffers significant losses of potential income due to the significant rise in second-hand sales of previously used materials, and due to purchases of books and the lending and photocopying of books, primarily in the educational sector. There are jurisdictions which advocate the enactment of lending and photocopying laws. Many of these laws might benefit the consumer, but will be highly detrimental to the publisher, because they will result in a decrease of purchases of original materials from the publisher. In addition, the forgery phenomenon prevalent in the designer clothing industry has infiltrated the publishing industry as well, resulting in serious profit losses to the publisher.
Internet-based distance learning techniques are being used, where an instructional provider maintains an Internet location such as a Web site and users visit the teaching Web site to receive both instruction and assessment of skills. As noted above, however, the opportunity for fraudulent use of such learning sites is great. For example, an authorized user may gain access to the Web site for a computer learning session, but then may leave the computer and a different student may continue, taking the place of the authorized student. This is undesirable for at least two reasons: first, the performance that is viewed by the service provider is not the performance of the actual student to whom it is attributed; and secondly, at least two persons are utilizing the learning site, although only one is providing payment or being charged, resulting in revenue loss for the Web site provider.
From the discussion above, it should be apparent that there is a need for a publishing product that incorporates both printed and Internet materials and that can be used only by authorized persons. Such access control would permit the publisher to benefit from licensing fees and thus earn income from users who acquired their products from sources other than the publisher. The present invention fulfills this need.
The present invention provides a technique to control access to computer network resources at a computer facility by permitting a user to interact with the computer facility through a computer node of a network, wherein the user interaction comprises language learning responses submitted to the computer facility through the computer node, and by performing a user authentication process to determine if the permitted user interaction is authorized and determining whether the permitted user interaction should be continued, if the user is determined not to be authorized, wherein the user authentication process is performed with user authentication information that is obtained by the computer facility during the permitted user interaction and also with user authentication information extracted from the user's language learning responses. In this way, user authentication occurs without intruding into the utilization of the computer facility.
The user authentication can occur as a result of speaker recognition processes that utilize speech information collected from the user who currently has access to the computer facility. The access control is especially suited to language training systems that collect speech information from users as part of their normal operation. In this way, the invention permits publishers of materials to incorporate both printed and Internet materials at a computer facility with confidence that the computer facility will be usable only by authorized persons.
In another aspect of the invention, a user who is determined by the system to be an unauthorized user will be invited to become an authorized user, such as by paying an additional registration fee. In this way, users are unaware of any explicit user identification checking operations being carried out, and once unauthorized users are discovered, they are invited to become authorized users and continue with their learning process in exchange for paying a fee. Thus, unauthorized persons are not immediately halted from using the system, but instead are treated as an opportunity for additional selling.
In yet another aspect of the invention, an input device such as a microphone apparatus can be offered for purchase wherein the microphone apparatus includes a code generator that produces a code output and communicates with the user computer over a channel via which the microphone apparatus provides an electrical signal to the user computer. In another aspect of the invention, the microphone apparatus is used by the user such that the user authentication information comprises a user code that is generated by the code generator of the microphone apparatus.
Other features and advantages of the present invention should be apparent from the following description of the preferred embodiment, which illustrates, by way of example, the principles of the invention.
If the user speech information data has not been previously entered, then the server computer 110 will receive voice data from the identified user 102 during the current communication session as a result of the user's speaking at the user client node 104. The server computer determines whether to permit continued access to the computer network facility 108 by the identified user in response to determining whether or not the user is an authorized user by using the speaker recognition techniques. The present invention thereby controls on-line access to a computer facility by granting access to a user and then unobtrusively performing user authentication with speaker recognition technology while the user is utilizing the computer facility.
In the embodiment illustrated in
The user 102 will respond to the received learning modules by producing speech 126 that will be received by a microphone apparatus 128 of the Personal Computer 104. In addition, the user may provide input to the computer facility 108 with keyboard and display mouse devices of the Personal Computer. During such computer interaction with the user, the computer 104 will convert the user's speech 126 into speech information, in a manner known to those skilled in the art, and will provide the learning facility 108 with that data. The speech information will then be compared by the Speaker Recognition processor 112 against speech information stored in the Authorization database 114. Such comparison techniques are known to those skilled in the art. The Speaker Recognition processor 112 and Learning Server 110 may comprise separate computers of the computer facility 108, or their functions may be combined into a single computer. The user speech information may also be referred to as speaker verification information or “voiceprint” information. Based on the speech information comparison, the learning server 110 will decide whether to permit continued access by the user. This processing is described in greater detail in
In either case, the user identification 202 results in confirmation that a person who has provided identification parameters, such as name and password, has matching entries in the Authorization database for the provided name and password. The system then permits access to the computer facility by the user. If no match in the Authorization database is located, then the system prevents further access or provides the user with an opportunity to become an authorized user, such as by paying a fee. Thus, in the preferred embodiment, first-time users will be diverted to a registration process as part of the user identification 202.
When the user's speaker verification information is received, the system will check to determine if the user's voiceprint information already exists in the system. This is represented by the decision box numbered 204. If the voiceprint has already been received, an affirmative outcome at the decision box 204, then at box 206 a lesson or study module will be identified for delivery to the user. The system may, for example, provide the next sequential lesson in a lesson plan. If the voiceprint being checked has not previously been received, a negative outcome at the decision box 204, this indicates that a new user is attempting to gain access to the computer facility. The user voiceprint information is actually the means by which the system authorizes or verifies a user. Therefore, if the user is a new user, then at box 208 a voiceprint for the new user will be built and stored in the database. This process is described in greater detail below.
Once the system has confirmed that user voiceprint information is available, a lesson may be identified for delivery to the user at the flow diagram box numbered 206. Once the user has cycled through all lessons, the lesson sequence will end at box 210. Lessons will be retrieved from the Lessons database, as indicated at the flow diagram box numbered 212. During the normal course of interacting with the system to finish individual lessons, the user will be presented with one or more questions on a display of the user's computer. These questions also will be extracted from the Lesson database at the learning facility for presentation to the user, as indicated at box 212. The questions will require the user to answer verbally to record a phrase 214. The user's vocal response will also be recorded in the voiceprint Authorization database, creating a real-time voiceprint with corresponding voice parameters. This voiceprint information, collected during the normal course of interacting with the system to complete lessons, will be used by the system to decide whether or not to proceed with the lesson.
More particularly, the system will preferably permit normal lesson operation to occur and will periodically perform a check to determine if the user who is studying the lesson is the same individual person who was previously identified with the password and name obtained above (box 202). This prevents a situation such as where a person obtains the name and password of an authorized user and attempts to proceed with lesson studying posing as that other user.
To perform the voiceprint check 218, the system uses voice recognition technology to compare the authorized user's recorded voiceprint information with that of the user who is studying the lesson. This is described in greater detail below. The comparison takes place in the background, without interfering with either the user or the lesson. To perform the trace of user progress 220, the system will follow the user's progress in the lesson plan to check for anomalies. An unexpected or unusual change in the current lesson's level (either up or down) by the studying user might be an indication that an authorized user has allowed someone else to enter the system. Once alerted to a potential problem in this way, the system will preferably determine whether the studying user is, in fact, the authorized user by re-checking the studying user's voiceprint information against the stored user voiceprint information. To perform the evaluation of the studying user's performance 222, the system will follow the user's performance in the lesson plan. Unexpected lower (or higher) performance results can be an indication that an authorized user has allowed access to an unauthorized user. After the system is alerted in this way to a potential problem, the system will preferably determine whether the user is, in fact, the authorized user by re-checking the user's voiceprint information.
After the requisite user authorization checks are performed, the system will come to a conclusion about whether the studying user is the same person as the previously authorized user associated with the user name and password first obtained at box 202. At box 224, the system will then make a decision about the user identification. That is, the system will decide whether or not the user is the properly licensed or authorized user. The system will then make a decision on continued access and continuation of the lesson, as indicated at the continuation box numbered 226. If the system has any doubts about the user's identity, a message will appear on the studying user's computer screen and preferably the current lesson will stop immediately at box 228. If the system decides that continuation is appropriate, then processing returns to the lesson presentation at box 206.
User Password Identification
As noted above at box 202, user identification is performed when the user first attempts to use the system.
If the system determines that the password entered by the user does exist, the user will be asked to fill in his or her name. For example, the following message may appear on the user's computer screen: “Enter your name, please.” The user will enter his or her name at box 312. The system will check the user name and determine whether the user is a new user, as indicated by the decision box 314. If the user's name is not found in the database, then the user is a new user, an affirmative outcome at the decision box 314, and at box 316 the user name will be added to the password Authorization database and at box 318 will be indicated as a new user. The Authorization database preferably includes information on the authorized users, such as Name, Password (for example from the accompanying lesson book), User skills parameters (Lesson level, Performance evaluation), and Voiceprint sample parameters. By using this information later, the system will prevent any other user from using the same password. At this stage (box 318), when the new user's name is first added to the Authorization database, all the user's skills parameters will be set to level zero.
Thus, the system has verified that the supplied password is a legitimate password (box 308) and has verified (or entered) the user name in the database (box 314). The system will next check to determine if the user name matches the password by searching for the user in the password database, as indicated by the decision box numbered 320. If the user is not the authorized user, a negative outcome at the decision box 320, then the system will prevent the user from continuing and will stop at box 322. For example, the following message will appear on the screen: “We are unable to identify you. Please contact us and we will be happy to assist you shortly.” The system will then stop processing the lesson plan immediately.
If the user name matches the user password, an affirmative outcome at the decision box numbered 320, then the system initializes an error count at the flow diagram box numbered 324. The error count is an indication of a non-authorized user. After a predetermined number of identification failures, as represented by the error count, the system will identify the user as a non-authorized user and the entire process will stop. The system then retrieves the lesson level from the password Authorization database at 326 and sets the lesson level for the current user to this retrieved level, at box 328. This step ends the user identification processing.
Building the User Voiceprint
The process of building a voiceprint for a new user as a means of identifying the user is illustrated in
In the next step, represented by the flow diagram box numbered 404, the system collects the voiceprint information. If the voiceprint information has been successfully stored, then the process ends at box 406. If the voiceprint information has yet to be successfully collected, meaning that it is not yet in the password database, then at box 404 the system collects the voiceprint information by having the user speak a phrase into the user's computer microphone. For example, the user may be asked to answer a question that appears on the display screen. The question is preferably chosen randomly from a Lessons Database, as indicated by the flow diagram box numbered 410. The user's spoken response, as represented by the microphone output signal, is digitized and recorded in the user's computer at the flow diagram box numbered 412. The recorded spoken response information is processed at box 414. This processing includes well-known processing techniques to represent the digitized information in a particular data format, such as what are referred to as Cepstral coefficients, and to provide an estimate of the spoken pitch. Such processing is described, for example, in the document “Nonlinear Discriminant Feature Extraction for Robust Text Independent Speaker Recognition” by Y. Konig, L. Heck, M. Weintraub and K. Sonmez (1998), Proceedings RLA2C-ESCA, Speaker Recognition and its Commercial and Forensic Applications, pp. 72-75, Avignon, France.
Next, as indicated by the flow diagram box numbered 416, the system extracts voiceprint parameters, thereby defining the speech information that will be used by the system for user identification. The extracted parameters permit the voiceprint information to be represented more compactly. This step is preferably performed by the user's computer, to minimize the amount of data that must be sent over the computer network to the learning facility. Finally, the extracted voiceprint information is provided to the learning facility, indicated at box 418, and the learning server stores the voiceprint information into the Authorization database, indicated at box 420.
Checking the Speech Information
As noted above, speaker recognition technology is used to compare the authorized user's voiceprint with a new user's recording during the course of the lesson and is one of the three ways in which an access authorization outcome is generated. The voiceprint comparison will take place in the background, without the studying user becoming aware of the process, and without interference to either the user or the lesson progress.
In the first voiceprint checking step, indicated by the
Tracing User Progress
Another way of checking user authorization and generating an access authorization outcome (
In the first user progress tracing step, the system retrieves the identified user's previous lesson level in the lesson plan from the learning facility Authorization database, indicated at the
At the decision box 606, if the present lesson level is not outside the acceptable range of difference compared to the level of the previous communication session, a negative outcome at the decision box, then the user progress tracing check is completed. If the present lesson level is too low or too high, compared to the previous lesson level, then at box 608 the system adds one error to the user authorization error count. The error count then preferably initiates a user voiceprint check at box 610, a process that is described above in connection with
User Performance Evaluation
Another way of checking user authorization and generating an access authorization outcome (
In the first user performance evaluation step, the system retrieves user performance data for the identified user from the Authorization database, as indicated by the flow diagram box numbered 702, and checks it against the present user's performance, as indicated by the flow diagram box numbered 704. The system will check for performance that is too low and too high. At the decision box numbered 706, the system checks for a low performance by the user. If the present studying user's performance is too low compared to the previous user's performance, an affirmative outcome at the decision box 706, then at box 708 one error will be added to the error count and at box 710 the user's voiceprint will be checked again.
After the error count adjustment, and following any system determination that the studying user's performance is not too low (a negative outcome at the decision box 706), the system processing checks for any performance is too high at the decision box numbered 712. If the present studying user's performance level seems too high compared to the previous user level, it might indicate possible use by a non-authorized user. It might also indicate that the same authorized user has improved his or her skills. Therefore, if the user's performance is too high, an affirmative outcome at the decision box 712, then the user's voiceprint will be checked again, as indicated by the flow diagram box numbered 714.
It should also be noted that user performance may comprise user proper pronunciation performance. For example, a Japanese (non-native American) user may have been trained to properly pronounce the American English letter “R” sound. Initially, the user's pronunciation may have been recognized as an American English “L” sound, and after intensive training it may sound like a proper American English “R” sound. Such a skill (the proper pronunciation of the American English “R”) is an example of the user performance described above.
After the system performs the random check of user authorization and receives an access authorization parameter (
As described above in connection with
The computer device 900 may comprise a personal computer or, in the case of a client machine, the computer device may comprise a Web appliance or other suitable network communications, voice-enabled device. In the case of a personal computer, the device 900 preferably includes a direct access storage device (DASD) 908, such as a fixed hard disk drive (HDD). The memory 910 typically comprises volatile semiconductor random access memory (RAM). If the computer device 900 is a personal computer, it preferably includes a program product reader 912 that accepts a program product storage device 914, from which the program product reader can read data (and to which it can optionally write data). The program product reader can comprise, for example, a disk drive, and the program product storage device can comprise removable storage media such as a floppy disk, an optical CD-ROM disc, a CD-R disc, a CD-RW disc, a DVD disk, or the like. Semiconductor memory devices for data storage and corresponding readers may also be used. The computer device 900 can communicate with the other connected computers over a network 916 (such as the Internet) through a network interface 918 that enables communication over a connection 920 between the network and the computer device.
The CPU 902 operates under control of programming steps that are temporarily stored in the memory 910 of the computer 900. When the programming steps are executed, the pertinent system component performs its functions. Thus, the programming steps implement the functionality of the system illustrated in
Alternatively, the program steps can be received into the operating memory 910 over the network 916. In the network method, the computer receives data including program steps into the memory 910 through the network interface 918 after network communication has been established over the network connection 920 by well-known methods that will be understood by those skilled in the art without further explanation. The program steps are then executed by the CPU 902 to implement the processing of the system.
As noted above, the user's Personal Computer 900 may communicate with other computing devices 922, which may provide the functionality of the Computer Facility 108 (
Additional Access Authorization with Code Generator
In addition to the analysis of user interaction input described above, a preferred embodiment of a language instruction system constructed in accordance with the present invention utilizes an input device that supplements the authorization operation and is marketed and sold in conjunction with the lesson modules obtained from the Computer Facility 108 (
The user computer 1002 has a construction similar to that illustrated in
A switch 1010 is provided to trigger the operation of a code generator 1012 that produces a code output signal to the analog input port 1004. If the switch 1010 is not closed, then electrical power is not provided to the code generator, and the microphone transducer output is provided to the analog input port 1004. A battery 1014 provides a source of electrical energy to power the code generator 1012, which produces a predetermined sequence of tones that are provided to the analog input port 1004 of the PC 1002. The code generator may comprise a single tone generator wherein code symbols 0, 1, 2, . . . , 9 are represented by a set of corresponding frequency tones such as 300 Hz, 400 Hz, 500 Hz, . . . , 1200 Hz, for example, or the code generator may comprise a modem transmitter, or other device that generates multiple tones. The PC 1002 can be provided with processing that recognizes the tones being received at the analog port 1004 and determines the proper code (equivalent code symbols) being generated. Such processing will be apparent to those skilled in the art.
The switch 1010 is preferably a switch that is activated by the user upon request by a message received at the PC 1002 and displayed for observation by the user at the PC display. By activating the switch 1010, the user causes the predetermined sequence of output tones to be generated by the code generator 1012. These tones are received by the PC 1002 and are analyzed and converted to a digital code by the user computer before they are communicated to the computer facility. If the transmitted code matches a known code or is otherwise validated, then the user is determined to be an authorized user. If the generated tones do not match a predetermined code known to the computer facility, then the user is not authorized. Access to the program of language instruction can then be halted. The microphone apparatus 1006 can be marketed and sold independently of the lesson modules, subject to the access control described above, or the microphone apparatus can be marketed and sold in conjunction with controlled access to the lesson modules, as described next.
Limited Access Selling
In the preferred embodiment of the system, an input device such as the microphone apparatus illustrated in
As an alternative to selling the generator-equipped microphone apparatus of
In the preferred embodiment, the program of language instruction is available over a network such as the Internet.
Before granting access to the user for the initial user selection of modules, the Web site would instruct the user to activate the microphone apparatus switch as described above. The generated code would be sent from the microphone apparatus code generator to the user's remote computer over the communication channel and from the remote computer the information would be sent to the language instruction Web site. As described above, if the generated code is proper, the user is granted access to the selected modules.
Thereafter, if the user wants to utilize additional language instruction modules, the user must request the modules and must authorize payment. In the Web site embodiment, the user may view information about additional modules at the language provider Web site, may request access to additional modules, and may transmit a payment authorization, such as a credit card charge authorization. This information would be received and processed by the language instruction provider, who would grant access to the requested modules. As noted above, the modules may be provided in an interactive, on-line manner, or the modules may be received by network download to the user's computer. The user may be identified by the code that is generated by the microphone apparatus, so that the log-in procedure during a subsequent session will enable a user who has paid for an additional module to continue with the authorized module.
Thus, the present invention provides a technique to control access to network resources in which an identified user at a client node is verified as being entered into a network user password database prior to having access to the computer network facility. The system then permits the user to enjoy access and then waits a predetermined time after access to verify that voiceprint data of the identified user has been entered into a network voiceprint database or, if it has not been entered, receives voiceprint data from the identified user as a result of the user speaking a predetermined phrase. The system then determines whether to permit continued access to the computer network facility by the identified user in response to at least one access parameter authorization outcome. Any unauthorized user may advantageously be given an opportunity to become an authorized user by payment of a fee. In this way, access to the network facility is controlled, and unauthorized users are potentially converted into authorized users.
The present invention has been described above in terms of a presently preferred embodiment so that an understanding of the present invention can be conveyed. There are, however, many configurations for network access control systems not specifically described herein but with which the present invention is applicable. The present invention should therefore not be seen as limited to the particular embodiments described herein, but rather, it should be understood that the present invention has wide applicability with respect to network access control generally. All modifications, variations, or equivalent arrangements and implementations that are within the scope of the attached claims should therefore be considered within the scope of the invention.