US20040163035A1 - Method for automatic and semi-automatic classification and clustering of non-deterministic texts - Google Patents

Method for automatic and semi-automatic classification and clustering of non-deterministic texts Download PDF

Info

Publication number
US20040163035A1
US20040163035A1 US10/771,315 US77131504A US2004163035A1 US 20040163035 A1 US20040163035 A1 US 20040163035A1 US 77131504 A US77131504 A US 77131504A US 2004163035 A1 US2004163035 A1 US 2004163035A1
Authority
US
United States
Prior art keywords
recited
documents
word sequences
data mining
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/771,315
Inventor
Assaf Ariel
Michael Brand
Itsik Horowitz
Ofer Shochet
Itzik Stauber
Dror Ziv
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Credit Suisse AG
Original Assignee
Verint Systems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Verint Systems Inc filed Critical Verint Systems Inc
Priority to US10/771,315 priority Critical patent/US20040163035A1/en
Assigned to VERINT SYSTEMS INC. reassignment VERINT SYSTEMS INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ARIEL, ASSAF, BRAND, MICHAEL, HOROWITZ, ITSIK, SHOCHET, OFER, STAUBER, ITZIK, ZIV, DROR DANIEL
Publication of US20040163035A1 publication Critical patent/US20040163035A1/en
Assigned to LEHMAN COMMERCIAL PAPER INC., AS ADMINISTRATIVE AGENT reassignment LEHMAN COMMERCIAL PAPER INC., AS ADMINISTRATIVE AGENT SECURITY AGREEMENT Assignors: VERINT SYSTEMS INC.
Assigned to CREDIT SUISSE AS ADMINISTRATIVE AGENT reassignment CREDIT SUISSE AS ADMINISTRATIVE AGENT ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LEHMAN COMMERCIAL PAPER INC., VERINT SYSTEMS INC.
Assigned to VERINT SYSTEMS INC., VERINT AMERICAS INC., VERINT VIDEO SOLUTIONS INC. reassignment VERINT SYSTEMS INC. RELEASE OF SECURITY INTEREST Assignors: CREDIT SUISSE AG
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • G10L15/197Probabilistic grammars, e.g. word n-grams
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/383Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/98Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/248Character recognition characterised by the processing or recognition method involving plural approaches, e.g. verification by template match; Resolving confusion among similar patterns, e.g. "O" versus "Q"
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the present invention is directed to processing of information in non-deterministic texts to increase the usefulness of the texts and, more particularly, to using non-textual information to indicate the importance or recognition accuracy of individual words or sequences of words.
  • spoken document retrieval is composed of two stages: transcription of speech and information retrieval (IR). Transcription of the speech is often referred to as speech-to-text (STT) or automatic speech recognition (ASR), and is often performed using a large vocabulary continuous speech recognizer (LVCSR).
  • Information retrieval is a general term referring to all forms of data mining.
  • One common form of data mining for example, is query-based retrieval, where, based on a user's query, documents are retrieved and presented to the user, ordered by an estimated measure of their relevance to the query. Traditionally, this stage is performed on the text output of the first stage.
  • the above aspects can be attained by a method for processing documents derived from at least one of spontaneous and conversational expression and containing non-deterministic text with average word recognition precision below 50 percent, the processing utilizing non-textual differences between n-word sequences in the documents to resolve more than two decision options, where n is a positive integer.
  • Such text may be obtained by automatic character recognition or automatic speech recognition of audio signals received via a telephone system.
  • the non-textual differences between the n-word sequences relate to recognition confidence of the n-word sequences
  • the processing requires fast access to the information stored in a large corpus of documents, e.g. for the purpose of data mining, the data is preferrably pre-processed to index the n-word sequences in a method that utilizes the non-textual differences between them.
  • Such a procedure can speed up many forms of data access, and in particular many forms of data mining, including query based retrieval, as would be apparent to a person skilled in the art.
  • the data mining may include extracting parameters from the documents utilizing the non-textual differences between the n-word sequences and establishing relations between the parameters extracted from the documents.
  • the parameters extracted from the documents may be fully known, such as parameters available in document metadata or may be hidden variables that cannot be fully determined from information existing in the document. Examples of extracted parameters include an assessment of relevance to a query based on the non-textual differences between the n-word sequences and an assessment of the document's relevance to a category.
  • algorithm(s) can be used to convert text containing non-textual differences between the n-word sequences into different standard text documents.
  • Many different algorithms may be used to transform non-deterministic text into standard text documents usable in text mining.
  • the algorithm to extract standard text documents from text with non-textual differences may apply a thresholding algorithm with varying thresholds.
  • one or more data mining techniques, each of which does not utilize non-textual differences can be applied to these standard text documents and the outputs of the different data mining techniques can be merged to obtain information that is equivalent to that obtained by data mining that utilizes the non-textual differences.
  • the documents may be categorized, clustered, classified, filtered or labeled, e.g., by using an algorithm to detect salient terms in the documents based on non-linguistic differences between the n-word sequences.
  • information related to at least one of the documents may be displayed, including at least some non-textual differences between n-word sequences.
  • Portions of the document(s) may be selectively displayed based on confidence of the accuracy of the displayed words. For example, salient terms in the document(s) may be displayed based on processing of confidence levels of recognition of the salient terms that resolves more than two decision options.
  • parameters extracted from the documents and indications of the relations between these parameters may be displayed graphically.
  • a user may indicate errors in recognition.
  • at least one word in the document is preferably replaced with a corrected word supplied by the user and the confidence level(s) of the corrected word(s) are reset to indicate high recognition accuracy.
  • FIG. 1 is a flowchart of a conventional spoken document retrieval system.
  • FIG. 2 is a flowchart of one method of spoken document retrieval according to the invention.
  • FIG. 3 is a flowchart of another method of spoken document retrieval according to the invention.
  • FIGS. 4 and 5 are block diagrams of spoken document retrieval systems according to the invention.
  • FIG. 6 is a block diagram of one confidence sensitive inverted index and one regular index containing confidence information.
  • FIG. 7 is a flowchart of text processing according to the invention.
  • FIGS. 8 and 9 are examples of displays generated by telephone call processing applications according to the invention.
  • labeling a form of data processing where documents are analyzed, and the analysis results (referred to as “labels”) are made available for later processing stages.
  • a topical analysis of documents is a labeling of the documents by subject.
  • retrieving a form of data mining where a subset of a document corpus is returned in response to a query.
  • the documents are each given a rank pertaining to their relevance to the query, and are sorted by decreasing relevance. categorizing a form of data mining where several “categories” are defined, and the documents of a corpus are labeled according the category to which they fit best.
  • a common variation is multilabel categorizing, where each document may fit zero or more categories. Preferably, information is given regarding the quality of the fit.
  • clustering a form of data mining similar to categorization, with the difference that the “categories” are not predefined, and the data mining must reveal them automatically.
  • classifying a process performed on a stream of incoming documents, where each is labeled and then forwarded for relevant additional processing (manual or automatic) based on the labels that have been discovered.
  • filtering a process performed on a stream of incoming documents, where each is labeled and then forwarded or discarded based on the labels that have been discovered.
  • salient terms terms whose appearance in a document provides information relevant to its correct labeling, and consequently to all forms of data mining subsequent to labeling.
  • High quality audio 20 is input into an ASR system 22 using LVCSR.
  • ASR system 22 converts spoken words into textually represented words, but often has other outputs as well. These outputs may include timing information, an indication of the confidence of recognition of particular words and phrases, alternative likely transcriptions, and more.
  • the LVCSR output cannot be piped directly into a traditional text mining system. It has to be converted into searchable text. For this reason, canonization 24 is performed to produce canonized text, also referred to below as standard text documents, used by conventional text mining software. Most commonly, canonization simply involves taking the textual words out of the LVCSR output and concatenating them. More sophisticated canonization schemes involve usage of both textual and non-textual information to convert the LVCSR output into a format more easily handled by text mining system 26 . Usage of textual information may include capitalization and punctuation based on grammatical rules. Usage of non-textual information may include capitalization and punctuation using timing information and word(s) omitted based on low confidence levels.
  • Text mining system 26 receives input from many different audio segments and stores the information in some format that will be convenient for later processing, a process known as “indexing”. When asked to produce output, typically, though not exclusively, by user query 28 , text mining system 26 searches its index and produces output. For example the output may be the identities of the audio segments that were requested for retrieval, ranked and scored by some relevance metric. The output may also include other information, such as the phrases in the retrieved segments that have proved to be salient terms, because of which the document was given the score that it was given.
  • document display system 30 can use all of it, and add to it the original audio segment(s) 20 , to give the user feedback to user query 28 that is as informative and audio-visually appealing as possible.
  • FIG. 2 A simplified embodiment of a method according to the present invention for spoken document retrieval of low grade audio 32 is illustrated in FIG. 2.
  • a similar system could be used for text generated by a handwritten text recognition system.
  • the traditional text mining system has been replaced with data mining system 34 that is designed especially for speech in low grade audio 32 .
  • Data mining system 34 doesn't require that the output of ASR 36 be canonized into text before it is handled, and can therefore utilize all information available in the output of ASR 36 . No information is lost in a canonization process, and words that are indexed can receive different and appropriate handling based on non-textual information, such as their confidence scores.
  • ASR 36 is therefore entirely omitted, and the output of ASR 36 is available with more of its information in subsequent processing stages, including indexing, retrieval and display. Since non-textual information is available, document display system 38 in FIG. 2 displays different information than document display system 30 in FIG. 1.
  • Data mining system 34 uses data mining targeted specifically at low-grade audio speech data, extracting useful information by ranking and retrieval utilizing non-textual data which makes data mining of telephony speech quite feasible.
  • data mining 34 includes indexing which differentiates words using non-textual differences between the words.
  • indexing which differentiates words using non-textual differences between the words.
  • data mining system 34 uses the non-textual differences between the words for retrieval.
  • FIG. 3 An alternative way to determine non-textual differences between words without using a specially created data mining system is illustrated in FIG. 3.
  • the same segments of audio 32 are canonized several times using different canonization methods 24 a , . . . 24 n .
  • These methods can differ, for example, by a choice of minimal allowed confidence, any word of confidence lower than that being removed from the transcription in one way or another.
  • the obtained text is indexed in N different indexes by text mining systems 26 a , . . . 26 n which may use the same indexing methods or different methods.
  • low grade audio 32 may additionally be processed by ASR 36 to produce output for data mining system(s) 34 that utilize(s) non-textual differences between words.
  • text mining systems 26 a , . . . 26 n search indexes of standard text documents, result(s) are retrieved from each index, and finally all N results are algorithmically merged 42 , providing a single output, which is then forwarded to document display system 38 .
  • the merged output unlike the single outputs of the text mining systems, can differentiate results by confidence levels, and display system 38 can use this information.
  • FIGS. 4 and 5 Either of the spoken document retrieval methods illustrated in FIGS. 2 and 3 can be implemented using systems like that illustrated in FIGS. 4 and 5. Such systems can be configured in many different ways, depending on the tasks needed to be performed and the volume of data processed and FIGS. 4 and 5 are just two examples. The invention is not limited to the configurations illustrated in FIGS. 4 and 5 and other configurations are possible and will be apparent to a person of ordinary skill in the art. For example, the functions performed by separate servers in FIGS. 4 and 5 may be performed by separate modules in a single computing system.
  • Voice data are supplied from voice acquisition module(s) 50 by network 52 and stored in data storage 54 .
  • Network 52 may be any known type of network, such as a local area network (LAN), wide area network (WAN), the Internet, etc.
  • the voice data may be in the form of WAV files or any other audio file format.
  • the system is accessed by one or more user terminals 56 , such as personal computers or other devices that include a user interface which may include a display.
  • user terminals 56 such as personal computers or other devices that include a user interface which may include a display.
  • users log into the system at various times to submit queries to voice oriented information retrieval (VOIR) indexing server 58 .
  • Voice data from voice acquisition module(s) 50 are supplied to speech categorization server 60 which, if necessary, converts the data before supplying the voice data to LVCSR(s) 22 and performs load balancing when more than one LVCSR 22 is used.
  • LVCSR(s) 22 output words and additional data, such as speaker-change, timing information, confidence scores, etc.
  • call metadata such as the time that a call was made and the number dialed, is obtained from voice acquisition module(s) 50 together with the voice data. All these types of data are combined, e.g., by speech categorization server 60 and forwarded, in online mode to speech analysis server 62 and in offline mode to VOIR indexing server 58 . Regardless of whether the method illustrated in FIG. 2 or FIG. 3 is implemented, results of a query in offline mode can be displayed on user terminals 56 with at least some of the non-textual differences between n-word sequences indicated. Examples of how the non-textual differences are conveyed to the user will be described below with reference to FIGS. 7 and 8.
  • the online configuration illustrated in FIG. 5, may be used when the volume of voice data is too large to allow effective offline processing, or it is desired to use push-technology alerts to people who may want the data. For example, a police inspector may want to be paged when the system detects a phone conversation relevant to her case.
  • the output of LVCSR(s) 22 is supplied via network 52 to speech analysis server 62 which labels the voice data.
  • the voice data may be labeled according to importance, subject matter, person or group that needs to respond, etc.
  • the labeling of the transcribed voice data is combined with the output of LVCSRs 22 and call metadata, and forwarded to categorization queue and workflow manager 64 .
  • categorization queue and workflow manager 64 The users at user terminals 56 are provided this information by categorization queue and workflow manager 64 .
  • categorization queue and workflow manager 64 uses the labeling provided by speech analysis server 62 to categorization queue and workflow manager 64 supplies text, voice data and call metadata appropriate for that user, depending on importance, topic, identity of the user, etc.
  • Training of speech analysis server 62 may be accomplished by offline processing using VOIR indexing server 58 in an implementation that includes both servers 58 and 62 .
  • One or more users label calls by importance, subject matter, relevant person or group, etc.
  • the labels assigned by users can be provided to speech analysis server 62 as training data to recognize similar calls during online processing of calls in a call center, for example.
  • training may continue during online processing as users correct the labeling provided by speech analysis server 62 .
  • VOIR indexing server 58 is trained in a similar manner.
  • LVCSR 22 supplies metadata, including confidence scores, associated with recognized words to VOIR indexing server 58 which generates an index that indicates at least some of the non-textual differences between n-word sequences. If the method described above with reference to FIG. 3 is implemented, VOIR indexing server 58 maintains an index for each canonization system 24 . In either case, the index(es) and the voice data (preferably compressed to minimize space requirements) or other data from which indexed text is obtained (such as handwritten documents) are preferably stored in data storage 54 .
  • data in data storage 54 is indexed by use of at least one confidence sensitive inverted index.
  • a confidence sensitive inverted index maps from terms to a sorted linked list identifying all documents where each term occurs and from each appearance of a document in this list to a sorted linked list identifying all positions in which the term appears and the confidence level of its recognition.
  • indexed data may include aggregated information relating to confidence.
  • each appearance 69 of a term in the document can carry additional data, such as its position in the document, its timing information, recognition score of that appearance of the term, etc.
  • an expected number of real occurrences of the term e.g., term i which points to mapping 67 i
  • the indexed document e.g., 68 a
  • Another example of aggregated information relating to confidence information that can be saved is the strength of association between every document and each category.
  • This information can be saved either in a regular (forward) index, like index 66 , another inverted index (not shown), or both.
  • Information not relating to confidence such as call metadata, can also be indexed, either in another inverted index, in a forward index like index 66 , or both. In both cases, if an inverted index is used, confidence sensitive inverted index 65 or a separate index can be used.
  • additional mapping technologies in addition to or instead of a mapping into a sorted linked list, can also be used.
  • Data storage 54 can also store information other than the indexes, such as the data that is being indexed. This data may include, among others, call audio, voice data and call metadata, and may include additional indexes used to refer to the same data.
  • FIG. 8 A more detailed flow of processing through the system illustrated in FIGS. 4 and 5 is provided in FIG. 8. Online processing flow corresponding to the configuration illustrated in FIG. 5 is illustrated in FIG. 8 by solid lines, while offline processing flow corresponding to the configuration illustrated in FIG. 4 is illustrated by dash-dot lines.
  • Low quality source data 32 such as recorded telephone conversations, supplied by voice acquisition module(s) 50 , undergo text extraction 74 in LVCSR(s) 22 controlled by speech categorization server 60 .
  • the results 76 which may include text, confidence scores, timing information and text alternative lattice information (potentially, other information, as well), undergo indexing 78 in VOIR indexing server 58 and are stored in data storage 54 .
  • results 76 are supplied to speech analysis server 62 which may perform labeling 80 of the calls, as described above.
  • Data from VOIR indexing server 58 are used for category training 82 , so that the categorization 84 can later be used in either online or offline mode.
  • FIG. 4 One embodiment of the system illustrated in FIG. 4 is used to process recorded telephone conversations at a call center by automatically generating transcriptions of the conversations.
  • offline ad-hoc querying 86 utilizes categorization 84 or rule-based keyword spotting 88 to obtain information 90 related to at least one of the documents, including at least some of the non-textual differences between n-word sequences that may be displayed on user terminal(s) 56 in the format illustrated in FIG. 8 or 9 .
  • the display illustrated in FIG. 8 provides an example of user input keywords 102 “call OR meeting” that have been found in 170 documents, eight of which are displayed on screen in FIG. 8.
  • the documents may be listed in a table 104 in an order based in part on the confidence of accuracy of the keywords displayed in the list.
  • table 104 includes call metadata, such as start time.
  • a waveform 106 of a portion of the seventh document (indicated as selected by shading in table 104 ) is displayed in the lower portion of the screen with indications of when the keywords were detected.
  • the text 108 recognized by LVCSR(s) 22 .
  • text 108 indicates the recognition confidence of the words and the salient terms listed in the query using one or more of highlighting, underlining, color or shade, size and style of fonts.
  • labels of the conversation such as “Technical” and “Incomplete” which follow the “Categories” 116 and appear in the column under “Contact Related To” in table 104 , along with similar category information. Confidence of these labels is also indicated.
  • a user may listen to the entire recording by using a pointing device, such as a computer mouse, to select a row in table 104 corresponding to the recording or can hear just the segments of audio corresponding to transcribed salient terms by selecting the speaker icon under the word “Play” on the row.
  • a pointing device such as a computer mouse
  • a user may select one of the words, such as “call” in a user-selectable speech bubble 110 associated with the waveform, or in the adjoining text, to skip directly to the point in a conversation where the word was said.
  • a pointer 112 below the waveform 106 indicates what sound is being played back to the user and a vertical cursor 114 indicates what word was recognized for the associated sound.
  • user terminal(s) 56 can also be used to graphically display results, e.g., content information 90 , indicating parameters extracted from the documents.
  • results e.g., content information 90
  • Examples illustrated in FIG. 9 are bar graphs 118 , 120 .
  • left bar graph 118 shows the number of calls matching a query based on call date
  • right graph 120 shows the relations of several categories to the user query.
  • the present invention has been described with respect to embodiments using text documents generated from telephone calls.
  • the invention is not limited to texts generated in this manner and can also be applied to text obtained in other ways, such as from fact extraction systems.
  • the present invention can be used with any system for processing documents that derive from at least one of spontaneous and conversational expression which outputs non-deterministic text with average word recognition below 50 percent.

Abstract

Non-deterministic text with average word recognition precision below 50% is processed utilizing non-textual differences between words or sequences of words in the text to provide more useful information to users by resolving more than two decision options. One or more indexes that indicate non-textual differences between n-word sequences, where n is a positive integer, may be generated for use in data mining that considers the non-textual differences. Alternatively, multiple indexes may be generated using different data mining techniques that may or may not utilize non-textual differences and then the results produced by the different data mining techniques may be merged to identify non-textual differences. These techniques may be used in classifying, labeling, categorizing, filtering, clustering, or retrieving documents, or in discovering salient terms in a set of documents.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • This application is related to and claims priority to U.S. provisional application entitled METHOD FOR AUTOMATIC AND SEMI-AUTOMATIC CLASSIFICATION AND CLUSTERING OF NON-DETERMINISTIC TEXTS having serial No. 60/444,982, by Assaf ARIEL, Itsik HOROWITZ, Itzik STAUBER, Michael BRAND, Ofer SHOCHET and Dror ZIV, filed Feb. 5, 2003 and incorporated by reference herein. This application is also related to the application entitled AUGMENTATION AND CALIBRATION OF OUTPUT FROM NON-DETERMINISTIC TEXT GENERATORS BY MODELING ITS CHARACTERISTICS IN SPECIFIC ENVIRONMENTS by Michael BRAND, filed concurrently and incorporated by reference herein.[0001]
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention [0002]
  • The present invention is directed to processing of information in non-deterministic texts to increase the usefulness of the texts and, more particularly, to using non-textual information to indicate the importance or recognition accuracy of individual words or sequences of words. [0003]
  • 2. Description of the Related Art [0004]
  • In general, spoken document retrieval (SDR) is composed of two stages: transcription of speech and information retrieval (IR). Transcription of the speech is often referred to as speech-to-text (STT) or automatic speech recognition (ASR), and is often performed using a large vocabulary continuous speech recognizer (LVCSR). Information retrieval (IR) is a general term referring to all forms of data mining. One common form of data mining, for example, is query-based retrieval, where, based on a user's query, documents are retrieved and presented to the user, ordered by an estimated measure of their relevance to the query. Traditionally, this stage is performed on the text output of the first stage. [0005]
  • There are many known techniques for extracting useful information from texts, commonly referred to as text mining or text data mining which is a sub-discipline of data mining. Many of these techniques have been used on text output by speech-to-text algorithms or automatic character recognition systems. However, in systems that use text that has been converted from digitized speech or is based on character recognition, there has been little success when the original source is of low quality, such as telephone conversations or handwritten text, due to the low precision of accuracy of the resulting texts. As a result, most commentators in the field have discouraged application of techniques developed for easily recognized source material to source material that is difficult to recognize. Examples of such techniques can be found in U.S. Pat. Nos. 5,625,748; 6,397,181 and 6,598,054, all incorporated by reference herein. [0006]
  • Therefore, there are no known systems that provide easy access to poor quality audio, except when it is in a predictable format, such as the rules that conversations between air traffic controllers and persons in the cockpit of an aircraft follow. [0007]
  • SUMMARY OF THE INVENTION
  • It is an aspect of the present invention to improve access to text by using non-textual information. [0008]
  • It is another aspect of the present invention to use conventional text mining techniques in previously developed text mining software in a way that utilizes non-textual information in data mining. [0009]
  • It is a further aspect of the present invention to improve access to documents produced by speech recognizers using recognition confidence measurement. [0010]
  • The above aspects can be attained by a method for processing documents derived from at least one of spontaneous and conversational expression and containing non-deterministic text with average word recognition precision below 50 percent, the processing utilizing non-textual differences between n-word sequences in the documents to resolve more than two decision options, where n is a positive integer. Such text may be obtained by automatic character recognition or automatic speech recognition of audio signals received via a telephone system. In the preferred embodiment, the non-textual differences between the n-word sequences relate to recognition confidence of the n-word sequences [0011]
  • When the processing requires fast access to the information stored in a large corpus of documents, e.g. for the purpose of data mining, the data is preferrably pre-processed to index the n-word sequences in a method that utilizes the non-textual differences between them. Such a procedure can speed up many forms of data access, and in particular many forms of data mining, including query based retrieval, as would be apparent to a person skilled in the art. [0012]
  • The data mining may include extracting parameters from the documents utilizing the non-textual differences between the n-word sequences and establishing relations between the parameters extracted from the documents. The parameters extracted from the documents may be fully known, such as parameters available in document metadata or may be hidden variables that cannot be fully determined from information existing in the document. Examples of extracted parameters include an assessment of relevance to a query based on the non-textual differences between the n-word sequences and an assessment of the document's relevance to a category. [0013]
  • As an alternative to creating index(es) indicating non-textual differences between n-word sequences, algorithm(s) can be used to convert text containing non-textual differences between the n-word sequences into different standard text documents. Many different algorithms may be used to transform non-deterministic text into standard text documents usable in text mining. For example, the algorithm to extract standard text documents from text with non-textual differences may apply a thresholding algorithm with varying thresholds. Then, one or more data mining techniques, each of which does not utilize non-textual differences, can be applied to these standard text documents and the outputs of the different data mining techniques can be merged to obtain information that is equivalent to that obtained by data mining that utilizes the non-textual differences. [0014]
  • Whether or not the index(es) include an indication of non-textual differences, the documents may be categorized, clustered, classified, filtered or labeled, e.g., by using an algorithm to detect salient terms in the documents based on non-linguistic differences between the n-word sequences. [0015]
  • In response to a query using any type(s) of index(es), information related to at least one of the documents may be displayed, including at least some non-textual differences between n-word sequences. Portions of the document(s) may be selectively displayed based on confidence of the accuracy of the displayed words. For example, salient terms in the document(s) may be displayed based on processing of confidence levels of recognition of the salient terms that resolves more than two decision options. In addition, parameters extracted from the documents and indications of the relations between these parameters may be displayed graphically. [0016]
  • In response to the display of such information, a user may indicate errors in recognition. In this case at least one word in the document is preferably replaced with a corrected word supplied by the user and the confidence level(s) of the corrected word(s) are reset to indicate high recognition accuracy. [0017]
  • These, together with other aspects and advantages which will be subsequently apparent, reside in the details of construction and operation as more fully hereinafter described and claimed, reference being had to the accompanying drawings forming a part hereof, wherein like numerals refer to like parts throughout.[0018]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a flowchart of a conventional spoken document retrieval system. [0019]
  • FIG. 2 is a flowchart of one method of spoken document retrieval according to the invention. [0020]
  • FIG. 3 is a flowchart of another method of spoken document retrieval according to the invention. [0021]
  • FIGS. 4 and 5 are block diagrams of spoken document retrieval systems according to the invention. [0022]
  • FIG. 6 is a block diagram of one confidence sensitive inverted index and one regular index containing confidence information. [0023]
  • FIG. 7 is a flowchart of text processing according to the invention. [0024]
  • FIGS. 8 and 9 are examples of displays generated by telephone call processing applications according to the invention.[0025]
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • Following are several terms used herein that are in common use in automatic speech recognition or data mining. [0026]
    labeling a form of data processing where documents are analyzed,
    and the analysis results (referred to as “labels”) are made
    available for later processing stages. For example, a
    topical analysis of documents is a labeling of the
    documents by subject.
    retrieving a form of data mining where a subset of a document
    corpus is returned in response to a query. Preferably, the
    documents are each given a rank pertaining to their
    relevance to the query, and are sorted by decreasing
    relevance.
    categorizing a form of data mining where several “categories” are
    defined, and the documents of a corpus are labeled
    according the category to which they fit best. A common
    variation is multilabel categorizing, where each document
    may fit zero or more categories. Preferably, information
    is given regarding the quality of the fit.
    clustering a form of data mining similar to categorization,
    with the difference that the “categories” are not predefined,
    and the data mining must reveal them automatically.
    classifying a process performed on a stream of incoming
    documents, where each is labeled and then forwarded
    for relevant additional processing (manual or automatic)
    based on the labels that have been discovered.
    filtering a process performed on a stream of incoming
    documents, where each is labeled and then forwarded or
    discarded based on the labels that have been discovered.
    salient terms terms whose appearance in a document
    provides information relevant to its correct labeling, and
    consequently to all forms of data mining subsequent
    to labeling.
  • First, processing performed by a typical spoken document retrieval system will be described with reference to FIG. 1. [0027] High quality audio 20 is input into an ASR system 22 using LVCSR. ASR system 22 converts spoken words into textually represented words, but often has other outputs as well. These outputs may include timing information, an indication of the confidence of recognition of particular words and phrases, alternative likely transcriptions, and more.
  • The LVCSR output cannot be piped directly into a traditional text mining system. It has to be converted into searchable text. For this reason, canonization [0028] 24 is performed to produce canonized text, also referred to below as standard text documents, used by conventional text mining software. Most commonly, canonization simply involves taking the textual words out of the LVCSR output and concatenating them. More sophisticated canonization schemes involve usage of both textual and non-textual information to convert the LVCSR output into a format more easily handled by text mining system 26. Usage of textual information may include capitalization and punctuation based on grammatical rules. Usage of non-textual information may include capitalization and punctuation using timing information and word(s) omitted based on low confidence levels.
  • [0029] Text mining system 26 receives input from many different audio segments and stores the information in some format that will be convenient for later processing, a process known as “indexing”. When asked to produce output, typically, though not exclusively, by user query 28, text mining system 26 searches its index and produces output. For example the output may be the identities of the audio segments that were requested for retrieval, ranked and scored by some relevance metric. The output may also include other information, such as the phrases in the retrieved segments that have proved to be salient terms, because of which the document was given the score that it was given.
  • All this information is finally piped into [0030] document display system 30 which can use all of it, and add to it the original audio segment(s) 20, to give the user feedback to user query 28 that is as informative and audio-visually appealing as possible.
  • A simplified embodiment of a method according to the present invention for spoken document retrieval of [0031] low grade audio 32 is illustrated in FIG. 2. A similar system could be used for text generated by a handwritten text recognition system. In embodiment illustrated in FIG. 2, the traditional text mining system has been replaced with data mining system 34 that is designed especially for speech in low grade audio 32. Data mining system 34 doesn't require that the output of ASR 36 be canonized into text before it is handled, and can therefore utilize all information available in the output of ASR 36. No information is lost in a canonization process, and words that are indexed can receive different and appropriate handling based on non-textual information, such as their confidence scores. The canonization stage 24 of the process illustrated in FIG. 1 is therefore entirely omitted, and the output of ASR 36 is available with more of its information in subsequent processing stages, including indexing, retrieval and display. Since non-textual information is available, document display system 38 in FIG. 2 displays different information than document display system 30 in FIG. 1.
  • Traditionally, speech data mining was confined to [0032] high quality audio 20 such as broadcast quality audio. Broadcast audio is of high quality, typically achieving 60-80% word recognition precision, and certainly better than 50%. By contrast, telephony grade data typically has 20-30% word recognition precision, and certainly worse than 50%. Traditional SDR systems do not utilize non-linguistic information during data mining (in all stages where documents are handled together and not separately). Attempts to use traditional text mining methodologies on low-grade audio data have proven inadequate, with the results not good enough for commercial use.
  • [0033] Data mining system 34 uses data mining targeted specifically at low-grade audio speech data, extracting useful information by ranking and retrieval utilizing non-textual data which makes data mining of telephony speech quite feasible. In the embodiment illustrated in FIG. 2, data mining 34 includes indexing which differentiates words using non-textual differences between the words. Upon receipt of user query 28, data mining system 34 uses the non-textual differences between the words for retrieval.
  • An alternative way to determine non-textual differences between words without using a specially created data mining system is illustrated in FIG. 3. In this scheme, the same segments of [0034] audio 32 are canonized several times using different canonization methods 24 a, . . . 24 n. These methods can differ, for example, by a choice of minimal allowed confidence, any word of confidence lower than that being removed from the transcription in one way or another.
  • After applying the N [0035] different canonization schemes 24 a, . . . 24 n, the obtained text is indexed in N different indexes by text mining systems 26 a, . . . 26 n which may use the same indexing methods or different methods. Although not illustrated in FIG. 3, low grade audio 32 may additionally be processed by ASR 36 to produce output for data mining system(s) 34 that utilize(s) non-textual differences between words.
  • When user query [0036] 28 is received, text mining systems 26 a, . . . 26 n search indexes of standard text documents, result(s) are retrieved from each index, and finally all N results are algorithmically merged 42, providing a single output, which is then forwarded to document display system 38. The merged output, unlike the single outputs of the text mining systems, can differentiate results by confidence levels, and display system 38 can use this information.
  • Either of the spoken document retrieval methods illustrated in FIGS. 2 and 3 can be implemented using systems like that illustrated in FIGS. 4 and 5. Such systems can be configured in many different ways, depending on the tasks needed to be performed and the volume of data processed and FIGS. 4 and 5 are just two examples. The invention is not limited to the configurations illustrated in FIGS. 4 and 5 and other configurations are possible and will be apparent to a person of ordinary skill in the art. For example, the functions performed by separate servers in FIGS. 4 and 5 may be performed by separate modules in a single computing system. [0037]
  • In the configuration illustrated in FIG. 4, a system according to the invention is used interactively offline. Voice data are supplied from voice acquisition module(s) [0038] 50 by network 52 and stored in data storage 54. Network 52 may be any known type of network, such as a local area network (LAN), wide area network (WAN), the Internet, etc. The voice data may be in the form of WAV files or any other audio file format.
  • In either of the configurations illustrated in FIGS. 4 and 5, the system is accessed by one or [0039] more user terminals 56, such as personal computers or other devices that include a user interface which may include a display. In the interactive offline system illustrated in FIG. 4, users log into the system at various times to submit queries to voice oriented information retrieval (VOIR) indexing server 58. Voice data from voice acquisition module(s) 50 are supplied to speech categorization server 60 which, if necessary, converts the data before supplying the voice data to LVCSR(s) 22 and performs load balancing when more than one LVCSR 22 is used.
  • LVCSR(s) [0040] 22 output words and additional data, such as speaker-change, timing information, confidence scores, etc. In addition, call metadata, such as the time that a call was made and the number dialed, is obtained from voice acquisition module(s) 50 together with the voice data. All these types of data are combined, e.g., by speech categorization server 60 and forwarded, in online mode to speech analysis server 62 and in offline mode to VOIR indexing server 58. Regardless of whether the method illustrated in FIG. 2 or FIG. 3 is implemented, results of a query in offline mode can be displayed on user terminals 56 with at least some of the non-textual differences between n-word sequences indicated. Examples of how the non-textual differences are conveyed to the user will be described below with reference to FIGS. 7 and 8.
  • The online configuration illustrated in FIG. 5, may be used when the volume of voice data is too large to allow effective offline processing, or it is desired to use push-technology alerts to people who may want the data. For example, a police inspector may want to be paged when the system detects a phone conversation relevant to her case. In the online configuration illustrated in FIG. 5, the output of LVCSR(s) [0041] 22 is supplied via network 52 to speech analysis server 62 which labels the voice data. For example, the voice data may be labeled according to importance, subject matter, person or group that needs to respond, etc. The labeling of the transcribed voice data is combined with the output of LVCSRs 22 and call metadata, and forwarded to categorization queue and workflow manager 64. The users at user terminals 56 are provided this information by categorization queue and workflow manager 64. Using the labeling provided by speech analysis server 62, categorization queue and workflow manager 64 supplies text, voice data and call metadata appropriate for that user, depending on importance, topic, identity of the user, etc.
  • Training of [0042] speech analysis server 62 may be accomplished by offline processing using VOIR indexing server 58 in an implementation that includes both servers 58 and 62. One or more users label calls by importance, subject matter, relevant person or group, etc. The labels assigned by users can be provided to speech analysis server 62 as training data to recognize similar calls during online processing of calls in a call center, for example. In addition, training may continue during online processing as users correct the labeling provided by speech analysis server 62. When all processing is offline, VOIR indexing server 58 is trained in a similar manner.
  • In a typical implementation of the invention, a [0043] single LVCSR 22 pass is sufficient for each call. If the method described above with reference to FIG. 2 is implemented, LVCSR 22 supplies metadata, including confidence scores, associated with recognized words to VOIR indexing server 58 which generates an index that indicates at least some of the non-textual differences between n-word sequences. If the method described above with reference to FIG. 3 is implemented, VOIR indexing server 58 maintains an index for each canonization system 24. In either case, the index(es) and the voice data (preferably compressed to minimize space requirements) or other data from which indexed text is obtained (such as handwritten documents) are preferably stored in data storage 54.
  • In the preferred embodiment, if the method illustrated in FIG. 2 is implemented, data in [0044] data storage 54 is indexed by use of at least one confidence sensitive inverted index. A confidence sensitive inverted index maps from terms to a sorted linked list identifying all documents where each term occurs and from each appearance of a document in this list to a sorted linked list identifying all positions in which the term appears and the confidence level of its recognition. In addition (or alternatively), indexed data may include aggregated information relating to confidence.
  • An example is illustrated in FIG. 6 of a confidence sensitive [0045] inverted index 65 and a regular (forward) index 66 containing confidence information with the two indexes 65, 66 referencing each other. In the mapping 67 from terms in the documents to a sorted linked list of documents 68, each appearance 69 of a term in the document can carry additional data, such as its position in the document, its timing information, recognition score of that appearance of the term, etc. Also, an expected number of real occurrences of the term (e.g., term i which points to mapping 67 i) in the indexed document (e.g., 68 a) can be calculated based on the individual recognition scores of the occurrences.
  • Another example of aggregated information relating to confidence information that can be saved is the strength of association between every document and each category. This information can be saved either in a regular (forward) index, like [0046] index 66, another inverted index (not shown), or both. Information not relating to confidence, such as call metadata, can also be indexed, either in another inverted index, in a forward index like index 66, or both. In both cases, if an inverted index is used, confidence sensitive inverted index 65 or a separate index can be used. Furthermore, additional mapping technologies, in addition to or instead of a mapping into a sorted linked list, can also be used. Data storage 54 can also store information other than the indexes, such as the data that is being indexed. This data may include, among others, call audio, voice data and call metadata, and may include additional indexes used to refer to the same data.
  • A more detailed flow of processing through the system illustrated in FIGS. 4 and 5 is provided in FIG. 8. Online processing flow corresponding to the configuration illustrated in FIG. 5 is illustrated in FIG. 8 by solid lines, while offline processing flow corresponding to the configuration illustrated in FIG. 4 is illustrated by dash-dot lines. Low [0047] quality source data 32, such as recorded telephone conversations, supplied by voice acquisition module(s) 50, undergo text extraction 74 in LVCSR(s) 22 controlled by speech categorization server 60. In offline mode, the results 76, which may include text, confidence scores, timing information and text alternative lattice information (potentially, other information, as well), undergo indexing 78 in VOIR indexing server 58 and are stored in data storage 54. In the online mode, results 76 are supplied to speech analysis server 62 which may perform labeling 80 of the calls, as described above. Data from VOIR indexing server 58 are used for category training 82, so that the categorization 84 can later be used in either online or offline mode.
  • One embodiment of the system illustrated in FIG. 4 is used to process recorded telephone conversations at a call center by automatically generating transcriptions of the conversations. In this embodiment, offline ad-hoc querying [0048] 86 (FIG. 8) utilizes categorization 84 or rule-based keyword spotting 88 to obtain information 90 related to at least one of the documents, including at least some of the non-textual differences between n-word sequences that may be displayed on user terminal(s) 56 in the format illustrated in FIG. 8 or 9. The display illustrated in FIG. 8 provides an example of user input keywords 102 “call OR meeting” that have been found in 170 documents, eight of which are displayed on screen in FIG. 8. Preferably, the documents may be listed in a table 104 in an order based in part on the confidence of accuracy of the keywords displayed in the list. In the example illustrated in FIGS. 8 and 9, table 104 includes call metadata, such as start time.
  • In the example illustrated in FIG. 8, a [0049] waveform 106 of a portion of the seventh document (indicated as selected by shading in table 104) is displayed in the lower portion of the screen with indications of when the keywords were detected. Below the waveform is the text 108 recognized by LVCSR(s) 22. Preferably, text 108 indicates the recognition confidence of the words and the salient terms listed in the query using one or more of highlighting, underlining, color or shade, size and style of fonts. Also shown in the example illustrated in FIGS. 8 and 9 are labels of the conversation, such as “Technical” and “Incomplete” which follow the “Categories” 116 and appear in the column under “Contact Related To” in table 104, along with similar category information. Confidence of these labels is also indicated.
  • In one embodiment of the invention, a user may listen to the entire recording by using a pointing device, such as a computer mouse, to select a row in table [0050] 104 corresponding to the recording or can hear just the segments of audio corresponding to transcribed salient terms by selecting the speaker icon under the word “Play” on the row. Once a row has been selected, a user may select one of the words, such as “call” in a user-selectable speech bubble 110 associated with the waveform, or in the adjoining text, to skip directly to the point in a conversation where the word was said. A pointer 112 below the waveform 106 indicates what sound is being played back to the user and a vertical cursor 114 indicates what word was recognized for the associated sound.
  • Preferably, user terminal(s) [0051] 56 can also be used to graphically display results, e.g., content information 90, indicating parameters extracted from the documents. Examples illustrated in FIG. 9 are bar graphs 118, 120. In FIG. 9, left bar graph 118 shows the number of calls matching a query based on call date, while right graph 120 shows the relations of several categories to the user query.
  • The present invention has been described with respect to embodiments using text documents generated from telephone calls. However, as noted above, the invention is not limited to texts generated in this manner and can also be applied to text obtained in other ways, such as from fact extraction systems. Furthermore, the present invention can be used with any system for processing documents that derive from at least one of spontaneous and conversational expression which outputs non-deterministic text with average word recognition below 50 percent. [0052]
  • The many features and advantages of the invention are apparent from the detailed specification and, thus, it is intended by the appended claims to cover all such features and advantages of the invention that fall within the true spirit and scope of the invention. Further, since numerous modifications and changes will readily occur to those skilled in the art, it is not desired to limit the invention to the exact construction and operation illustrated and described, and accordingly all suitable modifications and equivalents may be resorted to, falling within the scope of the invention. [0053]

Claims (105)

What is claimed is:
1. A document processing method, comprising:
processing documents derived from at least one of spontaneous and conversational expression and containing non-deterministic text with average word recognition precision below 50 percent, said processing utilizing non-textual differences between n-word sequences in the documents to resolve more than two decision options, where n is a positive integer.
2. A method as recited in claim 1, wherein said processing includes data mining of the documents.
3. A method as recited in claim 2, wherein said data mining includes retrieving at least one of the documents utilizing the non-textual differences between the n-word sequences in the documents.
4. A method as recited in claim 2, wherein said data mining includes extracting parameters from the documents, utilizing the non-textual differences between said n-word sequences.
5. A method as recited in claim 4, wherein said data mining further includes producing graphic results indicating the relations between the parameters extracted from the documents.
6. A method as recited in claim 4, wherein at least one of the parameters extracted from the documents is an assessment of relevance to a query based on the non-textual differences between the n-word sequences.
7. A method as recited in claim 4, wherein at least one of the extracted parameters is an assessment of a hidden variable that cannot be fully determined from information existing in the document.
8. A method as recited in claim 4, wherein at least one of the extracted parameters is the assessment of the document's relevance to a category.
9. A method as recited in claim 2, wherein said processing includes categorizing the documents.
10. A method as recited in claim 9, wherein said categorizing includes use of at least one algorithm to detect salient terms in the documents based on non-linguistic differences between the n-word sequences.
11. A method as recited in claim 2, further comprising clustering the documents.
12. A method as recited in claim 11, wherein said clustering includes discovering salient terms in the documents based on non-linguistic differences between the n-word sequences.
13. A method as recited in claim 11, wherein said clustering includes assessing a relation between the n-word sequences based on non-textual differences.
14. A method as recited in claim 4, wherein said data mining includes establishing relations between the parameters extracted from the documents.
15. A method as recited in claim 1, wherein the non-textual differences between the n-word sequences relate to recognition confidence of the n-word sequences.
16. A method as recited in claim 1, further comprising at least one of classifying and filtering the documents as the documents are received.
17. A method as recited in claim 1, further comprising labeling the documents as the documents are received.
18. A method as recited in claim 1, further comprising displaying information related to at least one of the documents, including at least some of the non-textual differences between the n-word sequences.
19. A method as recited in claim 18, wherein said displaying uses at least one of gray scaling, color, font-size and font style to indicate at least some of the non-textual differences between the n-word sequences.
20. A method as recited in claim 18, wherein said displaying selectively displays portions of the at least one of the documents based on confidence of accuracy of words displayed.
21. A method as recited in claim 18, wherein said displaying further displays salient terms in the at least one of the documents based on said processing of confidence levels of the salient terms that resolves more than two decision options.
22. A method as recited in claim 21, wherein a number of the salient terms are available for display and said displaying is further based on the number of the salient terms available for display and available space for display of the salient terms.
23. A method as recited in claim 1, further comprising:
receiving user input indicating errors in recognition; and
replacing at least one word in the document with a corrected word based on the user input and setting the confidence levels of the corrected word to indicate high recognition accuracy.
24. A method as recited in claim 1, further comprising generating the documents by automatic speech recognition of audio signals received via a telephone system.
25. A method as recited in claim 1, further comprising generating the documents by automatic character recognition.
26. A method as recited in claim 1, further comprising generating the documents by a fact extraction system.
27. A method as recited in claim 1, wherein said processing includes
applying different data mining techniques, each of which does not indicate non-textual differences; and
merging results of the different data mining techniques to obtain results that are dependent on the non-textual differences between the n-word sequences.
28. A method as recited in claim 27, wherein the different data mining techniques include at least one of retrieving, categorizing, filtering, classifying, labeling and clustering documents without utilization of any non-textual differences between the n-word sequences.
29. A method as recited in claim 27,
wherein said applying uses a plurality of different algorithms to transform non-deterministic text into standard text documents usable in text mining, and
wherein the data mining techniques operate on the standard text documents.
30. A method as recited in claim 29,
wherein said processing further includes generating a plurality of indexes of the standard text documents, and
wherein the data mining techniques operate on the indexes to obtain the results.
31. A method as recited in claim 30, wherein the data mining techniques include
receiving a query; and
retrieving the results relevant to the query:
32. A method as recited in claim 30, wherein the data mining of at least some of the different indexes is performed by data mining software that does not output non-textual differences.
33. A method as recited in claim 29, wherein the different algorithms are thresholding algorithms using different confidence thresholds to determine omitted words that fall below the confidence thresholds.
34. A method as recited in claim 1, further comprising:
receiving user input indicating a change in labeling of at least one document; and
replacing at least part of information provided by at least one label for the at least one document based on the user input.
35. A document processing method, comprising:
producing at least one index of n-word sequences in documents derived from at least one of spontaneous and conversational expression and containing non-deterministic text with average word recognition precision below 50 percent, utilizing non-textual differences between the n-word sequences, where n is a positive integer; and
processing the documents based on the non-textual differences between the n-word sequences in the at least one index, where said processing resolves more than two decision options.
36. A method as recited in claim 35, wherein the non-textual differences between the n-word sequences relate to recognition confidence of the n-word sequences.
37. At least one computer readable medium storing instructions for controlling at least one computer system to perform a document processing method comprising:
processing documents derived from at least one of spontaneous and conversational expression and containing non-deterministic text with average word recognition precision below 50 percent, said processing utilizing non-textual differences between n-word sequences in the documents, where n is a positive integer and said processing resolves more than two decision options.
38. At least one computer readable medium as recited in claim 37, wherein said processing includes data mining of the documents.
39. At least one computer readable medium as recited in claim 38, wherein said data mining includes retrieving at least one of the documents utilizing the non-textual differences between the n-word sequences in the documents.
40. At least one computer readable medium as recited in claim 38, wherein said data mining includes
extracting parameters from the documents, utilizing the non-textual differences between said n-word sequences; and
establishing relations between the parameters extracted from the documents.
41. At least one computer readable medium as recited in claim 40, wherein said data mining further includes producing graphic results indicating the relations between the parameters extracted from the documents.
42. At least one computer readable medium as recited in claim 40, wherein at least one of the parameters extracted from the documents is an assessment of relevance to a query based on the non-textual differences between the n-word sequences.
43. At least one computer readable medium as recited in claim 40, wherein at least one of the extracted parameters is an assessment of a hidden variable that cannot be fully determined from information existing in the document.
44. At least one computer readable medium as recited in claim 40, wherein at least one of the extracted parameters is the assessment of the document's relevance to a category.
45. At least one computer readable medium as recited in claim 38, wherein said processing includes categorizing the documents.
46. At least one computer readable medium as recited in claim 45, wherein said categorizing includes use of at least one algorithm to detect salient terms in the documents based on non-linguistic differences between the n-word sequences.
47. At least one computer readable medium as recited in claim 38, further comprising clustering the documents.
48. At least one computer readable medium as recited in claim 47, wherein said clustering includes discovering salient terms in the documents based on non-linguistic differences between the n-word sequences.
49. At least one computer readable medium as recited in claim 47, wherein said clustering includes assessing a relation between the n-word sequences based on non-textual differences.
50. At least one computer readable medium as recited in claim 37, wherein the non-textual differences between the n-word sequences relate to recognition confidence of the n-word sequences.
51. At least one computer readable medium as recited in claim 37, further comprising at least one of classifying and filtering the documents as the documents are received.
52. At least one computer readable medium as recited in claim 37, further comprising labeling the documents as the documents are received.
53. At least one computer readable medium as recited in claim 37, further comprising displaying information related to at least one of the documents, including at least some of the non-textual differences between the n-word sequences.
54. At least one computer readable medium as recited in claim 53, wherein said displaying uses at least one of gray scaling, color, font-size and font style to indicate at least some of the non-textual differences between the n-word sequences.
55. At least one computer readable medium as recited in claim 53, wherein said displaying selectively displays portions of the at least one of the documents based on confidence of accuracy of words displayed.
56. At least one computer readable medium as recited in claim 53, wherein said displaying further displays salient terms in the at least one of the documents based on said processing of confidence levels of the salient terms that resolves more than two decision options.
57. At least one computer readable medium as recited in claim 56, wherein a number of the salient terms are available for display and said displaying is further based on the number of the salient terms available for display and available space for display of the salient terms.
58. At least one computer readable medium as recited in claim 37, further comprising:
receiving user input indicating errors in recognition; and
replacing at least one word in the document with a corrected word based on the user input and setting the confidence levels of the corrected word to indicate high recognition accuracy.
59. At least one computer readable medium as recited in claim 37, further comprising generating the documents by automatic speech recognition of audio signals received via a telephone system.
60. At least one computer readable medium as recited in claim 37, further comprising generating the documents by automatic character recognition.
61. At least one computer readable medium as recited in claim 37, further comprising generating the documents by a fact extraction system.
62. At least one computer readable medium as recited in claim 37, wherein said processing includes
applying different data mining techniques, each of which does not indicate non-textual differences; and
merging results of the different data mining techniques to obtain the non-textual differences between the n-word sequences.
63. At least one computer readable medium as recited in claim 62, wherein the different data mining techniques include at least one of retrieving, categorizing, filtering, classifying, labeling and clustering documents without utilization of any non-textual differences between the n-word sequences.
64. At least one computer readable medium as recited in claim 62,
wherein said applying uses a plurality of different algorithms to transform non-deterministic text into standard text documents usable in text mining, and
wherein the data mining techniques operate on the standard text documents.
65. At least one computer readable medium as recited in claim 64,
wherein said processing further includes generating a plurality of indexes of the standard text documents, and
wherein the data mining techniques operate on the indexes to obtain the results.
66. At least one computer readable medium as recited in claim 65, wherein the data mining techniques include
receiving a query; and
retrieving the results relevant to the query.
67. At least one computer readable medium as recited in claim 66, wherein the data mining of at least some of the different indexes is performed by data mining software that does not output non-textual differences.
68. At least one computer readable medium as recited in claim 64, wherein the different algorithms are thresholding algorithms using different confidence thresholds to determine omitted words that fall below the confidence thresholds.
69. At least one computer readable medium as recited in claim 37, further comprising:
receiving user input indicating a change in labeling of at least one document; and
replacing at least part of information provided by at least one label for the at least one document based on the user input.
70. At least one computer readable medium for controlling at least one computer system to perform document processing method, comprising:
producing at least one index of n-word sequences in documents derived from at least one of spontaneous and conversational expression and containing non-deterministic text with average word recognition precision below 50 percent, utilizing non-textual differences between the n-word sequences, where n is a positive integer; and
processing the documents based on the non-textual differences between the n-word sequences in the at least one index, where said processing resolves more than two decision options.
71. At least one computer readable medium as recited in claim 70, wherein the non-textual differences between the n-word sequences relate to recognition confidence of the n-word sequences.
72. An apparatus for processing documents, comprising:
processing means for processing documents derived from at least one of spontaneous and conversational expression and containing non-deterministic text with average word recognition precision below 50 percent, said processing utilizing non-textual differences between n-word sequences in the documents, where n is a positive integer and said processing resolves more than two decision options.
73. An apparatus as recited in claim 72, wherein said processing means comprises index means for producing at least one index of the n-word sequences utilizing the non-textual differences between the n-word sequences.
74. An apparatus as recited in claim 73, wherein said processing means comprises data mining means for retrieving at least one of the documents utilizing the at least one index.
75. An apparatus as recited in claim 74,
wherein said data mining means comprises:
parameter extraction means for extracting parameters from the documents, utilizing the non-textual differences between said n-word sequences; and
relations establishment means for establishing relations between the parameters extracted from the documents, and
wherein said apparatus further comprises display means for producing graphic results indicating the relations between the parameters extracted from the documents.
76. An apparatus as recited in claim 75, wherein at least one of the extracted parameters is an assessment of a hidden variable that cannot be fully determined from information existing in the at least one of the documents.
77. An apparatus as recited in claim 72, wherein the non-textual differences between the n-word sequences relate to recognition confidence of the n-word sequences.
78. An apparatus as recited in claim 72, wherein said processing means comprises categorizing means for categorizing the documents utilizing at least one algorithm based on non-linguistic differences between the n-word sequences.
79. An apparatus as recited in claim 72, wherein said processing means comprises clustering means for clustering the documents by assessing a relation between the n-word sequences based on non-textual differences.
80. An apparatus as recited in claim 72, wherein said processing means comprises means for at least one of classifying and filtering the documents as the documents are received.
81. An apparatus as recited in claim 72, further comprising display means for displaying information related to at least one of the documents, including at least some of the non-textual differences between the n-word sequences.
82. An apparatus as recited in claim 81, wherein said display means selectively displays portions of the at least one of the documents based on confidence of accuracy of words displayed.
83. An apparatus as recited in claim 72,
further comprising input means for receiving user input indicating errors in recognition, and
wherein said processing means comprises means for replacing at least one word in the at least one of the documents with a corrected word based on the user input and setting the confidence levels of the corrected word to indicate high recognition accuracy.
84. An apparatus as recited in claim 72, coupled to a telephone system and further comprising automatic speech recognition means for generating the documents by automatic speech recognition of audio signals received via the telephone system.
85. An apparatus as recited in claim 72, further comprising automatic character recognition means for generating the documents by automatic character recognition.
86. An apparatus as recited in claim 72, wherein said processing means comprises:
data mining means for applying different data mining techniques, each of which does not indicate non-textual differences; and
merge means for merging results of the different data mining techniques to obtain the non-textual differences between the n-word sequences.
87. An apparatus as recited in claim 86, wherein said data mining means includes means for at least one of retrieving, categorizing, filtering, classifying, labeling and clustering documents without utilization of any non-textual differences between the n-word sequences.
88. An apparatus as recited in claim 87, wherein said data mining means uses a plurality of different algorithms to transform non-deterministic text into standard text documents usable in text mining and the data mining techniques operate on the standard text documents.
89. An apparatus as recited in claim 87,
further comprising indexing means for generating a plurality of indexes of the standard text documents, and
wherein said data mining means uses the different indexes in applying the different data mining techniques.
90. An apparatus as recited in claim 89,
further comprising input means for receiving a query; and
wherein said data mining means further includes retrieving means for retrieving the results relevant to the query.
91. A data processing system, comprising:
at least one server to process documents, derived from at least one of spontaneous and conversational expression and containing non-deterministic text with word recognition precision of less than 50 percent, utilizing non-textual differences between n-word sequences, where n is a positive integer.
92. A data processing system as recited in claim 91, wherein said at least one server includes an indexing server producing at least one index of the n-word sequences utilizing the non-textual differences between the n-word sequences,
93. A data processing system as recited in claim 92, wherein said indexing server retrieves at least one of the documents utilizing data mining of the at least one index.
94. A data processing system as recited in claim 91,
wherein said at least one server extracts parameters from the documents, utilizing the non-textual differences between said n-word sequences, and establishes relations between the parameters extracted from the documents, and
wherein said data processing system further comprises at least one display device producing graphic results indicating the relations between the parameters extracted from the documents.
95. A data processing system as recited in claim 94, wherein at least one of the extracted parameters is an assessment of a hidden variable that cannot be fully determined from information existing in the at least one of the documents.
96. A data processing system as recited in claim 91, wherein the non-textual differences between the n-word sequences relate to recognition confidence of the n-word sequences.
97. A data processing system as recited in claim 91, further comprising at least one display device displaying information related to at least one of the documents, including at least some of the non-textual differences between the n-word sequences
98. A data processing system as recited in claim 97, wherein said at least one display device selectively displays portions of at least one of the documents based on confidence of accuracy of words displayed.
99. A data processing system as recited in claim 91, wherein said at least one server applies different data mining techniques, each of which does not indicate non-textual differences and merges results of the different data mining techniques to obtain the non-textual differences between the n-word sequences.
100. A data processing system as recited in claim 99, wherein said at least one server uses a plurality of different algorithms to transform non-deterministic text into standard text documents usable in text mining and the data mining techniques operate on the standard text documents.
101. A data processing system as recited in claim 100, wherein said at least one server generates a plurality of indexes of the standard text documents and uses the different indexes in applying the different data mining techniques.
102. A data processing system as recited in claim 99, wherein the different data mining techniques include at least one of retrieving, categorizing, filtering, classifying, labeling and clustering documents without utilization of any non-textual differences between the n-word sequences.
103. A data processing system as recited in claim 102, wherein said at least one server uses a plurality of different algorithms to transform non-deterministic text into standard text documents usable in text mining and the data mining techniques operate on the standard text documents.
104. A data processing system as recited in claim 91,
further comprising at least one user terminal providing user input indicating errors in recognition in a document, and
wherein said at least one server replaces at least one word in the document with a corrected word based on the user input and sets confidence levels of the corrected word to indicate high recognition accuracy.
105. A data processing system as recited in claim 91, further comprising at least one of an automatic speech recognition unit, an automatic character recognition unit and a fact extraction unit to generate the documents from data that on average produces word recognition precision of less than 50 percent.
US10/771,315 2003-02-05 2004-02-05 Method for automatic and semi-automatic classification and clustering of non-deterministic texts Abandoned US20040163035A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/771,315 US20040163035A1 (en) 2003-02-05 2004-02-05 Method for automatic and semi-automatic classification and clustering of non-deterministic texts

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US44498203P 2003-02-05 2003-02-05
US10/771,315 US20040163035A1 (en) 2003-02-05 2004-02-05 Method for automatic and semi-automatic classification and clustering of non-deterministic texts

Publications (1)

Publication Number Publication Date
US20040163035A1 true US20040163035A1 (en) 2004-08-19

Family

ID=32869303

Family Applications (4)

Application Number Title Priority Date Filing Date
US10/771,315 Abandoned US20040163035A1 (en) 2003-02-05 2004-02-05 Method for automatic and semi-automatic classification and clustering of non-deterministic texts
US10/771,409 Active 2027-03-11 US7792671B2 (en) 2003-02-05 2004-02-05 Augmentation and calibration of output from non-deterministic text generators by modeling its characteristics in specific environments
US12/059,660 Abandoned US20080183468A1 (en) 2003-02-05 2008-03-31 Augmentation and calibration of output from non-deterministic text generators by modeling its characteristics in specific environments
US12/876,207 Expired - Lifetime US8195459B1 (en) 2003-02-05 2010-09-06 Augmentation and calibration of output from non-deterministic text generators by modeling its characteristics in specific environments

Family Applications After (3)

Application Number Title Priority Date Filing Date
US10/771,409 Active 2027-03-11 US7792671B2 (en) 2003-02-05 2004-02-05 Augmentation and calibration of output from non-deterministic text generators by modeling its characteristics in specific environments
US12/059,660 Abandoned US20080183468A1 (en) 2003-02-05 2008-03-31 Augmentation and calibration of output from non-deterministic text generators by modeling its characteristics in specific environments
US12/876,207 Expired - Lifetime US8195459B1 (en) 2003-02-05 2010-09-06 Augmentation and calibration of output from non-deterministic text generators by modeling its characteristics in specific environments

Country Status (4)

Country Link
US (4) US20040163035A1 (en)
EP (2) EP1590798A2 (en)
IL (1) IL170065A (en)
WO (2) WO2004072780A2 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060067578A1 (en) * 2004-09-30 2006-03-30 Fuji Xerox Co., Ltd. Slide contents processor, slide contents processing method, and storage medium storing program
US20070050445A1 (en) * 2005-08-31 2007-03-01 Hugh Hyndman Internet content analysis
US20080027888A1 (en) * 2006-07-31 2008-01-31 Microsoft Corporation Optimization of fact extraction using a multi-stage approach
US20090012970A1 (en) * 2007-07-02 2009-01-08 Dror Daniel Ziv Root cause analysis using interactive data categorization
US20090248647A1 (en) * 2008-03-25 2009-10-01 Omer Ziv System and method for the quality assessment of queries
US20090249253A1 (en) * 2008-03-31 2009-10-01 Palm, Inc. Displaying mnemonic abbreviations for commands
US20110072052A1 (en) * 2008-05-28 2011-03-24 Aptima Inc. Systems and methods for analyzing entity profiles
US8725732B1 (en) * 2009-03-13 2014-05-13 Google Inc. Classifying text into hierarchical categories
US20210312123A1 (en) * 2020-04-03 2021-10-07 Jon Ward Systems and Methods For Cloud-Based Productivity Tools
US11244011B2 (en) * 2015-10-23 2022-02-08 International Business Machines Corporation Ingestion planning for complex tables
US20230028717A1 (en) * 2020-08-27 2023-01-26 Capital One Services, Llc Representing Confidence in Natural Language Processing

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1590798A2 (en) * 2003-02-05 2005-11-02 Verint Systems Inc. Method for automatic and semi-automatic classification and clustering of non-deterministic texts
US7856355B2 (en) * 2005-07-05 2010-12-21 Alcatel-Lucent Usa Inc. Speech quality assessment method and system
US20070078806A1 (en) * 2005-10-05 2007-04-05 Hinickle Judith A Method and apparatus for evaluating the accuracy of transcribed documents and other documents
US20100027768A1 (en) * 2006-11-03 2010-02-04 Foskett James J Aviation text and voice communication system
US8126891B2 (en) * 2008-10-21 2012-02-28 Microsoft Corporation Future data event prediction using a generative model
US8379801B2 (en) * 2009-11-24 2013-02-19 Sorenson Communications, Inc. Methods and systems related to text caption error correction
US9070360B2 (en) * 2009-12-10 2015-06-30 Microsoft Technology Licensing, Llc Confidence calibration in automatic speech recognition systems
US8930189B2 (en) 2011-10-28 2015-01-06 Microsoft Corporation Distributed user input to text generated by a speech to text transcription service
US9870520B1 (en) * 2013-08-02 2018-01-16 Intuit Inc. Iterative process for optimizing optical character recognition
FR3010809B1 (en) * 2013-09-18 2017-05-19 Airbus Operations Sas METHOD AND DEVICE FOR AUTOMATIC MANAGEMENT ON BOARD AN AIRCRAFT AUDIO MESSAGE AIRCRAFT.
DE112015001468T5 (en) * 2014-03-27 2016-12-15 Sony Corporation Electronic device and method for identifying input commands of a user
US9858923B2 (en) * 2015-09-24 2018-01-02 Intel Corporation Dynamic adaptation of language models and semantic tracking for automatic speech recognition
CN108777141B (en) * 2018-05-31 2022-01-25 康键信息技术(深圳)有限公司 Test apparatus, test method, and storage medium
CN110110303A (en) * 2019-03-28 2019-08-09 苏州八叉树智能科技有限公司 Newsletter archive generation method, device, electronic equipment and computer-readable medium
CN111581455B (en) * 2020-04-28 2023-03-21 北京字节跳动网络技术有限公司 Text generation model generation method and device and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5625748A (en) * 1994-04-18 1997-04-29 Bbn Corporation Topic discriminator using posterior probability or confidence scores
US6397181B1 (en) * 1999-01-27 2002-05-28 Kent Ridge Digital Labs Method and apparatus for voice annotation and retrieval of multimedia data
US20020178002A1 (en) * 2001-05-24 2002-11-28 International Business Machines Corporation System and method for searching, analyzing and displaying text transcripts of speech after imperfect speech recognition
US6598054B2 (en) * 1999-01-26 2003-07-22 Xerox Corporation System and method for clustering data objects in a collection
US20040083101A1 (en) * 2002-10-23 2004-04-29 International Business Machines Corporation System and method for data mining of contextual conversations

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5550930A (en) * 1991-06-17 1996-08-27 Microsoft Corporation Method and system for training a handwriting recognizer at the time of misrecognition
GB9709341D0 (en) * 1997-05-08 1997-06-25 British Broadcasting Corp Method of and apparatus for editing audio or audio-visual recordings
AU2001245927A1 (en) 2000-03-24 2001-10-08 Dragon Systems, Inc. Lexical analysis of telephone conversations with call center agents
US6839667B2 (en) * 2001-05-16 2005-01-04 International Business Machines Corporation Method of speech recognition by presenting N-best word candidates
US6963834B2 (en) 2001-05-29 2005-11-08 International Business Machines Corporation Method of speech recognition using empirically determined word candidates
EP1590798A2 (en) * 2003-02-05 2005-11-02 Verint Systems Inc. Method for automatic and semi-automatic classification and clustering of non-deterministic texts

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5625748A (en) * 1994-04-18 1997-04-29 Bbn Corporation Topic discriminator using posterior probability or confidence scores
US6598054B2 (en) * 1999-01-26 2003-07-22 Xerox Corporation System and method for clustering data objects in a collection
US6397181B1 (en) * 1999-01-27 2002-05-28 Kent Ridge Digital Labs Method and apparatus for voice annotation and retrieval of multimedia data
US20020178002A1 (en) * 2001-05-24 2002-11-28 International Business Machines Corporation System and method for searching, analyzing and displaying text transcripts of speech after imperfect speech recognition
US20040083101A1 (en) * 2002-10-23 2004-04-29 International Business Machines Corporation System and method for data mining of contextual conversations

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060067578A1 (en) * 2004-09-30 2006-03-30 Fuji Xerox Co., Ltd. Slide contents processor, slide contents processing method, and storage medium storing program
US7698645B2 (en) * 2004-09-30 2010-04-13 Fuji Xerox Co., Ltd. Presentation slide contents processor for categorizing presentation slides and method for processing and categorizing slide contents
US20070050445A1 (en) * 2005-08-31 2007-03-01 Hugh Hyndman Internet content analysis
US20080027888A1 (en) * 2006-07-31 2008-01-31 Microsoft Corporation Optimization of fact extraction using a multi-stage approach
US7668791B2 (en) 2006-07-31 2010-02-23 Microsoft Corporation Distinguishing facts from opinions using a multi-stage approach
US9015194B2 (en) * 2007-07-02 2015-04-21 Verint Systems Inc. Root cause analysis using interactive data categorization
US20090012970A1 (en) * 2007-07-02 2009-01-08 Dror Daniel Ziv Root cause analysis using interactive data categorization
US20090248647A1 (en) * 2008-03-25 2009-10-01 Omer Ziv System and method for the quality assessment of queries
US20090249253A1 (en) * 2008-03-31 2009-10-01 Palm, Inc. Displaying mnemonic abbreviations for commands
US9053088B2 (en) * 2008-03-31 2015-06-09 Qualcomm Incorporated Displaying mnemonic abbreviations for commands
US20110072052A1 (en) * 2008-05-28 2011-03-24 Aptima Inc. Systems and methods for analyzing entity profiles
US9123022B2 (en) 2008-05-28 2015-09-01 Aptima, Inc. Systems and methods for analyzing entity profiles
US9594825B2 (en) 2008-05-28 2017-03-14 Aptima, Inc. Systems and methods for analyzing entity profiles
US11461373B2 (en) 2008-05-28 2022-10-04 Aptima, Inc. Systems and methods for analyzing entity profiles
US8725732B1 (en) * 2009-03-13 2014-05-13 Google Inc. Classifying text into hierarchical categories
US11244011B2 (en) * 2015-10-23 2022-02-08 International Business Machines Corporation Ingestion planning for complex tables
US20210312123A1 (en) * 2020-04-03 2021-10-07 Jon Ward Systems and Methods For Cloud-Based Productivity Tools
US11687710B2 (en) * 2020-04-03 2023-06-27 Braincat, Inc. Systems and methods for cloud-based productivity tools
US20230028717A1 (en) * 2020-08-27 2023-01-26 Capital One Services, Llc Representing Confidence in Natural Language Processing
US11720753B2 (en) * 2020-08-27 2023-08-08 Capital One Services, Llc Representing confidence in natural language processing

Also Published As

Publication number Publication date
EP1590796A1 (en) 2005-11-02
EP1590798A2 (en) 2005-11-02
US7792671B2 (en) 2010-09-07
WO2004072780A2 (en) 2004-08-26
US20040158469A1 (en) 2004-08-12
WO2004072955A1 (en) 2004-08-26
IL170065A (en) 2013-02-28
US20080183468A1 (en) 2008-07-31
WO2004072780A3 (en) 2004-11-11
US8195459B1 (en) 2012-06-05

Similar Documents

Publication Publication Date Title
US20040163035A1 (en) Method for automatic and semi-automatic classification and clustering of non-deterministic texts
US10431214B2 (en) System and method of determining a domain and/or an action related to a natural language input
CN108197282B (en) File data classification method and device, terminal, server and storage medium
US11182435B2 (en) Model generation device, text search device, model generation method, text search method, data structure, and program
US7272558B1 (en) Speech recognition training method for audio and video file indexing on a search engine
US9229974B1 (en) Classifying queries
US8239189B2 (en) Method and system for estimating a sentiment for an entity
US7415409B2 (en) Method to train the language model of a speech recognition system to convert and index voicemails on a search engine
CA2638558C (en) Topic word generation method and system
US20100070263A1 (en) Speech data retrieving web site system
CN111046133A (en) Question-answering method, question-answering equipment, storage medium and device based on atlas knowledge base
US20040249808A1 (en) Query expansion using query logs
CN105956053B (en) A kind of searching method and device based on the network information
JP2013521567A (en) System including client computing device, method of tagging media objects, and method of searching a digital database including audio tagged media objects
CN112800170A (en) Question matching method and device and question reply method and device
CN107748784B (en) Method for realizing structured data search through natural language
CN109446376B (en) Method and system for classifying voice through word segmentation
US20220058213A1 (en) Systems and methods for identifying dynamic types in voice queries
WO2006118360A1 (en) Issue trend analysis system
CN111881283A (en) Business keyword library creating method, intelligent chat guiding method and device
CN111125457A (en) Deep cross-modal Hash retrieval method and device
CN108038099B (en) Low-frequency keyword identification method based on word clustering
US20230214579A1 (en) Intelligent character correction and search in documents
CN111488453B (en) Resource grading method, device, equipment and storage medium
JP6576847B2 (en) Analysis system, analysis method, and analysis program

Legal Events

Date Code Title Description
AS Assignment

Owner name: VERINT SYSTEMS INC., NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ARIEL, ASSAF;BRAND, MICHAEL;HOROWITZ, ITSIK;AND OTHERS;REEL/FRAME:014967/0368;SIGNING DATES FROM 20040129 TO 20040202

AS Assignment

Owner name: LEHMAN COMMERCIAL PAPER INC., AS ADMINISTRATIVE AG

Free format text: SECURITY AGREEMENT;ASSIGNOR:VERINT SYSTEMS INC.;REEL/FRAME:019588/0613

Effective date: 20070525

AS Assignment

Owner name: CREDIT SUISSE AS ADMINISTRATIVE AGENT, NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:VERINT SYSTEMS INC.;LEHMAN COMMERCIAL PAPER INC.;REEL/FRAME:022793/0888

Effective date: 20090604

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: VERINT AMERICAS INC., NEW YORK

Free format text: RELEASE OF SECURITY INTEREST;ASSIGNOR:CREDIT SUISSE AG;REEL/FRAME:026206/0340

Effective date: 20110429

Owner name: VERINT SYSTEMS INC., NEW YORK

Free format text: RELEASE OF SECURITY INTEREST;ASSIGNOR:CREDIT SUISSE AG;REEL/FRAME:026206/0340

Effective date: 20110429

Owner name: VERINT VIDEO SOLUTIONS INC., NEW YORK

Free format text: RELEASE OF SECURITY INTEREST;ASSIGNOR:CREDIT SUISSE AG;REEL/FRAME:026206/0340

Effective date: 20110429