WO2003067497A1 - A method and apparatus to visually present discussions for data mining purposes - Google Patents

A method and apparatus to visually present discussions for data mining purposes Download PDF

Info

Publication number
WO2003067497A1
WO2003067497A1 PCT/US2003/003504 US0303504W WO03067497A1 WO 2003067497 A1 WO2003067497 A1 WO 2003067497A1 US 0303504 W US0303504 W US 0303504W WO 03067497 A1 WO03067497 A1 WO 03067497A1
Authority
WO
WIPO (PCT)
Prior art keywords
actor
user
query
discussion
communication
Prior art date
Application number
PCT/US2003/003504
Other languages
French (fr)
Inventor
Elizabeth B. Charnock
Curtis Thompson
Steven L. Roberts
Original Assignee
Cataphora, Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cataphora, Inc filed Critical Cataphora, Inc
Priority to AU2003207856A priority Critical patent/AU2003207856A1/en
Priority to CA002475319A priority patent/CA2475319A1/en
Priority to EP03706095A priority patent/EP1481346B1/en
Publication of WO2003067497A1 publication Critical patent/WO2003067497A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access

Definitions

  • the present invention relates to electronic documents, and more particularly to a method for visualizing the relationships among, and retrieving one more groups of documents satisfying a user-defined criterion or set of criteria.
  • a keyword search involves entering a list of words, which are likely to be contained within the body of the document for which the user is searching.
  • a fielded search involves locating documents using lexical strings that have been deliberately placed within the document (usually at the top) with the purpose of facilitating document retrieval.
  • a method of organizing information comprises providing a visualization of actor communications in the context of one or more discussion, a discussion including at least one actor and at least one documented communication.
  • Figure 1 is a block diagram of one embodiment of a network, which may be used with the present invention.
  • Figure 2 is a block diagram of one embodiment of a computer system.
  • Figure 3 is a block diagram of navigation flow in one embodiment of the present invention.
  • Figure 4 is a block diagram of user-interface flow in one embodiment of the present invention.
  • Figure 5 is a screen shot of one embodiment of the participant graph.
  • Figure 6 is a screen shot of another embodiment of the participant graph, in which the time of day is represented.
  • Figure 7 is a screen shot of a form panel for adding items that were not originally part of the discussion being visualized.
  • Figure 8 is a screen shot of one embodiment of a participant graph, in which a pop-up showing basic information about the item is displayed.
  • Figure 9 is a screen shot of one embodiment of a document trail graph
  • Figure 10 is a screen shot of one embodiment of a money trail graph
  • Figure 11 is a screen shot of one embodiment a view that uses a color, pattern, or similar distinguishing mechanism which uses the color spectrum to help users to discern small shifts in the communication activity of a very large population of actors.
  • Figure 12 is a screen shot of one embodiment of an activity graph, which illustrates the amount of communication among actors over a user- specified period of time.
  • Figure 13 is a screen shot of one embodiment of a discussion timeline, in which each discussion appears as a rectangle of the length appropriate relative to its duration in the timeline.
  • Figure 14 is a screen shot of one embodiment of a discussion timeline, with a spider-eye panning widget to temporarily change the resolution of the discussion visualization.
  • Figure 15 is a screen shot of one embodiment of a discussion timeline, showing the individual events in the discussion.
  • Figure 17 is a screen shot of one embodiment of a graphical representation of a discussion timeline.
  • Figure 16 is a screen shot of one embodiment of a discussion cluster view.
  • Figure 18 is a screen shot of one embodiment of a transcript view, showing actor color-coding.
  • Figure 19 is a screen shot of one embodiment of a transcript view, showing actor activity.
  • Figure 20 is a screen shot of one embodiment of a transcript view, showing discussion partitions.
  • Figure 21 is a screen shot of one embodiment of a transcript view, showing actor and document-type color-coding.
  • Figure 22 is a screen shot of one embodiment of a transcript view, showing document attachments.
  • Figure 23 is a screen shot of one embodiment of a transcript view, showing color-coding of quoted text.
  • Figure 24 is a screen shot of one embodiment of a transcript view, showing that a deletion has occurred.
  • Figure 25 is a screen shot of one embodiment of a transcript view, showing Instant Messages (IMs).
  • IMs Instant Messages
  • Figure 26 is a screen shot of one embodiment of a query results view, showing discussion titles, discussion start and end dates, and actor images.
  • Figure 27 is a screen shot of one embodiment of a matrix query results view.
  • Figure 28 is a screen shot of one embodiment of the saved queries view.
  • Figure 29 is a screen shot of one embodiment of a tool for submitting user queries.
  • Figure 30 is a screen shot of one embodiment of a tool for submitting user queries, in which said tool allows the user to select types of actor involvement, and to use a saved query.
  • Figure 31 is a screen shot of one embodiment of a tool for submitting user queries, in which said tool allows the user to exclude certain actors from the query.
  • Figure 32 is a diagram of a query template (Template 1 ).
  • Figure 33 is a diagram of a query template (Template 2).
  • Figure 34 is a diagram of query templates (Templates 3 & 4).
  • Figure 35 is a diagram of query components.
  • Figure 36 is a screen shot of one embodiment of a Venn diagram view of document categories.
  • Figures 37a - 37c are screen shots of one embodiment of Query by Example (QBE).
  • Figure 38 is a screen shot of one embodiment of the document lifecycle view.
  • Figure 39 is a screen shot of one embodiment of a user interface for viewing discussions on a PalmOS-based mobile device.
  • Figure 40 is a screen shot of one embodiment of the master window view of the case management user interface.
  • a method and apparatus for visualizing both the electronic paper trails referred to as “discussions” and the statistical anomalies and patterns that are directly computable from these discussions is disclosed.
  • a discussion in this context is a heterogeneous set of causally related communications and events for which either electronic evidence exists, or can be created to reflect. Thus, a discussion provides a means of reviewing a series of related events that occurred over time.
  • One example of generating such discussions from raw communications data is discussed in more detail in copending Application Serial No XXX, entitled “A Method and Apparatus for Retrieving Interrelated Sets of Documents", filed concurrently herewith (hereinafter referred to as 'An Apparatus for Sociological Data Mining').
  • the visualizations and user interface tools described in this application greatly facilitate the efficient and effective review and understanding of such chains of events.
  • MVC Model View Controller
  • FIG. 1 depicts a typical networked environment in which the present invention operates.
  • the network 105 allows access to email data stores on an email server 120, log files stored on a voicemail server 125, documents stored on a data server 130, and data stored in databases 140 and 145.
  • Data is processed by an indexing system 135 and sociological engine 150, and is presented to the user by a visualization mechanism 140.
  • the visualization mechanism 140 is described in more detail in the present application.
  • FIG. 2 depicts a typical digital computer 200 on which the present system will run.
  • a data bus 205 allows communication between a central processing unit 210, random access volatile memory 215, a data storage device 220, and a network interface card 225.
  • Input from the user is permitted through an alphanumeric input device 235 and cursor control system 240, and data is made visible to the user via a display 230.
  • Communication between the computer and other networked devices is made possible via a communications device 245.
  • control logic or software implementing the present invention can be stored in main memory 250, mass storage device 225, or other storage medium locally or remotely accessible to processor 210.
  • the present invention may also be embodied in a handheld or portable device containing a subset of the computer hardware components described above.
  • the handheld device may be configured to contain only the bus 215, the processor 210, and memory 250 and/or 225.
  • the present invention may also be embodied in a special purpose appliance including a subset of the computer hardware components described above.
  • the appliance may include a processor 210, a data storage device 225, a bus 215, and memory 250, and only rudimentary communications mechanisms, such as a small touch-screen that permits the user to communicate in a basic manner with the device.
  • the more special-purpose the device is the fewer of the elements need be present for the device to function.
  • communications with the user may be through a touch-based screen, or similar mechanism.
  • a machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g. a computer).
  • a machine readable medium includes read-only memory (ROM), random access memory (RAM), magnetic disk storage media, optical storage media, flash memory devices, electrical, optical, acoustical or other forms of propagated signals (e.g. carrier waves, infrared signals, digital signals, etc.).
  • FIG. 3 Navigation among views is facilitated by the fact that all of the viewable entities have very close relationships to one another, as depicted in Figure 3.
  • the user can submit queries 320, which return discussions 305.
  • Each discussion must contain at least two actors 310.
  • Each of the actors 310 about whom the user can submit queries 320 must appear in zero (0) or more discussions 305 (an actor can appear in 0 discussions by being connected in some way with a singleton document which, by definition, is not part of a discussion).
  • An actor 310 can be associated with multiple topics 315, and vice versa.
  • Each discussion 305 can be associated with multiple topics 315, and vice versa.
  • the user can generally click on an image representing an actor to see additional information about this actor, and vice versa.
  • a user submits a query 320 using either Query by Example 405, Multi- evidence Query User Interface 410; Query Language 415; Canned Query Templates 420, Visual Query Interface 425, or Query Building Wizard 430.
  • the resulting query specifies at least one of a number of parameters, including but not limited to actors, time, topic, related events, communication type, specific documents and work-flow processes. Additionally, the system allows the user to submit queries in natural language format.
  • the results may comprise singleton documents 425, discussions 305, actors 310, statistics 440 and topics 315.
  • Results are displayed in one or more of the formats appropriate to the results content and shown in Figure 4.
  • singleton documents are displayed in tabular list view.
  • Discussions are displayed as a participant graph, overview graph, transcript view, question and answer list, matrix view, cluster view, or tabular list view.
  • Actors are displayed in an activity graph, participant graph, actor profile, matrix view, tabular list view or cluster view.
  • Statistics are displayed as an activity graph as a profile view (for example, actor profile view or data set profile view), or as a Venn diagram.
  • Topics are displayed as an activity graph, Venn diagram, overview graph, matrix view or tabular list view.
  • Participant Graphs graphs that connect the actions of a certain set of participants as related to one or more discussions
  • Activity Graphs comparative or individual graphs that indicate the historical communication or collaboration activity over time among various actors.
  • Document Trail Graphs diagrams that display data tracing the lifecycle of a document or group of documents, including but not limited to such events as document revisions, check-ins and transmissions.
  • Transcript View Variations any primarily text-oriented view that lays out a sequence of events and/or communications
  • Animation description of different ways that interactive or animated aids or trial art could be generated from any of the above.
  • Participant graphs shown in Figures 5-8 represent the set of communication items which belong to a particular discussion, or in some embodiments, multiple discussions.
  • Figure 5 is a screen shot showing one embodiment of the participant graph for a fragment of a discussion, showing the actors involved and the various communications that took place between the actors during the discussion fragment.
  • Each actor is denoted by a unique icon 545, which in this example is a photograph or some other graphical representation of the actor. In other embodiments, a textual representation of the actor (far example, the actor's name) could be used.
  • Communications are denoted by connections 540 between actors. In this example, three communication types are shown: documents, email and instant messages, each of which is denoted by a unique color code, pattern, icon, or other distinguishing mechanism.
  • a legend 550 at the top-right of the screen shot indicates the meaning of each color, and of each of four icons that are used to label the connections.
  • a timeline 505 allows the user to see the date and time at which each transaction in the discussion took place.
  • a content icon 510 By interacting with a content icon 510, the user can see the content of any document and the time when the transaction took place.
  • a type icon 515 allows the user to see information about the transaction type and/or document type.
  • a 'more info' icon 520 allows the user to see basic information about the transaction.
  • a clock icon 525 allows the user to see the precise time at which the transaction took place.
  • the system may further display a popup 530, which shows a chronological list of the transactions in which the current actor participated within the current discussion. For one embodiment, the popup 530 is displayed when the user clicks on an actor's icon 545.
  • the personality (or personalities) of a given actor that participated in a discussion can be displayed.
  • Figure 6 shows a screen shot of a participant graph similar to that shown in Figure 5. Additionally, it uses background color 610 and a series of time-of-day icons 615 at the top of the screenshot to denote the time of day at which the communication was created.
  • the user has positioned the mouse cursor close to the 'more info' icon 520, thereby causing a popup window 605 to be displayed containing basic information about the transaction.
  • a panning widget 630 allows the user to navigate forwards and backwards within the discussion using the time-of day bar 620.
  • a drop-down list box 625 allows the user to switch between different time zones, thereby adjusting the alignment of the discussion with the time-of-day icons 615.
  • Figure 8 shows a screen shot of a participant graph similar to those shown in Figures 5 and 6. Additionally, it shows a toolbar 810 at the top of the screen that allows the user to select between different discussion views: activity, participant (shown here), and transcript.
  • a second toolbar 815 provides buttons to allow the user to carry out the following actions: to zoom in on a particular part of the discussion, thereby showing the elements of said discussion in greater detail; to pan between different sections of the discussion; to filter the discussion on criteria that may include (but are not limited to): actor, communication type and time; to adjust the view of the discussion based on time span; to print the discussion or the contents of the graphical view; or to define new events to add to the view.
  • a user interface navigation mechanism 830 at the bottom of the page allows the user to control which section of the discussion is displayed on screen.
  • a pair of dropdown list boxes 825 allows the user to control the discussion display through the use of filters.
  • An icon 820 and vertical dotted line 835 indicate the occurrence of a significant event (in this example, a board meeting) during the period displayed.
  • Participant graphs show the images 545 of the actors participating in a discussion, and the links 540 between the transactions in which they participated. Participant graphs may display a timeline 505 to show when user activity occurred, and may also display a background 610 in varying shades in order to represent daytime and nighttime activity. Participant graphs can optionally be limited to a partition of a discussion, or to only include certain specific actors of interest, either individually or by some property such as organization. It may also be limited to displaying only those actors who played an active rather than passive role in the items in question, where "active" is defined as initiating one or more of the items .
  • the user may set a threshold value for how active the actor had to have been in the discussion in order to be displayed, based on measures including, but not limited to, the number of items in the discussion initiated by that actor, the importance of these items ; whether any were "pivotal" as described in 'An Apparatus for Sociological Data Mining'.
  • measures including, but not limited to, the number of items in the discussion initiated by that actor, the importance of these items ; whether any were "pivotal" as described in 'An Apparatus for Sociological Data Mining'.
  • a small icon containing "" is displayed in lieu of the regular actor icon. Clicking on this icon expands that instance to the form of the regular icon for that actor.
  • the actor may be identified in other ways including, but not limited to, a smaller icon, or a browned out icon.
  • each items is depicted as a line connecting two or more actors.
  • the color of the line indicates the type of item. Choices include, but are not limited to:
  • Actors 545 may be individuals, or they may be aggregate actors.
  • Examples include an organization, the executive team, or de facto actor group such as a "circle of trust" as defined in 'An Apparatus for Sociological Data Mining'.
  • a group mail alias would also be considered an aggregate actor.
  • an actor might be a system or automated process, for example, a daemon that sends out a particular status message. Actors may be represented by actual photographs 3810 of themselves when available. Alternately, the user may choose a graphic representation for each actor by choosing from a canned library of images, or adding any electronic image that they wish. Once selected, the same image is used to represent this actor visually throughout the entire system.
  • an actor has more than one distinct personality (as defined in 'An Apparatus for Sociological Data Mining' patent)
  • the user has the option to use a different image or graphic for each such personality. If the user opts not to do this, where multiple personalities exist the system will use the one graphic provided to represent all personalities, but will tint the image with different colors in order to allow the various distinct personalities to be readily identified. The user may globally specify the color scheme to be used in such cases; for example, the primary personality will always be tinted blue.
  • the graph is represented as a timeline of events; the resolution can be increased or decreased using zoom in and out controls.
  • daytime and nighttime hours are indicated by a change in background color; as shown in Figure 6.
  • icon markers 615 indicating time of day may also be used; as shown in Figure 6.
  • Icons may optionally be displayed that indicate the document type of the transaction in those cases where it is appropriate, for example, to indicate that the document being sent was an Excel spreadsheet rather than a Microsoft Word document.
  • each appropriate document type icon is displayed.
  • a multiple document type icon is displayed, which depicts a stack of overlapping rectangles.
  • the system provides a different visualization for documents which were attached as opposed to incorporated by reference with a URL or something similar. Rolling the mouse over or near any of the transaction lines will bring up a pop-up 605 with basic information about the transaction (Figure 6).
  • the exact types of information vary by transaction type, but include, as appropriate, the following:
  • the user may click on the small icon to get only the timestamp details.
  • right-clicking on this icon provides an immediate chronology of events just before and after the item in question with timestamp information. This is to help clarify which event preceded which, in those cases where the events were almost contemporaneous.
  • the "content” icon can be used to pull up the content of the document involved in the transaction.
  • actors are shown partially grayed out if their presence in the transaction was likely, but could not be verified solely on the basis of the electronic evidence.
  • One example of this is the case of a meeting entry pulled from an online group calendar which asserts that Joe, Jack, and John will be present at the meeting. Without other supporting evidence, such as meeting minutes indicating that all parties were present, it cannot be definitively asserted that all three men attended the meeting.
  • the user interface allows the user to add items that were not originally part of the discussion being visualized. This is done through filling out a form panel, shown in Figure 7, in the user interface, specifying all of the information that would have been associated with an actual item.
  • Figure 7 is a screen shot of a form panel for adding items that were not originally part of the discussion being visualized.
  • the panel displays the discussion title 705, start date 710 and end date 715, and actors 720 involved.
  • a text box 725 allows the user to enter a label for the item to be added. In one embodiment, this text box 725 is replaced with a dropdown listbox, combo box, or other user interface tool for adding an item from a preconfigured or dynamically generated list.
  • a series of option buttons 730 allow the user to specify the type of item to be added. After an item is added, it would be shown on the participant graph. For one embodiment, items added by a user are flagged in the participant graph, to indicate their nature. For another embodiment, the information that an item has been added can be obtained using the 'info' icon 520.
  • the view is implemented as a canvas, so the user may drag and drop shapes, lines, and text on it as they please.
  • such additions are checked for semantic correctness.
  • added events are indicated by color, patterns, icon, or some other indicator.
  • a canned library of icons to represent common concepts like "meeting" may be provided in the UI; the user may elect to add and use their own images as well. The user may also add descriptive text about the event. This text would appear when the user clicks on the icon representing that event.
  • Animation can help accentuate the involvement of certain actors of special interest; it can also highlight the accelerating or decelerating pace of the transactions.
  • Types of animation provided in one embodiment of the invention are as follows:
  • the layout algorithm for the view can be implemented with a number of commonly available graphing libraries. In one embodiment of the invention, a limit of 8 line connections per actor icon is imposed for readability purposes. For one embodiment, should additional connections be necessary in order to represent the underlying data, a second actor icon will be drawn to accommodate the additional lines.
  • the participant graph view can be used modally as a visual querying interface.
  • the user may select a transaction by selecting its objects with a marquis tool, and generate a Query by Example (QBE) query.
  • QBE queries One example of QBE queries that may be used with this system is defined in 'An Apparatus for Sociological Data Mining'.
  • the user may also select individual actor icons in order to see all transactions involving all of these actors.
  • Other accompanying UI widgets and tools include, but are not limited to, the following.
  • a panning widget 620 shown in Figure 6. This widget 620utilizes a thumbnail image of the full discussion transcript view, shrunk to whatever length necessary to fit in the visible view.
  • the participant graph automatically scrolls to the position indicated by the panning widget 620, making it especially useful for viewing discussions of long duration.
  • Daytime and night-time hours are indicated in the thumbnail, allowing the user to easily detect, for example, anomalously high amounts of communications after standard or usual working hours.
  • nighttime starts at 5:00PM in the primary time zone, or some other pre-configured time, and any communications or events after that time are distinguished, for example by being colored darkish gray.
  • a gradient fill is used to indicate rough time of day, as shown in Figure 6.
  • communication and events occurring during weekends or holidays are coded, for example by being colored pink.
  • the time zone defaults to the one in which the greatest amount of transactions occurred; times from other time zones will be normalized to this time zone.
  • the panning widget there is a control 625 above the panning widget allowing the user to change the default time zone used by the panning widget 620.
  • parallel instances of the thumbnail will be drawn for each time zone from which transactions originated.
  • One panning widget extends across all of the thumbnails.
  • the transcript view elements being thumbnailed are color-coded according to initiating actor rather than time of day.
  • these items are color coded by topic.
  • Figure 11 is a screen shot showing one embodiment of the activity graph for a discussion.
  • the user has selected this view of the discussion using the tool bar 810.
  • This view shows the level of activity over time in two ways: as a line graph 1120, and as a diagram 1125 in which levels of communication activity are denoted by colors of the rainbow.
  • a legend 1130 explains the meaning of the colors.
  • An icon 820 and vertical dotted line 835 indicate the occurrence of a significant event (in this case, a board meeting).
  • a toolbar 815 and navigation mechanism 830 as shown in Figure 8, are also shown.
  • a slider 1115 allows the user to create a different viewable span on the canvas.
  • the rainbow view uses a color, pattern, or similar distinguishing mechanism which uses the color spectrum to help users to discern small shifts in the communication activity of a very large population of actors. Specifically, this view is used to pinpoint the amount of communication on specific topics. It is accompanied by a graph below which allows the assignation of numerical values to the colors used in the spectrum view. Maximum density is determined by historical values over the same time span and same or comparable population.
  • Activity graphs are used to illustrate the amount of communication among a small set of actors over a user-specified period of time. They may optionally be additionally constricted by topic. Actor sets may be specified in any of the following ways:
  • Figure 12 is a screen shot showing one embodiment of the activity graph for a discussion.
  • Lines 1220 linking actor images 545 are terminated with boxes 1215 showing the number of communications that took place between the actors.
  • each actor is represented by both an image or other icon 545 and a text item 1205 containing the name of the actor.
  • a legend 1225 shows the mapping between colors and levels of communication activity.
  • the connecting line 1220 has two colors, and the portion of the line adjoining each actor represents the number of communications sent by that actor to the other actor of the pair.
  • the line 1220 connecting the two actors has a single color throughout its length.
  • a number 1215 at each end of each connecting line shows the exact number of communications that the actor at that end of the line 1220 has sent to the actor at the other end of the line 1220.
  • the user can invoke a communication profile popup window 1210.
  • the popup window 1210 is invoked by double-clicking on the line 1220 connecting actor images 545.
  • the popup window 1210 provides additional data about the communications, including average communications length and depth, and document types exchanged. For one embodiment, any anomalies noted by the system are also flagged.
  • Each individual or aggregate actor is represented by an image provided or selected by the user.
  • a single line is used to indicate all communication between actors, in both directions.
  • the direction of the arrows at the ends of the line indicate which way the communication is flowing.
  • the number in the box 1215 embedded in the arrow indicates the number of communications to the other actor.
  • these are the communications to that actor specifically, as opposed to communications sent to various distribution lists.
  • an aggregate actor is included in the display, all such aggregate communications are included, since such aggregate actors often correspond to distribution lists. Note that for purposes of readability, only communication between pairs of actors is shown. In order to show communication between tuples of actors, aggregate actors may be created.
  • the coloring of the lines is used to indicate one of the following, depending on how the user has configured the user interface:
  • the color or pattern of the line indicates the frequency of communication, while the thickness of the line indicates the volume of communication. In another embodiment, the thickness of the line indicates the frequency of communication, while the color or pattern of the line indicates the volume of communication.
  • the number of communications can be based on any or all of the following, depending on how the user has configured the user interface:
  • the activity graph can be superimposed on an org chart in order to highlight communication flows that appear to differ from the org chart.
  • actor titles are listed, and additional lines to indicate reporting relationships may be rendered.
  • It can also be used as a visual querying tool; the user may select two or more actors in order to see all of the discussions, or individual communications between them. The user may also click on the line connecting any 2 actors in order to bring up a panel 1210, shown in Figure 12, containing the communication profile of these actors.
  • Which information to display is user- configurable, but would typically include the following:
  • Figure 13 is a screen shot showing one embodiment of one view of a discussion timeline.
  • Sets of adjoining rectangles 1305, linked by lines 1310 and color-coded by actor (as shown in legend 1315), are used to represent the communications within a discussion (so that each discussion appears as a set of adjoining rectangles 1305).
  • the x-axis of the screen represents the timeline, and the sets of rectangles are arranged one above the other on the y-axis as in a Gantt chart.
  • Above each discussion 1305 appears that discussion's title 1320.
  • the lines 1310 show related discussions, which are generally either precursors to, or offsprings of, the current discussion.
  • each discussion appears as a rectangle 1305 of the length appropriate relative to its duration in the timeline.
  • the title 1320 of the discussion appears directly above the rectangle; in some embodiments, this is followed by the number of items in the discussion.
  • the rectangles are thumbnails of the content part of the transcript view of the discussion, scaled down to the necessary size and rotated 90 degrees to the left.
  • each item within the discussion is coded according to one of the following, depending on the user's preference:
  • the user may configure the user interface to color code all communications originated or received by a particular actor of interest.
  • numerous parallel thumbnails may be created in a dedicated view in order to help the user observe the overlap between different actors of interest.
  • Figure 14 is a screen shot showing one embodiment of another view of the discussion timeline. In this view, four discussions 1405 are displayed, and the level of activity within each discussion is represented by vertical lines 1415 of various thicknesses, where a thicker line denotes a greater level of communication activity.
  • a panning widget 1410 over one portion of a discussion magnifies the vertical lines in the portion of the display under the widget 1410.
  • the user can move the panning widget 1410 by mouse manipulation.
  • a hand icon 1420 appears on the panning widget 1410.
  • An icon 820 and vertical dotted line 835 indicate when a significant event occurred.
  • the discussion names appear to the left of the view, and one discussion occupies all of the real estate in that range of the Y axis.
  • Figure 15 depicts a timeline of the individual events in a discussion.
  • Figure 15 is a screen shot showing one embodiment of another view of the discussion timeline. In this view, detailed information about each individual communication event 1505 is arranged along a discussion timeline 505.
  • Communication events 1505 are depicted as blocks on the chart (in one embodiment, different types 1530 of events are depicted using distinctively colored or patterned backgrounds).
  • Each block depicting an event 1505 contains header information 1520 related to the corresponding communication, including but not limited to: the sender or creator of the communication; the person who last modified the communication; the date of the communication; the subject of the communication; and any associated attachments or linked documents.
  • the user can click on an area 1510 of each block in order to display the content of the communication.
  • Color-coded lines 1525 linking each event denote the primary type of evidence used by the system to incorporate that particular item into the discussion.
  • a zooming tool 1535 at the top right of the screen allows the user to zoom in (to show less communications in more detail) or out (to show more communications in less detail).
  • the background area 1515 of the chart is color-coded or coded with a distinctive pattern to represent daytime and nighttime.
  • Figure 15 provides an overview of the constituent parts of a discussion and the connections between them. Communication events are depicted as sets of interconnected blocks 1505.
  • the blocks 1505 may be color-coded as elsewhere described; actor icons may be optionally included in the block.
  • the different colored lines 1525 reflect the primary type of evidence used by the system to incorporate that particular item into the discussion. Evidence types include but are not limited to, the following: similarity of participants, "reply to”, lexical similarity, pragmatic tag, same attachment, and workflow process. These terms are explained in 'An Apparatus for Sociological Data Mining'.
  • FIG. 16 is a screen shot showing one embodiment of a discussion cluster view.
  • the total number of discussions meeting certain user-specified criteria is reflected in the size of the shape (in this embodiment, a circle) representing the cluster.
  • the shape that currently has the focus is displayed in a distinctive color 1635, with a distinctive pattern, or is shown enlarged, thereby distinguishing it from circles 1610 that do not have the focus.
  • Links 1615 between clusters are color- coded according to whether the clusters share: commonality of actors, commonality of topics, or commonality of another type.
  • Commonality of actors occurs when two clusters, distinctive from each other by virtue of meeting different clustering criteria, nevertheless share the same set of actors. Where this is the case, a distinctive color is used to trace the link between the two clusters in question. Icons allow the user to see more information 515, the date and time 520 of the communication, and to view 525 the underlying document discussion. A separate, smaller, window 1630 allows the user to navigate within discussion space by moving a panning tool 1620. In one embodiment, when the user activates the panning tool 1620, a hand icon 1625 is displayed.
  • shapes 1610 are used to represent groups of discussions.
  • the shapes 1610 are labeled with the number of discussions contained in that group, and a description of the group.
  • a smaller window 1630 shows a map of the entire discussion space, or a relatively large part thereof, and contains a smaller frame 1620 to represent the area of discussion space under analysis. Since this view is independent of the information content, it is suitable for use even when the information has been strongly encrypted, and thus is not accessible for analysis.
  • Document Trail Graphs are independent of the information content, it is suitable for use even when the information has been strongly encrypted, and thus is not accessible for analysis.
  • Document trail graphs depict the life cycle of one particular document.
  • Figure 9 is a screen shot showing one embodiment of the document trail graph for a discussion.
  • Each cluster of items on the graph consists of one actor icon 905 and at least one document icon 935.
  • the actor's actions with regard to the document (such as creation, modification, check-in, etc) are represented by displaying a document icon 935 in an appropriate color or pattern, according to a legend 930.
  • the x-axis of the graph represents the time line, with dates shown along a timeline display 505 at the bottom of the graph, and lifecycle increments 910 displayed along the top.
  • the length of the document in pages is indicated by a number 925 inside the document icon.
  • Links 915 between versions of the document are color-coded according to function. In one embodiment, hovering the mouse over the 'more info' icon 520 invokes a popup 920 summarizing data related to the document in question.
  • a timeline 505 allows the user to see the date and time at which a particular event 935 in the document's life occurred.
  • An actor icon 905 denotes the actor responsible for said event 935.
  • Events 935 are depicted as clusters of activity comprising document icons 925 and an actor icon 905.
  • Links 915 between the various versions of the document that comprise a single event are color-coded according to function.
  • Document revision numbers 910 (for example, but not limited to, source control system revision numbers, or revision numbers assigned by the present invention) are displayed along the x-axis of the graph.
  • Document icons 925 are color-coded according to the type of user activity that triggered the event. Examples of said user activity include, but are not limited to, document creation, modification, revision, deletion, check-in, check-out, distribution, viewing, third-party transfer and content transfer.
  • a legend 930 explaining the color-coding is superimposed on the graph.
  • Document trail graphs further show icons allow the user to see more information 515, the date and time 520 of the communication, and to view 525 the underlying document. Hovering the mouse over (in one embodiment, clicking) the 'more info' button 515 displays a popup 920 containing a summary of information related to the event in question. In one embodiment, document icons 925 contain a count of the number of pages (or other size metric) contained within the document at the time of the event 935 in question.
  • the purpose of the money trail graph is to chart the movement of money using data available within a discussion.
  • This visualization displays information related to money transfers that have been extracted from a discussion.
  • the data is displayed along a timeline 505.
  • Each extracted data point in the money trail includes a financial institution 1010 or money manager, at least one actor 545 party to the transaction, and a sum of money 1005, if that data is available.
  • Links 540 connecting the elements of a financial transaction are color-coded according to transaction type following a color code specified in a legend 1025. Hovering the mouse over the 'more info' icon 520 beside a link 540 invokes a popup 1015 summarizing data related to the financial transaction.
  • An account icon 1020 allows the user to see which financial accounts are involved in the transaction.
  • the graph displays actors 545 (whether individuals, groups, or organizations) and the financial institutions 1010 who are involved with the transfer.
  • Color-coded links 540 between actors denote the type of transaction, and are explained in one embodiment in a legend 1025.
  • the basic transcript view shown in Figures 18 to 25, is a linear presentation of the causally related communication events that make up a discussion.
  • Communications 1830 are displayed in chronological order, and relevant metadata is displayed at the top of each communication.
  • the metadata includes, but is not limited to: date and time created, saved or sent; subject; recipient list; and time (in one embodiment, time is denoted by a clock icon 1815.)
  • Actor names 1820 are color-coded.
  • a header area 1805 provides information related to the discussion, including (but not limited to) discussion title, message count, list of participants, date range and total number of attached documents (in one embodiment, the total number including duplicates; in another embodiment, the total number of distinct attached documents).
  • an actor image 545 is associated with each communication, to denote the actor who created or changed the document.
  • Clickable links 1810 contain the names of any attachments, and open the corresponding attachment when clicked.
  • a display tool 1825 at the top-right of the screen allows the user to show or hide message headers, quoted text within each message, or message content. Communications may further provide document-type coding: for example, by pattern or color coding.
  • a sequence of documents 1830 (or other communication events, such as instant messages 2525) is displayed beneath a discussion header 1805.
  • the discussion might be augmented by external events, either manually by the user through the user interface, or via an automated process defined for a specific case.
  • this view consists of a user-configurable summary portion at the top, followed by a list of the various items in the discussion.
  • Each item has an author or creator, and optionally a set of other participants, such as those actors appearing in the cc: line of an email.
  • each actor 1820 is automatically color-coded by the system.
  • color coding of actors is done relative to the individual discussion.
  • actors of particular interest can be assigned colors that are to be used globally.
  • colors are recycled by the system within non-intersecting sets of actors.
  • Each item also has a title, a date, and an item type, such as: email, meeting, document modification, etc.
  • activity associated with each actor is represented in a horizontal bar 1905 containing colored areas 1910, where the areas are color-coded by actor and spaced to represent time intervals.
  • discussion partitions 2005 are displayed.
  • the partitions 2005 represent the threads that make up the discussion.
  • the partitions 2005 include the number of communications in each thread of the discussion.
  • discussions that have been partitioned can be accessed by clicking on the title of the partition 2005.
  • items of different types are displayed with different background colors or patterns 2110, as shown in Figure 21.
  • document type is shown via the use of an icon.
  • the time of day that a message was sent is shown by an icon 2105.
  • any attachments associated with communications in the present discussion are flagged via distinctive icons 2205 in the header or in the communication body.
  • documents linked by reference to communications in the present discussion are flagged via distinctive icons 2210 in the header or in the communication body. Examples of documents linked by reference include, but are not limited to: a document whose URL is referred to in a communication; and a data file whose file name and path is referred to in a communication. In one embodiment, clicking on the icon displays the attachment.
  • quoted text 2320 is distinguished.
  • the background 2315 is color coded.
  • the text 2320 itself is color-coded, ln one embodiment, within each communication that contains quoted text, each distinct quote is assigned a timestamp 2310.
  • the communication header area contains explanatory text 2305 stating how many pieces of quoted text are associated with the current communication. In one embodiment, the explanatory text 2305 is replaced by an icon.
  • a clock icon 1815 as shown in Figure 18 appears that is set to the time that the event occurred.
  • an icon indicating general time of day appears. For example, a document modification that occurred at night would have an icon with a partial moon against a dark backdrop with stars, while an email sent at dawn would have a rising sun.
  • their picture 545 appears at the top of each item that they created, as shown in Figure 18. In cases where no actor image is available or desired, a user-selected graphic can be used in its place.
  • the summary portion 1805 contains the discussion timeline, participating actors, number of items, and controls which allow certain information to be viewed or hidden.
  • the discussion timeline is represented graphically (Figure 17) as a series of headers 1705 connected by color-code lines 1710.
  • One embodiment of generating the summary or resolution is described in 'An Apparatus for Sociological Data Mining'.
  • Optional UI tools include controls to "fast forward" to the next item created or otherwise involving particular actors. This, like the panning widget, which is also used with this view, is especially useful for long discussions which have many participants associated with them.
  • a deleted item 2415 can be flagged in any or all of several ways: the background 2420 has a distinctive color or pattern, or is otherwise displayed in a distinctive way; a red flag icon 2425 is displayed on the item; a text box 2405 displays additional information including but not limited to the computed level of certainty that an item was deleted, and the computed level of suspicion associated with the deletion; a "torn document" effect 2410 graphically conveys to the user that this discussion is incomplete. For one embodiment, only suspicious deletions are flagged.
  • the question of whether the deletion (or suspected deletion) of the data was either legal in the context of a given matter, or was in compliance with some defined standard of behavior is of interest.
  • One embodiment of a system for making this determination is described in copending application Serial No. XXXXX, filed concurrently herewith, and entitled "A METHOD AND APPARATUS TO PROCESS DATA FOR DATA MINING PURPOSES.”
  • the system will flag the item. For one embodiment, a red flag icon 2425 is used. Missing information is noted in bold red text.
  • the background color of the item will be set to whatever the user's preference is for displaying this kind of item, for example a background containing a tiling of question marks 2420, as shown in Figure 24.
  • Figure 25 is a screen shot showing one embodiment of the transcript view of a discussion, focusing on instant messages 2525 within the discussion. Actors 2515 are color-coded, and time-stamps 2520 are shown at regular intervals.
  • a slider 2505 at the left of the screen allows the user to navigate through the set of instant messages, as does a vertical scroll bar 2535 to the right.
  • the slider 2505 at the left of the screen additionally shows a panning tool 2510 representing the position of the visible portion of instant message text within the larger body of text. Note that for instant messages (IMs) 2525, a simpler item form is used, where IMs 2525 are displayed in chronological order and timestamped
  • a panning tool 2505 with a slider 2510 allows the user to navigate through the IMs 2525.
  • the user can also navigate using a conventional scrollbar 2535.
  • the same form may also be used to represent emails in a condensed format in which data about additional participants is not deemed of interest. In such cases, the view is constructed by decomposing the emails into the separate text blocks attributable to each actor, and then linearizing them by time (accounting for differences in time zone.) In another embodiment, all contiguous communication from the same actor is presented in the same item, separated by line breaks, much like the traditional form of a play dialog. Querying Tools
  • an extensive querying language is provided.
  • this language reflects the actor orientation of the document analysis engine that is described in 'An Apparatus for Sociological Data Mining' patent. Since it is well known that the vast majority of searches contain one or two keywords, and no operators, it is important for the query language for "discussions" to break away from this standard, but ineffective paradigm. This is accomplished by using a sequential structuring of the query information. It is assumed that the majority, but not all, of queries performed with the query language will be one of the following forms, or subsets of the forms described below.
  • the query is of the format: who 3205 (actor/actor group) knew/probably knew/saw/believed/asserted 3210 (verb relationship) what 3215 (topical or specific document instance or version) when 3220 (time, timeframe, or timeframe relative to a particular event).
  • the query may specify how 3225 (for example, via pager, mobile device, desktop machine) or where 3230 (if it is possible on the basis of the electronic evidence to place the person geographically at the time of the communication) for the communications as well.
  • the who 3205 is narrowed by adding additional features.
  • the query may include, with what frequency 3305 (for example, once, repeatedly) an actor, did what 3310 (for example, edit or check-in a document, delete a document, commit a pattern of actions or single action 3305, such as excluding particular other persons from meetings or discussions, etc), what object 3315 (actor 3205 and/or content 3215) did they do this to, and when 3220.
  • the user can specify how 3310 did patterns of behavior (relationship between an object 3215 and an actor 3205 or content 3215) change over a specified period of time 3220, or with respect to some other specific context 3405.
  • the user can query how the patterns of communication between two litigants changed after a particular material event.
  • the user may further query if there any relationship of statistical significance between the occurrences of events of particular tuples of event types, and if so, what kind.
  • the language generally requires that an actor be specified prior to any other terms.
  • an actor of "anyone" may be specified, or may be automatically inserted by the system.
  • Individual actors can be specified by first name and last name; if only one or the other is provided, the system will look in the recent command history for that user in an attempt to disambiguate it. If nothing suitable is found, the system will try to match the string to both actor first and last names present in the corpus. It will then present a list of appropriate choices, or if there is only choice echo it back to the user for confirmation.
  • An actor's circle of trust can be specified by adding a plus sign "+" after the actor's name.
  • an aggregate actor In the case of an aggregate actor, the union of the actors in the different circles of trust is taken. Similarly, an actor group, such as the set of all employees of ACME Corp. could be specified. Similarly, in one embodiment, certain personalities of a given actor (or actors) can be specified.
  • Operators may be active or passive in nature relative to the actor. For example, modifying a document is active, while getting promoted to a higher position is passive.
  • Content modification operators include, but are not limited to, the following:
  • Knew The actor actively engaged in discussion about the topic(s) in question.
  • Non-content operators include employee lifecycle events such as Hire, Departure, Transfer, Promotion, and Role Change.
  • Other non-content events include, but are not limited to: Vacation or leave of absence or sick day, Travel event, Wire transfer send or receive, or Phone call, presuming no transcript of the phone call exists.
  • the "how” may optionally be specified as either a specific device type, such as a Blackberry, or as a category of device, for example a mobile device.
  • the "how” could also be a fax or a voicemail, or a paper letter.
  • the "how” is identified by its immediately following an unquoted “by” or "via.”
  • the "where" may be optionally specified by entering the geographic location of the actor at the time of their participation in the particular transaction. This can be done hierarchically, if a tree of locations is provided. If there is more than one actor specified in the query, the where is modified by actor. In one embodiment, this is specified as ⁇ actor name> in ⁇ location> or ⁇ actor name> at ⁇ location>.
  • the core engine calculates the primary limiting factors in a query.
  • the information is used to indicate to the user which terms are responsible for very substantially reducing or expanding the result set.
  • the system can optionally inform the user on which terms could be generalized or specialized one level further for best effect on the results set.
  • these alternate queries are run automatically on separate threads at the same time as the base query, in order to facilitate an immediate response to a user question, such as a request for "more” or "less.”
  • Each of the operators below can be used in the context of retrieving discussions or individual communications, or both. These may be used to override the system defaults described previously. For one embodiment, the actual retrieval behavior of these operators is determined by the current relevance scoring mechanism in place. One example of such relevance scoring is described in 'An Apparatus for Sociological Data Mining'.
  • Keyword an operator 3510: Result set contains all discussions or communications with at least one occurrence of a specified term, depending on the context in which it is used. This operator can specify sets of terms through techniques including but not limited to use of wildcard characters and matching using the Levenshtein edit distance.
  • Phrase an operator 3510: Result set contains all discussions or communications with at least one occurrence of the sequence of terms. This operator can specify sets of related phrases using techniques including but not limited to the use of wildcard characters in individual terms, matching by Levenshtein edit distance between terms and matching by Levenshtein edit distance between sequences of terms.
  • Classifier an operator 3510: Result set specified by the set of sub- queries obtained from expanding a given class from an ontology loaded into the document analysis engine.
  • NamedEntity an operator 3510: Result set specified by the query obtained from expanding a given named entity from all ontologies loaded into the document analysis engine.
  • the second group of operators search over metadata collected from each individual communication as well as relationships between documents created during the evidence accrual process while building discussions. These operators return discussions when applied.
  • CommunicationType Returns all discussions containing certain types of communication items, for example email.
  • EventType Returns all discussions that contain an event of a particular kind, such as a board meeting.
  • Event Returns all discussions that contain a particular instance of an event, for example, the board meeting that occurred on March 15, 2001.
  • PragmaticTag Returns any discussions containing one or more items with the given pragmatic tag.
  • ActorRelations return discussions with the indicated relationship between a set of actors, cliques ("circles of trust") or groups. Relationships include but are not limited to: "between”, “among”, “drop”, “add”, “exclude.” Some of these operators optionally use a ternary syntax: ⁇ joe rudd> excludes ⁇ bob jones> (see 'An Apparatus for Sociological Data Mining' for an explanation of these items)
  • ActorStatistics return discussions with a statistical relationship between an indicated actor and others. For example "most frequent correspondents with ActorX"
  • the fifth group of operators are combinatorial operators used to combine result sets of subqueries.
  • the conventional logical operators have a different effect when applied over discussions.
  • DiscussionMember Takes a set of individual documents and returns the set which are members of one or more discussions. The negation may be used in order to retrieve the complement set. Used with -statistics, it will calculate various statistics on the differences between the member and non-member documents.
  • DiscussionProperties Used on one or more discussions, queries against the total number of communications/events, types, the maximum depth, overall duration, frequency of communications, topics, actors, etc.
  • ExpandToDiscussions return the set of unique discussions containing at least one document from the document set.
  • the document set is obtained from the result set of a subquery.
  • a specific graphical querying tool is also provided, in addition to the views that serve double-duty as visual query builders.
  • the query tool includes a text field that users may use to enter words, phrases, or ontology names.
  • a separate pane to specify ontologies (similar to the ontology selection dropdown list 3715 shown in Figure 37a) using a tree to select the desired items may be displayed, as well as a view indicating which ontology hits correlate with which others - for example content discussing tax evasion and travel frequently co-occurring - also allowing the desired ontologies to be selected and added to the query.
  • Figure 36 depicts another visual query means using a Venn diagram representation to indicate how many documents were "hit” by a particular ontology, or by a combination of particular ontologies.
  • a series of interlocking circles 3620 represent the extent to which communications "hit” only one, or more than one, ontology.
  • the interlocking circles 3620 are used to indicate how many documents have been found to reside within each of three categories, as shown in the single-category total 3605. It also shows the number of documents that reside in more than one of the three categories, as shown in the multiple-category total 3610.
  • an explanatory text 3615 prompts the user to click in the relevant portion of the Venn diagram in order to see the corresponding documents.
  • users may click on any bounded area of the diagram. Doing so will bring up a panel containing a relevance ranked list of either individual documents or discussions, depending upon the user's preference.
  • the relevance ranking scheme will be altered to favor documents that have a substantial score for each ontology in question.
  • This view is also used in thumbnail form in order to show how the topic relative proportions changed due to the addition of new documents to the corpus. This is done both by showing "before” and “after” thumbnails, as well as displaying thumbnails side by side of each segment of the data set (however the segments are determined by the user) so that their topic content may be easily compared.
  • a similar representation can be constructed on the basis of actors rather than ontologies; further both actor and ontology information could be combined in one Venn diagram view.
  • individual and aggregate actor icons 2910 are provided in the search panel, though actor names may also be typed in the text field 2905. Users may specify which icons should appear; initially by default the system will select the ones with the greatest communication frequency. Subsequently, by default, it will display the actors who appear most frequently in queries. Additional options allow the exclusion of the specific actors; if an actor has been excluded, the icon representing him will have an "X" or diagonal bar superimposed in it, similar to the symbol used in prohibition signs, as shown in Figure 31.
  • events of global interest 2915 are added to a catalog so that they are displayed in the query tool for easy access. Additionally, a date range may be specified using standard calendar selection controls 2920. For one embodiment, events of interest will also appear in the calendar 2925 by coloring the square for the particular date(s) in question. Double-clicking on a colored square will bring up a pop-up with a description of the event. If an event is selected, the user will be asked whether they want the query to be:
  • the querying tool allows the user to specify, through the use of check boxes 3010 in what way an actors must have been involved with each document in order for the document to be considered responsive to the query. Examples of the involvement include, but are not limited to: creating, changing, reading, seeing, and/or receiving a document.
  • the querying tool allows the user to select pre-created, saved queries 3005. Possible mechanisms for selecting the saved queries include, but are not limited to, drop-down list or combo boxes (as shown in Figure 30) and list boxes.
  • the user can specify that only discussions involving certain personalities of an given actor should be returned.
  • the query will be echoed back to the user.
  • all queries are echoed back to the user in front of the result set.
  • This is done using query templates, such as those specified in Figures 32-34.
  • the echo is constructed by concatenating the following pieces of data: "Query on:” ⁇ actors> octions performedxcontent descriptorsxtime>
  • each query template has a corresponding natural language phrase that is used to generate the echo.
  • the above would be expressed as:
  • the user may enter natural language queries, and the system will interpret these queries by matching them to the appropriate query template and then performing any necessary word mapping via the use of ontologies.
  • Additional query options include, but are not limited to, the following:
  • the above-mentioned discussion length query options include (but are not limited to) the longest or shortest discussions (both by number of items and calendar duration) among a given set of actors, or on a given topic.
  • the ability to target the longest or shortest discussions by actor provides a targeted tool for probing the activities of specific actors of interest, without being restricted to particular topics or content. This is important because such restrictions limit the user to finding only what he already thinks may be there, leaving potentially important or interesting information unrevealed.
  • GUI tool will provide the user feedback on which terms caused the query (on a relative basis) to over- generate or under-generate.
  • the user may also avail herself of a number of canned query templates. These include, but are not limited to, the following: • Did ⁇ this> actor receive ⁇ this> version of ⁇ this> particular document?
  • the user may configure the interface to display one or more of a number of different kinds of views in response to a query.
  • the default view is a tabular listing of the discussions that are responsive to the query, relevance ranked accordingly. This table may include all of the following information, plus any additional information that has been programmatically added:
  • Each line of the results view shows the discussion title 2605, discussion start date 2610 and end date 2615, and a button 2625 depicting the image and name of each actor involved with the discussion.
  • clicking on the button displays information related to the actor.
  • only the actor image is displayed on the button.
  • only the actor name is displayed on the button.
  • a non-clickable image or text box is used, rather than a button.
  • only primary actors are shown.
  • only certain personalities of an actor are shown.
  • the discussion is displayed by clicking on the relevant line in the results view, or by highlighting the results view line and clicking the 'Display Discussion' button 2620.
  • a text summarization of the discussion is displayed on the relevant line in the results view.
  • the user may also opt to have the discussions returned from a query visualized in a matrix view, shown in Figure 27, in which the columns represent a variety of discussion properties extracted from the user's query. For example, if there were 20 actors participating in all of the discussions returned by a particular query, each one would be represented by its own column, as would be other properties, such as communication type, which relevant ontologies "hit" it, and so on.
  • Each discussion 2710 is displayed in its own row, and each property 2705 that it has, such as the participation of a particular actor causes the relevant square to be colored in. Different fill colors may be used in order to indicate whether the actor was a primary actor in the discussion, just an actor, or merely a passive participant.
  • FIG. 27 This is depicted in Figure 27 in compact form (without use of the actor images.)
  • the user may choose to save a number of queries and their results in a particular location, so that this data may be displayed together, as pictured in Figure 28.
  • saved queries are displayed in a list, where each item is identified by a folder icon 2850, to convey to the user the fact that it may be expanded.
  • a results list 2835 containing relevant discussions and their associated actors 2840 and date range becomes visible.
  • a folder icon 2850 is used to represent each query, and the textual content 2855 of the query is displayed to the right of the folder icon.
  • the first query is shown expanded, revealing the results list 2835.
  • Descriptive icons 2815, 2820, 2825 and 2830 appear to the left of each saved query. Clicking on the icon representing a pencil 2820 allows the user to annotate the query; a green rectangle next to the pencil icon indicates that the query has already been annotated. Clicking on the icon representing a hard drive 2830 saves the query to the local machine.
  • the document icon 2815 at the left becomes replaced with the initials of the last user to modify the data (shown as TU' in this figure).
  • the folder icon 2825 is used to add a discussion to a bin or folder of the user's choosing. For each saved query, a list of any relevant discussions 2805 and communications 2810 is shown. In one embodiment, such items show the list of actors 2840 involved, and the date range 2845 of the relevant discussion.
  • saved data may be annotated (by clicking on the pencil icon,) saved to a local hard drive (by clicking on the hard drive icon,) or placed in one or more particular bins (by clicking on the folder icon to see a list of options that may be selected,) and that the initials of the user who last manipulated the document are included.
  • QBE (Query By Example) [00168]
  • QBE refers to a set of techniques whereby a user provides an exemplar of what she is looking for in lieu of constructing an explicit query.
  • Figures 37a-37c are screen shots of a series of Query by Example (QBE) windows. This refers to the type of query in which an exemplar of the desired returned object is specified by the user.
  • QBE becomes a more complicated issue than it is with regular documents.
  • 'An Apparatus for Sociological Data Mining' application discussions have large numbers of properties, the importance of which may shift according to use case. In other words, there is no simple, one size fits all similarity metric for discussions.
  • the first QBE window shown in Figure 37a, therefore allows the user to choose from among a plurality of properties.
  • the properties include (but are not limited to): actors 2910, content terms or phrases 2905, topics 3705, content type 3710, ontology 3715, and time range 3720.
  • the second window, shown in Figure 37b contains a set of discussion properties that can be considered as evidence in determining similarity.
  • the set shown can be selected by the user from the full set of discussion properties (except for unique ID).
  • one embodiment of the invention provides the default set 3725 of discussion properties, pictured in Figure 37b.
  • the colored rectangles 3735 represent the relative importance of each of the discussion properties.
  • the user may modify the sizes of the different colored rectangles 3735 in the box at the bottom of figure 37b. Since the size of the box is fixed, enlarging one box proportionally reduces the sizes of the others. By repeated resizings of these rectangles, the user can achieve whatever relative scoring amongst these different factors they wish.
  • this relative scoring information is saved by the system, and will be the default setting until the user changes it again.
  • a pie chart may be used, in a similar manner.
  • the user may select relative importance numerically by percentage, or using some other tool.
  • the user may name and save different settings, as different settings may be useful for different use cases.
  • the system provides the following functionality in this regard:
  • the user may enter a combination containing all or some of the following query items: topic, document type, ontology, time range, and actor.
  • the system will return a results list containing all discussions that meet this combination of criteria.
  • the combination of parameters entered by the user can include certain personalities of a given actor.
  • a user may right-click on any graphical representation of a discussion in any of the previously described views in order to bring up the menu item "Find Similar". This will bring up a window according to the user's configured preferences displaying the discussions returned by the query.
  • a user may right-click on any graphical representation of an individual textual communication, for example, the rows in a table representing singleton documents returned in response to a query, in order to locate other documents that are similar both contextually and by themselves. This will bring up a two-tabbed view, one with discussions, and one with singleton documents.
  • the user may enter a document containing text into the system in order to use its contents as input to the query engine.
  • all named entities, including actors will be extracted from the document.
  • a topic analysis will be done via the use of ontologies and pragmatic tagging, known text blocks will be sought, and finally any mention of dates will be extracted.
  • This usage is depositions in a litigation context.
  • the desired behavior of the QBE mechanism may vary by application.
  • the default behavior is to consider that actor and content are the two key items in the weighting; all other properties merely impact the ranking of the discussion in the result set.
  • actor is expanded first to any actor with the same role or title in the same organization as the actor(s) provided in the exemplar, and then to any actor in the same organization.
  • Content may be determined by ontology or pragmatic tag, with the former being given more weight. Discussions that contain the desired actors or content under this definition are returned.
  • results are relevance-ranked according to the scheme laid out in 'An Apparatus for Sociological Data Mining'.
  • Advanced Options panel as shown in Figure 37b, and specify the relative weight that he wishes to assign to each property, and whether or not the value of the property is to be treated strictly as specified in the exemplar. For example, must the exact actors in the exemplar be present in order for a discussion to be retrieved, or does it suffice if their colleagues in the same department are present?
  • the colored rectangles 3735 represent the relative importance of each of the discussion properties.
  • the user may modify the sizes of the different colored rectangles 3735 in the box at the bottom of figure 37b. Since the size of the box is fixed, enlarging one box proportionally reduces the sizes of the others. By repeated resizings of these rectangles, the user can achieve whatever relative scoring amongst these different factors they wish. In one embodiment, this relative scoring information is saved by the system, and will be the default setting until the user changes it again. Alternatively, a pie chart may be used, in a similar manner. Alternatively, the user may select relative importance numerically by percentage, or using some other tool. In one embodiment, the user may name and save different settings, as different settings may be useful for different use cases.
  • the system performs the query.
  • the property or properties primarily responsible for the rank are shown 3750 (in one embodiment, properties are color-coded, and the coding is explained in a legend 3745 below the results).
  • the initial item was scored highly primarily on the basis of shared terms. If the high score were also attributable to shared actors, a blue chit would also appear.
  • the degree of saturation of the color chit is used to express the relative level of similarity in this dimension.
  • the user sees a warning message 3755 if the result has been broken down into clusters.
  • the user may configure the view to show any of the available discussion properties. Similarly, in one embodiment, he may resize and reorder the various columns via direct manipulation.
  • views differ from the previously described ones in that they are less actor-focused and more object-focused. These views are intended to depict the history of a particular document (or other electronic data object) as it moves from creation, to distribution, various modifications, changes in form, extractions or copy/pastes to other documents, and possibly deletion. Such views can be extremely important in investigative contexts, when a particular document becomes the focus of attention.
  • Figure 38 depicts the lifecycle view for a document. If versioning information is available from a document management system or repository, or if the creating application provides it, the versions are shown by number 915 above the view, with vertical lines extending beneath them to help make it clear which actors modified or received a document before, or after a particular version change.
  • Major versions and minor versions can be represented differently as per user preference; minor versions may be omitted from the display entirely, represented by thinner lines and smaller number boxes, or drawn the same as major versions.
  • Other designations may be added by the user manually, or extracted automatically from systems that contain such information. These designations include, but are not limited to, published, shipped, and produced.
  • the legend panel 3825 indicates the color coding of some of the different kinds of possible lifecycle events.
  • the lifecycle view is drawn according to a left to right timeline.
  • the actor icons only need be drawn in approximately the correct location with respect to the timeline. This is for purposes of readability; drawing a separate actor icon for related actions that may have taken place only moments apart from one another would only serve to decrease the readability of the visualization.
  • an additional actor icon will be drawn if it is necessary to do so in order to not combine events which occurred on opposite sides of a version line. Therefore to capture such information, each actor icon is framed by a frame that can be partitioned up to 8 times in order to indicate the occurrence of different events performed by the actor on the document within a fairly short period of time.
  • an actor might check out a document, modify it in some way, email it around to various people, and then check it back into the repository - all within a matter of a short period of time.
  • the actor frame would have 4 colors, one side each, in whatever colors designated by the legend. With the color scheme pictured below, this would be: orange, red, blue, and yellow.
  • the user may click on an actor icon in order to view a detailed log of events represented by that instance of the actor icon. Clicking on any part of the frame will bring up a pop-up with a detailed description of that action. For example, in the case of a check-in, the detailed description would include all of the following information (if available)
  • the user may click on the clock icon above the actor icon in order to see a simple chronological list with exact timestamps of the events represented by that actor icon instance.
  • the "?” icon may be used to access other kinds of information as specified in user preferences.
  • Speech recognition is already widely used by the legal and medical profession for recording of briefs, reports, and the like.
  • the system includes a means of extracting data that is input by speech recognition, and making such data searchable and retrievable like any other artifact.
  • Input to speech recognition can take the form either of speaker-dependent recognition (the type employed by dictation software) or speaker-independent recognition (the type employed by telephony applications); the system includes adapters to incorporate data from both types of systems.
  • the system may utilize speech recognition as an interface allowing users to query data already in the system. To this end, an interactive voice interface to the system could display discussions and other data to the user, either on a device or through an audio-only interface.
  • an auditory interface is commonly used to play back data to the user, be it for playback over a telephone or through speakers attached to another device such as a desktop computer.
  • the system includes auditory interfaces, including but not limited to: playback of indexed documents by text-to-speech, or spoken synthesis that accompanies or parallels any of the visual diagrams generated by the system.
  • Further remote interfaces for the system may include wireless and handheld display and input to the system, for example through WAP or similar protocols that transmit data over wireless networks (including telephony networks), input of data via Short Messaging System (SMS) or similar protocols, the use of downloadable/syncable views and data input for handheld/palmtop/tablet PCs, and interfaces for wearable computing devices.
  • SMS Short Messaging System
  • the system allows both input and retrieval of data into the system through any combination of devices; for example, a user's spoken query will be displayable on the screen of a handheld device.
  • a natural language interface is a highly desirable mode of interaction with the system.
  • Users who are limited to an auditory interface can respond better to systems that are designed around the vagaries of human speech (which include disfluencies, variable noise conditions, and the strictly linear exchange of information).
  • the nature of auditory interfaces is such that spontaneity and a tolerance for garbled input is incorporated into the interface; rather than scripted, fixed input that can be manipulated visually, the voice interface must attempt to parse ambiguous user input and return a "system appropriate" result.
  • speech recognition interfaces rely on a grammar that restricts potential user utterances in order to provide accurate recognition.
  • any possible user utterance can generate a fixed-length set of possible parses. From this set of potential parses, an algorithm is applied to account for phonetic similarities in homophones, to remove content that occurs in only a few parses, and so forth, leaving a "core" hypothesis that can be used as the basis for a search.
  • the user utterance "Find me anything about fraud” might generate the following hypothesis set from a speech recognition engine:
  • a preliminary result set can be relayed to the speaker by voice interface, allowing of course for additional refinement or correction of the query, as well as for more detailed display/playback of user-selected elements of the result set.
  • the system may repeat the query as understood to the user, permitting the user to either confirm the query or to repeat the query to modify it.
  • Figure 39 is a screen shot showing one embodiment of the discussion view, as used on a mobile device.
  • a list 3920 of returned discussions is shown, each of which is associated with a checkbox 3915 allowing the user to select the discussions in order to view further detail.
  • the query 3910 that caused the list 3920 of discussions to be returned is displayed.
  • a group of buttons 3905 allows the query to be launched or interrupted.
  • FIG. 40 is a screen shot of one embodiment of the case management master window.
  • the user can select from among various types of communications 4045 (and, in one embodiment, the actors who sent communications), or can select discussions 4050.
  • Documents are displayed in the top right pane 4010.
  • the top right pane 4020 shows a privileged document, which is flagged 4015 as such.
  • the user can enter text in order to find specific discussions, documents, or actors.
  • the bottom-left pane 4030 is used to bookmark searches to which the user wishes to return.
  • a group of option buttons 4040 allows the user to select between management of discussions, documents, or actors, and a set of command buttons 4025 allows the user to select different views of the data.
  • This window contains the following functionality of interest:
  • Documents including discussions may be marked as "privileged" causing the red privileged stamp to always appear over the document in electronic form, and to be printed when the document is printed.
  • the user may search for a word or topic in discussions, according to the actors to whom the words or topic are attributable, or in individual documents.

Abstract

A method of organizing information is disclosed. The method comprises providing a visualization of actor (310) communications in the context of one or more discussion (305), a discussion (305) including at least one actor (310) and at least one documented (310) communication.

Description

A METHOD AND APPARATUS TO VISUALLY PRESENT DISCUSSIONS FOR
DATA MINING PURPOSES
FIELD OF THE INVENTION
[0001] The present invention relates to electronic documents, and more particularly to a method for visualizing the relationships among, and retrieving one more groups of documents satisfying a user-defined criterion or set of criteria.
BACKGROUND
[0002] The volume of electronic information in both personal and corporate data stores is increasing rapidly. Examples of such stores include e-mail messages, word-processed and text documents, contact management tools, and calendars. But the precision and usability of knowledge management and search technology has not kept pace. The vast majority of searches performed today are still keyword searches or fielded searches. A keyword search involves entering a list of words, which are likely to be contained within the body of the document for which the user is searching. A fielded search involves locating documents using lexical strings that have been deliberately placed within the document (usually at the top) with the purpose of facilitating document retrieval.
[0003] These data retrieval techniques suffer from two fundamental flaws. Firstly, they often result in either vast numbers of documents being returned, or, if too many keywords or attribute-value pairs are specified and the user specifies that they must all appear in the document, no documents being returned. Secondly, these techniques are able only to retrieve documents that individually meet the search criteria. If two or more related (but distinct) documents meet the search criteria only when considered as a combined unit, these documents will not be retrieved. Examples of this would include the case where the earlier draft of a document contains a keyword, but where this keyword is absent from the later document; or an e-mail message and an entry in an electronic calendar, where the calendar entry might clarify the context of a reference in the e-mail message. There is a clear need for a search technique that returns sets of related documents that are not merely grouped by textual similarity, but also grouped and sequenced according to the social context in which they were created, modified, or quoted. [0004] This would make it possible to retrieve a very precise set of documents from a large corpus of data. Hitherto, with conventional search tools, this has only been possible by the use of complex search queries, and the results have been restricted to documents that individually meet the search criteria. It is desirable to be able to retrieve a precise set of documents from a large corpus of texts using relatively simple search queries. It would be of further benefit to present said documents in the context of causally related links (for example, a document containing the minutes of a board meeting has a causal link to an email announcing that meeting), even when those other documents do not, individually, satisfy the search criteria. This would relieve the user of the need for prior knowledge (before running the search) of such details as the exact date on which a message was sent, and who sent it. Existing search tools require such prior knowledge, because they do not establish causal links between documents.
SUMMARY
[0005] A method of organizing information is disclosed. The method comprises providing a visualization of actor communications in the context of one or more discussion, a discussion including at least one actor and at least one documented communication.
BRIEF DESCRIPTION OF THE DRAWINGS
[0006] The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
[0007] Figure 1 is a block diagram of one embodiment of a network, which may be used with the present invention.
[0008] Figure 2 is a block diagram of one embodiment of a computer system.
[0009] Figure 3 is a block diagram of navigation flow in one embodiment of the present invention.
[0010] Figure 4 is a block diagram of user-interface flow in one embodiment of the present invention.
[0011] Figure 5 is a screen shot of one embodiment of the participant graph. [0012] Figure 6 is a screen shot of another embodiment of the participant graph, in which the time of day is represented.
[0013] Figure 7 is a screen shot of a form panel for adding items that were not originally part of the discussion being visualized.
[0014] Figure 8 is a screen shot of one embodiment of a participant graph, in which a pop-up showing basic information about the item is displayed.
[0015] Figure 9 is a screen shot of one embodiment of a document trail graph
[0016] Figure 10 is a screen shot of one embodiment of a money trail graph
[0017] Figure 11 is a screen shot of one embodiment a view that uses a color, pattern, or similar distinguishing mechanism which uses the color spectrum to help users to discern small shifts in the communication activity of a very large population of actors.
[0018] Figure 12 is a screen shot of one embodiment of an activity graph, which illustrates the amount of communication among actors over a user- specified period of time.
[0019] Figure 13 is a screen shot of one embodiment of a discussion timeline, in which each discussion appears as a rectangle of the length appropriate relative to its duration in the timeline.
[0020] Figure 14 is a screen shot of one embodiment of a discussion timeline, with a spider-eye panning widget to temporarily change the resolution of the discussion visualization.
[0021] Figure 15 is a screen shot of one embodiment of a discussion timeline, showing the individual events in the discussion.
[0022] Figure 17 is a screen shot of one embodiment of a graphical representation of a discussion timeline.
[0023] Figure 16 is a screen shot of one embodiment of a discussion cluster view. [0024] Figure 18 is a screen shot of one embodiment of a transcript view, showing actor color-coding.
[0025] Figure 19 is a screen shot of one embodiment of a transcript view, showing actor activity.
[0026] Figure 20 is a screen shot of one embodiment of a transcript view, showing discussion partitions.
[0027] Figure 21 is a screen shot of one embodiment of a transcript view, showing actor and document-type color-coding.
[0028] Figure 22 is a screen shot of one embodiment of a transcript view, showing document attachments.
[0029] Figure 23 is a screen shot of one embodiment of a transcript view, showing color-coding of quoted text.
[0030] Figure 24 is a screen shot of one embodiment of a transcript view, showing that a deletion has occurred.
[0031] Figure 25 is a screen shot of one embodiment of a transcript view, showing Instant Messages (IMs).
[0032] Figure 26 is a screen shot of one embodiment of a query results view, showing discussion titles, discussion start and end dates, and actor images.
[0033] Figure 27 is a screen shot of one embodiment of a matrix query results view.
[0034] Figure 28 is a screen shot of one embodiment of the saved queries view.
[0035] Figure 29 is a screen shot of one embodiment of a tool for submitting user queries.
[0036] Figure 30 is a screen shot of one embodiment of a tool for submitting user queries, in which said tool allows the user to select types of actor involvement, and to use a saved query.
[0037] Figure 31 is a screen shot of one embodiment of a tool for submitting user queries, in which said tool allows the user to exclude certain actors from the query.
[0038] Figure 32 is a diagram of a query template (Template 1 ).
[0039] Figure 33 is a diagram of a query template (Template 2).
[0040] Figure 34 is a diagram of query templates (Templates 3 & 4).
[0041] Figure 35 is a diagram of query components. [0042] Figure 36 is a screen shot of one embodiment of a Venn diagram view of document categories.
[0043] Figures 37a - 37c are screen shots of one embodiment of Query by Example (QBE).
[0044] Figure 38 is a screen shot of one embodiment of the document lifecycle view.
[0045] Figure 39 is a screen shot of one embodiment of a user interface for viewing discussions on a PalmOS-based mobile device.
[0046] Figure 40 is a screen shot of one embodiment of the master window view of the case management user interface.
DETAILED DESCRIPTION OF THE INVENTION
[0047] A method and apparatus for visualizing both the electronic paper trails referred to as "discussions" and the statistical anomalies and patterns that are directly computable from these discussions is disclosed. A discussion in this context is a heterogeneous set of causally related communications and events for which either electronic evidence exists, or can be created to reflect. Thus, a discussion provides a means of reviewing a series of related events that occurred over time. One example of generating such discussions from raw communications data is discussed in more detail in copending Application Serial No XXX, entitled "A Method and Apparatus for Retrieving Interrelated Sets of Documents", filed concurrently herewith (hereinafter referred to as 'An Apparatus for Sociological Data Mining'). The visualizations and user interface tools described in this application greatly facilitate the efficient and effective review and understanding of such chains of events.
[0048] The views described in the following sections provide both graphic visualizations, as well as a means of navigating through the complex chains of communications and events that comprise the data being visualized. These views may be offered to the user in a Model View Controller (MVC) graphical user interface, or via a web-based application.
[0049] The present invention will typically be used in conjunction with a computer network. Figure 1 depicts a typical networked environment in which the present invention operates. The network 105 allows access to email data stores on an email server 120, log files stored on a voicemail server 125, documents stored on a data server 130, and data stored in databases 140 and 145. Data is processed by an indexing system 135 and sociological engine 150, and is presented to the user by a visualization mechanism 140. The visualization mechanism 140 is described in more detail in the present application.
[0050] The present invention is for use with digital computers. Figure 2 depicts a typical digital computer 200 on which the present system will run. A data bus 205 allows communication between a central processing unit 210, random access volatile memory 215, a data storage device 220, and a network interface card 225. Input from the user is permitted through an alphanumeric input device 235 and cursor control system 240, and data is made visible to the user via a display 230. Communication between the computer and other networked devices is made possible via a communications device 245.
[0051] It will be appreciated by those of ordinary skill in the art that any configuration of the system may be used for various purposes according to the particular implementation. The control logic or software implementing the present invention can be stored in main memory 250, mass storage device 225, or other storage medium locally or remotely accessible to processor 210.
[0052] It will be apparent to those of ordinary skill in the art that the system, method, and process described herein can be implemented as software stored in main memory 250 or read only memory 220 and executed by processor 210. This control logic or software may also be resident on an article of manufacture comprising a computer readable medium having computer readable program code embodied therein and being readable by the mass storage device 225 and for causing the processor 210 to operate in accordance with the methods and teachings herein.
[0053] The present invention may also be embodied in a handheld or portable device containing a subset of the computer hardware components described above. For example, the handheld device may be configured to contain only the bus 215, the processor 210, and memory 250 and/or 225. The present invention may also be embodied in a special purpose appliance including a subset of the computer hardware components described above. For example, the appliance may include a processor 210, a data storage device 225, a bus 215, and memory 250, and only rudimentary communications mechanisms, such as a small touch-screen that permits the user to communicate in a basic manner with the device. In general, the more special-purpose the device is, the fewer of the elements need be present for the device to function. In some devices, communications with the user may be through a touch-based screen, or similar mechanism.
[0054] It will be appreciated by those of ordinary skill in the art that any configuration of the system may be used for various purposes according to the particular implementation. The control logic or software implementing the present invention can be stored on any machine-readable medium locally or remotely accessible to processor 210. A machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g. a computer). For example, a machine readable medium includes read-only memory (ROM), random access memory (RAM), magnetic disk storage media, optical storage media, flash memory devices, electrical, optical, acoustical or other forms of propagated signals (e.g. carrier waves, infrared signals, digital signals, etc.).
[0055] Navigation among views is facilitated by the fact that all of the viewable entities have very close relationships to one another, as depicted in Figure 3. The user can submit queries 320, which return discussions 305. Each discussion must contain at least two actors 310. Each of the actors 310 about whom the user can submit queries 320 must appear in zero (0) or more discussions 305 (an actor can appear in 0 discussions by being connected in some way with a singleton document which, by definition, is not part of a discussion). An actor 310 can be associated with multiple topics 315, and vice versa. Each discussion 305 can be associated with multiple topics 315, and vice versa.
[0056] Hence, for example, in a view depicting discussions, the user can generally click on an image representing an actor to see additional information about this actor, and vice versa.
[0057] More generally the usage of the user interface flows as shown in Figure 4. A user submits a query 320 using either Query by Example 405, Multi- evidence Query User Interface 410; Query Language 415; Canned Query Templates 420, Visual Query Interface 425, or Query Building Wizard 430. The resulting query specifies at least one of a number of parameters, including but not limited to actors, time, topic, related events, communication type, specific documents and work-flow processes. Additionally, the system allows the user to submit queries in natural language format.
[0058] The results may comprise singleton documents 425, discussions 305, actors 310, statistics 440 and topics 315. Results are displayed in one or more of the formats appropriate to the results content and shown in Figure 4. Thus, singleton documents are displayed in tabular list view. Discussions are displayed as a participant graph, overview graph, transcript view, question and answer list, matrix view, cluster view, or tabular list view. Actors are displayed in an activity graph, participant graph, actor profile, matrix view, tabular list view or cluster view. Statistics are displayed as an activity graph as a profile view (for example, actor profile view or data set profile view), or as a Venn diagram. Topics are displayed as an activity graph, Venn diagram, overview graph, matrix view or tabular list view. These views are discussed in more detail below with respect to other Figures. [0059] Views:
• Participant Graphs: graphs that connect the actions of a certain set of participants as related to one or more discussions
• Activity Graphs: comparative or individual graphs that indicate the historical communication or collaboration activity over time among various actors.
• Overview Graphs: diagrams that contain data on one or more discussions, documents, topic discussion, or other aggregate behavior.
• Document Trail Graphs: diagrams that display data tracing the lifecycle of a document or group of documents, including but not limited to such events as document revisions, check-ins and transmissions.
• Money Trail Graphs: diagrams that chart the flow of money, based on information gleaned from a discussion.
• Transcript View Variations: any primarily text-oriented view that lays out a sequence of events and/or communications
• Object Lifecycle Views: views that are focused on the electronic data objects, rather than on the actors.
• Animation: description of different ways that interactive or animated aids or trial art could be generated from any of the above.
[0060] Related Materials include:
• Querying Tools: any view that can serve the purpose of generating a query, including some of the above • Case Management Application
• Query Language
• Mobile, Voice & Related Applications Participant Graphs
[0061] Participant graphs shown in Figures 5-8 represent the set of communication items which belong to a particular discussion, or in some embodiments, multiple discussions.
[0062] Figure 5 is a screen shot showing one embodiment of the participant graph for a fragment of a discussion, showing the actors involved and the various communications that took place between the actors during the discussion fragment. Each actor is denoted by a unique icon 545, which in this example is a photograph or some other graphical representation of the actor. In other embodiments, a textual representation of the actor (far example, the actor's name) could be used. Communications are denoted by connections 540 between actors. In this example, three communication types are shown: documents, email and instant messages, each of which is denoted by a unique color code, pattern, icon, or other distinguishing mechanism. A legend 550 at the top-right of the screen shot indicates the meaning of each color, and of each of four icons that are used to label the connections. These icons, when clicked on, allow the user to view communication content, view the communication type, receive more information about the communication (for example, the exact time at which it was created), and obtain help. A timeline 505 allows the user to see the date and time at which each transaction in the discussion took place. By interacting with a content icon 510, the user can see the content of any document and the time when the transaction took place. A type icon 515 allows the user to see information about the transaction type and/or document type. A 'more info' icon 520 allows the user to see basic information about the transaction. A clock icon 525 allows the user to see the precise time at which the transaction took place. The system may further display a popup 530, which shows a chronological list of the transactions in which the current actor participated within the current discussion. For one embodiment, the popup 530 is displayed when the user clicks on an actor's icon 545. In one embodiment, the personality (or personalities) of a given actor that participated in a discussion can be displayed.
[0063] Figure 6 shows a screen shot of a participant graph similar to that shown in Figure 5. Additionally, it uses background color 610 and a series of time-of-day icons 615 at the top of the screenshot to denote the time of day at which the communication was created. In Figure 6, the user has positioned the mouse cursor close to the 'more info' icon 520, thereby causing a popup window 605 to be displayed containing basic information about the transaction. In one implementation, a panning widget 630 allows the user to navigate forwards and backwards within the discussion using the time-of day bar 620. In one implementation, a drop-down list box 625 allows the user to switch between different time zones, thereby adjusting the alignment of the discussion with the time-of-day icons 615.
[0064] Figure 8 shows a screen shot of a participant graph similar to those shown in Figures 5 and 6. Additionally, it shows a toolbar 810 at the top of the screen that allows the user to select between different discussion views: activity, participant (shown here), and transcript. A second toolbar 815 provides buttons to allow the user to carry out the following actions: to zoom in on a particular part of the discussion, thereby showing the elements of said discussion in greater detail; to pan between different sections of the discussion; to filter the discussion on criteria that may include (but are not limited to): actor, communication type and time; to adjust the view of the discussion based on time span; to print the discussion or the contents of the graphical view; or to define new events to add to the view. In this screen shot, the user has hovered the mouse over the link 540 between two actors, thereby causing a popup 605 to be displayed. The popup 605 contains further details about the communication over whose link the user is hovering the mouse. A user interface navigation mechanism 830 at the bottom of the page allows the user to control which section of the discussion is displayed on screen. A pair of dropdown list boxes 825 allows the user to control the discussion display through the use of filters. An icon 820 and vertical dotted line 835 indicate the occurrence of a significant event (in this example, a board meeting) during the period displayed.
[0065] Participant graphs show the images 545 of the actors participating in a discussion, and the links 540 between the transactions in which they participated. Participant graphs may display a timeline 505 to show when user activity occurred, and may also display a background 610 in varying shades in order to represent daytime and nighttime activity. Participant graphs can optionally be limited to a partition of a discussion, or to only include certain specific actors of interest, either individually or by some property such as organization. It may also be limited to displaying only those actors who played an active rather than passive role in the items in question, where "active" is defined as initiating one or more of the items . In one embodiment of the invention, the user may set a threshold value for how active the actor had to have been in the discussion in order to be displayed, based on measures including, but not limited to, the number of items in the discussion initiated by that actor, the importance of these items ; whether any were "pivotal" as described in 'An Apparatus for Sociological Data Mining'. For one embodiment, if an actor has been filtered out, but was responsible for initiating a transaction, a small icon containing "..." is displayed in lieu of the regular actor icon. Clicking on this icon expands that instance to the form of the regular icon for that actor. Alternatively, the actor may be identified in other ways including, but not limited to, a smaller icon, or a browned out icon.
[0066] In this view, each items is depicted as a line connecting two or more actors. The color of the line indicates the type of item. Choices include, but are not limited to:
• Email
• Instant Message
• Sending a document (as an attachment in email)
• Phone call (one version with transcript, one without)
• Voicemail (presuming that it had been processed by a speech to text indexer)
• Wire or other funds transfer
• Fax
• Sending/Receipt of FedEx or other electronically trackable package [0067] Actors 545 may be individuals, or they may be aggregate actors.
Examples include an organization, the executive team, or de facto actor group such as a "circle of trust" as defined in 'An Apparatus for Sociological Data Mining'. A group mail alias would also be considered an aggregate actor. In some cases, an actor might be a system or automated process, for example, a daemon that sends out a particular status message. Actors may be represented by actual photographs 3810 of themselves when available. Alternately, the user may choose a graphic representation for each actor by choosing from a canned library of images, or adding any electronic image that they wish. Once selected, the same image is used to represent this actor visually throughout the entire system. [0068] If an actor has more than one distinct personality (as defined in 'An Apparatus for Sociological Data Mining' patent), in some embodiments of the invention, the user has the option to use a different image or graphic for each such personality. If the user opts not to do this, where multiple personalities exist the system will use the one graphic provided to represent all personalities, but will tint the image with different colors in order to allow the various distinct personalities to be readily identified. The user may globally specify the color scheme to be used in such cases; for example, the primary personality will always be tinted blue.
[0069] The graph is represented as a timeline of events; the resolution can be increased or decreased using zoom in and out controls. In one embodiment, daytime and nighttime hours are indicated by a change in background color; as shown in Figure 6. In some embodiments, icon markers 615 indicating time of day may also be used; as shown in Figure 6. Icons may optionally be displayed that indicate the document type of the transaction in those cases where it is appropriate, for example, to indicate that the document being sent was an Excel spreadsheet rather than a Microsoft Word document. In the event that there are multiple documents attached, each appropriate document type icon is displayed. In another embodiment, a multiple document type icon is displayed, which depicts a stack of overlapping rectangles. In one embodiment, the system provides a different visualization for documents which were attached as opposed to incorporated by reference with a URL or something similar. Rolling the mouse over or near any of the transaction lines will bring up a pop-up 605 with basic information about the transaction (Figure 6). The exact types of information vary by transaction type, but include, as appropriate, the following:
Originating timestamp and timezone
Originating geographic location
Wire transfer amount
Length of phone call or voicemail message
Subject or title
Sensitivity level
Urgency or priority
Ending timestamp and timezone
Return of read receipt timestamp [0070] Alternately, the user may click on the small icon to get only the timestamp details. In one embodiment, right-clicking on this icon provides an immediate chronology of events just before and after the item in question with timestamp information. This is to help clarify which event preceded which, in those cases where the events were almost contemporaneous. The "content" icon can be used to pull up the content of the document involved in the transaction. In one embodiment, there is also optionally a "More info" icon that can be configured to display other types of data that are appropriate. Examples of such data include, but are not limited to: prior user annotations about the transaction, its retrieval history, or the relation of that transaction to a known workflow pattern.
[0071] In one embodiment, actors are shown partially grayed out if their presence in the transaction was likely, but could not be verified solely on the basis of the electronic evidence. One example of this is the case of a meeting entry pulled from an online group calendar which asserts that Joe, Jack, and John will be present at the meeting. Without other supporting evidence, such as meeting minutes indicating that all parties were present, it cannot be definitively asserted that all three men attended the meeting.
[0072] Mousing over an actor icon will bring up a pop-up with the basic information available on that actor. This includes, but is not limited to, the following: Full name Title
Organization
Primary electronic identity
Electronic identity conducting the transactions whose lines connect to this icon (if different than the primary)
[0073] Clicking on an actor icon brings up a panel with a chronological list 530 (shown in Figure 5) of the transactions this actor participated in within the discussion(s) being visualized.
[0074] In one embodiment, the user interface allows the user to add items that were not originally part of the discussion being visualized. This is done through filling out a form panel, shown in Figure 7, in the user interface, specifying all of the information that would have been associated with an actual item.
[0075] Figure 7 is a screen shot of a form panel for adding items that were not originally part of the discussion being visualized. The panel displays the discussion title 705, start date 710 and end date 715, and actors 720 involved. A text box 725 allows the user to enter a label for the item to be added. In one embodiment, this text box 725 is replaced with a dropdown listbox, combo box, or other user interface tool for adding an item from a preconfigured or dynamically generated list. A series of option buttons 730 allow the user to specify the type of item to be added. After an item is added, it would be shown on the participant graph. For one embodiment, items added by a user are flagged in the participant graph, to indicate their nature. For another embodiment, the information that an item has been added can be obtained using the 'info' icon 520.
[0076] In one embodiment, the view is implemented as a canvas, so the user may drag and drop shapes, lines, and text on it as they please. In one embodiment, such additions are checked for semantic correctness. For one embodiment, added events are indicated by color, patterns, icon, or some other indicator.
[0077] Events of interest are depicted as icons above or below the canvas from which vertical lines extend, cutting across the canvas at the appropriate point in the X axis. These events fall into one or more of the following categories:
• An event belonging to the discussion, but which is not directly a transaction among its actors. For example, a milestone in a workflow process.
• An event extracted from one of the online calendars of the primary actors in the discussion.
• An event entered manually in the UI by the user
[0078] A canned library of icons to represent common concepts like "meeting" may be provided in the UI; the user may elect to add and use their own images as well. The user may also add descriptive text about the event. This text would appear when the user clicks on the icon representing that event.
[0079] In one embodiment of the invention, numerous animation utilities are provided in order to make the visualizations more vivid. Animation can help accentuate the involvement of certain actors of special interest; it can also highlight the accelerating or decelerating pace of the transactions. Types of animation provided in one embodiment of the invention are as follows:
• Rendering the transaction lines and actor icons individually, in the order and timing in which they occurred, according to a condensed timeline appropriate for viewing in generally less than one minute. This emphasizes the lag time (or lack thereof) between contiguous transactions. • Similarly, but partially graying out, via compositing or other techniques, all transaction lines rather than not rendering them until their appropriate place in the timeline. [0080] The layout algorithm for the view can be implemented with a number of commonly available graphing libraries. In one embodiment of the invention, a limit of 8 line connections per actor icon is imposed for readability purposes. For one embodiment, should additional connections be necessary in order to represent the underlying data, a second actor icon will be drawn to accommodate the additional lines. Note that while the graph generally follows a left to right timeline, a reply to an email message or IM will show a line going backwards to indicate that the transaction is a reply to a previous transaction, and that these two transactions should be considered part of a single nested transaction.
[0081] However, from an adherence to the timeline perspective, the placement of the two (or more) actor icons involved will be approximately at the start and end time of the nested transaction. If needed, additional actor icons will be rendered to ensure it. Since the purpose of the visualization is to provide an overview of the related transactions in a discussion, exact centering of the actor icons around the relevant line in the X axis is not considered essential. Exact event chronology information can be had from the ancillary panels that are only a single click away. In one embodiment of the invention, transaction lines are represented with directional arrows. In one of these embodiments, a "reply to" can be indicated with a line that has arrows on both ends; if there were N replies, the number N would be rendered near the appropriate arrow.
[0082] Finally, in one embodiment of the invention, the participant graph view can be used modally as a visual querying interface. The user may select a transaction by selecting its objects with a marquis tool, and generate a Query by Example (QBE) query. One example of QBE queries that may be used with this system is defined in 'An Apparatus for Sociological Data Mining'. The user may also select individual actor icons in order to see all transactions involving all of these actors. [0083] Other accompanying UI widgets and tools include, but are not limited to, the following. A panning widget 620, shown in Figure 6. This widget 620utilizes a thumbnail image of the full discussion transcript view, shrunk to whatever length necessary to fit in the visible view. The participant graph automatically scrolls to the position indicated by the panning widget 620, making it especially useful for viewing discussions of long duration. Daytime and night-time hours are indicated in the thumbnail, allowing the user to easily detect, for example, anomalously high amounts of communications after standard or usual working hours. In one embodiment, nighttime starts at 5:00PM in the primary time zone, or some other pre-configured time, and any communications or events after that time are distinguished, for example by being colored darkish gray. In another embodiment, a gradient fill is used to indicate rough time of day, as shown in Figure 6. In one embodiment, communication and events occurring during weekends or holidays are coded, for example by being colored pink. For one embodiment, the time zone defaults to the one in which the greatest amount of transactions occurred; times from other time zones will be normalized to this time zone. In one embodiment, there is a control 625 above the panning widget allowing the user to change the default time zone used by the panning widget 620. In another embodiment, parallel instances of the thumbnail will be drawn for each time zone from which transactions originated. One panning widget extends across all of the thumbnails. In a different embodiment, the transcript view elements being thumbnailed are color-coded according to initiating actor rather than time of day. In yet another embodiment, these items are color coded by topic.
"Rainbow" View
[0084] To visualize really large volumes of discussions, or individual messages, a different approach to the visualization is necessary. Figure 11 is a screen shot showing one embodiment of the activity graph for a discussion. The user has selected this view of the discussion using the tool bar 810. This view shows the level of activity over time in two ways: as a line graph 1120, and as a diagram 1125 in which levels of communication activity are denoted by colors of the rainbow. In this embodiment, a legend 1130 explains the meaning of the colors. An icon 820 and vertical dotted line 835 indicate the occurrence of a significant event (in this case, a board meeting). A toolbar 815 and navigation mechanism 830 as shown in Figure 8, are also shown. A slider 1115 allows the user to create a different viewable span on the canvas.
[0085] The rainbow view uses a color, pattern, or similar distinguishing mechanism which uses the color spectrum to help users to discern small shifts in the communication activity of a very large population of actors. Specifically, this view is used to pinpoint the amount of communication on specific topics. It is accompanied by a graph below which allows the assignation of numerical values to the colors used in the spectrum view. Maximum density is determined by historical values over the same time span and same or comparable population.
Activity Graphs
[0086] Activity graphs are used to illustrate the amount of communication among a small set of actors over a user-specified period of time. They may optionally be additionally constricted by topic. Actor sets may be specified in any of the following ways:
• Manual specification of particular actors through the user interface.
• Manual specification of one or more actors, with the checkbox enabled to include the "circle of trust."
• Manual specification of one or more aggregate actors which may then be expanded in the view.
[0087] Figure 12 is a screen shot showing one embodiment of the activity graph for a discussion. Lines 1220 linking actor images 545 are terminated with boxes 1215 showing the number of communications that took place between the actors. In one embodiment, each actor is represented by both an image or other icon 545 and a text item 1205 containing the name of the actor. A legend 1225 shows the mapping between colors and levels of communication activity. For each pair of actors, where actor A has sent more communications to B than B has sent to A, the connecting line 1220 has two colors, and the portion of the line adjoining each actor represents the number of communications sent by that actor to the other actor of the pair. Where each of the two actors has sent a comparable number of communications to the other, the line 1220 connecting the two actors has a single color throughout its length. A number 1215 at each end of each connecting line shows the exact number of communications that the actor at that end of the line 1220 has sent to the actor at the other end of the line 1220. The user can invoke a communication profile popup window 1210. In one embodiment the popup window 1210 is invoked by double-clicking on the line 1220 connecting actor images 545. The popup window 1210 provides additional data about the communications, including average communications length and depth, and document types exchanged. For one embodiment, any anomalies noted by the system are also flagged.
[0088] Referring to Figure 12, Each individual or aggregate actor is represented by an image provided or selected by the user. There is at most one line 1220 connecting any two actors in the activity graph. For one embodiment, a single line is used to indicate all communication between actors, in both directions. The direction of the arrows at the ends of the line indicate which way the communication is flowing. The number in the box 1215 embedded in the arrow indicates the number of communications to the other actor. For individual actors, these are the communications to that actor specifically, as opposed to communications sent to various distribution lists. If an aggregate actor is included in the display, all such aggregate communications are included, since such aggregate actors often correspond to distribution lists. Note that for purposes of readability, only communication between pairs of actors is shown. In order to show communication between tuples of actors, aggregate actors may be created.
[0089] The coloring of the lines is used to indicate one of the following, depending on how the user has configured the user interface:
• Whether the amount of communication to this actor relative to other individual actors during the same period of time is unusually high or low.
• Whether the amount of communication to this actor is high or low relative to what has historically been the case (presuming that comparison data exists.)
• Whether the amount of communication to this actor as a fraction of total communication to other individual actors is high or low compared to what has been true historically.
• Whether the amount of communication is high or low relative to a particular workflow process, or informally, among teams of similar size working on similar projects, either contemporaneously, historically, or both.
[0090] In one embodiment, the color or pattern of the line indicates the frequency of communication, while the thickness of the line indicates the volume of communication. In another embodiment, the thickness of the line indicates the frequency of communication, while the color or pattern of the line indicates the volume of communication.
[0091] The number of communications can be based on any or all of the following, depending on how the user has configured the user interface:
• Email
• Instant Messages (IM)
• Phone calls
[0092] If for some reason, the user has specified an actor who is totally unconnected to the other actors in the display, the icon for that actor will have no lines attached to it.
[0093] The activity graph can be superimposed on an org chart in order to highlight communication flows that appear to differ from the org chart. In this event, actor titles are listed, and additional lines to indicate reporting relationships may be rendered. It can also be used as a visual querying tool; the user may select two or more actors in order to see all of the discussions, or individual communications between them. The user may also click on the line connecting any 2 actors in order to bring up a panel 1210, shown in Figure 12, containing the communication profile of these actors. Which information to display is user- configurable, but would typically include the following:
• Average depth of communication
• Average interval between successive communications, optionally calculated bi-directionally
• Breakdown of communications by time (for example if the graph spans the period of one year, the communications would be broken down by the month)
• Document types exchanged
• Average length of communication
• Change from immediately previous observation period of same length
• Anomalies
• Ontologies which trap it
Overview Graphs
[0094] Figure 13 is a screen shot showing one embodiment of one view of a discussion timeline. Sets of adjoining rectangles 1305, linked by lines 1310 and color-coded by actor (as shown in legend 1315), are used to represent the communications within a discussion (so that each discussion appears as a set of adjoining rectangles 1305). The x-axis of the screen represents the timeline, and the sets of rectangles are arranged one above the other on the y-axis as in a Gantt chart. Above each discussion 1305 appears that discussion's title 1320. The lines 1310 show related discussions, which are generally either precursors to, or offsprings of, the current discussion.
[0095] The purpose of the overview graph, shown in Figure 13, is to show a set of discussions which occurred approximately contemporaneously. These graphs are one of the possible types of output from a query. In one embodiment, each discussion appears as a rectangle 1305 of the length appropriate relative to its duration in the timeline. The title 1320 of the discussion appears directly above the rectangle; in some embodiments, this is followed by the number of items in the discussion. The rectangles are thumbnails of the content part of the transcript view of the discussion, scaled down to the necessary size and rotated 90 degrees to the left. Specifically, each item within the discussion is coded according to one of the following, depending on the user's preference:
• Initiating actor
• Topic
• Document or communication or event type
[0096] The graphic resulting from this is then scaled to the appropriate dimensions and then placed on the chart. Note that an arbitrary number of discussions may be so rendered on this graph; the view simply becomes longer along the Y axis.
[0097] In addition, the user may configure the user interface to color code all communications originated or received by a particular actor of interest. In one embodiment, numerous parallel thumbnails may be created in a dedicated view in order to help the user observe the overlap between different actors of interest.
[0098] As there may be significant time lags between events in some discussions, in some embodiments, a bounding box is used to help indicate that all of the items in question are members of the same discussion. Connecting lines between discussions are used to depict forks in discussions. Similarly to the participant graph, events and other objects may be added to the graph. Zoom controls allow the resolution to be changed; the different visual representations of days, nights, and weekends/holidays may also be used here. [0099] Figure 14 is a screen shot showing one embodiment of another view of the discussion timeline. In this view, four discussions 1405 are displayed, and the level of activity within each discussion is represented by vertical lines 1415 of various thicknesses, where a thicker line denotes a greater level of communication activity. A panning widget 1410 over one portion of a discussion magnifies the vertical lines in the portion of the display under the widget 1410. The user can move the panning widget 1410 by mouse manipulation. In one embodiment, when the user does so, a hand icon 1420 appears on the panning widget 1410. An icon 820 and vertical dotted line 835 indicate when a significant event occurred.
[00100] In another embodiment, the discussion names appear to the left of the view, and one discussion occupies all of the real estate in that range of the Y axis.
[00101] For viewing smaller numbers of discussions, Figure 15 depicts a timeline of the individual events in a discussion. Figure 15 is a screen shot showing one embodiment of another view of the discussion timeline. In this view, detailed information about each individual communication event 1505 is arranged along a discussion timeline 505. Communication events 1505 are depicted as blocks on the chart (in one embodiment, different types 1530 of events are depicted using distinctively colored or patterned backgrounds). Each block depicting an event 1505 contains header information 1520 related to the corresponding communication, including but not limited to: the sender or creator of the communication; the person who last modified the communication; the date of the communication; the subject of the communication; and any associated attachments or linked documents. In one embodiment, the user can click on an area 1510 of each block in order to display the content of the communication. Color-coded lines 1525 linking each event denote the primary type of evidence used by the system to incorporate that particular item into the discussion. A zooming tool 1535 at the top right of the screen allows the user to zoom in (to show less communications in more detail) or out (to show more communications in less detail). In one embodiment, the background area 1515 of the chart is color-coded or coded with a distinctive pattern to represent daytime and nighttime.
[00102] Figure 15 provides an overview of the constituent parts of a discussion and the connections between them. Communication events are depicted as sets of interconnected blocks 1505. The blocks 1505 may be color-coded as elsewhere described; actor icons may be optionally included in the block. The different colored lines 1525 reflect the primary type of evidence used by the system to incorporate that particular item into the discussion. Evidence types include but are not limited to, the following: similarity of participants, "reply to", lexical similarity, pragmatic tag, same attachment, and workflow process. These terms are explained in 'An Apparatus for Sociological Data Mining'.
[00103] Another variation of this view uses clustering to group whole discussions together, connecting different clusters by the appropriately colored lines, as shown in Figure 16. Figure 16 is a screen shot showing one embodiment of a discussion cluster view. In this view, the total number of discussions meeting certain user-specified criteria is reflected in the size of the shape (in this embodiment, a circle) representing the cluster. In one embodiment, the shape that currently has the focus (is selected by the user) is displayed in a distinctive color 1635, with a distinctive pattern, or is shown enlarged, thereby distinguishing it from circles 1610 that do not have the focus. Links 1615 between clusters are color- coded according to whether the clusters share: commonality of actors, commonality of topics, or commonality of another type. Commonality of actors occurs when two clusters, distinctive from each other by virtue of meeting different clustering criteria, nevertheless share the same set of actors. Where this is the case, a distinctive color is used to trace the link between the two clusters in question. Icons allow the user to see more information 515, the date and time 520 of the communication, and to view 525 the underlying document discussion. A separate, smaller, window 1630 allows the user to navigate within discussion space by moving a panning tool 1620. In one embodiment, when the user activates the panning tool 1620, a hand icon 1625 is displayed.
[00104] In this view, shapes 1610 are used to represent groups of discussions. The shapes 1610 are labeled with the number of discussions contained in that group, and a description of the group. In one embodiment, a smaller window 1630 shows a map of the entire discussion space, or a relatively large part thereof, and contains a smaller frame 1620 to represent the area of discussion space under analysis. Since this view is independent of the information content, it is suitable for use even when the information has been strongly encrypted, and thus is not accessible for analysis. Document Trail Graphs
[00105] Document trail graphs depict the life cycle of one particular document. Figure 9 is a screen shot showing one embodiment of the document trail graph for a discussion. Each cluster of items on the graph consists of one actor icon 905 and at least one document icon 935. The actor's actions with regard to the document (such as creation, modification, check-in, etc) are represented by displaying a document icon 935 in an appropriate color or pattern, according to a legend 930. The x-axis of the graph represents the time line, with dates shown along a timeline display 505 at the bottom of the graph, and lifecycle increments 910 displayed along the top. In one embodiment, at each stage of the document trail, the length of the document in pages is indicated by a number 925 inside the document icon. Links 915 between versions of the document are color-coded according to function. In one embodiment, hovering the mouse over the 'more info' icon 520 invokes a popup 920 summarizing data related to the document in question.
[00106] A timeline 505 allows the user to see the date and time at which a particular event 935 in the document's life occurred. An actor icon 905 denotes the actor responsible for said event 935. Events 935 are depicted as clusters of activity comprising document icons 925 and an actor icon 905. Links 915 between the various versions of the document that comprise a single event are color-coded according to function. Document revision numbers 910 (for example, but not limited to, source control system revision numbers, or revision numbers assigned by the present invention) are displayed along the x-axis of the graph. Document icons 925 are color-coded according to the type of user activity that triggered the event. Examples of said user activity include, but are not limited to, document creation, modification, revision, deletion, check-in, check-out, distribution, viewing, third-party transfer and content transfer. In one embodiment, a legend 930 explaining the color-coding is superimposed on the graph.
[00107] Document trail graphs further show icons allow the user to see more information 515, the date and time 520 of the communication, and to view 525 the underlying document. Hovering the mouse over (in one embodiment, clicking) the 'more info' button 515 displays a popup 920 containing a summary of information related to the event in question. In one embodiment, document icons 925 contain a count of the number of pages (or other size metric) contained within the document at the time of the event 935 in question. Money Trail Graphs
[00108] The purpose of the money trail graph, shown in Figure 10, is to chart the movement of money using data available within a discussion. This visualization displays information related to money transfers that have been extracted from a discussion. The data is displayed along a timeline 505. Each extracted data point in the money trail includes a financial institution 1010 or money manager, at least one actor 545 party to the transaction, and a sum of money 1005, if that data is available. Links 540 connecting the elements of a financial transaction are color-coded according to transaction type following a color code specified in a legend 1025. Hovering the mouse over the 'more info' icon 520 beside a link 540 invokes a popup 1015 summarizing data related to the financial transaction. An account icon 1020 allows the user to see which financial accounts are involved in the transaction.
[00109] Any transactions within a discussion that relate to money transfers, whether they are merely documents discussing the transfer, or documents that in themselves constitute the instruments of transfer, are used to build a money trail graph. The graph displays actors 545 (whether individuals, groups, or organizations) and the financial institutions 1010 who are involved with the transfer. Color-coded links 540 between actors denote the type of transaction, and are explained in one embodiment in a legend 1025.
Transcript View Variations
[00110] The basic transcript view, shown in Figures 18 to 25, is a linear presentation of the causally related communication events that make up a discussion. Communications 1830 are displayed in chronological order, and relevant metadata is displayed at the top of each communication. The metadata includes, but is not limited to: date and time created, saved or sent; subject; recipient list; and time (in one embodiment, time is denoted by a clock icon 1815.) Actor names 1820 are color-coded. A header area 1805 provides information related to the discussion, including (but not limited to) discussion title, message count, list of participants, date range and total number of attached documents (in one embodiment, the total number including duplicates; in another embodiment, the total number of distinct attached documents). In one embodiment, an actor image 545 is associated with each communication, to denote the actor who created or changed the document. Clickable links 1810 contain the names of any attachments, and open the corresponding attachment when clicked. A display tool 1825 at the top-right of the screen allows the user to show or hide message headers, quoted text within each message, or message content. Communications may further provide document-type coding: for example, by pattern or color coding.
[00111] A sequence of documents 1830 (or other communication events, such as instant messages 2525) is displayed beneath a discussion header 1805. In one embodiment of the invention, the discussion might be augmented by external events, either manually by the user through the user interface, or via an automated process defined for a specific case. In one embodiment of the invention, this view consists of a user-configurable summary portion at the top, followed by a list of the various items in the discussion. Each item has an author or creator, and optionally a set of other participants, such as those actors appearing in the cc: line of an email. As shown in Figure 18, for one embodiment, each actor 1820 is automatically color-coded by the system. Since the number of actors in any given corpus can be arbitrarily large, and there are a finite number of variations in color that the eye can readily distinguish, color coding of actors is done relative to the individual discussion. However, actors of particular interest can be assigned colors that are to be used globally. In other embodiments of the invention, colors are recycled by the system within non-intersecting sets of actors. Each item also has a title, a date, and an item type, such as: email, meeting, document modification, etc.
[00112] In one embodiment of the invention, shown in Figure 19, activity associated with each actor is represented in a horizontal bar 1905 containing colored areas 1910, where the areas are color-coded by actor and spaced to represent time intervals.
[00113] In one embodiment of the invention, shown in Figure 20, discussion partitions 2005 are displayed. The partitions 2005 represent the threads that make up the discussion. In one embodiment, the partitions 2005 include the number of communications in each thread of the discussion. In this embodiment, discussions that have been partitioned (for example, because they are so large or complex) can be accessed by clicking on the title of the partition 2005.
[00114] In one embodiment of the invention, items of different types are displayed with different background colors or patterns 2110, as shown in Figure 21. In one embodiment, document type is shown via the use of an icon. In one embodiment, the time of day that a message was sent is shown by an icon 2105. [00115] In one embodiment of the invention, as shown in Figure 22, any attachments associated with communications in the present discussion are flagged via distinctive icons 2205 in the header or in the communication body. In one embodiment of the invention, documents linked by reference to communications in the present discussion are flagged via distinctive icons 2210 in the header or in the communication body. Examples of documents linked by reference include, but are not limited to: a document whose URL is referred to in a communication; and a data file whose file name and path is referred to in a communication. In one embodiment, clicking on the icon displays the attachment.
[00116] In one embodiment, shown in Figure 23, quoted text 2320 is distinguished. In one embodiment, the background 2315 is color coded. In another embodiment, the text 2320 itself is color-coded, ln one embodiment, within each communication that contains quoted text, each distinct quote is assigned a timestamp 2310. The communication header area contains explanatory text 2305 stating how many pieces of quoted text are associated with the current communication. In one embodiment, the explanatory text 2305 is replaced by an icon.
[00117] To make it easier for the user to immediately discern the time of day that an event occurred, in one embodiment, a clock icon 1815 as shown in Figure 18 appears that is set to the time that the event occurred. In other embodiment, an icon indicating general time of day appears. For example, a document modification that occurred at night would have an icon with a partial moon against a dark backdrop with stars, while an email sent at dawn would have a rising sun. In one embodiment, in addition to color coding the actors, their picture 545 appears at the top of each item that they created, as shown in Figure 18. In cases where no actor image is available or desired, a user-selected graphic can be used in its place.
[00118] The summary portion 1805 contains the discussion timeline, participating actors, number of items, and controls which allow certain information to be viewed or hidden. In one embodiment of the invention, the discussion timeline is represented graphically (Figure 17) as a series of headers 1705 connected by color-code lines 1710. In order to view message content, the user clicks on a command button, hyperlink or active area of the header. This includes, but is not limited to, transport and other header information in emails, quoted text from a prior email, routing information for a wire transfer, and check-in messages to document repositories. One embodiment of generating the summary or resolution is described in 'An Apparatus for Sociological Data Mining'.
[00119] Optional UI tools include controls to "fast forward" to the next item created or otherwise involving particular actors. This, like the panning widget, which is also used with this view, is especially useful for long discussions which have many participants associated with them.
[00120] In one embodiment, shown in Figure 24, items that are or are suspected to be missing from a discussion are flagged visually. A deleted item 2415 can be flagged in any or all of several ways: the background 2420 has a distinctive color or pattern, or is otherwise displayed in a distinctive way; a red flag icon 2425 is displayed on the item; a text box 2405 displays additional information including but not limited to the computed level of certainty that an item was deleted, and the computed level of suspicion associated with the deletion; a "torn document" effect 2410 graphically conveys to the user that this discussion is incomplete. For one embodiment, only suspicious deletions are flagged.
[00121] An item may have been deleted, yet leave traces behind of its prior existence. A simple example of this is the case in which message B was a reply to message A, but message A itself no longer exists other than what is to be found in the header and content information of message B. There are two subcases of interest related to this:
• The case in which a great deal of information about A - possibly all - can be reconstructed from other sources.
• The case in which only the suspected existence of A can be posited by the system, but virtually no other information is available.
[00122] These two cases differ considerably in their treatment in the user interface, since in the former case, the main consideration of interest is to inform the user that he is seeing reconstructed and/or partial information. For example, in the above example of message A and message B, the header of information of A would be lost, so there would be no way of knowing who had been cc'ed on A. Thus, in a reconstructed version of A in a transcript view, the "cc:" line content would contain a colored block containing question marks, or another representation of the user's choosing. For one embodiment, the item itself has a grayed out background color, and in one embodiment, a broken zig-zag line across it.
[00123] The latter case by definition presumes that there is no content available to display. An example of this would be references in other documents to a document that there is no independent evidence of; for example, a link that no longer resolves. In that instance, the available information is displayed in the appropriate location in the template. In one embodiment, a certainty factor, as shown in box 2405, of the system's belief that the document ever actually existed may also appear.
[00124] In some situations, the question of whether the deletion (or suspected deletion) of the data was either legal in the context of a given matter, or was in compliance with some defined standard of behavior is of interest. One embodiment of a system for making this determination is described in copending application Serial No. XXXXX, filed concurrently herewith, and entitled "A METHOD AND APPARATUS TO PROCESS DATA FOR DATA MINING PURPOSES." Once the determination has been made that the deletion of an item is possibly suspect in a given instance, the system will flag the item. For one embodiment, a red flag icon 2425 is used. Missing information is noted in bold red text. The background color of the item will be set to whatever the user's preference is for displaying this kind of item, for example a background containing a tiling of question marks 2420, as shown in Figure 24.
[00125] In the case of the various graph views, suspected deletions are handled similarly:
• Items which were suspiciously deleted will have an icon.
• Items which were partially or largely reconstructed from other forensically available sources are shown with a zig-zag line across them or have a zig-zag line icon above or to the side of them.
• Items whose content could not be reconstructed at all would bear a red question mark icon.
[00126] Figure 25 is a screen shot showing one embodiment of the transcript view of a discussion, focusing on instant messages 2525 within the discussion. Actors 2515 are color-coded, and time-stamps 2520 are shown at regular intervals. A slider 2505 at the left of the screen allows the user to navigate through the set of instant messages, as does a vertical scroll bar 2535 to the right. The slider 2505 at the left of the screen additionally shows a panning tool 2510 representing the position of the visible portion of instant message text within the larger body of text. Note that for instant messages (IMs) 2525, a simpler item form is used, where IMs 2525 are displayed in chronological order and timestamped
2520 at regular intervals. A panning tool 2505 with a slider 2510 allows the user to navigate through the IMs 2525. In one embodiment, the user can also navigate using a conventional scrollbar 2535. The same form may also be used to represent emails in a condensed format in which data about additional participants is not deemed of interest. In such cases, the view is constructed by decomposing the emails into the separate text blocks attributable to each actor, and then linearizing them by time (accounting for differences in time zone.) In another embodiment, all contiguous communication from the same actor is presented in the same item, separated by line breaks, much like the traditional form of a play dialog. Querying Tools
[00127] In order to help facilitate the iterative querying that is so essential when the user is confronted with an arbitrarily large and unfamiliar corpus of documents, an extensive querying language is provided. For one embodiment, this language reflects the actor orientation of the document analysis engine that is described in 'An Apparatus for Sociological Data Mining' patent. Since it is well known that the vast majority of searches contain one or two keywords, and no operators, it is important for the query language for "discussions" to break away from this standard, but ineffective paradigm. This is accomplished by using a sequential structuring of the query information. It is assumed that the majority, but not all, of queries performed with the query language will be one of the following forms, or subsets of the forms described below.
[00128] In Figure 32, the query is of the format: who 3205 (actor/actor group) knew/probably knew/saw/believed/asserted 3210 (verb relationship) what 3215 (topical or specific document instance or version) when 3220 (time, timeframe, or timeframe relative to a particular event). Optionally, the query may specify how 3225 (for example, via pager, mobile device, desktop machine) or where 3230 (if it is possible on the basis of the electronic evidence to place the person geographically at the time of the communication) for the communications as well.
[00129] In Figure 33, the who 3205 is narrowed by adding additional features. Thus, the query may include, with what frequency 3305 (for example, once, repeatedly) an actor, did what 3310 (for example, edit or check-in a document, delete a document, commit a pattern of actions or single action 3305, such as excluding particular other persons from meetings or discussions, etc), what object 3315 (actor 3205 and/or content 3215) did they do this to, and when 3220. [00130] In Figure 34, the user can specify how 3310 did patterns of behavior (relationship between an object 3215 and an actor 3205 or content 3215) change over a specified period of time 3220, or with respect to some other specific context 3405. For example, the user can query how the patterns of communication between two litigants changed after a particular material event. The user may further query if there any relationship of statistical significance between the occurrences of events of particular tuples of event types, and if so, what kind.
[00131] For one embodiment, the language generally requires that an actor be specified prior to any other terms. In the event that the actor is immaterial to the query, an actor of "anyone" may be specified, or may be automatically inserted by the system. Individual actors can be specified by first name and last name; if only one or the other is provided, the system will look in the recent command history for that user in an attempt to disambiguate it. If nothing suitable is found, the system will try to match the string to both actor first and last names present in the corpus. It will then present a list of appropriate choices, or if there is only choice echo it back to the user for confirmation. An actor's circle of trust can be specified by adding a plus sign "+" after the actor's name. In the case of an aggregate actor, the union of the actors in the different circles of trust is taken. Similarly, an actor group, such as the set of all employees of ACME Corp. could be specified. Similarly, in one embodiment, certain personalities of a given actor (or actors) can be specified.
[00132] Next, the language uses an operator. For one embodiment, if the operator is omitted, it will be interpreted to mean "knew" or "asserted". There are two main classes of operators, those involving content creation or observation, and those that do not. Operators may be active or passive in nature relative to the actor. For example, modifying a document is active, while getting promoted to a higher position is passive. Content modification operators include, but are not limited to, the following:
• Asserted: There is text attributable to a particular actor that contains the assertion in question.
• Had reason to believe: This has to do with what knowledge the actor had, on the basis of the electronic record, in the face of omissions. For example, if there were 5 versions of a document prior to it being finalized, but a particular actor was only privy to the initial 4, he might not be aware of the existence of the 5th version. So, he might reasonably believe that the 4th revision was the final one.
• Knew: The actor actively engaged in discussion about the topic(s) in question.
• Probably Knew: The actor's membership in a particular circle of trust suggests that even absent specific electronic evidence, that the actor probably was aware of a particular thing.
• Saw: The actor in question saw an instance of the content in question. That the actor saw it is established by either their responding to, or commenting on the material. Other evidence of "saw" includes, but is not limited to, any logged access of a document containing this content.
• May Have Seen: There is relevant content that the actor may have seen, but there is no direct evidence that he saw it. For example, the fact that person A sends person B an email cannot reasonably by itself be construed as person B reading this email, at all or in its entirety.
[00133] All of the above also have negations, which may be specified by the use of either "not" or a minus sign. Non-content operators include employee lifecycle events such as Hire, Departure, Transfer, Promotion, and Role Change. Other non-content events include, but are not limited to: Vacation or leave of absence or sick day, Travel event, Wire transfer send or receive, or Phone call, presuming no transcript of the phone call exists.
[00134] "When" may be specified as any of the following:
• Absolute time, using any of the standard date/time formats.
• Time of day (day, night/evening, morning, afternoon, after hours)
• Day of week (or weekday, weekend)
• Holiday or work day or vacation day or one or more specific actors "out of town" as gauged from online calendars and HR system information.
[00135] Note that all time information is implicitly actor-relative. Differences in time zones, national holidays, and even what is considered "after hours" are addressed. Therefore a "when" phrase is interpreted according to what is true for the greatest number of actors specified in the query. If a different behavior than this is desired by the user, she may explicitly bind the "when" term to either an actor or a specific location. For example:
• 1 :00PM in London • Holiday in France
• Evening for Linda Holmes
[00136] If "when" is not specified, it is presumed to mean:
• The lifespan of the actor specified in the query, if only one actor is specified.
• The interval of time beginning with the earliest lifespan in the actor group specified in the query, and ending with the latest lifespan (or current date/time,) if an actor group were specified.
• The intersection of actor or personality lifespans as specified in the query, if communication among different actors is required by the query
[00137] The "how" may optionally be specified as either a specific device type, such as a Blackberry, or as a category of device, for example a mobile device. The "how" could also be a fax or a voicemail, or a paper letter. In one embodiment, the "how" is identified by its immediately following an unquoted "by" or "via."
[00138] The "where" may be optionally specified by entering the geographic location of the actor at the time of their participation in the particular transaction. This can be done hierarchically, if a tree of locations is provided. If there is more than one actor specified in the query, the where is modified by actor. In one embodiment, this is specified as <actor name> in <location> or <actor name> at <location>.
[00139] Because of the highly iterative nature of large corpus querying, any of these operators can be iterated on by either reducing or expanding their scope. As described in 'An Apparatus for Sociological Data Mining', for one embodiment, the core engine calculates the primary limiting factors in a query. The information is used to indicate to the user which terms are responsible for very substantially reducing or expanding the result set. To facilitate the appropriate use of such iteration, the system can optionally inform the user on which terms could be generalized or specialized one level further for best effect on the results set. In one embodiment, these alternate queries are run automatically on separate threads at the same time as the base query, in order to facilitate an immediate response to a user question, such as a request for "more" or "less."
Content or "What" Operators
[00140] Each of the operators below can be used in the context of retrieving discussions or individual communications, or both. These may be used to override the system defaults described previously. For one embodiment, the actual retrieval behavior of these operators is determined by the current relevance scoring mechanism in place. One example of such relevance scoring is described in 'An Apparatus for Sociological Data Mining'.
• Keyword (an operator 3510): Result set contains all discussions or communications with at least one occurrence of a specified term, depending on the context in which it is used. This operator can specify sets of terms through techniques including but not limited to use of wildcard characters and matching using the Levenshtein edit distance.
• Phrase (an operator 3510): Result set contains all discussions or communications with at least one occurrence of the sequence of terms. This operator can specify sets of related phrases using techniques including but not limited to the use of wildcard characters in individual terms, matching by Levenshtein edit distance between terms and matching by Levenshtein edit distance between sequences of terms.
• Classifier (an operator 3510): Result set specified by the set of sub- queries obtained from expanding a given class from an ontology loaded into the document analysis engine.
• NamedEntity (an operator 3510): Result set specified by the query obtained from expanding a given named entity from all ontologies loaded into the document analysis engine.
• InDiscussionOnly (a document type 3505): Return only results from discussions
• InSingleDocOnly (a document type 3505): Return only singleton documents that are not members of any discussion.
Evidence Operators
[00141] The second group of operators search over metadata collected from each individual communication as well as relationships between documents created during the evidence accrual process while building discussions. These operators return discussions when applied.
• CommunicationType: Returns all discussions containing certain types of communication items, for example email.
• EventType: Returns all discussions that contain an event of a particular kind, such as a board meeting. • Event: Returns all discussions that contain a particular instance of an event, for example, the board meeting that occurred on March 15, 2001.
• WithltemRelatedToQuery: Will return all discussions containing communications that are a match for a query, regardless of other parameters.
• WithSimilarEvidenceLinks: Will return all discussions with a certain frequency or statistical distribution of evidence links of specific kinds.
• HaveRevisions: Returns those discussions that have more than one version (i.e., have at least one revision due to the subsequent addition of further evidence.)
• PragmaticTag: Returns any discussions containing one or more items with the given pragmatic tag.
Multi-Discussion Operators
[00142] The third group of operators search over metadata collected from each discussion as well as relationships between discussions. These operators return discussions when applied.
• WithSimilarProperties: return discussions containing a distribution of properties of contained documents. For instance "discussions where most communications sent after hours".
• WithSimilarActors: discussions containing specified set of actors, actors can be marked as primary, regular, observer or passive participant. For example: primary:<joe rudd>.
• WithSameWorkflow: return all discussions that are an instance of the given template. Templates include formal and informal workflows, etc.
• RelatedDiscussions: return discussions related to the given discussions, for example, offspring.
[00143] The fourth group of operators search over inferred sociological relationships between communications in a discussion. In general the discussions which contain communications with the indicated relationship are returned.
• ActorRelations: return discussions with the indicated relationship between a set of actors, cliques ("circles of trust") or groups. Relationships include but are not limited to: "between", "among", "drop", "add", "exclude." Some of these operators optionally use a ternary syntax: <joe rudd> excludes <bob jones> (see 'An Apparatus for Sociological Data Mining' for an explanation of these items)
• ActorStatistics: return discussions with a statistical relationship between an indicated actor and others. For example "most frequent correspondents with ActorX"
• Topology: return discussions with a given topology, for example: "split" "merge"
• Resolution: return discussion with a given resolution
• Damaging: return discussions with damaging actors. Primarily useful in combination with other queries.
[00144] The fifth group of operators are combinatorial operators used to combine result sets of subqueries. The conventional logical operators have a different effect when applied over discussions.
• REQUIRED
• PROHIBITED
• () - nesting
• [] - suppress ontology expansion
Other Operators
• DiscussionMember: Takes a set of individual documents and returns the set which are members of one or more discussions. The negation may be used in order to retrieve the complement set. Used with -statistics, it will calculate various statistics on the differences between the member and non-member documents.
• DiscussionProperties: Used on one or more discussions, queries against the total number of communications/events, types, the maximum depth, overall duration, frequency of communications, topics, actors, etc.
• ExpandToDiscussions: return the set of unique discussions containing at least one document from the document set. The document set is obtained from the result set of a subquery.
[00145] A specific graphical querying tool is also provided, in addition to the views that serve double-duty as visual query builders. As depicted in Figures 29-31 and 37a-c, the query tool includes a text field that users may use to enter words, phrases, or ontology names. Optionally, a separate pane to specify ontologies (similar to the ontology selection dropdown list 3715 shown in Figure 37a) using a tree to select the desired items may be displayed, as well as a view indicating which ontology hits correlate with which others - for example content discussing tax evasion and travel frequently co-occurring - also allowing the desired ontologies to be selected and added to the query.
[00146] Figure 36 depicts another visual query means using a Venn diagram representation to indicate how many documents were "hit" by a particular ontology, or by a combination of particular ontologies. A series of interlocking circles 3620 represent the extent to which communications "hit" only one, or more than one, ontology. The interlocking circles 3620 are used to indicate how many documents have been found to reside within each of three categories, as shown in the single-category total 3605. It also shows the number of documents that reside in more than one of the three categories, as shown in the multiple-category total 3610. In this embodiment, an explanatory text 3615 prompts the user to click in the relevant portion of the Venn diagram in order to see the corresponding documents. Using this view, users may click on any bounded area of the diagram. Doing so will bring up a panel containing a relevance ranked list of either individual documents or discussions, depending upon the user's preference. In the event that the user clicks on an area that is the intersection between two or more ontologies, in one embodiment, the relevance ranking scheme will be altered to favor documents that have a substantial score for each ontology in question.
[00147] This view is also used in thumbnail form in order to show how the topic relative proportions changed due to the addition of new documents to the corpus. This is done both by showing "before" and "after" thumbnails, as well as displaying thumbnails side by side of each segment of the data set (however the segments are determined by the user) so that their topic content may be easily compared. A similar representation can be constructed on the basis of actors rather than ontologies; further both actor and ontology information could be combined in one Venn diagram view.
[00148] Returning to Figures 29-31 , in the query tool, individual and aggregate actor icons 2910 are provided in the search panel, though actor names may also be typed in the text field 2905. Users may specify which icons should appear; initially by default the system will select the ones with the greatest communication frequency. Subsequently, by default, it will display the actors who appear most frequently in queries. Additional options allow the exclusion of the specific actors; if an actor has been excluded, the icon representing him will have an "X" or diagonal bar superimposed in it, similar to the symbol used in prohibition signs, as shown in Figure 31.
[00149] For one embodiment, events of global interest 2915 are added to a catalog so that they are displayed in the query tool for easy access. Additionally, a date range may be specified using standard calendar selection controls 2920. For one embodiment, events of interest will also appear in the calendar 2925 by coloring the square for the particular date(s) in question. Double-clicking on a colored square will bring up a pop-up with a description of the event. If an event is selected, the user will be asked whether they want the query to be:
• Prior to the event
• Subsequent to the event
• Within a specified period of time before or after the event
• During the event
[00150] If the calendar controls have been used and one or more events have been selected, the system will treat this as a request to include the union of these times. However, in this case, those discussions corresponding to the time specified by events will be given a higher relevancy ranking on the dimension of time.
[00151] In one embodiment, shown in Figure 30, the querying tool allows the user to specify, through the use of check boxes 3010 in what way an actors must have been involved with each document in order for the document to be considered responsive to the query. Examples of the involvement include, but are not limited to: creating, changing, reading, seeing, and/or receiving a document. In one embodiment, also shown in Figure 30, the querying tool allows the user to select pre-created, saved queries 3005. Possible mechanisms for selecting the saved queries include, but are not limited to, drop-down list or combo boxes (as shown in Figure 30) and list boxes. In one embodiment, the user can specify that only discussions involving certain personalities of an given actor should be returned.
[00152] After the user hits the "go" button, the query will be echoed back to the user. In some embodiments of the invention all queries, however specified, are echoed back to the user in front of the result set. This is done using query templates, such as those specified in Figures 32-34. Specifically, using the example of Template 1 (Figure 32), in one embodiment of the invention, the echo is constructed by concatenating the following pieces of data: "Query on:" <actors> octions performedxcontent descriptorsxtime>
For example:
"Query on Joe Smith or Bob Jones modifying spreadsheets last quarter"
[00153] In some embodiments, each query template has a corresponding natural language phrase that is used to generate the echo. In such embodiments, the above would be expressed as:
"Did Joe Smith or Bob Jones modify any spreadsheets last quarter?"
[00154] Since numerous query options may be specified, use of an echo helps compactly confirm what the user has asked for. This may help users to understand the result set returned, especially if the user erred in some way. Further, the text of the echo may optionally be saved with the results sets, making it easy for other users to immediately interpret the results set.
[00155] The converse also holds true; in some embodiments of the invention, the user may enter natural language queries, and the system will interpret these queries by matching them to the appropriate query template and then performing any necessary word mapping via the use of ontologies.
[00156] Additional query options include, but are not limited to, the following:
• Discussion length (number of items)
• Discussion length (calendar duration)
• Discussion depth (number of items on same topic)
• Containing events/communication of specific types
[00157] The above-mentioned discussion length query options include (but are not limited to) the longest or shortest discussions (both by number of items and calendar duration) among a given set of actors, or on a given topic. The ability to target the longest or shortest discussions by actor provides a targeted tool for probing the activities of specific actors of interest, without being restricted to particular topics or content. This is important because such restrictions limit the user to finding only what he already thinks may be there, leaving potentially important or interesting information unrevealed.
[00158] As is the case with the query language, the GUI tool will provide the user feedback on which terms caused the query (on a relative basis) to over- generate or under-generate.
[00159] The user may also avail herself of a number of canned query templates. These include, but are not limited to, the following: • Did <this> actor receive <this> version of <this> particular document?
• Were there any unusual peaks or troughs in communication activity between <these> actors?
• Find the longest discussions during <these> actors during this period of time
• <Who> discussed <this> topic the most?
• <Who> discussed <this> topic at all?
• <Who> was in <this> actor's circle of trust, when?
• Show any instances where communication circumvented the org chart.
• Show any instances where an unexpected person modified a document. [00160] All such questions are accompanied by a UI template which allows the user to select the instances of actor, document, topic (ontology) or time interval as appropriate to fill in or extend the template.
[00161] The user may configure the interface to display one or more of a number of different kinds of views in response to a query. In one embodiment, the default view is a tabular listing of the discussions that are responsive to the query, relevance ranked accordingly. This table may include all of the following information, plus any additional information that has been programmatically added:
• Discussion Name (as determined by the core engine)
• Discussion Profile (includes such information as the number of items, kind of items, number of attachments.)
• Lifespan (interval of time from the beginning of the first transaction in the discussion to the last)
• Summary, as described in 'An Apparatus for Sociological Data Mining'
• Resolution, as described in 'An Apparatus for Sociological Data Mining'
• Primary Participants
• Specific participants (indicate which actors of special interest were in any way involved in the discussion, even very peripherally.)
• Ontologies (which ontologies trapped content in the discussion)
• Missing Items (whether the system has detected evidence that some of the items that were once part of the discussion are now absent - and if so, how many such items there are.)
• Revision history (As noted in patent 'An Apparatus for Sociological Data Mining', a discussion may be revised due to the incorporation of additional data from new data sources that had previously been unavailable. In some embodiments of the invention, it may also be modified manually by an administrator with the appropriate level of privilege.) • Retrieval & viewing history (How many times this discussion has been retrieved in a query, how many times it was actually viewed or annotated.) [00162] As elsewhere in the system, by default the images used to represent the actors are used in order to facilitate rapid visual scanning of the results, as shown in Figure 26. Figure 26 is a query results view showing actor images. Each line of the results view shows the discussion title 2605, discussion start date 2610 and end date 2615, and a button 2625 depicting the image and name of each actor involved with the discussion. In one embodiment, clicking on the button displays information related to the actor. In one embodiment, only the actor image is displayed on the button. In another embodiment, only the actor name is displayed on the button. In one embodiment, a non-clickable image or text box is used, rather than a button. In one embodiment, only primary actors are shown. In one embodiment, only certain personalities of an actor are shown. The discussion is displayed by clicking on the relevant line in the results view, or by highlighting the results view line and clicking the 'Display Discussion' button 2620. In one embodiment, a text summarization of the discussion is displayed on the relevant line in the results view.
[00163] The user may also opt to have the discussions returned from a query visualized in a matrix view, shown in Figure 27, in which the columns represent a variety of discussion properties extracted from the user's query. For example, if there were 20 actors participating in all of the discussions returned by a particular query, each one would be represented by its own column, as would be other properties, such as communication type, which relevant ontologies "hit" it, and so on. Each discussion 2710 is displayed in its own row, and each property 2705 that it has, such as the participation of a particular actor causes the relevant square to be colored in. Different fill colors may be used in order to indicate whether the actor was a primary actor in the discussion, just an actor, or merely a passive participant. This is depicted in Figure 27 in compact form (without use of the actor images.) [00164] In addition the user may choose to save a number of queries and their results in a particular location, so that this data may be displayed together, as pictured in Figure 28. In one embodiment, saved queries are displayed in a list, where each item is identified by a folder icon 2850, to convey to the user the fact that it may be expanded. When expanded, a results list 2835 containing relevant discussions and their associated actors 2840 and date range becomes visible.
[00165] A folder icon 2850 is used to represent each query, and the textual content 2855 of the query is displayed to the right of the folder icon. The first query is shown expanded, revealing the results list 2835. Descriptive icons 2815, 2820, 2825 and 2830 appear to the left of each saved query. Clicking on the icon representing a pencil 2820 allows the user to annotate the query; a green rectangle next to the pencil icon indicates that the query has already been annotated. Clicking on the icon representing a hard drive 2830 saves the query to the local machine. The document icon 2815 at the left becomes replaced with the initials of the last user to modify the data (shown as TU' in this figure). The folder icon 2825 is used to add a discussion to a bin or folder of the user's choosing. For each saved query, a list of any relevant discussions 2805 and communications 2810 is shown. In one embodiment, such items show the list of actors 2840 involved, and the date range 2845 of the relevant discussion.
[00166] For one embodiment, individual or "singleton" documents are displayed separately from discussions. Furthermore, for one embodiment, saved data may be annotated (by clicking on the pencil icon,) saved to a local hard drive (by clicking on the hard drive icon,) or placed in one or more particular bins (by clicking on the folder icon to see a list of options that may be selected,) and that the initials of the user who last manipulated the document are included.
[00167] Finally, for users for whom even this simplified process might seem onerous, in one embodiment, a discussion finding "wizard" is provided. This wizard follows the sequence of operators indicated in the section on the querying language. Effectively it decomposes the controls in the illustration above into several individual, simpler panels while providing the user inline help information. The first panel asks about actors; the second asks about events of interest, the third about important words or phrases, and so on.
QBE (Query By Example) [00168] QBE refers to a set of techniques whereby a user provides an exemplar of what she is looking for in lieu of constructing an explicit query. Figures 37a-37c are screen shots of a series of Query by Example (QBE) windows. This refers to the type of query in which an exemplar of the desired returned object is specified by the user. In the case of discussion objects, QBE becomes a more complicated issue than it is with regular documents. As can be seen in 'An Apparatus for Sociological Data Mining' application, discussions have large numbers of properties, the importance of which may shift according to use case. In other words, there is no simple, one size fits all similarity metric for discussions. For example, if discussion A contains the same 3 topics as discussion B, but shares only one actor with it, and shares the same group of actors with discussion C with which it has one topic in common, it is unclear which of B or C would be considered to most similar to A. The first QBE window, shown in Figure 37a, therefore allows the user to choose from among a plurality of properties. The properties include (but are not limited to): actors 2910, content terms or phrases 2905, topics 3705, content type 3710, ontology 3715, and time range 3720.
[00169] The second window, shown in Figure 37b, contains a set of discussion properties that can be considered as evidence in determining similarity. The set shown can be selected by the user from the full set of discussion properties (except for unique ID). In addition, one embodiment of the invention provides the default set 3725 of discussion properties, pictured in Figure 37b. The colored rectangles 3735 represent the relative importance of each of the discussion properties. In one embodiment, using the modified cursor 3740, the user may modify the sizes of the different colored rectangles 3735 in the box at the bottom of figure 37b. Since the size of the box is fixed, enlarging one box proportionally reduces the sizes of the others. By repeated resizings of these rectangles, the user can achieve whatever relative scoring amongst these different factors they wish. In one embodiment, this relative scoring information is saved by the system, and will be the default setting until the user changes it again. Alternatively, a pie chart may be used, in a similar manner. Alternatively, the user may select relative importance numerically by percentage, or using some other tool. In one embodiment, the user may name and save different settings, as different settings may be useful for different use cases. The system provides the following functionality in this regard:
• As depicted in Figure 37a, the user may enter a combination containing all or some of the following query items: topic, document type, ontology, time range, and actor. The system will return a results list containing all discussions that meet this combination of criteria. In one embodiment, the combination of parameters entered by the user can include certain personalities of a given actor.
• A user may right-click on any graphical representation of a discussion in any of the previously described views in order to bring up the menu item "Find Similar". This will bring up a window according to the user's configured preferences displaying the discussions returned by the query.
• A user may right-click on any graphical representation of an individual textual communication, for example, the rows in a table representing singleton documents returned in response to a query, in order to locate other documents that are similar both contextually and by themselves. This will bring up a two-tabbed view, one with discussions, and one with singleton documents.
• As pictured in Figure 37a, the user may enter a document containing text into the system in order to use its contents as input to the query engine. As described further in 'An Apparatus for Sociological Data Mining', all named entities, including actors, will be extracted from the document. In one embodiment, a topic analysis will be done via the use of ontologies and pragmatic tagging, known text blocks will be sought, and finally any mention of dates will be extracted. One example of this usage is depositions in a litigation context.
[00170] Discussions have large numbers of properties including, but not limited to, the following: Actors
Primary Actors (Regular) Actors Observers
Number of organizations Number of Items Number of Item Types Item Types Lifespan Length • Number of Partitions
• Topics
• Pragmatic Tagged Items
• Revisions
[00171] As a result, there is potentially considerable ambiguity as to what exactly it means to say that one discussion is "similar" to another, and therefore should be returned in a QBE query. Further, the desired behavior of the QBE mechanism may vary by application. However, in one embodiment, the default behavior is to consider that actor and content are the two key items in the weighting; all other properties merely impact the ranking of the discussion in the result set. Specifically, actor is expanded first to any actor with the same role or title in the same organization as the actor(s) provided in the exemplar, and then to any actor in the same organization. Content may be determined by ontology or pragmatic tag, with the former being given more weight. Discussions that contain the desired actors or content under this definition are returned. For one embodiment, results are relevance-ranked according to the scheme laid out in 'An Apparatus for Sociological Data Mining'.
[00172] If the user wishes a different behavior, he may pull up the
Advanced Options panel as shown in Figure 37b, and specify the relative weight that he wishes to assign to each property, and whether or not the value of the property is to be treated strictly as specified in the exemplar. For example, must the exact actors in the exemplar be present in order for a discussion to be retrieved, or does it suffice if their colleagues in the same department are present? In one embodiment, the relative weights are assigned with a weighted scale (i.e., a scale that has both numbers and words, for example 5 = must be the case; 1 = desirable to be somewhat similar.) In another embodiment, shown in Figure 37b, the colored rectangles 3735 represent the relative importance of each of the discussion properties. In one embodiment, using the modified cursor 3740, the user may modify the sizes of the different colored rectangles 3735 in the box at the bottom of figure 37b. Since the size of the box is fixed, enlarging one box proportionally reduces the sizes of the others. By repeated resizings of these rectangles, the user can achieve whatever relative scoring amongst these different factors they wish. In one embodiment, this relative scoring information is saved by the system, and will be the default setting until the user changes it again. Alternatively, a pie chart may be used, in a similar manner. Alternatively, the user may select relative importance numerically by percentage, or using some other tool. In one embodiment, the user may name and save different settings, as different settings may be useful for different use cases.
[00173] With this information, the system performs the query. In order to help the user make sense of the ranking of results in figure 37c, the property or properties primarily responsible for the rank are shown 3750 (in one embodiment, properties are color-coded, and the coding is explained in a legend 3745 below the results). For example, as pictured in figure 37c, the initial item was scored highly primarily on the basis of shared terms. If the high score were also attributable to shared actors, a blue chit would also appear. In some embodiments of the invention, the degree of saturation of the color chit is used to express the relative level of similarity in this dimension. In one embodiment, the user sees a warning message 3755 if the result has been broken down into clusters.
[00174] The user may configure the view to show any of the available discussion properties. Similarly, in one embodiment, he may resize and reorder the various columns via direct manipulation.
Filtered Viewing of Discussions
[00175] Using standard information retrieval techniques, those items within the discussion that are relevant to the user's query may be identified and visually highlighted. The user may opt to have all portions of a discussion that are not responsive to their query be minimized. In the case of a transcript view, non- responsive items would be condensed to a single header line, with a button that can be clicked on in order to expand the entry in order to make its contents visible.
[00176] Certain actors who may generate a considerable volume of data may nevertheless generate very little content of interest. If desired, the user may specify that all communications originating from such actors are by default minimized in any views of the discussion.
Object Lifecycle Views
[00177] These views differ from the previously described ones in that they are less actor-focused and more object-focused. These views are intended to depict the history of a particular document (or other electronic data object) as it moves from creation, to distribution, various modifications, changes in form, extractions or copy/pastes to other documents, and possibly deletion. Such views can be extremely important in investigative contexts, when a particular document becomes the focus of attention.
[00178] Figure 38 depicts the lifecycle view for a document. If versioning information is available from a document management system or repository, or if the creating application provides it, the versions are shown by number 915 above the view, with vertical lines extending beneath them to help make it clear which actors modified or received a document before, or after a particular version change. Major versions and minor versions can be represented differently as per user preference; minor versions may be omitted from the display entirely, represented by thinner lines and smaller number boxes, or drawn the same as major versions. Other designations may be added by the user manually, or extracted automatically from systems that contain such information. These designations include, but are not limited to, published, shipped, and produced. The legend panel 3825 indicates the color coding of some of the different kinds of possible lifecycle events. The lifecycle view is drawn according to a left to right timeline. However, as is also the case with the participant graph, the actor icons only need be drawn in approximately the correct location with respect to the timeline. This is for purposes of readability; drawing a separate actor icon for related actions that may have taken place only moments apart from one another would only serve to decrease the readability of the visualization. However, an additional actor icon will be drawn if it is necessary to do so in order to not combine events which occurred on opposite sides of a version line. Therefore to capture such information, each actor icon is framed by a frame that can be partitioned up to 8 times in order to indicate the occurrence of different events performed by the actor on the document within a fairly short period of time. For example, an actor might check out a document, modify it in some way, email it around to various people, and then check it back into the repository - all within a matter of a short period of time. In this event, the actor frame would have 4 colors, one side each, in whatever colors designated by the legend. With the color scheme pictured below, this would be: orange, red, blue, and yellow.
[00179] In order to "drill down" for further information, the user may click on an actor icon in order to view a detailed log of events represented by that instance of the actor icon. Clicking on any part of the frame will bring up a pop-up with a detailed description of that action. For example, in the case of a check-in, the detailed description would include all of the following information (if available)
• Timestamp of check-in • Check-in message
• Other files modified as part of same check-in (if any)
• List of those actors receiving check-in notification
• Resulting version number
• Check-in verification ID
[00180] In addition, the user may click on the clock icon above the actor icon in order to see a simple chronological list with exact timestamps of the events represented by that actor icon instance. As in other views, the "?" icon may be used to access other kinds of information as specified in user preferences.
[00181] As depicted in Figure 38 below, individual actors may be filtered out of the view, either entirely removed from the display, or else grayed out significantly as shown below. Individual action types may be similarly treated. For example, a user may not care who checked-out or received a document, but rather may be interested in only those persons who modified the document or sent it outside of the organization.
Mobile, Voice & Related Applications
[00182] As usage of new types of user interf ces becomes more widespread, the system will need to not only absorb data that is captured through such interfaces, but also provide its output to users who rely on these modalities. Examples of the types of interfaces to be considered in this regard are: speech recognition and text-to-speech (either as stand-alone applications or in conjunction with telephony technologies), handheld devices such as those using the PalmOS (Figure 39) or WindowsCE operating systems, mobile telephones equipped with browser interfaces such as iMode or WAP and potentially other devices using specialized data transmission protocols and/or specialized embedded operating systems.
[00183] Speech recognition is already widely used by the legal and medical profession for recording of briefs, reports, and the like. The system includes a means of extracting data that is input by speech recognition, and making such data searchable and retrievable like any other artifact. Input to speech recognition can take the form either of speaker-dependent recognition (the type employed by dictation software) or speaker-independent recognition (the type employed by telephony applications); the system includes adapters to incorporate data from both types of systems. [00184] Furthermore, the system may utilize speech recognition as an interface allowing users to query data already in the system. To this end, an interactive voice interface to the system could display discussions and other data to the user, either on a device or through an audio-only interface. For applications using speech recognition as input mechanism, an auditory interface is commonly used to play back data to the user, be it for playback over a telephone or through speakers attached to another device such as a desktop computer. To this end, in one embodiment, the system includes auditory interfaces, including but not limited to: playback of indexed documents by text-to-speech, or spoken synthesis that accompanies or parallels any of the visual diagrams generated by the system.
[00185] Further remote interfaces for the system may include wireless and handheld display and input to the system, for example through WAP or similar protocols that transmit data over wireless networks (including telephony networks), input of data via Short Messaging System (SMS) or similar protocols, the use of downloadable/syncable views and data input for handheld/palmtop/tablet PCs, and interfaces for wearable computing devices. The system allows both input and retrieval of data into the system through any combination of devices; for example, a user's spoken query will be displayable on the screen of a handheld device.
[00186] Mobile and voice applications are most useful as query interfaces to the system for users who find themselves away from office systems but nonetheless require system access. However, the provision for data input by mobile or voice interfaces also means that "live" updates to a system can be made remotely, and that secondary sources of information (on-the-spot interviews, court proceedings, live news feeds) can be incorporated into the system in the absence of other indexing and content extraction processes. This topic is dealt with in further depth in 'An Apparatus for Sociological Data Mining'.
[00187] For voice applications in particular, a natural language interface is a highly desirable mode of interaction with the system. Users who are limited to an auditory interface (where the input to the system is spoken rather than textual) can respond better to systems that are designed around the vagaries of human speech (which include disfluencies, variable noise conditions, and the strictly linear exchange of information). The nature of auditory interfaces is such that spontaneity and a tolerance for garbled input is incorporated into the interface; rather than scripted, fixed input that can be manipulated visually, the voice interface must attempt to parse ambiguous user input and return a "system appropriate" result. [00188] Typically, speech recognition interfaces rely on a grammar that restricts potential user utterances in order to provide accurate recognition. In a spoken query interface to the system described in this patent, highly accurate utterance recognition is unlikely, but need not be a hindrance to proper function. By allowing the system to accept unstructured utterances and subsequently to construct a range of hypotheses about their content, a much more usable type of interface results. With an unstructured grammar, any possible user utterance can generate a fixed-length set of possible parses. From this set of potential parses, an algorithm is applied to account for phonetic similarities in homophones, to remove content that occurs in only a few parses, and so forth, leaving a "core" hypothesis that can be used as the basis for a search.
[00189] As an example, the user utterance, "Find me anything about fraud" might generate the following hypothesis set from a speech recognition engine:
• "find me a thing about fraud"
• "find my anything about frog"
• "find me knee thing up out fraud"
• ... and so forth.
[00190] While none of the generated parses is entirely correct, the phonetic similarity of many items in the resulting set can be used to generate a normalized "core" hypothesis that finds the commonly occurring substrings such as "find/fine" "me/my", "anything/a thing/knee thing", "about/up out", and "fraud/frog". Normalization of this set of results can proceed according to relatively simple natural language heuristics: those words that are essentially contentless, such as "find me anything", can be omitted, leaving the core terms "about fraud", which can be encoded, for example, as a set of Boolean search queries like "contents: fraud OR contents: "about fraud". Once the queries are generated, a preliminary result set can be relayed to the speaker by voice interface, allowing of course for additional refinement or correction of the query, as well as for more detailed display/playback of user-selected elements of the result set. For one embodiment, the system may repeat the query as understood to the user, permitting the user to either confirm the query or to repeat the query to modify it.
[00191] Figure 39 is a screen shot showing one embodiment of the discussion view, as used on a mobile device. A list 3920 of returned discussions is shown, each of which is associated with a checkbox 3915 allowing the user to select the discussions in order to view further detail. In one embodiment, the query 3910 that caused the list 3920 of discussions to be returned is displayed. In one embodiment, a group of buttons 3905 allows the query to be launched or interrupted.
Case Management Application
[00192] One of the applications of the system is case management in a litigation context. The functionality previously described can be delivered inside a case management application. As pictured in Figure 40, the master window in this application allows the viewing of both individual documents and discussions in their various visual manifestations. Figure 40 is a screen shot of one embodiment of the case management master window. In the top-left pane 4005, the user can select from among various types of communications 4045 (and, in one embodiment, the actors who sent communications), or can select discussions 4050. Documents are displayed in the top right pane 4010. In this example, the top right pane 4020 shows a privileged document, which is flagged 4015 as such. At the bottom right pane 4035, the user can enter text in order to find specific discussions, documents, or actors. The bottom-left pane 4030 is used to bookmark searches to which the user wishes to return. A group of option buttons 4040 allows the user to select between management of discussions, documents, or actors, and a set of command buttons 4025 allows the user to select different views of the data. This window contains the following functionality of interest:
• Allowing users to browse by document type, which is calculated either by file extension or by pragmatic tagging, and to drill down first by actor and then by topic, or vice versa, as well as by discussion membership.
• Documents, including discussions may be marked as "privileged" causing the red privileged stamp to always appear over the document in electronic form, and to be printed when the document is printed.
• The user may search for a word or topic in discussions, according to the actors to whom the words or topic are attributable, or in individual documents.
[00193] In the foregoing specification, the invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims

CLAIMSWe claim:
1. A method of organizing information comprising: providing a visualization of actor communications in the context of one or more discussion, a discussion including at least one actor and at least one documented communication.
2. The method of claim 1 , wherein the documented communication may be one or more of the following: a document, an email, an instant message (IM), a facsimile, a voicemail, a phone call, a wire transfer, a fund transfer, or an electronically traceable package.
3. The method of claim 1 , further comprising: receiving a query; and generating the visualization in response to the query.
4. The method of claim 3, wherein the query includes one or more of: actors, time frame, topic, related events, communications type, specific document, or workflow process.
5. The method of claim 3, wherein the visualization comprises a tabular list of documents that satisfy the query.
6 The method of claim 3, wherein the visualization comprises a discussion oriented display.
7. The method of claim 6, wherein the discussion oriented display comprises one of the following: a participant graph, an overview graph, a transcript view, a question and answer list, a matrix view, a cluster view, and a tabular list view.
8 The method of claim 3, wherein the visualization comprises an actor- oriented display.
9. The method of claim 8, wherein the actor-oriented display comprises one of the following: an activity graph, a participant graph, an actor profile, a matrix view, a tabular list view, and a cluster view.
10. The method of claim 3, wherein the visualization comprises a statistical display of data.
11. The method of claim 10, wherein the statistical display comprises one of the following: a Venn diagram, and a profile view.
12. The method of claim 3, wherein the visualization comprises a topic- based display.
13. The method of claim 1 , wherein the actor is an aggregate actor, comprising one of the following: a circle of trust, a group, a section, or another grouping of two or more actors.
14. The method of claim 1 , wherein the discussion includes an exchange between at least two actors, the exchange including one or more documented communications.
15. The method of claim 14, wherein a plurality of communications are indicated between the at least two actors, and a visual representation of a depth of the communications is shown.
16. The method of claim 15, wherein the visual representation is a line between two actors.
17. The method of claim 16, wherein a thickness of the line indicates a number of communications between the actors.
18. The method of claim 1 , further comprising: displaying a time-based participant graph showing communications between various actors over time.
19. The method of claim 18, wherein each communication is coded to indicate a communication type.
20. The method of claim 18, wherein each communication may be selected to display additional information about the communication.
21. The method of claim 20, wherein the additional information comprises one or more of the following: communication type, date and time of communication, communication content.
22. The method of claim 18, wherein each actor is represented visually by a unique icon.
23. The method of claim 22, wherein the icon is one of the following: a photograph of the actor, a consistent graphical representation of the actor.
24. The method of claim 22, further comprising: displaying actor information, in response to a user selecting the unique icon.
25. The method of claim 18, wherein the time of day is visually indicated in the time-based participant graph.
26. The method of claim 25, wherein the time of day indication is color based.
27. The method of claim 25, wherein the time of day indication further visually indicates holidays and after-hours communications.
28. The method of claim 18, further comprising: displaying tags indicating events of interest, to show communications in relationship to such events.
29. The method of claim 1 , further comprising: enabling a user to add additional communications to the visualization.
30. The method of claim 1 , wherein the visualization comprises a document trail graph, providing information on each document.
31. The method of claim 30, wherein the information comprises one or more of the following: creation date, creating actor, modification date(s), modification actor(s), revision date(s), revision actor(s), deletion date, deletion actor, check-in date(s), check-out date(s), distribution(s), recipients of distribution(s), and document content.
32. The method of claim 1 , wherein the visualization comprises a money trail graph, illustrating times and actors involved in various money transfers.
33. The method of claim 1 , wherein the visualization comprises an activity graph that illustrates a level of activity over time.
34. The method of claim 33, further comprising displaying an icon illustrating events of relevance, to show a relationship of activity levels to the events of relevance.
35. The method of claim 33, further comprising: displaying two actor icons, representing actors that communicated with each other, and a communication line between the two actor icons showing a communication depth.
36. The method of claim 35, wherein a number at a first end of a line represents a number of communications sent by a first actor to a second actor, and a number at a second end of the line represents the number of communications sent by the second actor to the first actor.
37. The method of claim 35, wherein a color of the communication line shows the communication density.
38. The method of claim 1 , wherein the visualization is a discussion timeline in which sets of adjoining rectangles, linked by lines and coded by actor represent the communications within a discussion.
39. The method of claim 38, further comprising displaying a legend identifying each actor code.
40. The method of claim , wherein the visualization is a discussion cluster, illustrating a number of discussions that meet a query criteria of the user.
41. The method of claim 40, further comprising:visually identifying a particular discussion focus.
42. The method of claim 1 , wherein the visualization comprises a transcript view, displaying communications coded by actor.
43. The method of claim 42, wherein communications are color coded by document type.
44. The method of claim 42, wherein quoted text within a document is color coded for an originating actor.
45. The method of claim 42, further comprising: indicating deleted documents in the transcript, including available information about the deleted document.
46. The method of claim 45, further comprising: determining if a deleted document is suspicious, and if so, flagging the deleted document indication in the transcript.
47. The method of claim 1 , wherein the visualization is a matrix query result view, indicating participation of certain actors in certain discussions.
48. The method of claim 1 , further comprising: providing a query tool to construct queries for related documents.
49. The method of claim 48, further comprising: displaying actor icons for selection with the query tool, to enable a user to identify an actor.
50. The method of claim 49, further comprising: permitting specification of actor involvement for each selected actor, the actor involvement being one of the following: created, changed, received, read, or saw a document.
51. The method of claim 49, further comprising: permitting an actor to be excluded from the query.
52. The method of claim 48, wherein constructing a query comprises one or more of the following: specifying an actor, specifying an action by the actor, specifying content, specifying timeframe, specifying communication method, specifying actor location, specifying causality for the communication, specifying action frequency, specifying action type, specifying target of the communication, document types for retrieval, and keywords.
53. The method of claim 48, comprising: providing a query by example, permitting a user to select from multiple pulldown menus.
54. The method of claim 53, further comprising: prompting the user to assign priority to related parameters using a parameter weighting.
55. The method of claim 48, further comprising: saving queries and query results; and making the saved queries and the saved query results available to the user.
56. An apparatus to present data comprising: a query tool to receive a request; and a display tool to present a visualization of actor communications in the context of one or more discussions, a discussion including at least one actor and at least one documented communication.
57. The apparatus of claim 56, wherein constructing a query comprises one or more of the following: specifying an actor, specifying an action by the actor, specifying content, specifying timeframe, specifying communication method, specifying actor location, specifying causality for the communication, specifying action frequency, specifying action type, specifying target of the communication, document types for retrieval, and keywords.
58. The apparatus of claim 56, comprising: a query by example tool including multiple pull-down menus to select various parameters of a query.
59. The apparatus of claim 58, further comprising: a parameter weighting tool to assign priority to related parameters.
60. The apparatus of claim 56, further comprising: a memory to save queries and query results, the saved queries and the saved query results available to the user.
61. The apparatus of claim 56, further comprising: a plurality of actor icons for selection with the query tool, to enable a user to identify an actor.
62. The apparatus of claim 61 , further comprising: a selector to specify actor involvement for each selected actor, the actor involvement being one of the following: created, changed, received, read, or saw a document.
63. The apparatus of claim 62, wherein the selector permits an actor to be excluded from the query.
64. The apparatus of claim 56, wherein the visualization comprises a participant graph including actor icons and connectors indicating communications between the actors.
65. The apparatus of claim 64, wherein the actor icons are a unique icon for each actor, the unique icon comprising: a photograph of the actor or a consistent graphical representation of the actor.
66. The apparatus of claim 64, further comprising: icons attached to each connector, the icons designed to provide additional information about the communication represented by the connector.
PCT/US2003/003504 2002-02-04 2003-02-04 A method and apparatus to visually present discussions for data mining purposes WO2003067497A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AU2003207856A AU2003207856A1 (en) 2002-02-04 2003-02-04 A method and apparatus to visually present discussions for data mining purposes
CA002475319A CA2475319A1 (en) 2002-02-04 2003-02-04 A method and apparatus to visually present discussions for data mining purposes
EP03706095A EP1481346B1 (en) 2002-02-04 2003-02-04 A method and apparatus to visually present discussions for data mining purposes

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US35440302P 2002-02-04 2002-02-04
US60/354,403 2002-02-04

Publications (1)

Publication Number Publication Date
WO2003067497A1 true WO2003067497A1 (en) 2003-08-14

Family

ID=27734370

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/US2003/003309 WO2003067473A1 (en) 2002-02-04 2003-02-04 A method and apparatus for sociological data mining
PCT/US2003/003504 WO2003067497A1 (en) 2002-02-04 2003-02-04 A method and apparatus to visually present discussions for data mining purposes

Family Applications Before (1)

Application Number Title Priority Date Filing Date
PCT/US2003/003309 WO2003067473A1 (en) 2002-02-04 2003-02-04 A method and apparatus for sociological data mining

Country Status (5)

Country Link
US (2) US7143091B2 (en)
EP (2) EP1485825A4 (en)
AU (2) AU2003207856A1 (en)
CA (2) CA2475319A1 (en)
WO (2) WO2003067473A1 (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005116887A1 (en) * 2004-05-25 2005-12-08 Arion Human Capital Limited Data analysis and flow control system
AU2011201994B2 (en) * 2004-03-31 2012-11-01 Google Llc Providing snippets relevant to a search query in a conversation-based email system
US8346859B2 (en) 2004-03-31 2013-01-01 Google Inc. Method, system, and graphical user interface for dynamically updating transmission characteristics in a web mail reply
US8533274B2 (en) 2004-03-31 2013-09-10 Google Inc. Retrieving and snoozing categorized conversations in a conversation-based email system
US8554852B2 (en) 2005-12-05 2013-10-08 Google Inc. System and method for targeting advertisements or other information using user geographical information
US8583654B2 (en) 2011-07-27 2013-11-12 Google Inc. Indexing quoted text in messages in conversations to support advanced conversation-based searching
US8601004B1 (en) 2005-12-06 2013-12-03 Google Inc. System and method for targeting information items based on popularities of the information items
US8626851B2 (en) 2004-03-31 2014-01-07 Google Inc. Email conversation management system
US8782156B2 (en) 2004-08-06 2014-07-15 Google Inc. Enhanced message display
US9002725B1 (en) 2005-04-20 2015-04-07 Google Inc. System and method for targeting information based on message content
WO2015187274A1 (en) * 2014-06-01 2015-12-10 Apple Inc. Displaying options, assigning notification, ignoring messages, and simultaneous user interface displays in a messaging application
US9887949B2 (en) 2014-05-31 2018-02-06 Apple Inc. Displaying interactive notifications on touch sensitive devices
US9898162B2 (en) 2014-05-30 2018-02-20 Apple Inc. Swiping functions for messaging applications
US10352049B2 (en) 2013-06-27 2019-07-16 Valinge Innovation Ab Building panel with a mechanical locking system
US10620812B2 (en) 2016-06-10 2020-04-14 Apple Inc. Device, method, and graphical user interface for managing electronic communications
US11188168B2 (en) 2010-06-04 2021-11-30 Apple Inc. Device, method, and graphical user interface for navigating through a user interface using a dynamic object selection indicator

Families Citing this family (862)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8352400B2 (en) 1991-12-23 2013-01-08 Hoffberg Steven M Adaptive pattern recognition based controller apparatus and method and human-factored interface therefore
US7904187B2 (en) 1999-02-01 2011-03-08 Hoffberg Steven M Internet appliance system and method
US6832245B1 (en) 1999-12-01 2004-12-14 At&T Corp. System and method for analyzing communications of user messages to rank users and contacts based on message content
US6938024B1 (en) * 2000-05-04 2005-08-30 Microsoft Corporation Transmitting information given constrained resources
US9704128B2 (en) * 2000-09-12 2017-07-11 Sri International Method and apparatus for iterative computer-mediated collaborative synthesis and analysis
US7945600B1 (en) * 2001-05-18 2011-05-17 Stratify, Inc. Techniques for organizing data to support efficient review and analysis
US7308451B1 (en) 2001-09-04 2007-12-11 Stratify, Inc. Method and system for guided cluster based processing on prototypes
US20030005048A1 (en) * 2001-06-13 2003-01-02 Rivar Technologies, Inc. System and method for integrated web-based software code environment
US7774711B2 (en) 2001-09-28 2010-08-10 Aol Inc. Automatic categorization of entries in a contact list
US7716287B2 (en) 2004-03-05 2010-05-11 Aol Inc. Organizing entries in participant lists based on communications strengths
US7260555B2 (en) 2001-12-12 2007-08-21 Guardian Data Storage, Llc Method and architecture for providing pervasive security to digital assets
US8065713B1 (en) 2001-12-12 2011-11-22 Klimenty Vainstein System and method for providing multi-location access management to secured items
US7380120B1 (en) 2001-12-12 2008-05-27 Guardian Data Storage, Llc Secured data format for access control
US7921450B1 (en) 2001-12-12 2011-04-05 Klimenty Vainstein Security system using indirect key generation from access rules and methods therefor
US7178033B1 (en) 2001-12-12 2007-02-13 Pss Systems, Inc. Method and apparatus for securing digital assets
US7565683B1 (en) 2001-12-12 2009-07-21 Weiqing Huang Method and system for implementing changes to security policies in a distributed security system
US7921284B1 (en) 2001-12-12 2011-04-05 Gary Mark Kinghorn Method and system for protecting electronic data in enterprise environment
US10033700B2 (en) 2001-12-12 2018-07-24 Intellectual Ventures I Llc Dynamic evaluation of access rights
US7930756B1 (en) 2001-12-12 2011-04-19 Crocker Steven Toye Multi-level cryptographic transformations for securing digital assets
US7921288B1 (en) 2001-12-12 2011-04-05 Hildebrand Hal S System and method for providing different levels of key security for controlling access to secured items
US8006280B1 (en) 2001-12-12 2011-08-23 Hildebrand Hal S Security system for generating keys from access rules in a decentralized manner and methods therefor
US10360545B2 (en) 2001-12-12 2019-07-23 Guardian Data Storage, Llc Method and apparatus for accessing secured electronic data off-line
US7950066B1 (en) 2001-12-21 2011-05-24 Guardian Data Storage, Llc Method and system for restricting use of a clipboard application
US7519589B2 (en) * 2003-02-04 2009-04-14 Cataphora, Inc. Method and apparatus for sociological data analysis
US8135711B2 (en) 2002-02-04 2012-03-13 Cataphora, Inc. Method and apparatus for sociological data analysis
CA2475319A1 (en) 2002-02-04 2003-08-14 Cataphora, Inc. A method and apparatus to visually present discussions for data mining purposes
US8176334B2 (en) 2002-09-30 2012-05-08 Guardian Data Storage, Llc Document security system that permits external users to gain access to secured files
US6970882B2 (en) * 2002-04-04 2005-11-29 International Business Machines Corporation Unified relational database model for data mining selected model scoring results, model training results where selection is based on metadata included in mining model control table
US20030204507A1 (en) * 2002-04-25 2003-10-30 Li Jonathan Qiang Classification of rare events with high reliability
DE10222156A1 (en) * 2002-05-17 2003-11-27 Siemens Ag Transmission efficient handling of multi media information uses a process to identify and optimize useful data content that is set against price categories
US20040078447A1 (en) * 2002-09-17 2004-04-22 Malik Dale W. User profiles for managing email and instant messaging (IM)
US8090717B1 (en) 2002-09-20 2012-01-03 Google Inc. Methods and apparatus for ranking documents
US7568148B1 (en) 2002-09-20 2009-07-28 Google Inc. Methods and apparatus for clustering news content
US7130844B2 (en) * 2002-10-31 2006-10-31 International Business Machines Corporation System and method for examining, calculating the age of an document collection as a measure of time since creation, visualizing, identifying selectively reference those document collections representing current activity
US7085755B2 (en) * 2002-11-07 2006-08-01 Thomson Global Resources Ag Electronic document repository management and access system
AU2003303499A1 (en) * 2002-12-26 2004-07-29 The Trustees Of Columbia University In The City Of New York Ordered data compression system and methods
US8819039B2 (en) 2002-12-31 2014-08-26 Ebay Inc. Method and system to generate a listing in a network-based commerce system
US7945674B2 (en) 2003-04-02 2011-05-17 Aol Inc. Degrees of separation for handling communications
US7263614B2 (en) 2002-12-31 2007-08-28 Aol Llc Implicit access for communications pathway
US7657540B1 (en) * 2003-02-04 2010-02-02 Seisint, Inc. Method and system for linking and delinking data records
US20040210639A1 (en) 2003-03-26 2004-10-21 Roy Ben-Yoseph Identifying and using identities deemed to be known to a user
US20040243531A1 (en) * 2003-04-28 2004-12-02 Dean Michael Anthony Methods and systems for representing, using and displaying time-varying information on the Semantic Web
US8707034B1 (en) 2003-05-30 2014-04-22 Intellectual Ventures I Llc Method and system for using remote headers to secure electronic files
JP4200055B2 (en) * 2003-06-12 2008-12-24 日立オムロンターミナルソリューションズ株式会社 Banknote transaction system
US7734627B1 (en) 2003-06-17 2010-06-08 Google Inc. Document similarity detection
US7305557B2 (en) * 2003-06-20 2007-12-04 International Business Machines Corporation Management and recovery of data object annotations using digital fingerprinting
US20040259641A1 (en) * 2003-06-23 2004-12-23 Ho David Yc Method and system for enabling and managing a networking database and system supporting a multi-user network game
US7162473B2 (en) * 2003-06-26 2007-01-09 Microsoft Corporation Method and system for usage analyzer that determines user accessed sources, indexes data subsets, and associated metadata, processing implicit queries based on potential interest to users
US7225187B2 (en) * 2003-06-26 2007-05-29 Microsoft Corporation Systems and methods for performing background queries from content and activity
KR100539834B1 (en) * 2003-06-30 2005-12-28 엘지전자 주식회사 System and Method for Managing Map version using of car navigation
US8290948B2 (en) * 2003-06-30 2012-10-16 Hoshiko, Llc Method and apparatus for content filtering
JP4200834B2 (en) * 2003-07-02 2008-12-24 沖電気工業株式会社 Information search system, information search method, and information search program
US7627613B1 (en) * 2003-07-03 2009-12-01 Google Inc. Duplicate document detection in a web crawler system
US7509313B2 (en) * 2003-08-21 2009-03-24 Idilia Inc. System and method for processing a query
US7895523B2 (en) * 2003-09-04 2011-02-22 International Business Machines Corporation Method, system and program product for obscuring supplemental web content
US7383269B2 (en) * 2003-09-12 2008-06-03 Accenture Global Services Gmbh Navigating a software project repository
US7577655B2 (en) 2003-09-16 2009-08-18 Google Inc. Systems and methods for improving the ranking of news articles
SG146665A1 (en) 2003-09-19 2008-10-30 Research In Motion Ltd Handheld electronic device and associated method providing time data in a messaging environment
US7503070B1 (en) * 2003-09-19 2009-03-10 Marshall Van Alstyne Methods and systems for enabling analysis of communication content while preserving confidentiality
US20050071212A1 (en) * 2003-09-26 2005-03-31 Flockhart Andrew D. Method and apparatus for business time computation in a resource allocation system
US7770175B2 (en) 2003-09-26 2010-08-03 Avaya Inc. Method and apparatus for load balancing work on a network of servers based on the probability of being serviced within a service time goal
US8094804B2 (en) 2003-09-26 2012-01-10 Avaya Inc. Method and apparatus for assessing the status of work waiting for service
US20050071759A1 (en) * 2003-09-29 2005-03-31 Xerox Corporation Method for an imaging system to form a user interface for a user to accept or request modification to a displayed document, a method for an imaging system to form a user interface for a user to communicate to the imaging system a desired modification in a displayed document, and a method of modifying a displayed document in an imaging system
US7703140B2 (en) 2003-09-30 2010-04-20 Guardian Data Storage, Llc Method and system for securing digital assets using process-driven security policies
US8127366B2 (en) 2003-09-30 2012-02-28 Guardian Data Storage, Llc Method and apparatus for transitioning between states of security policies used to secure electronic documents
US7158980B2 (en) * 2003-10-02 2007-01-02 Acer Incorporated Method and apparatus for computerized extracting of scheduling information from a natural language e-mail
US7346629B2 (en) * 2003-10-09 2008-03-18 Yahoo! Inc. Systems and methods for search processing using superunits
EP1530139A1 (en) * 2003-11-05 2005-05-11 Sap Ag Method and computer system for workflow management
US7552433B2 (en) * 2003-11-12 2009-06-23 Hewlett-Packard Development Company, L.P. Non-platform-specific unique indentifier generation
US8239687B2 (en) 2003-11-12 2012-08-07 The Trustees Of Columbia University In The City Of New York Apparatus method and medium for tracing the origin of network transmissions using n-gram distribution of data
US20050125461A1 (en) * 2003-12-08 2005-06-09 International Business Machines Corporation Version control of metadata
US7333985B2 (en) * 2003-12-15 2008-02-19 Microsoft Corporation Dynamic content clustering
US7702909B2 (en) * 2003-12-22 2010-04-20 Klimenty Vainstein Method and system for validating timestamps
US7523109B2 (en) * 2003-12-24 2009-04-21 Microsoft Corporation Dynamic grouping of content including captive data
CN100495392C (en) * 2003-12-29 2009-06-03 西安迪戈科技有限责任公司 Intelligent search method
JP4297345B2 (en) * 2004-01-14 2009-07-15 Kddi株式会社 Mass mail detection method and mail server
US7254593B2 (en) * 2004-01-16 2007-08-07 International Business Machines Corporation System and method for tracking annotations of data sources
US7269590B2 (en) * 2004-01-29 2007-09-11 Yahoo! Inc. Method and system for customizing views of information associated with a social network user
US20050177600A1 (en) * 2004-02-11 2005-08-11 International Business Machines Corporation Provisioning of services based on declarative descriptions of a resource structure of a service
JP2005242904A (en) * 2004-02-27 2005-09-08 Ricoh Co Ltd Document group analysis device, document group analysis method, document group analysis system, program and storage medium
US7636710B2 (en) * 2004-03-04 2009-12-22 Symantec Operating Corporation System and method for efficient file content searching within a file system
US20050198305A1 (en) * 2004-03-04 2005-09-08 Peter Pezaris Method and system for associating a thread with content in a social networking environment
US20050203931A1 (en) * 2004-03-13 2005-09-15 Robert Pingree Metadata management convergence platforms, systems and methods
JP2007529822A (en) * 2004-03-15 2007-10-25 ヤフー! インコーポレイテッド Search system and method integrating user annotations from a trust network
US8788492B2 (en) 2004-03-15 2014-07-22 Yahoo!, Inc. Search system and methods with integration of user annotations from a trust network
US7953859B1 (en) 2004-03-31 2011-05-31 Avaya Inc. Data model of participation in multi-channel and multi-party contacts
US7908663B2 (en) 2004-04-20 2011-03-15 Microsoft Corporation Abstractions and automation for enhanced sharing and collaboration
KR20060134175A (en) * 2004-04-21 2006-12-27 코닌클리케 필립스 일렉트로닉스 엔.브이. System and method for managing threads in a network chat environment
US20060218111A1 (en) * 2004-05-13 2006-09-28 Cohen Hunter C Filtered search results
US7437382B2 (en) * 2004-05-14 2008-10-14 Microsoft Corporation Method and system for ranking messages of discussion threads
US20050278139A1 (en) * 2004-05-28 2005-12-15 Glaenzer Helmut K Automatic match tuning
US20060047816A1 (en) * 2004-06-17 2006-03-02 International Business Machines Corporation Method and apparatus for generating and distributing meeting minutes from an instant messaging session
US7949666B2 (en) 2004-07-09 2011-05-24 Ricoh, Ltd. Synchronizing distributed work through document logs
US8738412B2 (en) 2004-07-13 2014-05-27 Avaya Inc. Method and apparatus for supporting individualized selection rules for resource allocation
US7720845B2 (en) * 2004-08-13 2010-05-18 Yahoo! Inc. Systems and methods for updating query results based on query deltas
US7970639B2 (en) * 2004-08-20 2011-06-28 Mark A Vucina Project management systems and methods
US20060069734A1 (en) * 2004-09-01 2006-03-30 Michael Gersh Method and system for organizing and displaying message threads
US20060074833A1 (en) * 2004-09-03 2006-04-06 Biowisdom Limited System and method for notifying users of changes in multi-relational ontologies
US20060053099A1 (en) * 2004-09-03 2006-03-09 Biowisdom Limited System and method for capturing knowledge for integration into one or more multi-relational ontologies
US7505989B2 (en) * 2004-09-03 2009-03-17 Biowisdom Limited System and method for creating customized ontologies
US20060053382A1 (en) * 2004-09-03 2006-03-09 Biowisdom Limited System and method for facilitating user interaction with multi-relational ontologies
US20060053174A1 (en) * 2004-09-03 2006-03-09 Bio Wisdom Limited System and method for data extraction and management in multi-relational ontology creation
US7496593B2 (en) * 2004-09-03 2009-02-24 Biowisdom Limited Creating a multi-relational ontology having a predetermined structure
US7493333B2 (en) * 2004-09-03 2009-02-17 Biowisdom Limited System and method for parsing and/or exporting data from one or more multi-relational ontologies
US20060053172A1 (en) * 2004-09-03 2006-03-09 Biowisdom Limited System and method for creating, editing, and using multi-relational ontologies
US20060053175A1 (en) * 2004-09-03 2006-03-09 Biowisdom Limited System and method for creating, editing, and utilizing one or more rules for multi-relational ontology creation and maintenance
US7707167B2 (en) * 2004-09-20 2010-04-27 Microsoft Corporation Method, system, and apparatus for creating a knowledge interchange profile
US7730010B2 (en) * 2004-09-20 2010-06-01 Microsoft Corporation Method, system, and apparatus for maintaining user privacy in a knowledge interchange system
US7593924B2 (en) * 2004-09-20 2009-09-22 Microsoft Corporation Method, system, and apparatus for receiving and responding to knowledge interchange queries
US7949121B1 (en) 2004-09-27 2011-05-24 Avaya Inc. Method and apparatus for the simultaneous delivery of multiple contacts to an agent
US8234141B1 (en) 2004-09-27 2012-07-31 Avaya Inc. Dynamic work assignment strategies based on multiple aspects of agent proficiency
JP2006099236A (en) * 2004-09-28 2006-04-13 Toshiba Corp Classification support device, classification support method, and classification support program
US7587396B2 (en) * 2004-11-24 2009-09-08 Oracle International Corporation Encoding data to be sorted
US20060122974A1 (en) * 2004-12-03 2006-06-08 Igor Perisic System and method for a dynamic content driven rendering of social networks
US7734670B2 (en) * 2004-12-15 2010-06-08 Microsoft Corporation Actionable email documents
US20060149710A1 (en) 2004-12-30 2006-07-06 Ross Koningstein Associating features with entities, such as categories of web page documents, and/or weighting such features
US8090736B1 (en) * 2004-12-30 2012-01-03 Google Inc. Enhancing search results using conceptual document relationships
US7769579B2 (en) 2005-05-31 2010-08-03 Google Inc. Learning facts from semi-structured text
US8244689B2 (en) 2006-02-17 2012-08-14 Google Inc. Attribute entropy as a signal in object normalization
US7680791B2 (en) * 2005-01-18 2010-03-16 Oracle International Corporation Method for sorting data using common prefix bytes
US7904411B2 (en) * 2005-02-04 2011-03-08 Accenture Global Services Limited Knowledge discovery tool relationship generation
US8660977B2 (en) * 2005-02-04 2014-02-25 Accenture Global Services Limited Knowledge discovery tool relationship generation
US20060179026A1 (en) * 2005-02-04 2006-08-10 Bechtel Michael E Knowledge discovery tool extraction and integration
US7788293B2 (en) * 2005-03-02 2010-08-31 Google Inc. Generating structured information
US7882447B2 (en) 2005-03-30 2011-02-01 Ebay Inc. Method and system to determine area on a user interface
US7587387B2 (en) 2005-03-31 2009-09-08 Google Inc. User interface for facts query engine with snippets from information sources that include query terms and answer terms
US9208229B2 (en) 2005-03-31 2015-12-08 Google Inc. Anchor text summarization for corroboration
US8682913B1 (en) 2005-03-31 2014-03-25 Google Inc. Corroborating facts extracted from multiple sources
US7523137B2 (en) * 2005-04-08 2009-04-21 Accenture Global Services Gmbh Model-driven event detection, implication, and reporting system
US20060242125A1 (en) * 2005-04-25 2006-10-26 Storage Technology Corporation Method, apparatus, and computer program product for assessing a user's current information management system
US7765098B2 (en) * 2005-04-26 2010-07-27 Content Analyst Company, Llc Machine translation using vector space representations
US7636730B2 (en) * 2005-04-29 2009-12-22 Battelle Memorial Research Document clustering methods, document cluster label disambiguation methods, document clustering apparatuses, and articles of manufacture
US7672956B2 (en) * 2005-04-29 2010-03-02 International Business Machines Corporation Method and system for providing a search index for an electronic messaging system based on message threads
US20060262115A1 (en) * 2005-05-02 2006-11-23 Shapiro Graham H Statistical machine learning system and methods
US20060265383A1 (en) * 2005-05-18 2006-11-23 Pezaris Design, Inc. Method and system for performing and sorting a content search
US7567976B1 (en) * 2005-05-31 2009-07-28 Google Inc. Merging objects in a facts database
US7831545B1 (en) * 2005-05-31 2010-11-09 Google Inc. Identifying the unifying subject of a set of facts
US8996470B1 (en) 2005-05-31 2015-03-31 Google Inc. System for ensuring the internal consistency of a fact repository
US7680830B1 (en) 2005-05-31 2010-03-16 Symantec Operating Corporation System and method for policy-based data lifecycle management
US7493347B2 (en) * 2005-06-02 2009-02-17 International Business Machines Corporation Method for condensing reported checkpoint log data
US20060282303A1 (en) * 2005-06-08 2006-12-14 Microsoft Corporation Distributed organizational analyzer
WO2006134682A1 (en) * 2005-06-15 2006-12-21 Matsushita Electric Industrial Co., Ltd. Characteristic expression extracting device, method, and program
US7636734B2 (en) * 2005-06-23 2009-12-22 Microsoft Corporation Method for probabilistic analysis of most frequently occurring electronic message addresses within personal store (.PST) files to determine owner with confidence factor based on relative weight and set of user-specified factors
US20080005064A1 (en) * 2005-06-28 2008-01-03 Yahoo! Inc. Apparatus and method for content annotation and conditional annotation retrieval in a search context
US20070005698A1 (en) * 2005-06-29 2007-01-04 Manish Kumar Method and apparatuses for locating an expert during a collaboration session
US7765257B2 (en) * 2005-06-29 2010-07-27 Cisco Technology, Inc. Methods and apparatuses for selectively providing privacy through a dynamic social network system
JP2007042069A (en) * 2005-06-30 2007-02-15 Sony Corp Information processor, information processing method and information processing program
US7937344B2 (en) 2005-07-25 2011-05-03 Splunk Inc. Machine data web
EP1910949A4 (en) * 2005-07-29 2012-05-30 Cataphora Inc An improved method and apparatus for sociological data analysis
US20070067291A1 (en) * 2005-09-19 2007-03-22 Kolo Brian A System and method for negative entity extraction technique
US7502788B2 (en) * 2005-11-08 2009-03-10 International Business Machines Corporation Method for retrieving constant values using regular expressions
US20070112854A1 (en) * 2005-11-12 2007-05-17 Franca Paulo B Apparatus and method for automatic generation and distribution of documents
US8316292B1 (en) * 2005-11-18 2012-11-20 Google Inc. Identifying multiple versions of documents
US20070118441A1 (en) * 2005-11-22 2007-05-24 Robert Chatwani Editable electronic catalogs
US8977603B2 (en) * 2005-11-22 2015-03-10 Ebay Inc. System and method for managing shared collections
WO2007060780A1 (en) * 2005-11-22 2007-05-31 Nec Corporation Inspiration support device, inspiration support method, and inspiration support program
GB0524350D0 (en) * 2005-11-30 2006-01-04 Ibm Method and apparatus for propagating address change in an email
US7853465B2 (en) * 2005-12-02 2010-12-14 Oracle International Corp. Methods and apparatus to present event information with respect to a timeline
US20070150802A1 (en) * 2005-12-12 2007-06-28 Canon Information Systems Research Australia Pty. Ltd. Document annotation and interface
US20070143660A1 (en) * 2005-12-19 2007-06-21 Huey John M System and method for indexing image-based information
US20070143300A1 (en) * 2005-12-20 2007-06-21 Ask Jeeves, Inc. System and method for monitoring evolution over time of temporal content
US7792872B1 (en) * 2005-12-29 2010-09-07 United Services Automobile Association Workflow administration tools and user interfaces
US7792871B1 (en) 2005-12-29 2010-09-07 United Services Automobile Association Workflow administration tools and user interfaces
US7822706B1 (en) 2005-12-29 2010-10-26 United Services Automobile Association (Usaa) Workflow administration tools and user interfaces
US7788296B2 (en) * 2005-12-29 2010-08-31 Guidewire Software, Inc. Method and apparatus for managing a computer-based address book for incident-related work
US7840526B1 (en) 2005-12-29 2010-11-23 United Services Automobile Association (Usaa) Workflow administration tools and user interfaces
WO2007081958A2 (en) 2006-01-10 2007-07-19 Christopher Armstrong Indicating and searching recent content publication activity by a user
US8150857B2 (en) * 2006-01-20 2012-04-03 Glenbrook Associates, Inc. System and method for context-rich database optimized for processing of concepts
JP2007193685A (en) * 2006-01-20 2007-08-02 Fujitsu Ltd Program for displaying personal connection information, recording medium with the program recorded thereon, device for displaying personal connection information, and method for displaying personal connection information
US7991797B2 (en) 2006-02-17 2011-08-02 Google Inc. ID persistence through normalization
US8260785B2 (en) 2006-02-17 2012-09-04 Google Inc. Automatic object reference identification and linking in a browseable fact repository
US8700568B2 (en) 2006-02-17 2014-04-15 Google Inc. Entity normalization via name normalization
US8001121B2 (en) 2006-02-27 2011-08-16 Microsoft Corporation Training a ranking function using propagated document relevance
US8019763B2 (en) * 2006-02-27 2011-09-13 Microsoft Corporation Propagating relevance from labeled documents to unlabeled documents
US8448242B2 (en) 2006-02-28 2013-05-21 The Trustees Of Columbia University In The City Of New York Systems, methods, and media for outputting data based upon anomaly detection
US20090119173A1 (en) * 2006-02-28 2009-05-07 Buzzlogic, Inc. System and Method For Advertisement Targeting of Conversations in Social Media
JP2009528639A (en) * 2006-02-28 2009-08-06 バズロジック, インコーポレイテッド Social analysis system and method for analyzing conversations in social media
US8112324B2 (en) 2006-03-03 2012-02-07 Amazon Technologies, Inc. Collaborative structured tagging for item encyclopedias
US8402022B2 (en) * 2006-03-03 2013-03-19 Martin R. Frank Convergence of terms within a collaborative tagging environment
US8744885B2 (en) * 2006-03-28 2014-06-03 Snowflake Itm, Inc. Task based organizational management system and method
US8190625B1 (en) 2006-03-29 2012-05-29 A9.Com, Inc. Method and system for robust hyperlinking
US20090030754A1 (en) * 2006-04-25 2009-01-29 Mcnamar Richard Timothy Methods, systems and computer software utilizing xbrl to identify, capture, array, manage, transmit and display documents and data in litigation preparation, trial and regulatory filings and regulatory compliance
WO2007127695A2 (en) * 2006-04-25 2007-11-08 Elmo Weber Frank Prefernce based automatic media summarization
US8676703B2 (en) 2006-04-27 2014-03-18 Guidewire Software, Inc. Insurance policy revisioning method and apparatus
WO2007130864A2 (en) * 2006-05-02 2007-11-15 Lit Group, Inc. Method and system for retrieving network documents
US20070282770A1 (en) * 2006-05-15 2007-12-06 Nortel Networks Limited System and methods for filtering electronic communications
US7958557B2 (en) * 2006-05-17 2011-06-07 Computer Associates Think, Inc. Determining a source of malicious computer element in a computer network
US9507778B2 (en) 2006-05-19 2016-11-29 Yahoo! Inc. Summarization of media object collections
US7526486B2 (en) 2006-05-22 2009-04-28 Initiate Systems, Inc. Method and system for indexing information about entities with respect to hierarchies
US7971179B2 (en) * 2006-05-23 2011-06-28 Microsoft Corporation Providing artifact lifespan and relationship representation
US7493293B2 (en) * 2006-05-31 2009-02-17 International Business Machines Corporation System and method for extracting entities of interest from text using n-gram models
EP2030134A4 (en) 2006-06-02 2010-06-23 Initiate Systems Inc A system and method for automatic weight generation for probabilistic matching
US8869037B2 (en) * 2006-06-22 2014-10-21 Linkedin Corporation Event visualization
US7831928B1 (en) 2006-06-22 2010-11-09 Digg, Inc. Content visualization
US8140267B2 (en) * 2006-06-30 2012-03-20 International Business Machines Corporation System and method for identifying similar molecules
WO2008004023A2 (en) * 2006-06-30 2008-01-10 Nokia Corporation A listing for received messages
US20080005249A1 (en) * 2006-07-03 2008-01-03 Hart Matt E Method and apparatus for determining the importance of email messages
US20080040331A1 (en) * 2006-07-21 2008-02-14 Iior, Llc Computer-implemented social searching
US7971135B2 (en) * 2006-07-28 2011-06-28 Adobe Systems Incorporated Method and system for automatic data aggregation
US7962937B2 (en) * 2006-08-01 2011-06-14 Microsoft Corporation Media content catalog service
US7529740B2 (en) * 2006-08-14 2009-05-05 International Business Machines Corporation Method and apparatus for organizing data sources
US7933843B1 (en) * 2006-08-26 2011-04-26 CommEq Ltd. Media-based computational influencer network analysis
US7627432B2 (en) 2006-09-01 2009-12-01 Spss Inc. System and method for computing analytics on structured data
JP2010503072A (en) * 2006-09-02 2010-01-28 ティーティービー テクノロジーズ,エルエルシー Computer-based meeting preparation method and execution system
US9202184B2 (en) 2006-09-07 2015-12-01 International Business Machines Corporation Optimizing the selection, verification, and deployment of expert resources in a time of chaos
US7698268B1 (en) 2006-09-15 2010-04-13 Initiate Systems, Inc. Method and system for filtering false positives
US8356009B2 (en) 2006-09-15 2013-01-15 International Business Machines Corporation Implementation defined segments for relational database systems
US7685093B1 (en) * 2006-09-15 2010-03-23 Initiate Systems, Inc. Method and system for comparing attributes such as business names
US7627550B1 (en) * 2006-09-15 2009-12-01 Initiate Systems, Inc. Method and system for comparing attributes such as personal names
US8789172B2 (en) 2006-09-18 2014-07-22 The Trustees Of Columbia University In The City Of New York Methods, media, and systems for detecting attack on a digital processing device
US7945122B2 (en) * 2006-09-27 2011-05-17 International Business Machines Corporation Method, system, and program product for processing an electronic document
US20080077418A1 (en) * 2006-09-27 2008-03-27 Andrew Coleman Method, system, and program product for analyzing how a procedure will be applied to an electronic document
US7930197B2 (en) * 2006-09-28 2011-04-19 Microsoft Corporation Personal data mining
US20080082416A1 (en) * 2006-09-29 2008-04-03 Kotas Paul A Community-Based Selection of Advertisements for a Concept-Centric Electronic Marketplace
US7623129B2 (en) * 2006-09-29 2009-11-24 Business Objects Software Ltd. Apparatus and method for visualizing the relationship between a plurality of sets
US20090287503A1 (en) * 2008-05-16 2009-11-19 International Business Machines Corporation Analysis of individual and group healthcare data in order to provide real time healthcare recommendations
WO2008043082A2 (en) 2006-10-05 2008-04-10 Splunk Inc. Time series search engine
US7802305B1 (en) * 2006-10-10 2010-09-21 Adobe Systems Inc. Methods and apparatus for automated redaction of content in a document
US7571145B2 (en) * 2006-10-18 2009-08-04 Yahoo! Inc. Social knowledge system content quality
US8122026B1 (en) 2006-10-20 2012-02-21 Google Inc. Finding and disambiguating references to entities on web pages
US7707130B2 (en) * 2006-10-23 2010-04-27 Health Care Information Services Llc Real-time predictive computer program, model, and method
US7805406B2 (en) 2006-10-27 2010-09-28 Xystar Technologies, Inc. Cross-population of virtual communities
US8594702B2 (en) 2006-11-06 2013-11-26 Yahoo! Inc. Context server for associating information based on context
US7953758B2 (en) * 2006-11-10 2011-05-31 Ricoh Company, Ltd. Workflow management method and workflow management apparatus
US7765176B2 (en) * 2006-11-13 2010-07-27 Accenture Global Services Gmbh Knowledge discovery system with user interactive analysis view for analyzing and generating relationships
US20080114750A1 (en) * 2006-11-14 2008-05-15 Microsoft Corporation Retrieval and ranking of items utilizing similarity
US8484108B2 (en) * 2006-11-17 2013-07-09 International Business Machines Corporation Tracking entities during identity resolution
US9208174B1 (en) * 2006-11-20 2015-12-08 Disney Enterprises, Inc. Non-language-based object search
US9110903B2 (en) 2006-11-22 2015-08-18 Yahoo! Inc. Method, system and apparatus for using user profile electronic device data in media delivery
US8402356B2 (en) 2006-11-22 2013-03-19 Yahoo! Inc. Methods, systems and apparatus for delivery of media
US7814405B2 (en) * 2006-11-28 2010-10-12 International Business Machines Corporation Method and system for automatic generation and updating of tags based on type of communication and content state in an activities oriented collaboration tool
EP2102750B1 (en) 2006-12-04 2014-11-05 Commvault Systems, Inc. System and method for creating copies of data, such as archive copies
KR100839880B1 (en) * 2006-12-13 2008-06-19 (주)이엑스쓰리디 Method for indicating the amount of communication for each user using the icon and communication terminal using the same
US20080148366A1 (en) * 2006-12-16 2008-06-19 Mark Frederick Wahl System and method for authentication in a social network service
US7840537B2 (en) 2006-12-22 2010-11-23 Commvault Systems, Inc. System and method for storing redundant information
US8769099B2 (en) 2006-12-28 2014-07-01 Yahoo! Inc. Methods and systems for pre-caching information on a mobile computing device
US7844604B2 (en) * 2006-12-28 2010-11-30 Yahoo! Inc. Automatically generating user-customized notifications of changes in a social network system
US7788247B2 (en) * 2007-01-12 2010-08-31 Microsoft Corporation Characteristic tagging
US8131536B2 (en) * 2007-01-12 2012-03-06 Raytheon Bbn Technologies Corp. Extraction-empowered machine translation
US8190661B2 (en) * 2007-01-24 2012-05-29 Microsoft Corporation Using virtual repository items for customized display
US8819021B1 (en) * 2007-01-26 2014-08-26 Ernst & Young U.S. Llp Efficient and phased method of processing large collections of electronic data known as “best match first”™ for electronic discovery and other related applications
US8484083B2 (en) * 2007-02-01 2013-07-09 Sri International Method and apparatus for targeting messages to users in a social network
US8359339B2 (en) 2007-02-05 2013-01-22 International Business Machines Corporation Graphical user interface for configuration of an algorithm for the matching of data records
US8166389B2 (en) * 2007-02-09 2012-04-24 General Electric Company Methods and apparatus for including customized CDA attributes for searching and retrieval
US8145673B2 (en) * 2007-02-16 2012-03-27 Microsoft Corporation Easily queriable software repositories
US20080201330A1 (en) * 2007-02-16 2008-08-21 Microsoft Corporation Software repositories
US8006094B2 (en) * 2007-02-21 2011-08-23 Ricoh Co., Ltd. Trustworthy timestamps and certifiable clocks using logs linked by cryptographic hashes
WO2008102373A2 (en) * 2007-02-23 2008-08-28 Ravikiran Sureshbabu Pasupulet A method and system for close range communication using concetric arcs model
US7917478B2 (en) * 2007-02-26 2011-03-29 International Business Machines Corporation System and method for quality control in healthcare settings to continuously monitor outcomes and undesirable outcomes such as infections, re-operations, excess mortality, and readmissions
US7853611B2 (en) 2007-02-26 2010-12-14 International Business Machines Corporation System and method for deriving a hierarchical event based database having action triggers based on inferred probabilities
US7970759B2 (en) 2007-02-26 2011-06-28 International Business Machines Corporation System and method for deriving a hierarchical event based database optimized for pharmaceutical analysis
US7895515B1 (en) * 2007-02-28 2011-02-22 Trend Micro Inc Detecting indicators of misleading content in markup language coded documents using the formatting of the document
US8347202B1 (en) 2007-03-14 2013-01-01 Google Inc. Determining geographic locations for place names in a fact repository
US8515926B2 (en) 2007-03-22 2013-08-20 International Business Machines Corporation Processing related data from information sources
US8996483B2 (en) 2007-03-28 2015-03-31 Ricoh Co., Ltd. Method and apparatus for recording associations with logs
US8370355B2 (en) 2007-03-29 2013-02-05 International Business Machines Corporation Managing entities within a database
WO2008121170A1 (en) * 2007-03-29 2008-10-09 Initiate Systems, Inc. Method and system for parsing languages
WO2008121824A1 (en) 2007-03-29 2008-10-09 Initiate Systems, Inc. Method and system for data exchange among data sources
US8423514B2 (en) 2007-03-29 2013-04-16 International Business Machines Corporation Service provisioning
US8768895B2 (en) * 2007-04-11 2014-07-01 Emc Corporation Subsegmenting for efficient storage, resemblance determination, and transmission
US8166012B2 (en) * 2007-04-11 2012-04-24 Emc Corporation Cluster storage using subsegmenting
US8290967B2 (en) 2007-04-19 2012-10-16 Barnesandnoble.Com Llc Indexing and search query processing
US7917493B2 (en) 2007-04-19 2011-03-29 Retrevo Inc. Indexing and searching product identifiers
US8504553B2 (en) * 2007-04-19 2013-08-06 Barnesandnoble.Com Llc Unstructured and semistructured document processing and searching
US9183290B2 (en) 2007-05-02 2015-11-10 Thomas Reuters Global Resources Method and system for disambiguating informational objects
US7953724B2 (en) * 2007-05-02 2011-05-31 Thomson Reuters (Scientific) Inc. Method and system for disambiguating informational objects
US8239350B1 (en) 2007-05-08 2012-08-07 Google Inc. Date ambiguity resolution
US8838478B2 (en) * 2007-05-11 2014-09-16 Sony Corporation Targeted advertising in mobile devices
US7831625B2 (en) 2007-05-16 2010-11-09 Microsoft Corporation Data model for a common language
US8204866B2 (en) * 2007-05-18 2012-06-19 Microsoft Corporation Leveraging constraints for deduplication
US8150868B2 (en) * 2007-06-11 2012-04-03 Microsoft Corporation Using joint communication and search data
US9542394B2 (en) * 2007-06-14 2017-01-10 Excalibur Ip, Llc Method and system for media-based event generation
US8140493B2 (en) * 2007-06-15 2012-03-20 Oracle International Corporation Changing metadata without invalidating cursors
US8356014B2 (en) * 2007-06-15 2013-01-15 Oracle International Corporation Referring to partitions with for (values) clause
US8209294B2 (en) * 2007-06-15 2012-06-26 Oracle International Corporation Dynamic creation of database partitions
US8135688B2 (en) * 2007-06-15 2012-03-13 Oracle International Corporation Partition/table allocation on demand
US20080313002A1 (en) * 2007-06-18 2008-12-18 Ppg Industries Ohio, Inc. Method, system, and apparatus for operating a registry
US7966291B1 (en) 2007-06-26 2011-06-21 Google Inc. Fact-based object merging
US20090005085A1 (en) * 2007-06-28 2009-01-01 Motorola, Inc. Selective retry system for a code division multiple access stack for call setup failures
CA2601154C (en) * 2007-07-07 2016-09-13 Mathieu Audet Method and system for distinguising elements of information along a plurality of axes on a basis of a commonality
US7970766B1 (en) 2007-07-23 2011-06-28 Google Inc. Entity type assignment
US8738643B1 (en) 2007-08-02 2014-05-27 Google Inc. Learning synonymous object names from anchor texts
US20090043867A1 (en) * 2007-08-06 2009-02-12 Apple Inc. Synching data
US8719287B2 (en) * 2007-08-31 2014-05-06 Business Objects Software Limited Apparatus and method for dynamically selecting componentized executable instructions at run time
US20090070346A1 (en) * 2007-09-06 2009-03-12 Antonio Savona Systems and methods for clustering information
US8782203B2 (en) * 2007-09-14 2014-07-15 International Business Machines Corporation Propagating accelerated events in a network management system
US7877385B2 (en) 2007-09-21 2011-01-25 Microsoft Corporation Information retrieval using query-document pair information
US8713434B2 (en) 2007-09-28 2014-04-29 International Business Machines Corporation Indexing, relating and managing information about entities
US8417702B2 (en) 2007-09-28 2013-04-09 International Business Machines Corporation Associating data records in multiple languages
EP2193415A4 (en) 2007-09-28 2013-08-28 Ibm Method and system for analysis of a system for matching data records
US7865516B2 (en) * 2007-10-04 2011-01-04 International Business Machines Corporation Associative temporal search of electronic files
US8239342B2 (en) * 2007-10-05 2012-08-07 International Business Machines Corporation Method and apparatus for providing on-demand ontology creation and extension
US7890539B2 (en) * 2007-10-10 2011-02-15 Raytheon Bbn Technologies Corp. Semantic matching using predicate-argument structure
US7831571B2 (en) * 2007-10-25 2010-11-09 International Business Machines Corporation Anonymizing selected content in a document
US9244929B2 (en) * 2007-10-31 2016-01-26 Echostar Technologies L.L.C. Automated indexing of electronic files and file folders
US8812435B1 (en) 2007-11-16 2014-08-19 Google Inc. Learning objects and facts from documents
US8180807B2 (en) 2007-11-27 2012-05-15 At&T Intellectual Property I, L.P. System and method of determining relationship information
US8412516B2 (en) 2007-11-27 2013-04-02 Accenture Global Services Limited Document analysis, commenting, and reporting system
US8266519B2 (en) * 2007-11-27 2012-09-11 Accenture Global Services Limited Document analysis, commenting, and reporting system
US8069142B2 (en) 2007-12-06 2011-11-29 Yahoo! Inc. System and method for synchronizing data on a network
US8671154B2 (en) 2007-12-10 2014-03-11 Yahoo! Inc. System and method for contextual addressing of communications on a network
US8307029B2 (en) 2007-12-10 2012-11-06 Yahoo! Inc. System and method for conditional delivery of messages
US20110225158A1 (en) * 2007-12-12 2011-09-15 21Ct, Inc. Method and System for Abstracting Information for Use in Link Analysis
US8166168B2 (en) 2007-12-17 2012-04-24 Yahoo! Inc. System and method for disambiguating non-unique identifiers using information obtained from disparate communication channels
US8260832B2 (en) * 2007-12-18 2012-09-04 Oracle International Corporation Managing large collection of interlinked XML documents
US20090164443A1 (en) * 2007-12-19 2009-06-25 Ramos Jo A Database performance mining
US20090183104A1 (en) * 2008-01-03 2009-07-16 Dotson Gerald A Multi-mode viewer control for viewing and managing groups of statistics
US9706345B2 (en) 2008-01-04 2017-07-11 Excalibur Ip, Llc Interest mapping system
US9626685B2 (en) 2008-01-04 2017-04-18 Excalibur Ip, Llc Systems and methods of mapping attention
US8762285B2 (en) * 2008-01-06 2014-06-24 Yahoo! Inc. System and method for message clustering
US8719121B1 (en) * 2008-01-15 2014-05-06 David Leason System and method for automated construction of time records based on electronic messages
US8775441B2 (en) 2008-01-16 2014-07-08 Ab Initio Technology Llc Managing an archive for approximate string matching
US20090182618A1 (en) 2008-01-16 2009-07-16 Yahoo! Inc. System and Method for Word-of-Mouth Advertising
US8103660B2 (en) * 2008-01-22 2012-01-24 International Business Machines Corporation Computer method and system for contextual management and awareness of persistent queries and results
US7877367B2 (en) * 2008-01-22 2011-01-25 International Business Machines Corporation Computer method and apparatus for graphical inquiry specification with progressive summary
WO2009094672A2 (en) * 2008-01-25 2009-07-30 Trustees Of Columbia University In The City Of New York Belief propagation for generalized matching
US20090204636A1 (en) * 2008-02-11 2009-08-13 Microsoft Corporation Multimodal object de-duplication
WO2009102501A2 (en) 2008-02-15 2009-08-20 Your Net Works, Inc. System, method, and computer program product for providing an association between a first participant and a second participant in a social network
US7885973B2 (en) * 2008-02-22 2011-02-08 International Business Machines Corporation Computer method and apparatus for parameterized semantic inquiry templates with type annotations
US20090222717A1 (en) * 2008-02-28 2009-09-03 Theodor Holm Nelson System for exploring connections between data pages
US20120137202A1 (en) * 2008-02-28 2012-05-31 Theodor Holm Nelson System for exploring connections between data pages
US8538811B2 (en) 2008-03-03 2013-09-17 Yahoo! Inc. Method and apparatus for social network marketing with advocate referral
US8554623B2 (en) 2008-03-03 2013-10-08 Yahoo! Inc. Method and apparatus for social network marketing with consumer referral
US8560390B2 (en) 2008-03-03 2013-10-15 Yahoo! Inc. Method and apparatus for social network marketing with brand referral
US8812853B1 (en) * 2008-03-18 2014-08-19 Avaya Inc. Traceability for threaded communications
US8745133B2 (en) 2008-03-28 2014-06-03 Yahoo! Inc. System and method for optimizing the storage of data
US8589486B2 (en) 2008-03-28 2013-11-19 Yahoo! Inc. System and method for addressing communications
US8271506B2 (en) 2008-03-31 2012-09-18 Yahoo! Inc. System and method for modeling relationships between entities
KR101475339B1 (en) * 2008-04-14 2014-12-23 삼성전자주식회사 Communication terminal and method for unified natural language interface thereof
US7506333B1 (en) 2008-04-23 2009-03-17 International Business Machines Corporation Method, system, and computer program product for managing foreign holidays for a computer application based on email
US8266168B2 (en) 2008-04-24 2012-09-11 Lexisnexis Risk & Information Analytics Group Inc. Database systems and methods for linking records and entity representations with sufficiently high confidence
US8121962B2 (en) * 2008-04-25 2012-02-21 Fair Isaac Corporation Automated entity identification for efficient profiling in an event probability prediction system
US20100049665A1 (en) * 2008-04-25 2010-02-25 Christopher Allan Ralph Basel adaptive segmentation heuristics
US8095963B2 (en) 2008-04-30 2012-01-10 Microsoft Corporation Securing resource stores with claims-based security
US8266114B2 (en) * 2008-09-22 2012-09-11 Riverbed Technology, Inc. Log structured content addressable deduplicating storage
US7996371B1 (en) * 2008-06-10 2011-08-09 Netapp, Inc. Combining context-aware and context-independent data deduplication for optimal space savings
US8484162B2 (en) 2008-06-24 2013-07-09 Commvault Systems, Inc. De-duplication systems and methods for application-specific data
US8219524B2 (en) 2008-06-24 2012-07-10 Commvault Systems, Inc. Application-aware and remote single instance data management
US9098495B2 (en) * 2008-06-24 2015-08-04 Commvault Systems, Inc. Application-aware and remote single instance data management
US8813107B2 (en) 2008-06-27 2014-08-19 Yahoo! Inc. System and method for location based media delivery
US8452855B2 (en) 2008-06-27 2013-05-28 Yahoo! Inc. System and method for presentation of media related to a context
US8706406B2 (en) 2008-06-27 2014-04-22 Yahoo! Inc. System and method for determination and display of personalized distance
US8166263B2 (en) 2008-07-03 2012-04-24 Commvault Systems, Inc. Continuous data protection over intermittent connections, such as continuous data backup for laptops or wireless devices
US8583668B2 (en) 2008-07-30 2013-11-12 Yahoo! Inc. System and method for context enhanced mapping
US10230803B2 (en) 2008-07-30 2019-03-12 Excalibur Ip, Llc System and method for improved mapping and routing
US7913114B2 (en) * 2008-07-31 2011-03-22 Quantum Corporation Repair of a corrupt data segment used by a de-duplication engine
US20110087670A1 (en) * 2008-08-05 2011-04-14 Gregory Jorstad Systems and methods for concept mapping
US8788466B2 (en) * 2008-08-05 2014-07-22 International Business Machines Corporation Efficient transfer of deduplicated data
CN101605141A (en) * 2008-08-05 2009-12-16 天津大学 Web service relational network system based on semanteme
US8245282B1 (en) 2008-08-19 2012-08-14 Eharmony, Inc. Creating tests to identify fraudulent users
US8589333B2 (en) * 2008-08-19 2013-11-19 Northrop Grumman Systems Corporation System and method for information sharing across security boundaries
US8386506B2 (en) 2008-08-21 2013-02-26 Yahoo! Inc. System and method for context enhanced messaging
US8290915B2 (en) * 2008-09-15 2012-10-16 International Business Machines Corporation Retrieval and recovery of data chunks from alternate data stores in a deduplicating system
US8751559B2 (en) * 2008-09-16 2014-06-10 Microsoft Corporation Balanced routing of questions to experts
US8281027B2 (en) 2008-09-19 2012-10-02 Yahoo! Inc. System and method for distributing media related to a location
US9015181B2 (en) 2008-09-26 2015-04-21 Commvault Systems, Inc. Systems and methods for managing single instancing data
WO2010036754A1 (en) * 2008-09-26 2010-04-01 Commvault Systems, Inc. Systems and methods for managing single instancing data
US8108778B2 (en) 2008-09-30 2012-01-31 Yahoo! Inc. System and method for context enhanced mapping within a user interface
US9600484B2 (en) 2008-09-30 2017-03-21 Excalibur Ip, Llc System and method for reporting and analysis of media consumption data
US8495032B2 (en) 2008-10-01 2013-07-23 International Business Machines Corporation Policy based sharing of redundant data across storage pools in a deduplicating system
AU2009298151B2 (en) * 2008-10-03 2015-07-16 Benefitfocus.Com, Inc. Systems and methods for automatic creation of agent-based systems
US10481878B2 (en) * 2008-10-09 2019-11-19 Objectstore, Inc. User interface apparatus and methods
US8560298B2 (en) * 2008-10-21 2013-10-15 Microsoft Corporation Named entity transliteration using comparable CORPRA
JP5535230B2 (en) 2008-10-23 2014-07-02 アビニシオ テクノロジー エルエルシー Fuzzy data manipulation
US20100106551A1 (en) * 2008-10-24 2010-04-29 Oskari Koskimies Method, system, and apparatus for process management
US9805123B2 (en) 2008-11-18 2017-10-31 Excalibur Ip, Llc System and method for data privacy in URL based context queries
US8032508B2 (en) 2008-11-18 2011-10-04 Yahoo! Inc. System and method for URL based query for retrieving data related to a context
US8024317B2 (en) 2008-11-18 2011-09-20 Yahoo! Inc. System and method for deriving income from URL based context queries
US8060492B2 (en) 2008-11-18 2011-11-15 Yahoo! Inc. System and method for generation of URL based context queries
US8412677B2 (en) 2008-11-26 2013-04-02 Commvault Systems, Inc. Systems and methods for byte-level or quasi byte-level single instancing
US9224172B2 (en) 2008-12-02 2015-12-29 Yahoo! Inc. Customizable content for distribution in social networks
US8055675B2 (en) 2008-12-05 2011-11-08 Yahoo! Inc. System and method for context based query augmentation
EP2377080A4 (en) 2008-12-12 2014-01-08 Univ Columbia Machine optimization devices, methods, and systems
US8166016B2 (en) 2008-12-19 2012-04-24 Yahoo! Inc. System and method for automated service recommendations
US8543580B2 (en) * 2008-12-23 2013-09-24 Microsoft Corporation Mining translations of web queries from web click-through data
US7937386B2 (en) * 2008-12-30 2011-05-03 Complyon Inc. System, method, and apparatus for information extraction of textual documents
US20100169359A1 (en) * 2008-12-30 2010-07-01 Barrett Leslie A System, Method, and Apparatus for Information Extraction of Textual Documents
US8266135B2 (en) * 2009-01-05 2012-09-11 International Business Machines Corporation Indexing for regular expressions in text-centric applications
US8161255B2 (en) * 2009-01-06 2012-04-17 International Business Machines Corporation Optimized simultaneous storing of data into deduplicated and non-deduplicated storage pools
US8068012B2 (en) * 2009-01-08 2011-11-29 Intelleflex Corporation RFID device and system for setting a level on an electronic device
US8332205B2 (en) * 2009-01-09 2012-12-11 Microsoft Corporation Mining transliterations for out-of-vocabulary query terms
US8462161B1 (en) 2009-01-20 2013-06-11 Kount Inc. System and method for fast component enumeration in graphs with implicit edges
US8266150B1 (en) * 2009-02-02 2012-09-11 Trend Micro Incorporated Scalable document signature search engine
US9195739B2 (en) * 2009-02-20 2015-11-24 Microsoft Technology Licensing, Llc Identifying a discussion topic based on user interest information
US20100223341A1 (en) * 2009-02-27 2010-09-02 Microsoft Corporation Electronic messaging tailored to user interest
US20100241990A1 (en) * 2009-03-23 2010-09-23 Microsoft Corporation Re-usable declarative workflow templates
US8150967B2 (en) 2009-03-24 2012-04-03 Yahoo! Inc. System and method for verified presence tracking
US8689172B2 (en) * 2009-03-24 2014-04-01 International Business Machines Corporation Mining sequential patterns in weighted directed graphs
US8401996B2 (en) 2009-03-30 2013-03-19 Commvault Systems, Inc. Storing a variable number of instances of data objects
US20100250399A1 (en) * 2009-03-31 2010-09-30 Ebay, Inc. Methods and systems for online collections
WO2010135586A1 (en) 2009-05-20 2010-11-25 The Trustees Of Columbia University In The City Of New York Systems devices and methods for estimating
US8578120B2 (en) 2009-05-22 2013-11-05 Commvault Systems, Inc. Block-level single instancing
US8972424B2 (en) 2009-05-29 2015-03-03 Peter Snell Subjective linguistic analysis
CN102460374A (en) * 2009-05-29 2012-05-16 彼得·S·斯内尔 System and related method for digital attitude mapping
US8095571B2 (en) 2009-06-22 2012-01-10 Microsoft Corporation Partitioning modeling platform data
US8930306B1 (en) 2009-07-08 2015-01-06 Commvault Systems, Inc. Synchronized data deduplication
CN101963965B (en) 2009-07-23 2013-03-20 阿里巴巴集团控股有限公司 Document indexing method, data query method and server based on search engine
TWI396990B (en) * 2009-08-03 2013-05-21 Univ Nat Taiwan Science Tech Citation record extraction system and method, and program product
US10223701B2 (en) 2009-08-06 2019-03-05 Excalibur Ip, Llc System and method for verified monetization of commercial campaigns
US8914342B2 (en) 2009-08-12 2014-12-16 Yahoo! Inc. Personal data platform
US8364611B2 (en) 2009-08-13 2013-01-29 Yahoo! Inc. System and method for precaching information on a mobile device
US20110055295A1 (en) * 2009-09-01 2011-03-03 International Business Machines Corporation Systems and methods for context aware file searching
US8694505B2 (en) 2009-09-04 2014-04-08 Microsoft Corporation Table of contents for search query refinement
US11080790B2 (en) 2009-09-24 2021-08-03 Guidewire Software, Inc. Method and apparatus for managing revisions and tracking of insurance policy elements
WO2011041345A1 (en) * 2009-10-02 2011-04-07 Georgia Tech Research Corporation Identification disambiguation in databases
US8674993B1 (en) 2009-10-14 2014-03-18 John Fleming Graph database system and method for facilitating financial and corporate relationship analysis
US8301512B2 (en) 2009-10-23 2012-10-30 Ebay Inc. Product identification using multiple services
US8224847B2 (en) 2009-10-29 2012-07-17 Microsoft Corporation Relevant individual searching using managed property and ranking features
US11023675B1 (en) 2009-11-03 2021-06-01 Alphasense OY User interface for use with a search engine for searching financial related documents
US20110106589A1 (en) * 2009-11-03 2011-05-05 James Blomberg Data visualization platform for social and traditional media metrics analysis
US20120137367A1 (en) 2009-11-06 2012-05-31 Cataphora, Inc. Continuous anomaly detection based on behavior modeling and heterogeneous information analysis
US11113299B2 (en) 2009-12-01 2021-09-07 Apple Inc. System and method for metadata transfer among search entities
US11122009B2 (en) * 2009-12-01 2021-09-14 Apple Inc. Systems and methods for identifying geographic locations of social media content collected over social networks
US9411859B2 (en) 2009-12-14 2016-08-09 Lexisnexis Risk Solutions Fl Inc External linking based on hierarchical level weightings
CN102117436A (en) * 2009-12-30 2011-07-06 鸿富锦精密工业(深圳)有限公司 System and method for analyzing patient electronic receipt file
US20120254333A1 (en) * 2010-01-07 2012-10-04 Rajarathnam Chandramouli Automated detection of deception in short and multilingual electronic messages
US8645377B2 (en) * 2010-01-15 2014-02-04 Microsoft Corporation Aggregating data from a work queue
US20110184983A1 (en) * 2010-01-28 2011-07-28 Her Majesty The Queen In Right Of Canada As Represented By The Minister Method and system for extracting and characterizing relationships between entities mentioned in documents
CN102893278A (en) * 2010-02-03 2013-01-23 阿科德有限公司 Electronic message systems and methods
US8306212B2 (en) 2010-02-19 2012-11-06 Avaya Inc. Time-based work assignments in automated contact distribution
US9805101B2 (en) 2010-02-26 2017-10-31 Ebay Inc. Parallel data stream processing system
US8620849B2 (en) 2010-03-10 2013-12-31 Lockheed Martin Corporation Systems and methods for facilitating open source intelligence gathering
JP4898934B2 (en) 2010-03-29 2012-03-21 株式会社Ubic Forensic system, forensic method, and forensic program
JP4868191B2 (en) 2010-03-29 2012-02-01 株式会社Ubic Forensic system, forensic method, and forensic program
NO20100464A1 (en) * 2010-03-29 2011-09-30 Companybook Method and arrangement for business matching and detection of changes for a business using mathematical models
TWI396983B (en) * 2010-04-14 2013-05-21 Inst Information Industry Named entity marking apparatus, named entity marking method, and computer program product thereof
US8762375B2 (en) * 2010-04-15 2014-06-24 Palo Alto Research Center Incorporated Method for calculating entity similarities
US8732208B2 (en) * 2010-04-19 2014-05-20 Facebook, Inc. Structured search queries based on social-graph information
US8255399B2 (en) 2010-04-28 2012-08-28 Microsoft Corporation Data classifier
US20110276744A1 (en) 2010-05-05 2011-11-10 Microsoft Corporation Flash memory cache including for use with persistent key-value store
US8935487B2 (en) 2010-05-05 2015-01-13 Microsoft Corporation Fast and low-RAM-footprint indexing for data deduplication
US9053032B2 (en) 2010-05-05 2015-06-09 Microsoft Technology Licensing, Llc Fast and low-RAM-footprint indexing for data deduplication
US8429113B2 (en) 2010-06-16 2013-04-23 Infernotions Technologies Ltd. Framework and system for identifying partners in nefarious activities
JP5581864B2 (en) * 2010-07-14 2014-09-03 ソニー株式会社 Information processing apparatus, information processing method, and program
JP4995950B2 (en) * 2010-07-28 2012-08-08 株式会社Ubic Forensic system, forensic method, and forensic program
US20120030572A1 (en) * 2010-08-02 2012-02-02 International Business Machines Corporation Network visualization system
US8589536B2 (en) 2010-08-02 2013-11-19 International Business Machines Corporation Network monitoring system
US8572760B2 (en) 2010-08-10 2013-10-29 Benefitfocus.Com, Inc. Systems and methods for secure agent information
JP2012043047A (en) * 2010-08-16 2012-03-01 Fuji Xerox Co Ltd Information processor and information processing program
US8670618B2 (en) 2010-08-18 2014-03-11 Youwho, Inc. Systems and methods for extracting pedigree and family relationship information from documents
US20120046992A1 (en) * 2010-08-23 2012-02-23 International Business Machines Corporation Enterprise-to-market network analysis for sales enablement and relationship building
JP5127895B2 (en) * 2010-08-26 2013-01-23 キヤノン株式会社 Recording apparatus and recording method
WO2012040350A1 (en) 2010-09-24 2012-03-29 International Business Machines Corporation Lexical answer type confidence estimation and application
US8577851B2 (en) 2010-09-30 2013-11-05 Commvault Systems, Inc. Content aligned block-based deduplication
US8578109B2 (en) 2010-09-30 2013-11-05 Commvault Systems, Inc. Systems and methods for retaining and using data block signatures in data protection operations
WO2012045023A2 (en) 2010-09-30 2012-04-05 Commvault Systems, Inc. Archiving data objects using secondary copies
US8577718B2 (en) 2010-11-04 2013-11-05 Dw Associates, Llc Methods and systems for identifying, quantifying, analyzing, and optimizing the level of engagement of components within a defined ecosystem or context
US9165285B2 (en) 2010-12-08 2015-10-20 Microsoft Technology Licensing, Llc Shared attachments
US8954446B2 (en) 2010-12-14 2015-02-10 Comm Vault Systems, Inc. Client-side repository in a networked deduplicated storage system
US9020900B2 (en) 2010-12-14 2015-04-28 Commvault Systems, Inc. Distributed deduplicated storage system
US20120158674A1 (en) * 2010-12-20 2012-06-21 Mark David Lillibridge Indexing for deduplication
US9110936B2 (en) 2010-12-28 2015-08-18 Microsoft Technology Licensing, Llc Using index partitioning and reconciliation for data deduplication
US9639543B2 (en) * 2010-12-28 2017-05-02 Microsoft Technology Licensing, Llc Adaptive index for data deduplication
US9418385B1 (en) * 2011-01-24 2016-08-16 Intuit Inc. Assembling a tax-information data structure
US8694490B2 (en) 2011-01-28 2014-04-08 Bitvore Corporation Method and apparatus for collection, display and analysis of disparate data
US8626750B2 (en) 2011-01-28 2014-01-07 Bitvore Corp. Method and apparatus for 3D display and analysis of disparate data
US8375400B2 (en) 2011-02-11 2013-02-12 Research In Motion Limited Communication device and method for coherent updating of collated message listings
US8630860B1 (en) * 2011-03-03 2014-01-14 Nuance Communications, Inc. Speaker and call characteristic sensitive open voice search
US8849768B1 (en) * 2011-03-08 2014-09-30 Symantec Corporation Systems and methods for classifying files as candidates for deduplication
US9294308B2 (en) 2011-03-10 2016-03-22 Mimecast North America Inc. Enhancing communication
US9031961B1 (en) * 2011-03-17 2015-05-12 Amazon Technologies, Inc. User device with access behavior tracking and favorite passage identifying functionality
US11308449B2 (en) 2011-04-28 2022-04-19 Microsoft Technology Licensing, Llc Storing metadata inside file to reference shared version of file
US8682989B2 (en) 2011-04-28 2014-03-25 Microsoft Corporation Making document changes by replying to electronic messages
US9137185B2 (en) 2011-04-28 2015-09-15 Microsoft Technology Licensing, Llc Uploading attachment to shared location and replacing with a link
US20120278179A1 (en) * 2011-04-28 2012-11-01 Ray Campbell Systems and methods for deducing user information from input device behavior
US8681866B1 (en) 2011-04-28 2014-03-25 Google Inc. Method and apparatus for encoding video by downsampling frame resolution
US10552799B2 (en) 2011-04-28 2020-02-04 Microsoft Technology Licensing, Llc Upload of attachment and insertion of link into electronic messages
US10185932B2 (en) 2011-05-06 2019-01-22 Microsoft Technology Licensing, Llc Setting permissions for links forwarded in electronic messages
US20120282950A1 (en) * 2011-05-06 2012-11-08 Gopogo, Llc Mobile Geolocation String Building System And Methods Thereof
US8965983B2 (en) 2011-05-06 2015-02-24 Microsoft Technology Licensing, Llc Changes to documents are automatically summarized in electronic messages
US8996359B2 (en) 2011-05-18 2015-03-31 Dw Associates, Llc Taxonomy and application of language analysis and processing
US20120304072A1 (en) * 2011-05-23 2012-11-29 Microsoft Corporation Sentiment-based content aggregation and presentation
US8840013B2 (en) * 2011-12-06 2014-09-23 autoGraph, Inc. Consumer self-profiling GUI, analysis and rapid information presentation tools
US9953273B2 (en) 2011-06-28 2018-04-24 Salesforce.Com, Inc. Systems and methods for creating a rich social media profile
US20130002676A1 (en) * 2011-06-28 2013-01-03 Salesforce.Com, Inc. Computer implemented systems and methods for visualizing organizational connections
US8952796B1 (en) 2011-06-28 2015-02-10 Dw Associates, Llc Enactive perception device
US20130014266A1 (en) * 2011-07-07 2013-01-10 Mitel Networks Corporation Collaboration privacy
US8650198B2 (en) 2011-08-15 2014-02-11 Lockheed Martin Corporation Systems and methods for facilitating the gathering of open source intelligence
US9256908B2 (en) * 2011-08-19 2016-02-09 International Business Machines Corporation Utility consumption disaggregation using low sample rate smart meters
US9117225B2 (en) 2011-09-16 2015-08-25 Visa International Service Association Apparatuses, methods and systems for transforming user infrastructure requests inputs to infrastructure design product and infrastructure allocation outputs
US8909581B2 (en) 2011-10-28 2014-12-09 Blackberry Limited Factor-graph based matching systems and methods
US8688793B2 (en) 2011-11-08 2014-04-01 Blackberry Limited System and method for insertion of addresses in electronic messages
CN104054074B (en) 2011-11-15 2019-03-08 起元科技有限公司 Data based on candidate item inquiry divide group
US9601117B1 (en) * 2011-11-30 2017-03-21 West Corporation Method and apparatus of processing user data of a multi-speaker conference call
US9082082B2 (en) 2011-12-06 2015-07-14 The Trustees Of Columbia University In The City Of New York Network information methods devices and systems
US9269353B1 (en) 2011-12-07 2016-02-23 Manu Rehani Methods and systems for measuring semantics in communications
US20130159254A1 (en) * 2011-12-14 2013-06-20 Yahoo! Inc. System and methods for providing content via the internet
US9020807B2 (en) 2012-01-18 2015-04-28 Dw Associates, Llc Format for displaying text analytics results
US9667513B1 (en) 2012-01-24 2017-05-30 Dw Associates, Llc Real-time autonomous organization
WO2013115953A2 (en) * 2012-02-02 2013-08-08 Bitvore Corp. Method and apparatus for 3d display and analysis of disparate data
US9311623B2 (en) 2012-02-09 2016-04-12 International Business Machines Corporation System to view and manipulate artifacts at a temporal reference point
US9020890B2 (en) 2012-03-30 2015-04-28 Commvault Systems, Inc. Smart archiving and data previewing for mobile devices
US9135369B2 (en) * 2012-05-02 2015-09-15 Nvidia Corporation System, method, and computer program product for performing graph aggregation
US9009179B2 (en) 2012-05-02 2015-04-14 Nvidia Corporation System, method, and computer program product for performing graph matching
US9098598B1 (en) 2012-05-04 2015-08-04 Google Inc. Non-default location support for expandable content item publisher side files
US9177007B2 (en) * 2012-05-14 2015-11-03 Salesforce.Com, Inc. Computer implemented methods and apparatus to interact with records using a publisher of an information feed of an online social network
US9189473B2 (en) * 2012-05-18 2015-11-17 Xerox Corporation System and method for resolving entity coreference
US8856249B2 (en) 2012-05-24 2014-10-07 Yahoo! Inc. Method and system for email sequence identification
US8738628B2 (en) * 2012-05-31 2014-05-27 International Business Machines Corporation Community profiling for social media
US8533182B1 (en) * 2012-05-31 2013-09-10 David P. Charboneau Apparatuses, systems, and methods for efficient graph pattern matching and querying
US9251186B2 (en) 2012-06-13 2016-02-02 Commvault Systems, Inc. Backup using a client-side signature repository in a networked storage system
US9880771B2 (en) 2012-06-19 2018-01-30 International Business Machines Corporation Packing deduplicated data into finite-sized containers
EP2680173A3 (en) * 2012-06-29 2014-01-15 Orange Determining implicit social networking relationships and organization
US9047254B1 (en) * 2012-07-05 2015-06-02 Google Inc. Detection and validation of expansion types of expandable content items
US8751304B1 (en) 2012-07-05 2014-06-10 Google Inc. Monitoring content item expansion events across multiple content item providers
US9043699B1 (en) * 2012-07-05 2015-05-26 Google Inc. Determining expansion directions for expandable content item environments
US9146911B1 (en) 2012-07-17 2015-09-29 Google Inc. Predicting expansion directions for expandable content item environments
US8694632B1 (en) 2012-07-17 2014-04-08 Google Inc. Determining content item expansion prediction accuracy
US9405821B1 (en) 2012-08-03 2016-08-02 tinyclues SAS Systems and methods for data mining automation
WO2014028871A1 (en) 2012-08-17 2014-02-20 Twitter, Inc. Search infrastructure
WO2014031618A2 (en) 2012-08-22 2014-02-27 Bitvore Corp. Data relationships storage platform
US9288166B2 (en) * 2012-09-18 2016-03-15 International Business Machines Corporation Preserving collaboration history with relevant contextual information
US9865008B2 (en) 2012-09-20 2018-01-09 Google Llc Determining a configuration of a content item display environment
US9870554B1 (en) * 2012-10-23 2018-01-16 Google Inc. Managing documents based on a user's calendar
US10108526B2 (en) * 2012-11-27 2018-10-23 Purdue Research Foundation Bug localization using version history
US10650063B1 (en) * 2012-11-27 2020-05-12 Robert D. Fish Systems and methods for making correlations
US9529795B2 (en) * 2012-11-29 2016-12-27 Thomson Reuters Global Resources Systems and methods for natural language generation
US9633022B2 (en) 2012-12-28 2017-04-25 Commvault Systems, Inc. Backup and restoration for a deduplicated file system
EP2770446A4 (en) * 2012-12-28 2015-01-14 Huawei Tech Co Ltd Data processing method and device
US10013481B2 (en) 2013-01-02 2018-07-03 Research Now Group, Inc. Using a graph database to match entities by evaluating boolean expressions
US9390195B2 (en) * 2013-01-02 2016-07-12 Research Now Group, Inc. Using a graph database to match entities by evaluating boolean expressions
US9665591B2 (en) 2013-01-11 2017-05-30 Commvault Systems, Inc. High availability distributed deduplicated storage system
US9904721B1 (en) * 2013-01-25 2018-02-27 Gravic, Inc. Source-side merging of distributed transactions prior to replication
US9141723B2 (en) * 2013-03-14 2015-09-22 Facebook, Inc. Caching sliding window data
WO2014144745A1 (en) * 2013-03-15 2014-09-18 The Dun & Bradstreet Corporation Non-deterministic disambiguation and matching of business locale data
US9460310B2 (en) * 2013-03-15 2016-10-04 Pathar, Inc. Method and apparatus for substitution scheme for anonymizing personally identifiable information
US9195673B2 (en) * 2013-03-15 2015-11-24 International Business Machines Corporation Scalable graph modeling of metadata for deduplicated storage systems
US9285948B2 (en) * 2013-03-15 2016-03-15 Assima Switzerland Sa System and method for interface display screen manipulation
US10282378B1 (en) * 2013-04-10 2019-05-07 Christopher A. Eusebi System and method for detecting and forecasting the emergence of technologies
US9043908B1 (en) 2013-04-18 2015-05-26 Trend Micro Incorporated Detection of encryption and compression applications
US10614132B2 (en) 2013-04-30 2020-04-07 Splunk Inc. GUI-triggered processing of performance data and log data from an information technology environment
US10019496B2 (en) 2013-04-30 2018-07-10 Splunk Inc. Processing of performance data and log data from an information technology environment by using diverse data stores
US10346357B2 (en) 2013-04-30 2019-07-09 Splunk Inc. Processing of performance data and structure data from an information technology environment
US10353957B2 (en) 2013-04-30 2019-07-16 Splunk Inc. Processing of performance data and raw log data from an information technology environment
US10225136B2 (en) 2013-04-30 2019-03-05 Splunk Inc. Processing of log data and performance data obtained via an application programming interface (API)
US10318541B2 (en) 2013-04-30 2019-06-11 Splunk Inc. Correlating log data with performance measurements having a specified relationship to a threshold value
US10997191B2 (en) 2013-04-30 2021-05-04 Splunk Inc. Query-triggered processing of performance data and log data from an information technology environment
US9734195B1 (en) * 2013-05-16 2017-08-15 Veritas Technologies Llc Automated data flow tracking
US9400825B2 (en) * 2013-05-23 2016-07-26 Strategy Companion Corporation Pivot analysis method using condition group
US10642928B2 (en) * 2013-06-03 2020-05-05 International Business Machines Corporation Annotation collision detection in a question and answer system
US9710789B2 (en) 2013-06-04 2017-07-18 SuccessFactors Multi-dimension analyzer for organizational personnel
US20150012448A1 (en) * 2013-07-03 2015-01-08 Icebox, Inc. Collaborative matter management and analysis
US9411804B1 (en) * 2013-07-17 2016-08-09 Yseop Sa Techniques for automatic generation of natural language text
US10037317B1 (en) 2013-07-17 2018-07-31 Yseop Sa Techniques for automatic generation of natural language text
US9589043B2 (en) 2013-08-01 2017-03-07 Actiance, Inc. Unified context-aware content archive system
US10817613B2 (en) 2013-08-07 2020-10-27 Microsoft Technology Licensing, Llc Access and management of entity-augmented content
SE537697C2 (en) * 2013-08-08 2015-09-29 Enigio Time Ab Procedure for generating signals for time stamping of documents and procedure for time stamping of documents
US10216849B2 (en) * 2013-08-26 2019-02-26 Knewton, Inc. Personalized content recommendations
US20150062660A1 (en) * 2013-08-30 2015-03-05 Toshiba Tec Kabushiki Kaisha File management apparatus and file management method
US9213820B2 (en) * 2013-09-10 2015-12-15 Ebay Inc. Mobile authentication using a wearable device
US9836517B2 (en) * 2013-10-07 2017-12-05 Facebook, Inc. Systems and methods for mapping and routing based on clustering
US9973462B1 (en) 2013-10-21 2018-05-15 Google Llc Methods for generating message notifications
US11238056B2 (en) 2013-10-28 2022-02-01 Microsoft Technology Licensing, Llc Enhancing search results with social labels
US9276939B2 (en) 2013-12-17 2016-03-01 International Business Machines Corporation Managing user access to query results
US10324897B2 (en) 2014-01-27 2019-06-18 Commvault Systems, Inc. Techniques for serving archived electronic mail
US11645289B2 (en) * 2014-02-04 2023-05-09 Microsoft Technology Licensing, Llc Ranking enterprise graph queries
US9870432B2 (en) 2014-02-24 2018-01-16 Microsoft Technology Licensing, Llc Persisted enterprise graph queries
US9852208B2 (en) * 2014-02-25 2017-12-26 International Business Machines Corporation Discovering communities and expertise of users using semantic analysis of resource access logs
US11657060B2 (en) 2014-02-27 2023-05-23 Microsoft Technology Licensing, Llc Utilizing interactivity signals to generate relationships and promote content
US10255563B2 (en) 2014-03-03 2019-04-09 Microsoft Technology Licensing, Llc Aggregating enterprise graph content around user-generated topics
US10380072B2 (en) 2014-03-17 2019-08-13 Commvault Systems, Inc. Managing deletions from a deduplication database
US9633056B2 (en) 2014-03-17 2017-04-25 Commvault Systems, Inc. Maintaining a deduplication database
US10127075B2 (en) * 2014-04-14 2018-11-13 International Business Machines Corporation Model driven optimization of annotator execution in question answering system
EP3143519A1 (en) 2014-05-12 2017-03-22 Google, Inc. Automated reading comprehension
US9503467B2 (en) 2014-05-22 2016-11-22 Accenture Global Services Limited Network anomaly detection
US9729583B1 (en) 2016-06-10 2017-08-08 OneTrust, LLC Data processing systems and methods for performing privacy assessments and monitoring of new versions of computer code for privacy compliance
US11249858B2 (en) 2014-08-06 2022-02-15 Commvault Systems, Inc. Point-in-time backups of a production application made accessible over fibre channel and/or ISCSI as data sources to a remote application by representing the backups as pseudo-disks operating apart from the production application and its host
US9852026B2 (en) 2014-08-06 2017-12-26 Commvault Systems, Inc. Efficient application recovery in an information management system based on a pseudo-storage-device driver
US9716721B2 (en) 2014-08-29 2017-07-25 Accenture Global Services Limited Unstructured security threat information analysis
US9407645B2 (en) 2014-08-29 2016-08-02 Accenture Global Services Limited Security threat information analysis
US10061826B2 (en) 2014-09-05 2018-08-28 Microsoft Technology Licensing, Llc. Distant content discovery
US9146962B1 (en) 2014-10-09 2015-09-29 Splunk, Inc. Identifying events using informational fields
US10235638B2 (en) 2014-10-09 2019-03-19 Splunk Inc. Adaptive key performance indicator thresholds
US10474680B2 (en) 2014-10-09 2019-11-12 Splunk Inc. Automatic entity definitions
US10209956B2 (en) 2014-10-09 2019-02-19 Splunk Inc. Automatic event group actions
US9760240B2 (en) 2014-10-09 2017-09-12 Splunk Inc. Graphical user interface for static and adaptive thresholds
US9146954B1 (en) 2014-10-09 2015-09-29 Splunk, Inc. Creating entity definition from a search result set
US10536353B2 (en) 2014-10-09 2020-01-14 Splunk Inc. Control interface for dynamic substitution of service monitoring dashboard source data
US11275775B2 (en) 2014-10-09 2022-03-15 Splunk Inc. Performing search queries for key performance indicators using an optimized common information model
US11296955B1 (en) 2014-10-09 2022-04-05 Splunk Inc. Aggregate key performance indicator spanning multiple services and based on a priority value
US9864797B2 (en) 2014-10-09 2018-01-09 Splunk Inc. Defining a new search based on displayed graph lanes
US11755559B1 (en) 2014-10-09 2023-09-12 Splunk Inc. Automatic entity control in a machine data driven service monitoring system
US11087263B2 (en) 2014-10-09 2021-08-10 Splunk Inc. System monitoring with key performance indicators from shared base search of machine data
US10592093B2 (en) 2014-10-09 2020-03-17 Splunk Inc. Anomaly detection
US11671312B2 (en) 2014-10-09 2023-06-06 Splunk Inc. Service detail monitoring console
US10193775B2 (en) 2014-10-09 2019-01-29 Splunk Inc. Automatic event group action interface
US10417108B2 (en) 2015-09-18 2019-09-17 Splunk Inc. Portable control modules in a machine data driven service monitoring system
US11200130B2 (en) 2015-09-18 2021-12-14 Splunk Inc. Automatic entity control in a machine data driven service monitoring system
US11455590B2 (en) 2014-10-09 2022-09-27 Splunk Inc. Service monitoring adaptation for maintenance downtime
US10505825B1 (en) 2014-10-09 2019-12-10 Splunk Inc. Automatic creation of related event groups for IT service monitoring
US11501238B2 (en) 2014-10-09 2022-11-15 Splunk Inc. Per-entity breakdown of key performance indicators
US10417225B2 (en) 2015-09-18 2019-09-17 Splunk Inc. Entity detail monitoring console
US10305758B1 (en) 2014-10-09 2019-05-28 Splunk Inc. Service monitoring interface reflecting by-service mode
US10447555B2 (en) 2014-10-09 2019-10-15 Splunk Inc. Aggregate key performance indicator spanning multiple services
US9158811B1 (en) 2014-10-09 2015-10-13 Splunk, Inc. Incident review interface
US9130832B1 (en) 2014-10-09 2015-09-08 Splunk, Inc. Creating entity definition from a file
US9210056B1 (en) 2014-10-09 2015-12-08 Splunk Inc. Service monitoring interface
US9491059B2 (en) 2014-10-09 2016-11-08 Splunk Inc. Topology navigator for IT services
US20160105329A1 (en) 2014-10-09 2016-04-14 Splunk Inc. Defining a service-monitoring dashboard using key performance indicators derived from machine data
US9575673B2 (en) 2014-10-29 2017-02-21 Commvault Systems, Inc. Accessing a file system using tiered deduplication
US9842593B2 (en) 2014-11-14 2017-12-12 At&T Intellectual Property I, L.P. Multi-level content analysis and response
US11140115B1 (en) * 2014-12-09 2021-10-05 Google Llc Systems and methods of applying semantic features for machine learning of message categories
US10409909B2 (en) 2014-12-12 2019-09-10 Omni Ai, Inc. Lexical analyzer for a neuro-linguistic behavior recognition system
US10409910B2 (en) * 2014-12-12 2019-09-10 Omni Ai, Inc. Perceptual associative memory for a neuro-linguistic behavior recognition system
US10198155B2 (en) 2015-01-31 2019-02-05 Splunk Inc. Interface for automated service discovery in I.T. environments
US9967351B2 (en) 2015-01-31 2018-05-08 Splunk Inc. Automated service discovery in I.T. environments
US9483474B2 (en) * 2015-02-05 2016-11-01 Microsoft Technology Licensing, Llc Document retrieval/identification using topics
US9753921B1 (en) 2015-03-05 2017-09-05 Dropbox, Inc. Comment management in shared documents
US10102290B2 (en) 2015-03-12 2018-10-16 Oracle International Corporation Methods for identifying, ranking, and displaying subject matter experts on social networks
US10339106B2 (en) 2015-04-09 2019-07-02 Commvault Systems, Inc. Highly reusable deduplication database after disaster recovery
US11061946B2 (en) 2015-05-08 2021-07-13 Refinitiv Us Organization Llc Systems and methods for cross-media event detection and coreferencing
SG10201503755QA (en) * 2015-05-13 2016-12-29 Dataesp Private Ltd Searching large data space for statistically significant patterns
US10324914B2 (en) 2015-05-20 2019-06-18 Commvalut Systems, Inc. Handling user queries against production and archive storage systems, such as for enterprise customers having large and/or numerous files
US20160350391A1 (en) 2015-05-26 2016-12-01 Commvault Systems, Inc. Replication using deduplicated secondary copy data
US10505884B2 (en) * 2015-06-05 2019-12-10 Microsoft Technology Licensing, Llc Entity classification and/or relationship identification
US10740349B2 (en) 2015-06-22 2020-08-11 Microsoft Technology Licensing, Llc Document storage for reuse of content within documents
US10394949B2 (en) 2015-06-22 2019-08-27 Microsoft Technology Licensing, Llc Deconstructing documents into component blocks for reuse in productivity applications
US10339183B2 (en) 2015-06-22 2019-07-02 Microsoft Technology Licensing, Llc Document storage for reuse of content within documents
US10050919B2 (en) * 2015-06-26 2018-08-14 Veritas Technologies Llc Highly parallel scalable distributed email threading algorithm
US10055498B2 (en) * 2015-07-07 2018-08-21 Oracle International Corporation Methods for assessing and scoring user proficiency in topics determined by data from social networks and other sources
US9766825B2 (en) 2015-07-22 2017-09-19 Commvault Systems, Inc. Browse and restore for block-level backups
CN111343241B (en) * 2015-07-24 2022-12-09 创新先进技术有限公司 Graph data updating method, device and system
US9363149B1 (en) 2015-08-01 2016-06-07 Splunk Inc. Management console for network security investigations
US9516052B1 (en) 2015-08-01 2016-12-06 Splunk Inc. Timeline displays of network security investigation events
US10254934B2 (en) 2015-08-01 2019-04-09 Splunk Inc. Network security investigation workflow logging
US10476908B2 (en) * 2015-08-10 2019-11-12 Allure Security Technology Inc. Generating highly realistic decoy email and documents
US10467528B2 (en) * 2015-08-11 2019-11-05 Oracle International Corporation Accelerated TR-L-BFGS algorithm for neural network
US9979743B2 (en) 2015-08-13 2018-05-22 Accenture Global Services Limited Computer asset vulnerabilities
US9659007B2 (en) * 2015-08-26 2017-05-23 International Business Machines Corporation Linguistic based determination of text location origin
US10275446B2 (en) * 2015-08-26 2019-04-30 International Business Machines Corporation Linguistic based determination of text location origin
US9639524B2 (en) 2015-08-26 2017-05-02 International Business Machines Corporation Linguistic based determination of text creation date
US9886582B2 (en) 2015-08-31 2018-02-06 Accenture Global Sevices Limited Contextualization of threat data
US10719561B2 (en) * 2015-09-16 2020-07-21 John L. Haller, Jr. System and method for analyzing popularity of one or more user defined topics among the big data
US10067964B2 (en) * 2015-09-16 2018-09-04 John L. Haller, Jr. System and method for analyzing popularity of one or more user defined topics among the big data
US20170116194A1 (en) 2015-10-23 2017-04-27 International Business Machines Corporation Ingestion planning for complex tables
US9965519B2 (en) * 2015-11-25 2018-05-08 Passport Health Communications, Inc. Document linkage and forwarding
US9959504B2 (en) * 2015-12-02 2018-05-01 International Business Machines Corporation Significance of relationships discovered in a corpus
US10061663B2 (en) 2015-12-30 2018-08-28 Commvault Systems, Inc. Rebuilding deduplication data in a distributed deduplication data storage system
US10198455B2 (en) * 2016-01-13 2019-02-05 International Business Machines Corporation Sampling-based deduplication estimation
US10042842B2 (en) 2016-02-24 2018-08-07 Utopus Insights, Inc. Theft detection via adaptive lexical similarity analysis of social media data streams
US10296368B2 (en) 2016-03-09 2019-05-21 Commvault Systems, Inc. Hypervisor-independent block-level live browse for access to backed up virtual machine (VM) data and hypervisor-free file-level recovery (block-level pseudo-mount)
US10706447B2 (en) 2016-04-01 2020-07-07 OneTrust, LLC Data processing systems and communication systems and methods for the efficient generation of privacy risk assessments
US11244367B2 (en) 2016-04-01 2022-02-08 OneTrust, LLC Data processing systems and methods for integrating privacy information management systems with data loss prevention tools or other tools for privacy design
US11004125B2 (en) 2016-04-01 2021-05-11 OneTrust, LLC Data processing systems and methods for integrating privacy information management systems with data loss prevention tools or other tools for privacy design
US20220164840A1 (en) 2016-04-01 2022-05-26 OneTrust, LLC Data processing systems and methods for integrating privacy information management systems with data loss prevention tools or other tools for privacy design
WO2017182062A1 (en) 2016-04-19 2017-10-26 Huawei Technologies Co., Ltd. Concurrent segmentation using vector processing
JP6420489B2 (en) 2016-04-19 2018-11-07 華為技術有限公司Huawei Technologies Co.,Ltd. Vector processing for segmented hash calculation
US20170337293A1 (en) * 2016-05-18 2017-11-23 Sisense Ltd. System and method of rendering multi-variant graphs
US9710544B1 (en) * 2016-05-19 2017-07-18 Quid, Inc. Pivoting from a graph of semantic similarity of documents to a derivative graph of relationships between entities mentioned in the documents
WO2017197526A1 (en) 2016-05-20 2017-11-23 Roman Czeslaw Kordasiewicz Systems and methods for graphical exploration of forensic data
US10740409B2 (en) 2016-05-20 2020-08-11 Magnet Forensics Inc. Systems and methods for graphical exploration of forensic data
US10454973B2 (en) 2016-06-10 2019-10-22 OneTrust, LLC Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods
US10762236B2 (en) 2016-06-10 2020-09-01 OneTrust, LLC Data processing user interface monitoring systems and related methods
US11146566B2 (en) 2016-06-10 2021-10-12 OneTrust, LLC Data processing systems for fulfilling data subject access requests and related methods
US11416589B2 (en) 2016-06-10 2022-08-16 OneTrust, LLC Data processing and scanning systems for assessing vendor risk
US10949170B2 (en) 2016-06-10 2021-03-16 OneTrust, LLC Data processing systems for integration of consumer feedback with data subject access requests and related methods
US11341447B2 (en) 2016-06-10 2022-05-24 OneTrust, LLC Privacy management systems and methods
US10848523B2 (en) 2016-06-10 2020-11-24 OneTrust, LLC Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods
US11562097B2 (en) 2016-06-10 2023-01-24 OneTrust, LLC Data processing systems for central consent repository and related methods
US11200341B2 (en) 2016-06-10 2021-12-14 OneTrust, LLC Consent receipt management systems and related methods
US10796260B2 (en) 2016-06-10 2020-10-06 OneTrust, LLC Privacy management systems and methods
US10565236B1 (en) 2016-06-10 2020-02-18 OneTrust, LLC Data processing systems for generating and populating a data inventory
US10783256B2 (en) 2016-06-10 2020-09-22 OneTrust, LLC Data processing systems for data transfer risk identification and related methods
US10467432B2 (en) 2016-06-10 2019-11-05 OneTrust, LLC Data processing systems for use in automatically generating, populating, and submitting data subject access requests
US11481710B2 (en) 2016-06-10 2022-10-25 OneTrust, LLC Privacy management systems and methods
US10685140B2 (en) 2016-06-10 2020-06-16 OneTrust, LLC Consent receipt management systems and related methods
US11328092B2 (en) 2016-06-10 2022-05-10 OneTrust, LLC Data processing systems for processing and managing data subject access in a distributed environment
US10776518B2 (en) 2016-06-10 2020-09-15 OneTrust, LLC Consent receipt management systems and related methods
US10944725B2 (en) 2016-06-10 2021-03-09 OneTrust, LLC Data processing systems and methods for using a data model to select a target data asset in a data migration
US11416798B2 (en) 2016-06-10 2022-08-16 OneTrust, LLC Data processing systems and methods for providing training in a vendor procurement process
US10572686B2 (en) 2016-06-10 2020-02-25 OneTrust, LLC Consent receipt management systems and related methods
US11294939B2 (en) 2016-06-10 2022-04-05 OneTrust, LLC Data processing systems and methods for automatically detecting and documenting privacy-related aspects of computer software
US11228620B2 (en) 2016-06-10 2022-01-18 OneTrust, LLC Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods
US11336697B2 (en) 2016-06-10 2022-05-17 OneTrust, LLC Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods
US10776514B2 (en) 2016-06-10 2020-09-15 OneTrust, LLC Data processing systems for the identification and deletion of personal data in computer systems
US10565397B1 (en) 2016-06-10 2020-02-18 OneTrust, LLC Data processing systems for fulfilling data subject access requests and related methods
US10565161B2 (en) 2016-06-10 2020-02-18 OneTrust, LLC Data processing systems for processing data subject access requests
US10706176B2 (en) 2016-06-10 2020-07-07 OneTrust, LLC Data-processing consent refresh, re-prompt, and recapture systems and related methods
US10726158B2 (en) 2016-06-10 2020-07-28 OneTrust, LLC Consent receipt management and automated process blocking systems and related methods
US10740487B2 (en) 2016-06-10 2020-08-11 OneTrust, LLC Data processing systems and methods for populating and maintaining a centralized database of personal data
US11625502B2 (en) 2016-06-10 2023-04-11 OneTrust, LLC Data processing systems for identifying and modifying processes that are subject to data subject access requests
US11057356B2 (en) 2016-06-10 2021-07-06 OneTrust, LLC Automated data processing systems and methods for automatically processing data subject access requests using a chatbot
US11087260B2 (en) 2016-06-10 2021-08-10 OneTrust, LLC Data processing systems and methods for customizing privacy training
US11392720B2 (en) 2016-06-10 2022-07-19 OneTrust, LLC Data processing systems for verification of consent and notice processing and related methods
US10706379B2 (en) 2016-06-10 2020-07-07 OneTrust, LLC Data processing systems for automatic preparation for remediation and related methods
US10318761B2 (en) 2016-06-10 2019-06-11 OneTrust, LLC Data processing systems and methods for auditing data request compliance
US10642870B2 (en) 2016-06-10 2020-05-05 OneTrust, LLC Data processing systems and methods for automatically detecting and documenting privacy-related aspects of computer software
US10592648B2 (en) 2016-06-10 2020-03-17 OneTrust, LLC Consent receipt management systems and related methods
US10585968B2 (en) 2016-06-10 2020-03-10 OneTrust, LLC Data processing systems for fulfilling data subject access requests and related methods
US10509894B2 (en) 2016-06-10 2019-12-17 OneTrust, LLC Data processing and scanning systems for assessing vendor risk
US10496803B2 (en) 2016-06-10 2019-12-03 OneTrust, LLC Data processing systems and methods for efficiently assessing the risk of privacy campaigns
US11416109B2 (en) 2016-06-10 2022-08-16 OneTrust, LLC Automated data processing systems and methods for automatically processing data subject access requests using a chatbot
US11727141B2 (en) 2016-06-10 2023-08-15 OneTrust, LLC Data processing systems and methods for synching privacy-related user consent across multiple computing devices
US11403377B2 (en) 2016-06-10 2022-08-02 OneTrust, LLC Privacy management systems and methods
US11023842B2 (en) 2016-06-10 2021-06-01 OneTrust, LLC Data processing systems and methods for bundled privacy policies
US10510031B2 (en) 2016-06-10 2019-12-17 OneTrust, LLC Data processing systems for identifying, assessing, and remediating data processing risks using data modeling techniques
US10853501B2 (en) 2016-06-10 2020-12-01 OneTrust, LLC Data processing and scanning systems for assessing vendor risk
US10909265B2 (en) 2016-06-10 2021-02-02 OneTrust, LLC Application privacy scanning systems and related methods
US10606916B2 (en) 2016-06-10 2020-03-31 OneTrust, LLC Data processing user interface monitoring systems and related methods
US10282700B2 (en) 2016-06-10 2019-05-07 OneTrust, LLC Data processing systems for generating and populating a data inventory
US11520928B2 (en) 2016-06-10 2022-12-06 OneTrust, LLC Data processing systems for generating personal data receipts and related methods
US10846433B2 (en) 2016-06-10 2020-11-24 OneTrust, LLC Data processing consent management systems and related methods
US10997318B2 (en) 2016-06-10 2021-05-04 OneTrust, LLC Data processing systems for generating and populating a data inventory for processing data access requests
US11354434B2 (en) 2016-06-10 2022-06-07 OneTrust, LLC Data processing systems for verification of consent and notice processing and related methods
US11277448B2 (en) 2016-06-10 2022-03-15 OneTrust, LLC Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods
US11025675B2 (en) 2016-06-10 2021-06-01 OneTrust, LLC Data processing systems and methods for performing privacy assessments and monitoring of new versions of computer code for privacy compliance
US10678945B2 (en) 2016-06-10 2020-06-09 OneTrust, LLC Consent receipt management systems and related methods
US11138242B2 (en) 2016-06-10 2021-10-05 OneTrust, LLC Data processing systems and methods for automatically detecting and documenting privacy-related aspects of computer software
US11144622B2 (en) 2016-06-10 2021-10-12 OneTrust, LLC Privacy management systems and methods
US10706131B2 (en) 2016-06-10 2020-07-07 OneTrust, LLC Data processing systems and methods for efficiently assessing the risk of privacy campaigns
US11636171B2 (en) 2016-06-10 2023-04-25 OneTrust, LLC Data processing user interface monitoring systems and related methods
US10878127B2 (en) 2016-06-10 2020-12-29 OneTrust, LLC Data subject access request processing systems and related methods
US10284604B2 (en) 2016-06-10 2019-05-07 OneTrust, LLC Data processing and scanning systems for generating and populating a data inventory
US11100444B2 (en) 2016-06-10 2021-08-24 OneTrust, LLC Data processing systems and methods for providing training in a vendor procurement process
US11418492B2 (en) 2016-06-10 2022-08-16 OneTrust, LLC Data processing systems and methods for using a data model to select a target data asset in a data migration
US10509920B2 (en) * 2016-06-10 2019-12-17 OneTrust, LLC Data processing systems for processing data subject access requests
US11222139B2 (en) 2016-06-10 2022-01-11 OneTrust, LLC Data processing systems and methods for automatic discovery and assessment of mobile software development kits
US10353673B2 (en) 2016-06-10 2019-07-16 OneTrust, LLC Data processing systems for integration of consumer feedback with data subject access requests and related methods
US11188862B2 (en) 2016-06-10 2021-11-30 OneTrust, LLC Privacy management systems and methods
US11222142B2 (en) 2016-06-10 2022-01-11 OneTrust, LLC Data processing systems for validating authorization for personal data collection, storage, and processing
US10496846B1 (en) 2016-06-10 2019-12-03 OneTrust, LLC Data processing and communications systems and methods for the efficient implementation of privacy by design
US10713387B2 (en) 2016-06-10 2020-07-14 OneTrust, LLC Consent conversion optimization systems and related methods
US10169609B1 (en) 2016-06-10 2019-01-01 OneTrust, LLC Data processing systems for fulfilling data subject access requests and related methods
US11295316B2 (en) 2016-06-10 2022-04-05 OneTrust, LLC Data processing systems for identity validation for consumer rights requests and related methods
US11354435B2 (en) 2016-06-10 2022-06-07 OneTrust, LLC Data processing systems for data testing to confirm data deletion and related methods
US10949565B2 (en) 2016-06-10 2021-03-16 OneTrust, LLC Data processing systems for generating and populating a data inventory
US11416590B2 (en) 2016-06-10 2022-08-16 OneTrust, LLC Data processing and scanning systems for assessing vendor risk
US11222309B2 (en) 2016-06-10 2022-01-11 OneTrust, LLC Data processing systems for generating and populating a data inventory
US10706174B2 (en) 2016-06-10 2020-07-07 OneTrust, LLC Data processing systems for prioritizing data subject access requests for fulfillment and related methods
US10242228B2 (en) 2016-06-10 2019-03-26 OneTrust, LLC Data processing systems for measuring privacy maturity within an organization
US10769301B2 (en) 2016-06-10 2020-09-08 OneTrust, LLC Data processing systems for webform crawling to map processing activities and related methods
US11461500B2 (en) 2016-06-10 2022-10-04 OneTrust, LLC Data processing systems for cookie compliance testing with website scanning and related methods
US10282559B2 (en) 2016-06-10 2019-05-07 OneTrust, LLC Data processing systems for identifying, assessing, and remediating data processing risks using data modeling techniques
US11366909B2 (en) 2016-06-10 2022-06-21 OneTrust, LLC Data processing and scanning systems for assessing vendor risk
US10586075B2 (en) 2016-06-10 2020-03-10 OneTrust, LLC Data processing systems for orphaned data identification and deletion and related methods
US11651104B2 (en) 2016-06-10 2023-05-16 OneTrust, LLC Consent receipt management systems and related methods
US11188615B2 (en) 2016-06-10 2021-11-30 OneTrust, LLC Data processing consent capture systems and related methods
US11651106B2 (en) 2016-06-10 2023-05-16 OneTrust, LLC Data processing systems for fulfilling data subject access requests and related methods
US10776517B2 (en) 2016-06-10 2020-09-15 OneTrust, LLC Data processing systems for calculating and communicating cost of fulfilling data subject access requests and related methods
US11586700B2 (en) 2016-06-10 2023-02-21 OneTrust, LLC Data processing systems and methods for automatically blocking the use of tracking tools
US11074367B2 (en) 2016-06-10 2021-07-27 OneTrust, LLC Data processing systems for identity validation for consumer rights requests and related methods
US11134086B2 (en) 2016-06-10 2021-09-28 OneTrust, LLC Consent conversion optimization systems and related methods
US10503926B2 (en) 2016-06-10 2019-12-10 OneTrust, LLC Consent receipt management systems and related methods
US11210420B2 (en) 2016-06-10 2021-12-28 OneTrust, LLC Data subject access request processing systems and related methods
US10803200B2 (en) 2016-06-10 2020-10-13 OneTrust, LLC Data processing systems for processing and managing data subject access in a distributed environment
US10896394B2 (en) 2016-06-10 2021-01-19 OneTrust, LLC Privacy management systems and methods
US11544667B2 (en) 2016-06-10 2023-01-03 OneTrust, LLC Data processing systems for generating and populating a data inventory
US10997315B2 (en) 2016-06-10 2021-05-04 OneTrust, LLC Data processing systems for fulfilling data subject access requests and related methods
US10708305B2 (en) 2016-06-10 2020-07-07 OneTrust, LLC Automated data processing systems and methods for automatically processing requests for privacy-related information
US10614247B2 (en) 2016-06-10 2020-04-07 OneTrust, LLC Data processing systems for automated classification of personal information from documents and related methods
US11151233B2 (en) 2016-06-10 2021-10-19 OneTrust, LLC Data processing and scanning systems for assessing vendor risk
US10592692B2 (en) 2016-06-10 2020-03-17 OneTrust, LLC Data processing systems for central consent repository and related methods
US11157600B2 (en) 2016-06-10 2021-10-26 OneTrust, LLC Data processing and scanning systems for assessing vendor risk
US11301796B2 (en) 2016-06-10 2022-04-12 OneTrust, LLC Data processing systems and methods for customizing privacy training
US10885485B2 (en) 2016-06-10 2021-01-05 OneTrust, LLC Privacy management systems and methods
US11366786B2 (en) 2016-06-10 2022-06-21 OneTrust, LLC Data processing systems for processing data subject access requests
US11238390B2 (en) 2016-06-10 2022-02-01 OneTrust, LLC Privacy management systems and methods
US11038925B2 (en) 2016-06-10 2021-06-15 OneTrust, LLC Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods
US10416966B2 (en) 2016-06-10 2019-09-17 OneTrust, LLC Data processing systems for identity validation of data subject access requests and related methods
US11227247B2 (en) 2016-06-10 2022-01-18 OneTrust, LLC Data processing systems and methods for bundled privacy policies
US11343284B2 (en) 2016-06-10 2022-05-24 OneTrust, LLC Data processing systems and methods for performing privacy assessments and monitoring of new versions of computer code for privacy compliance
US10839102B2 (en) 2016-06-10 2020-11-17 OneTrust, LLC Data processing systems for identifying and modifying processes that are subject to data subject access requests
US11475136B2 (en) 2016-06-10 2022-10-18 OneTrust, LLC Data processing systems for data transfer risk identification and related methods
US10909488B2 (en) 2016-06-10 2021-02-02 OneTrust, LLC Data processing systems for assessing readiness for responding to privacy-related incidents
US11138299B2 (en) 2016-06-10 2021-10-05 OneTrust, LLC Data processing and scanning systems for assessing vendor risk
US11438386B2 (en) 2016-06-10 2022-09-06 OneTrust, LLC Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods
US11675929B2 (en) 2016-06-10 2023-06-13 OneTrust, LLC Data processing consent sharing systems and related methods
US10798133B2 (en) 2016-06-10 2020-10-06 OneTrust, LLC Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods
US10873606B2 (en) 2016-06-10 2020-12-22 OneTrust, LLC Data processing systems for data-transfer risk identification, cross-border visualization generation, and related methods
US10607028B2 (en) 2016-06-10 2020-03-31 OneTrust, LLC Data processing systems for data testing to confirm data deletion and related methods
US10445356B1 (en) * 2016-06-24 2019-10-15 Pulselight Holdings, Inc. Method and system for analyzing entities
US10692014B2 (en) * 2016-06-27 2020-06-23 Microsoft Technology Licensing, Llc Active user message diet
US10275444B2 (en) 2016-07-15 2019-04-30 At&T Intellectual Property I, L.P. Data analytics system and methods for text data
US9569729B1 (en) * 2016-07-20 2017-02-14 Chenope, Inc. Analytical system and method for assessing certain characteristics of organizations
US10776846B2 (en) * 2016-07-27 2020-09-15 Nike, Inc. Assortment optimization
CN106897346A (en) 2016-08-04 2017-06-27 阿里巴巴集团控股有限公司 The method and device of data processing
US10942960B2 (en) 2016-09-26 2021-03-09 Splunk Inc. Automatic triage model execution in machine data driven monitoring automation apparatus with visualization
US10942946B2 (en) 2016-09-26 2021-03-09 Splunk, Inc. Automatic triage model execution in machine data driven monitoring automation apparatus
US10769213B2 (en) * 2016-10-24 2020-09-08 International Business Machines Corporation Detection of document similarity
US10880254B2 (en) 2016-10-31 2020-12-29 Actiance, Inc. Techniques for supervising communications from multiple communication modalities
US20180234234A1 (en) * 2017-02-10 2018-08-16 Secured FTP Hosting, LLC d/b/a SmartFile System for describing and tracking the creation and evolution of digital files
US10740193B2 (en) 2017-02-27 2020-08-11 Commvault Systems, Inc. Hypervisor-independent reference copies of virtual machine payload data based on block-level pseudo-mount
US10417269B2 (en) * 2017-03-13 2019-09-17 Lexisnexis, A Division Of Reed Elsevier Inc. Systems and methods for verbatim-text mining
US10628496B2 (en) * 2017-03-27 2020-04-21 Dell Products, L.P. Validating and correlating content
US10095716B1 (en) * 2017-04-02 2018-10-09 Sas Institute Inc. Methods, mediums, and systems for data harmonization and data harmonization and data mapping in specified domains
US10664352B2 (en) 2017-06-14 2020-05-26 Commvault Systems, Inc. Live browsing of backed up data residing on cloned disks
US10013577B1 (en) 2017-06-16 2018-07-03 OneTrust, LLC Data processing systems for identifying whether cookies contain personally identifying information
US11934937B2 (en) 2017-07-10 2024-03-19 Accenture Global Solutions Limited System and method for detecting the occurrence of an event and determining a response to the event
WO2019027259A1 (en) 2017-08-01 2019-02-07 Samsung Electronics Co., Ltd. Apparatus and method for providing summarized information using an artificial intelligence model
US10972299B2 (en) * 2017-09-06 2021-04-06 Cisco Technology, Inc. Organizing and aggregating meetings into threaded representations
CA3075865A1 (en) * 2017-09-15 2019-03-21 Financial & Risk Organisation Limited Systems and methods for cross-media event detection and coreferencing
EP3682400A4 (en) * 2017-09-15 2021-06-02 Financial & Risk Organisation Limited Systems and methods for cross-media event detection and coreferencing
US11423362B2 (en) * 2017-10-26 2022-08-23 Oliver Sterczyk Method of conducting workplace electronic communication traffic analysis
CN107944946B (en) * 2017-11-03 2020-10-16 清华大学 Commodity label generation method and device
US10510346B2 (en) * 2017-11-09 2019-12-17 Microsoft Technology Licensing, Llc Systems, methods, and computer-readable storage device for generating notes for a meeting based on participant actions and machine learning
US11354301B2 (en) * 2017-11-13 2022-06-07 LendingClub Bank, National Association Multi-system operation audit log
US11556520B2 (en) 2017-11-13 2023-01-17 Lendingclub Corporation Techniques for automatically addressing anomalous behavior
US11281647B2 (en) * 2017-12-06 2022-03-22 International Business Machines Corporation Fine-grained scalable time-versioning support for large-scale property graph databases
US11645329B2 (en) 2017-12-28 2023-05-09 International Business Machines Corporation Constructing, evaluating, and improving a search string for retrieving images indicating item use
US11055345B2 (en) 2017-12-28 2021-07-06 International Business Machines Corporation Constructing, evaluating, and improving a search string for retrieving images indicating item use
US11061943B2 (en) * 2017-12-28 2021-07-13 International Business Machines Corporation Constructing, evaluating, and improving a search string for retrieving images indicating item use
US10984109B2 (en) * 2018-01-30 2021-04-20 Cisco Technology, Inc. Application component auditor
US10606954B2 (en) * 2018-02-15 2020-03-31 International Business Machines Corporation Topic kernelization for real-time conversation data
US20190272492A1 (en) * 2018-03-05 2019-09-05 Edgile, Inc. Trusted Eco-system Management System
US11023495B2 (en) * 2018-03-19 2021-06-01 Adobe Inc. Automatically generating meaningful user segments
US10803853B2 (en) * 2018-05-04 2020-10-13 Optum Services (Ireland) Limited Audio transcription sentence tokenization system and method
US11463441B2 (en) 2018-05-24 2022-10-04 People.ai, Inc. Systems and methods for managing the generation or deletion of record objects based on electronic activities and communication policies
US10565229B2 (en) 2018-05-24 2020-02-18 People.ai, Inc. Systems and methods for matching electronic activities directly to record objects of systems of record
US11924297B2 (en) 2018-05-24 2024-03-05 People.ai, Inc. Systems and methods for generating a filtered data set
US10263799B1 (en) * 2018-08-29 2019-04-16 Capital One Services, Llc Managing meeting data
US11144675B2 (en) 2018-09-07 2021-10-12 OneTrust, LLC Data processing systems and methods for automatically protecting sensitive data within privacy management systems
US10803202B2 (en) 2018-09-07 2020-10-13 OneTrust, LLC Data processing systems for orphaned data identification and deletion and related methods
US11544409B2 (en) 2018-09-07 2023-01-03 OneTrust, LLC Data processing systems and methods for automatically protecting sensitive data within privacy management systems
US10936809B2 (en) * 2018-09-11 2021-03-02 Dell Products L.P. Method of optimized parsing unstructured and garbled texts lacking whitespaces
US11062042B1 (en) 2018-09-26 2021-07-13 Splunk Inc. Authenticating data associated with a data intake and query system using a distributed ledger system
CN109460541B (en) * 2018-09-27 2023-02-21 广州大学 Vocabulary relation labeling method and device, computer equipment and storage medium
US10992612B2 (en) * 2018-11-12 2021-04-27 Salesforce.Com, Inc. Contact information extraction and identification
US11010258B2 (en) 2018-11-27 2021-05-18 Commvault Systems, Inc. Generating backup copies through interoperability between components of a data storage management system and appliances for data storage and deduplication
US11010399B1 (en) * 2018-11-28 2021-05-18 Intuit Inc. Automated data scraping
US11698727B2 (en) 2018-12-14 2023-07-11 Commvault Systems, Inc. Performing secondary copy operations based on deduplication performance
RU2706467C1 (en) * 2018-12-29 2019-11-19 Николай Евгеньевич Ляпухов Method and device for fixing, recording and storing data on time of birth and user's life events
US11615309B2 (en) 2019-02-27 2023-03-28 Oracle International Corporation Forming an artificial neural network by generating and forming of tunnels
US20200327017A1 (en) 2019-04-10 2020-10-15 Commvault Systems, Inc. Restore using deduplicated secondary copy data
US11928114B2 (en) * 2019-04-23 2024-03-12 Thoughtspot, Inc. Query generation based on a logical data model with one-to-one joins
US11463264B2 (en) 2019-05-08 2022-10-04 Commvault Systems, Inc. Use of data block signatures for monitoring in an information management system
US11507562B1 (en) * 2019-05-22 2022-11-22 Splunk Inc. Associating data from different nodes of a distributed ledger system
US11269859B1 (en) 2019-05-22 2022-03-08 Splunk Inc. Correlating different types of data of a distributed ledger system
US11170029B2 (en) 2019-05-31 2021-11-09 Lendingclub Corporation Multi-user cross-device tracking
US10902190B1 (en) * 2019-07-03 2021-01-26 Microsoft Technology Licensing Llc Populating electronic messages with quotes
US11537816B2 (en) 2019-07-16 2022-12-27 Ancestry.Com Operations Inc. Extraction of genealogy data from obituaries
US11409744B2 (en) 2019-08-01 2022-08-09 Thoughtspot, Inc. Query generation based on merger of subqueries
US11544655B2 (en) * 2019-08-06 2023-01-03 International Business Machines Corporation Team effectiveness assessment and enhancement
US11222057B2 (en) * 2019-08-07 2022-01-11 International Business Machines Corporation Methods and systems for generating descriptions utilizing extracted entity descriptors
JP7354750B2 (en) * 2019-10-10 2023-10-03 富士フイルムビジネスイノベーション株式会社 information processing system
US20210133769A1 (en) * 2019-10-30 2021-05-06 Veda Data Solutions, Inc. Efficient data processing to identify information and reformant data files, and applications thereof
US11442896B2 (en) 2019-12-04 2022-09-13 Commvault Systems, Inc. Systems and methods for optimizing restoration of deduplicated data stored in cloud-based storage resources
US11580456B2 (en) 2020-04-27 2023-02-14 Bank Of America Corporation System to correct model drift in machine learning application
US11687424B2 (en) 2020-05-28 2023-06-27 Commvault Systems, Inc. Automated media agent state management
US11797528B2 (en) 2020-07-08 2023-10-24 OneTrust, LLC Systems and methods for targeted data discovery
WO2022026564A1 (en) 2020-07-28 2022-02-03 OneTrust, LLC Systems and methods for automatically blocking the use of tracking tools
US20230289376A1 (en) 2020-08-06 2023-09-14 OneTrust, LLC Data processing systems and methods for automatically redacting unstructured data from a data subject access request
WO2022060860A1 (en) 2020-09-15 2022-03-24 OneTrust, LLC Data processing systems and methods for detecting tools for the automatic blocking of consent requests
WO2022061270A1 (en) 2020-09-21 2022-03-24 OneTrust, LLC Data processing systems and methods for automatically detecting target data transfers and target data processing
US11487797B2 (en) 2020-09-22 2022-11-01 Dell Products L.P. Iterative application of a machine learning-based information extraction model to documents having unstructured text data
US20220138270A1 (en) * 2020-11-03 2022-05-05 Heyautofill, Inc. Process and system for data transferring and mapping between different applications
US11397819B2 (en) 2020-11-06 2022-07-26 OneTrust, LLC Systems and methods for identifying data processing activities based on data discovery results
US20220171773A1 (en) * 2020-12-01 2022-06-02 International Business Machines Corporation Optimizing expansion of user query input in natural language processing applications
WO2022132949A1 (en) * 2020-12-15 2022-06-23 ClearVector, Inc. Computer-implemented methods for narrative-structured representation of and intervention into a network computing environment
US11567996B2 (en) * 2020-12-28 2023-01-31 Atlassian Pty Ltd Collaborative document graph-based user interfaces
US20220237191A1 (en) * 2021-01-25 2022-07-28 Salesforce.Com, Inc. System and method for supporting very large data sets in databases
WO2022159901A1 (en) 2021-01-25 2022-07-28 OneTrust, LLC Systems and methods for discovery, classification, and indexing of data in a native computing system
US11442906B2 (en) 2021-02-04 2022-09-13 OneTrust, LLC Managing custom attributes for domain objects defined within microservices
EP4288889A1 (en) 2021-02-08 2023-12-13 OneTrust, LLC Data processing systems and methods for anonymizing data samples in classification analysis
US11601464B2 (en) 2021-02-10 2023-03-07 OneTrust, LLC Systems and methods for mitigating risks of third-party computing system functionality integration into a first-party computing system
WO2022178089A1 (en) 2021-02-17 2022-08-25 OneTrust, LLC Managing custom workflows for domain objects defined within microservices
US11546661B2 (en) 2021-02-18 2023-01-03 OneTrust, LLC Selective redaction of media content
EP4305539A1 (en) 2021-03-08 2024-01-17 OneTrust, LLC Data transfer discovery and analysis systems and related methods
US11615152B2 (en) 2021-04-06 2023-03-28 International Business Machines Corporation Graph-based event schema induction for information retrieval
US11562078B2 (en) 2021-04-16 2023-01-24 OneTrust, LLC Assessing and managing computational risk involved with integrating third party computing functionality within a computing system
US20230067265A1 (en) * 2021-08-25 2023-03-02 International Business Machines Corporation Force-directed network calendar
US11620142B1 (en) 2022-06-03 2023-04-04 OneTrust, LLC Generating and customizing user interfaces for demonstrating functions of interactive user environments
CN115345262B (en) * 2022-10-18 2022-12-27 南京工业大学 Neural network model key data mining method based on influence scores

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5689550A (en) * 1994-08-08 1997-11-18 Voice-Tel Enterprises, Inc. Interface enabling voice messaging systems to interact with communications networks
US6209100B1 (en) * 1998-03-27 2001-03-27 International Business Machines Corp. Moderated forums with anonymous but traceable contributions
US6275811B1 (en) * 1998-05-06 2001-08-14 Michael R. Ginn System and method for facilitating interactive electronic communication through acknowledgment of positive contributive
US6381579B1 (en) * 1998-12-23 2002-04-30 International Business Machines Corporation System and method to provide secure navigation to resources on the internet
WO2003067473A1 (en) 2002-02-04 2003-08-14 Cataphora, Inc. A method and apparatus for sociological data mining

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5819248A (en) * 1990-12-31 1998-10-06 Kegan; Daniel L. Persuasion organizer and calculator
US6564321B2 (en) * 1995-04-28 2003-05-13 Bobo Ii Charles R Systems and methods for storing, delivering, and managing messages
US5579466A (en) * 1994-09-01 1996-11-26 Microsoft Corporation Method and system for editing and formatting data in a dialog window
US5903646A (en) * 1994-09-02 1999-05-11 Rackman; Michael I. Access control system for litigation document production
US5758257A (en) 1994-11-29 1998-05-26 Herz; Frederick System and method for scheduling broadcast of and access to video programs and other data using customer profiles
US6628303B1 (en) * 1996-07-29 2003-09-30 Avid Technology, Inc. Graphical user interface for a motion video planning and editing system for a computer
US6304886B1 (en) * 1997-06-19 2001-10-16 International Business Machines Corporation System and method for building a web site using specific interface
US6865715B2 (en) * 1997-09-08 2005-03-08 Fujitsu Limited Statistical method for extracting, and displaying keywords in forum/message board documents
US6247011B1 (en) * 1997-12-02 2001-06-12 Digital-Net, Inc. Computerized prepress authoring for document creation
US6098070A (en) * 1998-06-09 2000-08-01 Hipersoft Corp. Case management for a personal injury plaintiff's law office using a relational database
US6598046B1 (en) * 1998-09-29 2003-07-22 Qwest Communications International Inc. System and method for retrieving documents responsive to a given user's role and scenario
US6728752B1 (en) * 1999-01-26 2004-04-27 Xerox Corporation System and method for information browsing using multi-modal features
US6567830B1 (en) * 1999-02-12 2003-05-20 International Business Machines Corporation Method, system, and program for displaying added text to an electronic media file
US6493702B1 (en) * 1999-05-05 2002-12-10 Xerox Corporation System and method for searching and recommending documents in a collection using share bookmarks
US6654726B1 (en) * 1999-11-05 2003-11-25 Ford Motor Company Communication schema of online system and method of status inquiry and tracking related to orders for consumer product having specific configurations
EP1248996A4 (en) * 2000-01-18 2003-03-12 Zantaz Com Internet-based archive service for electronic documents
WO2001057633A1 (en) * 2000-02-07 2001-08-09 John Rudolph Trust-based cliques marketing tool
US6311194B1 (en) * 2000-03-15 2001-10-30 Taalee, Inc. System and method for creating a semantic web and its applications in browsing, searching, profiling, personalization and advertising
US20030101065A1 (en) * 2001-11-27 2003-05-29 International Business Machines Corporation Method and apparatus for maintaining conversation threads in electronic mail

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5689550A (en) * 1994-08-08 1997-11-18 Voice-Tel Enterprises, Inc. Interface enabling voice messaging systems to interact with communications networks
US6209100B1 (en) * 1998-03-27 2001-03-27 International Business Machines Corp. Moderated forums with anonymous but traceable contributions
US6275811B1 (en) * 1998-05-06 2001-08-14 Michael R. Ginn System and method for facilitating interactive electronic communication through acknowledgment of positive contributive
US6381579B1 (en) * 1998-12-23 2002-04-30 International Business Machines Corporation System and method to provide secure navigation to resources on the internet
WO2003067473A1 (en) 2002-02-04 2003-08-14 Cataphora, Inc. A method and apparatus for sociological data mining

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
See also references of EP1481346A4
SMITH M A ET AL.: "Visualization Components for Persistent Conversations", CHI 2001 CONFERENCE PROCEEDINGS. CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 31 March 2001 (2001-03-31), pages 136 - 143, XP002301011, DOI: doi:10.1145/365024.365073

Cited By (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140089440A1 (en) * 2004-03-31 2014-03-27 Google Inc. Systems and Methods for Applying User Actions to Conversation Messages
AU2011201993B2 (en) * 2004-03-31 2013-04-04 Google Llc Searching conversations in a conversation-based email system
AU2013205898B2 (en) * 2004-03-31 2016-05-19 Google Llc Snoozing conversations in a conversation-based email system
US8346859B2 (en) 2004-03-31 2013-01-01 Google Inc. Method, system, and graphical user interface for dynamically updating transmission characteristics in a web mail reply
US8700717B2 (en) 2004-03-31 2014-04-15 Google Inc. Email conversation management system
US8533274B2 (en) 2004-03-31 2013-09-10 Google Inc. Retrieving and snoozing categorized conversations in a conversation-based email system
US9794207B2 (en) 2004-03-31 2017-10-17 Google Inc. Email conversation management system
US8560615B2 (en) 2004-03-31 2013-10-15 Google Inc. Displaying conversation views in a conversation-based email system
US9734216B2 (en) 2004-03-31 2017-08-15 Google Inc. Systems and methods for re-ranking displayed conversations
US8583747B2 (en) 2004-03-31 2013-11-12 Google Inc. Labeling messages of conversations and snoozing labeled conversations in a conversation-based email system
US8601062B2 (en) 2004-03-31 2013-12-03 Google Inc. Providing snippets relevant to a search query in a conversation-based email system
US9819624B2 (en) 2004-03-31 2017-11-14 Google Inc. Displaying conversations in a conversation-based email system
US8621022B2 (en) 2004-03-31 2013-12-31 Google, Inc. Primary and secondary recipient indicators for conversations
US8626851B2 (en) 2004-03-31 2014-01-07 Google Inc. Email conversation management system
AU2011201991B2 (en) * 2004-03-31 2012-11-01 Google Llc Conversation-based email with list of senders in a conversation
AU2011201994B2 (en) * 2004-03-31 2012-11-01 Google Llc Providing snippets relevant to a search query in a conversation-based email system
US9602456B2 (en) * 2004-03-31 2017-03-21 Google Inc. Systems and methods for applying user actions to conversation messages
US9418105B2 (en) 2004-03-31 2016-08-16 Google Inc. Email conversation management system
US9395865B2 (en) 2004-03-31 2016-07-19 Google Inc. Systems, methods, and graphical user interfaces for concurrent display of reply message and multiple response options
US10757055B2 (en) 2004-03-31 2020-08-25 Google Llc Email conversation management system
US9015257B2 (en) 2004-03-31 2015-04-21 Google Inc. Labeling messages with conversation labels and message labels
US9015264B2 (en) 2004-03-31 2015-04-21 Google Inc. Primary and secondary recipient indicators for conversations
US10706060B2 (en) 2004-03-31 2020-07-07 Google Llc Systems and methods for re-ranking displayed conversations
US9063990B2 (en) 2004-03-31 2015-06-23 Google Inc. Providing snippets relevant to a search query in a conversation-based email system
US9063989B2 (en) 2004-03-31 2015-06-23 Google Inc. Retrieving and snoozing categorized conversations in a conversation-based email system
US9071566B2 (en) 2004-03-31 2015-06-30 Google Inc. Retrieving conversations that match a search query
US9124543B2 (en) 2004-03-31 2015-09-01 Google Inc. Compacted mode for displaying messages in a conversation
US10284506B2 (en) 2004-03-31 2019-05-07 Google Llc Displaying conversations in a conversation-based email system
WO2005116887A1 (en) * 2004-05-25 2005-12-08 Arion Human Capital Limited Data analysis and flow control system
US8782156B2 (en) 2004-08-06 2014-07-15 Google Inc. Enhanced message display
US9002725B1 (en) 2005-04-20 2015-04-07 Google Inc. System and method for targeting information based on message content
US8554852B2 (en) 2005-12-05 2013-10-08 Google Inc. System and method for targeting advertisements or other information using user geographical information
US8601004B1 (en) 2005-12-06 2013-12-03 Google Inc. System and method for targeting information items based on popularities of the information items
US11188168B2 (en) 2010-06-04 2021-11-30 Apple Inc. Device, method, and graphical user interface for navigating through a user interface using a dynamic object selection indicator
US11709560B2 (en) 2010-06-04 2023-07-25 Apple Inc. Device, method, and graphical user interface for navigating through a user interface using a dynamic object selection indicator
US9262455B2 (en) 2011-07-27 2016-02-16 Google Inc. Indexing quoted text in messages in conversations to support advanced conversation-based searching
US8972409B2 (en) 2011-07-27 2015-03-03 Google Inc. Enabling search for conversations with two messages each having a query team
US8583654B2 (en) 2011-07-27 2013-11-12 Google Inc. Indexing quoted text in messages in conversations to support advanced conversation-based searching
US9037601B2 (en) 2011-07-27 2015-05-19 Google Inc. Conversation system and method for performing both conversation-based queries and message-based queries
US9009142B2 (en) 2011-07-27 2015-04-14 Google Inc. Index entries configured to support both conversation and message based searching
US10352049B2 (en) 2013-06-27 2019-07-16 Valinge Innovation Ab Building panel with a mechanical locking system
US9898162B2 (en) 2014-05-30 2018-02-20 Apple Inc. Swiping functions for messaging applications
US11226724B2 (en) 2014-05-30 2022-01-18 Apple Inc. Swiping functions for messaging applications
US10739947B2 (en) 2014-05-30 2020-08-11 Apple Inc. Swiping functions for messaging applications
US11916861B2 (en) 2014-05-31 2024-02-27 Apple Inc. Displaying interactive notifications on touch sensitive devices
US9887949B2 (en) 2014-05-31 2018-02-06 Apple Inc. Displaying interactive notifications on touch sensitive devices
US10771422B2 (en) 2014-05-31 2020-09-08 Apple Inc. Displaying interactive notifications on touch sensitive devices
US11190477B2 (en) 2014-05-31 2021-11-30 Apple Inc. Displaying interactive notifications on touch sensitive devices
US10416882B2 (en) 2014-06-01 2019-09-17 Apple Inc. Displaying options, assigning notification, ignoring messages, and simultaneous user interface displays in a messaging application
US11068157B2 (en) 2014-06-01 2021-07-20 Apple Inc. Displaying options, assigning notification, ignoring messages, and simultaneous user interface displays in a messaging application
KR20200128173A (en) * 2014-06-01 2020-11-11 애플 인크. Displaying options, assigning notification, ignoring messages, and simultaneous user interface displays in a messaging application
KR102333461B1 (en) * 2014-06-01 2021-12-02 애플 인크. Displaying options, assigning notification, ignoring messages, and simultaneous user interface displays in a messaging application
US11494072B2 (en) 2014-06-01 2022-11-08 Apple Inc. Displaying options, assigning notification, ignoring messages, and simultaneous user interface displays in a messaging application
WO2015187274A1 (en) * 2014-06-01 2015-12-10 Apple Inc. Displaying options, assigning notification, ignoring messages, and simultaneous user interface displays in a messaging application
US11868606B2 (en) 2014-06-01 2024-01-09 Apple Inc. Displaying options, assigning notification, ignoring messages, and simultaneous user interface displays in a messaging application
US9971500B2 (en) 2014-06-01 2018-05-15 Apple Inc. Displaying options, assigning notification, ignoring messages, and simultaneous user interface displays in a messaging application
US10620812B2 (en) 2016-06-10 2020-04-14 Apple Inc. Device, method, and graphical user interface for managing electronic communications

Also Published As

Publication number Publication date
CA2475319A1 (en) 2003-08-14
CA2475267A1 (en) 2003-08-14
EP1481346B1 (en) 2012-10-10
US7143091B2 (en) 2006-11-28
EP1481346A4 (en) 2008-03-26
EP1485825A1 (en) 2004-12-15
EP1481346A1 (en) 2004-12-01
EP1485825A4 (en) 2008-03-19
US20060253418A1 (en) 2006-11-09
AU2003207836A1 (en) 2003-09-02
CA2475267C (en) 2014-08-05
AU2003207856A1 (en) 2003-09-02
US20030182310A1 (en) 2003-09-25
WO2003067473A1 (en) 2003-08-14

Similar Documents

Publication Publication Date Title
EP1481346B1 (en) A method and apparatus to visually present discussions for data mining purposes
US7421660B2 (en) Method and apparatus to visually present discussions for data mining purposes
US20200402009A1 (en) Relational presentation of communications and application for transaction analysis
US11204963B2 (en) Digital processing systems and methods for contextual auto-population of communications recipients in collaborative work systems
US11036371B2 (en) Methods and apparatus for managing and exchanging information using information objects
US9792356B2 (en) System and method for supporting natural language queries and requests against a user&#39;s personal data cloud
US8676913B1 (en) Discussion-topic, social network systems
US8135711B2 (en) Method and apparatus for sociological data analysis
US7343365B2 (en) Computer system architecture for automatic context associations
US7519589B2 (en) Method and apparatus for sociological data analysis
US7692653B1 (en) System and method for presenting statistics
US20070226204A1 (en) Content-based user interface for document management
EP1910949A2 (en) An improved method and apparatus for sociological data analysis
CN111190965A (en) Text data-based ad hoc relationship analysis system and method
WO2022234273A1 (en) Project data processing method and apparatus

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2475319

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2003706095

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2003706095

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP