WO2008011104A2 - Real-time detection and prevention of bulk messages - Google Patents

Real-time detection and prevention of bulk messages Download PDF

Info

Publication number
WO2008011104A2
WO2008011104A2 PCT/US2007/016374 US2007016374W WO2008011104A2 WO 2008011104 A2 WO2008011104 A2 WO 2008011104A2 US 2007016374 W US2007016374 W US 2007016374W WO 2008011104 A2 WO2008011104 A2 WO 2008011104A2
Authority
WO
WIPO (PCT)
Prior art keywords
key
status
message
keys
rate
Prior art date
Application number
PCT/US2007/016374
Other languages
French (fr)
Other versions
WO2008011104A3 (en
Inventor
Amit Jhawar
Original Assignee
Microsoft Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corporation filed Critical Microsoft Corporation
Publication of WO2008011104A2 publication Critical patent/WO2008011104A2/en
Publication of WO2008011104A3 publication Critical patent/WO2008011104A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/212Monitoring or handling of messages using filtering or selective blocking

Definitions

  • Anti-virus software typically employ filtration techniques to protect computers against viruses, worms, Trojan horses, and other unwanted messages.
  • Filtration techniques classify an incoming message based on one or more rules that are applied to various attributes of the message.
  • a rule may indicate that an incoming message that contains a . specific sender's electronic mail address is to be classified and treated as spam.
  • End users such as a recipient of messages, may also create and use rules to classify incoming messages to identify and handle spam appropriately.
  • a user of an electronic mail client program may set up an "inbox rule" to move all incoming messages with a subject heading including the text "$$$" to a "deleted items" folder.
  • filtration techniques One problem with filtration techniques is that commercial senders of messages (e.g., "spammers") can adapt their messages to known or commonly employed filters to ensure that messages are not classified as spam.
  • spammers e.g., "spammers”
  • the filtration techniques react to existing or known attacks, and the filtration techniques may require a tremendous amount of manual intervention to keep up with the adaptations of the spammers. Any kind of manual intervention to react to an attack may be too late as a very large number of unwanted messages may have been processed as a result of the attack, causing additional burden on various resources - e.g., CPU, storage, network bandwidth, etc.
  • a detection server detects and prevents bulk messages in realtime by analyzing the network traffic pattern of attributes of messages, such as electronic mail (email) messages, that are passing through the network against an expected network traffic pattern.
  • the expected network traffic pattern may be specified as a combination of a rate and one or more thresholds, where each threshold has a corresponding status.
  • the rate specifies a quantity of the attribute measured with respect to a quantity of time.
  • a threshold specifies a number of times the specified rate needs to be exceeded, and a status associated with a threshold is attained when the rate is exceeded the requisite threshold number of times. The status indicates an action that is to be taken in processing the email message containing the attribute.
  • the detection server Upon receiving an indication of an attribute of an email message, the detection server checks to determine whether the indicated instance of the attribute of the email message causes the specified rate to be exceeded and a specified threshold to be crossed. Whenever an indicated instance of the attribute of the email message causes a specified threshold to be crossed, the detection server assigns to the attribute the status associated with the threshold. The email message can then be processed in accordance with the status assigned to the attribute of the email message.
  • Figure 1 is a high-level block diagram that illustrates an environment in which a detection server executes, according to some embodiments.
  • Figure 2 is a block diagram that illustrates selected components of the detection server, according to some embodiments.
  • Figure 3 is a data structure diagram that illustrates example logical data structures of the detection server, according to some embodiments.
  • Figure 4 is a flow diagram that illustrates the processing of a message delivery host, according to some embodiments.
  • Figure 5 is a flow diagram that illustrates the processing of the message delivery host to process a message in accordance with the status associated with keys, according to some embodiments.
  • Figure 6 is a flow diagram that illustrates the processing of the host component of the detection server, according to some embodiments.
  • Figure 7 is a flow diagram that illustrates the processing of the host component of the detection server to update database entries for a key, according to some embodiments.
  • Figure 8 is a flow diagram that illustrates the processing of the peer component of the detection server to advertise suspicious keys to peer detection servers, according to some embodiments.
  • Figure 9 is a flow diagram that illustrates the processing of the peer component of the detection server to process suspicious keys advertised by peer detection servers, according to some embodiments.
  • Figure 10 is a flow diagram that illustrates the processing of the central detection server to push suspicious keys to the detection servers, according to some embodiments.
  • Figure 11 is a flow diagram that illustrates the processing of the central detection server to push manual input, according to some embodiments.
  • Figure 12 is a flow diagram that illustrates the processing of the central server component of the detection server to process keys pushed by the central detection server, according to some embodiments. DETAILED DESCRIPTION
  • a detection server detects and prevents bulk messages in real-time by analyzing the network traffic pattern of attributes of messages, such as electronic mail (email) messages, that are passing through the network against an expected network traffic pattern.
  • An attribute of an email message may be the sender's Internet Protocol (IP) address, a body part (e.g., text, HTML, image, document, etc.) of the email message, and the like.
  • IP Internet Protocol
  • An authorized user such as a network or system administrator, may specify an expected network traffic pattern for the attributes of email messages.
  • attacks from undesirable bulk messages can be broadly categorized as (1) a single IP address hitting the network with either the same or similar message or potentially different messages, or (2) the same or similar message coming from multiple locations, which may be a virus attack.
  • the characteristics of the undesirable messages create a certain, distinctive undesirable network traffic pattern, and the user can specify an expected network traffic pattern that accounts for the undesirable network traffic pattern.
  • the expected network traffic pattern may be specified as a combination of a rate and one or more thresholds, where each threshold has a corresponding status.
  • the rate specifies a quantity of the attribute measured with respect to a quantity of time.
  • a rate may be specified as a number of instances of an attribute (e.g., ten instances of the attribute) detected within a specified period of time (e.g., one second).
  • a threshold specifies a number of times the specified rate needs to be exceeded, and a status associated with a threshold is attained when the rate is exceeded the requisite threshold number of times. The status indicates an action that is to be taken in processing the email message containing the attribute.
  • the detection server receives an indication of an attribute of an email message from a message delivery host. Upon receiving the indication of the attribute of the email message, the detection server checks to determine whether the indicated instance of the attribute of the email message causes the specified rate to be exceeded and a specified threshold to be crossed.
  • the detection server assigns to the attribute the status associated with the threshold.
  • the detection server provides to the message delivery host the current status assigned to the attribute, and the message delivery host processes the email message in accordance with the specified status.
  • the detection server is able to monitor the attributes of the email messages that are flowing through the network, and automatically take action in real-time, without any human intervention, whenever a threshold is crossed, thus indicating a potential anomaly in the network traffic pattern.
  • the detection servers apply a single rate to all attributes. In other embodiments, the detection servers may apply multiple rates to the attributes.
  • a message delivery host may generate one or more keys from various attributes of an email message, and send each key to the detection server along with a request for the status associated with each key.
  • a key is a representation, such as a hash value, of an attribute of an email message, and the message delivery host may use any of a variety of well-known hashing functions or techniques to generate the keys from the attributes of the email message. For example, when processing an email message, the message delivery host may generate a hash value of the IP address of the sender of the email message, and hash values of the contents of various parts of the email message body.
  • the message delivery host then sends each of the generated keys to the detection server and, without delivering or otherwise further processing the email message, waits to receive from the detection server the status associated with each of the keys.
  • the message delivery host then processes the email message according to the respective statuses of the keys received from the detection server.
  • a status may specify an action such as accept, reject, hold, score, or ignore, and each of the keys of an email message may be associated with the same status or different statuses. Stated another way, the keys of an email message need not all be associated with the same status.
  • the message delivery host may apply a priority scheme (precedence order) to the received statuses to determine how to process the email message. For example, the precedence order of the statuses, form high priority to low priority, may be to accept, reject, hold, score, and ignore.
  • a priority scheme precedence order
  • the message delivery host accepts the email message (e.g., normally processes the email message) if any key of the email message is associated with a status of accept, irrespective of the statuses of the other keys of the email message. If no key of the email message is associated with a status of accept, and any key of the email message is associated with a status of reject, then the message delivery host rejects the email message (e.g., does not further process the email message). If no key of the email message is associated with a status of accept or reject, and any key of the email message is associated with a status of hold, then the message delivery host sends a copy of the email message to a preconfigured address.
  • the message delivery host can indicate that a copy of the email message has been retained for further examination, and continue processing the email message. If no key of the email message is associated with a status of accept, reject, or hold, and any key of the email message is associated with a status of score, then the message delivery host makes an indication that the email message may be suspicious, and accepts the email message (e.g., processes the email message with the indication that it may be suspicious). The message delivery host may identify the attribute of the email message that may have been suspicious. If no key of the email message is associated with a status of accept, reject, hold, or score, or all of the keys are associated with status of ignore, then the message delivery host accepts the email message.
  • the detection server is able to indicate the keys that may be suspicious and specify the action to be taken by the message delivery host.
  • the statuses reject, hold, and score indicate varying levels of suspiciousness of the key (i.e., the suspiciousness of the attribute of the email message that was used to generate the key) and, of these, the status reject indicates a confirmed bad key, while the status of accept serves as an "override" action.
  • the aforementioned number of statuses are only one example of indicating varying levels of suspiciousness and the corresponding actions to perform at each level, and one skilled in the art will appreciate that there may be a different number of levels and/or different actions to perform at each level.
  • the message delivery host may send the rejected email messages to a preconfigured address for further analysis by, for example, a spam analyst.
  • the message delivery host waits for a preconfigured "timeout" time period to receive a reply to the request for status associated with a key from the detection server.
  • the preconfigured timeout may be specified in a configuration file. If the message delivery host does not receive a reply from the detection server within the timeout time period, the message delivery host "skips" the key and continues processing the next key. In this case, the message delivery host can process the key as if the detection server returned a status of ignore for the key.
  • the detection server advertises suspicious keys (e.g., a keys associated with statuses of reject, hold, or score) to its peer detection servers.
  • suspicious keys e.g., a keys associated with statuses of reject, hold, or score
  • an organization's computing environment including its services (e.g., email services, web services, etc.) and network, may be distributed into a multiple number of datacenters.
  • one datacenter may be servicing the organization's facilities located in North America
  • another datacenter may be servicing the organization's facilities located in Europe
  • still another datacenter may be servicing the organization's facilities located in Asia.
  • Each datacenter may be implemented using a multiple number of servers, such as the message delivery hosts, and the servers in a datacenter may be supported by a detection server.
  • the detection server When a detection server in one datacenter identifies a key to be suspicious, the detection server advertises the suspicious key to its peer detection servers in the other datacenters by broadcasting the key. The detection server may also broadcast information regarding the suspicious key, such as an indication of the number of times the rate has been exceeded in the particular datacenter.
  • the receiving detection server consolidates the received information regarding the suspicious key(s) with its own network traffic pattern information (e.g., the information regarding the keys detected in the receiving detection server's datacenter) to generate a more complete view of the distributed datacenters. For example, a key may be determined to be suspicious in a first datacenter and not in a second peer datacenter.
  • the detection server in the first datacenter advertises the suspicious key to its peer detection server in the second datacenter.
  • the detection server in the second datacenter can also identify the key received through the advertisement, which was not previously determined to be suspicious in the second datacenter, as a suspicious key in the second datacenter.
  • a key may have been detected in both datacenters to be suspicious (e.g., status of score), but not as a confirmed bad key (e.g., status of reject).
  • the detection server in the first datacenter advertises the suspicious key to its peer detection server in the second datacenter and, likewise, the detection server in the second datacenter advertises the suspicious key to its peer detection server in the first datacenter.
  • the detection server in each datacenter Upon receiving the advertisement of the suspicious key, the detection server in each datacenter consolidates the information regarding the suspicious key received via the advertisement with its own information and updates the status associated with the suspicious key accordingly. For example, the consolidated information may be sufficient to indicate that the suspicious key should now be a confirmed bad key (i.e., associate a status of reject to the suspicious key) even though the suspicious key was not identified as a confirmed bad key in either of the datacenters.
  • the sharing of information between peer detection servers allows the detection servers to consolidate the information regarding the distributed datacenters in real-time, automatically generate a consolidated view of the network traffic patterns, and create a more complete view of the network.
  • a central detection server periodically pulls information regarding all the keys from the detections servers.
  • the central detection server then processes the pulled information to identify the suspicious keys, and pushes (e.g., redistributes) the information regarding the suspicious keys, including the confirmed bad keys, to the detection servers.
  • each detection server can consolidate the received information and use the consolidated information in its processing of the requests for status associated with a key.
  • the central detection server allows authorized users, such as system administrators, network administrators, spam analysts, and the like, to input keys into the central server for distribution to the detection servers.
  • the central detection server may provide a user interface (Ul) for use in inputting a key or multiple keys.
  • a spam analyst may receive a confirmed bad key from a source, such as an anti-virus software provider. The analyst can then utilize the Ul provided by the central detection server to input the confirmed bad key for distribution to the detection servers.
  • each detection server uses the received information regarding the confirmed bad key in its processing of the requests for status associated with a key.
  • the central detection server provides authorized users access to the keys, and the information related to the keys, and allows the users to input a different status for a key or multiple keys.
  • the central detection server distributes the indicated status and the key to the detection servers for use by the detection servers in their processing of the requests for status associated with a key.
  • a spam analyst may analyze a suspicious key and determine that the key is not suspicious. The analyst can then input a new status that indicates that the previously suspicious key is now a good ⁇ key (e.g., the analyst can assign a new status of either accept or ignore to the previously suspicious key).
  • FIG. 1 is a high-level block diagram that illustrates an environment in which a detection server executes, according to some embodiments.
  • the environment comprises a plurality of datacenters 102 coupled to a central detection server 104 via a communications link 106.
  • the environment is only one example of a suitable operating environment and is not intended to suggest any limitation as to the scope of use or functionality of the detection servers and the central detection server.
  • the datacenters may each correspond to a centralized repository, either physical or virtual, of computing services.
  • Each datacenter comprises a plurality of message delivery hosts 108 coupled to a detection server 110.
  • the hosts in the datacenter provide the computing services of the datacenter.
  • one or more hosts in the datacenter may be email delivery hosts that provide email services to, for example, an organization.
  • Each detection server services the hosts that are in its datacenter.
  • the host When a host in a datacenter receives a message to process, the host generates keys from various parts of the message and sends each key, along with a request for its status, to the detection server that is servicing the datacenter.
  • the detection server monitors the traffic pattern of the keys that are passing through the network, and assigns a status to each key based on the monitored traffic pattern.
  • the detection server looks up the statuses assigned to the keys and returns the statuses to the host. The host then processes the message according to the statuses provided by the detection server.
  • FIG. 2 is a block diagram that illustrates selected components of the detection server, according to some embodiments.
  • the detection server comprises a host component 202, a peer component 204, a central server component 206, and a key store 208.
  • the key store contains the keys received from the hosts and the central detection server and other data structures used by the detection server.
  • the host component is invoked when a request for a status of a key received from a host.
  • the host component processes and responds to the received requests for statuses of keys received from the hosts.
  • the peer component is invoked to process interactions with peer detection servers.
  • the central server component is invoked to process interactions with the central detection server.
  • the computing device on which the detection server is implemented may include a central processing unit, memory, input devices (e.g., keyboard and pointing devices), output devices (e.g., display devices), and storage devices (e.g., disk drives).
  • the memory and storage devices are computer-readable media that may contain instructions that implement the detection server.
  • the data structures and message structures may be stored or transmitted via a data transmission medium, such as a signal on a communications link.
  • Various communication links may be used, such as the Internet, a local area network, a wide area network, a point-to-point dial-up connection, a cell phone network, and so on.
  • Embodiments of the detection server and the central detection server may be implemented in various operating environments that include personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, programmable consumer electronics, digital cameras, network PCs, minicomputers, mainframe computers, network devices, distributed computing environments that include any of the above systems or devices, and so on.
  • the computer systems may be cell phones, personal digital assistants, smart phones, personal computers, programmable consumer electronics, digital cameras, and so on.
  • FIG. 3 is a data structure diagram that illustrates example logical data structures of the detection server, according to some embodiments.
  • the data structures include a key table 302 and a status threshold table 304.
  • the key table includes an entry for each key that is monitored by the detection server.
  • Each entry in the key table comprises, by way of example, seven fields including a key field 306, a rate field 308, a hit count field 310, a number of times rate exceeded locally field 312, a number of times rate exceeded globally field 314, a status field 316, and a status override field 318.
  • the key field identifies a key.
  • the rate field specifies a rate for the identified key.
  • the hit count field contains a count of the number of times the identified key was detected by the detection server (e.g., processed by the detection server).
  • the number of times rate exceeded locally field contains a count of the number of times the detection server determined that the identified key exceeded the specified rate.
  • the number of times rate exceeded globally field contains a count of the number of times the identified key exceeded the specified rate globally (e.g., across the collection of peer detection servers).
  • the detection server accordingly updates the number of times rate exceeded globally field whenever it receives a count of the number of times other detection servers (e.g., the detection servers in other datacenters) determined that the identified key exceeded the specified rate. This count may be provided by peer detection servers or the central detection server.
  • the status field specifies a status that is associated with the identified key. The hosts use the status assigned to the key to determine how to process the message from which the key was generated.
  • the specified status overrides the status specified in the status field.
  • the detection server can record an indication of the manually specified status in the status override field.
  • the status threshold table contains an entry for each status that may be assigned to a key. Each entry in the status threshold table comprises, by way of example, two fields including a status field 320 and a threshold value field 322.
  • the status field identifies a status.
  • the threshold value field specifies a number of times the rate needs to be exceeded for a key to attain the corresponding status.
  • the detection server can create an entry in the status threshold table for each status and its corresponding threshold value specified by an authorized user.
  • the first entry identifies a status IGNORE and a threshold value of 0
  • the second entry identifies a status SCORE and a threshold value of 1
  • the third entry identifies a status HOLD and a threshold value of 5
  • the fourth entry identifies a status REJECT and a threshold value of 10.
  • These entries indicate that a key is assigned: a status of IGNORE when the rate is exceeded zero times; a status of SCORE when the rate is exceeded one time; a status of HOLD when the rate is exceeded five times; and a status of REJECT when the rate is exceeded ten times.
  • the rate may be specified in a separate record or table.
  • the data structures of the detection server may be tailored to the space/computation requirements of the detection server.
  • FIG. 4 is a flow diagram that illustrates the processing of a message delivery host, according to some embodiments.
  • the message delivery host processes messages as part of providing its services. For example, an email message delivery host processes email messages as part of providing email services to email clients.
  • the message delivery host receives a message to process.
  • the message delivery host generates one or more keys from various attributes of the message. For each generated key (block 406), the message delivery host performs block 408, until all of the keys are processed (block 410).
  • the message delivery host utilizes the services of its detection server to determine the status associated with (e.g., assigned to) the key. For example, the message delivery host may send each key to the detection server along with a request for status.
  • the message delivery host processes the message according to the statuses associated with the keys, and completes.
  • FIG. 5 is a flow diagram that illustrates the processing of the message delivery host to process a message in accordance with the status associated with keys, according to some embodiments.
  • the message delivery host receives the statuses associated with the keys from the detection server, for example, in response to one or more requests for status made by the message delivery host.
  • decision block 502 if at least one of the statuses is an ACCEPT status, then the message delivery host continues at block 504, else the message delivery host continues at decision block 506.
  • the message delivery host processes the message in a normal manner, and completes. In this instance, the message delivery host processes the message as a good message.
  • decision block 506 if at least one of the statuses is a REJECT status, then the message delivery host continues at block 508, else the message delivery host continues at decision block 510. In block 508, the message delivery host rejects the message as a bad (e.g., bulk) message, and completes. In this instance, the message delivery host does not further process the message.
  • decision block 510 if at least one of the statuses is a HOLD status, then the message delivery host continues at block 512, else the message delivery host continues at decision block 514. In block 512, the message delivery host sends a copy of the message to a preconfigured address of a holding location, and completes. This message may then be further analyzed, for example, by spam analysts.
  • the message delivery host may continue to process the message. In other embodiments, the message delivery host does not further process the message.
  • decision block 514 if at least one of the statuses is a SCORE status, then the message delivery host continues at block 516, else the message delivery host continues at block 518.
  • the message delivery host indicates that the message is suspicious and processes the message in the normal manner (block 518);
  • the message delivery host processes the message as a good message in the normal manner, and completes.
  • Figure 6 is a flow diagram that illustrates the processing of the host component of the detection server, according to some embodiments. The host component is passed the key received from a host and returns the status associated with the key.
  • the host component receives the request for status of a key.
  • the host component updates the database entries for the key.
  • the host component queries the database for the status of the key.
  • the host component returns the status to the requestor (e.g., the message delivery host that sent the request for status), and completes.
  • FIG. 7 is a flow diagram that illustrates the processing of the host component of the detection server to update database entries for a key, according to some embodiments.
  • the host component updates the entries corresponding to the key in the key table or creates a new entry for the key in the key table.
  • decision block 702 if an entry for the key is not in the database, then the host component continues at block 704, else the host component continues at block 708.
  • block 704 the host component creates an entry in the key table for the key.
  • the host component sets the status of the key to IGNORE, and completes. For example, the host component specifies a status of IGNORE in the status field of the entry for the key in the key table.
  • the host component increments the hit count for the key.
  • the host component can increment the hit count maintained in the hit count field of the entry for the key in the key table.
  • decision block 710 if the increment in the hit count does not cause the rate to be exceeded, then the host component completes, else the host component continues at block 712.
  • block 712 the host component increments the number of times the rate has been exceeded locally to the detection server. For example, the host component can increment the count maintained in the number of times rate exceeded locally field of the entry for the key in the key table.
  • decision block 714 if the occurrence of the rate being exceeded does not cause any of the specified thresholds to be crossed, then the host component completes, else the host component continues at block 716.
  • the host component can check the thresholds specified in the status threshold table to determine if any one of the specified thresholds are crossed.
  • the host component assigns to the key the status associated with the threshold that was crossed, and completes.
  • the host component assigns to the key a status of HOLD.
  • the host component can specify a status of HOLD in the status field of the entry for the key in the key table. If the host component determines that the rate is not exceeded (block 710) or that a threshold is not crossed (block 716), the host component does not update the status (e.g., the status field of the entry for the key in the key table) of the key.
  • the host component can periodically clean up entries for keys in the key table.
  • FIG 8 is a flow diagram that illustrates the processing of the peer component of the detection server to advertise suspicious keys to peer detection servers, according to some embodiments.
  • the peer component periodically broadcasts to peer detection servers the keys that have been identified as being suspicious by the detection server.
  • the peer component identifies the suspicious keys. For example, the peer component can identify the keys in the key table whose statuses are specified to be either SCORE, HOLD, or REJECT.
  • the peer component sends an advertisement of the suspicious keys to the peer detection servers, and completes.
  • the peer component can send a broadcast to the peer detection servers a message that includes an indication of the suspicious keys and data corresponding to each suspicious key, such as, by way of example, the hit count, the number of times the rate was exceeded locally, other statistical information, and the like.
  • Figure 9 is a flow diagram that illustrates the processing of the peer component of the detection server to process suspicious keys advertised by peer detection servers, according to some embodiments.
  • the peer component is passed the advertisement of suspicious keys received from a peer detection server.
  • the peer component updates the entries corresponding to the suspicious keys in the key table or creates new entries for the suspicious keys in the key table.
  • the peer component receives an advertisement of suspicious keys from a peer detection server.
  • the peer component For each suspicious key advertised (block 904), the peer component performs blocks 906 to 910, until all of the advertised suspicious keys are processed (block 912).
  • the peer component updates the database entries for the key with the information regarding the suspicious key received with the advertisement. For example, if an entry for the suspicious key exists in the key table, then the peer component updates the entries corresponding to the suspicious key in the key table with the information regarding the suspicious key received with the advertisement. If an entry for the suspicious key does not exist in the key table, then the peer component creates a new entry for the suspicious key in the key table and updates the entries corresponding to the suspicious key in the key table with the information regarding the suspicious key received with the advertisement.
  • decision block 908 if the advertised suspicious key and, in particular, the information regarding the suspicious key received with the advertisement causes any of the specified thresholds to be crossed, then the peer component continues at block 910, else the peer component continues at block 912 to process the next advertised key. In block 910, the peer component assigns to the suspicious key the status associated with the threshold that was crossed.
  • FIG 10 is a flow diagram that illustrates the processing of the central detection server to push suspicious keys to the detection servers, according to some embodiments.
  • the central detection server periodically pulls (obtains) information regarding the keys from the detection servers, processes the pulled information to identify suspicious keys, and pushes (distributes) the information regarding the suspicious keys to the detection servers.
  • the central detection server pulls from the detection servers information regarding the keys processed by the detection servers.
  • the central detection server consolidates the information regarding the keys pulled from the detection servers.
  • the central detection server identifies suspicious keys using the consolidated information.
  • the central detection server pushes information regarding the identified suspicious keys to the detection servers, and completes.
  • FIG 11 is a flow diagram that illustrates the processing of the central detection server to push manual input, according to some embodiments.
  • the central detection server may provide a Ul that allows authorized users to view and/or input information regarding keys.
  • the central detection server pushes to the detection servers input provided by the authorized users.
  • the central detection server receives input regarding a key. For example, a spam analyst may input a key, which has not been detected by any of the detection servers) and indicate that the input key is a confirmed bad key (specify for the input key a status of REJECT).
  • the central detection server pushes the input information (the key and related information) to the detection servers, and completes.
  • FIG 12 is a flow diagram that illustrates the processing of the central server component of the detection server to process keys pushed by the central detection server, according to some embodiments.
  • the central server component receives information regarding keys from the central detection server, and updates the entries corresponding to the received keys in the key table or creates new entries for the received keys in the key table.
  • the central server component receives keys, and information regarding keys, from the central detection server. For each key or information regarding the key received (block 1204), the central server component performs block 1206, until all of the received keys are processed (block 1208).
  • the peer component updates the database entries for the key with the information regarding the key received from the central detection server.
  • the peer component creates a new entry for the key in the key table and updates the entries corresponding to the key in the key table with the information regarding the key received from the central detection server.
  • the central server component may have received information which indicates that a currently suspicious key should now be processed as a good, non-suspicious key (e.g., that the status of the suspicious key should be overridden with a status of ACCEPT).
  • the central server component specifies a status of ACCEPT in the status override field of the entry for the key in the key table.
  • the detection server can detect messages other than email messages.
  • the message delivery hosts may be web servers processing various web messages.
  • the message delivery hosts may generate keys from various attributes of a web message 1 and send the generated keys to an appropriate detection server to obtain the statuses of the keys.
  • the detection server can thus detect bulk or unwanted web messages.

Abstract

A method and system for detecting and preventing bulk messages in real-time is provided. A detection server detects and prevents bulk messages in real-time by analyzing the network traffic pattern of attributes of messages, such as email messages, that are passing through the network against an expected network traffic pattern. The expected network traffic pattern may be specified as a combination of a rate and one or more thresholds, where each threshold has a corresponding status. The rate specifies a quantity of an attribute measured with respect to a quantity of time. A status associated with a threshold is attained when the rate is exceeded the requisite threshold number of times. The status indicates an action that is to be taken in processing the email message containing the attribute. An email message can then be processed in accordance with a status assigned to an attribute of the email message.

Description

REAL-TIME DETECTION AND PREVENTION OF BULK MESSAGES
BACKGROUND
[0001] The costs associated with sending and receiving electronic messages, such as email messages, have reduced significantly to a point where the marginal cost of sending or receiving such a message is nearly zero. Commercial and other entities are thus able to send millions of email messages without incurring significant costs. However, the increased reliance on email has exposed the individuals and corporations that use email to threats from, for example, email viruses, spam, phishing attacks, and the like. Commercially available anti-virus software packages are typically employed to combat such attacks.
[0002] Anti-virus software typically employ filtration techniques to protect computers against viruses, worms, Trojan horses, and other unwanted messages. Filtration techniques classify an incoming message based on one or more rules that are applied to various attributes of the message. By way of example, a rule may indicate that an incoming message that contains a . specific sender's electronic mail address is to be classified and treated as spam. End users, such as a recipient of messages, may also create and use rules to classify incoming messages to identify and handle spam appropriately. As an example, a user of an electronic mail client program may set up an "inbox rule" to move all incoming messages with a subject heading including the text "$$$" to a "deleted items" folder. One problem with filtration techniques is that commercial senders of messages (e.g., "spammers") can adapt their messages to known or commonly employed filters to ensure that messages are not classified as spam. As a result, the filtration techniques react to existing or known attacks, and the filtration techniques may require a tremendous amount of manual intervention to keep up with the adaptations of the spammers. Any kind of manual intervention to react to an attack may be too late as a very large number of unwanted messages may have been processed as a result of the attack, causing additional burden on various resources - e.g., CPU, storage, network bandwidth, etc. SUMMARY
[0003] A method and system for detecting and preventing bulk messages in realtime is provided. A detection server detects and prevents bulk messages in realtime by analyzing the network traffic pattern of attributes of messages, such as electronic mail (email) messages, that are passing through the network against an expected network traffic pattern. The expected network traffic pattern may be specified as a combination of a rate and one or more thresholds, where each threshold has a corresponding status. The rate specifies a quantity of the attribute measured with respect to a quantity of time. A threshold specifies a number of times the specified rate needs to be exceeded, and a status associated with a threshold is attained when the rate is exceeded the requisite threshold number of times. The status indicates an action that is to be taken in processing the email message containing the attribute. Upon receiving an indication of an attribute of an email message, the detection server checks to determine whether the indicated instance of the attribute of the email message causes the specified rate to be exceeded and a specified threshold to be crossed. Whenever an indicated instance of the attribute of the email message causes a specified threshold to be crossed, the detection server assigns to the attribute the status associated with the threshold. The email message can then be processed in accordance with the status assigned to the attribute of the email message. [0004] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
BRIEF DESCRIPTION OF THE DRAWINGS
[0005] Figure 1 is a high-level block diagram that illustrates an environment in which a detection server executes, according to some embodiments. [0006] Figure 2 is a block diagram that illustrates selected components of the detection server, according to some embodiments.
[0007] Figure 3 is a data structure diagram that illustrates example logical data structures of the detection server, according to some embodiments. [0008] Figure 4 is a flow diagram that illustrates the processing of a message delivery host, according to some embodiments.
[0009] Figure 5 is a flow diagram that illustrates the processing of the message delivery host to process a message in accordance with the status associated with keys, according to some embodiments. [0010] Figure 6 is a flow diagram that illustrates the processing of the host component of the detection server, according to some embodiments. [0011] Figure 7 is a flow diagram that illustrates the processing of the host component of the detection server to update database entries for a key, according to some embodiments.
[0012] Figure 8 is a flow diagram that illustrates the processing of the peer component of the detection server to advertise suspicious keys to peer detection servers, according to some embodiments. [0013] Figure 9 is a flow diagram that illustrates the processing of the peer component of the detection server to process suspicious keys advertised by peer detection servers, according to some embodiments.
[0014] Figure 10 is a flow diagram that illustrates the processing of the central detection server to push suspicious keys to the detection servers, according to some embodiments. [0015] Figure 11 is a flow diagram that illustrates the processing of the central detection server to push manual input, according to some embodiments. [0016] Figure 12 is a flow diagram that illustrates the processing of the central server component of the detection server to process keys pushed by the central detection server, according to some embodiments. DETAILED DESCRIPTION
[0017] A method and system for detecting and preventing bulk messages in realtime is provided. In some embodiments, a detection server detects and prevents bulk messages in real-time by analyzing the network traffic pattern of attributes of messages, such as electronic mail (email) messages, that are passing through the network against an expected network traffic pattern. An attribute of an email message may be the sender's Internet Protocol (IP) address, a body part (e.g., text, HTML, image, document, etc.) of the email message, and the like. An authorized user, such as a network or system administrator, may specify an expected network traffic pattern for the attributes of email messages. For example, attacks from undesirable bulk messages, such as email attacks, can be broadly categorized as (1) a single IP address hitting the network with either the same or similar message or potentially different messages, or (2) the same or similar message coming from multiple locations, which may be a virus attack. The characteristics of the undesirable messages create a certain, distinctive undesirable network traffic pattern, and the user can specify an expected network traffic pattern that accounts for the undesirable network traffic pattern. The expected network traffic pattern may be specified as a combination of a rate and one or more thresholds, where each threshold has a corresponding status. The rate specifies a quantity of the attribute measured with respect to a quantity of time. For example, a rate may be specified as a number of instances of an attribute (e.g., ten instances of the attribute) detected within a specified period of time (e.g., one second). A threshold specifies a number of times the specified rate needs to be exceeded, and a status associated with a threshold is attained when the rate is exceeded the requisite threshold number of times. The status indicates an action that is to be taken in processing the email message containing the attribute. In a typical scenario, the detection server receives an indication of an attribute of an email message from a message delivery host. Upon receiving the indication of the attribute of the email message, the detection server checks to determine whether the indicated instance of the attribute of the email message causes the specified rate to be exceeded and a specified threshold to be crossed. Whenever an indicated instance of the attribute of the email message causes a specified threshold to be crossed, the detection server assigns to the attribute the status associated with the threshold. The detection server provides to the message delivery host the current status assigned to the attribute, and the message delivery host processes the email message in accordance with the specified status. In this manner, based on the specified rate and thresholds (i.e., the expected network pattern), the detection server is able to monitor the attributes of the email messages that are flowing through the network, and automatically take action in real-time, without any human intervention, whenever a threshold is crossed, thus indicating a potential anomaly in the network traffic pattern. [0018] In some embodiments, the detection servers apply a single rate to all attributes. In other embodiments, the detection servers may apply multiple rates to the attributes. For example, the detection server may apply one rate to one attribute and a second rate to a multiple number of different attributes. [0019] In some embodiments, a message delivery host may generate one or more keys from various attributes of an email message, and send each key to the detection server along with a request for the status associated with each key. A key is a representation, such as a hash value, of an attribute of an email message, and the message delivery host may use any of a variety of well-known hashing functions or techniques to generate the keys from the attributes of the email message. For example, when processing an email message, the message delivery host may generate a hash value of the IP address of the sender of the email message, and hash values of the contents of various parts of the email message body. The message delivery host then sends each of the generated keys to the detection server and, without delivering or otherwise further processing the email message, waits to receive from the detection server the status associated with each of the keys. The message delivery host then processes the email message according to the respective statuses of the keys received from the detection server.
[0020] By way of example, a status may specify an action such as accept, reject, hold, score, or ignore, and each of the keys of an email message may be associated with the same status or different statuses. Stated another way, the keys of an email message need not all be associated with the same status. Upon receiving the statuses associated with the keys, the message delivery host may apply a priority scheme (precedence order) to the received statuses to determine how to process the email message. For example, the precedence order of the statuses, form high priority to low priority, may be to accept, reject, hold, score, and ignore. Applying this precedence order, the message delivery host accepts the email message (e.g., normally processes the email message) if any key of the email message is associated with a status of accept, irrespective of the statuses of the other keys of the email message. If no key of the email message is associated with a status of accept, and any key of the email message is associated with a status of reject, then the message delivery host rejects the email message (e.g., does not further process the email message). If no key of the email message is associated with a status of accept or reject, and any key of the email message is associated with a status of hold, then the message delivery host sends a copy of the email message to a preconfigured address. The message delivery host can indicate that a copy of the email message has been retained for further examination, and continue processing the email message. If no key of the email message is associated with a status of accept, reject, or hold, and any key of the email message is associated with a status of score, then the message delivery host makes an indication that the email message may be suspicious, and accepts the email message (e.g., processes the email message with the indication that it may be suspicious). The message delivery host may identify the attribute of the email message that may have been suspicious. If no key of the email message is associated with a status of accept, reject, hold, or score, or all of the keys are associated with status of ignore, then the message delivery host accepts the email message. Using the statuses associated with the keys, the detection server is able to indicate the keys that may be suspicious and specify the action to be taken by the message delivery host. For example, the statuses reject, hold, and score indicate varying levels of suspiciousness of the key (i.e., the suspiciousness of the attribute of the email message that was used to generate the key) and, of these, the status reject indicates a confirmed bad key, while the status of accept serves as an "override" action. The aforementioned number of statuses are only one example of indicating varying levels of suspiciousness and the corresponding actions to perform at each level, and one skilled in the art will appreciate that there may be a different number of levels and/or different actions to perform at each level. In some embodiments, the message delivery host may send the rejected email messages to a preconfigured address for further analysis by, for example, a spam analyst. [0021] In some embodiments, the message delivery host waits for a preconfigured "timeout" time period to receive a reply to the request for status associated with a key from the detection server. The preconfigured timeout may be specified in a configuration file. If the message delivery host does not receive a reply from the detection server within the timeout time period, the message delivery host "skips" the key and continues processing the next key. In this case, the message delivery host can process the key as if the detection server returned a status of ignore for the key.
[0022] In some embodiments, the detection server advertises suspicious keys (e.g., a keys associated with statuses of reject, hold, or score) to its peer detection servers. By way of example, an organization's computing environment, including its services (e.g., email services, web services, etc.) and network, may be distributed into a multiple number of datacenters. For example, one datacenter may be servicing the organization's facilities located in North America, another datacenter may be servicing the organization's facilities located in Europe, and still another datacenter may be servicing the organization's facilities located in Asia. Each datacenter may be implemented using a multiple number of servers, such as the message delivery hosts, and the servers in a datacenter may be supported by a detection server. When a detection server in one datacenter identifies a key to be suspicious, the detection server advertises the suspicious key to its peer detection servers in the other datacenters by broadcasting the key. The detection server may also broadcast information regarding the suspicious key, such as an indication of the number of times the rate has been exceeded in the particular datacenter. When a detection server receives an advertisement of a suspicious key or keys form a peer detection server, the receiving detection server consolidates the received information regarding the suspicious key(s) with its own network traffic pattern information (e.g., the information regarding the keys detected in the receiving detection server's datacenter) to generate a more complete view of the distributed datacenters. For example, a key may be determined to be suspicious in a first datacenter and not in a second peer datacenter. In this instance, the detection server in the first datacenter advertises the suspicious key to its peer detection server in the second datacenter. Upon receiving the advertisement of the suspicious key, the detection server in the second datacenter can also identify the key received through the advertisement, which was not previously determined to be suspicious in the second datacenter, as a suspicious key in the second datacenter. In another example, a key may have been detected in both datacenters to be suspicious (e.g., status of score), but not as a confirmed bad key (e.g., status of reject). In this instance, the detection server in the first datacenter advertises the suspicious key to its peer detection server in the second datacenter and, likewise, the detection server in the second datacenter advertises the suspicious key to its peer detection server in the first datacenter. Upon receiving the advertisement of the suspicious key, the detection server in each datacenter consolidates the information regarding the suspicious key received via the advertisement with its own information and updates the status associated with the suspicious key accordingly. For example, the consolidated information may be sufficient to indicate that the suspicious key should now be a confirmed bad key (i.e., associate a status of reject to the suspicious key) even though the suspicious key was not identified as a confirmed bad key in either of the datacenters. The sharing of information between peer detection servers allows the detection servers to consolidate the information regarding the distributed datacenters in real-time, automatically generate a consolidated view of the network traffic patterns, and create a more complete view of the network.
[0023] In some embodiments, a central detection server periodically pulls information regarding all the keys from the detections servers. The central detection server then processes the pulled information to identify the suspicious keys, and pushes (e.g., redistributes) the information regarding the suspicious keys, including the confirmed bad keys, to the detection servers. Upon receiving the information regarding the suspicious keys and the confirmed bad keys, each detection server can consolidate the received information and use the consolidated information in its processing of the requests for status associated with a key. [0024] In some embodiments, the central detection server allows authorized users, such as system administrators, network administrators, spam analysts, and the like, to input keys into the central server for distribution to the detection servers. The central detection server may provide a user interface (Ul) for use in inputting a key or multiple keys. By way of example, a spam analyst may receive a confirmed bad key from a source, such as an anti-virus software provider. The analyst can then utilize the Ul provided by the central detection server to input the confirmed bad key for distribution to the detection servers. Upon receiving the confirmed bad key from the central detection server, each detection server uses the received information regarding the confirmed bad key in its processing of the requests for status associated with a key. In some embodiments, the central detection server provides authorized users access to the keys, and the information related to the keys, and allows the users to input a different status for a key or multiple keys. When a user inputs a status for a key, the central detection server distributes the indicated status and the key to the detection servers for use by the detection servers in their processing of the requests for status associated with a key. By way of example, a spam analyst may analyze a suspicious key and determine that the key is not suspicious. The analyst can then input a new status that indicates that the previously suspicious key is now a good^key (e.g., the analyst can assign a new status of either accept or ignore to the previously suspicious key).
[0025] Figure 1 is a high-level block diagram that illustrates an environment in which a detection server executes, according to some embodiments. The environment comprises a plurality of datacenters 102 coupled to a central detection server 104 via a communications link 106. The environment is only one example of a suitable operating environment and is not intended to suggest any limitation as to the scope of use or functionality of the detection servers and the central detection server. The datacenters may each correspond to a centralized repository, either physical or virtual, of computing services. Each datacenter comprises a plurality of message delivery hosts 108 coupled to a detection server 110. The hosts in the datacenter provide the computing services of the datacenter. For example, one or more hosts in the datacenter may be email delivery hosts that provide email services to, for example, an organization. Other hosts may provide services such as web services, web content, various application services, and the like. Each detection server services the hosts that are in its datacenter. When a host in a datacenter receives a message to process, the host generates keys from various parts of the message and sends each key, along with a request for its status, to the detection server that is servicing the datacenter. The detection server monitors the traffic pattern of the keys that are passing through the network, and assigns a status to each key based on the monitored traffic pattern. Upon receiving the keys, the detection server looks up the statuses assigned to the keys and returns the statuses to the host. The host then processes the message according to the statuses provided by the detection server. The central detection server interacts with the detection servers in the various datacenters to generate and provide a global view of the datacenters. [0026] Figure 2 is a block diagram that illustrates selected components of the detection server, according to some embodiments. The detection server comprises a host component 202, a peer component 204, a central server component 206, and a key store 208. The key store contains the keys received from the hosts and the central detection server and other data structures used by the detection server. The host component is invoked when a request for a status of a key received from a host. The host component processes and responds to the received requests for statuses of keys received from the hosts. The peer component is invoked to process interactions with peer detection servers. The central server component is invoked to process interactions with the central detection server.
[0027] The computing device on which the detection server is implemented may include a central processing unit, memory, input devices (e.g., keyboard and pointing devices), output devices (e.g., display devices), and storage devices (e.g., disk drives). The memory and storage devices are computer-readable media that may contain instructions that implement the detection server. In addition, the data structures and message structures may be stored or transmitted via a data transmission medium, such as a signal on a communications link. Various communication links may be used, such as the Internet, a local area network, a wide area network, a point-to-point dial-up connection, a cell phone network, and so on.
[0028] Embodiments of the detection server and the central detection server may be implemented in various operating environments that include personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, programmable consumer electronics, digital cameras, network PCs, minicomputers, mainframe computers, network devices, distributed computing environments that include any of the above systems or devices, and so on. The computer systems may be cell phones, personal digital assistants, smart phones, personal computers, programmable consumer electronics, digital cameras, and so on.
[0029] The detection server and the central detection server may be described in the general context of computer-executable instructions, such as program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, and so on that perform particular tasks or implement particular abstract data types. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments. [0030] Figure 3 is a data structure diagram that illustrates example logical data structures of the detection server, according to some embodiments. The data structures include a key table 302 and a status threshold table 304. The key table includes an entry for each key that is monitored by the detection server. Whenever the detection server receives an indication of a key from a host, for example, along with a request for a status of the key, the detection server updates the entry in the key table for that key. Each entry in the key table comprises, by way of example, seven fields including a key field 306, a rate field 308, a hit count field 310, a number of times rate exceeded locally field 312, a number of times rate exceeded globally field 314, a status field 316, and a status override field 318. The key field identifies a key. The rate field specifies a rate for the identified key. The hit count field contains a count of the number of times the identified key was detected by the detection server (e.g., processed by the detection server). The number of times rate exceeded locally field contains a count of the number of times the detection server determined that the identified key exceeded the specified rate. The number of times rate exceeded globally field contains a count of the number of times the identified key exceeded the specified rate globally (e.g., across the collection of peer detection servers). For example, the detection server accordingly updates the number of times rate exceeded globally field whenever it receives a count of the number of times other detection servers (e.g., the detection servers in other datacenters) determined that the identified key exceeded the specified rate. This count may be provided by peer detection servers or the central detection server. The status field specifies a status that is associated with the identified key. The hosts use the status assigned to the key to determine how to process the message from which the key was generated. When a status is specified in the status override field, the specified status overrides the status specified in the status field. For example, when a user manually specifies a status for a key, the detection server can record an indication of the manually specified status in the status override field. The status threshold table contains an entry for each status that may be assigned to a key. Each entry in the status threshold table comprises, by way of example, two fields including a status field 320 and a threshold value field 322. The status field identifies a status. The threshold value field specifies a number of times the rate needs to be exceeded for a key to attain the corresponding status. The detection server can create an entry in the status threshold table for each status and its corresponding threshold value specified by an authorized user. By way of example and as illustrated by the status threshold table in Figure 3, the first entry identifies a status IGNORE and a threshold value of 0, the second entry identifies a status SCORE and a threshold value of 1, the third entry identifies a status HOLD and a threshold value of 5, and the fourth entry identifies a status REJECT and a threshold value of 10. These entries indicate that a key is assigned: a status of IGNORE when the rate is exceeded zero times; a status of SCORE when the rate is exceeded one time; a status of HOLD when the rate is exceeded five times; and a status of REJECT when the rate is exceeded ten times. One skilled in the art will appreciate that this is only one example of the logical layout of data structures of the detection server. For example, rather that specifying the rate in each entry of the key table, the rate may be specified in a separate record or table. The data structures of the detection server may be tailored to the space/computation requirements of the detection server.
[0031] Figure 4 is a flow diagram that illustrates the processing of a message delivery host, according to some embodiments. The message delivery host processes messages as part of providing its services. For example, an email message delivery host processes email messages as part of providing email services to email clients. In block 402, the message delivery host receives a message to process. In block 404, the message delivery host generates one or more keys from various attributes of the message. For each generated key (block 406), the message delivery host performs block 408, until all of the keys are processed (block 410). In block 408, the message delivery host utilizes the services of its detection server to determine the status associated with (e.g., assigned to) the key. For example, the message delivery host may send each key to the detection server along with a request for status. In block 412, the message delivery host processes the message according to the statuses associated with the keys, and completes.
[0032] One skilled in the art will appreciate that, for this and other processes and methods disclosed herein, the functions performed in the processes and methods may be implemented in differing order. Furthermore, the outlined steps are only exemplary, and some of the steps may be optional, combined with fewer steps, or expanded into additional steps.
[0033] Figure 5 is a flow diagram that illustrates the processing of the message delivery host to process a message in accordance with the status associated with keys, according to some embodiments. The message delivery host receives the statuses associated with the keys from the detection server, for example, in response to one or more requests for status made by the message delivery host. In decision block 502, if at least one of the statuses is an ACCEPT status, then the message delivery host continues at block 504, else the message delivery host continues at decision block 506. In block 504, the message delivery host processes the message in a normal manner, and completes. In this instance, the message delivery host processes the message as a good message. In decision block 506, if at least one of the statuses is a REJECT status, then the message delivery host continues at block 508, else the message delivery host continues at decision block 510. In block 508, the message delivery host rejects the message as a bad (e.g., bulk) message, and completes. In this instance, the message delivery host does not further process the message. In decision block 510, if at least one of the statuses is a HOLD status, then the message delivery host continues at block 512, else the message delivery host continues at decision block 514. In block 512, the message delivery host sends a copy of the message to a preconfigured address of a holding location, and completes. This message may then be further analyzed, for example, by spam analysts. In some embodiments, the message delivery host may continue to process the message. In other embodiments, the message delivery host does not further process the message. In decision block 514, if at least one of the statuses is a SCORE status, then the message delivery host continues at block 516, else the message delivery host continues at block 518. In block 516, the message delivery host indicates that the message is suspicious and processes the message in the normal manner (block 518); In block 518, the message delivery host processes the message as a good message in the normal manner, and completes. [0034] Figure 6 is a flow diagram that illustrates the processing of the host component of the detection server, according to some embodiments. The host component is passed the key received from a host and returns the status associated with the key. In block 602, the host component receives the request for status of a key. In block 604, the host component updates the database entries for the key. In block 606, the host component queries the database for the status of the key. In block 608, the host component returns the status to the requestor (e.g., the message delivery host that sent the request for status), and completes.
[0035] Figure 7 is a flow diagram that illustrates the processing of the host component of the detection server to update database entries for a key, according to some embodiments. The host component updates the entries corresponding to the key in the key table or creates a new entry for the key in the key table. In decision block 702, if an entry for the key is not in the database, then the host component continues at block 704, else the host component continues at block 708. In block 704, the host component creates an entry in the key table for the key. In block 706, the host component sets the status of the key to IGNORE, and completes. For example, the host component specifies a status of IGNORE in the status field of the entry for the key in the key table. In block 708, the host component increments the hit count for the key. For example, the host component can increment the hit count maintained in the hit count field of the entry for the key in the key table. In decision block 710, if the increment in the hit count does not cause the rate to be exceeded, then the host component completes, else the host component continues at block 712. In block 712, the host component increments the number of times the rate has been exceeded locally to the detection server. For example, the host component can increment the count maintained in the number of times rate exceeded locally field of the entry for the key in the key table. In decision block 714, if the occurrence of the rate being exceeded does not cause any of the specified thresholds to be crossed, then the host component completes, else the host component continues at block 716. The host component can check the thresholds specified in the status threshold table to determine if any one of the specified thresholds are crossed. In block 716, the host component assigns to the key the status associated with the threshold that was crossed, and completes. Using the status threshold table of Figure 3 as an example, assuming that the threshold of 5 was crossed, the host component assigns to the key a status of HOLD. The host component can specify a status of HOLD in the status field of the entry for the key in the key table. If the host component determines that the rate is not exceeded (block 710) or that a threshold is not crossed (block 716), the host component does not update the status (e.g., the status field of the entry for the key in the key table) of the key. The host component can periodically clean up entries for keys in the key table.
[0036] Figure 8 is a flow diagram that illustrates the processing of the peer component of the detection server to advertise suspicious keys to peer detection servers, according to some embodiments. The peer component periodically broadcasts to peer detection servers the keys that have been identified as being suspicious by the detection server. In block 802, the peer component identifies the suspicious keys. For example, the peer component can identify the keys in the key table whose statuses are specified to be either SCORE, HOLD, or REJECT. In block 804, the peer component sends an advertisement of the suspicious keys to the peer detection servers, and completes. For example, the peer component can send a broadcast to the peer detection servers a message that includes an indication of the suspicious keys and data corresponding to each suspicious key, such as, by way of example, the hit count, the number of times the rate was exceeded locally, other statistical information, and the like. [0037] Figure 9 is a flow diagram that illustrates the processing of the peer component of the detection server to process suspicious keys advertised by peer detection servers, according to some embodiments. The peer component is passed the advertisement of suspicious keys received from a peer detection server. The peer component then updates the entries corresponding to the suspicious keys in the key table or creates new entries for the suspicious keys in the key table. In block 902, the peer component receives an advertisement of suspicious keys from a peer detection server. For each suspicious key advertised (block 904), the peer component performs blocks 906 to 910, until all of the advertised suspicious keys are processed (block 912). In block 906, the peer component updates the database entries for the key with the information regarding the suspicious key received with the advertisement. For example, if an entry for the suspicious key exists in the key table, then the peer component updates the entries corresponding to the suspicious key in the key table with the information regarding the suspicious key received with the advertisement. If an entry for the suspicious key does not exist in the key table, then the peer component creates a new entry for the suspicious key in the key table and updates the entries corresponding to the suspicious key in the key table with the information regarding the suspicious key received with the advertisement. In decision block 908, if the advertised suspicious key and, in particular, the information regarding the suspicious key received with the advertisement causes any of the specified thresholds to be crossed, then the peer component continues at block 910, else the peer component continues at block 912 to process the next advertised key. In block 910, the peer component assigns to the suspicious key the status associated with the threshold that was crossed.
[0038] Figure 10 is a flow diagram that illustrates the processing of the central detection server to push suspicious keys to the detection servers, according to some embodiments. The central detection server periodically pulls (obtains) information regarding the keys from the detection servers, processes the pulled information to identify suspicious keys, and pushes (distributes) the information regarding the suspicious keys to the detection servers. In block 1002, the central detection server pulls from the detection servers information regarding the keys processed by the detection servers. In block 1004, the central detection server consolidates the information regarding the keys pulled from the detection servers. In block 1006, the central detection server identifies suspicious keys using the consolidated information. In block 1008, the central detection server pushes information regarding the identified suspicious keys to the detection servers, and completes. [0039] Figure 11 is a flow diagram that illustrates the processing of the central detection server to push manual input, according to some embodiments. The central detection server may provide a Ul that allows authorized users to view and/or input information regarding keys. The central detection server pushes to the detection servers input provided by the authorized users. In block 1102, the central detection server receives input regarding a key. For example, a spam analyst may input a key, which has not been detected by any of the detection servers) and indicate that the input key is a confirmed bad key (specify for the input key a status of REJECT). In block 1104, the central detection server pushes the input information (the key and related information) to the detection servers, and completes.
[0040] Figure 12 is a flow diagram that illustrates the processing of the central server component of the detection server to process keys pushed by the central detection server, according to some embodiments. The central server component receives information regarding keys from the central detection server, and updates the entries corresponding to the received keys in the key table or creates new entries for the received keys in the key table. In block 1202, the central server component receives keys, and information regarding keys, from the central detection server. For each key or information regarding the key received (block 1204), the central server component performs block 1206, until all of the received keys are processed (block 1208). In block 1206, the peer component updates the database entries for the key with the information regarding the key received from the central detection server. If an entry for the key does not exist in the key table, then the peer component creates a new entry for the key in the key table and updates the entries corresponding to the key in the key table with the information regarding the key received from the central detection server. For example, the central server component may have received information which indicates that a currently suspicious key should now be processed as a good, non-suspicious key (e.g., that the status of the suspicious key should be overridden with a status of ACCEPT). In this instance, the central server component specifies a status of ACCEPT in the status override field of the entry for the key in the key table. [0041] From the foregoing, it will be appreciated that specific embodiments of the detection server have been described herein for purposes of illustration, but that various modifications may be made without deviating from the spirit and scope of the invention. Although the detection server was described in the context of detecting bulk or unwanted- email messages, one skilled in the art will appreciate that the detection server can detect messages other than email messages. For example, the message delivery hosts may be web servers processing various web messages. In this instance, the message delivery hosts may generate keys from various attributes of a web message1 and send the generated keys to an appropriate detection server to obtain the statuses of the keys. The detection server can thus detect bulk or unwanted web messages. Accordingly, although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

CLAIMS I/We claim:
1. A computer-implemented method for detecting unwanted messages in real-time at a message delivery host, the method comprising: generating at least one key for a message based on an attribute of the message (404); for each generated key, determining a status associated with the key (408); and processing the message according to the statuses associated with the generated keys (412).
2. The method of claim 1 , wherein the attribute of the message is an Internet Protocol (IP) address of a sender of the message.
3. The method of claim 1 , wherein the attribute of the message is a body part of the message.
4. The method of claim 1 , wherein the key is a hash print of a body part of the message.
5. The method of claim 1, wherein the status is obtained from a remote detection server.
6. The method of claim 1, wherein the status indicates to normally process the message (504, 518).
7. The method of claim 1, wherein the status indicates to reject the message (508).
8. The method of claim 1 , wherein the status indicates to send a copy of the message to a preconfigured address (512).
9. The method of claim 1 , wherein the status indicates to indicate the message as suspicious (516).
10. The method of claim 1 , wherein each status associated with a key is based on a count of a number of times a specified rate is exceeded and at least one threshold value, each threshold value specifies the number of times the rate has to be exceeded for the key to attain the corresponding status.
11. The method of claim 10, wherein the count of a number of times the specified rate is exceeded is a local count.
12. The method of claim 10, wherein the count of a number of times the specified rate is exceeded is a global count.
13. A computer-implemented method for assigning statuses to keys at a detection server, the keys based on attributes of messages, the method comprising: providing an association between a plurality of threshold values and corresponding statuses, wherein each threshold value specifies a number of times a rate has to be exceeded; for each key, providing a count of a number of times the rate is exceeded on the detection server; receiving from a message delivery host an indication of a key; determining whether the rate is exceeded (710); and upon determining that the rate is exceeded, incrementing the count of the number of times the rate is exceeded on the detection server (712); determining whether one of the threshold values is crossed (714); and upon determining that one of the threshold values is crossed, assigning to the key the status associated with the threshold value that is crossed (716). .
14. The method of claim 13 further comprising: for each key, providing a count of a number of times the rate is exceeded globally; and upon determining that the rate is exceeded, updating the count of the number of times the rate is exceeded globally.
15. The method of claim 14 further comprising: receiving from a peer detection server an advertisement of a suspicious key (902); updating the count of the number of times the rate is exceeded globally for the suspicious key (906); determining whether one of the threshold values is crossed (908); and upon determining that one of the threshold values is crossed, assigning to the suspicious key the status associated with the threshold value that is crossed (910).
16. The method of claim 13, wherein the status assigned to the key indicates that the key is a suspicious key.
17. The method of claim 13 further comprising: identifying suspicious keys (802); and sending an advertisement of suspicious keys to peer detection servers
(804).
18. A detection server comprising: a host component (202) that receives a request for a status of a key from a message delivery host, that determines the status of the specified key, and that sends the status of the key to the message delivery host; and a peer component (204) that sends advertisements of suspicious keys to peer detection servers, and that receives advertisements of suspicious keys from peer detection servers.
19. The server of claim 18, wherein the peer component (204) also assigns statuses to keys based on advertisements received from peer detection servers.
20. The server of claim 18, wherein the host component (202) also assigns statuses to keys based on requests for statuses of keys received from message delivery hosts.
PCT/US2007/016374 2006-07-18 2007-07-18 Real-time detection and prevention of bulk messages WO2008011104A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/458,342 US7734703B2 (en) 2006-07-18 2006-07-18 Real-time detection and prevention of bulk messages
US11/458,342 2006-07-18

Publications (2)

Publication Number Publication Date
WO2008011104A2 true WO2008011104A2 (en) 2008-01-24
WO2008011104A3 WO2008011104A3 (en) 2008-03-27

Family

ID=38957370

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/016374 WO2008011104A2 (en) 2006-07-18 2007-07-18 Real-time detection and prevention of bulk messages

Country Status (2)

Country Link
US (1) US7734703B2 (en)
WO (1) WO2008011104A2 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7555524B1 (en) * 2004-09-16 2009-06-30 Symantec Corporation Bulk electronic message detection by header similarity analysis
US8489689B1 (en) 2006-05-31 2013-07-16 Proofpoint, Inc. Apparatus and method for obfuscation detection within a spam filtering model
US8356076B1 (en) 2007-01-30 2013-01-15 Proofpoint, Inc. Apparatus and method for performing spam detection and filtering using an image history table
US7716297B1 (en) * 2007-01-30 2010-05-11 Proofpoint, Inc. Message stream analysis for spam detection and filtering
US8447039B2 (en) * 2007-09-26 2013-05-21 Cisco Technology, Inc. Active-active hierarchical key servers
US8938511B2 (en) * 2012-06-12 2015-01-20 International Business Machines Corporation Method and apparatus for detecting unauthorized bulk forwarding of sensitive data over a network
US9660947B1 (en) * 2012-07-27 2017-05-23 Intuit Inc. Method and apparatus for filtering undesirable content based on anti-tags
US20190273618A1 (en) * 2018-03-05 2019-09-05 Roger G. Marshall FAKEOUT© Software System - An electronic apostille-based real time content authentication technique for text, audio and video transmissions

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020063534A (en) * 2002-06-25 2002-08-03 디프소프트 주식회사 Method for filtering spam mail
US20020199095A1 (en) * 1997-07-24 2002-12-26 Jean-Christophe Bandini Method and system for filtering communication
US20050160148A1 (en) * 2004-01-16 2005-07-21 Mailshell, Inc. System for determining degrees of similarity in email message information
US20060031306A1 (en) * 2004-04-29 2006-02-09 International Business Machines Corporation Method and apparatus for scoring unsolicited e-mail

Family Cites Families (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5781901A (en) * 1995-12-21 1998-07-14 Intel Corporation Transmitting electronic mail attachment over a network using a e-mail page
US6018761A (en) * 1996-12-11 2000-01-25 The Robert G. Uomini And Louise B. Bidwell Trust System for adding to electronic mail messages information obtained from sources external to the electronic mail transport process
US6185551B1 (en) * 1997-06-16 2001-02-06 Digital Equipment Corporation Web-based electronic mail service apparatus and method using full text and label indexing
US6092101A (en) * 1997-06-16 2000-07-18 Digital Equipment Corporation Method for filtering mail messages for a plurality of client computers connected to a mail service system
US6189026B1 (en) * 1997-06-16 2001-02-13 Digital Equipment Corporation Technique for dynamically generating an address book in a distributed electronic mail system
US6571290B2 (en) * 1997-06-19 2003-05-27 Mymail, Inc. Method and apparatus for providing fungible intercourse over a network
US6052709A (en) * 1997-12-23 2000-04-18 Bright Light Technologies, Inc. Apparatus and method for controlling delivery of unsolicited electronic mail
US6356935B1 (en) * 1998-08-14 2002-03-12 Xircom Wireless, Inc. Apparatus and method for an authenticated electronic userid
US6941304B2 (en) * 1998-11-17 2005-09-06 Kana Software, Inc. Method and apparatus for performing enterprise email management
US6643686B1 (en) * 1998-12-18 2003-11-04 At&T Corp. System and method for counteracting message filtering
US6615242B1 (en) * 1998-12-28 2003-09-02 At&T Corp. Automatic uniform resource locator-based message filter
US6654787B1 (en) * 1998-12-31 2003-11-25 Brightmail, Incorporated Method and apparatus for filtering e-mail
WO2001016695A1 (en) * 1999-09-01 2001-03-08 Katsikas Peter L System for eliminating unauthorized electronic mail
US7249175B1 (en) * 1999-11-23 2007-07-24 Escom Corporation Method and system for blocking e-mail having a nonexistent sender address
US6832245B1 (en) * 1999-12-01 2004-12-14 At&T Corp. System and method for analyzing communications of user messages to rank users and contacts based on message content
US7412462B2 (en) * 2000-02-18 2008-08-12 Burnside Acquisition, Llc Data repository and method for promoting network storage of data
US7117246B2 (en) * 2000-02-22 2006-10-03 Sendmail, Inc. Electronic mail system with methodology providing distributed message store
US6829607B1 (en) * 2000-04-24 2004-12-07 Microsoft Corporation System and method for facilitating user input by automatically providing dynamically generated completion information
WO2002001783A2 (en) * 2000-06-27 2002-01-03 Peoplestreet, Inc. Systems and methods for managing contact information
US6772196B1 (en) * 2000-07-27 2004-08-03 Propel Software Corp. Electronic mail filtering system and methods
US6779021B1 (en) * 2000-07-28 2004-08-17 International Business Machines Corporation Method and system for predicting and managing undesirable electronic mail
US7321922B2 (en) * 2000-08-24 2008-01-22 Yahoo! Inc. Automated solicited message detection
US7149778B1 (en) * 2000-08-24 2006-12-12 Yahoo! Inc. Unsolicited electronic mail reduction
GB2366706B (en) * 2000-08-31 2004-11-03 Content Technologies Ltd Monitoring electronic mail messages digests
US6650890B1 (en) * 2000-09-29 2003-11-18 Postini, Inc. Value-added electronic messaging services and transparent implementation thereof using intermediate server
US7092992B1 (en) * 2001-02-01 2006-08-15 Mailshell.Com, Inc. Web page filtering including substitution of user-entered email address
US20020107925A1 (en) * 2001-02-05 2002-08-08 Robert Goldschneider Method and system for e-mail management
WO2002069108A2 (en) * 2001-02-26 2002-09-06 Eprivacy Group, Inc. System and method for controlling distribution of network communications
US20020120600A1 (en) * 2001-02-26 2002-08-29 Schiavone Vincent J. System and method for rule-based processing of electronic mail messages
GB2373130B (en) * 2001-03-05 2004-09-22 Messagelabs Ltd Method of,and system for,processing email in particular to detect unsolicited bulk email
US6928465B2 (en) * 2001-03-16 2005-08-09 Wells Fargo Bank, N.A. Redundant email address detection and capture system
US7325249B2 (en) * 2001-04-30 2008-01-29 Aol Llc Identifying unwanted electronic messages
US8095597B2 (en) * 2001-05-01 2012-01-10 Aol Inc. Method and system of automating data capture from electronic correspondence
US6769016B2 (en) * 2001-07-26 2004-07-27 Networks Associates Technology, Inc. Intelligent SPAM detection system using an updateable neural analysis engine
US7016939B1 (en) * 2001-07-26 2006-03-21 Mcafee, Inc. Intelligent SPAM detection system using statistical analysis
US20030149726A1 (en) * 2002-02-05 2003-08-07 At&T Corp. Automating the reduction of unsolicited email in real time
US7472163B1 (en) * 2002-10-07 2008-12-30 Aol Llc Bulk message identification
US7200636B2 (en) * 2002-11-01 2007-04-03 Sun Microsystems, Inc. Method and apparatus for applying personalized rules to e-mail messages at an e-mail server
US7433924B2 (en) * 2003-08-07 2008-10-07 International Business Machines Corporation Interceptor for non-subscribed bulk electronic messages
US7257564B2 (en) * 2003-10-03 2007-08-14 Tumbleweed Communications Corp. Dynamic message filtering
US20050120019A1 (en) * 2003-11-29 2005-06-02 International Business Machines Corporation Method and apparatus for the automatic identification of unsolicited e-mail messages (SPAM)
US20050132069A1 (en) 2003-12-14 2005-06-16 Marvin Shannon System and method for the algorithmic disposition of electronic communications
US8918466B2 (en) * 2004-03-09 2014-12-23 Tonny Yu System for email processing and analysis
US8180834B2 (en) 2004-10-07 2012-05-15 Computer Associates Think, Inc. System, method, and computer program product for filtering messages and training a classification module
US7650383B2 (en) * 2005-03-15 2010-01-19 Aol Llc Electronic message system with federation of trusted senders
US8065370B2 (en) * 2005-11-03 2011-11-22 Microsoft Corporation Proofs to filter spam
US8131805B2 (en) * 2006-03-01 2012-03-06 Research In Motion Limited Multilevel anti-spam system and method with load balancing
US7945684B2 (en) * 2006-06-21 2011-05-17 International Business Machines Corporation Spam risk assessment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020199095A1 (en) * 1997-07-24 2002-12-26 Jean-Christophe Bandini Method and system for filtering communication
KR20020063534A (en) * 2002-06-25 2002-08-03 디프소프트 주식회사 Method for filtering spam mail
US20050160148A1 (en) * 2004-01-16 2005-07-21 Mailshell, Inc. System for determining degrees of similarity in email message information
US20060031306A1 (en) * 2004-04-29 2006-02-09 International Business Machines Corporation Method and apparatus for scoring unsolicited e-mail

Also Published As

Publication number Publication date
US7734703B2 (en) 2010-06-08
WO2008011104A3 (en) 2008-03-27
US20080021961A1 (en) 2008-01-24

Similar Documents

Publication Publication Date Title
US10181957B2 (en) Systems and methods for detecting and/or handling targeted attacks in the email channel
US10938694B2 (en) System and method for detecting sources of abnormal computer network messages
US7519818B2 (en) Method and system for processing a communication based on trust that the communication is not unwanted as assigned by a sending domain
JP5047624B2 (en) A framework that enables the incorporation of anti-spam techniques
US7734703B2 (en) Real-time detection and prevention of bulk messages
KR100938072B1 (en) Framework to enable integration of anti-spam technologies
US8145710B2 (en) System and method for filtering spam messages utilizing URL filtering module
US7899849B2 (en) Distributed security provisioning
US8763114B2 (en) Detecting image spam
US9361605B2 (en) System and method for filtering spam messages based on user reputation
US20160301705A1 (en) Suspicious message processing and incident response
US20140259142A1 (en) Systems and methods for detecting undesirable network traffic content
US20140366144A1 (en) Multi-dimensional reputation scoring
US20100281536A1 (en) Phish probability scoring model
EP2115642A1 (en) Web reputation scoring
US20130246592A1 (en) Electronic message manager system, method, and computer program product for scanning an electronic message for unwanted content and associated unwanted sites
US20200106791A1 (en) Intelligent system for mitigating cybersecurity risk by analyzing domain name system traffic metrics
US20090019121A1 (en) Message processing
Sadan et al. Social network analysis of web links to eliminate false positives in collaborative anti-spam systems
US20060075099A1 (en) Automatic elimination of viruses and spam
US8122498B1 (en) Combined multiple-application alert system and method
US7653812B2 (en) Method and system for evaluating confidence in a sending domain to accurately assign a trust that a communication is not unwanted
US20090210500A1 (en) System, computer program product and method of enabling internet service providers to synergistically identify and control spam e-mail
US7577984B2 (en) Method and system for a sending domain to establish a trust that its senders communications are not unwanted
EP3948609B1 (en) Defanging malicious electronic files based on trusted user reporting

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: RU

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07836147

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 07836147

Country of ref document: EP

Kind code of ref document: A2