WO2000060860A1 - Computer system and method for sharing a job on a computer network using ip multicast - Google Patents

Computer system and method for sharing a job on a computer network using ip multicast Download PDF

Info

Publication number
WO2000060860A1
WO2000060860A1 PCT/US2000/005560 US0005560W WO0060860A1 WO 2000060860 A1 WO2000060860 A1 WO 2000060860A1 US 0005560 W US0005560 W US 0005560W WO 0060860 A1 WO0060860 A1 WO 0060860A1
Authority
WO
WIPO (PCT)
Prior art keywords
job
computer system
network
shared
computer systems
Prior art date
Application number
PCT/US2000/005560
Other languages
French (fr)
Inventor
Thomas Alan Gall
Original Assignee
International Business Machines Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corporation filed Critical International Business Machines Corporation
Publication of WO2000060860A1 publication Critical patent/WO2000060860A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/185Arrangements for providing special services to substations for broadcast or conference, e.g. multicast with management of multicast group membership

Definitions

  • This invention generally relates to data processing, and more specifically relates to the sharing of jobs between computers on a network.
  • a computer system on a network uses IP multicast to recruit other computer systems to share in the processing of a job. If a computer system on the network wants to be available to process shared jobs, it first registers for job sharing by invoking an IP multicast router at a particular IP address. All messages sent to the IP multicast router are broadcast to all computer systems that are registered with the router. When a computer system has a job to share, it recruits other computer systems to help process the job by sending a message to the IP multicast router that corresponds to a request to share the job. The candidate computer systems that receive the recruiter's broadcast determine if they can share the job according to one or more job sharing parameters.
  • a computer system meets the parameters for taking on the particular job, it responds to the recruiter. If the recruiter still needs help (e.g., if not enough candidate systems have responded yet) , the recruiter grants the response and delivers the job to the computer system. The computer system then performs the job (or task) and returns the results to the recruiter.
  • FIG. 1 is a block diagram of a computer system that may be networked with other computer systems in accordance with a preferred embodiment of the present invention
  • FIG. 2 is a block diagram of several computer systems of FIG. 1 that are all coupled together on a network via an IP multicast router;
  • FIG. 3 is a block diagram showing several possible job sharing parameters.
  • FIG. 4 is a flow diagram illustrating a method for sharing jobs via IP multicast in accordance with the preferred embodiments . Best Mode for Carrying Out the Invention
  • the present, invention is accomplished through sharing portions of jobs on computers that are connected on a network.
  • Networking software typically defines a protocol for exchanging information between computers on a network. Many different network protocols are known in the art. Examples of commercially-available networking software is Novell Netware and Windows NT, which each implement different protocols for exchanging information between computers .
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • LANs local area networks
  • Intranets Intranets
  • a remote command "rem” may be invoked by a recruiter.
  • a computer system on the network referred to herein as a “rem server” is dedicated to responding to the rem command to find an available candidate.
  • the rem server locates an idle machine on the network, and executes a command on the idle machine to share a job.
  • Known techniques of communicating between computers using text input and output protocols are used to communicate between the recruiter and the candidate. Note that invoking the rem command results in finding a single candidate for sharing a job.
  • Another known method for sharing a job relates to compiling source code.
  • Source code is computer code that is written in a high-level human-readable programming language that must be translated to a machine-readable version that is executable on a processor.
  • Source code is usually arranged in modules. Each module is generally a separate compilation unit, which means it can be compiled separately from other modules. When a large program is programmed in a modular fashion, the compilation of different modules can be performed by different computers. Compiling source code is a common step in software development that takes a very long time to perform for complex computer programs. By farming out the compilation of different modules to different computer systems on a network, the compilation time can be drastically reduced.
  • One known method for compiling different modules on different computer systems uses an "rcomp" command, which stands for "remote compile”.
  • rcomp server a computer system that is dedicated to handling rcomp commands (referred to herein as an "rcomp server") examines the available compile servers on the network to determine whether they are compatible compile servers, and whether they have sufficient capacity to perform the requested compilation. When searching for compile servers, preference is given to larger compile servers and to compile servers that do not have other rcomp jobs running on them. Once a candidate compile server is selected, the rcomp server executes a command on the candidate that sends messages via text output back to the recruiter, and that optionally receives messages via text input from the recruiter.
  • Rcomp is generally used for commands (such as compilations) that require more resources than are available on the recruiter. It is typically not intended for small or interactive jobs.
  • IP Multicast can best be understood by providing an analogy to commonly-known radio transmitters and receivers.
  • a radio transmitter such as a transmitter for a local radio station, continuously broadcasts its programming on a particular frequency. To listen to the radio station, one must tune a radio receiver to the frequency corresponding to the radio station's transmitter.
  • An IP Multicast router performs functions analogous to a radio transmitter — it continuously broadcasts information to any ' computer systems that are "tuned in” to the multicast channel.
  • a computer system effectively "tunes in” by invoking a particular reserved Internet Protocol (IP) address corresponding to the multicast router to register with the multicast router. Once registered, the computer system will receive all messages broadcast by the multicast router.
  • IP Internet Protocol
  • IP multicast One significant difference between IP multicast and the radio analogy is that computer systems, once registered with the multicast router, can also send messages to the router for distribution to all of the registered computer systems. In this manner a computer system can communicate with a large number of other computer systems at the same time without individually communicating with each one, and without knowing what other computer systems are job sharing candidates .
  • a computer system registers with an IP multicast router for job sharing, then receives all messages that are sent to that router.
  • a computer wants help in processing a job, it is referred to herein as a recruiter, and sends a recruiting message to the IP multicast router, which routes the message to all registered computer systems.
  • These registered computer systems are candidates to share the job. Each candidate looks to see if it can share the job, and if it can, it responds to the recruiter. If the recruiter still needs help when the candidate responds, it sends the job to the candidate for processing. The candidate then processes the job and returns the results to the recruiter. Referring to FIG.
  • a computer system 100 is an enhanced IBM AS/400 computer system, and represents one suitable type of computer system that can be networked together in accordance with the preferred embodiment.
  • computer system 100 comprises a processor 110 connected to a main memory 120, a mass storage interface 130, a terminal interface 140, and a network interface 150. These system components are interconnected through the use of a system bus 160.
  • Mass storage interface 130 is used to connect mass storage devices (such as a direct access storage device 155) to computer system 100.
  • One specific type of direct access storage device is a floppy disk drive, which may store data to and read data from a floppy diskette 195.
  • Main memory 120 contains data 122, an operating system 123, and a job sharing processor 124.
  • Job sharing processor 124 includes a user interface 125, a registration mechanism 126, a job recruiter 127, a job acceptor 128, and one or more job sharing parameters 129.
  • Job sharing processor 124 handles both requests by computer system 100 for sharing a job with other computer systems, as well as requests by other computer systems for computer system 100 to share a job.
  • computer system 100 may be a recruiter for a job it controls, then can be a candidate for jobs that are controlled by other computer systems.
  • Data 122 represents any data that serves as input to or output from any program in computer system 100.
  • Operating system 123 is a multitasking operating system known in the industry as OS/400; however, those skilled in the art will appreciate that the spirit and scope of the present invention is not limited to any one operating system.
  • Job sharing processor 124 includes a user interface 125 that allows a user to specify parameters relating to job sharing in general, relating to a specific job to be shared, or relating to performance, security, or other parameters.
  • User interface 125 provides a mechanism for a user to specify one or more acceptance parameters for allowing computer system 100 to undertake shared jobs from other computer systems. For example, a menu could allow a user to specify default times for allowable shared jobs, such as at lunch and during the hours the user is not at work. In addition, the user could dynamically enter allowable times for shared jobs. If the user has a two hour meeting in the afternoon, for example, during which the user will be away from his computer (and therefore not using it) , the user could simply enter the time of the meeting as an allowable time period for shared jobs. In addition, the user interface 125 also provides an operation that allows a user to cancel (i.e., kill) jobs that are processing.
  • cancel i.e., kill
  • the user may abort the job to regain the full processing capacity of the user's computer system. The aborted job will then have to be re-started elsewhere on the network.
  • Job sharing processor 124 includes a registration mechanism 126 for registering computer system 100 for shared jobs.
  • Registration mechanism 126 includes intelligence for performing predefined functions required by the specifics of the implemented IP multicast protocol to add computer system 100 to the list of recipients for IP multicast messages. For some IP multicast systems, registration is simply a matter of invoking a command at a predetermined IP address that corresponds to the IP multicast router. The mechanics of how to register to receive IP multicasts are known in the art, and are not discussed in detail herein. All mechanisms and methods for registering computer system 100 for shared jobs are within the scope of the present invention.
  • Job sharing processor 124 includes a job recruiter 127 and a job acceptor 128.
  • Job recruiter 127 is that portion of job sharing processor 124 that recruits other computer systems to help in processing a shared job.
  • Job acceptor 128 is that portion of job sharing processor 124 that monitors recruiting requests from other computer systems to help in processing a shared job, and that determines whether or not to respond to the request.
  • Job sharing parameters 129 are a collection of attributes relating to job sharing.
  • job sharing parameters 129 may include recruiting parameters for the job to be shared that candidate computer systems must meet to accept a job from the recruiter.
  • job sharing parameters may include acceptance attributes that determine whether or not computer system 100 may be used to process the shared job. Examples of possible attributes contained within job sharing parameters 129 are discussed below with reference to FIG. 3.
  • Computer system 100 utilizes well known virtual addressing mechanisms that allow the programs of computer system 100 to behave as if they only have access to a large, single storage entity instead of access to multiple, smaller storage entities such as main memory 120 and DASD device 155. Therefore, while data 122, operating system 123, and job sharing processor 124 are shown to reside in main memory 120, those skilled in the art will recognize that these items are not necessarily all completely contained in main memory 120 at the same time. It should also be noted that the term "memory" is used herein to generically refer to the entire virtual memory of computer system 100.
  • Processor 110 may be constructed from one or more microprocessors and/or integrated circuits. Processor 110 executes program instructions stored in main memory 120. Main memory 120 stores programs and data that processor 110 may access.
  • processor 110 When computer system 100 starts up, processor 110 initially executes the program instructions that make up operating system 123.
  • Operating system 123 is a sophisticated program that manages the resources of computer system 100. Some of these resources are processor 110, main memory 120, mass storage interface 130, terminal interface 140, network interface 150, and system bus 160.
  • computer system 100 is shown to contain only a single processor and a single system bus, those skilled in the art will appreciate that the present invention may be practiced using a computer system that has multiple processors and/or multiple buses.
  • the interfaces (called input/output processors in AS/400 terminology) that are used in the preferred embodiment each include separate, fully programmed microprocessors that are used to off-load compute- intensive processing from processor 110.
  • I/O adapters to perform similar functions.
  • Terminal interface 140 is used to directly connect one or more terminals 165 to computer system 100.
  • These terminals 165 which may be non-intelligent (i.e., dumb) terminals or fully programmable workstations, are used to allow system administrators and users to communicate with computer system 100. Note, however, that while terminal interface 140 is provided to support communication with one or more terminals 165, computer system 100 does not necessarily require a terminal 165, because all needed interaction with users and other processes may occur via network interface 150.
  • Network interface 150 is used to connect other computer systems and/or workstations (e.g., 175 in FIG. 1) to computer system 100 across a network 170.
  • the present invention applies equally no matter how computer system 100 may be connected to other computer systems and/or workstations, regardless of whether the network connection 170 is made using present-day analog and/or digital techniques or via some networking mechanism of the future.
  • many different network protocols can be used to implement a network. These protocols are specialized computer programs that allow computers to communicate across network 170. TCP/IP (Transmission Control Protocol/Internet Protocol) is an example of a suitable network protocol.
  • signal bearing media include: recordable type media such as floppy disks (e.g., 195 of FIG. 1) and CD ROM, and transmission type media such as digital and analog communications links.
  • one particular example of a networked computer system 200 in accordance with the preferred embodiments includes multiple computer systems 100 of FIG. 1, shown in FIG. 2 as 100A through 100E, all coupled to an IP multicast router 210. Any message sent by any of computer systems 100A through 100E to IP multicast router 210 are automatically broadcast to all computer systems that are registered with IP multicast router 210.
  • the configuration of FIG. 2 assumes that each of these computer systems 100A-100E has previously registered with IP multicast router 210, and is therefore a candidate for job sharing.
  • Each computer system has a corresponding job sharing processor 124.
  • computer system 100A and 100E are designated as servers, and computer systems 100B, 100C, and 100D are designated as clients.
  • Job sharing processor 124 includes the functionality for both recruiting other computer systems to help process a job, as well as helping other computer systems to process their jobs.
  • Job attributes 310 define attributes of the job to be shared.
  • Network attributes 320 define network performance parameters that must be met for a candidate computer system to be able to accept a job to be shared.
  • Security attributes 330 define attributes that must be met for a candidate computer system to be able to accept a job to be shared. Note that these three classes of attributes are shown by way of example, and other classes and types of attributes are clearly within the scope of the preferred embodiments disclosed herein.
  • Job attributes 310 include information relating to the job to be shared. Examples of suitable information that may be included in job attributes 310 include: where (i.e., what IP address) to apply to do the job; software that must be installed on the candidate computer system to do the job; status of data on the candidate computer system (e.g., is your database journal current to 03/15/99?) ; an estimate of time or CPU cycles required to perform the job; memory required to run the job; and disk space required to run the job. Job attributes 310 thus describe pre-requisites for the job in a way that gives enough information to the candidate computer systems that they know whether or not they are capable of accepting the job to be shared.
  • Network attributes 320 include information relating to required network performance for a computer system to accept a job.
  • suitable network performance attributes are ping time, hops from a host, connection speed, and network congestion.
  • Ping time refers to the time it takes a candidate computer system to reply to a request from a recruiter, and is usually specified in milliseconds.
  • Specifying a maximum allowable ping time allows the recruiter to limit candidate computer systems to those that are reasonably close to the recruiter, and therefore have a fast ping time. The rationale for providing this ping time attribute is that it wouldn't make much sense for a computer in China to share a job with a computer in Mexico if other closer computer systems could be used.
  • Specifying maximum ping time is an easy way to restrict candidate computer systems to those that can be communicated with relatively quickly.
  • Hops from a host is another suitable network attribute that specifies how many routers are passed through to go between the recruiter and the candidate. Hops from a host is another measure of network performance, so the prospective candidates can be limited, for example, to those systems that are no more than two hops from the recruiter.
  • Another measure of network performance is connection speed. Specifying minimum connection speed allows a recruiter to specify the minimum required bandwidth for job sharing. If a job to be shared requires 200 megabytes to be loaded on the candidate computer system, a minimum connection speed could ' be specified to prevent low-bandwidth computers (such as those that have a 28.8 kbps modem connection) from accepting the job.
  • Another measure of network performance is network congestion, which is a measure of how busy a network is.
  • a computer system might have a 1 megabit per second network connection speed, but the network is so congested that only 10 Kbit per second is getting through to the candidate computer system. Specifying allowable network congestion in bits per second of network throughput prevents overly-congested candidates from accepting the job.
  • Security attributes 330 allow specifying parameters that further limit which candidate computer systems can accept a job to be shared. The reasons for providing security attributes 330 is to prevent some types of jobs from being shared with some types of computer systems. In other words, a software compile job in a computer system that is in a software development group might be shared with an available candidate computer system in the accounting department, but payroll processing from the accounting department would probably not be allowed on computer systems in the software development group. A compile job of a proprietary computer program in the software group would likely be limited to the computers within the company, and may be further limited to computers within a particular group. Security attributes 330 allow specifying that the candidate computer systems must be in the same group or same company. In addition, other security attributes 330 may be defined to restrict job sharing to a predefined type of candidate computer system.
  • a method 400 for sharing a job in accordance with the present invention starts when a recruiter computer system has a job to share (step 410) .
  • Part of the job sharing parameters for the job to be shared might include the number of nodes N that are needed to process the job.
  • node is used to refer to computer systems on the network, as is known in the art. Note that the flow diagram of FIG. 4 is divided by a vertical dotted line, with the left half representing the flow steps for the recruiter, and the right half representing the flow steps for each candidate node.
  • the recruiter makes its desire to share the job known by broadcasting an advertisement to all nodes on the network (step 412).
  • the recruiter's advertisement preferably includes job sharing parameters 129 that specify attributes that a candidate must satisfy to be able to accept the job to be shared.
  • the broadcasting of the recruiter's advertisement in the preferred embodiment corresponds to sending a message to the IP multicast router 210, which then transmits the message to all registered nodes (i.e., candidates) .
  • the node broadcasts is acceptance to all nodes (step 434) .
  • this acceptance is also via IP multicast to allow other candidate nodes to monitor when a node accepts the job in step 430.
  • the recruiter listens for responses to the recruiting advertisement (step 414), and accepts the first N responses (step 440) .
  • the recruiter then sends a response to the node's acceptance that was sent in step 434, accepting the first N responses and rejecting other responses (step 440) .
  • the response message from the recruiter could be sent to the node via IP multicast, but is more likely a unicast message to only the affected node.
  • the recruiter takes this processed job information and uses it, in conjunction with the processed job results from the other nodes, to process the information to complete the overall job (step 470) .
  • the communication between recruiter and candidate in steps 440 through 470 of FIG. 4 are preferably performed in a unicast manner, directly between recruiter and candidate, rather than cluttering the IP multicast network with information that is only of use to these two computer systems.
  • the present invention allows for job sharing via IP multicast without requiring any job-specific intelligence to be put on potential job sharing candidate computer systems.
  • the pre-requisites for performing the job are specified in the job sharing parameters.
  • a candidate examines the job sharing parameter to see if it qualifies to take on the shared job. It is even possible to download executable software from the recruiter to the candidate so the candidate can then process the job using the downloaded software.
  • Job sharing using IP multicast is greatly simplified over the prior art methods for job sharing, which require job-specific intelligence to be installed on each candidate.
  • a computer system When a computer system has a job to be shared, it sends a recruiting message via IP multicast that includes an advertisement for the job along with the job sharing parameters.
  • Each candidate examiners the job sharing parameters to determine if it qualifies to do the job.
  • the candidate can respond to the recruiter by accepting the job. Assuming that the recruiter accepts the candidate's response, the job is then passed from the recruiter to the candidate, which processes the job and returns the result to the recruiter. In this manner large computational problems may be distributed in discrete pieces to different computer systems on a computer network.
  • the most readily apparent application for job sharing in accordance with the present invention is in a company network, such as an Intranet, that interconnects computer systems within a company.
  • This type of job sharing could be tightly controlled using security attributes, password authorization, etc.
  • the invention can also be applied to a much larger scale in a much less secure environment.
  • Computers on the Internet could register to share jobs when they have spare computing capacity. For example, the Search for Extraterrestrial Intelligence agency of the U.S. government

Abstract

A computer system (100) on a network (170) uses IP multicast to recruit other computer systems (175) to share in the processing of a job. If a computer (100) on the network (170) wants to be available to process shared jobs, it first registers for job sharing by invoking an IP multicast router (210) at a particular IP address. All messages sent to the IP multicast router (210) are broadcast to all computer systems that are registered with the router. When a computer system (100) has a job to share, it recruits other computer systems to help process the job by sending a message to the IP multicast router (210) that corresponds to a request to share the job. The candidate computer systems (175) that receive the recruiter's broadcast determine if they can share the job according to one or more job sharing parameters (129). These parameters may relate to the job itself, network performance, security, or other criteria for sharing. If a computer system meets the parameters for taking on the particular job, it responds to the recruiter. If the recruiter still needs help (e.g., if not enough candidate systems have responded yet), the recruiter grants the response and delivers the job to the computer system. The computer system then performs the job (or task) and return the results to the recruiter.

Description

Description
COMPUTER SYSTEM AND METHOD FOR SHARING A JOB ON A COMPUTER NETWORK
USING IP MULTICAST
Background of the Invention
1 . Technical Field
This invention generally relates to data processing, and more specifically relates to the sharing of jobs between computers on a network.
2. Background Art Since the dawn of the computer age, computer systems have become indispensable in many fields of human endeavor including engineering design, machine and process control, and information storage and access. In the early days of computers, companies such as banks, industry, and the government would purchase a single computer which satisfied their needs, but by the early 1950' s many companies had multiple computers and the need to move data from one computer to another became apparent. At this time computer networks began being developed to allow computers to work together. Networked computers are capable of performing jobs that no single computer could perform. In addition, networks allow low cost personal computer systems to connect to larger systems to perform tasks that such low cost systems could not perform alone. Most companies in the United States today have one or more computer networks. The topology and size of the networks may vary according to the computer systems being networked and the design of the system administrator. It is very common, in fact, for companies to have multiple computer networks. Many large companies have a sophisticated blend of local area networks (LANs) and wide area networks (WANs) that effectively connect most computers in the company to each other. With so many computers hooked together on a network, it soon became apparent that networked computers could be used to process large jobs by delegating different portions of the job to different computers on the network, which can then process their respective portions in parallel. In particular, many computers on a network may have excess computing capacity, or may have periods when they are not being used. These computers could be made productive by working on a portion of a large job with little or no expense, because the computing power is already present but unused.
Known techniques for sharing jobs among computers on a network require knowledge specific to processing the job to be included in the client software installed on each computer system. Thus, if a person defines a new job that would benefit from being processed on several different computers in the network, the client software on the computers must be upgraded to support the new job. The prior art thus effectively precludes dynamic recruiting of systems to work on new types of jobs. Without a mechanism for allowing computer systems on a network to dynamically interact to share jobs without having to pre-define the jobs being processed, the scope of shared jobs will be greatly limited, and excess computing capacity on computer networks will remain an untapped resource.
Disclosure of Invention According to the present invention, a computer system on a network uses IP multicast to recruit other computer systems to share in the processing of a job. If a computer system on the network wants to be available to process shared jobs, it first registers for job sharing by invoking an IP multicast router at a particular IP address. All messages sent to the IP multicast router are broadcast to all computer systems that are registered with the router. When a computer system has a job to share, it recruits other computer systems to help process the job by sending a message to the IP multicast router that corresponds to a request to share the job. The candidate computer systems that receive the recruiter's broadcast determine if they can share the job according to one or more job sharing parameters. These parameters may relate to the job itself, network performance, security, or other criteria for sharing. If a computer system meets the parameters for taking on the particular job, it responds to the recruiter. If the recruiter still needs help (e.g., if not enough candidate systems have responded yet) , the recruiter grants the response and delivers the job to the computer system. The computer system then performs the job (or task) and returns the results to the recruiter.
The foregoing and other features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings.
Brief Description of Drawings
The preferred embodiments of the present invention will hereinafter be described in conjunction with the appended drawings, where like designations denote like elements, and:
FIG. 1 is a block diagram of a computer system that may be networked with other computer systems in accordance with a preferred embodiment of the present invention;
FIG. 2 is a block diagram of several computer systems of FIG. 1 that are all coupled together on a network via an IP multicast router;
FIG. 3 is a block diagram showing several possible job sharing parameters; and
FIG. 4 is a flow diagram illustrating a method for sharing jobs via IP multicast in accordance with the preferred embodiments . Best Mode for Carrying Out the Invention
The present, invention is accomplished through sharing portions of jobs on computers that are connected on a network.
For those who are not familiar with networking concepts, the brief overview below provides background information that will help the reader to understand the present invention.
1. Overview
Networked Computer Systems
Connecting computers together on a network requires some form of networking software. Over the years, the power and sophistication of networking software has greatly increased. Networking software typically defines a protocol for exchanging information between computers on a network. Many different network protocols are known in the art. Examples of commercially-available networking software is Novell Netware and Windows NT, which each implement different protocols for exchanging information between computers .
One significant computer network that has recently become very popular is the Internet. The Internet grew out of a proliferation of computers and networks, and has evolved into a sophisticated worldwide network of computer systems. Using the Internet, a user may access computers all over the world from a single workstation. TCP/IP (Transmission Control Protocol/Internet Protocol) is an example of a network protocol that is in wide use today for communicating between computers on the Internet. In addition, the use of TCP/IP is also rapidly expanding to more local area networks (LANs) and Intranets within companies. With so many computers connected together both inside of a company and with others outside the company via the Internet, it would be very helpful if there was a way to efficiently share jobs between these computers. Job Sharing on a Network Using Rem and Rcomp
Recent efforts have recognized that computers that are networked together may be used to process different portions of a large job. For the purpose of discussing job sharing in this patent application, a computer system that wants help in processing a job is referred to herein as a "recruiter", and a computer system that may possibly help in performing the job is referred to as a "candidate". Of course, a recruiter for one job can become a candidate for other jobs. The term "job" as used herein in a generic sense to refer to both large problems that have smaller portions that need to be apportioned to different computer systems, and also to the individual portions (e.g., tasks) that are apportioned as well.
In one known method for job sharing, a remote command "rem" may be invoked by a recruiter. A computer system on the network referred to herein as a "rem server" is dedicated to responding to the rem command to find an available candidate. In response to the rem command, the rem server locates an idle machine on the network, and executes a command on the idle machine to share a job. Known techniques of communicating between computers using text input and output protocols are used to communicate between the recruiter and the candidate. Note that invoking the rem command results in finding a single candidate for sharing a job. Another known method for sharing a job relates to compiling source code. Source code is computer code that is written in a high-level human-readable programming language that must be translated to a machine-readable version that is executable on a processor. Source code is usually arranged in modules. Each module is generally a separate compilation unit, which means it can be compiled separately from other modules. When a large program is programmed in a modular fashion, the compilation of different modules can be performed by different computers. Compiling source code is a common step in software development that takes a very long time to perform for complex computer programs. By farming out the compilation of different modules to different computer systems on a network, the compilation time can be drastically reduced. One known method for compiling different modules on different computer systems uses an "rcomp" command, which stands for "remote compile".
Using the rcomp command assumes that there are certain machines on the network that are known as "compile servers", those machines with the appropriate compilers installed and that have available resources to run compilations. When an rcomp command is invoked by a recruiter, a computer system that is dedicated to handling rcomp commands (referred to herein as an "rcomp server") examines the available compile servers on the network to determine whether they are compatible compile servers, and whether they have sufficient capacity to perform the requested compilation. When searching for compile servers, preference is given to larger compile servers and to compile servers that do not have other rcomp jobs running on them. Once a candidate compile server is selected, the rcomp server executes a command on the candidate that sends messages via text output back to the recruiter, and that optionally receives messages via text input from the recruiter.
Rcomp is generally used for commands (such as compilations) that require more resources than are available on the recruiter. It is typically not intended for small or interactive jobs.
One problem with both rem and rcomp is that these commands are limited to certain types of tasks. Client software must be installed on each candidate system that only knows how to process particular tasks in pre-defined ways defined in the client software. There are many different types of relatively large and complex problems that could be solved using rem and rcomp techniques, but these techniques would require that the client software on each candidate system, as well as the recruiter, have specific knowledge and logic for processing pre-defined problems. This hurdle prevents a recruiter from using any candidate that does not support the function it needs to perform.
IP Multicast
The concept of IP Multicast can best be understood by providing an analogy to commonly-known radio transmitters and receivers. A radio transmitter, such as a transmitter for a local radio station, continuously broadcasts its programming on a particular frequency. To listen to the radio station, one must tune a radio receiver to the frequency corresponding to the radio station's transmitter. An IP Multicast router performs functions analogous to a radio transmitter — it continuously broadcasts information to any' computer systems that are "tuned in" to the multicast channel. A computer system effectively "tunes in" by invoking a particular reserved Internet Protocol (IP) address corresponding to the multicast router to register with the multicast router. Once registered, the computer system will receive all messages broadcast by the multicast router. One significant difference between IP multicast and the radio analogy is that computer systems, once registered with the multicast router, can also send messages to the router for distribution to all of the registered computer systems. In this manner a computer system can communicate with a large number of other computer systems at the same time without individually communicating with each one, and without knowing what other computer systems are job sharing candidates .
There exist a good number of different protocols and methods for performing IP multicast that are known in the art. The present invention expressly encompasses any and all methods, whether currently known or developed in the future, for performing IP multicast on a network. 2. Detailed Description
According to a preferred embodiment of the present invention, a computer system registers with an IP multicast router for job sharing, then receives all messages that are sent to that router. When a computer wants help in processing a job, it is referred to herein as a recruiter, and sends a recruiting message to the IP multicast router, which routes the message to all registered computer systems. These registered computer systems are candidates to share the job. Each candidate looks to see if it can share the job, and if it can, it responds to the recruiter. If the recruiter still needs help when the candidate responds, it sends the job to the candidate for processing. The candidate then processes the job and returns the results to the recruiter. Referring to FIG. 1, a computer system 100 is an enhanced IBM AS/400 computer system, and represents one suitable type of computer system that can be networked together in accordance with the preferred embodiment. Those skilled in the art will appreciate that the mechanisms and apparatus of the present invention apply equally to any computer system that can be networked together with other computer systems. As shown in FIG. 1, computer system 100 comprises a processor 110 connected to a main memory 120, a mass storage interface 130, a terminal interface 140, and a network interface 150. These system components are interconnected through the use of a system bus 160. Mass storage interface 130 is used to connect mass storage devices (such as a direct access storage device 155) to computer system 100. One specific type of direct access storage device is a floppy disk drive, which may store data to and read data from a floppy diskette 195.
Main memory 120 contains data 122, an operating system 123, and a job sharing processor 124. Job sharing processor 124 includes a user interface 125, a registration mechanism 126, a job recruiter 127, a job acceptor 128, and one or more job sharing parameters 129. Job sharing processor 124 handles both requests by computer system 100 for sharing a job with other computer systems, as well as requests by other computer systems for computer system 100 to share a job. In other words, computer system 100 may be a recruiter for a job it controls, then can be a candidate for jobs that are controlled by other computer systems.
Data 122 represents any data that serves as input to or output from any program in computer system 100. Operating system 123 is a multitasking operating system known in the industry as OS/400; however, those skilled in the art will appreciate that the spirit and scope of the present invention is not limited to any one operating system. Job sharing processor 124 includes a user interface 125 that allows a user to specify parameters relating to job sharing in general, relating to a specific job to be shared, or relating to performance, security, or other parameters.
User interface 125 provides a mechanism for a user to specify one or more acceptance parameters for allowing computer system 100 to undertake shared jobs from other computer systems. For example, a menu could allow a user to specify default times for allowable shared jobs, such as at lunch and during the hours the user is not at work. In addition, the user could dynamically enter allowable times for shared jobs. If the user has a two hour meeting in the afternoon, for example, during which the user will be away from his computer (and therefore not using it) , the user could simply enter the time of the meeting as an allowable time period for shared jobs. In addition, the user interface 125 also provides an operation that allows a user to cancel (i.e., kill) jobs that are processing. Thus, if the user blocks out a two hour time block as allowable job sharing time because of a meeting, and returns from the meeting an hour early to find his computer engaged in processing a shared job, the user may abort the job to regain the full processing capacity of the user's computer system. The aborted job will then have to be re-started elsewhere on the network.
In addition to setting times for allowable shared jobs, a user may also setup certain job sharing parameters 129 via user interface 125. These parameters may include attributes regarding the job to be shared, the candidates for sharing the job, network performance, and security. These specific attributes are discussed in more detail below with reference to FIG. 3. Job sharing processor 124 includes a registration mechanism 126 for registering computer system 100 for shared jobs. Registration mechanism 126 includes intelligence for performing predefined functions required by the specifics of the implemented IP multicast protocol to add computer system 100 to the list of recipients for IP multicast messages. For some IP multicast systems, registration is simply a matter of invoking a command at a predetermined IP address that corresponds to the IP multicast router. The mechanics of how to register to receive IP multicasts are known in the art, and are not discussed in detail herein. All mechanisms and methods for registering computer system 100 for shared jobs are within the scope of the present invention.
Job sharing processor 124 includes a job recruiter 127 and a job acceptor 128. Job recruiter 127 is that portion of job sharing processor 124 that recruits other computer systems to help in processing a shared job. Job acceptor 128 is that portion of job sharing processor 124 that monitors recruiting requests from other computer systems to help in processing a shared job, and that determines whether or not to respond to the request.
Job sharing parameters 129 are a collection of attributes relating to job sharing. In the case of a job to be shared with other computer systems (i.e., when computer system 100 is a recruiter) , job sharing parameters 129 may include recruiting parameters for the job to be shared that candidate computer systems must meet to accept a job from the recruiter. In the case of sharing a job from other computer systems (i.e., when computer system 100 is a candidate), job sharing parameters may include acceptance attributes that determine whether or not computer system 100 may be used to process the shared job. Examples of possible attributes contained within job sharing parameters 129 are discussed below with reference to FIG. 3.
Computer system 100 utilizes well known virtual addressing mechanisms that allow the programs of computer system 100 to behave as if they only have access to a large, single storage entity instead of access to multiple, smaller storage entities such as main memory 120 and DASD device 155. Therefore, while data 122, operating system 123, and job sharing processor 124 are shown to reside in main memory 120, those skilled in the art will recognize that these items are not necessarily all completely contained in main memory 120 at the same time. It should also be noted that the term "memory" is used herein to generically refer to the entire virtual memory of computer system 100. Processor 110 may be constructed from one or more microprocessors and/or integrated circuits. Processor 110 executes program instructions stored in main memory 120. Main memory 120 stores programs and data that processor 110 may access. When computer system 100 starts up, processor 110 initially executes the program instructions that make up operating system 123. Operating system 123 is a sophisticated program that manages the resources of computer system 100. Some of these resources are processor 110, main memory 120, mass storage interface 130, terminal interface 140, network interface 150, and system bus 160.
Although computer system 100 is shown to contain only a single processor and a single system bus, those skilled in the art will appreciate that the present invention may be practiced using a computer system that has multiple processors and/or multiple buses. In addition, the interfaces (called input/output processors in AS/400 terminology) that are used in the preferred embodiment each include separate, fully programmed microprocessors that are used to off-load compute- intensive processing from processor 110. However, those skilled in the art will appreciate that the present invention applies equally to computer systems that simply use I/O adapters to perform similar functions.
Terminal interface 140 is used to directly connect one or more terminals 165 to computer system 100. These terminals 165, which may be non-intelligent (i.e., dumb) terminals or fully programmable workstations, are used to allow system administrators and users to communicate with computer system 100. Note, however, that while terminal interface 140 is provided to support communication with one or more terminals 165, computer system 100 does not necessarily require a terminal 165, because all needed interaction with users and other processes may occur via network interface 150.
Network interface 150 is used to connect other computer systems and/or workstations (e.g., 175 in FIG. 1) to computer system 100 across a network 170. The present invention applies equally no matter how computer system 100 may be connected to other computer systems and/or workstations, regardless of whether the network connection 170 is made using present-day analog and/or digital techniques or via some networking mechanism of the future. In addition, many different network protocols can be used to implement a network. These protocols are specialized computer programs that allow computers to communicate across network 170. TCP/IP (Transmission Control Protocol/Internet Protocol) is an example of a suitable network protocol.
At this point, it is important to note that while the present invention has been and will continue to be described in the context of a fully functional computer system, those skilled in the art will appreciate that the present invention is capable of being distributed as a program product in a variety of forms, and that the present invention applies equally regardless of the particular type of signal bearing media used to actually carry out the distribution. Examples of suitable signal bearing media include: recordable type media such as floppy disks (e.g., 195 of FIG. 1) and CD ROM, and transmission type media such as digital and analog communications links.
Referring to FIG. 2, one particular example of a networked computer system 200 in accordance with the preferred embodiments includes multiple computer systems 100 of FIG. 1, shown in FIG. 2 as 100A through 100E, all coupled to an IP multicast router 210. Any message sent by any of computer systems 100A through 100E to IP multicast router 210 are automatically broadcast to all computer systems that are registered with IP multicast router 210. The configuration of FIG. 2 assumes that each of these computer systems 100A-100E has previously registered with IP multicast router 210, and is therefore a candidate for job sharing. Each computer system has a corresponding job sharing processor 124. Note that computer system 100A and 100E are designated as servers, and computer systems 100B, 100C, and 100D are designated as clients. These designations are arbitrary, and simply show that jobs may be shared between client computer systems, between server computer systems, between clients and servers, or between servers and clients. In other words, computer systems 100 can be any suitable type of computer, and are not even limited to client or server computer systems. Job sharing processor 124 includes the functionality for both recruiting other computer systems to help process a job, as well as helping other computer systems to process their jobs.
Referring to FIG. 3, one specific implementation for job sharing parameters 129 of FIG. 1 includes three different classes of attributes. Job attributes 310 define attributes of the job to be shared. Network attributes 320 define network performance parameters that must be met for a candidate computer system to be able to accept a job to be shared. Security attributes 330 define attributes that must be met for a candidate computer system to be able to accept a job to be shared. Note that these three classes of attributes are shown by way of example, and other classes and types of attributes are clearly within the scope of the preferred embodiments disclosed herein.
Job attributes 310 include information relating to the job to be shared. Examples of suitable information that may be included in job attributes 310 include: where (i.e., what IP address) to apply to do the job; software that must be installed on the candidate computer system to do the job; status of data on the candidate computer system (e.g., is your database journal current to 03/15/99?) ; an estimate of time or CPU cycles required to perform the job; memory required to run the job; and disk space required to run the job. Job attributes 310 thus describe pre-requisites for the job in a way that gives enough information to the candidate computer systems that they know whether or not they are capable of accepting the job to be shared.
Network attributes 320 include information relating to required network performance for a computer system to accept a job. Examples of suitable network performance attributes are ping time, hops from a host, connection speed, and network congestion. Ping time refers to the time it takes a candidate computer system to reply to a request from a recruiter, and is usually specified in milliseconds. Specifying a maximum allowable ping time allows the recruiter to limit candidate computer systems to those that are reasonably close to the recruiter, and therefore have a fast ping time. The rationale for providing this ping time attribute is that it wouldn't make much sense for a computer in China to share a job with a computer in Mexico if other closer computer systems could be used. Specifying maximum ping time is an easy way to restrict candidate computer systems to those that can be communicated with relatively quickly.
Hops from a host is another suitable network attribute that specifies how many routers are passed through to go between the recruiter and the candidate. Hops from a host is another measure of network performance, so the prospective candidates can be limited, for example, to those systems that are no more than two hops from the recruiter. Another measure of network performance is connection speed. Specifying minimum connection speed allows a recruiter to specify the minimum required bandwidth for job sharing. If a job to be shared requires 200 megabytes to be loaded on the candidate computer system, a minimum connection speed could' be specified to prevent low-bandwidth computers (such as those that have a 28.8 kbps modem connection) from accepting the job. Another measure of network performance is network congestion, which is a measure of how busy a network is. For example, a computer system might have a 1 megabit per second network connection speed, but the network is so congested that only 10 Kbit per second is getting through to the candidate computer system. Specifying allowable network congestion in bits per second of network throughput prevents overly-congested candidates from accepting the job.
Security attributes 330 allow specifying parameters that further limit which candidate computer systems can accept a job to be shared. The reasons for providing security attributes 330 is to prevent some types of jobs from being shared with some types of computer systems. In other words, a software compile job in a computer system that is in a software development group might be shared with an available candidate computer system in the accounting department, but payroll processing from the accounting department would probably not be allowed on computer systems in the software development group. A compile job of a proprietary computer program in the software group would likely be limited to the computers within the company, and may be further limited to computers within a particular group. Security attributes 330 allow specifying that the candidate computer systems must be in the same group or same company. In addition, other security attributes 330 may be defined to restrict job sharing to a predefined type of candidate computer system.
Note that other security measures may also be taken to assure the integrity of the job sharing system. For example, when a candidate computer system signals to the recruiter that it accepts a job, the recruiter could then require the candidate to enter a password or other identifying information to assure the candidate is authorized to receive the job. In addition, access to the IP multicast address could be restricted so that only authorized computer systems know how to register for job sharing of a particular type. These and other security measures are within the scope of the preferred embodiments .
Referring to FIG. 4, a method 400 for sharing a job in accordance with the present invention starts when a recruiter computer system has a job to share (step 410) . Part of the job sharing parameters for the job to be shared might include the number of nodes N that are needed to process the job. For the flow diagram of FIG. 4, the term "node" is used to refer to computer systems on the network, as is known in the art. Note that the flow diagram of FIG. 4 is divided by a vertical dotted line, with the left half representing the flow steps for the recruiter, and the right half representing the flow steps for each candidate node. The recruiter makes its desire to share the job known by broadcasting an advertisement to all nodes on the network (step 412). The recruiter's advertisement preferably includes job sharing parameters 129 that specify attributes that a candidate must satisfy to be able to accept the job to be shared. The broadcasting of the recruiter's advertisement in the preferred embodiment corresponds to sending a message to the IP multicast router 210, which then transmits the message to all registered nodes (i.e., candidates) .
When a candidate node receives the advertisement from the recruiter via the IP multicast router 210, it then determines whether it satisfies all the attributes specified in the job sharing parameters. In other words, the candidate determines if it has the proper software installed to run the job, if it has sufficient memory and hard disk space, and if it satisfies the network performance attributes and security attributes, etc. If the candidate node does not satisfy all of the specified job sharing parameters (step 420=NO) , the candidate node does not reply to the advertisement from the recruiter
(step 432). By simply not responding, as opposed to sending a rejection message, network traffic is minimized. If the candidate node determines that it satisfies all job sharing parameters, and can therefore process the job (step 420=YES), the node then determines whether it needs more work (step 422) . Note that the order of steps 420 and 422 may be reversed. If the node does not need any more work (step 422=NO) , the node does not respond to the recruiter's advertisement (step 432). If the node needs more work (step 422=YES) , it waits for a small random amount of time (step 424) to see if enough other nodes will respond to the recruiter's request. What constitutes a "small" amount of time may be defined in terms of the properties of the network. For example, most nodes should wait at least the time it takes a packet to get to all nodes on the network under normal network load. Making each node wait a small random amount of time to see how many nodes respond prevents all nodes from simultaneously accepting the job, which would leave to the recruiter the potentially complex task of determining which nodes to actually give the job to. By waiting for a small random amount of time, it is more likely that responses from candidate nodes will be spread out over time rather than occurring nearly simultaneously. After the random wait, if enough other nodes have already accepted (step 430=YES) , this candidate node does not respond (step 432) because enough candidates have already been recruited. If not enough nodes have accepted (step 430=NO) , the node broadcasts is acceptance to all nodes (step 434) . In the preferred embodiment, this acceptance is also via IP multicast to allow other candidate nodes to monitor when a node accepts the job in step 430. Meantime, the recruiter listens for responses to the recruiting advertisement (step 414), and accepts the first N responses (step 440) . The recruiter then sends a response to the node's acceptance that was sent in step 434, accepting the first N responses and rejecting other responses (step 440) . The response message from the recruiter could be sent to the node via IP multicast, but is more likely a unicast message to only the affected node. Meanwhile, the node is awaiting a response from the advertiser (step 436) . If the response is that the recruiter does not accept the node's acceptance sent in step 434 (step 450=NO) , the node makes no response (step 432) . If the recruiter accepts the node's acceptance in step 434 (step 450=YES) , the node then requests the job from the recruiter (step 452), which sends the job to the node (step 460) . The node then performs the job (step 454), and sends the completed job information to the recruiter (step 456) . The recruiter takes this processed job information and uses it, in conjunction with the processed job results from the other nodes, to process the information to complete the overall job (step 470) . Note that the communication between recruiter and candidate in steps 440 through 470 of FIG. 4 are preferably performed in a unicast manner, directly between recruiter and candidate, rather than cluttering the IP multicast network with information that is only of use to these two computer systems.
The present invention allows for job sharing via IP multicast without requiring any job-specific intelligence to be put on potential job sharing candidate computer systems. The pre-requisites for performing the job are specified in the job sharing parameters. A candidate examines the job sharing parameter to see if it qualifies to take on the shared job. It is even possible to download executable software from the recruiter to the candidate so the candidate can then process the job using the downloaded software. Job sharing using IP multicast is greatly simplified over the prior art methods for job sharing, which require job-specific intelligence to be installed on each candidate. When a computer system has a job to be shared, it sends a recruiting message via IP multicast that includes an advertisement for the job along with the job sharing parameters. Each candidate examiners the job sharing parameters to determine if it qualifies to do the job. If so, the candidate can respond to the recruiter by accepting the job. Assuming that the recruiter accepts the candidate's response, the job is then passed from the recruiter to the candidate, which processes the job and returns the result to the recruiter. In this manner large computational problems may be distributed in discrete pieces to different computer systems on a computer network.
The most readily apparent application for job sharing in accordance with the present invention is in a company network, such as an Intranet, that interconnects computer systems within a company. This type of job sharing could be tightly controlled using security attributes, password authorization, etc. However, the invention can also be applied to a much larger scale in a much less secure environment. Computers on the Internet could register to share jobs when they have spare computing capacity. For example, the Search for Extraterrestrial Intelligence agency of the U.S. government
(SETI) has collected huge amounts of data regarding transmissions and celestial phenomena that needs to be processed. This type of mundane, non-secure processing could very easily be accomplished by sharing portions of the job with many computers on the Internet. The present invention expressly extends to job sharing between computers using IP multicast, regardless of the size of the job and the number or type of computers on the network.
One skilled in the art will appreciate that many variations are possible within the scope of the present invention. Thus, while the invention has been particularly shown and described with reference to preferred embodiments thereof, it will be understood by those skilled in the art that these and other changes in form and details may be made therein without departing from the spirit and scope of the invention.

Claims

Claims
1. A networked computer system comprising: a plurality of computer systems that each includes: a network interface that couples each computer system via a network to a common network resource, the common network resource transmitting messages to other computer systems on the network via IP multicast and receiving messages from other computer systems on the network via IP multicast; a memory; and a job sharing processor residing in the memory, the job sharing processor broadcasting to the other computer systems via the common network resource when the job sharing processor has a job to be shared, the job sharing processor responding to broadcasts from the other computer systems to potentially accept a job from one of the other computer systems.
2. The computer system of claim 1 wherein the job sharing processor includes a registration mechanism for registering a computer system to be a job sharing candidate.
3. The computer system of claim 1 wherein the job sharing processor includes a user interface that allows a user to set at least one parameter for the job to be shared.
4. The computer system of claim 3 wherein the at least one parameter includes at least one job attribute that defines at least one characteristic of the job to be shared.
5. The computer system of claim 3 wherein the at least one parameter includes at least one network attribute that defines network performance requirements that job sharing candidate computer systems must meet to accept the job to be shared.
6. The computer system of claim 3 wherein the at least one parameter includes at least one security attribute that defines security requirements that job sharing candidate computer systems must meet to accept the job to be shared.
7. A networked computer system comprising: a plurality of computer systems that each includes: a network interface that couples each computer system via a network to a common network resource, the common network resource transmitting messages to other computer systems on the network via IP multicast and receiving messages from other computer systems on the network via IP multicast; a memory; and a job sharing processor residing in the memory, the job sharing processor comprising: a user interface that is used to set at least one acceptance parameter that determines whether the computer system may receive a job from the other computer systems on the network, the user interface also being used to set at least one recruiting parameter assigned to a specific job to be shared in the computer system; a registration mechanism for registering a computer system to be a job sharing candidate; a job recruiter that broadcasts to the other computer systems via the common network resource when the job sharing processor has a job to be shared; and a job acceptor that responds to broadcasts from the other computer systems if the computer system can receive the job to be shared from one of the other computer systems according to the at least one acceptance parameter and according to the at least one recruiting parameter assigned to the job to be shared.
8. The computer system of claim 7 wherein the at least one recruiting parameter includes: at least one job attribute that defines at least one characteristic of the job to be shared; at least one network attribute that defines network performance requirements that job sharing candidates must meet to accept the job to be shared; and at least one security attribute that defines security requirements that job sharing candidates must meet to accept the job to be shared.
9. A networked computer system comprising: a common network resource that transmits messages received from one computer system on a network to all computer systems on the network via IP multicast; a plurality of computer systems that each includes: a memory; means for coupling each computer system via the network to the common network resource; means residing in the memory for broadcasting to the other computer systems via the common network resource when the computer system has a job to be shared; and means residing in the memory for responding to broadcasts from the other computer systems to potentially accept a job received from one of the other computer systems.
10. A computer-implemented method for sharing jobs between computers systems on a network, the method comprising the steps of: a first computer system on the network broadcasting to the other computer systems via IP multicast when the first computer system has a job to share; each other computer system on the network responding to the broadcast from the first computer system to accept the job to share if the computer system satisfies at least one parameter transmitted by the first computer system for accepting the job to share.
11. The method of claim 10 wherein the step of responding to the broadcast from the first computer system is performed only when the computer system needs more work.
12. The method of claim 10 wherein the step of responding to the broadcast from the first computer system is performed only when the computer system detects that an insufficient number of computer systems have responded to the broadcast from the first computer system.
13. The method of claim 10 further including the step of: each computer system on the network that wants to share jobs registering for job sharing, said step of registering for job sharing making the computer system a job sharing candidate.
14. The method of claim 10 further including the step of: defining at least one parameter for a job to be shared on the network.
15. The method of claim 14 wherein the at least one parameter includes at least one job attribute that defines at least one characteristic of the job to be shared.
16. The method of claim 14 wherein the at least one parameter includes at least one network attribute that defines network performance requirements that job sharing candidates must meet to accept the job to be shared.
17. The method of claim 14 wherein the at least one parameter includes at least one security attribute that defines security requirements that job sharing candidates must meet to accept the job to be shared.
18. A computer-implemented method for sharing jobs on a network, the method comprising the steps of: providing a first computer system on the network; registering the first computer system for job sharing, making the first computer system a job sharing candidate; a user defining at least one parameter for a job to be shared on the network; the first computer system on the network broadcasting to the other computer systems via IP multicast that the first computer system has the job to be shared; each other computer system on the network responding to the broadcast from the first computer system to accept the job to be shared if all of the following are true: the computer system needs more work; the computer system detects that an insufficient number of computer systems have responded to the broadcast from the first computer system; and the computer system satisfies all of the following: at least one job attribute that defines at least one characteristic of the job to be shared; at least one network attribute that defines network performance requirements that candidate computer systems on the network must meet to accept the job to be shared; and at least one security attribute that defines security requirements that candidate computer systems on the network must meet to accept the job to be shared.
19. A program product comprising: a job sharing processor that broadcasts to other computer systems on a computer network using IP multicast when the job sharing processor has a job to be shared, the job sharing processor responding to broadcasts from the other computer systems to potentially accept a job from one of the other computer systems; and signal bearing media bearing the job sharing processor.
20. The program product of claim 19 wherein the signal bearing media comprises recordable media.
21. The program product of claim 19 wherein the signal bearing media comprises transmission media.
22. The program product of claim 19 wherein the job sharing processor includes a registration mechanism for registering a computer system to be a job sharing candidate.
23. The program product of claim 19 wherein the job sharing processor includes a user interface that allows a user to set at least one parameter for the job to be shared.
24. The program product of claim 23 wherein the at least one parameter includes at least one job attribute that defines at least one characteristic of the job to be shared.
25. The program product of claim 23 wherein the at least one parameter includes at least one network attribute that defines network performance requirements that job sharing candidate computer systems must meet to accept the job to be shared.
26. The program product of claim 23 wherein the at least one parameter includes at least one security attribute that defines security requirements that job sharing candidate computer systems must meet to accept the job to be shared.
27. A program product comprising: (A) a job sharing processor comprising: (Al) a user interface that is used to set at least one acceptance parameter that determines whether the computer system may receive a job from the other computer systems on the network, the user interface also being used to set at least one recruiting parameter assigned to a specific job to be shared in the computer system; (A2) a registration mechanism for registering a computer system to be a job sharing candidate; (A3) a job recruiter that broadcasts to the other computer systems via IP multicast when the job sharing processor has a job to be shared; and (A4) a job acceptor that responds to broadcasts from the other computer systems via IP multicast if the computer system can receive the job to be shared from one of the other computer systems according to the at least one acceptance parameter and according to the at least one recruiting parameter assigned to the job to be shared; and (B) signal bearing media bearing the job sharing processor .
28. The program product of claim 27 wherein the signal bearing media comprises recordable media.
29. The program product of claim 27 wherein the signal bearing media comprises transmission media.
30. The program product of claim 27 wherein the at least one recruiting parameter includes: at least one job attribute that defines at least one characteristic of the job to be shared; at least one network attribute that defines network performance requirements that job sharing candidates must meet to accept the job to be shared; and at least one security attribute that defines security requirements that job sharing candidates must meet to accept the job to be shared.
-2f
PCT/US2000/005560 1999-04-07 2000-03-02 Computer system and method for sharing a job on a computer network using ip multicast WO2000060860A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/287,435 US6356929B1 (en) 1999-04-07 1999-04-07 Computer system and method for sharing a job with other computers on a computer network using IP multicast
US09/287,435 1999-04-07

Publications (1)

Publication Number Publication Date
WO2000060860A1 true WO2000060860A1 (en) 2000-10-12

Family

ID=23102894

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/005560 WO2000060860A1 (en) 1999-04-07 2000-03-02 Computer system and method for sharing a job on a computer network using ip multicast

Country Status (2)

Country Link
US (3) US6356929B1 (en)
WO (1) WO2000060860A1 (en)

Families Citing this family (62)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6356929B1 (en) * 1999-04-07 2002-03-12 International Business Machines Corporation Computer system and method for sharing a job with other computers on a computer network using IP multicast
US7685311B2 (en) * 1999-05-03 2010-03-23 Digital Envoy, Inc. Geo-intelligent traffic reporter
US7844729B1 (en) 1999-05-03 2010-11-30 Digital Envoy, Inc. Geo-intelligent traffic manager
US6757740B1 (en) * 1999-05-03 2004-06-29 Digital Envoy, Inc. Systems and methods for determining collecting and using geographic locations of internet users
US6591290B1 (en) * 1999-08-24 2003-07-08 Lucent Technologies Inc. Distributed network application management system
GB2354090B (en) * 1999-09-08 2004-03-17 Sony Uk Ltd Distributed service provider
US7661107B1 (en) * 2000-01-18 2010-02-09 Advanced Micro Devices, Inc. Method and apparatus for dynamic allocation of processing resources
US7281168B1 (en) 2000-03-03 2007-10-09 Intel Corporation Failover architecture for local devices that access remote storage
US7266555B1 (en) 2000-03-03 2007-09-04 Intel Corporation Methods and apparatus for accessing remote storage through use of a local device
US7428540B1 (en) * 2000-03-03 2008-09-23 Intel Corporation Network storage system
US6952737B1 (en) * 2000-03-03 2005-10-04 Intel Corporation Method and apparatus for accessing remote storage in a distributed storage cluster architecture
US7203731B1 (en) 2000-03-03 2007-04-10 Intel Corporation Dynamic replication of files in a network storage system
US7506034B2 (en) * 2000-03-03 2009-03-17 Intel Corporation Methods and apparatus for off loading content servers through direct file transfer from a storage center to an end-user
US7092985B2 (en) * 2000-03-30 2006-08-15 United Devices, Inc. Method of managing workloads and associated distributed processing system
USRE42153E1 (en) 2000-03-30 2011-02-15 Hubbard Edward A Dynamic coordination and control of network connected devices for large-scale network site testing and associated architectures
US20090216641A1 (en) * 2000-03-30 2009-08-27 Niration Network Group, L.L.C. Methods and Systems for Indexing Content
US20090222508A1 (en) * 2000-03-30 2009-09-03 Hubbard Edward A Network Site Testing
US20010039497A1 (en) * 2000-03-30 2001-11-08 Hubbard Edward A. System and method for monitizing network connected user bases utilizing distributed processing systems
US8010703B2 (en) * 2000-03-30 2011-08-30 Prashtama Wireless Llc Data conversion services and associated distributed processing system
US6963897B1 (en) * 2000-03-30 2005-11-08 United Devices, Inc. Customer services and advertising based upon device attributes and associated distributed processing system
US20040103139A1 (en) * 2000-03-30 2004-05-27 United Devices, Inc. Distributed processing system having sensor based data collection and associated method
US6684250B2 (en) 2000-04-03 2004-01-27 Quova, Inc. Method and apparatus for estimating a geographic location of a networked entity
US20020016835A1 (en) * 2000-05-25 2002-02-07 Gary Gamerman System and method for cascaded distribution of processing
US7266556B1 (en) 2000-12-29 2007-09-04 Intel Corporation Failover architecture for a distributed storage system
US7797375B2 (en) * 2001-05-07 2010-09-14 International Business Machines Corporat System and method for responding to resource requests in distributed computer networks
US8200818B2 (en) * 2001-07-06 2012-06-12 Check Point Software Technologies, Inc. System providing internet access management with router-based policy enforcement
US7590684B2 (en) * 2001-07-06 2009-09-15 Check Point Software Technologies, Inc. System providing methodology for access control with cooperative enforcement
US20040107360A1 (en) * 2002-12-02 2004-06-03 Zone Labs, Inc. System and Methodology for Policy Enforcement
JP4155393B2 (en) * 2002-06-17 2008-09-24 富士通株式会社 File exchange apparatus, personal information registration / introduction server, transmission control method, and program
US20040015587A1 (en) * 2002-06-21 2004-01-22 Kogut-O'connell Judy J. System for transferring tools to resources
US20040001476A1 (en) * 2002-06-24 2004-01-01 Nayeem Islam Mobile application environment
US7774325B2 (en) * 2002-10-17 2010-08-10 Intel Corporation Distributed network attached storage system
US6850943B2 (en) * 2002-10-18 2005-02-01 Check Point Software Technologies, Inc. Security system and methodology for providing indirect access control
US7395536B2 (en) * 2002-11-14 2008-07-01 Sun Microsystems, Inc. System and method for submitting and performing computational tasks in a distributed heterogeneous networked environment
US20040160975A1 (en) * 2003-01-21 2004-08-19 Charles Frank Multicast communication protocols, systems and methods
US8136155B2 (en) * 2003-04-01 2012-03-13 Check Point Software Technologies, Inc. Security system with methodology for interprocess communication control
US20050135336A1 (en) * 2003-04-08 2005-06-23 Citizen Watch Co., Ltd. Internet access system, method of data transmission in the internet access system and information terminal using the internet access system
US7788726B2 (en) * 2003-07-02 2010-08-31 Check Point Software Technologies, Inc. System and methodology providing information lockbox
US8028292B2 (en) * 2004-02-20 2011-09-27 Sony Computer Entertainment Inc. Processor task migration over a network in a multi-processor system
US7685279B2 (en) * 2004-03-04 2010-03-23 Quova, Inc. Geo-location and geo-compliance utilizing a client agent
US8136149B2 (en) * 2004-06-07 2012-03-13 Check Point Software Technologies, Inc. Security system with methodology providing verified secured individual end points
US7627896B2 (en) * 2004-12-24 2009-12-01 Check Point Software Technologies, Inc. Security system providing methodology for cooperative enforcement of security policies during SSL sessions
US7620981B2 (en) * 2005-05-26 2009-11-17 Charles William Frank Virtual devices and virtual bus tunnels, modules and methods
US8819092B2 (en) 2005-08-16 2014-08-26 Rateze Remote Mgmt. L.L.C. Disaggregated resources and access methods
US9270532B2 (en) * 2005-10-06 2016-02-23 Rateze Remote Mgmt. L.L.C. Resource command messages and methods
US8141078B2 (en) * 2006-02-23 2012-03-20 International Business Machines Corporation Providing shared tasks amongst a plurality of individuals
US8116323B1 (en) 2007-04-12 2012-02-14 Qurio Holdings, Inc. Methods for providing peer negotiation in a distributed virtual environment and related systems and computer program products
US20090007099A1 (en) * 2007-06-27 2009-01-01 Cummings Gregory D Migrating a virtual machine coupled to a physical device
EP2130121A1 (en) * 2007-12-03 2009-12-09 Zircon Computing LLC Parallel processing system
US20090241121A1 (en) * 2007-12-24 2009-09-24 Gil Nechushtai Device, Method and Computer Program Product for Monitoring Collaborative Tasks
US7991882B1 (en) * 2008-06-12 2011-08-02 Lockheed Martin Corporation Communications network with flow control
US9324173B2 (en) * 2008-07-17 2016-04-26 International Business Machines Corporation System and method for enabling multiple-state avatars
US8957914B2 (en) * 2008-07-25 2015-02-17 International Business Machines Corporation Method for extending a virtual environment through registration
US8527625B2 (en) * 2008-07-31 2013-09-03 International Business Machines Corporation Method for providing parallel augmented functionality for a virtual environment
US10166470B2 (en) * 2008-08-01 2019-01-01 International Business Machines Corporation Method for providing a virtual world layer
US8112365B2 (en) * 2008-12-19 2012-02-07 Foster Scott C System and method for online employment recruiting and evaluation
US20100251259A1 (en) * 2009-03-31 2010-09-30 Howard Kevin D System And Method For Recruitment And Management Of Processors For High Performance Parallel Processing Using Multiple Distributed Networked Heterogeneous Computing Elements
US8443107B2 (en) * 2009-11-11 2013-05-14 Digital Envoy, Inc. Method, computer program product and electronic device for hyper-local geo-targeting
US9213574B2 (en) * 2010-01-30 2015-12-15 International Business Machines Corporation Resources management in distributed computing environment
US8694995B2 (en) 2011-12-14 2014-04-08 International Business Machines Corporation Application initiated negotiations for resources meeting a performance parameter in a virtualized computing environment
US8863141B2 (en) 2011-12-14 2014-10-14 International Business Machines Corporation Estimating migration costs for migrating logical partitions within a virtualized computing environment based on a migration cost history
US10536487B2 (en) * 2014-01-22 2020-01-14 Telefonaktiebolaget Lm Ericsson (Publ) End user controlled multi-service device priority setting

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5548728A (en) * 1994-11-04 1996-08-20 Canon Information Systems, Inc. System for reducing bus contention using counter of outstanding acknowledgement in sending processor and issuing of acknowledgement signal by receiving processor to indicate available space in shared memory
US6014545A (en) * 1997-03-27 2000-01-11 Industrial Technology Research Institute Growable architecture for high-speed two-way data services over CATV networks
US6049823A (en) * 1995-10-04 2000-04-11 Hwang; Ivan Chung-Shung Multi server, interactive, video-on-demand television system utilizing a direct-access-on-demand workgroup

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5317568A (en) * 1991-04-11 1994-05-31 Galileo International Partnership Method and apparatus for managing and facilitating communications in a distributed hetergeneous network
IL99923A0 (en) * 1991-10-31 1992-08-18 Ibm Israel Method of operating a computer in a network
US5689553A (en) * 1993-04-22 1997-11-18 At&T Corp. Multimedia telecommunications network and service
US5513126A (en) * 1993-10-04 1996-04-30 Xerox Corporation Network having selectively accessible recipient prioritized communication channel profiles
US5603029A (en) * 1995-06-07 1997-02-11 International Business Machines Corporation System of assigning work requests based on classifying into an eligible class where the criteria is goal oriented and capacity information is available
US5765033A (en) * 1997-02-06 1998-06-09 Genesys Telecommunications Laboratories, Inc. System for routing electronic mails
US5802146A (en) * 1995-11-22 1998-09-01 Bell Atlantic Network Services, Inc. Maintenance operations console for an advanced intelligent network
US5862329A (en) * 1996-04-18 1999-01-19 International Business Machines Corporation Method system and article of manufacture for multi-casting audio visual material
US5774660A (en) * 1996-08-05 1998-06-30 Resonate, Inc. World-wide-web server with delayed resource-binding for resource-based load balancing on a distributed resource multi-node network
US6751221B1 (en) * 1996-10-04 2004-06-15 Kabushiki Kaisha Toshiba Data transmitting node and network inter-connection node suitable for home network environment
US7145898B1 (en) * 1996-11-18 2006-12-05 Mci Communications Corporation System, method and article of manufacture for selecting a gateway of a hybrid communication system architecture
US5987118A (en) * 1997-10-21 1999-11-16 Mci Communiations Corporation Method and computer program logic for providing an intelligent network operator console with enhanced services
US6718387B1 (en) * 1997-12-10 2004-04-06 Sun Microsystems, Inc. Reallocating address spaces of a plurality of servers using a load balancing policy and a multicast channel
US6427071B1 (en) * 1998-12-08 2002-07-30 At&T Wireless Services, Inc. Apparatus and method for providing transporting for a control signal
US6356929B1 (en) * 1999-04-07 2002-03-12 International Business Machines Corporation Computer system and method for sharing a job with other computers on a computer network using IP multicast

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5548728A (en) * 1994-11-04 1996-08-20 Canon Information Systems, Inc. System for reducing bus contention using counter of outstanding acknowledgement in sending processor and issuing of acknowledgement signal by receiving processor to indicate available space in shared memory
US6049823A (en) * 1995-10-04 2000-04-11 Hwang; Ivan Chung-Shung Multi server, interactive, video-on-demand television system utilizing a direct-access-on-demand workgroup
US6014545A (en) * 1997-03-27 2000-01-11 Industrial Technology Research Institute Growable architecture for high-speed two-way data services over CATV networks

Also Published As

Publication number Publication date
US8291090B2 (en) 2012-10-16
US20020016811A1 (en) 2002-02-07
US20080071854A1 (en) 2008-03-20
US6356929B1 (en) 2002-03-12
US7337208B2 (en) 2008-02-26

Similar Documents

Publication Publication Date Title
US6356929B1 (en) Computer system and method for sharing a job with other computers on a computer network using IP multicast
US8285873B2 (en) Application network communication
EP0384339B1 (en) Broker for computer network server selection
US7124062B2 (en) Services search method
CN112799789A (en) Node cluster management method, device, equipment and storage medium
US6523023B1 (en) Method system and computer program product for distributed internet information search and retrieval
US5341477A (en) Broker for computer network server selection
US5949977A (en) Method and apparatus for requesting and processing services from a plurality of nodes connected via common communication links
US7065764B1 (en) Dynamically allocated cluster system
US7152109B2 (en) Automated provisioning of computing networks according to customer accounts using a network database data model
US7200657B2 (en) Autonomic provisioning of network-accessible service behaviors within a federated grid infrastructure
CN1960287B (en) Method and apparatus for scheduling jobs on a network
US20060048157A1 (en) Dynamic grid job distribution from any resource within a grid environment
US20070088703A1 (en) Peer-to-peer auction based data distribution
US6487577B1 (en) Distributed compiling
US20040226010A1 (en) Automated provisioning framework for internet site servers
CN107493191B (en) Cluster node and self-scheduling container cluster system
US8321590B2 (en) Application network communication
US20030028645A1 (en) Management system for a cluster
US6202089B1 (en) Method for configuring at runtime, identifying and using a plurality of remote procedure call endpoints on a single server process
US20050149468A1 (en) System and method for providing location profile data for network nodes
CN101551888A (en) Advertising information release system and corresponding method for controlling advertisement release
US7219345B2 (en) System and method for terminating processes in a distributed computing system
CN111506367A (en) Multi-cluster artificial intelligence online service method and system
KR20040070338A (en) System-wide optimization integration model

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): CA CZ IL JP KR PL

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase