CN103793296A - Method for assisting in backing-up and copying computer system in cluster - Google Patents

Method for assisting in backing-up and copying computer system in cluster Download PDF

Info

Publication number
CN103793296A
CN103793296A CN201410006210.1A CN201410006210A CN103793296A CN 103793296 A CN103793296 A CN 103793296A CN 201410006210 A CN201410006210 A CN 201410006210A CN 103793296 A CN103793296 A CN 103793296A
Authority
CN
China
Prior art keywords
copy
cluster
copies
secondary backup
backup
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410006210.1A
Other languages
Chinese (zh)
Inventor
聂磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201410006210.1A priority Critical patent/CN103793296A/en
Publication of CN103793296A publication Critical patent/CN103793296A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a method for assisting in backing-up and copying a computer system in a cluster. The method specifically includes: setting a back-up copy system in the cluster; allocating every primary back-up copy and secondary back-up copy in a hierarchical structure; when a fault of one of the copies is detected, substituting the faulted copy with the next layer; recopying the copy affected least and rebuilding a primary copy, an auxiliary copy and the secondary back-up copy, wherein the back-up copy system is provided with at least one client end, at least one node, one primary copy, minor copies and one secondary back-up copy. Compared with the prior art, the method for assisting in backing-up and copying the computer system in the cluster has a key task of improving an environment and is applied in real time, and has the advantages of strong practicability, wide application range and easiness in popularization.

Description

A kind of method that copies computer system in cluster for secondary backup
Technical field
The present invention relates to clustered computer system technology, more specifically say the method that copies computer system in cluster for secondary backup.
Background technology
In group system, an intrinsic subject matter is the potential leak of their failure.When the collapse in cluster, may be affected to an available single node of whole system.Redundancy, to increase the reliability of system, is incorporated in system, conventionally by the assembly copying.Be replicated in the service that service in distributed system or process need, the state that each copy is consistent.Guarantee that this consistance is by a specific replication protocol.Have diverse ways, the copy of organization flow and general differentiation are active, passive and half active copying.
In active reproduction technology, be also referred to as the method for state machine, each replica processes request is replied from client and transmission.The behavior of separate copy and technology comprise guarantees that the request of receiving of all copies is with identical order.In the situation of collapse, this technology has the low response time.But because all requests of all copy parallel processings, the expense while producing an operation showing, is a unpractical selection thereby make the high-availability solution of business application.
With passive reproduction technology, be also referred to as active and standby part, one of them copy, is called master, receives the request from client, and returns to response.Backup and main accepting state updating message.If master server breaks down, backup is taken over.Active unlike copying, more active than copying, it needs less processing power and processes the decision of asking and do not make any hypothesis.But, have and showing the response time increasing, failed in the situation that, make it be not suitable for the context of the application program strict to time requirement.
Half Active Replication reprography is active, evades uncertain problem, under the background of time-critical type application program.This technology is based on the active leader who copies and expand and tagger's concept.Although all copies that the request of actual treatment is carried out, it is the processing of carrying out uncertainty part, and the responsibility of informing tagger's leader.This technology is to Active Replication, and the processing of uncertainty is possible difference.But expense release time showing is to produce in the situation of primary copy of a failure.
Summary of the invention
Technical assignment of the present invention is to solve the deficiencies in the prior art, and a kind of method that copies computer system in cluster for secondary backup is provided.
Technical scheme of the present invention realizes in the following manner, a kind of this method that copies computer system in cluster for secondary backup, and its specific implementation process is:
Backup copies system is set in cluster, in this system, has at least one client, have a node at least, primary copy, less important copy and a Secondary Backup copy;
Each elementary and Secondary Backup copy in Distribution Layer aggregated(particle) structure;
In the time that the fault of one of them copy is detected, replace with lower one deck the copy breaking down;
Regeneration copies the copy with affected lowest level, rebuilds primary copy, secondary copy and Secondary Backup copy.
The described copy breaking down is less important copy, and new secondary copy promotes Secondary Backup copy, and reconfigures, and starts a new Secondary Backup copy.
The described copy breaking down is Secondary Backup copy, and clone itself forms the copy secondary copy of a new Secondary Backup copy.
The described copy copying is a single operating system, the i.e. image of AIX or (SuSE) Linux OS.
The beneficial effect that the present invention compared with prior art produced is:
A kind of method that copies computer system for secondary backup in cluster of the present invention adopts the arrangement of half Active Replication, here the relation of Secondary Backup between the main and less important duplicate adopting, and in group system quick-recovery or fault recovery soon, guarantee lower working time of expense and instantaneous fail-over capability.Copy the cluster of such process or system, lasting availability can guarantee, and failed in the situation that, and response and obviously reducing release time improves the mission critical of environment and application in real time, practical, is easy to popularization.
Accompanying drawing explanation
Accompanying drawing 1 is embodiments of the invention structural representations.
Accompanying drawing 2 is computer system schematic diagram of the cluster of node, client and a communication channel shown in the embodiment of the present invention.
Accompanying drawing 3 is primary copy process flows diagram flow chart of embodiments of the invention fault graph.
Accompanying drawing 4 is process flow diagrams of the duplicate failure of auxiliary view current in embodiments of the invention.
embodiment
Below in conjunction with accompanying drawing, a kind of method that copies computer system for secondary backup in cluster of the present invention is described in detail below.
Fundamental purpose of the present invention is to copy plan, complete " Secondary Backup copies " and process request, reduce at one time simultaneously working time and release time expense determinacy do not make any hypothesis, therefore make its be applicable to high availability and the fault-tolerant management of the application of mission critical and time-critical.
Another object of the present invention is to copy a new reproduction technology referred to as " secondary backup " in clustered computing system.In this technology, node in a process or a computer cluster is copied to three copies of a group or three process copies of clone, participate in Secondary Backup agreement and role classical " elementary " and " secondary ", except having introduced in a new role of this technology, referred to as " Secondary Backup " or " backup ".Secondary Backup is to one of the process of the process group of secondary copy or copy of system as a Hot Spare.Main and less important duplicate is participated in half Active Replication agreement, and has similar passive replication relation, between secondary and Secondary Backup.
Another object of the present invention is secondary copy and the triplicate between triplicate and the agreement of low expense of introducing.In addition, forever only have in addition " follower " participation program, adopt half Active Replication here.
The invention provides a kind of method that copies computer system in cluster for secondary backup, its specific implementation process is:
Backup copies system is set in cluster, in this system, has at least one client, have a node at least, primary copy, less important copy and a Secondary Backup copy;
Each elementary and Secondary Backup copy in Distribution Layer aggregated(particle) structure;
In the time that the fault of one of them copy is detected, replace with lower one deck the copy breaking down;
Regeneration copies the copy with affected lowest level, rebuilds primary copy, secondary copy and Secondary Backup copy.
The described copy breaking down is less important copy, and new secondary copy promotes Secondary Backup copy, and reconfigures, and starts a new Secondary Backup copy.
The described copy breaking down is Secondary Backup copy, and clone itself forms the copy secondary copy of a new Secondary Backup copy.
The copy copying is a single operating system, the i.e. image of AIX or (SuSE) Linux OS.
Embodiment.
Example as shown in Figure 1: this computer system of trooping has one or more client 12a----12N, a kind of communication system 13 and 14, node 16a----16n, disk bus 18, and one or more shared disk 20a----20n.Operable other bunches of the present invention may seem not to be both very much the quantity that depends on processor, the network of use and the selection of magnetic disc, etc.It can be understood that, client 12 is that a processor can be accessed this node 16 by LAN, the public LAN shown in the private local area network (LAN) as shown in 13 or 14.The every operation of client 12 one " front end " or client application querying server application program operate in cluster node 16.It also it will be understood that, the figure in system.As shown in Figure 1, each node 16 has the access of one or more shared external disk equipment 20.Each disk unit 20 can be connected to multiple nodes physically.Shared disk store tasks is closed of bonding data and is conventionally configured to data redundancy.The core of the group system 10 that node 16 forms.Node 16 is a processor, the high availability of operation and fault-tolerant management software and application software.
A new replication management technology, secondary backup copies, one group, copy in the process of the distributed system of disclosure management high availability.In secondary backup process, a copy is as secondary copy, rather than the backup of primary copy is the method for common active and standby part, the wherein situation of triplicate backup primary copy.
Accompanying drawing 2 shows the backup secondary reproducing unit of trooping, by client 1 and three copies 4,5, and 14.Each copy can be considered to operate in single computer systems or LPAR image on the technique single as one or container.A copy, also can represent one single, as AIX or (SuSE) Linux OS reflection.All three copies 4,5,6, also can be regarded as three independently process operate in a computer system.The request of all clients in primary copy 4 and secondary copy 5 processes is operations of being responsible for processing all uncertainty but only have primary copy 4.Secondary copy 5, then be forced to make identical decision, by the Secondary Backup of primary copy 4, secondary copy 5 regular updates, change to the state of Secondary Backup copy 6 comprising its state of inspection, affect run-time overhead cluster thereby reduce to greatest extent Secondary Backup copy 6.
Under normal circumstances, the composition of the unsuccessfully change group of a copy in a group, has provoked view variation.According to the fault of a copy in system or the processing mode difference of loss of data of the effect of the failed copy of hypothesis.Because Secondary Backup copy 6 does not participate in any group interaction that exceeds, its failure is completely transparent, the tissue of this copy.
Accompanying drawing 3 is a kind of methods of a process flow diagram, the failure of the primary copy 4 being wherein detected.In step 9, the primary copy of fault detected.In the time carrying out step 10, detect failed primary copy 4, secondary copy 5 moments adapter, and continue to calculate, consider the effect to primary copy 4.In client 12, the first thing that secondary copy 5 does is any pendent event of resetting, and it has received last known state itself that bring up-to-date primary copy 4 from failed primary copy 4.Secondary copy 5 will continue to carry out, and self-propelled synchronous and Secondary Backup copy 6, all waiting events after processing.Then communication system 13 or Secondary Backup are elevated to new booster action, secondary copy 6.
Accompanying drawing 4 is process flow diagrams of a process, the current secondary copy 5 that described fault is detected.If current secondary copy 5 breaks down, fault 14 detected.In step 15, the secondary part of Secondary Backup copy 6 promotions itself.There is extra resource, start and reconfigure a new copy of group beginning by the effect of Secondary Backup copy 6 at 3 secondary copies 4, recover the original degree that copies.
A process of the Secondary Backup copy 6 that described fault is detected.The fault of Secondary Backup copy 6 does not affect cluster state, because it does not participate in the processing of request and response.At 18 places, 4 clones of secondary copy, set up a new Secondary Backup 6 if possible.
The foregoing is only embodiments of the invention, within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (4)

1. in cluster, copy a method for computer system for secondary backup, it is characterized in that its specific implementation process is:
Backup copies system is set in cluster, in this system, has at least one client, have a node at least, primary copy, less important copy and a Secondary Backup copy;
Each elementary and Secondary Backup copy in Distribution Layer aggregated(particle) structure;
In the time that the fault of one of them copy is detected, replace with lower one deck the copy breaking down;
Regeneration copies the copy with affected lowest level, rebuilds primary copy, secondary copy and Secondary Backup copy.
2. a kind of method that copies computer system in cluster for secondary backup according to claim 1, it is characterized in that: described in the copy that breaks down be less important copy, new secondary copy promotes Secondary Backup copy, and reconfigures, and starts a new Secondary Backup copy.
3. a kind of method that copies computer system in cluster for secondary backup according to claim 1, it is characterized in that: described in the copy that breaks down be Secondary Backup copy, clone itself forms the copy secondary copy of a new Secondary Backup copy.
4. according to a kind of method that copies computer system in cluster for secondary backup described in claim 2 or 3, it is characterized in that: described in the copy that copies be a single operating system, the i.e. image of AIX or (SuSE) Linux OS.
CN201410006210.1A 2014-01-07 2014-01-07 Method for assisting in backing-up and copying computer system in cluster Pending CN103793296A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410006210.1A CN103793296A (en) 2014-01-07 2014-01-07 Method for assisting in backing-up and copying computer system in cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410006210.1A CN103793296A (en) 2014-01-07 2014-01-07 Method for assisting in backing-up and copying computer system in cluster

Publications (1)

Publication Number Publication Date
CN103793296A true CN103793296A (en) 2014-05-14

Family

ID=50669003

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410006210.1A Pending CN103793296A (en) 2014-01-07 2014-01-07 Method for assisting in backing-up and copying computer system in cluster

Country Status (1)

Country Link
CN (1) CN103793296A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106716972A (en) * 2014-09-30 2017-05-24 微软技术许可有限责任公司 Semi-automatic failover
CN107204878A (en) * 2017-05-27 2017-09-26 国网山东省电力公司 A kind of annular escape system of certificate server and method
CN112711376A (en) * 2019-10-25 2021-04-27 北京金山云网络技术有限公司 Method and device for determining object master copy file in object storage system
CN113254536A (en) * 2021-06-09 2021-08-13 蚂蚁金服(杭州)网络技术有限公司 Database transaction processing method, system, electronic device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5212784A (en) * 1990-10-22 1993-05-18 Delphi Data, A Division Of Sparks Industries, Inc. Automated concurrent data backup system
CN1512375A (en) * 2002-12-31 2004-07-14 联想(北京)有限公司 Fault-tolerance approach using machine group node interacting buckup
CN1752939A (en) * 2004-09-22 2006-03-29 微软公司 Method and system for synthetic backup and restore
US20080052327A1 (en) * 2006-08-28 2008-02-28 International Business Machines Corporation Secondary Backup Replication Technique for Clusters
CN101706795A (en) * 2009-11-30 2010-05-12 上海世范软件技术有限公司 Method for synchronizing data of database in active/standby server

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5212784A (en) * 1990-10-22 1993-05-18 Delphi Data, A Division Of Sparks Industries, Inc. Automated concurrent data backup system
CN1512375A (en) * 2002-12-31 2004-07-14 联想(北京)有限公司 Fault-tolerance approach using machine group node interacting buckup
CN1752939A (en) * 2004-09-22 2006-03-29 微软公司 Method and system for synthetic backup and restore
US20080052327A1 (en) * 2006-08-28 2008-02-28 International Business Machines Corporation Secondary Backup Replication Technique for Clusters
CN101706795A (en) * 2009-11-30 2010-05-12 上海世范软件技术有限公司 Method for synchronizing data of database in active/standby server

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106716972A (en) * 2014-09-30 2017-05-24 微软技术许可有限责任公司 Semi-automatic failover
CN106716972B (en) * 2014-09-30 2020-12-15 微软技术许可有限责任公司 Semi-automatic failover
CN107204878A (en) * 2017-05-27 2017-09-26 国网山东省电力公司 A kind of annular escape system of certificate server and method
CN107204878B (en) * 2017-05-27 2018-01-02 国网山东省电力公司 A kind of certificate server annular escape system and method
CN112711376A (en) * 2019-10-25 2021-04-27 北京金山云网络技术有限公司 Method and device for determining object master copy file in object storage system
CN113254536A (en) * 2021-06-09 2021-08-13 蚂蚁金服(杭州)网络技术有限公司 Database transaction processing method, system, electronic device and storage medium
WO2022257719A1 (en) * 2021-06-09 2022-12-15 北京奥星贝斯科技有限公司 Database transaction processing method and system, electronic device, and storage medium

Similar Documents

Publication Publication Date Title
US11360854B2 (en) Storage cluster configuration change method, storage cluster, and computer system
KR100297906B1 (en) Dynamic changes in configuration
KR100326982B1 (en) A highly scalable and highly available cluster system management scheme
WO2016070375A1 (en) Distributed storage replication system and method
CN110581782B (en) Disaster tolerance data processing method, device and system
US20080052327A1 (en) Secondary Backup Replication Technique for Clusters
JP6491210B2 (en) System and method for supporting persistent partition recovery in a distributed data grid
EP2643771B1 (en) Real time database system
CN102938705B (en) Method for managing and switching high availability multi-machine backup routing table
CN105069160A (en) Autonomous controllable database based high-availability method and architecture
CN110389858B (en) Method and device for recovering faults of storage device
US10705754B2 (en) Zero-data loss recovery for active-active sites configurations
WO2021103499A1 (en) Multi-active data center-based traffic switching method and device
CN113821376B (en) Cloud disaster recovery-based integrated backup disaster recovery method and system
CN103793296A (en) Method for assisting in backing-up and copying computer system in cluster
CN104917827A (en) Method for realizing oracle load balancing cluster
KR20150124642A (en) Communication failure recover method of parallel-connecte server system
CN111800484A (en) Service anti-destruction replacing method for mobile edge information service system
CN105323271B (en) Cloud computing system and processing method and device thereof
CN112231399A (en) Method and device applied to graph database
CN111708843A (en) Cross-data-center MySQL multi-activity implementation method based on MGR
CA2241861C (en) A scheme to perform event rollup
CN110554933A (en) Cloud management platform, and cross-cloud high-availability method and system for cloud platform service
CN111367711A (en) Safety disaster recovery method based on super fusion data
CN115658390A (en) Container disaster tolerance method, system, device, equipment and computer readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140514