CN103049322A - Vector target set balance partition method aiming at topological relation parallel computation - Google Patents
Vector target set balance partition method aiming at topological relation parallel computation Download PDFInfo
- Publication number
- CN103049322A CN103049322A CN2012105863713A CN201210586371A CN103049322A CN 103049322 A CN103049322 A CN 103049322A CN 2012105863713 A CN2012105863713 A CN 2012105863713A CN 201210586371 A CN201210586371 A CN 201210586371A CN 103049322 A CN103049322 A CN 103049322A
- Authority
- CN
- China
- Prior art keywords
- vector target
- vector
- topological relation
- computation
- parallel computation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Abstract
The invention discloses a vector target set balance partition method aiming at topological relation parallel computation. Vector target topological relation parallel computation belongs to the non-computational intensive algorithm, and in parallel computation, main computation resource consumption is to judge whether minimum bounding rectangles among vector targets are intersected or not, but topological computation only occupies a small part of computation resources. Therefore, vector target partition emphasizes to consider balance of the vector target quantity in each progress rather than to give consideration to geometrical complexity of the vector target. Aiming at vector target topological relation parallel computation, data are assigned to each progress by means of efficient balance partition, so that the vector target quantity among the progresses is balanced, namely, task loads are balanced. By the method, the computation loads among the progresses are highly balanced, and accordingly efficiency of the algorithm is improved, and the efficient vector data partition method is provided for development and service of topological relation software for mass data of single-computer multi-core and single-computer many-core high-performance cluster environments.
Description
Technical field
The invention belongs to the parallel computation field, particularly relate to the balanced division methods of a kind of vector target collection for the parallel computation of vector target topological relation.
Background technology
Topological relation is a kind of qualitative relationships that extraterrestrial target remains unchanged under the conversion such as extension, movement, rotation, and it has very important effect at aspects such as the tissue of spatial data, analysis, inquiries.Topological relation is also being played the part of very important role on spatial reasoning and space querying, be the important content of Geographic Information System.Along with the space data quantity explosive growth, traditional spatial relationship serial algorithm can not satisfy extensive spatial data analysis and demand, demand urgently utilizing the parallel architectural framework of computing machine to develop a kind of parallel algorithm, could satisfy scale vector target (spatial data) spatial relationship and calculate fast and the application demand of analyzing.Vector target topological relation parallel algorithm is comprised of 3 parts: the 1) pre-service of target polygon collection; 2) the target polygon collection is divided; 3) target polygon topological relationship calculation and judgement.
The pre-service of target polygon collection: the pre-service of parallel algorithm refers to polygon data is carried out and line ordering, and the principle of ordering is to sort take the number of vertex of target polygon as parameter.The target polygon collection is divided: the division of target polygon collection is the Focal point and difficult point problem of topological relation parallel algorithm.An importance that affects the efficient of parallel algorithm is the load balancing in the parallel computation process.
The topological relation design of Parallel Algorithms need be considered two aspect problems: the one, and the spatial character of vector target, the 2nd, the division of vector target collection.The topological relation parallel algorithm is non-computation-intensive algorithm, according to the feature of its parallel algorithm, can take correlation method to its data division.The vector target collection division methods of parallel algorithm mainly contains to take turns and turns partitioning at present
[1], the scope partitioning
[1-3], the hash partitioning
[Isosorbide-5-Nitrae], mix partitioning
[1,5], the space curve partitioning
[6,7]Deng, but these existing division methods can not ensure for the characteristics of topological relation algorithm that all the task load of each process in the topological relation parallel algorithm is balanced, have affected counting yield.Therefore, the present invention is directed to the characteristics of topological relation algorithm, designed new division methods, realized the topological relationship calculation load balancing to the scale vector object set, improved the efficient that the vector target spatial relationship is calculated.
List of references
[1] Zhao Chunyu. Vector spatial data access and processing gordian technique research [D] among the high performance parallel GIS. Wuhan: Wuhan University, 2006.
[2]Ann?Chervenak,Ian?Foster,Carl?Kesselman,Charles?Salisbury,Steven?Tueck.The?data?grid:Towards?an?architecture?for?the?distributed?management?andanalysis?of?large?scientific?datasets[J].Journal?of?Network?and?ComputerApplication.2000,23:187~200.
[3]http://docs.oracle.com/cd/B28359_01/server.111/b32024/partition.htm
[4]Chengwen?Liu,Hao?Chen,A?Hash?Partition?Strategy?for?Di?stributed?QueryProcessing[C],the?5th?International?Conference?on?Extending?DatabaseTechnology(EDBT),Avignon,France,1996,1057:371-387.
[5]Shahram?Ghandeharizadeh,David?J.DeWitt,Hybrid-range?partitioning?strategy:a?new?declustering?strategy?for?multiprocessor?databases?machines[C],Proceedings?of?the?sixteenth?international?conference?on?Very?large?databases,Brisbane,Australia,1990,481-492
[6] king Yong Jiemeng makes Kui Zhao Chun space. based on the massive spatial data partitioning algorithm research [J] of Hilbert space arrangement code. and Wuhan University Journal: information science version, 2007,32 (7): 650~653.
[7] field light. the partition strategy research and implementation [D] of Vector spatial data in the parallel computation environment. China University of Geosciences, 2011.
Three, summary of the invention
(1) algorithm steps
The invention provides the balanced division methods of a kind of vector target collection for the topological relation parallel computation.According to its topological relation parallel computation characteristics, the present invention has adopted the balanced division methods of the vector target collection of taking destination number into account that object set is divided to different processes, makes each process task of topological relation parallel computation balanced (as shown in Figure 1).
1. take the balanced division methods (hereinafter to be referred as balanced partitioning) of vector target collection of destination number into account
Vector target topological relation (point target topological relation, line target topological relation and Area Objects topological relation) parallel computation belongs to non-computation-intensive algorithm, its computational resource consumption main in parallel computation is to judge whether the minimum area-encasing rectangle of vector target intersects, and topology is calculated and only consumed the fraction computational resource.Therefore, the emphasis that the vector target collection is divided is not the geometric complexity that will pay close attention to vector target, but will pay close attention to the equilibrium of vector target quantity in each process.
Balanced partitioning is counted p according to process vector target collection (quantity n) balancedly is divided into p part, and each process obtains n/p vector target collection.Detailed process is as shown in Figure 2:
1) to concentrate number of objects be n to vector target, and (numeral among Fig. 2 A is object number of vertex v to concentrate the number of vertex of object according to vector target
i) calculate the weight w of each vector target according to (formula 1)
i, each process adopts sort algorithm that the vector target collection is carried out serializing (such as Fig. 2 B) according to the weights of object;
w
i=f (v
i) (the formula 1 of 1≤i≤n)
2) the vector target subset quantity of every sub-distribution be l (0<l≤| n/ (k * p) |, 2≤k<n/p, and k is even number), the number of times that integrates of distribution object is as m (0≤m≤k);
3) according to the order of vector target collection serializing, each is the adjacent sub-vector object set S of l and sequence number with p quantity
m{ s
M1, s
M2, s
M3..., s
M (p * l)(being p * l object) be assigned to different processes, if m is odd number, the order of then distributing is s
1→ s
P * lIf the order that even number then distributes is s
P * l→ s
1This allocation scheme emphasis is that to take in each process object number into account balanced, also with respect to the geometry complexity of vector target (number of vertex) balanced (because the vector target collection is according to weights v
iOrdering, odd number time reversed in order during with the even number sub-distribution is suc as formula the p of process shown in 2
1The m sub-distribution is S
Mi, the m+1 sub-distribution be S
(m+l) (p * l-i), process p
2The m sub-distribution is S
M (p * l-i), the m+1 sub-distribution be S
(m+l) i, the vector target collection geometry complexity basis equalization of each course allocation then), such as Fig. 2 C;
S
Mi+ S
(m+1) (p * l-i)≈ S
M (p * l-i)+ S
(m+1) iFormula 2
4) vector target that is l with a last not enough p quantity integrates (quantity as n-m * p * l), with it according to 2)
The step equilibrium is allocated to each process, until all vector targets all are divided to each process
(2) beneficial effect
1, utilizes the present invention, adopt the balanced division methods of the vector target collection of taking destination number into account for the topological relation algorithm characteristic, the equilibrium of vector target collection is divided to different processes, make the task between process realize load balancing, improved parallel efficiency calculation, its parallel efficiency is greater than 60%;
2, utilize the present invention, adopt unit multinuclear, many nuclear level High-Performance Computing Cluster hardware environment, can develop high performance parallel computation software, realize massive spatial data is carried out the efficient spatial relationship analysis.
Four, description of drawings
The division methods of Fig. 1 topological relation parallel algorithm characteristics and corresponding vector target collection thereof
Fig. 2 takes the balanced division methods of vector target collection of destination number into account
Fig. 3 uses the vector target collection of balanced division methods case
Five, embodiment
For the vector target collection division methods of topological relation parallel computation, case below is provided, and the present invention will be described.
(1) takes the balanced division methods case of vector target collection of destination number into account
The present invention utilize balanced partitioning data allocations to different processes, whether overlapping (in the topological relation a kind of) is arranged, and the extraterrestrial target ID that obtains of output inquiry between the compute vectors target.The vector target of present case integrates as 691,442 pieces ground, somewhere (4,417,571 points are arranged), as shown in Figure 3 (wherein Fig. 3 A is used ancestor plot collection in the case, and Fig. 3 B shows the complexity in ancestor plot).
1) vector target of computer memory topological relation integrates as DataSet, vector target integrate comprise object number as N=691,442;
2) establish the weights W of each vector target
i=f (v
i) (weight function f (v wherein
i)=v
i), namely take the number of vertex of vector target as weights;
3) adopt the canonical sort algorithm to carry out from big to small serializing according to weights object among the DataSet;
4) count p according to process, adopt balanced partitioning that DataSet vector target collection is divided into different processes, the amount of each course allocation data is subDataSet, and each sub-vector object set to comprise number of objects be 691,442/P, with P=4, every sub-distribution 691,442/ (2 * P), distributing twice is example: distribute for the first time, the subDataSet vector target number that distributes in the process 0 is 86,430 (1≤ID≤86,430, ID is the sequence number of object), the subDataSet vector target number that distributes in the process 1 is 86,430 (86,431≤ID≤172,860) the subDataSet vector target number that, distributes in the process 2 is 86,430 (172,861≤ID≤259,290) the subDataSet vector target number that, distributes in the process 3 is 86,430 (259,291≤ID≤345,720); Distribute for the second time, the subDataSet vector target number that distributes in the process 0 still is 86,430 (605,011≤ID≤691,440) the subDataSet vector target number that, distributes in the process 1 still is 86,430 (518,581≤ID≤605,010) the subDataSet vector target number that, distributes in the process 2 still is 86,430 (432,151≤ID≤518,580) the subDataSet vector target number that, distributes in the process 3 still is 86,430 (345,721≤ID≤432,150); At last, divide remaining 2 vector targets to process 0, process 1.Finally, the vector target quantity of process 0 is 172,861, and the vector target quantity of process 1 is 170,861, and the vector target quantity of process 2 is 170,860, and the vector target quantity of process 3 is 170,860;
5) vector target among each process DataSet and subDataSet vector target computer memory topological relation, reclaim the result that all processes obtain, its ground is overlapping 53 places, result in the table 1 just is based on the vector topological relation parallel computation of balanced partitioning, its acceleration effect is obvious, the computing elapsed time reduces along with increasing of process, has greatly improved topological relationship calculation efficient, and speed-up ratio can reach 5.33 when 8 processes.
Table 1 is based on the vector topological relation parallel computation result of balanced partitioning
Above-described specific embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above only is specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any modification of making, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.
Claims (1)
- A kind of vector target collection division methods for the topological relation parallel computation, its feature mainly is:For the topological relation parallel computation and take the balanced division methods of vector target collection of destination number into account;1. describedly according to claim 1 it is characterized in that for for the topological relation parallel computation and take the balanced division methods of vector target collection of destination number into account, comprise following 2 features:1) with each vector target number of vertex (v i) be weights (w i) parameter (w i=f (v i) (1≤i≤n), n is the vector target number), according to the w of each vector target iSize is carried out serializing to the vector target collection, and the identical vector target subset of division numbers makes the task load between process balanced to each process.2) in the process of distributing, according to the parity of distributing number of times m, with vector target collection S m{ s 1, s 2, s 3..., s pBe assigned to different processes; If m is odd number, the order of then distributing is s 1→ s p(being weights from big to small (or from small to large)); If m is even number, the order of then distributing is s p→ s 1(being weights from small to large (or from big to small)).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012105863713A CN103049322A (en) | 2012-12-31 | 2012-12-31 | Vector target set balance partition method aiming at topological relation parallel computation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012105863713A CN103049322A (en) | 2012-12-31 | 2012-12-31 | Vector target set balance partition method aiming at topological relation parallel computation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103049322A true CN103049322A (en) | 2013-04-17 |
Family
ID=48061972
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2012105863713A Pending CN103049322A (en) | 2012-12-31 | 2012-12-31 | Vector target set balance partition method aiming at topological relation parallel computation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103049322A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109062679A (en) * | 2018-08-01 | 2018-12-21 | 中国科学院遥感与数字地球研究所 | A kind of general division methods of vector data towards parallel processing |
CN110609744A (en) * | 2018-06-15 | 2019-12-24 | 伊姆西Ip控股有限责任公司 | Method, apparatus and computer program product for processing computing tasks |
CN113886089A (en) * | 2021-10-21 | 2022-01-04 | 上海勃池信息技术有限公司 | Task processing method, device, system, equipment and medium |
CN114780247A (en) * | 2022-05-17 | 2022-07-22 | 中国地质大学(北京) | Flow application scheduling method and system with flow rate and resource sensing |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5630129A (en) * | 1993-12-01 | 1997-05-13 | Sandia Corporation | Dynamic load balancing of applications |
US20070250470A1 (en) * | 2006-04-24 | 2007-10-25 | Microsoft Corporation | Parallelization of language-integrated collection operations |
CN101944045A (en) * | 2010-10-18 | 2011-01-12 | 中国人民解放军国防科学技术大学 | Method for distributing parallel discrete event simulation objects based on community characteristics |
CN102591622A (en) * | 2011-12-20 | 2012-07-18 | 南京大学 | Grid data coordinate conversion parallel method based on similarity transformation model |
-
2012
- 2012-12-31 CN CN2012105863713A patent/CN103049322A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5630129A (en) * | 1993-12-01 | 1997-05-13 | Sandia Corporation | Dynamic load balancing of applications |
US20070250470A1 (en) * | 2006-04-24 | 2007-10-25 | Microsoft Corporation | Parallelization of language-integrated collection operations |
CN101944045A (en) * | 2010-10-18 | 2011-01-12 | 中国人民解放军国防科学技术大学 | Method for distributing parallel discrete event simulation objects based on community characteristics |
CN102591622A (en) * | 2011-12-20 | 2012-07-18 | 南京大学 | Grid data coordinate conversion parallel method based on similarity transformation model |
Non-Patent Citations (2)
Title |
---|
YAN ZHOU等: "Hilbert Curve Based Spatial Data Declustering Method for Parallel Spatial Database", 《REMOTE SENSING, ENVIRONMENT AND TRANSPORTATION ENGINEERING (RSETE), 2012 2ND INTERNATIONAL CONFERENCE ON》 * |
YANG YIZHOU等: "Research on distributed Hilbert R tree spatial index based on BIRCH clustering", 《2012 20TH INTERNATIONAL CONFERENCE ON GEOINFORMATICS》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110609744A (en) * | 2018-06-15 | 2019-12-24 | 伊姆西Ip控股有限责任公司 | Method, apparatus and computer program product for processing computing tasks |
CN110609744B (en) * | 2018-06-15 | 2023-06-09 | 伊姆西Ip控股有限责任公司 | Method, apparatus and computer program product for processing computing tasks |
CN109062679A (en) * | 2018-08-01 | 2018-12-21 | 中国科学院遥感与数字地球研究所 | A kind of general division methods of vector data towards parallel processing |
CN113886089A (en) * | 2021-10-21 | 2022-01-04 | 上海勃池信息技术有限公司 | Task processing method, device, system, equipment and medium |
CN113886089B (en) * | 2021-10-21 | 2024-01-26 | 上海勃池信息技术有限公司 | Task processing method, device, system, equipment and medium |
CN114780247A (en) * | 2022-05-17 | 2022-07-22 | 中国地质大学(北京) | Flow application scheduling method and system with flow rate and resource sensing |
CN114780247B (en) * | 2022-05-17 | 2022-12-13 | 中国地质大学(北京) | Flow application scheduling method and system with flow rate and resource sensing |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hu et al. | Tricore: Parallel triangle counting on gpus | |
US10007742B2 (en) | Particle flow simulation system and method | |
Alresheedi et al. | Improved multiobjective salp swarm optimization for virtual machine placement in cloud computing | |
CN103279330A (en) | MapReduce multiple programming model based on virtual machine GPU computation | |
Hu et al. | Trix: Triangle counting at extreme scale | |
Rossi et al. | Towards robust algorithms for current deposition and dynamic load-balancing in a GPU particle in cell code | |
CN102110079A (en) | Tuning calculation method of distributed conjugate gradient method based on MPI | |
CN103049322A (en) | Vector target set balance partition method aiming at topological relation parallel computation | |
Yang et al. | Efficient FPGA-based graph processing with hybrid pull-push computational model | |
CN103150214A (en) | Vector target set balanced partitioning method aiming at spatial measure and direction relation concurrent computation | |
Chen et al. | Efficient and portable ALS matrix factorization for recommender systems | |
Zhang et al. | A survey of parallel particle tracing algorithms in flow visualization | |
CN103049329A (en) | High-efficiency system based on central processing unit (CPU)/many integrated core (MIC) heterogeneous system structure | |
CN110110158B (en) | Storage space division method and system for three-dimensional grid data | |
CN103294639A (en) | CPU+MIC mixed heterogeneous cluster system for achieving large-scale computing | |
Chang et al. | A novel energy-aware and resource efficient virtual resource allocation strategy in IaaS cloud | |
CN112948643B (en) | Structured grid streamline integration method based on thread parallelism | |
Zhang et al. | Task scheduling for gpu heterogeneous cluster | |
Khaled et al. | Parallel study of 3-D oil reservoir data visualization tool using hybrid distributed/shared-memory models | |
Chandrashekar et al. | Performance Model of HPC Application On CPU-GPU Platform | |
Kumar et al. | Power and data aware best fit algorithm for energy saving in cloud computing | |
Liu et al. | A-MapCG: an adaptive MapReduce framework for GPUs | |
Fatima et al. | A heterogeneous dynamic scheduling minimized make-span for energy and performance balancing | |
Papagiannis et al. | Scalable runtime support for data-intensive applications on the single-chip cloud computer | |
Liu et al. | Multi-Objective Virtual Machine Placement Algorithm Based on Improved Discrete Differential Evolution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20130417 |