CN103049322A - Vector target set balance partition method aiming at topological relation parallel computation - Google Patents

Vector target set balance partition method aiming at topological relation parallel computation Download PDF

Info

Publication number
CN103049322A
CN103049322A CN2012105863713A CN201210586371A CN103049322A CN 103049322 A CN103049322 A CN 103049322A CN 2012105863713 A CN2012105863713 A CN 2012105863713A CN 201210586371 A CN201210586371 A CN 201210586371A CN 103049322 A CN103049322 A CN 103049322A
Authority
CN
China
Prior art keywords
vector target
vector
topological relation
computation
parallel computation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012105863713A
Other languages
Chinese (zh)
Inventor
吴立新
杨宜舟
郭甲腾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN2012105863713A priority Critical patent/CN103049322A/en
Publication of CN103049322A publication Critical patent/CN103049322A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a vector target set balance partition method aiming at topological relation parallel computation. Vector target topological relation parallel computation belongs to the non-computational intensive algorithm, and in parallel computation, main computation resource consumption is to judge whether minimum bounding rectangles among vector targets are intersected or not, but topological computation only occupies a small part of computation resources. Therefore, vector target partition emphasizes to consider balance of the vector target quantity in each progress rather than to give consideration to geometrical complexity of the vector target. Aiming at vector target topological relation parallel computation, data are assigned to each progress by means of efficient balance partition, so that the vector target quantity among the progresses is balanced, namely, task loads are balanced. By the method, the computation loads among the progresses are highly balanced, and accordingly efficiency of the algorithm is improved, and the efficient vector data partition method is provided for development and service of topological relation software for mass data of single-computer multi-core and single-computer many-core high-performance cluster environments.

Description

The balanced division methods of a kind of vector target collection for the topological relation parallel computation
Technical field
The invention belongs to the parallel computation field, particularly relate to the balanced division methods of a kind of vector target collection for the parallel computation of vector target topological relation.
Background technology
Topological relation is a kind of qualitative relationships that extraterrestrial target remains unchanged under the conversion such as extension, movement, rotation, and it has very important effect at aspects such as the tissue of spatial data, analysis, inquiries.Topological relation is also being played the part of very important role on spatial reasoning and space querying, be the important content of Geographic Information System.Along with the space data quantity explosive growth, traditional spatial relationship serial algorithm can not satisfy extensive spatial data analysis and demand, demand urgently utilizing the parallel architectural framework of computing machine to develop a kind of parallel algorithm, could satisfy scale vector target (spatial data) spatial relationship and calculate fast and the application demand of analyzing.Vector target topological relation parallel algorithm is comprised of 3 parts: the 1) pre-service of target polygon collection; 2) the target polygon collection is divided; 3) target polygon topological relationship calculation and judgement.
The pre-service of target polygon collection: the pre-service of parallel algorithm refers to polygon data is carried out and line ordering, and the principle of ordering is to sort take the number of vertex of target polygon as parameter.The target polygon collection is divided: the division of target polygon collection is the Focal point and difficult point problem of topological relation parallel algorithm.An importance that affects the efficient of parallel algorithm is the load balancing in the parallel computation process.
The topological relation design of Parallel Algorithms need be considered two aspect problems: the one, and the spatial character of vector target, the 2nd, the division of vector target collection.The topological relation parallel algorithm is non-computation-intensive algorithm, according to the feature of its parallel algorithm, can take correlation method to its data division.The vector target collection division methods of parallel algorithm mainly contains to take turns and turns partitioning at present [1], the scope partitioning [1-3], the hash partitioning [Isosorbide-5-Nitrae], mix partitioning [1,5], the space curve partitioning [6,7]Deng, but these existing division methods can not ensure for the characteristics of topological relation algorithm that all the task load of each process in the topological relation parallel algorithm is balanced, have affected counting yield.Therefore, the present invention is directed to the characteristics of topological relation algorithm, designed new division methods, realized the topological relationship calculation load balancing to the scale vector object set, improved the efficient that the vector target spatial relationship is calculated.
List of references
[1] Zhao Chunyu. Vector spatial data access and processing gordian technique research [D] among the high performance parallel GIS. Wuhan: Wuhan University, 2006.
[2]Ann?Chervenak,Ian?Foster,Carl?Kesselman,Charles?Salisbury,Steven?Tueck.The?data?grid:Towards?an?architecture?for?the?distributed?management?andanalysis?of?large?scientific?datasets[J].Journal?of?Network?and?ComputerApplication.2000,23:187~200.
[3]http://docs.oracle.com/cd/B28359_01/server.111/b32024/partition.htm
[4]Chengwen?Liu,Hao?Chen,A?Hash?Partition?Strategy?for?Di?stributed?QueryProcessing[C],the?5th?International?Conference?on?Extending?DatabaseTechnology(EDBT),Avignon,France,1996,1057:371-387.
[5]Shahram?Ghandeharizadeh,David?J.DeWitt,Hybrid-range?partitioning?strategy:a?new?declustering?strategy?for?multiprocessor?databases?machines[C],Proceedings?of?the?sixteenth?international?conference?on?Very?large?databases,Brisbane,Australia,1990,481-492
[6] king Yong Jiemeng makes Kui Zhao Chun space. based on the massive spatial data partitioning algorithm research [J] of Hilbert space arrangement code. and Wuhan University Journal: information science version, 2007,32 (7): 650~653.
[7] field light. the partition strategy research and implementation [D] of Vector spatial data in the parallel computation environment. China University of Geosciences, 2011.
Three, summary of the invention
(1) algorithm steps
The invention provides the balanced division methods of a kind of vector target collection for the topological relation parallel computation.According to its topological relation parallel computation characteristics, the present invention has adopted the balanced division methods of the vector target collection of taking destination number into account that object set is divided to different processes, makes each process task of topological relation parallel computation balanced (as shown in Figure 1).
1. take the balanced division methods (hereinafter to be referred as balanced partitioning) of vector target collection of destination number into account
Vector target topological relation (point target topological relation, line target topological relation and Area Objects topological relation) parallel computation belongs to non-computation-intensive algorithm, its computational resource consumption main in parallel computation is to judge whether the minimum area-encasing rectangle of vector target intersects, and topology is calculated and only consumed the fraction computational resource.Therefore, the emphasis that the vector target collection is divided is not the geometric complexity that will pay close attention to vector target, but will pay close attention to the equilibrium of vector target quantity in each process.
Balanced partitioning is counted p according to process vector target collection (quantity n) balancedly is divided into p part, and each process obtains n/p vector target collection.Detailed process is as shown in Figure 2:
1) to concentrate number of objects be n to vector target, and (numeral among Fig. 2 A is object number of vertex v to concentrate the number of vertex of object according to vector target i) calculate the weight w of each vector target according to (formula 1) i, each process adopts sort algorithm that the vector target collection is carried out serializing (such as Fig. 2 B) according to the weights of object;
w i=f (v i) (the formula 1 of 1≤i≤n)
2) the vector target subset quantity of every sub-distribution be l (0<l≤| n/ (k * p) |, 2≤k<n/p, and k is even number), the number of times that integrates of distribution object is as m (0≤m≤k);
3) according to the order of vector target collection serializing, each is the adjacent sub-vector object set S of l and sequence number with p quantity m{ s M1, s M2, s M3..., s M (p * l)(being p * l object) be assigned to different processes, if m is odd number, the order of then distributing is s 1→ s P * lIf the order that even number then distributes is s P * l→ s 1This allocation scheme emphasis is that to take in each process object number into account balanced, also with respect to the geometry complexity of vector target (number of vertex) balanced (because the vector target collection is according to weights v iOrdering, odd number time reversed in order during with the even number sub-distribution is suc as formula the p of process shown in 2 1The m sub-distribution is S Mi, the m+1 sub-distribution be S (m+l) (p * l-i), process p 2The m sub-distribution is S M (p * l-i), the m+1 sub-distribution be S (m+l) i, the vector target collection geometry complexity basis equalization of each course allocation then), such as Fig. 2 C;
S Mi+ S (m+1) (p * l-i)≈ S M (p * l-i)+ S (m+1) iFormula 2
4) vector target that is l with a last not enough p quantity integrates (quantity as n-m * p * l), with it according to 2)
The step equilibrium is allocated to each process, until all vector targets all are divided to each process
(2) beneficial effect
1, utilizes the present invention, adopt the balanced division methods of the vector target collection of taking destination number into account for the topological relation algorithm characteristic, the equilibrium of vector target collection is divided to different processes, make the task between process realize load balancing, improved parallel efficiency calculation, its parallel efficiency is greater than 60%;
2, utilize the present invention, adopt unit multinuclear, many nuclear level High-Performance Computing Cluster hardware environment, can develop high performance parallel computation software, realize massive spatial data is carried out the efficient spatial relationship analysis.
Four, description of drawings
The division methods of Fig. 1 topological relation parallel algorithm characteristics and corresponding vector target collection thereof
Fig. 2 takes the balanced division methods of vector target collection of destination number into account
Fig. 3 uses the vector target collection of balanced division methods case
Five, embodiment
For the vector target collection division methods of topological relation parallel computation, case below is provided, and the present invention will be described.
(1) takes the balanced division methods case of vector target collection of destination number into account
The present invention utilize balanced partitioning data allocations to different processes, whether overlapping (in the topological relation a kind of) is arranged, and the extraterrestrial target ID that obtains of output inquiry between the compute vectors target.The vector target of present case integrates as 691,442 pieces ground, somewhere (4,417,571 points are arranged), as shown in Figure 3 (wherein Fig. 3 A is used ancestor plot collection in the case, and Fig. 3 B shows the complexity in ancestor plot).
1) vector target of computer memory topological relation integrates as DataSet, vector target integrate comprise object number as N=691,442;
2) establish the weights W of each vector target i=f (v i) (weight function f (v wherein i)=v i), namely take the number of vertex of vector target as weights;
3) adopt the canonical sort algorithm to carry out from big to small serializing according to weights object among the DataSet;
4) count p according to process, adopt balanced partitioning that DataSet vector target collection is divided into different processes, the amount of each course allocation data is subDataSet, and each sub-vector object set to comprise number of objects be 691,442/P, with P=4, every sub-distribution 691,442/ (2 * P), distributing twice is example: distribute for the first time, the subDataSet vector target number that distributes in the process 0 is 86,430 (1≤ID≤86,430, ID is the sequence number of object), the subDataSet vector target number that distributes in the process 1 is 86,430 (86,431≤ID≤172,860) the subDataSet vector target number that, distributes in the process 2 is 86,430 (172,861≤ID≤259,290) the subDataSet vector target number that, distributes in the process 3 is 86,430 (259,291≤ID≤345,720); Distribute for the second time, the subDataSet vector target number that distributes in the process 0 still is 86,430 (605,011≤ID≤691,440) the subDataSet vector target number that, distributes in the process 1 still is 86,430 (518,581≤ID≤605,010) the subDataSet vector target number that, distributes in the process 2 still is 86,430 (432,151≤ID≤518,580) the subDataSet vector target number that, distributes in the process 3 still is 86,430 (345,721≤ID≤432,150); At last, divide remaining 2 vector targets to process 0, process 1.Finally, the vector target quantity of process 0 is 172,861, and the vector target quantity of process 1 is 170,861, and the vector target quantity of process 2 is 170,860, and the vector target quantity of process 3 is 170,860;
5) vector target among each process DataSet and subDataSet vector target computer memory topological relation, reclaim the result that all processes obtain, its ground is overlapping 53 places, result in the table 1 just is based on the vector topological relation parallel computation of balanced partitioning, its acceleration effect is obvious, the computing elapsed time reduces along with increasing of process, has greatly improved topological relationship calculation efficient, and speed-up ratio can reach 5.33 when 8 processes.
Table 1 is based on the vector topological relation parallel computation result of balanced partitioning
Figure BSA00000832856600061
Above-described specific embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above only is specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any modification of making, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (1)

  1. A kind of vector target collection division methods for the topological relation parallel computation, its feature mainly is:
    For the topological relation parallel computation and take the balanced division methods of vector target collection of destination number into account;
    1. describedly according to claim 1 it is characterized in that for for the topological relation parallel computation and take the balanced division methods of vector target collection of destination number into account, comprise following 2 features:
    1) with each vector target number of vertex (v i) be weights (w i) parameter (w i=f (v i) (1≤i≤n), n is the vector target number), according to the w of each vector target iSize is carried out serializing to the vector target collection, and the identical vector target subset of division numbers makes the task load between process balanced to each process.
    2) in the process of distributing, according to the parity of distributing number of times m, with vector target collection S m{ s 1, s 2, s 3..., s pBe assigned to different processes; If m is odd number, the order of then distributing is s 1→ s p(being weights from big to small (or from small to large)); If m is even number, the order of then distributing is s p→ s 1(being weights from small to large (or from big to small)).
CN2012105863713A 2012-12-31 2012-12-31 Vector target set balance partition method aiming at topological relation parallel computation Pending CN103049322A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012105863713A CN103049322A (en) 2012-12-31 2012-12-31 Vector target set balance partition method aiming at topological relation parallel computation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012105863713A CN103049322A (en) 2012-12-31 2012-12-31 Vector target set balance partition method aiming at topological relation parallel computation

Publications (1)

Publication Number Publication Date
CN103049322A true CN103049322A (en) 2013-04-17

Family

ID=48061972

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012105863713A Pending CN103049322A (en) 2012-12-31 2012-12-31 Vector target set balance partition method aiming at topological relation parallel computation

Country Status (1)

Country Link
CN (1) CN103049322A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062679A (en) * 2018-08-01 2018-12-21 中国科学院遥感与数字地球研究所 A kind of general division methods of vector data towards parallel processing
CN110609744A (en) * 2018-06-15 2019-12-24 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for processing computing tasks
CN113886089A (en) * 2021-10-21 2022-01-04 上海勃池信息技术有限公司 Task processing method, device, system, equipment and medium
CN114780247A (en) * 2022-05-17 2022-07-22 中国地质大学(北京) Flow application scheduling method and system with flow rate and resource sensing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5630129A (en) * 1993-12-01 1997-05-13 Sandia Corporation Dynamic load balancing of applications
US20070250470A1 (en) * 2006-04-24 2007-10-25 Microsoft Corporation Parallelization of language-integrated collection operations
CN101944045A (en) * 2010-10-18 2011-01-12 中国人民解放军国防科学技术大学 Method for distributing parallel discrete event simulation objects based on community characteristics
CN102591622A (en) * 2011-12-20 2012-07-18 南京大学 Grid data coordinate conversion parallel method based on similarity transformation model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5630129A (en) * 1993-12-01 1997-05-13 Sandia Corporation Dynamic load balancing of applications
US20070250470A1 (en) * 2006-04-24 2007-10-25 Microsoft Corporation Parallelization of language-integrated collection operations
CN101944045A (en) * 2010-10-18 2011-01-12 中国人民解放军国防科学技术大学 Method for distributing parallel discrete event simulation objects based on community characteristics
CN102591622A (en) * 2011-12-20 2012-07-18 南京大学 Grid data coordinate conversion parallel method based on similarity transformation model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YAN ZHOU等: "Hilbert Curve Based Spatial Data Declustering Method for Parallel Spatial Database", 《REMOTE SENSING, ENVIRONMENT AND TRANSPORTATION ENGINEERING (RSETE), 2012 2ND INTERNATIONAL CONFERENCE ON》 *
YANG YIZHOU等: "Research on distributed Hilbert R tree spatial index based on BIRCH clustering", 《2012 20TH INTERNATIONAL CONFERENCE ON GEOINFORMATICS》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110609744A (en) * 2018-06-15 2019-12-24 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for processing computing tasks
CN110609744B (en) * 2018-06-15 2023-06-09 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for processing computing tasks
CN109062679A (en) * 2018-08-01 2018-12-21 中国科学院遥感与数字地球研究所 A kind of general division methods of vector data towards parallel processing
CN113886089A (en) * 2021-10-21 2022-01-04 上海勃池信息技术有限公司 Task processing method, device, system, equipment and medium
CN113886089B (en) * 2021-10-21 2024-01-26 上海勃池信息技术有限公司 Task processing method, device, system, equipment and medium
CN114780247A (en) * 2022-05-17 2022-07-22 中国地质大学(北京) Flow application scheduling method and system with flow rate and resource sensing
CN114780247B (en) * 2022-05-17 2022-12-13 中国地质大学(北京) Flow application scheduling method and system with flow rate and resource sensing

Similar Documents

Publication Publication Date Title
Hu et al. Tricore: Parallel triangle counting on gpus
US10007742B2 (en) Particle flow simulation system and method
Alresheedi et al. Improved multiobjective salp swarm optimization for virtual machine placement in cloud computing
CN103279330A (en) MapReduce multiple programming model based on virtual machine GPU computation
Hu et al. Trix: Triangle counting at extreme scale
Rossi et al. Towards robust algorithms for current deposition and dynamic load-balancing in a GPU particle in cell code
CN102110079A (en) Tuning calculation method of distributed conjugate gradient method based on MPI
CN103049322A (en) Vector target set balance partition method aiming at topological relation parallel computation
Yang et al. Efficient FPGA-based graph processing with hybrid pull-push computational model
CN103150214A (en) Vector target set balanced partitioning method aiming at spatial measure and direction relation concurrent computation
Chen et al. Efficient and portable ALS matrix factorization for recommender systems
Zhang et al. A survey of parallel particle tracing algorithms in flow visualization
CN103049329A (en) High-efficiency system based on central processing unit (CPU)/many integrated core (MIC) heterogeneous system structure
CN110110158B (en) Storage space division method and system for three-dimensional grid data
CN103294639A (en) CPU+MIC mixed heterogeneous cluster system for achieving large-scale computing
Chang et al. A novel energy-aware and resource efficient virtual resource allocation strategy in IaaS cloud
CN112948643B (en) Structured grid streamline integration method based on thread parallelism
Zhang et al. Task scheduling for gpu heterogeneous cluster
Khaled et al. Parallel study of 3-D oil reservoir data visualization tool using hybrid distributed/shared-memory models
Chandrashekar et al. Performance Model of HPC Application On CPU-GPU Platform
Kumar et al. Power and data aware best fit algorithm for energy saving in cloud computing
Liu et al. A-MapCG: an adaptive MapReduce framework for GPUs
Fatima et al. A heterogeneous dynamic scheduling minimized make-span for energy and performance balancing
Papagiannis et al. Scalable runtime support for data-intensive applications on the single-chip cloud computer
Liu et al. Multi-Objective Virtual Machine Placement Algorithm Based on Improved Discrete Differential Evolution

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130417