CA2279359A1 - A method of generating attribute cardinality maps - Google Patents

A method of generating attribute cardinality maps Download PDF

Info

Publication number
CA2279359A1
CA2279359A1 CA002279359A CA2279359A CA2279359A1 CA 2279359 A1 CA2279359 A1 CA 2279359A1 CA 002279359 A CA002279359 A CA 002279359A CA 2279359 A CA2279359 A CA 2279359A CA 2279359 A1 CA2279359 A1 CA 2279359A1
Authority
CA
Canada
Prior art keywords
range
elements
bin
mean
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA002279359A
Other languages
French (fr)
Other versions
CA2279359C (en
Inventor
Basantkumar John Oommen
Murali Thiyagarajah
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CA2279359A priority Critical patent/CA2279359C/en
Priority to CA2743462A priority patent/CA2743462C/en
Priority to US09/487,328 priority patent/US6865567B1/en
Publication of CA2279359A1 publication Critical patent/CA2279359A1/en
Application granted granted Critical
Publication of CA2279359C publication Critical patent/CA2279359C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24542Plan optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24542Plan optimisation
    • G06F16/24545Selectivity estimation or determination
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99932Access augmentation or optimizing

Abstract

This invention provides a novel means for creating a histogram for use in minimizing response time and resource consumption when optimizing a query in a database, and other like structures, the histogram being created by placing ordered elements into specific range until the next element to be considered for inclusion in the range is a predetermined distance from the (generalized) mean value associated with the elements within the range, whereupon that next element is placed in the following range. Similarly, the following ranges are closed when the next element to be considered for inclusion in the range is greater than a predetermined distance from the (generalized) mean value associated with the elements in that range, whereupon that next element is placed in the following range. For each range, the location and size of the range is recorded with, for example, the mean value, the slope or other attribute characterizing one or more elements in the range. The invention has also applications in pattern recognition, message routing, and in actuarial sciences.

Claims (64)

1. A method of generating a histogram comprising the steps of:
(a) providing a data set representing a plurality of elements and a value asso-ciated with each element, the data set having a property defining an order of the elements therein;
(b) determining at least one range, each of the at least one range having at least an element, an arithmetic mean of each range equal to the arithmetic mean of the values associated with the at least an element within said range, a specific range from the at least a range comprising a plurality of elements from the data set adjacent each other within the defined order, wherein the arithmetic mean of the specific range is within a predetermined maximum distance from a value associated with an element within the specific range, the predetermined maximum distance independent of the number of elements within the specific range and their associated values;
and, (c) for each range storing at least a value related to an estimate of a value associated with an element within the range and at least data relating to the size and location of the range.
2. A method of generating a histogram as defined in claim 1 wherein the at least a range comprises a plurality of ranges, and wherein some ranges from the plurality of ranges have a different number of elements and some ranges from the plurality of ranges have different areas, an area of each range equal to the product of the arithmetic mean of said range and the number of elements within said range.
3. A method of generating a histogram as defined in claim 2 wherein the step of determining the at least a range is performed so as to limit variance between the values associated with the elements within a same range from the at least a range in a known fashion, the limitation forming further statistical data of the histogram.
4. A method of generating a histogram as defined in claim 3 wherein a value associated with each element within a range from the at least a range is within where ~ is the tolerance value used in generating the R-ACM and i is the location of the element within a histogram sector of length l.
5. A method of generating a histogram as defined in claim 3 wherein the at least a value related to an estimate of a value for an element within the range includes a value relating to the arithmetic mean, and wherein the at least data relating to the range comprises data relating to both endpoints of the range.
6. A method of generating a histogram as defined in claim 3 wherein the step of determining at least a range comprises the steps of:

(a) using a suitably programmed processor, defining a first bin as a current bin;
(b) using the suitably programmed processor, selecting an element and adding it to the current bin as the most recently added element to the current bin;
(c) selecting an element from within the data set, the selected element not within a bin and adjacent the most recently added element;
(d) determining at least a mean of the values associated with elements within the bin;
(e) when the most recently selected element differs from a mean from the at least a mean by an amount less than a predetermined amount, adding the most recently selected element to the current bin as the most recently added element to the current bin and returning to step (c);

(f) when the selected element differs from the mean from the at least a mean by an amount more than the predetermined amount, creating a new bin as the current bin and adding the selected element to the new bin as the most recently added element to the current bin and returning to step (c);
and, (g) providing data relating to each bin including data indicative of a range of elements within each bin as the determined at least a range.
7. A method of generating a histogram as defined in claim 6 comprising the step of: (aa) providing a value, ~, of the predetermined maximum distance.
8. A method of generating a histogram as defined in claim 6 wherein the at least a mean comprises an arithmetic mean and wherein the mean from the at least a mean is the arithmetic mean.
9. A method of generating a histogram as defined in claim 6 wherein adjacent ele-ments are selected from a start location within the data set in order sequentially toward an end of th.e data set.
10. A method of generating a histogram as defined in claim 6 wherein adjacent elements are selected from a start location in an alternating fashion toward a beginning of the data set and toward an end of the data set.
11. A method of generating a histogram as defined in claim 3 wherein the step of determining at least a range comprises the steps of:

(a) selecting an element from within the data set;
(b) determining a bin with which to associate the element;
(c) when the determined bin is empty, adding the element to the bin;

(d) when the determined bin is other than empty, determining at least a mean of the values associated with elements within the determined bin;
(e) when the moss; recently selected element differs from a mean from the at least a mean by an amount less than a predetermined amount, adding the most recently selected element to the determined bin and returning to step (a);
(f) when the selected element differs from the mean from the at least a mean by an amount more than the predetermined amount, adding the selected element to the determined bin and dividing the determined bin into one of two bins and three bins, one of which includes the selected element and returning to step (a); and, (g) providing data relating to each bin including data indicative of a range of elements within the bin.
12. A method of generating a histogram as defined in claim 11 wherein the at least a mean comprises an arithmetic mean and wherein the mean from the at least a mean is the arithmetic mean.
13. A method of generating a histogram as defined in claim 12 comprising the step of: determining a first arithmetic mean of a first selected bin; determining a second arithmetic mean of a second selected bin adjacent the first selected bin;
comparing the first and second arithmetic means; and, when the arithmetic means are within a predetermined distance of each other, merging the first selected bin and the second selected bin to form a single merged bin including all the elements of the first selected bin and all the elements of the second selected bin.
14. A method of generating a histogram as defined in claim 3 wherein the step of determining at least a range comprises the steps of:
(a) using a suitably programmed processor, defining a first bin as a current bin;

(b) using the suitably programmed processor, selecting an element and adding it to the current bin as the most recently added element to the current bin;
(c) selecting elements adjacent the most recently added element(s);
(d) determining a first mean of the values associated with elements within the bin and deterrnining a second mean of the selected elements;
(e) when the second mean differs from the first mean by an amount less than a predetermined amount, adding the most recently selected elements to the current bin as the most recently added elements and returning to step (c);
(f) when the second mean differs from the first mean by an amount more than the predetermined amount, creating a new bin as the current bin and adding at least one of the selected elements to the new bin as the most recently added element(s) to the current bin and returning to step (c); and, (g) providing data, relating to each bin including data indicative of a range of elements within the bin.
15. A method of generating a histogram as defined in claim 14 wherein the step of (f) includes the steps of:
(f1) determining a first element within the selected elements to add to the new current bin;
(f2) adding the selected element(s) before the first element to the previous current bin; and, (f3) adding the selected element(s) from and including the first element to the new current bin.
16. A method of generating a histogram as defined in claim 3 wherein the step of determining at least a range comprises the steps of:
(a) using a suitably programmed processor, defining a first bin as a current bin;

(b) using the suitably programmed processor, selecting an element and adding it to the current bin as the most recently added element to the current bin;

(c) selecting an element from within the data set, the selected element not within a bin and adjacent the most recently added element;

(d) determining the Generalized positive-2 mean of the current bin as the square-root of the (sum of the squares of the values associated with elements within the bin divided by the number of the elements within the bin);
(e) determining the Generalized negative-2 mean of the current bin as the square of the (sum of the square-roots of the values associated with elements within the bin divided by the number of the elements within the bin);
(f) when the value associated with the selected element is lower than the said Generalized positive-2 mean, determining a difference between the value associated with the selected element and the said Generalized positive-2 mean, and when the value associated with the selected element is higher than the said Generalized negative-2 mean, determining a difference between the value associated with the selected element and the said Generalized negative-2 mean;
(g) when a difference is other than greater than the predetermined amount, adding the selected element to the current bin as the most recently added element to the current bin and returning to step (c);
(h) when a difference is greater than the predetermined amount, defining a new bin as the current bin, adding the selected element to the current bin as the most recently added element to the current bin, and returning to step (c); and, (i) providing data relating to each bin including data indicative of a range of elements within each bin as the determined at least a range.
17. A method of generating a histogram as defined in claim 3 wherein the step of determining at least a range comprises the steps of:

(a) using a suitably programmed processor, defining a first bin as a current bin;
(b) using the suitably programmed processor, selecting an element and adding it to the current bin as the most recently added element to the current bin;
(c) selecting an element from within the data set, the selected element not within a bin and adjacent the most recently added element;
(d) determining the Generalized positive-k mean of the current bin for a pre-determined k as the k th-root of the (sum of the k th powers of the values associated with elements within the bin divided by the number of the elements within the bin);
(e) determining the Generalized negative-k mean of the current bin as the k th power of the (sum of the k th-roots of the values associated with elements within the bin divided by the number of the elements within the bin);
(f) when the value associated with the selected element is lower than the said Generalized positive-k mean, determining a difference between the value associated with the selected element and the said Generalized positive-k mean, and when the value associated with the selected element is higher than the said Generalized negative-k mean, determining a difference between the value associated with the selected element and the said Generalized negative-k mean;
(g) when a difference is other than greater than the predetermined amount, adding the selected element to the current bin as the most recently added element to the current bin and returning to step (c);
(h) when a difference is greater than the predetermined amount, defining a new bin as the current bin, adding the selected element to the current bin as the most recently added element to the current bin, and returning to step (c); and, (i) providing data relating to each bin including data indicative of a range of elements within each bin as the determined at least a range.
18. A method of generating a histogram as defined in claim 3 wherein the step of determining at least a range comprises the steps of:

(a) using a suitably programmed processor, defining a first bin as a current bin;
(b) using the suitably programmed processor, selecting an element and adding it to the current bin as the most recently added element to the current bin;
(c) selecting an element from within the data set, the selected element not within a bin and adjacent the most recently added element;
(d) determining a current largest value as the largest of the values associated with elements within the bin;
(e) determining a current smallest value as the smallest of the values associated with elements within the bin;
(f) when the value associated with the selected element is lower than the current largest value, determining a difference between the value associated with the selected element and the current largest value, and when the value associated with the selected element is higher than the current smallest value, determining a difference between the value associated with the selected element and the current smallest value;
(g) when a difference is other than greater than the predetermined amount, adding the selected element to the current bin as the most recently added element to the current bin and returning to step (c);
(h) when a difference is greater than the predetermined amount, defining a new bin as the current bin, adding the selected element to the current bin as the most recently added element to the current bin, and returning to step (c); and, (i) providing data. relating to each bin including data indicative of a range of elements within each bin as the determined at least a range.
19. A method of generating a histogram as defined in claim 18 wherein the predetermined amount is equal to 2(~).
20. A method of generating a histogram as defined in claim 1 wherein the step of determining at least a range comprises the steps of:

(a) selecting a group of elements within the data set and adjacent one another within the ordering;

(b) determining a mean of the values associated with each selected element within the selected group of elements;

(c) comparing a value associated with each selected element in the group to the mean value to determine a difference;

(d) when a value is different from the mean by more than a predetermined amount, returning to step (a); and, (e) when all values are different from the mean by less than or equal to the predetermined amount, creating a bin including the selected group of elements and returning to step (a).
21. A method of generating a histogram as defined in claim 20 comprising the steps of (f1) selecting an element adjacent the bin including the selected group of elements, the selected element other than an element within a bin;
(f2) determining a mean of the values associated with each element within the bin and the selected element; and, (f3) when the value of the selected element differs from the mean by less than or equal to the predetermined amount, adding the selected element to the bin and returning to step (f1).
22. A method as defined in claim 1 comprising the step of: providing the mean associated with the range as an estimate of a value associated with an element within said range.
23. A method as defined in claim 22 comprising the step of: estimating a reliability of the estimated value.
24. A method as defined in claim 23 wherein the estimated value is used for estimating the computational efficiency of a search within a database.
25. A method as defined in claim 1 comprising the step of: estimating a value associated with a selected range of elements as a sum of (products of (the arithmetic mean far each range and a number of elements within both the range and the selected range)).
26. A method as defined in claim 1 wherein a plurality of ranges are determined and the determined ranges are used for dividing the elements into groups having similar statistical properties for use in actuarial calculations.
27. A method as defined in claim 1 comprising the steps of: determining a routing table in dependence upon the histogram; and, estimating a value for use in network routing in dependence upon the routing table.
28. A method as defined in claim 22 comprising the step of using the estimate for determining an approach to searching within a plurality of different databases given a predetermined limited time for conducting the search, wherein the approach is selected to approximately maximise the probability of successfully completing the search.
29. A method as defined in claim 1 comprising the step of improving a discriminant function used in a process of pattern recognition in dependence upon the histogram.
30. A method of generating a histogram comprising the steps of:

(a) providing a data set representing a plurality of elements and a value associated with each element, the data set having a property defining an order of the elements therein;
(b) determining a plurality of ranges, each of the plurality of ranges having at least an element, a known statistical correlation existing between values associated with elements in a same range, some ranges from the at least a range comprising a plurality of elements from the data set adjacent each other within the defined order, the statistical correlation for those ranges indicative of a maximum error between an estimated value associated with an element within the range and the value associated with the element, the maximum error other than (the total area of the range minus the estimated value), wherein an area of each range is equal to the product of the arithmetic mean of said range and the number of elements within said range; and, (c) for each range storing at least a value related to an estimate of a value associated with an element within the range and at least data relating to the size and location of the range.
31. A method of generating a histogram as defined in claim 30 wherein a range from the some ranges define a range of values associated with elements within the range, and wherein the range of values has a maximum upper limit and a minimum lower limit, the lower limit other than zero and the upper limit other than the area of the range.
32. A method of generating a histogram as defined in claim 31 wherein a value associated with each element within a range from the some ranges is within a known maximum error of an estimated value for that element and wherein the maximum error is different for some elements in a same range than for others.
33. A method of generating a histogram comprising the steps of:

(a) providing a data set representing a plurality of elements and a value asso-ciated with each element, the data set having a property defining an order of the elements therein;

(b) determining at least one range having a length such that the value associated with at least one element within the range is within a predetermined maximum distance of at least one other element within the range; and, (c) for each range storing at least a value related to an estimate of a value associated with an element within the range and at least data relating to the size and location of the range.
34. A method of generating a histogram comprising the steps of:

(a) providing a data set representing a plurality of elements and a value associated with each element, the data set having a property defining an order of the elements therein;

(b) determining at least one range having a length such that the value associated with every element within the range is within a predetermined maximum distance of every other element within the range; and, (c) for each range storing at least a value related to an estimate of a value associated with an element within the range and at least data relating to the size and location of the range.
35. An article of manufacture comprising a computer usable medium having data determinative of the following:

(a) computer readable program code embodied therein for providing a data set representing a plurality of elements and a value associated with each element, the data set having a property defining an order of the elements therein; item [(b)] computer readable program code embodied therein for determining at least one range, each of the at least one range having at least an element, an arithmetic mean of each range equal to the arithmetic mean of the values associated with the at least an element within said range, a specific range from the at least a range comprising a plurality of elements from the data set adjacent each other within the defined order, wherein the arithmetic mean of the specific range is within a predetermined maximum distance from a value associated with an element within the specific range, the predetermined maximum distance independent of the number of elements within the specific range and their associated values; and, item [(c)] computer readable program code embodied therein for storing for each range at least a value related to the mean and at least data relating to the size and location of the range.
36. A method of generating a histogram comprising the steps of:
(a) providing a data set representing a plurality of elements and a value associated with each element, the data set having a property defining an order of the elements therein;
(b) determining a range within the data set, the range comprising a plurality of elements from the data set and adjacent within the order; and, (c) storing a plurality of values indicative of a straight line defining an approximate upper boundary of the values associated with each element within the range, the straight line indicating different values for different elements within the range.
37. A method of generating a histogram as defined in claim 36, wherein the provided range has a mean equal to the mean of the values associated with the at least an element within said range and an area of the range equal to the product of the mean of said range and the number of elements within said range; and comprising the steps of:
(d) providing a value associated with specific elements within the range; and, (e) determining a straight line approximating the determined value for each of the specific elements and having an area therebelow determined according to (the value of the straight line at the first element in the range + the value of the straight line at the last element in the range) divided by 2 all times the number of elements within the range, the area below the straight line approximately equal to the area of the range.
38. A method of generating a histogram as defined in claim 37, comprising the steps of:
(f) providing a second range within the data set, the second range adjacent the first range and comprising a plurality of elements from the data set and adjacent within the order, the provided second range having a mean equal to the mean of the values associated with the at least an element within said range and an area of the range equal to the product of the mean of said range and the number of elements within said range;

(g) providing a value associated with specific elements within the second range;
(h) determining a second straight line approximating the determined value for each of the specific elements within the second range and having an area therebelow determined according to (the value of the second straight line at the first element in the second range + the valve of the second straight line at the last element in the second range) divided by 2 all times the number of elements within the second range, the area below the straight line approximately equal to the area of the second range;
and, storing a plurality of values indicative of the second straight line.
39. A method of generating a histogram as defined in claim 38, wherein the step of determining the straight line and of determining the second straight line are performed such that the adjacent endpoints of the first straight line and the second straight line are a same point.
40. A method of generating a histogram as defined in claim 39, wherein the step of determining the second straight line is performed in dependence upon the endpoint of the first straight line adjacent the second range.
41. A method of generating a histogram as defined in claim 37, wherein the step of determining a range is performed so as to limit variance between values associated with elements in the range and the straight line in a known fashion, the limitation forming further statistical data of the histogram.
42. A method of generating a histogram as defined in claim 41, wherein the step of determining a range is performed so as to limit average error between some values associated with elements in the range and the straight line in a known fashion.
43. A method of generating a histogram as defined in claim 41, wherein the step of determining a range is performed so as to limit least squared error between some values associated with elements in the range and the straight line in a known fashion.
44. A method of generating a histogram as defined in claim 36, wherein the plurality of values are indicative of a range beginning, a range ending, a value at the range beginning and a value at the range ending.
45. A method of generating a histogram as defined in claim 44, wherein the plurality of values includes data for determining a straight line approximating the values associated with elements within the range, and differing therefrom by an amount less than a known amount less than each associated value.
46. A method of generating a histogram comprising the steps of:
(a) providing a data set representing a plurality of elements and a value associated with each element, the data set having a property defining an order of the elements therein;
(b) providing a range within the data set, the range comprising a plurality of elements from the data set and adjacent within the order;
(c) determining a straight line indicating different values for different elements within the range; and, (d) storing a plurality of values indicative of a straight line defining an approximate upper boundary of the values associated with each element within the range, the straight line indicating different values for different elements within the range.
47. A method of generating a histogram as defined in claim 46, wherein the provided range has a mean equal to the mean of the values associated with the at least an element within said range and an area of the range equal to the product of the mean of said range and the number of elements within said range; and wherein the step (c) comprises the steps of:
(c1) providing a value associated with specific elements within the range;
and, (c2) determining a straight line approximating the determined value for each of the specific elements and having an area therebelow determined according to (the value of the straight line at the first element in the range + the value of the straight line at the last element in the range) divided by 2 all times the number of elements within the range, the area below the straight line approximately equal to the area of the range.
48. A method of generating a histogram as defined in claim 47, comprising the steps of:
(e) providing a second range within the data set, the second range adjacent the first range and comprising a plurality of elements from the data set and adjacent within the order, the provided second range having a mean equal to the mean of the values associated with the at least an element within said range and an area of the range equal to the product of the mean of said range and the number of elements within said range;
(f) providing a value associated with specific elements within the second range;
(g) determining a second straight line approximating the determined value for each of the specific elements within the second range and having an area therebelow determined according to (the value of the second straight line at the first element in the second range + the value of the second straight line at the last element in the second range) divided by 2 all times the number of elements within the second range, the area below the straight line approximately equal to the area of the second range;
and, storing a plurality of values indicative of the second straight line.
49. A method of generating a histogram as defined in claim 48, wherein the step of determining the straight line and of determining the second straight line are performed such that the adjacent endpoints of the first straight line and the second straight line are a same point.
50. A method of generating a histogram as defined in claim 49, wherein the step of determining the second straight line is performed in dependence upon the endpoint of the first straight line adjacent the second range.
51. A method of generating a histogram as defined in claim 47, wherein the step of determining the straight line is performed so as to limit variance between values associated with elements in the range and the straight line in a known fashion, the limitation forming further statistical data of the histogram.
52. A method of generating a histogram as defined in claim 51, wherein the step of determining the straight line is performed so as to limit average error between some values associated with elements in the range and the straight line in a known fashion.
53. A method of generating a histogram as defined in claim 51, wherein the step of determining the straight line is performed so as to limit least squared error between some values associated with elements in the range and the straight line in a known fashion.
54. A method as defined in claim 36, comprising the step of estimating a value associated with an element based on the location of the straight line at the element.
55. A method as defined in claim 54, comprising the step of estimating a reliability of the estimated value.
56. A method as defined in claim 46, comprising the step of estimating a value associated with an element based on the location of the straight line at the element.
57. A method as deemed in claim 56, comprising the step of estimating a reliability of the estimated value.
58. A method as defined in claim 56, comprising the step of using the estimated value, estimating the computational efficiency of a search within a database.
59. A method as defined in claim 46, comprising the steps of determining a routing table in dependence upon the histogram; and, determining an estimate of a value within the routing table for determining a network routing.
60. A method as defined in claim 56, comprising the step of: using the estimate for determining an approach to searching within a plurality of different databases given a predetermined limited time for conducting the search, wherein the approach is determined to approximately maximise the probability of successfully completing the search.
61. A method as defined in claim 46, comprising the step of: improving the discriminant function used in a process of pattern recognition in dependence upon the histogram.
62. A method as defined in claim 36, comprising the step of estimating a value associated with a selected range of elements as a sum of areas below the straight lines for portions of ranges within the selected range.
63. A method as defined in claim 46, comprising the step of estimating a value associated with a selected range of elements as a sum of areas below the straight lines for portions of ranges within the selected range.
64. An article of manufacture comprising a computer usable medium having data determinative of the following:
(a) computer readable program code embodied therein for providing a data set representing a plurality of elements and a value associated with each element, the data set having a property defining an order of the elements therein;
(b) computer readable program code embodied therein for providing a range within the data set, the range comprising a plurality of elements from the data set and adjacent within the order;
(c) computer readable program code embodied therein for determining a straight line indicating different values for different elements within the range; and, (d) computer readable program code embodied therein for storing a plurality of values indicative of a straight line defining an approximate upper boundary of the values associated with each element within the range, the straight line indicating different values for different elements within the range.
CA2279359A 1999-07-30 1999-07-30 A method of generating attribute cardinality maps Expired - Fee Related CA2279359C (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CA2279359A CA2279359C (en) 1999-07-30 1999-07-30 A method of generating attribute cardinality maps
CA2743462A CA2743462C (en) 1999-07-30 1999-07-30 A method of generating attribute cardinality maps
US09/487,328 US6865567B1 (en) 1999-07-30 2000-01-19 Method of generating attribute cardinality maps

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CA2279359A CA2279359C (en) 1999-07-30 1999-07-30 A method of generating attribute cardinality maps

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CA2743462A Division CA2743462C (en) 1999-07-30 1999-07-30 A method of generating attribute cardinality maps

Publications (2)

Publication Number Publication Date
CA2279359A1 true CA2279359A1 (en) 2001-01-30
CA2279359C CA2279359C (en) 2012-10-23

Family

ID=4163900

Family Applications (2)

Application Number Title Priority Date Filing Date
CA2279359A Expired - Fee Related CA2279359C (en) 1999-07-30 1999-07-30 A method of generating attribute cardinality maps
CA2743462A Expired - Fee Related CA2743462C (en) 1999-07-30 1999-07-30 A method of generating attribute cardinality maps

Family Applications After (1)

Application Number Title Priority Date Filing Date
CA2743462A Expired - Fee Related CA2743462C (en) 1999-07-30 1999-07-30 A method of generating attribute cardinality maps

Country Status (2)

Country Link
US (1) US6865567B1 (en)
CA (2) CA2279359C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837250A (en) * 2021-01-27 2021-05-25 武汉华中数控股份有限公司 Infrared image self-adaptive enhancement method based on generalized histogram equalization

Families Citing this family (139)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6532458B1 (en) * 1999-03-15 2003-03-11 Microsoft Corporation Sampling for database systems
JP4563558B2 (en) * 2000-07-31 2010-10-13 株式会社ターボデータラボラトリー Data compiling method and storage medium storing compiling method
US7085769B1 (en) * 2001-04-26 2006-08-01 Ncr Corporation Method and apparatus for performing hash join
US20030033582A1 (en) * 2001-05-09 2003-02-13 Wavemarket, Inc. Representations for estimating distance
US6732085B1 (en) * 2001-05-31 2004-05-04 Oracle International Corporation Method and system for sample size determination for database optimizers
US6907422B1 (en) * 2001-12-18 2005-06-14 Siebel Systems, Inc. Method and system for access and display of data from large data sets
JP2003203076A (en) * 2001-12-28 2003-07-18 Celestar Lexico-Sciences Inc Knowledge searching device and method, program and recording medium
GB0205000D0 (en) * 2002-03-04 2002-04-17 Isis Innovation Unsupervised data segmentation
US7174344B2 (en) * 2002-05-10 2007-02-06 Oracle International Corporation Orthogonal partitioning clustering
US7707145B2 (en) * 2002-07-09 2010-04-27 Gerald Mischke Method for control, analysis and simulation of research, development, manufacturing and distribution processes
US7047230B2 (en) * 2002-09-09 2006-05-16 Lucent Technologies Inc. Distinct sampling system and a method of distinct sampling for optimizing distinct value query estimates
US20040093413A1 (en) * 2002-11-06 2004-05-13 Bean Timothy E. Selecting and managing time specified segments from a large continuous capture of network data
US7031958B2 (en) * 2003-02-06 2006-04-18 International Business Machines Corporation Patterned based query optimization
WO2004086185A2 (en) * 2003-03-19 2004-10-07 Unisys Corporation Rules-based deployment of computing components
CA2427209A1 (en) * 2003-04-30 2004-10-30 Ibm Canada Limited - Ibm Canada Limitee Optimization of queries on views defined by conditional expressions having mutually exclusive conditions
US7299226B2 (en) * 2003-06-19 2007-11-20 Microsoft Corporation Cardinality estimation of joins
US7149735B2 (en) * 2003-06-24 2006-12-12 Microsoft Corporation String predicate selectivity estimation
US7155585B2 (en) * 2003-08-01 2006-12-26 Falconstor Software, Inc. Method and system for synchronizing storage system data
EP1510932A1 (en) * 2003-08-27 2005-03-02 Sap Ag Computer implemented method and according computer program product for storing data sets in and retrieving data sets from a data storage system
US7739263B2 (en) * 2003-09-06 2010-06-15 Oracle International Corporation Global hints
US7260563B1 (en) * 2003-10-08 2007-08-21 Ncr Corp. Efficient costing for inclusion merge join
US7167848B2 (en) * 2003-11-07 2007-01-23 Microsoft Corporation Generating a hierarchical plain-text execution plan from a database query
US20050223019A1 (en) * 2004-03-31 2005-10-06 Microsoft Corporation Block-level sampling in statistics estimation
US20060005121A1 (en) * 2004-06-30 2006-01-05 Microsoft Corporation Discretization of dimension attributes using data mining techniques
US7283990B2 (en) * 2004-07-27 2007-10-16 Xerox Corporation Method and system for managing resources for multi-service jobs based on location
US20060074875A1 (en) * 2004-09-30 2006-04-06 International Business Machines Corporation Method and apparatus for predicting relative selectivity of database query conditions using respective cardinalities associated with different subsets of database records
US8046354B2 (en) * 2004-09-30 2011-10-25 International Business Machines Corporation Method and apparatus for re-evaluating execution strategy for a database query
US20060116983A1 (en) * 2004-11-30 2006-06-01 International Business Machines Corporation System and method for ordering query results
US7359922B2 (en) * 2004-12-22 2008-04-15 Ianywhere Solutions, Inc. Database system and methodology for generalized order optimization
US7716162B2 (en) * 2004-12-30 2010-05-11 Google Inc. Classification of ambiguous geographic references
US8935273B2 (en) * 2005-06-23 2015-01-13 International Business Machines Corporation Method of processing and decomposing a multidimensional query against a relational data source
US7630977B2 (en) * 2005-06-29 2009-12-08 Xerox Corporation Categorization including dependencies between different category systems
CA2519021A1 (en) * 2005-09-13 2007-03-13 Cognos Incorporated System and method of providing date, arithmetic, and other functions for olap sources
CA2519001A1 (en) * 2005-09-13 2007-03-13 Cognos Incorporated System and method of data agnostic business intelligence query
US7570821B2 (en) * 2005-11-02 2009-08-04 Kitakyushu Foundation For The Advancement Of Industry, Science And Technology Apparatus and method for image coding
US7685098B2 (en) * 2005-12-08 2010-03-23 International Business Machines Corporation Estimating the size of a join by generating and combining partial join estimates
US7185004B1 (en) 2005-12-09 2007-02-27 International Business Machines Corporation System and method for reverse routing materialized query tables in a database
US8732138B2 (en) * 2005-12-21 2014-05-20 Sap Ag Determination of database statistics using application logic
US7565266B2 (en) * 2006-02-14 2009-07-21 Seagate Technology, Llc Web-based system of product performance assessment and quality control using adaptive PDF fitting
US20070208696A1 (en) * 2006-03-03 2007-09-06 Louis Burger Evaluating materialized views in a database system
US7873628B2 (en) * 2006-03-23 2011-01-18 Oracle International Corporation Discovering functional dependencies by sampling relations
US7478083B2 (en) * 2006-04-03 2009-01-13 International Business Machines Corporation Method and system for estimating cardinality in a database system
US7702699B2 (en) * 2006-05-31 2010-04-20 Oracle America, Inc. Dynamic data stream histograms for large ranges
US7395270B2 (en) * 2006-06-26 2008-07-01 International Business Machines Corporation Classification-based method and apparatus for string selectivity estimation
US7962499B2 (en) 2006-08-18 2011-06-14 Falconstor, Inc. System and method for identifying and mitigating redundancies in stored data
US8694524B1 (en) * 2006-08-28 2014-04-08 Teradata Us, Inc. Parsing a query
US20080065616A1 (en) * 2006-09-13 2008-03-13 Brown Abby H Metadata integration tool, systems and methods for managing enterprise metadata for the runtime environment
US20080195578A1 (en) * 2007-02-09 2008-08-14 Fabian Hueske Automatically determining optimization frequencies of queries with parameter markers
JP5063151B2 (en) * 2007-03-19 2012-10-31 株式会社リコー Information search system and information search method
US20080288527A1 (en) * 2007-05-16 2008-11-20 Yahoo! Inc. User interface for graphically representing groups of data
US8122056B2 (en) 2007-05-17 2012-02-21 Yahoo! Inc. Interactive aggregation of data on a scatter plot
US7739229B2 (en) 2007-05-22 2010-06-15 Yahoo! Inc. Exporting aggregated and un-aggregated data
US7756900B2 (en) * 2007-05-22 2010-07-13 Yahoo!, Inc. Visual interface to indicate custom binning of items
US7668952B2 (en) * 2007-08-27 2010-02-23 Internationla Business Machines Corporation Apparatus, system, and method for controlling a processing system
US7774336B2 (en) * 2007-09-10 2010-08-10 International Business Machines Corporation Adaptively reordering joins during query execution
JP5018487B2 (en) * 2008-01-14 2012-09-05 富士通株式会社 Multi-objective optimization design support apparatus, method, and program considering manufacturing variations
JP5003499B2 (en) * 2008-01-14 2012-08-15 富士通株式会社 Multi-objective optimization design support apparatus, method, and program
US20090182538A1 (en) * 2008-01-14 2009-07-16 Fujitsu Limited Multi-objective optimum design support device using mathematical process technique, its method and program
US7987177B2 (en) * 2008-01-30 2011-07-26 International Business Machines Corporation Method for estimating the number of distinct values in a partitioned dataset
US20090307187A1 (en) * 2008-02-28 2009-12-10 Amir Averbuch Tree automata based methods for obtaining answers to queries of semi-structured data stored in a database environment
US7987195B1 (en) 2008-04-08 2011-07-26 Google Inc. Dynamic determination of location-identifying search phrases
CN102084363B (en) * 2008-07-03 2014-11-12 加利福尼亚大学董事会 A method for efficiently supporting interactive, fuzzy search on structured data
US9189523B2 (en) * 2008-07-05 2015-11-17 Hewlett-Packard Development Company, L.P. Predicting performance of multiple queries executing in a database
US9910892B2 (en) 2008-07-05 2018-03-06 Hewlett Packard Enterprise Development Lp Managing execution of database queries
US8473327B2 (en) * 2008-10-21 2013-06-25 International Business Machines Corporation Target marketing method and system
US8195496B2 (en) * 2008-11-26 2012-06-05 Sap Aktiengesellschaft Combining multiple objective functions in algorithmic problem solving
US8214352B2 (en) * 2008-11-26 2012-07-03 Hewlett-Packard Development Company Modular query optimizer
JP5163472B2 (en) * 2008-12-17 2013-03-13 富士通株式会社 Design support apparatus, method, and program for dividing and modeling parameter space
US20100287015A1 (en) * 2009-05-11 2010-11-11 Grace Au Method for determining the cost of evaluating conditions
US8117224B2 (en) * 2009-06-23 2012-02-14 International Business Machines Corporation Accuracy measurement of database search algorithms
US20110184934A1 (en) * 2010-01-28 2011-07-28 Choudur Lakshminarayan Wavelet compression with bootstrap sampling
US8812484B2 (en) 2010-03-30 2014-08-19 Hewlett-Packard Development Company, L.P. System and method for outer joins on a parallel database management system
US8650218B2 (en) * 2010-05-20 2014-02-11 International Business Machines Corporation Dynamic self configuring overlays
US9785904B2 (en) * 2010-05-25 2017-10-10 Accenture Global Services Limited Methods and systems for demonstrating and applying productivity gains
US8356027B2 (en) * 2010-10-07 2013-01-15 Sap Ag Hybrid query execution plan generation and cost model evaluation
US20120136879A1 (en) * 2010-11-29 2012-05-31 Eric Williamson Systems and methods for filtering interpolated input data based on user-supplied or other approximation constraints
US8229917B1 (en) * 2011-02-24 2012-07-24 International Business Machines Corporation Database query optimization using clustering data mining
US9208462B2 (en) * 2011-12-21 2015-12-08 Mu Sigma Business Solutions Pvt. Ltd. System and method for generating a marketing-mix solution
US20130212085A1 (en) * 2012-02-09 2013-08-15 Ianywhere Solutions, Inc. Parallelizing Query Optimization
US9087361B2 (en) 2012-06-06 2015-07-21 Addepar, Inc. Graph traversal for generating table views
US9015073B2 (en) * 2012-06-06 2015-04-21 Addepar, Inc. Controlled creation of reports from table views
US9411853B1 (en) * 2012-08-03 2016-08-09 Healthstudio, LLC In-memory aggregation system and method of multidimensional data processing for enhancing speed and scalability
US9208198B2 (en) 2012-10-17 2015-12-08 International Business Machines Corporation Technique for factoring uncertainty into cost-based query optimization
US8972378B2 (en) * 2012-10-22 2015-03-03 Microsoft Corporation Formulating global statistics for distributed databases
GB2508223A (en) * 2012-11-26 2014-05-28 Ibm Estimating the size of a joined table in a database
US9465826B2 (en) * 2012-11-27 2016-10-11 Hewlett Packard Enterprise Development Lp Estimating unique entry counts using a counting bloom filter
GB2508603A (en) 2012-12-04 2014-06-11 Ibm Optimizing the order of execution of multiple join operations
US9105062B2 (en) 2012-12-13 2015-08-11 Addepar, Inc. Transaction effects
US9135300B1 (en) * 2012-12-20 2015-09-15 Emc Corporation Efficient sampling with replacement
US10642918B2 (en) * 2013-03-15 2020-05-05 University Of Florida Research Foundation, Incorporated Efficient publish/subscribe systems
US9870415B2 (en) * 2013-09-18 2018-01-16 Quintiles Ims Incorporated System and method for fast query response
US9785645B1 (en) * 2013-09-24 2017-10-10 EMC IP Holding Company LLC Database migration management
US9361339B2 (en) 2013-11-26 2016-06-07 Sap Se Methods and systems for constructing q, θ-optimal histogram buckets
US10223410B2 (en) * 2014-01-06 2019-03-05 Cisco Technology, Inc. Method and system for acquisition, normalization, matching, and enrichment of data
US9953074B2 (en) * 2014-01-31 2018-04-24 Sap Se Safe synchronization of parallel data operator trees
US9842152B2 (en) * 2014-02-19 2017-12-12 Snowflake Computing, Inc. Transparent discovery of semi-structured data schema
US9792328B2 (en) 2014-03-13 2017-10-17 Sybase, Inc. Splitting of a join operation to allow parallelization
US9836505B2 (en) 2014-03-13 2017-12-05 Sybase, Inc. Star and snowflake join query performance
GB201409214D0 (en) * 2014-05-23 2014-07-09 Ibm A method and system for processing a data set
US10007644B2 (en) * 2014-06-17 2018-06-26 Sap Se Data analytic consistency of visual discoveries in sample datasets
US10671917B1 (en) 2014-07-23 2020-06-02 Hrl Laboratories, Llc System for mapping extracted Neural activity into Neuroceptual graphs
US10360506B2 (en) 2014-07-23 2019-07-23 Hrl Laboratories, Llc General formal concept analysis (FCA) framework for classification
US10740331B2 (en) * 2014-08-07 2020-08-11 Coupang Corp. Query execution apparatus, method, and system for processing data, query containing a composite primitive
US9424333B1 (en) 2014-09-05 2016-08-23 Addepar, Inc. Systems and user interfaces for dynamic and interactive report generation and editing based on automatic traversal of complex data structures
US9244899B1 (en) 2014-10-03 2016-01-26 Addepar, Inc. Systems and user interfaces for dynamic and interactive table generation and editing based on automatic traversal of complex data structures including time varying attributes
US9218502B1 (en) 2014-10-17 2015-12-22 Addepar, Inc. System and architecture for electronic permissions and security policies for resources in a data system
US10409835B2 (en) * 2014-11-28 2019-09-10 Microsoft Technology Licensing, Llc Efficient data manipulation support
US9798775B2 (en) * 2015-01-16 2017-10-24 International Business Machines Corporation Database statistical histogram forecasting
US10318866B2 (en) * 2015-03-05 2019-06-11 International Business Machines Corporation Selectivity estimation using artificial neural networks
US9875087B2 (en) * 2015-04-10 2018-01-23 Oracle International Corporation Declarative program engine for large-scale program analysis
CN107710239A (en) * 2015-07-23 2018-02-16 赫尔实验室有限公司 PARZEN window feature selecting algorithms for form concept analysis (FCA)
US10204135B2 (en) 2015-07-29 2019-02-12 Oracle International Corporation Materializing expressions within in-memory virtual column units to accelerate analytic queries
US10366083B2 (en) * 2015-07-29 2019-07-30 Oracle International Corporation Materializing internal computations in-memory to improve query performance
US11443390B1 (en) 2015-11-06 2022-09-13 Addepar, Inc. Systems and user interfaces for dynamic and interactive table generation and editing based on automatic traversal of complex data structures and incorporation of metadata mapped to the complex data structures
US10732810B1 (en) 2015-11-06 2020-08-04 Addepar, Inc. Systems and user interfaces for dynamic and interactive table generation and editing based on automatic traversal of complex data structures including summary data such as time series data
US10372807B1 (en) 2015-11-11 2019-08-06 Addepar, Inc. Systems and user interfaces for dynamic and interactive table generation and editing based on automatic traversal of complex data structures in a distributed system architecture
US9747338B2 (en) 2015-12-16 2017-08-29 International Business Machines Corporation Runtime optimization for multi-index access
US10452656B2 (en) * 2016-03-31 2019-10-22 Sap Se Deep filter propagation using explicit dependency and equivalency declarations in a data model
WO2017175433A1 (en) * 2016-04-06 2017-10-12 三菱電機株式会社 Map data generation system and method for generating map data
US10706354B2 (en) 2016-05-06 2020-07-07 International Business Machines Corporation Estimating cardinality selectivity utilizing artificial neural networks
US10162859B2 (en) * 2016-10-31 2018-12-25 International Business Machines Corporation Delayable query
US10642832B1 (en) 2016-11-06 2020-05-05 Tableau Software, Inc. Reducing the domain of a subquery by retrieving constraints from the outer query
US11055331B1 (en) 2016-11-06 2021-07-06 Tableau Software, Inc. Adaptive interpretation and compilation of database queries
US10565286B2 (en) * 2016-12-28 2020-02-18 Sap Se Constructing join histograms from histograms with Q-error guarantees
US10346398B2 (en) 2017-03-07 2019-07-09 International Business Machines Corporation Grouping in analytical databases
US10949438B2 (en) 2017-03-08 2021-03-16 Microsoft Technology Licensing, Llc Database query for histograms
WO2019147201A2 (en) * 2017-07-26 2019-08-01 Istanbul Sehir Universitesi Method of estimation for the result cluster of the inquiry realized for searching string in database
US10977240B1 (en) 2017-10-21 2021-04-13 Palantir Technologies Inc. Approaches for validating data
US11422983B2 (en) 2017-12-13 2022-08-23 Paypal, Inc. Merging data based on proximity and validation
US11080276B2 (en) * 2018-02-23 2021-08-03 Sap Se Optimal ranges for relational query execution plans
US11226955B2 (en) 2018-06-28 2022-01-18 Oracle International Corporation Techniques for enabling and integrating in-memory semi-structured data and text document searches with in-memory columnar query processing
US11153400B1 (en) 2019-06-04 2021-10-19 Thomas Layne Bascom Federation broker system and method for coordinating discovery, interoperability, connections and correspondence among networked resources
CN111159316B (en) * 2020-02-14 2023-03-14 北京百度网讯科技有限公司 Relational database query method, device, electronic equipment and storage medium
US11392572B2 (en) * 2020-03-02 2022-07-19 Sap Se Selectivity estimation using non-qualifying tuples
US11455302B2 (en) * 2020-05-15 2022-09-27 Microsoft Technology Licensing, Llc Distributed histogram computation framework using data stream sketches and samples
CN111784246B (en) * 2020-07-01 2023-04-07 深圳市检验检疫科学研究院 Logistics path estimation method
US11782918B2 (en) * 2020-12-11 2023-10-10 International Business Machines Corporation Selecting access flow path in complex queries
CN113220806B (en) * 2021-03-15 2022-03-18 中山大学 Large-scale road network direction judgment method and system based on derivative parallel line segments
US11928128B2 (en) * 2022-05-12 2024-03-12 Truist Bank Construction of a meta-database from autonomously scanned disparate and heterogeneous sources

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US588968A (en) * 1897-08-31 Vehicle-brake
US4950894A (en) * 1985-01-25 1990-08-21 Fuji Photo Film Co., Ltd. Radiation image read-out method
US5167228A (en) * 1987-06-26 1992-12-01 Brigham And Women's Hospital Assessment and modification of endogenous circadian phase and amplitude
US4956774A (en) * 1988-09-02 1990-09-11 International Business Machines Corporation Data base optimizer using most frequency values statistics
US5883968A (en) * 1994-07-05 1999-03-16 Aw Computer Systems, Inc. System and methods for preventing fraud in retail environments, including the detection of empty and non-empty shopping carts
US5950185A (en) * 1996-05-20 1999-09-07 Lucent Technologies Inc. Apparatus and method for approximating frequency moments
US5987468A (en) * 1997-12-12 1999-11-16 Hitachi America Ltd. Structure and method for efficient parallel high-dimensional similarity join
US6278989B1 (en) * 1998-08-25 2001-08-21 Microsoft Corporation Histogram construction using adaptive random sampling with cross-validation for database systems
US6146830A (en) * 1998-09-23 2000-11-14 Rosetta Inpharmatics, Inc. Method for determining the presence of a number of primary targets of a drug
US6401088B1 (en) * 1999-02-09 2002-06-04 At&T Corp. Method and apparatus for substring selectivity estimation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837250A (en) * 2021-01-27 2021-05-25 武汉华中数控股份有限公司 Infrared image self-adaptive enhancement method based on generalized histogram equalization
CN112837250B (en) * 2021-01-27 2023-03-10 武汉华中数控股份有限公司 Infrared image self-adaptive enhancement method based on generalized histogram equalization

Also Published As

Publication number Publication date
US6865567B1 (en) 2005-03-08
CA2743462A1 (en) 2001-01-30
CA2743462C (en) 2012-10-16
CA2279359C (en) 2012-10-23

Similar Documents

Publication Publication Date Title
CA2279359A1 (en) A method of generating attribute cardinality maps
WO2021189729A1 (en) Information analysis method, apparatus and device for complex relationship network, and storage medium
JP4875200B2 (en) Method for searching for an object appearing in an image, apparatus therefor, computer program, computer system, and computer-readable storage medium
TW455794B (en) System and method for detecting clusters of information
Hu et al. Distance indexing on road networks
US8768893B2 (en) Identifying computer users having files with common attributes
CN105843956A (en) Paging query method and system
CA2523128A1 (en) Information retrieval and text mining using distributed latent semantic indexing
CN102737123B (en) A kind of multidimensional data distribution method
CN102169491B (en) Dynamic detection method for multi-data concentrated and repeated records
US5995970A (en) Method and apparatus for geographic coordinate data storage
Kriegel et al. Similarity search in structured data
CN103473268B (en) Linear element spatial index structuring method, system and search method and system thereof
US20140074538A1 (en) Stack handling operation method, system, and computer program
CN105279524A (en) High-dimensional data clustering method based on unweighted hypergraph segmentation
CN112434031A (en) Uncertain high-utility mode mining method based on information entropy
Wicaksono et al. The comparison of apriori algorithm with preprocessing and FP-growth algorithm for finding frequent data pattern in association rule
CN101140583A (en) Text searching method and device
CN106484782B (en) A kind of large-scale medical image retrieval based on the study of multicore Hash
KR101030250B1 (en) Data processing method and data processing program
JP4440246B2 (en) Spatial index method
JP3938815B2 (en) Node creation method, image search method, and recording medium
CN104715002A (en) SPACE DIVISION METHOD and SPACE DIVISION DEVICE
KR101319647B1 (en) Method for providing scalable collaborative filtering framework with overlapped information
Low et al. Colour-based relevance feedback for image retrieval

Legal Events

Date Code Title Description
EEER Examination request
MKLA Lapsed

Effective date: 20170731