CN1517914A - 结构化文件的检索 - Google Patents
结构化文件的检索 Download PDFInfo
- Publication number
- CN1517914A CN1517914A CNA2004100016133A CN200410001613A CN1517914A CN 1517914 A CN1517914 A CN 1517914A CN A2004100016133 A CNA2004100016133 A CN A2004100016133A CN 200410001613 A CN200410001613 A CN 200410001613A CN 1517914 A CN1517914 A CN 1517914A
- Authority
- CN
- China
- Prior art keywords
- file
- path
- user
- term
- query
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims description 146
- 230000008878 coupling Effects 0.000 claims description 17
- 238000010168 coupling process Methods 0.000 claims description 17
- 238000005859 coupling reaction Methods 0.000 claims description 17
- 230000014509 gene expression Effects 0.000 claims description 14
- 238000005259 measurement Methods 0.000 claims description 5
- 230000013011 mating Effects 0.000 claims description 3
- 230000000875 corresponding effect Effects 0.000 claims 5
- 230000002596 correlated effect Effects 0.000 claims 1
- 230000007246 mechanism Effects 0.000 description 33
- 230000008569 process Effects 0.000 description 21
- 238000004422 calculation algorithm Methods 0.000 description 16
- 235000019580 granularity Nutrition 0.000 description 15
- 238000004891 communication Methods 0.000 description 8
- 238000002474 experimental method Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 7
- 238000013507 mapping Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 230000001427 coherent effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 238000009795 derivation Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000000047 product Substances 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000012467 final product Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002386 leaching Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/338—Presentation of query results
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61J—CONTAINERS SPECIALLY ADAPTED FOR MEDICAL OR PHARMACEUTICAL PURPOSES; DEVICES OR METHODS SPECIALLY ADAPTED FOR BRINGING PHARMACEUTICAL PRODUCTS INTO PARTICULAR PHYSICAL OR ADMINISTERING FORMS; DEVICES FOR ADMINISTERING FOOD OR MEDICINES ORALLY; BABY COMFORTERS; DEVICES FOR RECEIVING SPITTLE
- A61J3/00—Devices or methods specially adapted for bringing pharmaceutical products into particular physical or administering forms
- A61J3/07—Devices or methods specially adapted for bringing pharmaceutical products into particular physical or administering forms into the form of capsules or similar small containers for oral use
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3341—Query execution using boolean model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61J—CONTAINERS SPECIALLY ADAPTED FOR MEDICAL OR PHARMACEUTICAL PURPOSES; DEVICES OR METHODS SPECIALLY ADAPTED FOR BRINGING PHARMACEUTICAL PRODUCTS INTO PARTICULAR PHYSICAL OR ADMINISTERING FORMS; DEVICES FOR ADMINISTERING FOOD OR MEDICINES ORALLY; BABY COMFORTERS; DEVICES FOR RECEIVING SPITTLE
- A61J2200/00—General characteristics or adaptations
- A61J2200/40—Heating or cooling means; Combinations thereof
- A61J2200/42—Heating means
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61J—CONTAINERS SPECIALLY ADAPTED FOR MEDICAL OR PHARMACEUTICAL PURPOSES; DEVICES OR METHODS SPECIALLY ADAPTED FOR BRINGING PHARMACEUTICAL PRODUCTS INTO PARTICULAR PHYSICAL OR ADMINISTERING FORMS; DEVICES FOR ADMINISTERING FOOD OR MEDICINES ORALLY; BABY COMFORTERS; DEVICES FOR RECEIVING SPITTLE
- A61J2200/00—General characteristics or adaptations
- A61J2200/70—Device provided with specific sensor or indicating means
- A61J2200/72—Device provided with specific sensor or indicating means for temperature
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/953—Organization of data
- Y10S707/956—Hierarchical
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
- Y10S707/99935—Query augmenting and refining, e.g. inexact access
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
- Y10S707/99936—Pattern matching access
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99937—Sorting
Abstract
Description
编号 | 查询 |
1 | 中国历史 |
2 | 清代 |
3 | 美国历史中的原子弹爆炸 |
4 | 第二次世界大战中的福特汽车 |
5 | 微积分学上的牛顿冲击是什么? |
6 | 微软对万维网的态度是什么? |
7 | 林肯在美国历史上的影响是什么? |
8 | 伦敦的舰队街 |
9 | 沙漠风暴中的军用飞机 |
10 | 核潜艇能携带什么导弹? |
阈值 | TFIDF Para | 可标量的检索 | %提高 |
0.1 | 0.4225 | 0.6596 | 56.14 |
0.2 | 0.4669 | 0.6638 | 42.17 |
0.3 | 0.5161 | 0.6668 | 29.21 |
0.4 | 0.5490 | 0.6668 | 21.46 |
0.5 | 0.5204 | 0.7139 | 37.19 |
0.6 | 0.4851 | 0.6995 | 44.20 |
0.7 | 0.4995 | 0.7641 | 52.97 |
0.8 | 0.4278 | 0.7258 | 69.66 |
0.9 | 0.2845 | 0.7351 | 158.40 |
平均 | 0.4616 | 0.7910 | 71.35 |
平均+标准偏差 | 0.4694 | 0.6167 | 31.39 |
无/阈值 | 0.2137 | 0.6596 | 208.74 |
Claims (58)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/337,138 | 2003-01-06 | ||
US10/337,138 US7111000B2 (en) | 2003-01-06 | 2003-01-06 | Retrieval of structured documents |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1517914A true CN1517914A (zh) | 2004-08-04 |
CN100568229C CN100568229C (zh) | 2009-12-09 |
Family
ID=32507431
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB2004100016133A Expired - Lifetime CN100568229C (zh) | 2003-01-06 | 2004-01-06 | 结构化文件的检索 |
Country Status (5)
Country | Link |
---|---|
US (4) | US7111000B2 (zh) |
EP (1) | EP1435581B1 (zh) |
JP (1) | JP4425641B2 (zh) |
KR (1) | KR101120760B1 (zh) |
CN (1) | CN100568229C (zh) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1318974C (zh) * | 2005-08-05 | 2007-05-30 | 北京九州汇宝软件有限公司<Del/> | 数据库备份数据的压缩和查询方法 |
CN102521270A (zh) * | 2010-11-22 | 2012-06-27 | 微软公司 | 用于有效预先计算的可分解的分级 |
CN102929901A (zh) * | 2006-06-26 | 2013-02-13 | 尼尔森(美国)有限公司 | 提高数据仓库性能的方法和装置 |
CN101681377B (zh) * | 2007-05-23 | 2013-03-27 | 微软公司 | 用于搜索的用户定义的相关性排序 |
CN103488681A (zh) * | 2009-06-19 | 2014-01-01 | 布雷克公司 | 斜线标签 |
CN104572620A (zh) * | 2014-12-31 | 2015-04-29 | 百度在线网络技术(北京)有限公司 | 一种用于显示章节内容的方法和装置 |
US9195745B2 (en) | 2010-11-22 | 2015-11-24 | Microsoft Technology Licensing, Llc | Dynamic query master agent for query execution |
US9342582B2 (en) | 2010-11-22 | 2016-05-17 | Microsoft Technology Licensing, Llc | Selection of atoms for search engine retrieval |
US9424351B2 (en) | 2010-11-22 | 2016-08-23 | Microsoft Technology Licensing, Llc | Hybrid-distribution model for search engine indexes |
US9529908B2 (en) | 2010-11-22 | 2016-12-27 | Microsoft Technology Licensing, Llc | Tiering of posting lists in search engine index |
CN106815266A (zh) * | 2015-12-01 | 2017-06-09 | 北京国双科技有限公司 | 裁判文书检索方法和装置 |
CN108959573A (zh) * | 2018-07-05 | 2018-12-07 | 京东方科技集团股份有限公司 | 基于桌面云的数据迁移方法、装置、电子设备以及存储介质 |
CN110990017A (zh) * | 2019-09-11 | 2020-04-10 | 无锡江南计算技术研究所 | 一种基于可信树的特征存储与匹配方法 |
Families Citing this family (73)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7111000B2 (en) * | 2003-01-06 | 2006-09-19 | Microsoft Corporation | Retrieval of structured documents |
JP4049317B2 (ja) | 2003-05-14 | 2008-02-20 | インターナショナル・ビジネス・マシーンズ・コーポレーション | 検索支援装置およびプログラム |
PL363397A1 (en) * | 2003-11-12 | 2005-05-16 | Advanced Digital Broadcast Ltd. | System for data search and definition in tree formats and method for data search and definition in tree formats |
US20050198059A1 (en) * | 2004-03-04 | 2005-09-08 | Peilin Chou | Database and database management system |
JP2005309727A (ja) * | 2004-04-21 | 2005-11-04 | Hitachi Ltd | ファイルシステム |
US7487145B1 (en) | 2004-06-22 | 2009-02-03 | Google Inc. | Method and system for autocompletion using ranked results |
US7836044B2 (en) | 2004-06-22 | 2010-11-16 | Google Inc. | Anticipated query generation and processing in a search engine |
JP4309818B2 (ja) * | 2004-07-15 | 2009-08-05 | 株式会社東芝 | 構造化文書管理装置、検索装置、記憶方法、検索方法及びプログラム |
US20060031760A1 (en) * | 2004-08-05 | 2006-02-09 | Microsoft Corporation | Adaptive document layout server/client system and process |
US20060047656A1 (en) * | 2004-09-01 | 2006-03-02 | Dehlinger Peter J | Code, system, and method for retrieving text material from a library of documents |
US20060085401A1 (en) * | 2004-10-20 | 2006-04-20 | Microsoft Corporation | Analyzing operational and other data from search system or the like |
US7499940B1 (en) | 2004-11-11 | 2009-03-03 | Google Inc. | Method and system for URL autocompletion using ranked results |
US20060106769A1 (en) | 2004-11-12 | 2006-05-18 | Gibbs Kevin A | Method and system for autocompletion for languages having ideographs and phonetic characters |
US8090736B1 (en) * | 2004-12-30 | 2012-01-03 | Google Inc. | Enhancing search results using conceptual document relationships |
US9189481B2 (en) * | 2005-05-06 | 2015-11-17 | John M. Nelson | Database and index organization for enhanced document retrieval |
US20060259475A1 (en) * | 2005-05-10 | 2006-11-16 | Dehlinger Peter J | Database system and method for retrieving records from a record library |
US8156097B2 (en) * | 2005-11-14 | 2012-04-10 | Microsoft Corporation | Two stage search |
US8010523B2 (en) * | 2005-12-30 | 2011-08-30 | Google Inc. | Dynamic search box for web browser |
US7809711B2 (en) * | 2006-06-02 | 2010-10-05 | International Business Machines Corporation | System and method for semantic analysis of intelligent device discovery |
CN100573520C (zh) * | 2006-08-29 | 2009-12-23 | 国际商业机器公司 | 为检索对多个文档进行预处理的方法和装置 |
US8401841B2 (en) * | 2006-08-31 | 2013-03-19 | Orcatec Llc | Retrieval of documents using language models |
WO2008049092A2 (en) * | 2006-10-18 | 2008-04-24 | Google Inc. | Generic online ranking system and method suitable for syndication |
US7836085B2 (en) * | 2007-02-05 | 2010-11-16 | Google Inc. | Searching structured geographical data |
US7831587B2 (en) * | 2007-05-10 | 2010-11-09 | Xerox Corporation | Event hierarchies and memory organization for structured data retrieval |
US7822752B2 (en) * | 2007-05-18 | 2010-10-26 | Microsoft Corporation | Efficient retrieval algorithm by query term discrimination |
US9256594B2 (en) | 2007-06-06 | 2016-02-09 | Michael S. Neustel | Patent analyzing system |
US8160306B1 (en) * | 2007-06-06 | 2012-04-17 | Neustel Michael S | Patent analyzing system |
US20090119281A1 (en) * | 2007-11-03 | 2009-05-07 | Andrew Chien-Chung Wang | Granular knowledge based search engine |
US8069179B2 (en) * | 2008-04-24 | 2011-11-29 | Microsoft Corporation | Preference judgements for relevance |
US8161036B2 (en) * | 2008-06-27 | 2012-04-17 | Microsoft Corporation | Index optimization for ranking using a linear model |
US8171031B2 (en) * | 2008-06-27 | 2012-05-01 | Microsoft Corporation | Index optimization for ranking using a linear model |
US8312032B2 (en) | 2008-07-10 | 2012-11-13 | Google Inc. | Dictionary suggestions for partial user entries |
US20100125566A1 (en) * | 2008-11-18 | 2010-05-20 | Patentcafe.Com, Inc. | System and method for conducting a patent search |
US20100287152A1 (en) | 2009-05-05 | 2010-11-11 | Paul A. Lipari | System, method and computer readable medium for web crawling |
US10303722B2 (en) * | 2009-05-05 | 2019-05-28 | Oracle America, Inc. | System and method for content selection for web page indexing |
KR101122394B1 (ko) * | 2009-05-08 | 2012-03-23 | 엔에이치엔(주) | 엔트로피 점수를 이용한 검색결과 제공 방법 및 장치 |
CN102483752A (zh) | 2009-06-03 | 2012-05-30 | 谷歌公司 | 用于部分输入的查询的自动完成 |
WO2011000165A1 (en) * | 2009-07-03 | 2011-01-06 | Hewlett-Packard Development Company,L.P. | Apparatus and method for text extraction |
US9507827B1 (en) * | 2010-03-25 | 2016-11-29 | Excalibur Ip, Llc | Encoding and accessing position data |
US8370330B2 (en) * | 2010-05-28 | 2013-02-05 | Apple Inc. | Predicting content and context performance based on performance history of users |
US20120084291A1 (en) * | 2010-09-30 | 2012-04-05 | Microsoft Corporation | Applying search queries to content sets |
US8620907B2 (en) | 2010-11-22 | 2013-12-31 | Microsoft Corporation | Matching funnel for large document index |
US8713024B2 (en) | 2010-11-22 | 2014-04-29 | Microsoft Corporation | Efficient forward ranking in a search engine |
US9098570B2 (en) * | 2011-03-31 | 2015-08-04 | Lexisnexis, A Division Of Reed Elsevier Inc. | Systems and methods for paragraph-based document searching |
US20120271844A1 (en) * | 2011-04-20 | 2012-10-25 | Microsoft Corporation | Providng relevant information for a term in a user message |
KR101454677B1 (ko) * | 2011-10-31 | 2014-10-27 | 네이버 주식회사 | 엔트로피 점수를 이용한 검색결과 제공 방법 및 장치 |
US8965904B2 (en) * | 2011-11-15 | 2015-02-24 | Long Van Dinh | Apparatus and method for information access, search, rank and retrieval |
US20130297657A1 (en) * | 2012-05-01 | 2013-11-07 | Gajanan Chinchwadkar | Apparatus and Method for Forming and Using a Tree Structured Database with Top-Down Trees and Bottom-Up Indices |
JP6590481B2 (ja) * | 2012-12-07 | 2019-10-16 | キヤノン電子株式会社 | ウイルス侵入経路特定装置、ウイルス侵入経路特定方法およびプログラム |
US9916284B2 (en) * | 2013-12-10 | 2018-03-13 | International Business Machines Corporation | Analyzing document content and generating an appendix |
JP6461992B2 (ja) | 2014-11-05 | 2019-01-30 | キヤノン電子株式会社 | 特定装置、その制御方法、及びプログラム |
US9875288B2 (en) | 2014-12-01 | 2018-01-23 | Sap Se | Recursive filter algorithms on hierarchical data models described for the use by the attribute value derivation |
US10776376B1 (en) * | 2014-12-05 | 2020-09-15 | Veritas Technologies Llc | Systems and methods for displaying search results |
US11281639B2 (en) * | 2015-06-23 | 2022-03-22 | Microsoft Technology Licensing, Llc | Match fix-up to remove matching documents |
US10733164B2 (en) | 2015-06-23 | 2020-08-04 | Microsoft Technology Licensing, Llc | Updating a bit vector search index |
US11392568B2 (en) | 2015-06-23 | 2022-07-19 | Microsoft Technology Licensing, Llc | Reducing matching documents for a search query |
US10242071B2 (en) | 2015-06-23 | 2019-03-26 | Microsoft Technology Licensing, Llc | Preliminary ranker for scoring matching documents |
US10467215B2 (en) | 2015-06-23 | 2019-11-05 | Microsoft Technology Licensing, Llc | Matching documents using a bit vector search index |
US10229143B2 (en) | 2015-06-23 | 2019-03-12 | Microsoft Technology Licensing, Llc | Storage and retrieval of data from a bit vector search index |
US10565198B2 (en) | 2015-06-23 | 2020-02-18 | Microsoft Technology Licensing, Llc | Bit vector search index using shards |
JP6890593B2 (ja) * | 2015-12-24 | 2021-06-18 | コーニンクレッカ フィリップス エヌ ヴェKoninklijke Philips N.V. | 関連履歴の長さを決定するためのデバイス及び方法 |
US20180165265A1 (en) * | 2016-12-08 | 2018-06-14 | International Business Machines Corporation | Indicating property inheritance in object hierarchies |
KR102594625B1 (ko) * | 2017-03-19 | 2023-10-25 | 오펙-에슈콜롯 리서치 앤드 디벨롭먼트 엘티디 | K-부정합 검색을 위한 필터를 생성하는 시스템 및 방법 |
WO2020075062A1 (en) * | 2018-10-08 | 2020-04-16 | Arctic Alliance Europe Oy | Method and system to perform text-based search among plurality of documents |
US11074262B2 (en) * | 2018-11-30 | 2021-07-27 | International Business Machines Corporation | Automated document filtration and prioritization for document searching and access |
US11061913B2 (en) * | 2018-11-30 | 2021-07-13 | International Business Machines Corporation | Automated document filtration and priority scoring for document searching and access |
US11068490B2 (en) * | 2019-01-04 | 2021-07-20 | International Business Machines Corporation | Automated document filtration with machine learning of annotations for document searching and access |
US11721441B2 (en) | 2019-01-15 | 2023-08-08 | Merative Us L.P. | Determining drug effectiveness ranking for a patient using machine learning |
US10977292B2 (en) | 2019-01-15 | 2021-04-13 | International Business Machines Corporation | Processing documents in content repositories to generate personalized treatment guidelines |
US11537581B2 (en) * | 2019-03-22 | 2022-12-27 | Hewlett Packard Enterprise Development Lp | Co-parent keys for document information trees |
US11531818B2 (en) * | 2019-11-15 | 2022-12-20 | 42 Maru Inc. | Device and method for machine reading comprehension question and answer |
US20210349888A1 (en) * | 2020-05-11 | 2021-11-11 | Dropbox, Inc. | Personalized Spelling Correction |
CN112307356A (zh) * | 2020-10-30 | 2021-02-02 | 北京百度网讯科技有限公司 | 信息搜索方法、装置、电子设备及存储介质 |
Family Cites Families (55)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5020019A (en) * | 1989-05-29 | 1991-05-28 | Ricoh Company, Ltd. | Document retrieval system |
JPH03122770A (ja) * | 1989-10-05 | 1991-05-24 | Ricoh Co Ltd | キーワード連想文書検索方法 |
US5404514A (en) * | 1989-12-26 | 1995-04-04 | Kageneck; Karl-Erbo G. | Method of indexing and retrieval of electronically-stored documents |
US5321833A (en) * | 1990-08-29 | 1994-06-14 | Gte Laboratories Incorporated | Adaptive ranking system for information retrieval |
JP2943447B2 (ja) * | 1991-01-30 | 1999-08-30 | 三菱電機株式会社 | テキスト情報抽出装置とテキスト類似照合装置とテキスト検索システムとテキスト情報抽出方法とテキスト類似照合方法、及び、質問解析装置 |
JPH05101107A (ja) * | 1991-10-07 | 1993-04-23 | Hitachi Ltd | 適合率を用いた絞り込みデータ検索装置及び方法 |
GB9220404D0 (en) * | 1992-08-20 | 1992-11-11 | Nat Security Agency | Method of identifying,retrieving and sorting documents |
JP2770715B2 (ja) * | 1993-08-25 | 1998-07-02 | 富士ゼロックス株式会社 | 構造化文書検索装置 |
EP0645757B1 (en) * | 1993-09-23 | 2000-04-05 | Xerox Corporation | Semantic co-occurrence filtering for speech recognition and signal transcription applications |
US5692176A (en) * | 1993-11-22 | 1997-11-25 | Reed Elsevier Inc. | Associative text search and retrieval system |
US5574840A (en) | 1994-08-29 | 1996-11-12 | Microsoft Corporation | Method and system for selecting text utilizing a plurality of text using switchable minimum granularity of selection |
JP2896634B2 (ja) | 1995-03-02 | 1999-05-31 | 富士ゼロックス株式会社 | 全文登録語検索装置および全文登録語検索方法 |
US5826260A (en) * | 1995-12-11 | 1998-10-20 | International Business Machines Corporation | Information retrieval system and method for displaying and ordering information based on query element contribution |
US5752242A (en) * | 1996-04-18 | 1998-05-12 | Electronic Data Systems Corporation | System and method for automated retrieval of information |
JP3598742B2 (ja) * | 1996-11-25 | 2004-12-08 | 富士ゼロックス株式会社 | 文書検索装置及び文書検索方法 |
US6098065A (en) * | 1997-02-13 | 2000-08-01 | Nortel Networks Corporation | Associative search engine |
US5873081A (en) * | 1997-06-27 | 1999-02-16 | Microsoft Corporation | Document filtering via directed acyclic graphs |
US5933822A (en) * | 1997-07-22 | 1999-08-03 | Microsoft Corporation | Apparatus and methods for an information retrieval system that employs natural language processing of search results to improve overall precision |
US6014639A (en) | 1997-11-05 | 2000-01-11 | International Business Machines Corporation | Electronic catalog system for exploring a multitude of hierarchies, using attribute relevance and forwarding-checking |
US5999664A (en) * | 1997-11-14 | 1999-12-07 | Xerox Corporation | System for searching a corpus of document images by user specified document layout components |
US6801916B2 (en) * | 1998-04-01 | 2004-10-05 | Cyberpulse, L.L.C. | Method and system for generation of medical reports from data in a hierarchically-organized database |
US6389425B1 (en) | 1998-07-09 | 2002-05-14 | International Business Machines Corporation | Embedded storage mechanism for structured data types |
JP2000029902A (ja) | 1998-07-15 | 2000-01-28 | Nec Corp | 構造化文書分類装置およびこの構造化文書分類装置をコンピュータで実現するプログラムを記録した記録媒体、並びに、構造化文書検索システムおよびこの構造化文書検索システムをコンピュータで実現するプログラムを記録した記録媒体 |
US6446061B1 (en) * | 1998-07-31 | 2002-09-03 | International Business Machines Corporation | Taxonomy generation for document collections |
JP2000090098A (ja) | 1998-09-09 | 2000-03-31 | Hitachi Ltd | データベース問い合わせ方法及びその実施装置並びにその処理プログラムを記録した媒体 |
US6363378B1 (en) * | 1998-10-13 | 2002-03-26 | Oracle Corporation | Ranking of query feedback terms in an information retrieval system |
JP2001160066A (ja) | 1998-12-25 | 2001-06-12 | Matsushita Electric Ind Co Ltd | データ処理装置、データ処理方法および記録媒体、並びに該データ処理方法をコンピュータに実行させるためのプログラム |
US6385611B1 (en) * | 1999-05-07 | 2002-05-07 | Carlos Cardona | System and method for database retrieval, indexing and statistical analysis |
US7225182B2 (en) * | 1999-05-28 | 2007-05-29 | Overture Services, Inc. | Recommending search terms using collaborative filtering and web spidering |
US6380947B1 (en) * | 1999-07-22 | 2002-04-30 | At&T Corp. | Method and apparatus for displaying and tree scrolling a hierarchical data structure |
US20020052692A1 (en) | 1999-09-15 | 2002-05-02 | Eoin D. Fahy | Computer systems and methods for hierarchical cluster analysis of large sets of biological data including highly dense gene array data |
US7287214B1 (en) * | 1999-12-10 | 2007-10-23 | Books24X7.Com, Inc. | System and method for providing a searchable library of electronic documents to a user |
US6397211B1 (en) * | 2000-01-03 | 2002-05-28 | International Business Machines Corporation | System and method for identifying useless documents |
US7333983B2 (en) * | 2000-02-03 | 2008-02-19 | Hitachi, Ltd. | Method of and an apparatus for retrieving and delivering documents and a recording media on which a program for retrieving and delivering documents are stored |
DE60044423D1 (de) * | 2000-02-03 | 2010-07-01 | Hitachi Ltd | Verfahren und Gerät zum Wiederauffinden und Ausgeben von Dokumenten und Speichermedium mit entspechendem Program |
KR20040041082A (ko) * | 2000-07-24 | 2004-05-13 | 비브콤 인코포레이티드 | 멀티미디어 북마크와 비디오의 가상 편집을 위한 시스템및 방법 |
KR100426382B1 (ko) * | 2000-08-23 | 2004-04-08 | 학교법인 김포대학 | 엔트로피 정보와 베이지안 에스오엠을 이용한 문서군집기반의 순위조정 방법 |
KR100434902B1 (ko) * | 2000-08-28 | 2004-06-07 | 주식회사 에이전트엑스퍼트 | 지식 기반 맞춤 정보 제공 시스템 및 그 서비스 방법 |
US6804662B1 (en) * | 2000-10-27 | 2004-10-12 | Plumtree Software, Inc. | Method and apparatus for query and analysis |
US6693651B2 (en) * | 2001-02-07 | 2004-02-17 | International Business Machines Corporation | Customer self service iconic interface for resource search results display and selection |
US7225234B2 (en) * | 2001-03-02 | 2007-05-29 | Sedna Patent Services, Llc | Method and system for selective advertisement display of a subset of search results |
US20020123989A1 (en) * | 2001-03-05 | 2002-09-05 | Arik Kopelman | Real time filter and a method for calculating the relevancy value of a document |
KR100498574B1 (ko) * | 2001-03-08 | 2005-07-01 | 주식회사 다이퀘스트 | 단락 단위의 실시간 응답 색인을 이용한 자연어 질의-응답검색시스템 |
JP3842577B2 (ja) | 2001-03-30 | 2006-11-08 | 株式会社東芝 | 構造化文書検索方法および構造化文書検索装置およびプログラム |
US20020198962A1 (en) * | 2001-06-21 | 2002-12-26 | Horn Frederic A. | Method, system, and computer program product for distributing a stored URL and web document set |
US20050108200A1 (en) * | 2001-07-04 | 2005-05-19 | Frank Meik | Category based, extensible and interactive system for document retrieval |
US7403938B2 (en) * | 2001-09-24 | 2008-07-22 | Iac Search & Media, Inc. | Natural language query processing |
US20030115191A1 (en) * | 2001-12-17 | 2003-06-19 | Max Copperman | Efficient and cost-effective content provider for customer relationship management (CRM) or other applications |
US7080059B1 (en) * | 2002-05-13 | 2006-07-18 | Quasm Corporation | Search and presentation engine |
EP1532542A1 (en) * | 2002-05-14 | 2005-05-25 | Verity, Inc. | Apparatus and method for region sensitive dynamically configurable document relevance ranking |
US7231395B2 (en) * | 2002-05-24 | 2007-06-12 | Overture Services, Inc. | Method and apparatus for categorizing and presenting documents of a distributed database |
US7139778B2 (en) * | 2002-06-28 | 2006-11-21 | Microsoft Corporation | Linear programming approach to assigning benefit to database physical design structures |
US20040037734A1 (en) * | 2002-08-23 | 2004-02-26 | Toomey Patrick J. | Method for removal of mold from a structure |
US7111000B2 (en) * | 2003-01-06 | 2006-09-19 | Microsoft Corporation | Retrieval of structured documents |
US20070260627A1 (en) * | 2006-05-03 | 2007-11-08 | Lucent Technologies Inc. | Method and apparatus for selective content modification within a content complex |
-
2003
- 2003-01-06 US US10/337,138 patent/US7111000B2/en active Active
- 2003-12-15 EP EP03028647.0A patent/EP1435581B1/en not_active Expired - Lifetime
-
2004
- 2004-01-06 JP JP2004001489A patent/JP4425641B2/ja not_active Expired - Fee Related
- 2004-01-06 CN CNB2004100016133A patent/CN100568229C/zh not_active Expired - Lifetime
- 2004-01-06 KR KR1020040000739A patent/KR101120760B1/ko not_active IP Right Cessation
-
2006
- 2006-03-23 US US11/277,344 patent/US7428538B2/en not_active Expired - Lifetime
- 2006-03-23 US US11/277,345 patent/US20060161532A1/en not_active Abandoned
-
2008
- 2008-09-16 US US12/211,793 patent/US8046370B2/en not_active Expired - Lifetime
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1318974C (zh) * | 2005-08-05 | 2007-05-30 | 北京九州汇宝软件有限公司<Del/> | 数据库备份数据的压缩和查询方法 |
CN102929901B (zh) * | 2006-06-26 | 2016-12-14 | 尼尔森(美国)有限公司 | 提高数据仓库性能的方法和装置 |
CN102929901A (zh) * | 2006-06-26 | 2013-02-13 | 尼尔森(美国)有限公司 | 提高数据仓库性能的方法和装置 |
CN101681377B (zh) * | 2007-05-23 | 2013-03-27 | 微软公司 | 用于搜索的用户定义的相关性排序 |
US11055270B2 (en) | 2009-06-19 | 2021-07-06 | International Business Machines Corporation | Trash daemon |
US10877950B2 (en) | 2009-06-19 | 2020-12-29 | International Business Machines Corporation | Slashtags |
US10078650B2 (en) | 2009-06-19 | 2018-09-18 | International Business Machines Corporation | Hierarchical diff files |
US11080256B2 (en) | 2009-06-19 | 2021-08-03 | International Business Machines Corporation | Combinators |
US11176114B2 (en) | 2009-06-19 | 2021-11-16 | International Business Machines Corporation | RAM daemons |
US10437808B2 (en) | 2009-06-19 | 2019-10-08 | International Business Machines Corporation | RAM daemons |
US11487735B2 (en) | 2009-06-19 | 2022-11-01 | International Business Machines Corporation | Combinators |
CN103488681A (zh) * | 2009-06-19 | 2014-01-01 | 布雷克公司 | 斜线标签 |
US10095725B2 (en) | 2009-06-19 | 2018-10-09 | International Business Machines Corporation | Combinators |
US9607085B2 (en) | 2009-06-19 | 2017-03-28 | International Business Machines Corporation | Hierarchical diff files |
US10997145B2 (en) | 2009-06-19 | 2021-05-04 | International Business Machines Corporation | Hierarchical diff files |
CN102521270B (zh) * | 2010-11-22 | 2015-04-01 | 微软公司 | 用于有效预先计算的可分解的分级 |
US9529908B2 (en) | 2010-11-22 | 2016-12-27 | Microsoft Technology Licensing, Llc | Tiering of posting lists in search engine index |
US9424351B2 (en) | 2010-11-22 | 2016-08-23 | Microsoft Technology Licensing, Llc | Hybrid-distribution model for search engine indexes |
US9342582B2 (en) | 2010-11-22 | 2016-05-17 | Microsoft Technology Licensing, Llc | Selection of atoms for search engine retrieval |
US10437892B2 (en) | 2010-11-22 | 2019-10-08 | Microsoft Technology Licensing, Llc | Efficient forward ranking in a search engine |
US9195745B2 (en) | 2010-11-22 | 2015-11-24 | Microsoft Technology Licensing, Llc | Dynamic query master agent for query execution |
US8805755B2 (en) | 2010-11-22 | 2014-08-12 | Microsoft Corporation | Decomposable ranking for efficient precomputing |
CN102521270A (zh) * | 2010-11-22 | 2012-06-27 | 微软公司 | 用于有效预先计算的可分解的分级 |
CN104572620A (zh) * | 2014-12-31 | 2015-04-29 | 百度在线网络技术(北京)有限公司 | 一种用于显示章节内容的方法和装置 |
CN106815266A (zh) * | 2015-12-01 | 2017-06-09 | 北京国双科技有限公司 | 裁判文书检索方法和装置 |
CN106815266B (zh) * | 2015-12-01 | 2020-06-16 | 北京国双科技有限公司 | 裁判文书检索方法和装置 |
CN108959573B (zh) * | 2018-07-05 | 2022-07-15 | 京东方科技集团股份有限公司 | 基于桌面云的数据迁移方法、装置、电子设备以及存储介质 |
CN108959573A (zh) * | 2018-07-05 | 2018-12-07 | 京东方科技集团股份有限公司 | 基于桌面云的数据迁移方法、装置、电子设备以及存储介质 |
CN110990017A (zh) * | 2019-09-11 | 2020-04-10 | 无锡江南计算技术研究所 | 一种基于可信树的特征存储与匹配方法 |
CN110990017B (zh) * | 2019-09-11 | 2022-09-09 | 无锡江南计算技术研究所 | 一种基于可信树的特征存储与匹配方法 |
Also Published As
Publication number | Publication date |
---|---|
KR101120760B1 (ko) | 2012-06-12 |
US20060161532A1 (en) | 2006-07-20 |
US7111000B2 (en) | 2006-09-19 |
JP4425641B2 (ja) | 2010-03-03 |
CN100568229C (zh) | 2009-12-09 |
KR20040063822A (ko) | 2004-07-14 |
EP1435581A3 (en) | 2005-09-28 |
JP2004213675A (ja) | 2004-07-29 |
EP1435581A2 (en) | 2004-07-07 |
US20060155690A1 (en) | 2006-07-13 |
US8046370B2 (en) | 2011-10-25 |
US20090012956A1 (en) | 2009-01-08 |
EP1435581B1 (en) | 2013-04-10 |
US20040133557A1 (en) | 2004-07-08 |
US7428538B2 (en) | 2008-09-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1517914A (zh) | 结构化文件的检索 | |
Mork et al. | 12 years on–Is the NLM medical text indexer still useful and relevant? | |
US8903810B2 (en) | Techniques for ranking search results | |
US7406459B2 (en) | Concept network | |
JP5632124B2 (ja) | 格付け方法、検索結果並び替え方法、格付けシステム及び検索結果並び替えシステム | |
US6920448B2 (en) | Domain specific knowledge-based metasearch system and methods of using | |
US7809717B1 (en) | Method and apparatus for concept-based visual presentation of search results | |
US6829599B2 (en) | System and method for improving answer relevance in meta-search engines | |
US7870147B2 (en) | Query revision using known highly-ranked queries | |
US9552420B2 (en) | Feature engineering and user behavior analysis | |
US20060253428A1 (en) | Performant relevance improvements in search query results | |
CN103902597B (zh) | 确定目标关键词所对应的搜索相关性类别的方法和设备 | |
CN101075942A (zh) | 基于专家值传播算法的社会网络专家信息处理系统及方法 | |
US20130297612A1 (en) | System for enhancing expert-based computerized analysis of a set of digital documents and methods useful in conjunction therewith | |
CN101055587A (zh) | 一种基于用户行为信息的搜索引擎检索结果重排序方法 | |
CN1882943A (zh) | 使用超单元的搜索处理的系统和方法 | |
US20080065631A1 (en) | User query data mining and related techniques | |
CN1489089A (zh) | 文件检索系统和问题回答系统 | |
CN1846210A (zh) | 利用本体存储并检索数据的方法及装置 | |
CN1760867A (zh) | 用于内联网搜索的方法和装置 | |
CN1653448A (zh) | 搜索数据源的系统和方法 | |
CN1629838A (zh) | 电子文档的处理、浏览以及信息提取的方法、装置及系统 | |
CN1750002A (zh) | 提供搜索结果的方法 | |
US8489643B1 (en) | System and method for automated content aggregation using knowledge base construction | |
US20040186833A1 (en) | Requirements -based knowledge discovery for technology management |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
ASS | Succession or assignment of patent right |
Owner name: MICROSOFT TECHNOLOGY LICENSING LLC Free format text: FORMER OWNER: MICROSOFT CORP. Effective date: 20150515 |
|
C41 | Transfer of patent application or patent right or utility model | ||
TR01 | Transfer of patent right |
Effective date of registration: 20150515 Address after: Washington State Patentee after: MICROSOFT TECHNOLOGY LICENSING, LLC Address before: Washington State Patentee before: Microsoft Corp. |
|
C41 | Transfer of patent application or patent right or utility model | ||
TR01 | Transfer of patent right |
Effective date of registration: 20160713 Address after: Grand Cayman, Georgetown, Cayman Islands Patentee after: Microsoft Corp. Address before: Washington State Patentee before: MICROSOFT TECHNOLOGY LICENSING, LLC |
|
CX01 | Expiry of patent term |
Granted publication date: 20091209 |
|
CX01 | Expiry of patent term |