DE602006021384D1 - Erzeugung von Beschreibungen für Klassen und Cluster von Dokumenten - Google Patents
Erzeugung von Beschreibungen für Klassen und Cluster von DokumentenInfo
- Publication number
- DE602006021384D1 DE602006021384D1 DE602006021384T DE602006021384T DE602006021384D1 DE 602006021384 D1 DE602006021384 D1 DE 602006021384D1 DE 602006021384 T DE602006021384 T DE 602006021384T DE 602006021384 T DE602006021384 T DE 602006021384T DE 602006021384 D1 DE602006021384 D1 DE 602006021384D1
- Authority
- DE
- Germany
- Prior art keywords
- clusters
- classes
- descriptions
- documents
- generation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/312,764 US7813919B2 (en) | 2005-12-20 | 2005-12-20 | Class description generation for clustering and categorization |
Publications (1)
Publication Number | Publication Date |
---|---|
DE602006021384D1 true DE602006021384D1 (de) | 2011-06-01 |
Family
ID=37726701
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
DE602006021384T Active DE602006021384D1 (de) | 2005-12-20 | 2006-12-14 | Erzeugung von Beschreibungen für Klassen und Cluster von Dokumenten |
Country Status (3)
Country | Link |
---|---|
US (1) | US7813919B2 (de) |
EP (2) | EP2302532A1 (de) |
DE (1) | DE602006021384D1 (de) |
Families Citing this family (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7809733B2 (en) * | 2006-03-02 | 2010-10-05 | Oracle International Corp. | Effort based relevance |
US8364467B1 (en) | 2006-03-31 | 2013-01-29 | Google Inc. | Content-based classification |
US7917492B2 (en) * | 2007-09-21 | 2011-03-29 | Limelight Networks, Inc. | Method and subsystem for information acquisition and aggregation to facilitate ontology and language-model generation within a content-search-service system |
JP5082374B2 (ja) * | 2006-10-19 | 2012-11-28 | 富士通株式会社 | フレーズアラインメントプログラム、翻訳プログラム、フレーズアラインメント装置およびフレーズアラインメント方法 |
US7757163B2 (en) * | 2007-01-05 | 2010-07-13 | International Business Machines Corporation | Method and system for characterizing unknown annotator and its type system with respect to reference annotation types and associated reference taxonomy nodes |
US7856351B2 (en) * | 2007-01-19 | 2010-12-21 | Microsoft Corporation | Integrated speech recognition and semantic classification |
US8108413B2 (en) * | 2007-02-15 | 2012-01-31 | International Business Machines Corporation | Method and apparatus for automatically discovering features in free form heterogeneous data |
US8996587B2 (en) | 2007-02-15 | 2015-03-31 | International Business Machines Corporation | Method and apparatus for automatically structuring free form hetergeneous data |
WO2009009192A2 (en) * | 2007-04-18 | 2009-01-15 | Aumni Data, Inc. | Adaptive archive data management |
US8856123B1 (en) * | 2007-07-20 | 2014-10-07 | Hewlett-Packard Development Company, L.P. | Document classification |
JP5379138B2 (ja) * | 2007-08-23 | 2013-12-25 | グーグル・インコーポレーテッド | 領域辞書の作成 |
US7917355B2 (en) * | 2007-08-23 | 2011-03-29 | Google Inc. | Word detection |
US7983902B2 (en) * | 2007-08-23 | 2011-07-19 | Google Inc. | Domain dictionary creation by detection of new topic words using divergence value comparison |
US8140584B2 (en) * | 2007-12-10 | 2012-03-20 | Aloke Guha | Adaptive data classification for data mining |
US8189930B2 (en) * | 2008-07-17 | 2012-05-29 | Xerox Corporation | Categorizer with user-controllable calibration |
US8788497B2 (en) * | 2008-09-15 | 2014-07-22 | Microsoft Corporation | Automated criterion-based grouping and presenting |
US8266148B2 (en) * | 2008-10-07 | 2012-09-11 | Aumni Data, Inc. | Method and system for business intelligence analytics on unstructured data |
US8339680B2 (en) * | 2009-04-02 | 2012-12-25 | Xerox Corporation | Printer image log system for document gathering and retention |
US8386437B2 (en) * | 2009-04-02 | 2013-02-26 | Xerox Corporation | Apparatus and method for document collection and filtering |
US9405456B2 (en) * | 2009-06-08 | 2016-08-02 | Xerox Corporation | Manipulation of displayed objects by virtual magnetism |
US8165974B2 (en) | 2009-06-08 | 2012-04-24 | Xerox Corporation | System and method for assisted document review |
US8566349B2 (en) | 2009-09-28 | 2013-10-22 | Xerox Corporation | Handwritten document categorizer and method of training |
JP2011095905A (ja) * | 2009-10-28 | 2011-05-12 | Sony Corp | 情報処理装置および方法、並びにプログラム |
US8756503B2 (en) | 2011-02-21 | 2014-06-17 | Xerox Corporation | Query generation from displayed text documents using virtual magnets |
US8860763B2 (en) | 2012-01-31 | 2014-10-14 | Xerox Corporation | Reversible user interface component |
US8880525B2 (en) | 2012-04-02 | 2014-11-04 | Xerox Corporation | Full and semi-batch clustering |
US9189473B2 (en) | 2012-05-18 | 2015-11-17 | Xerox Corporation | System and method for resolving entity coreference |
US9977829B2 (en) * | 2012-10-12 | 2018-05-22 | Hewlett-Packard Development Company, L.P. | Combinatorial summarizer |
US20140289260A1 (en) * | 2013-03-22 | 2014-09-25 | Hewlett-Packard Development Company, L.P. | Keyword Determination |
CN103678274A (zh) * | 2013-04-15 | 2014-03-26 | 南京邮电大学 | 一种基于改进互信息和熵的文本分类特征提取方法 |
US20150127323A1 (en) * | 2013-11-04 | 2015-05-07 | Xerox Corporation | Refining inference rules with temporal event clustering |
JP6044963B2 (ja) | 2014-02-12 | 2016-12-14 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | 情報処理装置、方法及びプログラム |
CN104991891B (zh) * | 2015-07-28 | 2018-03-30 | 北京大学 | 一种短文本特征提取方法 |
CN105045913B (zh) * | 2015-08-14 | 2018-08-28 | 北京工业大学 | 基于WordNet以及潜在语义分析的文本分类方法 |
TWI571756B (zh) | 2015-12-11 | 2017-02-21 | 財團法人工業技術研究院 | 用以分析瀏覽記錄及其文件之方法及其系統 |
CN107967912B (zh) * | 2017-11-28 | 2022-02-25 | 广州势必可赢网络科技有限公司 | 一种人声分割方法及装置 |
US11893500B2 (en) | 2017-11-28 | 2024-02-06 | International Business Machines Corporation | Data classification for data lake catalog |
US11301629B2 (en) | 2019-08-21 | 2022-04-12 | International Business Machines Corporation | Interleaved conversation concept flow enhancement |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5537488A (en) | 1993-09-16 | 1996-07-16 | Massachusetts Institute Of Technology | Pattern recognition system with statistical classification |
US5857179A (en) * | 1996-09-09 | 1999-01-05 | Digital Equipment Corporation | Computer method and apparatus for clustering documents and automatic generation of cluster keywords |
US6137911A (en) | 1997-06-16 | 2000-10-24 | The Dialog Corporation Plc | Test classification system and method |
US6104835A (en) * | 1997-11-14 | 2000-08-15 | Kla-Tencor Corporation | Automatic knowledge database generation for classifying objects and systems therefor |
US6424971B1 (en) | 1999-10-29 | 2002-07-23 | International Business Machines Corporation | System and method for interactive classification and analysis of data |
US6862586B1 (en) | 2000-02-11 | 2005-03-01 | International Business Machines Corporation | Searching databases that identifying group documents forming high-dimensional torus geometric k-means clustering, ranking, summarizing based on vector triplets |
US7035431B2 (en) * | 2002-02-22 | 2006-04-25 | Microsoft Corporation | System and method for probabilistic exemplar-based pattern tracking |
US7165024B2 (en) | 2002-02-22 | 2007-01-16 | Nec Laboratories America, Inc. | Inferring hierarchical descriptions of a set of documents |
US7031909B2 (en) * | 2002-03-12 | 2006-04-18 | Verity, Inc. | Method and system for naming a cluster of words and phrases |
US6931347B2 (en) * | 2002-03-29 | 2005-08-16 | International Business Machines Corporation | Safety stock determination |
US7085771B2 (en) * | 2002-05-17 | 2006-08-01 | Verity, Inc | System and method for automatically discovering a hierarchy of concepts from a corpus of documents |
US20030233232A1 (en) * | 2002-06-12 | 2003-12-18 | Lucent Technologies Inc. | System and method for measuring domain independence of semantic classes |
US7139754B2 (en) | 2004-02-09 | 2006-11-21 | Xerox Corporation | Method for multi-class, multi-label categorization using probabilistic hierarchical modeling |
US7457808B2 (en) | 2004-12-17 | 2008-11-25 | Xerox Corporation | Method and apparatus for explaining categorization decisions |
US20060287848A1 (en) * | 2005-06-20 | 2006-12-21 | Microsoft Corporation | Language classification with random feature clustering |
US7849087B2 (en) | 2005-06-29 | 2010-12-07 | Xerox Corporation | Incremental training for probabilistic categorizer |
US7630977B2 (en) | 2005-06-29 | 2009-12-08 | Xerox Corporation | Categorization including dependencies between different category systems |
US8209335B2 (en) * | 2005-09-20 | 2012-06-26 | International Business Machines Corporation | Extracting informative phrases from unstructured text |
-
2005
- 2005-12-20 US US11/312,764 patent/US7813919B2/en not_active Expired - Fee Related
-
2006
- 2006-12-14 EP EP10184892A patent/EP2302532A1/de not_active Ceased
- 2006-12-14 EP EP06126091A patent/EP1801714B1/de not_active Expired - Fee Related
- 2006-12-14 DE DE602006021384T patent/DE602006021384D1/de active Active
Also Published As
Publication number | Publication date |
---|---|
EP1801714A3 (de) | 2007-09-05 |
EP1801714B1 (de) | 2011-04-20 |
EP2302532A1 (de) | 2011-03-30 |
US20070143101A1 (en) | 2007-06-21 |
US7813919B2 (en) | 2010-10-12 |
EP1801714A2 (de) | 2007-06-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE602006021384D1 (de) | Erzeugung von Beschreibungen für Klassen und Cluster von Dokumenten | |
DE602006013810D1 (de) | Aus biomasseressourcen gewonnener polyester und herstellungsverfahren dafür | |
DE602006011834D1 (de) | Anlaufschaltung und Anlaufverfahren für Bandgapspannungsgeneratoren | |
DE602007006394D1 (de) | Mikrofon und Befestigungsanordnung | |
DE602006020661D1 (de) | Wässrige Dispersion von Polymerteilchen | |
ATE480322T1 (de) | Keramikpartikel | |
DE602006013788D1 (de) | Cyclo-1-gen aus mais und promoter | |
EP1960952A4 (de) | Analyse adminstrativer krankenpflege-anspruchsdaten und anderer datenquellen | |
DE602006013761D1 (de) | Herstellungsverfahren für wasserabsorbierende Partikel und wasserabsorbierende Partikel | |
DE502006004466D1 (de) | Elektrisches bauelement | |
DE602007013962D1 (de) | Wasserstoffverdichtersystem | |
DE602008001533D1 (de) | Erzeugungsverfahren für Fotomaskendaten, Herstellungsverfahren für Fotomasken, Belichtungsverfahren und Herstellungsverfahren für Bauelemente | |
DE602007000901D1 (de) | Dateiteilung durch Teilung der Cluster und der Management Information | |
DE602006001272D1 (de) | Narkosebox und Mikroskop | |
FI20055408A0 (fi) | Äärellisen tietokonemallin luominen | |
DE602006017437D1 (de) | Polyarenazol/thermoplast-pulpe und herstellungsverfahren dafür | |
BRPI0908668A2 (pt) | ''produtor auditivo com alto-falantes'' | |
AU314482S (en) | Dust cover for an electrical connector | |
DE602006005502D1 (de) | Entwickler-regulierungsglied und entwicklungsvorrichtung | |
DE502005004418D1 (de) | Turbolader | |
DE602007006881D1 (de) | Luftschalter, Ausschaltfeder für den Luftschalter und Verbindungsverfahren dafür | |
AT503152A3 (de) | Einbau elektrischer grosskomponenten in doppelstock-triebzügen | |
DE502006005974D1 (de) | Nutzung von Variablen in mehreren Automatisierungssystemen | |
DE602007002258D1 (de) | Stromvorverstärker und damit verbundener Stromvergleicher | |
UY3519Q (es) | Cubierta para purificador de aire |