Sign in

Method and apparatus for open data collection

 David G. Stork
A method and apparatus for open data collection is provided. The method of machine learning comprises setting up a system for learning, the system having certain goals. The method further comprising presenting queries to non-expert netizens over a network, the netizens participating in the...
Inventor: David G. Stork
Assignees: Ricoh Co., Ltd.
Primary Examiners: Wilbert L. Starks, Jr.
Attorneys: Blakely, Sokoloff, Taylor & Zafman LLP

U.S. Classification
706/12; 706/45

International Classification
G06N005/00

View patent at USPTO

Citations

Patent NumberTitleIssue date
5586218Autonomous learning and reasoning agentDec 17, 1996
5671333Training apparatus and methodSep 23, 1997
5675710Method and apparatus for training a text classifierOct 7, 1997
5802509Rule generation system and method of generating ruleSep 1, 1998
5806056Expert system and method employing hierarchical knowledge base, and interactive multimedia/hypermedia applications Sep 8, 1998
5819247Apparatus and methods for machine learning hypothesesOct 6, 1998
5901246Ergonomic man-machine interface incorporating adaptive pattern recognition based control systemMay 4, 1999
5946673Computer implemented machine learning and control systemAug 31, 1999
5960422System and method for optimized source selection in an information retrieval systemSep 28, 1999
5963940Natural language information retrieval system and methodOct 5, 1999
5970482System for data mining using neuroagentsOct 19, 1999
5974412Intelligent query system for automatically indexing information in a database and automatically categorizing usersOct 26, 1999
6026388User interface and other enhancements for natural language information retrieval system and methodFeb 15, 2000
6112176Speech data collection over the world wide webAug 29, 2000
6128380Automatic call distribution and training systemOct 3, 2000
6128608Enhancing knowledge discovery using multiple support vector machinesOct 3, 2000
6144989Adaptive agent-oriented software architectureNov 7, 2000
6269368Information retrieval using dynamic evidence combinationJul 31, 2001
6289353Intelligent query system for automatically indexing in a database and automatically categorizing usersSep 11, 2001
6411932Rule-based learning of word pronunciations from training corporaJun 25, 2002
6427141Enhancing knowledge discovery using multiple support vector machinesJul 30, 2002
6493686Computer implemented machine learning method and system including specifically defined intronsDec 10, 2002

Claims

What is claimed is:

1. A method of machine learning using a training process to train a learning system, the method comprising:

presenting queries to non-expert netizens over a network, the netizens participating in the training process; and

continually updating the system and refining the queries based on responses to the queries provided by the netizens,

wherein the queries are multiple choice queries.

2. The method of claim 1, wherein the system has certain goals including accumulating data.

3. The method of claim 2, wherein at least one goal comprises a goal selected from among the following: handwriting recognition, voice recognition, building a database of queries to recognize an object, building a database of common sense.

4. The method of claim 2, wherein the goals of the system evolve as the system is updated.

5. The method of claim 4, wherein the goals comprise a plurality of intermediate goals, that change in response to the responses while approaching a final goal.

6. The method of claim 5, wherein one of the plurality of intermediate goals is to recognize a certain letter of the alphabet in handwriting.

7. The method of claim 5, wherein one of the plurality of intermediate goals is to recognize a sound corresponding to a certain set of letters, in context.

8. The method of claim 1, further comprising providing access to a domain expert to resolve conflicts between the responses of netizens, if a conflict arises.

9. The method of claim 1, wherein setting up the system comprises:

implementing a plurality of rules for presenting questions;

implementing an architecture for interacting with the netizens to enable netizens to access the system; and

generating a database for storing the responses.

10. The method of claim 9, further comprising:

evaluating a reliability rating for each of the netizens; and

weighting the response of each of the netizens according to the reliability rating.

11. A system coupled to a network to present queries to and receive responses from a plurality of netizens over the network, the system comprising:

a user interface to present the queries and receiving the responses;

a data aggregation logic to organize the responses;

a query formulation logic to formulate a next query based on the plurality of responses to the last query; and

reliability evaluation logic to weight each response according to a reliability of the netizen providing the response.

12. The system of claim 11, further comprising:

conflict resolution logic to resolve conflicts between responses provided by the netizens using domain experts.

13. A method of data aggregation over a network comprising:

presenting a question to a plurality of participants over a network;
receiving responses to the question;
analyzing the plurality of responses to the question from the plurality of participants;
formulating a next question based on the plurality of responses; and
presenting the next question to the plurality of participants.

14. The method of claim 13, further comprising:

resolving a conflict between the plurality responses provided by the netizens using domain experts, if the conflict arises.

15. The method of claim 13, further comprising:

evaluating a reliability rating for each of the netizens; and

weighting the response of each of the netizens according to the reliability rating.

16. A method of interacting with a user comprising:

presenting a query to the user over a network;
receiving a response to the query from the user, the response transmitted to a learning system; and
informing the user of a result generated based on the response to the query, such that the user is rewarded by being informed of the content and state of data being gathered based on the response.

17. A machine readable medium having stored thereon data representing sequences of instructions, which when executed by a computer system, cause said computer system to perform the operations of:

presenting multiple choice queries to non-expert netizens over a network, the netizens participating in a training process of a learning system; and

continually updating the learning system and refining the multiple choice queries based on responses to the queries provided by the netizens.

18. The machine readable medium of claim 17, wherein the system includes a plurality of goals, and one of the goals is to accumulate data.

19. The machine readable medium of claim 17, further comprising:

rewarding netizens for their participation in the training process.

20. A system for implementing a training process comprising:

a means for presenting queries to and receiving responses from non-expert netizens over a network, the netizens participating in the training process;
a means for continually updating the system and refining the queries based on the responses to the queries provided by the netizens; and
a means for rewarding the netizens for participation in training the system.

21. The system for training of claim 20, further comprising:

a means for storing the responses of the netizens; and
a means for weighting the responses of each netizens based on a reliability of the netizen.

Drawings