|Publication number||US20050273461 A1|
|Application number||US 11/133,160|
|Publication date||Dec 8, 2005|
|Filing date||May 18, 2005|
|Priority date||Jun 21, 2001|
|Also published as||CA2352577A1, US6785664, US7020644, US20030009429, US20030028500|
|Publication number||11133160, 133160, US 2005/0273461 A1, US 2005/273461 A1, US 20050273461 A1, US 20050273461A1, US 2005273461 A1, US 2005273461A1, US-A1-20050273461, US-A1-2005273461, US2005/0273461A1, US2005/273461A1, US20050273461 A1, US20050273461A1, US2005273461 A1, US2005273461A1|
|Export Citation||BiBTeX, EndNote, RefMan|
|Referenced by (5), Classifications (12), Legal Events (3)|
|External Links: USPTO, USPTO Assignment, Espacenet|
The present invention is a division of application U.S. patent Ser. No. 09/885079, filed Jun. 21, 2001, by Kevin Jameson.
The present invention uses inventions from the following patent applications that are filed contemporaneously herewith, and which are incorporated herein by reference:
U.S. patent Ser. No. 09/885078, Collection Information Manager, Kevin Jameson.
This invention relates to automated software systems for processing collections of computer files in arbitrary ways, thereby improving the productivity of software developers, web media developers, and other humans and computer systems that work with collections of computer files.
The general problem addressed by this invention is the low productivity of human knowledge workers who use labor-intensive manual processes to work with collections of computer files. One promising solution strategy for this software productivity problem is to build automated systems to replace manual human effort.
Unfortunately, replacing arbitrary manual processes performed on arbitrary computer files with automated systems is a difficult thing to do. Many challenging subproblems must be solved before competent automated systems can be constructed. As a consequence, the general software productivity problem has not been solved yet, despite large industry investments of time and money over several decades.
The present invention provides one piece of the overall functionality required to implement automated systems for processing collections of computer files. In particular, the current invention has a practical application in the technological arts because it provides automated collection processing systems with a means for obtaining useful, context-sensitive knowledge about variant computational processes for processing collections.
Introduction to Process Knowledge
This discussion starts at the level of automated collection processing systems, to establish a context for the present invention. Then the discussion is narrowed down to the present invention, a Collection Knowledge System.
The main goal of automated collection processing systems is to process collections of computer files, by automatically generating and executing arbitrary sequences of computer commands that are applied to the collections.
One critical part of automated collection processing systems is automatically calculating a variant computational process to execute on the current collection. This calculation is quite difficult to carry out in practice, for many significant reasons. One reason is that the computation can be arbitrarily complex. Another reason is that many files can be involved in the computation. Another reason is that many different application programs can be involved in the computation. Another reason is that platform dependent processes can be involved in the computation. Another reason is that various site preferences and policies can be involved in the computation. And so on. The list of complicating factors is long. Generating and executing arbitrary computational processes is a complex endeavor even for humans, let alone for automated collection processing systems.
For communication convenience, this discussion is now narrowed down to focus on the general domain of software development, and in particular on the field of automated software build systems. This is a useful thing to do for several reasons. First, it will ground the discussion in a practical application in the technical arts. Second, it will bring to mind examples of appropriate complexity to readers who are skilled in the art. Third, it will provide several concrete problems that are solved by the present invention.
Even though the discussion is narrowed here, readers should keep in mind that automated collection processing systems have a much wider application than only to the particular field of software build systems. The goal of automated collection processing systems is to automatically generate and execute arbitrary computational processes on arbitrary collections of computer files.
Software build processes are good examples for the present invention because they effectively illustrate many of the problem factors described above. That is, software build processes they can be large, complex, customized, platform dependent, and can involve many files and application programs. For example, large multi-platform system builds can easily involve tens of computing platforms, hundreds of different software programs on each platform, and tens of thousands of data files that participate in the build. The automatic calculation and execution of such large software build problems is a challenging task, regardless of whether manual or automated means are used.
One of the most difficult aspects of calculating and executing software builds is accommodating variance in the calculated processes. For example, typical industrial software environments often contain large amounts of process variance in tools, processes, policies, data, and in almost everything else involved in computational processes.
Process variance is difficult to handle even for human programmers, because variance usually requires simultaneous, peaceful, co-existence and co-execution of a plurality of variant instances of complex processes. Peaceful co-existence and co-execution are not easy to achieve within industrial software environments. For example, two variant processes might both use a large customer database that is impractical to duplicate. Or multiple working programs in an existing working process might be incompatible with a required new variant program. And so on. In typical cases, many interacting process issues must be resolved before variant processes of significant size can peacefully co-exist in industrial software environments.
As a simple model of variant process complexity, consider a single long strand of colored beads on a string. Beads represent programs, data files, and process steps. Beads can have variant colors, shapes, and versions. Individual beads represent steps in a computational process, individual programs, particular input or output data files, or particular sets of control arguments to particular program beads. Finally, consider that software build problems of medium complexity are represented by several hundreds or thousands of beads on the string.
This bead model can now be used to illustrate how process variance affects complexity. To begin with, it is reasonable to say that most complex industrial software processes must be varied in some way to meet the needs of various computational situations. In the bead model, this is equivalent to saying that most beads on the string will need to be varied at some time, by color, by shape or by version, in order to create a new, particular variant process to meet a new, particular variant software development situation.
As one example, it is often the case that some original data files will be incompatible with a proposed new computing platform bead or a proposed new program bead, requiring that multiple data beads be changed whenever certain platform or program beads are changed. This example illustrates coupling among particular changes within a particular computational process. This example also illustrates the point that it is not always possible to make only one change in a process; in many situations, multiple bead changes must be coordinated in order to create a desired variant computational process.
As a second example of bead model complexity, large industrial software environments can easily contain tens of long product strings and new product version strings, each containing many hundreds or thousands of beads. Further, tens or hundreds of programmers can be continuously improving the various product strings by adding new feature beads, fixing software bug beads, modifying existing feature beads in some way, or cloning, splitting, or merging whole strings of beads. Since each string change must be tested, it follows that many variant combinations of data beads, feature beads, bug fix beads, and process beads must peacefully co-exist together in host industrial software environments for arbitrary periods of time.
Thus it can be seen that automated collection processing systems are not faced with only simple problems involving single complex computational processes. Instead, automated systems face a far more difficult, more general problem that involves whole families of related, complex, coupled, customized, and platform-dependent computational processes.
For each individual computational situation, a competent automated system must calculate, create, and execute a precisely correct variant computational process. In order to do that, automated systems require access to a large amount of variant process knowledge. One mechanism for providing the required knowledge is a Collection Knowledge System, the subject of the present invention.
Problems To Be Solved
This section lists several important problems that are faced by automated collection processing systems, and that are solved by the present Collection Knowledge System invention.
The Knowledge Organization Problem is one important problem that must be solved to enable the construction of automated collection processing systems. It is the problem of how to organize knowledge for variant processes, in one place, with one conceptual model, for use by multiple programs in variant processing situations.
Some interesting aspects of the Knowledge Organization Problem are these: an arbitrary number of programs can be involved; an arbitrary amount of knowledge for each program can be involved; knowledge used by programs can have arbitrary structure determined by the program; and knowledge can exist in various binary or textual forms.
The Customized Knowledge Problem is another important problem to solve. It is the problem of how to customize stored knowledge for use in variant processing situations.
Some interesting aspects of the Customized Knowledge Problem are these: knowledge can be customized for arbitrary programs; arbitrary amounts of knowledge can be customized; knowledge can be customized for sites, departments, projects, teams, and for individual people; knowledge can be customized by purpose (for example, debug versus production processes); and various permutations of customized knowledge may even be required.
The Platform-Dependent Knowledge Problem is another important problem. It is the problem of representing platform dependent knowledge in ways that promote human understanding, reduce knowledge maintenance costs, provide easy automated access to stored knowledge, and enable effective sharing of platform dependent knowledge across multiple platforms within particular application programs.
Some interesting aspects of the Platform Dependent Knowledge Problem include these: many platforms may be involved; platforms can be closely or distantly related; platforms can share a little or a lot of information; new platform knowledge is sometimes added; old platform knowledge is sometimes discarded; knowledge can be shared among many or a only few platforms within an application.
The Coupled-Application Knowledge Problem is another important problem. It is the problem of multiple applications being indirectly coupled to each other by their shared use of the same processing knowledge for the same computing purpose. As a consequence of coupling, knowledge changes made for one program may require knowledge changes to be made in other programs. For example, a single knowledge change to enable software “debug” compilations typically requires changes to both compiler and linker control arguments. The compiler is told to insert debugging symbol tables, and the linker is told not to strip symbol tables out of the linked executable file. Thus two bodies of knowledge for two apparently independent programs are coupled by the purpose of debugging.
Some interesting aspects of the Coupled-Application Knowledge Problem are these: multiple applications may be involved in a coupling relationship; applications may be coupled by data file formats, control arguments, or execution sequences; coupling relationships can vary with the current processing purpose; multiple sets of coupled programs may be involved; and multiple coupled processes involving multiple sets of coupled applications may be involved.
The Shared Knowledge Problem is another important problem. It is the problem of how to share knowledge among multiple programs regardless of variant processing purposes or coupling relationships. This problem is not the same as the Coupled-Application Knowledge Problem, which considers indirect coupling among multiple applications according to computational purpose (e.g. debugging). Instead, the Shared Knowledge Problem considers deliberate sharing of knowledge among applications and multiple platforms to reduce multiple copies of the same knowledge.
Some interesting aspects of the Shared Knowledge Problem are these: shared knowledge may be platform dependent; shared knowledge may be customized by site, project, person, purpose, and so on; multiple applications may share one piece of knowledge; and the set of multiple applications that share a piece of knowledge may change with variant processing purpose.
The Scalable Knowledge Delivery Problem is another important problem. It is the problem of how to deliver arbitrary amounts of complex, customized, shared, and platform-dependent knowledge to arbitrary programs, in ways that are resistant to scale up failure.
Some interesting aspects of the Scalable Knowledge Delivery Problem are these: arbitrary amounts of knowledge can be involved; the format of delivered knowledge can be a text string, a text pair, a list, a text or binary file, a set of files, a directory, or even a tree of files; network filesystem mounting methods such as NFS (Network Filesystem System) are sometimes inappropriate or unreliable; frequently used knowledge should be cached for faster retrieval; and cached knowledge must be flushed when the underlying original knowledge is updated or removed.
The Mobile Knowledge Problem is another important problem. It is the problem of how to encapsulate knowledge within a collection, so that knowledge can be shipped around the network in the form of collections of computer (knowledge) files. Importantly, application programs working within the nature filesystem boundaries of the collection directory subtree should be able to use the knowledge stored within the mobile collection. Mobile collections provide an implementation of the idea of location-sensitive knowledge.
Some interesting aspects of the Mobile Knowledge Problem are these: arbitrary amounts of knowledge may be involved; knowledge for multiple programs may be involved; customized knowledge may be involved; variant knowledge may be involved; location-sensitive knowledge should override static knowledge stored in the system if so desired; and mobile knowledge arriving at a site should be installable at the receiving site.
The Workspace Knowledge Problem is another important problem. It is the problem of how to configure a computer filesystem workspace to contain particular sets of hierarchically-organized collections that each contain multiple knowledge files, such that the knowledge files become available to application programs that work within the directory subtree that defines the workspace subtree. Workspaces provide an implementation of the idea of location-sensitive knowledge.
Some interesting aspects of the Workspace Knowledge Problem are these: arbitrary amounts of knowledge can be involved; workspaces lower in the subtree should share or “inherit” knowledge stored above the workspaces in the subtree; knowledge located lower in the subtree should override knowledge located higher in the subtree; and knowledge should become available to programs only when their current working directory is within the workspace subtree.
The Aggregated Knowledge Problem is another important problem. It is the problem of aggregating various smaller bodies of knowledge into larger bodies of knowledge that are intended to serve a particular focus, purpose, area of endeavor, or problem domain.
Some interesting aspects of the Aggregated Knowledge Problem are these: arbitrary amounts of knowledge can be involved; aggregated collections of knowledge should be named for convenient reference; aggregated knowledge can be associated with particular filesystem locations or subtrees; and aggregated knowledge should be accessible by name, independent of filesystem location.
The Installable Knowledge Problem is another important problem. It is the problem of how to create, install, and maintain smaller, named, encapsulated subsets of system knowledge, thereby reducing the complexity of the overall system knowledge management problem.
Some interesting aspects of the Installable Knowledge Problem are these: arbitrary amounts of knowledge can be involved; previously installed knowledge must be uninstalled before newer installable knowledge can be installed; installable knowledge should not be coupled to previously existing knowledge; programs must dynamically detect and use installable knowledge; and uninstallation must cause the flushing of previously cached versions of the old installable knowledge.
As the foregoing material suggests, knowledge management for supporting variant computational processes is a complex problem. Many important issues must be solved in order to create a competent knowledge management and delivery system.
General Shortcomings of the Prior Art
A professional prior art search for the present invention was performed, but produced no meaningful, relevant works of prior art. Therefore the following discussion is general in nature, and highlights the significant conceptual differences between the program-oriented knowledge storage mechanisms of the prior art, and the novel collection-oriented knowledge management mechanisms represented by the present invention.
Prior art approaches lack support for collections. This is the largest limitation of all because it prevents the use of high-level collection abstractions that can significantly improve productivity.
Prior art approaches lack support for managing many simultaneous and different customizations of program knowledge, thereby making it impossible for one set of knowledge to simultaneously serve the needs of many software programs that participate in many variant computational processes.
Prior art approaches lack support for managing the knowledge of coupled application programs in synchronization, thereby making it difficult for humans to coordinate the actions of chains of coupled programs in variant computational processes, and thereby increasing human programming costs.
Prior art approaches lack support for sharing knowledge among multiple unrelated programs, thereby requiring humans to provide each program with its own copy of shared knowledge, and thereby increasing knowledge maintenance costs.
Prior art approaches lack support for using a single scalable means to deliver operational knowledge to many programs within typical industrial software environments, thereby making it more difficult to centrally manage knowledge, and thereby increasing knowledge maintenance costs.
Prior art approaches lack support for partitioning program knowledge into encapsulated subsets of mobile knowledge that can be easily moved around and utilized within a filesystem or computer network. This discourages the sharing and mobility of knowledge, and discourages the use of mobile, location-sensitive knowledge in particular computing situations.
Prior art approaches lack support for associating knowledge with particular directories in filesystem subtrees, thereby making it impossible to configure hierarchical computer workspaces to contain particular sets of knowledge.
Prior art approaches lack support for aggregating smaller bodies of knowledge into larger, named, bodies of knowledge that can be referenced by name or that can be associated with physical filesystem subtrees. This discourages the association of bodies of aggregated knowledge with particular computational problems or computational workspaces.
Prior art approaches lack support for partitioning program knowledge into encapsulated subsets of installable knowledge that can be individually created, installed, and maintained, thereby increasing the monolithic nature of most stored program knowledge, and thereby increasing knowledge creation and maintenance costs.
As can be seen from the above description, prior art mechanisms in general have several important disadvantages. Notably, they do not provide support for collections, coupled applications, shared knowledge, customized knowledge, installable knowledge, or mobile knowledge.
In contrast, the present Collection Knowledge System has none of these limitations, as the following disclosure will show.
Specific Shortcomings in Prior Art
One main example of prior art knowledge delivery systems is the common technique of storing application program data on a local hard disk, where it can be accessed by an application program.
For example, preference options for spreadsheets and word processors on personal computers are generally stored using this technique. It is fair to say that historically, this particular approach has been the main approach used by the industry to store application program knowledge.
However, as described previously, this approach has many significant limitations with respect to supporting applications that participate in variant computational processes. Indeed, it is fair to say that this simple approach is one of the main causes of difficulty in treating variant processes, for all the reasons listed earlier.
For example, this prior art approach limits the sharing of knowledge among applications. It cannot represent the idea of coupled applications. It cannot associate variant processes with relevant knowledge. It cannot represent different customizations of application knowledge. And so on.
As can be seen from the above description, the main prior art approach used within the software industry has many significant limitations. Most importantly, it is oriented toward storing knowledge for single, isolated applications. It cannot represent knowledge for entire variant processes, and cannot use a combination of customized knowledge from many different application programs to satisfy the knowledge needs of entire variant processes.
In contrast, the present Collection Knowledge System invention has none of these limitations, as the following disclosure will show.
A Collection Knowledge System provides context-sensitive knowledge delivery services to application programs, thereby enabling application programs to better work with variant computational processes.
In operation, a Collection Knowledge System receives knowledge requests from application programs, performs local and remote lookups into structured trees of knowledge, and finally returns the obtained knowledge to the requesting programs.
Importantly, requested knowledge is retrieved using customizable search rules, thereby making it possible to override default knowledge values with higher-precedence knowledge values.
Collection Knowledge Systems can store and deliver customized knowledge for entire variant computational processes, thereby enabling automated collection processing systems and individual application programs to calculate, create, and execute complex variant computational processes in automated, scalable ways that were not previously possible.
The main object of the present Collection Knowledge System invention is to manage the knowledge of entire, industrial-strength variant processes that involve many application programs and data files. The present Collection Knowledge System invention enables humans and automated programs to work with knowledge at the variant process level, rather than at the individual program level, thereby significantly improving human productivity by automatically processing variant computational process knowledge in ways that were not previously possible.
Another object is to provide a general, scalable, and automated Collection Knowledge System, thereby promoting the construction of general and scalable automated collection processing systems.
Another object is to provide support for representing and sharing platform dependent knowledge, thereby making it possible for collection knowledge systems that execute on one computing platform to serve up knowledge for many computing platforms.
Another object is to provide support for customized knowledge, thereby making it possible for humans to store customized company, department, project, team, individual, and purpose-oriented policy decisions within a collection knowledge system.
Another object is to provide support for context-sensitive knowledge, thereby making it possible to aggregate and utilize particular knowledge for particular variant processes, using symbolic context names that are associated with those variant processes.
Another object is to provide support for an automated, scalable, centralized means of managing variant process knowledge, thereby promoting increased sharing of knowledge within the central storage system, promoting increased convenience for customizing knowledge according to desired operational policies, and promoting decreased knowledge maintenance costs.
Another object is to provide support for mobile knowledge, whereby process knowledge is stored within a collection subtree on a filesystem where it can be accessed in a location-sensitive way by programs operating within the directory subtree containing the collection. Mobile knowledge mechanisms encapsulate variant computational process knowledge into mobile collections that can be transported across networks and installed into remote knowledge systems.
Another object is to provide support for workspace knowledge, whereby knowledge in the form of mobile collections is stored in a hierarchical way within a computer filesystem, thereby creating a physical computer workspace containing particular bodies of knowledge at particular locations within the computer filesystem subtree, and thereby enabling locations within the tree to “inherit” knowledge located above them within the subtree.
Another object is to provide support for aggregated knowledge, whereby knowledge in the form of mobile collections is aggregated into named sets of knowledge that can be referenced by name, or that can be associated with physical filesystem subtrees, thereby enabling application programs to access large bodies of knowledge that are specifically related to the location, purpose, or computational situation at hand.
Another object is to provide support for partitioned subsets of process knowledge in the form of installable knowledge collections, thereby enabling custom knowledge for an entire particular variant process to be stored within a single collection, and also thereby enabling the convenient installation and management of the installable knowledge.
As can be seen from the objects above, collection knowledge systems can provide many useful services to both humans and programs that process collections of computer files. Collection Knowledge Systems can significantly improve human productivity by supporting the automatic calculation, creation, and execution of complex variant computational processes, in scalable, automated ways that were not previously possible.
Further advantages of the present Collection Knowledge System invention will become apparent from the drawings and disclosures that follow.
Overview of Collections
This section introduces collections and some related terminology.
Collections are sets of computer files that can be manipulated as a set, rather than as individual files. Collection information is comprised of three major parts: (1) a collection specifier that contains information about a collection instance, (2) a collection type definition that contains information about how to process all collections of a particular type, and (3) optional collection content in the form of arbitrary computer files that belong to a collection.
Collection specifiers contain information about a collection instance. For example, collection specifiers may define such things as the collection type, a text summary description of the collection, collection content members, derivable output products, collection processing information such as process parallelism limits, special collection processing steps, and program option overrides for programs that manipulate collections.
Collection specifiers are typically implemented as simple key-value pairs in text files or database tables.
Collection type definitions are user-defined sets of attributes that can be shared among multiple collections. In practice, collection specifiers contain collection type indicators that reference detailed collection type definitions that are externally stored and shared among all collections of a particular type. Collection type definitions typically define such things as collection types, product types, file types, action types, administrative policy preferences, and other information that is useful to application programs for understanding and processing collections.
Collection content is the set of all files and directories that are members of the collection. By convention, all files and directories recursively located within an identified set of subtrees are usually considered to be collection members. In addition, collection specifiers can contain collection content directives that add further files to the collection membership. Collection content is also called collection membership.
Collection is a term that refers to the union of a collection specifier and a set of collection content.
Collection information is a term that refers to the union of collection specifier information, collection type definition information, and collection content information.
Collection membership information describes collection content.
Collection information managers are software modules that obtain and organize collection information from collection information stores into information-rich collection data structures that are used by application programs.
First is a policy to specify that the root directory of a collection is a directory that contains a collection specifier file. In this example, the root directory of a collection 100 is a directory named “c-myhomepage”
Second is a policy to specify that all files and directories in and below the root directory of a collection are part of the collection content. Therefore directory “s”
Collection Information Types
Suppose that an application program means
Collection specifiers 102 are useful because they enable all per-instance, non-collection-content information to be stored in one physical location. Collection content 103 is not included in collection specifiers because collection content 103 is often large and dispersed among many files.
All per-collection-instance information, including both collection specifier 102 and collection content 103, can be grouped into a single logical collection 100 for illustrative purposes.
Collection Application Architectures
Collection type definition API means 112 provides access to collection type information available from collection type definition server means 115. Collection specifier API means 113 provides access to collection specifier information available from collection specifier server means 116. Collection content API means 114 provides access to collection content available from collection content server means 117.
API means 112-114, although shown here as separate software components for conceptual clarity, may optionally be implemented wholly or in part within a collection information manager means 111, or within said server means 115-117, without loss of functionality.
API means 112-114 may be implemented by any functional communication mechanism known to the art, including but not limited to command line program invocations, subroutine calls, interrupts, network protocols, or file passing techniques.
Server means 115-117 may be implemented by any functional server mechanism known to the art, including but not limited to database servers, local or network file servers, HTTP web servers, FTP servers, NFS servers, or servers that use other communication protocols such as TCP/IP, etc.
Server means 115-117 may use data storage means that may be implemented by any functional storage mechanism known to the art, including but not limited to magnetic or optical disk storage, digital memory such as RAM or flash memory, network storage devices, or other computer memory devices.
Collection information manager means 111, API means 112-114, and server means 115-117 may each or all optionally reside on a separate computer to form a distributed implementation. Alternatively, if a distributed implementation is not desired, all components may be implemented on the same computer.
Collection Data Structures
In particular, preferred implementations would use collection datastructures to manage collection information for collections being processed. The specific information content of a collection datastructure is determined by implementation policy. However, a collection specifier typically contains at least a collection type indicator
This section defines various terms used in this document. Each of these kinds of knowledge is explained in detail later in this document.
Knowledge is a term that refers to information that is stored in a Collection Knowledge System (CKS system) for use by application programs. Stored knowledge can include program control information, template files, type definition information, or in general, any information that is useful to an application program in producing desired application results.
Knowledge tree is a term that refers to a hierarchical organization of information, such as a tree of directories and files on a computer disk, or a logical tree of information stored as hierarchically related tables and records in a computer database.
Knowledge tree namespace is a term that refers to a set of named knowledge trees. Knowledge tree namespaces have names by which they can be referenced.
Customized knowledge is a term that refers to knowledge accessed by search rules that impose a precedence ordering on requested knowledge. For example, in a linear search rule implementation, the first knowledge found according to the search rules used would have a higher precedence than subsequent occurrences of the same knowledge. Thus the first knowledge found is said to be a customized version of the default, later-found knowledge.
Shared knowledge is a term that refers to knowledge shared among a plurality of application programs. No particular program “owns” such knowledge, since it is a shared resource.
Context knowledge is a term that refers to knowledge found using a particular named set of customized search rules that define a context. A context is a set of search rules that specify an ordered sequence of places to look for requested knowledge. For example, a “debug” context would specify knowledge search rules that gave precedence to special pieces of knowledge for supporting debugging activities.
Platform dependent knowledge is a term that refers to knowledge that varies with computing platform. Thus an application program using platform dependent knowledge would normally expect different platforms to return different knowledge results from the same original knowledge request.
Mobile knowledge is a term that refers to knowledge that is both stored and accessible in a mobile collection. The main idea of mobile knowledge is that collections can contain knowledge that can be easily moved around a network, and that can be used by application programs located at receiving locations. In this approach, application knowledge becomes a commodity resource that can be stored, shipped, and used in essentially the same way as other common computer files. Mobile knowledge can be used as a pure knowledge customization mechanism, as a pure knowledge extension mechanism, or as a combination of both, depending on the precedence positioning of mobile knowledge trees in a set of search rules. The main intent of mobile knowledge is to act as a knowledge extension mechanism.
Workspace Knowledge is a term that refers to knowledge that is contained within a hierarchically organized set of mobile knowledge collections. The main idea of workspace knowledge is to associate mobile knowledge collections with particular directories and subtrees in computer filesystems, thereby making it possible for humans to create physical, hierarchical subtree workspaces that contain knowledge relevant to the computational tasks performed at various levels within the hierarchical workspaces.
Aggregated knowledge is a term that refers to knowledge that is contained within named sets of mobile knowledge collections, or “knowledge spaces.” Aggregated knowledge can be used as a pure knowledge customization mechanism, as a pure knowledge extension mechanism, or as a combination of both techniques. The main intent of aggregated knowledge is to act as a knowledge extension mechanism.
Remote knowledge is a term that refers to knowledge that is stored remotely on a network, and that is accessed using a network program or network protocol. A good example of remote knowledge is the remote knowledge contained in a client-server CKS system, such as the one described later in this document. Two important advantages of remote knowledge are that it encourages centralized administration and widespread sharing of the remote knowledge.
Remote aggregated knowledge is a term that refers to aggregated knowledge that is provided by a remote knowledge delivery system. For example, aggregated knowledge that is stored on a server in a client-server CKS system is remote aggregated knowledge. Remote server-side aggregated knowledge is implemented identically to local client-side aggregated knowledge, and indeed, appears as local aggregated knowledge to the server program.
Installable knowledge is a term that refers to encapsulated mobile knowledge that can be easily installed into, and be uninstalled from, a knowledge delivery system. The main advantage of installable knowledge is that it partitions large knowledge sets into more easily managed subsets of knowledge that are conveniently contained within mobile installable knowledge collections.
Cached knowledge is a term that refers to knowledge that has been added to a computer memory cache for increased system performance.
Knowledge Types and Search Rules
This section describes several types of knowledge used by the present Collection Knowledge System invention, and shows how various kinds of knowledge can be represented and accessed by search rules.
Fundamental Knowledge Organization
The present Collection Knowledge System invention fundamentally organizes knowledge into hierarchical trees, because trees help to separate knowledge used by multiple application programs within one physical location. For example,
Tree structures are both a simple and preferred mode of implementation, but databases using tables and records are also possible. Those skilled in the art will understand that both approaches have advantages and disadvantages, leading to various tradeoffs for each implementation.
Search Rules And Knowledge Precedence
Because any one set of knowledge cannot meet all computational situations over time, knowledge delivery systems must provide means for customizing and extending the default knowledge content of a knowledge system.
The present CKS invention uses search rules as a mechanism for both customizing and extending stored knowledge. Search rules are comprised of one or more directory pathnames that tell a knowledge system where to look for knowledge files. Application programs traverse a list of search rules, looking in successive search directories to find the desired knowledge in particular knowledge files.
New search rules are added to a list of search rules for two main reasons.
The first reason is to customize the existing set of knowledge files. In this case, a newly added search rule should specify a directory that contains customized versions of existing knowledge files. The new search rule must appear early enough in the list of search rules to ensure that the desired “customized” versions of knowledge files are found before non-customized versions of the knowledge files.
The second reason is to extend the set of knowledge files that can be accessed using the search rules. In this case, the added search rule should point at a directory that contains knowledge files that are not available through any other search rule. Thus the total set of knowledge files accessible through the search rules is extended.
In summary, search rules provide a means for both customizing and extending the knowledge stored in a Collection Knowledge System system. By controlling the membership and ordering of rules within a search rule list, humans and applications can ensure that the desired knowledge files will be found during a search. This is an important point. Search rules have a large and consequential effect on the utility and outputs of knowledge delivery systems.
Static and Dynamic Search Rules
Search rules can be dynamic or static in nature. Static search rules are fixed in nature and do not vary over long periods of time, so they can be precisely specified as valid “hardcoded” directory pathnames in search rules. In contrast, dynamic search rules do vary over time and are temporary in nature. Thus they cannot be precisely specified as hardcoded directory pathnames in search rules.
Instead, dynamic search rules are best represented by partially specified directory pathnames that contain placeholder strings. Placeholder strings are replaced at runtime with values appropriate for the current knowledge request, thereby forming a complete and valid directory pathname for use as a search rule.
For example, in Line 8, the placeholder string “$HOME” would be replaced at runtime by the home directory of the person (actually, of the login account) that was using the search rules. In Line 8 also, the placeholder string “_app_” would be replaced at runtime by the name of the application program under which requested knowledge was stored. In Line 10, the placeholder string “_COLL_” would be replaced at runtime by the name of the current mobile collection, if any, that was being processed by the application program.
Dynamic search rules provide a powerful mechanism for adapting search rules, and thus the searched set of knowledge, to the particular computational situation that is in progress at search time.
Extending and Customizing Knowledge
Individual search rules can be used to extend knowledge, to customize knowledge, or to do a combination of both. Specifically, to extend knowledge, a search rule should make new knowledge files available. To customize knowledge, a search rule should make customized versions of existing knowledge files available. To both extend and customize at once, a search rule should make both new knowledge files and customized files available.
In preferred implementations, using one search rule to both extend and customize knowledge may have the undesirable side effect of coupling the extended and customized knowledge to each other. This complicates overall system knowledge management. In such a case, neither the pure extension knowledge nor the pure customization knowledge can be independently used. Both types of knowledge must travel together, since they are coupled by virtue of being in the same knowledge tree, referenced by one search rule. Having said this, the particular knowledge content provided by a search rule is determined by implementation policy, so mixed combinations of extension and customization knowledge may well be used together if implementation policy so dictates.
Overview of Search Rule Construction
This section describes the overall mechanism of constructing lists of search rules. The main goal of constructing search rule lists is to ensure that the final list provides access to the desired knowledge set, and that the order of search rules provides the desired knowledge precedence relationships.
Search rules for customizing knowledge should appear first in the search rule list, so that customized knowledge is found first. Afterwards, search rules that extend or provide the default knowledge set can follow in any suitable order determined by implementation policy.
This document discusses construction by starting with customization search rules, appending other rules, and finishing with rules for the default knowledge set. Readers skilled in the art will immediately appreciate that the list could also be constructed in reverse, starting with the default set, prepending extension rules, and finishing with customization rules. Both approaches are valid.
A useful general association can be made between search rules and knowledge types, assuming the use of preferred implementation policies that do not combine both knowledge extension and customization activities within individual search rules.
In particular, the following knowledge types are generally associated with search rules: customized, context, mobile, workspace, aggregated, and remote knowledge. Knowledge types are explained later in this document.
Search rules for customized and context knowledge are generally used to customize system knowledge. Search rules for mobile, workspace, aggregated, and remote knowledge are generally used to extend system knowledge.
The discussion will now introduce various types of knowledge.
The main idea of customized knowledge is to override existing knowledge values with customized values that are preferred by particular knowledge requests. Customized knowledge is important because it is one of the primary mechanisms for representing variant process knowledge.
For example, customized knowledge is useful for representing debugging knowledge, test case knowledge, new versions of knowledge, personal knowledge workspaces, and for representing the use of customized programs, data sets, preferences, and so on.
Search rules are the main technical mechanism for implementing customized knowledge.
The main idea of shared knowledge is to share common knowledge among multiple application programs that each require access to the shared knowledge. Shared knowledge reduces complexity and knowledge maintenance costs by sharing one copy of information, in comparison to the alternative, which is to maintain multiple copies of the same information.
The main technical mechanism for sharing knowledge is the use of a virtual application program name in the fundamental knowledge tree structure.
In practice, application programs look up shared knowledge by using the name of the shared information directory instead of using the name of the directory containing their own application data.
The main idea of context knowledge is to give names to particular sets of search rules, so that search rule lists can be constructed with the help of conveniently named blocks of search rules that represent particular ideas or computational situations.
For example, rules in a context named “debug” could point to directories containing useful debugging knowledge values. Rules in another context named “production” could point to directories containing knowledge values that were optimized for high-performance software production processes.
Once contexts have been defined, they can be combined into context stacks that mirror their desired position in a constructed search rule list.
When a final search rule list is constructed, all search rules in the “debug” context will appear in the final list before all search rules in the “mine” context, thereby implementing the desired search rule precedence specified in the original context stack expressions.
Platform Dependent Knowledge
The main idea of platform dependent knowledge is to vary knowledge by computing platform within an application knowledge set. This is a very useful thing to do because many knowledge values used in variant computational processes vary by computing platform.
For example, the following things often vary by platform in commercial software environments: application programs, filenames and suffixes, command line options, data formats, build processes, link libraries used, products built, and so on.
The main technical means for supporting platform dependent knowledge is the use of virtual platforms within knowledge tree structures. A virtual platform is a symbolic name for a platform.
Platform dependent knowledge helps to reduce knowledge complexity and maintenance costs by sharing knowledge among virtual platforms within individual application programs. Note that sharing of knowledge among platforms within an application program is not the same as sharing arbitrary knowledge among multiple application programs, which was described earlier.
Knowledge can be shared at any appropriate virtual platform level within
There are two main ideas associated with mobile knowledge.
The first main idea of mobile knowledge is to encapsulate a knowledge tree structure within a collection, so that the encapsulated knowledge can then become as mobile as collections are, and can be transported around filesystems and networks for use at receiving locations.
The second main idea of mobile knowledge is to dynamically extend the amount of knowledge that is accessible by an application program at a receiving location. This goal is achieved by dynamically adding search rules to point to encapsulated knowledge trees within mobile collections.
The first idea, encapsulation, is achieved technically by placing the desired knowledge tree within the collection.
The second idea, adding dynamic search rules for mobile knowledge, is shown by
In a conceptual sense, mobile knowledge is intended to mimic the human idea of associating additional knowledge with particular physical locations. For example, additional reference knowledge would be available to humans in a library location, and additional commerce knowledge would be available to humans in a bank location. Similarly, additional mobile knowledge becomes available to application programs that work within a mobile knowledge collection subtree.
Mobile knowledge allows human programmers to change directories into special mobile knowledge collections to access the knowledge that is stored there. Thus by changing directories, humans can influence the knowledge that is accessible to them. For example, human programmers can conveniently use mobile collections to set up special test environments containing special test environment knowledge, and then can ship the test environment around to their friends or coworkers.
It is not always beneficial to have all known knowledge available to programs, all of the time. Knowledge sets can conflict with each other. Instead, it is often quite useful to separate and partition knowledge into distinct chunks, and to change directories into particular mobile knowledge collections to temporarily extend or customize existing knowledge in particular ways.
The main technical mechanism for accessing mobile knowledge is dynamic search rules such as shown in
Workspace knowledge is nested mobile knowledge. That is, workspace knowledge is knowledge that is contained in a series of nested collections within a computer filesystem. A workspace is defined as the series of nested collections that runs from the current working collection up toward the root directory of the filesystem. The top end of a workspace can terminate at the root directory of the filesystem, or at a directory level determined by implementation policy. For example, the implementation could specify that user home directories (e.g. “/home/user”) were the top limit, instead of the root directory of the filesystem.
The main idea of workspace knowledge is for application programs to have access to knowledge contained in ancestor collections of the current working collection. Each ancestor collection above the current collection may or may not contain mobile knowledge.
For example, suppose a first mobile knowledge collection contains knowledge that is required to process a second group of several other collections. How can knowledge in the first mobile knowledge collection be accessed while applications are processing the other collections in the second group?
Workspace knowledge provides a mechanism for doing so, as follows. The first mobile knowledge collection is placed in a filesystem directory, and then all collections in the second group are placed within the collection subtree of the first mobile knowledge collection. Thus all collections in the second group become child subtrees of the first parent collection. Subsequently, when application programs work within the subtree one of the second group of collections, the first parent collection is an ancestor of the current working collection. Application programs can then search upward through ancestor directories to detect and utilize the mobile knowledge contained within those ancestor collections.
Thus workspace knowledge makes it possible for an application program working within a collection subtree to access additional mobile knowledge trees stored within ancestors of the current collection subtree location, by virtue of dynamic workspace knowledge search rules.
In addition, the workspace rule on Line 5 would be expanded (replicated) as many times as necessary to point to knowledge trees contained in ancestor collections. Thus a second copy of rule 5 would point to the mobile knowledge tree rooted at Line 7 in the ancestor collection. Expanded workspace rules are discussed in more detail later in this document.
Aggregated knowledge is non-nested mobile knowledge. That is, aggregated knowledge is comprised of named groups of non-nested mobile knowledge collections. The main idea of aggregated knowledge is to construct a named set of knowledge trees from multiple mobile knowledge collections.
In practice, aggregated knowledge can be explicitly referenced by an application program, or it can be dynamically picked up by an application program through an association with a physical filesystem location, as is done with workspace knowledge.
Aggregated knowledge is similar to workspace knowledge in almost all aspects except one. The primary difference between the two is that workspace knowledge organizes mobile knowledge collections by nesting them within each other, to support the idea of inheritance and automatic directory upsearches to locate ancestor knowledge collections.
In contrast, aggregated knowledge imposes no physical organizations on mobile knowledge collections. Instead, aggregated knowledge is represented by a “knowledge space” definition file comprised of a list of explicit references or explicit pathnames to mobile knowledge collections, wherever they are located. Thus aggregated knowledge can be implemented by collections located immediately beside each other, far away from each other, nested within each other, or by collections organized in some other manner.
When a final search rule list is constructed, all search rules in the “agk-p1” knowledge space will appear in the final list before all search rules in the “dept1” knowledge space, thereby implementing the desired knowledge precedence specified in the original aggregated knowledge space expressions.
Remote knowledge is knowledge that is accessed over a network communication mechanism, rather than through a local computer filesystem. The two main ideas of remote knowledge are to centralize the administration of knowledge, and to increase the sharing of knowledge among remote people, sites, platforms, and application programs.
The main technical mechanisms for accessing remote knowledge are search rules containing remote lookup expressions, and a client-server knowledge delivery system. Remote lookup expressions contain four parts: (1) a remote knowledge treespace name, (2) a network server host name, (3) a knowledge tree name, and (4) a virtual platform name.
Remote Aggregated Knowledge
Remote aggregated knowledge is similar to the local aggregated knowledge model described previously, except that remote aggregated knowledge is stored in remote mobile knowledge collections instead of local mobile knowledge collections.
The two main ideas of remote aggregated knowledge are to centralize the administration of knowledge and increase the sharing of knowledge among remote people, sites, platforms, and application programs.
Remote aggregated knowledge helps to reduce system complexity by centralizing both the administration and sharing of knowledge. When remote aggregated knowledge is used, fewer knowledge trees are required by the overall system, and they can all be maintained in one place.
Installable knowledge is knowledge that can be reversibly installed into, or removed from, a knowledge tree.
The main idea of installable knowledge is to partition large bodies of knowledge into smaller chunks of knowledge that can be more easily managed. In particular, installable knowledge is intended to support the convenient construction of large, custom sets of knowledge by a relatively simple process of additively installing a series of installable knowledge packages into one or more constructed knowledge trees.
The two main technical mechanisms of installable knowledge are named installable knowledge collections and automated command sequences for installing and removing installable knowledge.
Special directory names are not required. For example, user-defined installable knowledge directory names are possible within an installable knowledge collection. However, it is both practical and convenient to define a few standard installable knowledge directory names within an implementation, so that standard command sequences can be automatically used by the implementation to recognize and install or uninstall installable knowledge. If non-standard names are used, the implementation must be capable of automatically determining how to install knowledge contained in non-standard installable knowledge directories.
Installation of installable knowledge is typically done by copying subtrees into knowledge trees. Removal of installable knowledge is done by deleting the previously installed subtrees. Therefore those skilled in the art can appreciate that the technical mechanisms required for installing and removing installable knowledge are well known to the art and are simple to implement.
One important idea of installable knowledge is that installed knowledge retains an association with its original knowledge set even after installation, so that the installed knowledge can be easily uninstalled as a set. Maintaining such an association is very important. If no association is maintained after installation, the installed knowledge looks like any other knowledge in the post-installation knowledge tree, and cannot be easily identified for uninstallation or upgrading.
In preferred filesystem implementations of installable knowledge, the association is maintained as follows. All installable knowledge is comprised of two parts: an index file that lists all knowledge in the installable knowledge set, and a data directory that contains all knowledge within the installable knowledge set. Installation is accomplished simply by copying both the index file and the knowledge directory into a special installable knowledge directory within the destination knowledge tree. Uninstallation is accomplished simply by deleting the index file and knowledge directory from the special installable knowledge directory within the destination knowledge tree.
The general correspondence pattern between index filenames and data directory names in this example is “idx-xxx.tbl” and “z-xxx”, where “xxx” is chosen to describe the particular installable knowledge set. Particular matching patterns are determined by implementation policy. The particular matching pattern used here causes all index files to sort to the top of an alphabetic directory listing, thereby making it more convenient to list installable knowledge index names in small computer display windows without having to see matching data directories.
To install installable knowledge, index files and corresponding data directories are copied from the originating collection
It is advantageous, but not necessary, to dynamically calculate the index file from the contents of the knowledge data directory at the time of installation. Typically, a simple list of installable knowledge elements and their locations within the data directory is placed within the index file. This way, humans can make additions and deletions to the installable knowledge data files without worrying about index maintenance procedures or associated labor costs.
Note that installable knowledge is always installed underneath a virtual platform directory (such as “pi”) which is itself stored underneath an application program name in the knowledge tree. This is required to support platform dependent installable knowledge within application knowledge sets.
To install installable knowledge in client-server systems such as shown in
To uninstall installable knowledge, index files and corresponding data subtrees are simply deleted from the special “d-i” directory in the destination knowledge tree. One way of identifying the files and directories to be uninstalled is to provide complete names as parameters to the uninstall operation. A second way is to provide appropriate “xxx” strings to the uninstall operation, which could calculate the actual index and data directory names according to the implementation patterns “idx-xxx.tbl” and “z-xxx”. A third way is to perform the uninstallation from within the original installable knowledge collection, where the original index and data directory names can be obtained by listing the contents of the original installable knowledge collection. In all cases, uninstalled knowledge must be flushed from all knowledge caches maintained by the implementation.
To uninstall installable knowledge in client-server systems such as shown in
To upgrade, old installable knowledge is uninstalled, and new installable knowledge is installed, thereby replacing old installable knowledge with new installable knowledge and accomplishing the intent of the upgrade.
To maintain associations between the original installable knowledge set and the installed installable knowledge in database implementations of installable knowledge, each newly installed database record or table must carry a unique installable knowledge identification value to support the removal of all installable knowledge that was in the original installable knowledge set.
In preferred client server implementations, a Collection Knowledge System that is capable of installations and uninstallations over the network is called a Writeable Collection Knowledge System system. Writeable Collection Knowledge Systems are very useful because they permit many clients to easily install and upgrade knowledge on a central server.
For example, a Writeable Collection Knowledge System would be useful to a community of people who wanted to share knowledge among themselves. They could install such knowledge onto a Collection Knowledge System shared by the community members.
Because Collection Knowledge Systems provide knowledge directly to application programs, human users of installed knowledge are generally not required to understand the details of the installed knowledge. Instead, it is more typical that human users simply instruct application programs to access installed knowledge, to thereby carry out whatever operations the installed knowledge supports. This approach is advantageous, because it reduces the knowledge burden on human users. That is, they can obtain the benefits of installed knowledge through application programs, without really having to know anything about the details of the installed knowledge.
As can be seen from the foregoing discussion, installable knowledge is both convenient to use and to manage. Importantly, installable knowledge enables the efficient and widespread sharing of knowledge among remote people, sites, and application programs.
Cached knowledge is knowledge that has previously been retrieved and stored for future use.
The main idea of cached knowledge is to provide increased performance on future lookups of knowledge that has been previously retrieved. Cached knowledge is held in application program memory using caching techniques that are well known to the art.
Collection Knowledge System
A CKS system has three major components.
One component is knowledge itself, which is stored in various knowledge structures including knowledge trees, mobile collections, and knowledge spaces.
A second component is a set of search rules, which are used to locate requested knowledge. The construction and use of knowledge search rules is perhaps the most critical part of understanding a CKS system.
A third component is CKS program software, which is responsible for building and using search rules to retrieve requested knowledge from knowledge stores on behalf of application programs.
In overall operation, an application program uses a CKS system to satisfy a knowledge request. In turn, the CKS system dynamically constructs a set of search rules for the particular incoming request, and uses the search rules to locate requested knowledge. Finally, the requested knowledge is returned to the caller.
The following discussion explains the overall architecture and operation of a Collection Knowledge System.
CKS-Enabled Application Architecture
CKS Manager Architecture
Module Get Runtime Info 131 obtains runtime information required to support the knowledge retrieval operation. Information so retrieved typically includes environment variable values and invocation control and data arguments.
Module Build Search Rules 140 dynamically constructs a particular set of search rules for each incoming knowledge request. Typical search rule lists might specify 10 to 20 knowledge tree locations that should be searched to find reqested knowledge.
Module Perform Knowledge Lookups 150 performs lookups to satisfy the incoming knowledge request, using the search rules that were constructed by Build Search Rules 140.
Module Organize and Return Results 132 organizes retrieved knowledge into convenient forms for return to the calling module CKS Manager 130.
In operation, CKS Manager 130 proceeds according to the simplified algorithm shown in
First, Module Get Runtime Information 131 obtains the values of various environment variables, including the location of an initial set of search rules and various possible invocation control arguments and knowledge request values. Runtime information is returned to the calling module.
Next, Module Build Search Rules 140 uses runtime and incoming knowledge request information to dynamically construct a particular set of search rules to satisfy the current knowledge request. Search rules may consider various kinds of knowledge stores, including customized knowledge, shared knowledge, platform dependent knowledge, mobile knowledge, workspace knowledge, aggregated knowledge, remote knowledge, and installable knowledge. Previously cached knowledge is always implicitly considered, even though the cache location is not explicitly represented by constructed search rules.
Next, Module Perform Knowledge Lookups 150 uses runtime, knowledge request, and search rule information to perform the requested knowledge lookups. Multiple lookups may be requested within one CKS Manager 130 invocation. In general, Module Perform Knowledge Lookups 150 traverses the constructed search rules one by one, searching for knowledge. For requests that specify that the “first found” knowledge be returned, the knowledge lookup operation terminates immediately after the first knowledge match is found. In contrast, some requests specify that all available knowledge matches be returned. In those cases, the entire set of search rules is traversed to find all available knowledge matches.
Finally, Module Organize and Return Results 132 organizes and returns retrieved knowledge results to Module CKS Manager 130, for eventual use by the originating application program 120.
Build Search Rules
Module Get Runtime Search Rule Information 141 obtains search-specific runtime information, such as determining the existence of “context.root” and “akspace.root” anchor files in the ancestor directories above the invocation working directory. (Anchor files are explained below.) Modules Upsearch For Context Root 142 and Upsearch For Aggregated Knowledge Space Root 143 are subordinate modules that actually perform the upsearches, respectively.
Module Get Initial Context Rules 144 uses environment variable values provided by Get Runtime Information 131 to locate an initial set of search rules.
Module Add Customized Knowledge 145 uses the initial search rules obtained by Get Initial Context Rules 144 and one or more context names (search rule set names) to construct a second set of search rules. The second set of search rules is used to look up the requested knowledge. Specifically, Module Add Customized Knowledge uses the initial set of search rules to find a “context.tbl” context definition table as the first step in constructing the second set of search rules.
Recall that a context is a named set of search rules that specify an ordered sequence of places to look for requested knowledge. Supposing that customized knowledge for debugging was desired, Add Customized Knowledge 145 would look up the “debug” name in the context name table
Module Instantiate Mobile Knowledge 146 instantiates existing search rules for mobile knowledge by replacing placeholder strings such as “_COLL_” with a pathname to the current working collection. Recall that mobile knowledge is knowledge that is stored and accessible in a mobile collection.
Module Instantiate Workspace Knowledge 147 instantiates existing search rules for workspace knowledge by replacing placeholder strings such as “_WKSPC_” with a pathname to workspace mobile knowledge collections. Recall that workspace knowledge is knowledge that is stored in mobile knowledge collections that are hierarchical ancestors of the current working collection.
Module Add Aggregated Knowledge 148 further appends more search rules for aggregated knowledge to the growing second set of search rules. Recall that aggregated knowledge is defined as a named set of mobile knowledge collections, or “knowledge spaces.”
In operation, Module Build Search Rules 140 proceeds according to the simplified algorithm shown in
Build Search Rules 140 first calls Get Runtime Search Rule Information 141 to locate anchor files such as “context.root”
Anchor files represent the idea of associating particular named contexts and named aggregated knowledge spaces with particular physical locations in a computer filesystem. That way, application programs that are invoked in particular working directories can dynamically access context knowledge and aggregated knowledge that has been associated with those working directories. In effect, this approach mimics the human experience of associating work tools with particular physical spaces such as kitchens, workshops, or office spaces. That is, application programs can access additional knowledge when they execute within physical filesystem subtrees that have been associated with context or workspace knowledge by anchor files.
Get Runtime Search Rule Information 141 calls subordinate modules Upsearch For Context Root 142 and Upsearch For Knowledge Space Root 143 to actually perform the upsearches. Each subordinate module traverses filesystem directories upward from the current working directory, checking each successive ancestor directory for the existence of anchor files.
Anchor file upsearches can terminate for several reasons. For example, when an anchor file is found, when a particular ancestor directory is reached, or when the filesystem root directory is reached. Particular termination behaviors are determined by implementation policy. In preferred implementations, upsearches usually terminate at a specific directory (such as a home directory).
Operation—Initial Context Rules
Next, Module Get Initial Context Rules 144 is called to obtain an initial set of search rules. The initial set of search rules is used to locate the context name table and context definition files that are used to construct the second set of search rules. The initial set of search rules is typically defined by a runtime environment variable such as shown by
Next, Module Add Customized Knowledge 145 is called to begin the construction of a second set of search rules. The module must first identify a list of customized contexts (search rules for customized knowledge) that should be used. A list of such contexts is called a “context stack.”
For each context on the context stack, Add Customized Knowledge 145 finds a context definition file for the current context by looking up the context name in a context name table that is located using the initial search rules.
Using the first context name “debug” from the context stacks shown in
The context definition file name “debug.def” is looked up using the initial search rules
Continuing, since the “debug” context search rule
Module Add Customized Knowledge 145 repeats the process above for each context on the current context stack, appending search rules as contexts specify. In particular, the “mine” context is the second context on the context stack of
After all contexts on the explicit context stack have been added to the growing second list of search rules, search rules for the “default” context are automatically appended.
Note that the default context search rules both extend and customize knowledge. Knowledge is extended by the use of mobile, workspace, and aggregated knowledge. Knowledge is customized by the use of customized knowledge search rules Lines 17-18. Vendor knowledge Line 19 is usually considered to be the most default and uncustomized knowledge set, since it is usually the first set of knowledge installed in a collection knowledge system.
As discussed before, mixing customization with extension in search rules may conflict with particular implementation policy goals. However, as long as there is no need to use the customized knowledge rules in Lines 17-18 to customize the extended knowledge provided by Lines 4-14, this set of search rules would cause no precedence problems.
Note that the search rules for mobile and workspace knowledge are generic, and contain various placeholder strings that are explained below. Placeholder strings in these rules are instantiated with specific values for the current knowledge request before the rules are used by Module Perform Knowledge Lookups 150.
The discussion now continues with operational explanations of mobile, workspace, and aggregated knowledge search rules.
Next, Module Instantiate Mobile Knowledge 146 is called to continue the construction of a second set of search rules.
In contrast to customized knowledge search rules that are appended to the second set of search rules according to named contexts on a context stack, mobile knowledge search rules cannot be appended. Instead, mobile knowledge search rules must be added into the second set of search rules as part of a context search rule set. Once in place, placeholder strings in mobile knowledge search rules are replaced at runtime with values appropriate to the current lookup situation. Module Instantiate Mobile Knowledge 146 performs the placeholder string replacements.
In particular, the search rules shown in
Lines 5-8 show four search rules for mobile knowledge, and illustrate the use of platform dependent search rules implemented by four virtual platforms. Other rules in
From the foregoing it should be clear that Module Instantiate Mobile Knowledge 146 does not add new mobile knowledge search rules to the second set of search rules. Instead, it only instantiates generic mobile knowledge rules that were added by the “default” context rules. Instantiation is done using specific replacement values for the current knowledge request.
Next, Module Instantiate Workspace Knowledge 147 is called to continue the construction of a second set of search rules.
Like mobile knowledge search rules, workspace rules must be added to the second set of search rules as part of a context search rule set. However, workspace search rules can be expanded (that is, replicated and instantiated) in situ, in linear proportion to the number of ancestor collections discovered by upsearch operations. Only one set of platform dependent generic rules need be added to the second set of rules, such as shown by the workspace knowledge rule in
Placeholder strings in expanded workspace knowledge search rules are replaced at runtime with values appropriate to the current situation. Module Instantiate Workspace Knowledge 147 performs the placeholder string replacements.
For example, suppose that a workspace upsearch operation discovered three ancestor collections. Then the set of search rules in
From the foregoing it should be clear that Module Instantiate Workspace Knowledge 147 does not truly add new workspace knowledge search rules to the second set of search rules. Instead, it only replicates and instantiates generic search rules that were added by the “default” context rules. Generic workspace search rules are expanded during instantiation in linear proportion to the number of ancestor collections discovered in the workspace.
Next, Module Instantiate Aggregated Knowledge 148 is called to continue the construction of the second set of search rules by adding search rules for spatial knowledge. Aggregated knowledge anchor files are found and used in essentially the same way as are context anchor files. Aggregated knowledge search rules are replicated and instantiated in essentially the same way as are workspace search rules.
Module Instantiate Aggregated Knowledge 148 must first identify a list of aggregated knowledge spaces that should be instantiated. A list of such knowledge spaces is called an “aggregated knowledge space stack” or more simply, a “knowledge stack” or “akspace stack.”
For each knowledge space on the knowledge stack, Instantiate Aggregated Knowledge 148 finds a knowledge space definition file for the current aggregated knowledge space by looking up the knowledge space name in an aggregated knowledge space name table that is located using the initial search rules. For example, consider the first knowledge space “agk-p1” named in
Module Instantiate Aggregated Knowledge 148 would first locate a knowledge space name table
From the foregoing it should be clear that Module Instantiate Aggregated Knowledge 148 does not truly add new aggregated knowledge search rules to the second set of search rules. Instead, it only replicates and instantiates generic aggregated knowledge rules that were originally added to the search rules as part of search rules for the “default” context.
This completes discussion of how the second set of search rules is constructed. The next section explains how the second set of search rules is used to locate requested knowledge.
Perform Knowledge Lookups
Module Do Local Lookups 151 performs lookups using local knowledge stores to satisfy knowledge requests. In a preferred filesystem implementation, local knowledge stores are comprised of various knowledge trees stored on a local computer disk.
Local Cache Manager 152 caches the results of local lookups to improve performance on successive lookups that reference the same local knowledge files.
Module Do Remote Lookup 153 performs lookups using remote knowledge stores to satisfy knowledge requests. In a preferred filesystem implementation, remote knowledge stores are comprised of various knowledge trees stored on remote computer disks. Remote knowledge is accessed using a Module CKS System Client 160 that is part of a client-server software architecture
Remote Cache Manager 154 caches the results of remote lookups to improve performance on successive lookups that reference the same remote knowledge files.
In operation, Module Perform Knowledge Lookups 150 proceeds according to the simplified algorithm shown in
Module Perform Knowledge Lookups 150 begins by traversing each search rule in the second set of search rules previously constructed by Build Search Rules 140.
Continuing, Module Perform Knowledge Lookups 150 performs a knowledge lookup for each rule in the second set of search rules until the original knowledge request is satisfied. Some lookups require that only the first instance of the desired knowledge be found and returned, whereas other lookups require that all instances of the desired knowledge be found and returned. Therefore the termination criteria for each lookup is peculiar to that particular lookup.
Module Do Local Lookup 151 is called if the current search rule is a search rule for local knowledge. Module Do Local Lookup 151 immediately calls Local Cache Manager 152 to see if the requested knowledge is resident in the cache. If so, the cached knowledge is returned. If not, Module Do Local Lookup 151 performs a lookup search by constructing a complete pathname to a potential knowledge file by prepending the search rule directory to the desired knowledge filename that was provided as a knowledge request parameter.
Module Do Local Lookup 151 checks to see if the desired knowledge file exists in the current search directory. If the local search succeeds, knowledge results are returned and added to a lookup cache by Local Cache Manager 152. The lookup operation terminates if the knowledge request is satisfied. If the knowledge request still requires further searching, other searches are made using subsequent search rules in the second set of search rules, until the lookup operation terminates.
Module Do Remote Lookup 153 is called if the current search rule is a search rule for remote knowledge. Module Do Remote Lookup 153 immediately calls Remote Cache Manager 154 to see if the requested knowledge is resident in the cache. If so, the cached knowledge is returned to the caller. If not, Module Do Remote Lookup 153 calls Module CKS System Client 160 to oversee the remote lookup.
Parameters from the current remote search rule are passed to CKS System Client 160 to specify the remote search. If the remote search succeeds, knowledge results are returned and added to a lookup cache by Remote Cache Manager 154. The remote lookup terminates if the knowledge request is satisfied. If the knowledge request still requires further searching, other searches are made using subsequent search rules in the second set of search rules, until the lookup operation terminates.
When Module CKS System Client 160 is called to resolve a remote lookup request, the client software passes the lookup request to Module CKS System Server 170. The server software is responsible for performing the physical lookup using knowledge stores accessible to the server.
Lookup algorithms on the server side mirror lookup algorithms for the local side, which have been described above. In particular, Module CKS System Server 170 functions almost exactly as does Module CKS Manager 130 as far as lookups are concerned, with the main difference being that the server can only perform “local” server lookups on local server knowledge stores. (Note that “local” on the server side means local to the server, even though “local” on the server side means “remote” from the client perspective.)
Module CKS System Server 170 uses the same algorithms as were described above to obtain initial search rules, to build a second set of search rules, and to perform the requested lookups. In essence, it is fair to say if client-server network mechanisms are ignored, Module CKS Manager 130 on the local effectively calls a peer Module CKS Manager 130 on the remote side to perform remote lookups. Those skilled in the art will immediately recognize the similarity in architectures, algorithms, and knowledge stores, as well as the strong opportunity for software reuse that is identified here.
On the server side of the network connection, parameters from the current remote search rule in the second set of search rules on the local side are used to construct physical search rules for the second set of search rules on the server side. Two kinds of remote search rules are possible: remote customized knowledge, and remote aggregated knowledge.
Lookups—Remote Customized Knowledge
Remote customized knowledge is slightly easier to explain, and so will be considered first here.
In operation, the server build search rules module locates a knowledge treespace name table such as
Lookups—Remote Aggregated Knowledge
Remote aggregated knowledge requires one extra lookup.
In operation, the server locates an aggregated knowledge namespace name table such as
This completes discussion of the main architecture and algorithms of CKS System 122 and Module CKS Manager 130.
Knowledge Lookup Functions
The algorithms of all functions shown in
As can be seen from the foregoing discussion, a collection knowledge system manages several kinds of useful knowledge. In particular, the use of context knowledge provides a flexible, extensible means for implementing site knowledge policies. Thus collection knowledge systems provide a practical and useful service to application programs that use knowledge lookups to process collections in automated, scalable ways that were not previously possible.
The present Collection Knowledge System invention provides practical solutions to ten important knowledge delivery problems faced by builders of automated collection processing systems. The problems are: (1) the knowledge organization problem, (2) the customized knowledge problem, (3) the platform dependent knowledge problem, (4) the coupled-application knowledge problem, (5) the shared knowledge problem, (6) the scalable knowledge delivery problem, (7) the mobile knowledge problem, (8) the workspace knowledge problem, (9) the aggregated knowledge problem, and (10) the installable knowledge problem.
As can be seen from the foregoing disclosure, the present collection knowledge system invention provides application programs with a very practical means for obtaining precise answers to knowledge lookup requests in an automated, customizable, and scalable way that was not previously available.
Although the foregoing descriptions are specific, they should be considered as sample embodiments of the invention, and not as limitations. Those skilled in the art will understand that many other possible ramifications can be imagined without departing from the spirit and scope of the present invention.
General Software Ramifications
The foregoing disclosure has recited particular combinations of program architecture, data structures, and algorithms to describe preferred embodiments. However, those of ordinary skill in the software art can appreciate that many other equivalent software embodiments are possible within the teachings of the present invention.
As one example, data structures have been described here as coherent single data structures for convenience of presentation. But information could also be could be spread across a different set of coherent data structures, or could be split into a plurality of smaller data structures for implementation convenience, without loss of purpose or functionality.
As a second example, particular software architectures have been presented here to more strongly associate primary algorithmic functions with primary modules in the software architectures. However, because software is so flexible, many different associations of algorithmic functionality and module architecture are also possible, without loss of purpose or technical capability. At the under-modularized extreme, all algorithmic functionality could be contained in one software module. At the over-modularized extreme, each tiny algorithmic function could be contained in a separate software module.
As a third example, particular simplified algorithms have been presented here to generally describe the primary algorithmic functions and operations of the invention. However, those skilled in the software art know that other equivalent algorithms are also easily possible. For example, if independent data items are being processed, the algorithmic order of nested loops can be changed, the order of functionally treating items can be changed, and so on.
Those skilled in the software art can appreciate that architectural, algorithmic, and resource tradeoffs are ubiquitous in the software art, and are typically resolved by particular implementation choices made for particular reasons that are important for each implementation at the time of its construction. The architectures, algorithms, and data structures presented above comprise one such conceptual implementation, which was chosen to emphasize conceptual clarity.
From the above, it can be seen that there are many possible equivalent implementations of almost any software architecture or algorithm, regardless of most implementation differences that might exist. Thus when considering algorithmic and functional equivalence, the essential inputs, outputs, associations, and applications of information that truly characterize an algorithm should also be considered. These characteristics are much more fundamental to a software invention than are flexible architectures, simplified algorithms, or particular organizations of data structures.
A collection knowledge system can be used in various practical applications.
One possible application is to improve the productivity of human computer programmers, by providing them with a way for storing and accessing knowledge for variant computational processes.
Another application is to share application program knowledge among a community of application program users, thereby allowing them to more easily advance the productivity of the community by virtue of shared knowledge stored in a collection knowledge system.
Another application is to centralize the administration of knowledge within a community of users, thereby shifting the burden of understanding and maintaining the knowledge from the many to the few.
Another application is to manage knowledge customization preferences for a community of people, in a way that permits people to share the customization preferences of other people. Thus new members of the community can receive both application program training and software environment preferences from the same coworkers.
Another application is to manage and distribute reusable software fragments to software reuse application programs, thereby promoting software reuse and reducing software development costs for application programs, websites, and other software projects.
Another application is to provide knowledge delivery services to collection makefile generator programs, which could use a collection knowledge system to obtain the precise, scalable, and customized knowledge that is required for generating makefiles for variant computational processes.
Another application is to provide IDE integrated development environments with a collection knowledge system that could provide customized knowledge to the integrated development system.
Another application is to provide knowledge to a network knowledge deliver server that was used by large numbers of application programs and users to obtain centralized policy settings for the overall network environment.
One possible functional enhancement is to modify a CKS system to automatically share knowledge with other CKS systems, thereby creating a robust, distributed network of knowledge delivery servers on a network, where servers could replicate and share knowledge among themselves.
A four-level hierarchical organization of knowledge was presented here, comprised of the following levels: knowledge tree roots, contexts, applications, and virtual platforms. Special installable knowledge directories were also described.
However, other hierarchical organizations are also possible, with different orderings of the same levels, or with either more or fewer levels in the tree.
One possible alternate organization of the same number of levels would be to place the virtual platform level above the application level, thereby grouping all platform dependent information for a platform together. This organization is convenient for extending existing knowledge to a new platforms, because it makes it easier to find all the platform dependent knowledge within a knowledge tree. However, it does so at the cost of distributing application knowledge among several platform dependent subtrees, rather than keeping all application knowledge under one application subtree.
Another simpler organization is to remove the context level. This would make the system considerably less flexible and less powerful, but would result in considerably less complexity. For example, context stacks would not be required, and a second set of constructed search rules would not be required.
Other simpler organizations could be created by removing mobile knowledge, workspace knowledge, aggregated knowledge, installable knowledge, or virtual platform mechanisms. Each removal would reduce complexity, at a tradeoff cost of reduced power and flexibility.
The foregoing discussion was based on a preferred implementation that used simple text files to store knowledge. Using text files to represent knowledge has many advantages, that include at least simplicity, ease of use, and flexibility. However, other representations for knowledge are also possible.
As one example, databases could be used to store knowledge. In particular, databases could be a useful representation for large quantities of well-structured knowledge. In such cases, CKS systems could advantageously provide database support underneath a CKS API (Application Programming Interface) that was offered to application programs. As always, tradeoffs exist. Search rule formats would have to be altered to contain database search parameters instead of knowledge tree pathnames. Performance would likely be slower for large database systems than for smaller filesystem implementations. Particular choices of knowledge representations are determined by implementation policy.
As another example, SGML or XML representations could be used to store knowledge. This approach would offer more formal structuring and labelling of knowledge, by virtue of the markup tags used in SGML and XML. In addition, formal software parsers and GUI programs for manipulating XML knowledge files could be constructed. This would make it more convenient for at least some humans to work with knowledge files using tools specifically designed for the purpose, instead of using simple text editors on text files. One disadvantage of using XML or some other markup language is that it would complicate the look and content of knowledge files for humans, potentially increasing knowledge maintenance costs. Particular choices of knowledge representations are determined by implementation policy.
Initial Search Rules
The foregoing discussion described the use of an environment variable to point at an initial set of search rules. However, other methods are also possible.
For example, in some cases environment variables are not convenient or not possible, such as in computational processes that are spawned by Unix cron (automated scheduler) programs. These programs invoke programs with a very minimal set of environment variables, none of which point to an initial set of search rules.
One possible solution to this problem is to place a special file somewhere in the executable PATH that is specified by the default cron environment variables. That way, programs that need to find an initial set of search rules can search the directories listed in their PATH environment variables for the name of the special file. The special file would contain a pathname pointer to an initial set of search rules, corresponding to the value side of the environment variable assignment shown in
The foregoing discussion described a particular preferred implementation of search rules. However, other variations are possible.
One possible variation is to use mobile, workspace, aggregated, and installable knowledge in the initial set of search rules. Although this technique was not shown for reasons of simplicity, it can provide more search rule power at the cost of some complexity.
Another possible variation is to deliberately mix extension and customization knowledge within the same knowledge tree. This approach can reduce knowledge maintenance costs by reducing the number of required knowledge trees, at the tradeoff cost of coupling the two kinds of knowledge.
Another possible variation is to use non-linear search rule representations, such as a using a recursive tree traversal mechanism based on a tree of search rules. This approach would allow for “context subtrees” rather than linear sets of search rules in a context, and would allow for greater flexibility in creating complex search rule orders.
Another possible variation is to dynamically vary the composition of the search rule set in response to the particular knowledge that is the target of the lookup. For example, different sets of rules could be constructed at runtime depending on particular parameters of the lookup, such as application name, or desired filename, or key-value pair. Another example could use conditional runtime selection of rules in the constructed set, depending on the lookup parameters. Another example is to use more or fewer levels in the virtual platform hierarchy.
The preferred embodiment described above uses a four-level virtual platform hierarchy to organize makefile fragment information into specific, generic, family, and platform independent operating system categories. However, other organizational schemes are also possible.
One example is to use a linear virtual platform structure that aggregates related information into fewer abstraction levels, perhaps by removing the generic level. Another example is to use a hierarchical structure organized by company, department, team, and individual.
As can be seen by one of ordinary skill in the art, many other ramifications are also possible within the teachings of this invention.
The full scope of the present invention should be determined by the accompanying claims and their legal equivalents, rather than from the examples given in the specification.
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US7594225 *||Dec 20, 2004||Sep 22, 2009||Microsoft Corporation||Integrated setup for generating customization XML|
|US7685591||Dec 20, 2004||Mar 23, 2010||Microsoft Corporation||Customizing a software application through a patch file|
|US8145641 *||Jan 18, 2008||Mar 27, 2012||Oracle International Corporation||Managing feature data based on spatial collections|
|US20060136872 *||Dec 20, 2004||Jun 22, 2006||Microsoft Corporation||Integrated setup for generating customization XML|
|US20090187587 *||Jul 23, 2009||Oracle International Corporation||Managing feature data based on spatial collections|
|U.S. Classification||1/1, 707/999.003|
|International Classification||G06F9/445, G06N5/02, G06F12/00, G06F9/44, G06F17/30, G06F7/00|
|Cooperative Classification||G06F8/20, G06N5/022|
|European Classification||G06F8/20, G06N5/02K|
|Feb 14, 2008||AS||Assignment|
Owner name: COVERITY, INC., CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CODEFAST, INC.;REEL/FRAME:020507/0400
Effective date: 20071206
|Dec 2, 2014||AS||Assignment|
Owner name: SYNOPSIS, INC, CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:COVERITY, INC;REEL/FRAME:034514/0496
Effective date: 20141027
|Mar 3, 2015||AS||Assignment|
Owner name: SYNOPSYS, INC, CALIFORNIA
Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE NAME PREVIOUSLY RECORDED AT REEL: 034514 FRAME: 0496.ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT;ASSIGNOR:COVERITY, INC;REEL/FRAME:035121/0570
Effective date: 20141027