US 20050114801 A1
Complex multidimensional datasets generated by digital imaging spectroscopy can be organized and analyzed by applying software and computer-based methods comprising sorting algorithms. Combinations of these algorithms to images and graphical data, allow pixels or features to be rapidly and efficiently classified into meaningful groups according to defined criteria. Multiple rounds of pixel or feature selection may be performed based on independent sorting criteria. In one embodiment sorting by spectral criteria (e.g., intensity at a given wavelength) is combined with sorting by temporal criteria (e.g., absorbance at a given time) to identify microcolonies of recombinant organisms harboring mutated genes encoding enzymes having desirable kinetic attributes and substrate specificity. Restriction of the set of pixels analyzed in a subsequent sort based on criteria applied in an earlier sort (“sort and lock” analyses) minimize computational and storage resources. User-defined criteria can also be incorporated into the sorting process by means of a graphical user interface that comprises a visualization tools including a contour plot, a sorting bar and a grouping bar, an image window, and a plot window that allow run-time interactive identification of pixels or features meeting one or more criteria, and display of their associated spectral or kinetic data. These methods are useful for extracting information from imaging data in applications ranging from biology and medicine to remote sensing.
23. A graphical user interface for display and analysis of digital image data comprising:
(a) a reference window for displaying a reference image comprising pixels;
(b) a contour plot window for indicating pixel location along a first dimension, indicating a non-positional variable along a second dimension, and indicating pixel intensity by a variable signal appearing along the second dimension, said contour plot window further comprising (i) a grouping bar for grouping together pixels for analysis; and (ii) a selection bar for selecting pixels that are thereby indicated in the reference window and plotted in the plot window;
(c) a plot window for displaying a plot of pixel intensity as a function of the non-positional variable.
24. The graphical user interface of
25. The graphical user interface of
26. The graphical user interface of
27. The graphical user interface of
28. The graphical user interface of
This application is a divisional of co-pending U.S. patent application Ser. No. 09/767,595, filed Jan. 22, 2001, now U.S. Pat. No. ______, which claims the benefit of U.S. Provisional Application No. 60/177,575, filed Jan. 22, 2000 and U.S. Provisional Application No. 60/186,034, filed Mar. 1, 2000, the entire disclosures of which are hereby incorporated by reference in their entirety.
The U.S. Government has certain rights in this invention pursuant to Grant No. R44GM5555470 awarded by the National Institutes of Health.
The current invention relates generally to the visualization and processing of multidimensional data, and in particular, to data formed from a series of images.
Sophisticated analysis of imaging data requires software that can rapidly identify meaningful regions of the image. Depending on the size and number of regions, this process may require evaluating very large datasets, and thus efficient sorting of the data is essential for finding the desirable elements. In the present invention, regions of interest (ROIs) in previous feature-based imaging spectroscopy are extended to include pixel-based analyses. This requires new algorithms, since the size of a pixel-based analysis can be more than 1000 times larger than that of a feature-based analysis. In addition to requiring a burdensome amount of processing time, prior art sorting algorithms that may have been adequate to categorize and classify relatively noiseless feature data are not necessarily successful in sorting single-pixel spectra without additional parameters or human intervention.
In cases in which human intervention is advantageous, the present invention includes a means for combining machine and human intelligence to enhance image analysis. For example, the present invention provides a method for combining sorting by spectral criteria (e.g., intensity at a given wavelength) and sorting by temporal criteria (e.g., absorbance at a given time). Sorting enables the user to classify large amounts of data into meaningful and manageable groups according to defined criteria. The present invention also allows for multiple rounds of pixel or feature selection based on independent sorting criteria. Methods are presented for extracting useful information by combining the analyses of multiple datasets and datatypes (e.g., absorbance, fluorescence, or time), such as those obtained using the instruments and methods disclosed in U.S. Pat. Nos. 5,859,700 and 5,914,245, and in U.S. patent application Ser. No. 09/092,316.
The methods described herein are useful for a number of applications in biology, chemistry and medicine. Biomedical applications include high-throughput screening (e.g., pharmaceutical screening) and medical imaging and diagnostics (e.g., oximetry or retinal examination). Biological targets include live or dead biological cells (e.g., bacterial colonies or tissue samples), as well as cell extracts, DNA or protein samples, and the like. Sample formats for presenting the targets include microplates and other miniaturized assay plates, membranes, electrophoresis gels, microarrays, macroarrays, capillaries, beads and particles, gel microdroplets, microfluidic chips and other microchips, and compact discs. More generally, the methods of the present invention can be used for analysis of polymers, optical materials, electronic components, thin films, coatings, combinatorial chemical libraries, paper, food, packaging, textiles, water quality, mineralogy, printing and lithography, artwork, documents, remote sensing data, computer graphics and databases, or any other endeavor or field of study that generates multidimensional data.
The present invention provides methods, systems and computer programs for analyzing and visualizing multidimensional data. Typically, the first two dimensions are spatial and the third dimension is either spectral or temporal. (Although the term spectra or kinetics may be used herein, the methods described are of general applicability to both forms of vector data.) The invention includes a graphical user interface and method that allows for the analyses of multiple data types. For example, datastacks of fluorescence emission intensity, absorbance, reflectance and kinetics (changes in signal over time) can be analyzed either independently or on the same sample for the same field of view. Fluorescence measurements involving fluorescence resonance energy transfer (FRET) can also be analyzed. A key feature of the present invention is that data analysis can be performed in series. Thus, for example, the results of sorting pixels or features within one image stack can be applied to subsequent sorts within image stacks. The present invention also includes methods to prefilter data. Thus, for example, pixel-based analysis can be performed, wherein features are selected based on particular criteria and a subsequent sort is restricted to pixels that lie within the selected features. These sorting methods are guided by the heuristics of parameters input by the user. This is especially beneficial when expert knowledge is available. Thus, for example, the user can select a particular spectrum with desirable characteristics (a target spectrum) from a spectral stack, and the program will automatically classify all of the spectra obtained from the image stack by comparing each of the unclassified spectra to the target spectrum, calculating a distance measure, and sorting the spectra based on their distance measure. The classified (sorted) spectra are then displayed in the contour plot window or other plot windows.
Sorting can also be used for sequentially analyzing images and graphical data, such that the pixels that are ultimately displayed are restricted by at least two independent criteria. For example, pixels or features that have been extracted based on selected spectral criteria (e.g., absorbance) can be further analyzed based on temporal criteria (e.g., kinetics). This method of combined analysis provides a means for rapidly and efficiently extracting useful information from massive amounts of data. A further embodiment of sequential sorting involves discarding unwanted data during the sorting process. This ‘sort and lock’ procedure provides a useful new tool for data compression. This method for sorting and displaying multidimensional data from an image stack comprises the steps of: (a) selecting a subset of pixels from an image by a first algorithm; (b) discarding the pixels that are not selected; (c) selecting a subset of the remaining pixels by a second sorting algorithm; and (d) automatically indicating the final selection of pixels by back-coloring the corresponding pixels in the image. This type of multidimensional analysis can also be performed by manipulating the contour plot window. The method comprises the steps of (a) sorting the pixels by a first algorithm; (b) automatically indicating on the contour plot pixels sorted by the first algorithm; (c) selecting a subset of pixels in the contour plot; (d) sorting the subset of pixels by applying a second algorithm; (e) selecting a reduced subset of pixels in the contour plot; and (f) automatically indicating the final selection of pixels by backcoloring the reduced subset of pixels in the image. The present invention also provides a method for displaying a grouping bar that can be used to analyze images and graphical data within the graphical user interface (“GUI”). The grouping bar enables the user to segregate groups of pixels or features within a contour plot, and thereby facilitates independent sorting and backcoloring of the individual groups of pixels or features in the image. The methods of the present invention are applicable to a variety of problems involving complex, multidimensional, or gigapixel imaging tasks, including (for example) automated screening of genetic libraries expressing enzyme variants.
According to one embodiment of the invention, a method for analyzing digital image data is provided, said method comprising (a) loading into a computer memory a plurality of data stacks wherein each data stack comprises pixel intensity data for a plurality of images, the pixel intensity data expressed as a function of: (i) pixel position, (ii) a first non-positional variable, and (iii) a second non-positional variable, wherein within a data stack, the value of the first non-positional variable is not constant and the value of the second non-positional variable is constant, and wherein between data stacks, the value of the second non-positional variable differs; (b) generating for a plurality of pixels within a first data stack, a plurality of first functions that relate pixel intensity to the first non-positional variable; (c) sorting the pixels within the first stack according to a first value obtained by applying a mathematical operation to the first functions generated for the plurality of pixels; (d) selecting a first set of sorted pixels; (e) generating for a plurality of pixels within the first set, a plurality of second functions that relate pixel intensity to the second non-positional variable; and (f) sorting the pixels within the first set according to a second value obtained by applying a second mathematical operation to the second functions generated for the plurality of pixels within the first set. The non-positional variables may be selected from a wide range of different parameter types that indicate, e.g., the time the data were captured, or, e.g., a condition such as wavelength, temperature, pH, chemical activity (such as, e.g., the concentration of an enzyme substrate or enzyme inhibitor, or the concentration of a drug or other chemical component), pressure, partial pressure of a gaseous chemical, or ionic strength, etc. under which the data were captured.
According to another embodiment, the invention provides a graphical user interface (“GUI”) for display and analysis of digital image data comprising (a) a reference window for displaying a reference image comprising pixels; (b) a contour plot window for indicating pixel location along a first dimension, indicating a non-positional variable (such as, e.g., time, wavelength, temperature, pH, chemical activity, pressure, partial pressure of a gaseous chemical, or ionic strength, etc.) along a second dimension, and indicating pixel intensity by a variable signal appearing along the second dimension, said contour plot window further comprising (i) a grouping bar for grouping together pixels for analysis; and (ii) a selection bar for selecting pixels that are thereby indicated in the reference window and plotted in the plot window; (c) a plot window for displaying a plot of pixel intensity as a function of the non-positional variable.
The file of this patent contains at least one drawing executed in color. copies of this patent with color drawing(s) will be provided by the Patent and Trademark Office upon request and payment of the necessary fee.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice and testing of the present invention, suitable methods and materials are described below.
All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present application, including definitions, will control. In addition, the materials, methods, and examples described herein are illustrative only and not intended to be limiting.
Other features and advantages of the invention will be apparent from the following detailed description, the drawings, and from the claims.
Layout of the Graphical User Interface
The graphical user interface (“GUI”) is diagrammed in
The layout of a typical Kcat display configuration is shown in
Each project in the workspace may contain one or more analysis files, which contain data calculated as a function of the non-spatial dimension in the image stacks. Each one of these data vectors corresponds to a row in the contour. When the non-spatial variable is wavelength, the vector is referred to as a spectrum. When this variable is time, the vector is commonly referred to as velocity or a kinetic trace. Although the terms “spectrum” and “kinetics” are used in the description of the present invention, it must be noted that the methods are of general applicability to both forms of graphical data. Three interactive windows, consisting of a reference image, a contour plot and conventional plot window, are displayed. The results of a given analysis can be saved and used in subsequent data processing after the file is reopened. This stored information is referred to as a template, and it includes resultant sorts of the contour plot and pixel groupings which can later be applied to the same datastack or to alternative datastacks.
The image stack category accommodates multiple image stacks. Data can be calculated from a raw image stack or pre-processed image stacks. Pre-processing can include a simple division or subtraction to correct for background, or more involved processing to correct for spectral overlap exhibited by multiple fluorescent dyes in an imaged sample. In kinetic experiments, it is useful to divide all images in the stack by the time zero image. A flat field correction may also be used. This algorithm corrects for background and restricts the grayvalue range of the processed image. This is done by using a background image which has previously been divided by its average grayvalue. With large image files, it may be beneficial to perform a prefiltering step which creates a smaller image stack consisting of subsets of regions of interest in the larger image. For example, this subset may comprise a collage of microcolonies satisfying a particular criterion. Such preprocessing has the advantage of concisely displaying rare positive microcolonies while significantly reducing computation time and file storage space requirements. In certain experiments, it is also possible to simultaneously acquire multiple datastacks representing different wavelengths for the same timepoint or multiple timepoints for the same wavelength. This four dimensional data concept is illustrated in
The image category is used to store images that are not necessarily part of the stack. These images can be unprocessed monochrome images acquired under special experimental conditions, or they can be image-processed monochrome or RGB pseudocolored images. Any images in the workspace can be used as a reference image in an analysis.
The stack and image computation described here can be menu driven or incorporated into a wizard, or incorporated into firmware.
The workspace window has been hidden in
The contour plot is a convenient visualization tool for concisely displaying three-dimensional data on a two-dimensional screen. Each thin row in the contour plot represents the data for a particular pixel or feature. The x-axis represents wavelength (for spectral data) or time (for kinetics data). Thus, each wavelength or time point is represented by a discrete column in the contour plot. Of course, this arrangement can be altered without departing from the scope of the invention, as by, e.g., rotating the contour plot through 90 degrees so that each row represents wavelength or time, and each column represents the data for a particular pixel or feature. The intensity of the measured signal (e.g., absorbance) at a given wavelength or time point is indicated by a color code, whose scale is depicted at the bottom of the contour plot window. Black/blue color indicates no/low absorbance and red/white indicates high/maximum absorbance. Thus, for spectral data, the spectrum of a given pixel or feature having a single absorption maximum may be represented by a row in the contour plot which has a white or red segment within the column corresponding to the wavelength of maximum absorption. This absorption maximum is flanked by other colors representing progressively decreasing absorbance. Absorption, reflectance, or fluorescence data can be displayed for every pixel or feature in a scene. As those skilled in the art will readily appreciate, many alternatives to the above-described color code may be used to represent the intensity of the measured signal such as intensity variation (i.e., brighter or lighter regions along the contour plot row), and variations in any other type of visually distinguishable display variation such as stippling, cross-hatching patterns, or any other plotting symbol that can be related to signal intensity in a manner analogous to the exemplified color bar. When the pixels or features are sorted, the various rows are re-ordered from top to bottom in the contour plot window. Thus, sorting tends to create more easily recognized groupings of pixels or features.
Single Pixel Versus Blob Analysis
Using the GUI diagrammed above, pixels can be grouped into features by conventional image processing techniques, and all four of the windows within the GUI then act to coordinate feature-based (rather than pixel-based) information. While feature-based analysis can increase the signal-to-noise ratio in certain low-light applications, we find feature extraction to be inferior to pixel-based analysis in many applications. This is due largely to problems associated with separating neighboring pixels into different features that may be adjacent or overlapping.
Another reason for basing analyses on pixels rather than features is that problems with ‘edge’ pixels can be minimized (
Pedagogical Test Target
A pedagogical, easily recognized sample is used in this section to demonstrate various aspects of the software.
The GUI is highly interactive in run-time. The computer's mouse can be used to point to a pixel in the image window and thereby initiate two actions: 1) the pixel spectrum is displayed in the plot window, and 2) a tick mark appears next to the associated row in the contour window. Alternatively, pointing to a row in the contour plot causes two analogous actions: 1) the corresponding pixel is highlighted in the image window, and 2) the associated conventional plot is updated for that particular pixel. Dragging the mouse vertically over the contour plot while the mouse button is held down selects multiple rows, whose spectra are then plotted simultaneously. Likewise, dragging out a box in the image window simultaneously indicates the corresponding spectra in both the plot and contour windows. Coordinated keyboard and mouse actions can be used to make multiple selections. Once selections are made, options for color display, plot linewidth, spectral averaging, and further processing are enabled using menu, mouse and keyboard manipulations well-known to MS Windows' users.
Sorting and Display of Contour Plots for a Single Group
Contour plots are very effective for visualizing and extracting useful information from massive amounts of spectral data. The concise display of data is possible because of the application of a series of sorting algorithms that group pixels with similar properties. In the case of pedagogical M&M's candies, these properties are due to visible light absorption. Aspects of this sorting process are shown below in
Target Vector Selection
The software is capable of using a variety of different target vectors that can be specified by the user. There are many possible candidates for the target vector. This flexibility can be used in a reiterative method for rapid compression (requiring human intervention) or as a ‘single pass’ mining tool. For example, a researcher may be interested only in knowing whether certain spectral characteristics exist in an image stack. In this case, a form of target set analysis can be implemented, wherein a previously stored reference spectrum is used as the target vector. The results of the first iteration of sorting will ‘mine’ spectrally similar pixels into one category in the contour plot. These pixels can then be color-coded on the image and removed from subsequent sorting. In this ‘sort and lock’ procedure, average spectra and variances can be calculated and displayed. This process can be repeated using different target vectors until all pixels are categorized. Thus, for example, a new target spectrum can be selected by the user based on the appearance of the contour plot produced by the previous target spectrum.
In cases in which spectral components are not known, or in which single-pixel spectra contain contributions from multiple components including instrumental or lighting artifacts, a randomly generated spectrum can be used. Alternatively, the spectrum from a random pixel or feature can be selected as a first reference. This latter selection is similar to the procedure described for Panels D and E in
Demonstration of Target Vector Selection and its Use with Multiple Groups
Given the many possibilities for target vector selection, we demonstrate its application in the context of multiple groups created using the GUI. In Panel E of
The user interface is very flexible and allows for repeated individual sorting within groups. Groups can also be ungrouped, combined and regrouped as necessary to refine a given analysis. Numerous support functions for placement of a group within a contour plot are also available. Individualized group sorting is demonstrated in
Creating an Analysis
A flowchart outlining one embodiment of steps involved in creating absorbance spectra from a spectral-datastack is shown in
The individual steps of
In this flowchart, ROI determination is based on contrast enhancement of the Reference Image. This is done automatically within the code using preset parameters (e.g., pixels whose values fall within the top 10% of pixel values in the image) which a user can override by dragging on the two sliderbars beneath the image. This allows one to change the respective high and low values used to determine the ROIs. Additionally, a user can paint on the image with a user defined brush size to erase and/or add ROIs. Similar functionality is enabled for identifying the I0 reference in the Set I0 GUI. I0 pixel values are incorporated in the Beer-Lambert equation (Abs=log I0/I) in order to calculate absorbances. These absorbances are then displayed within a contour plot.
Reference Image and ROI Determination
Determination of ROIs are sometimes highly correlated to the determination of the reference image. Since ROIs are determined by pixel value and heuristics such as morphology, image processing and enhancing an image using physically relevant parameters is important. Frequently, useful information is already apparent from the reference image. Therefore, the generation of a reference image can also be considered a prefiltering step which minimizes the amount of data to be processed. For example, in screening microcolonies, one does not want to compute spectra for parts of the image which do not contain microcolonies. Therefore, a reference image is used to extract only those regions in the image which do correspond to microcolonies. In certain instances, there is no single reference image from which ROIs are extracted. In this case, the reference image serves as a visual aid only and ROIs are calculated using images and parameters entered into the software. An example of this situation is given below.
In the simplest embodiment, a reference image is an unprocessed monochrome image taken at a specific timepoint under specific wavelength illumination and detection conditions. These images can also be background subtracted or flat-fielded to correct for optical and other artifacts. In all computation processes, information loss must be taken into consideration. For example, if division is used the resulting number may be very small. Therefore, integral pixel values are first converted to floating point notation prior to division and rescaled before the conversion back to integers takes place. To facilitate display, final display values are often rescaled to an eight-bit range between 0-255.
One embodiment of a reference image combines images taken at wavelengths corresponding to known spectral parameters in the sample. For example, in fluorescence and absorbance, spectral images corresponding to peak maxima or minima can be selectively combined in an arithmetic or algebraic manner. Similarly, images can be ratioed using any of a combination of wavelengths.
Another embodiment of a reference image for prefiltering uses timecourse images. If the raw timecourse datastack has already been flat-fielded by the T0 image, a later timepoint image may contain kinetic data. Single timepoint images such as this are background corrected and they save feature or pixel information according to parameters set for maximum absorption, rather than maximum change of absorption over time.
When four dimensional data is available as in an RGB image, multiple channel information can be combined for a particular timepoint. This embodiment of a reference image represents the change in absorption of a target over time. An RGB image is created by subtracting an image obtained at a early time-point from an image obtained at a later time-point. This resulting image will be black (RGB values of zero) wherever there is no increase in absorption and will be colored (positive RGB values) where increased absorption occurred over time. Another alternative is to derive the reference image by dividing one image by another. For example, a 24 minute RGB image can be divided by a 2 minute RGB image after synchronous induction of a chromogenic reaction. This method removes fluctuations in the background intensity between different images. If division is used, the program converts the individual RGB values from integers to floating point variables during the operation and re-scales the values before converting them back to integers. Otherwise, the resulting RGB values will be reduced to a narrow range, and therefore there will also be a loss of information.
One embodiment of the dialog boxes in
Using the dialog box in
The updated image with selected pixels colored magenta, is shown
In addition to the Boolean type processing above, multichannel information can be evaluated by color distance criteria by formulating equations that compare the color of all of the pixels in a series of images to preselected target values. The images can be any image including one of the previously described reference images above. Using the three channel RGB system as example, a target of 200, 10, and 30, corresponding to RGB can be set. This target can be selected from the image or predefined based on previous experiments. A distance metric corresponding to the sum-of-differences between the target and each pixel's RGB value is determined and then compared to a specified cutoff value. If the distance cutoff were, for example, set to 30, a pixel with a value of 210, 19, 20 would be selected and a pixel with a value of 231, 10, 30 would be rejected. Color distance criteria also may comprise alternative equations such as the sum-of-the-square-of-the differences, i.e., an actual color distance in RGB color space.
While making a transition from feature to pixel processing, we saw an opportunity to contribute to hyperspectral database management after realizing that contour plots can facilitate new data compression methods. Hyperspectral information can be significantly compressed by using novel algorithms which eliminate data loss when used in the context of a client/server protocol. Based on the initial rapid preview of highly compressed data, a subsequent request for more specific information can be sent. This integrated approach to hyperspectral data management is needed in many fields where spectral datacubes are beginning to emerge as new instrumentation is developed. These fields include remote sensing and telemedicine where data is shared and transmitted to individual researchers over communication lines.
Contour plots are readily linked to data compression and take advantage of spectral heuristics, unlike common graphics image compression methods, which do not. Because image stacks are formed from grayscale 2D images, well known formats such as JPEG and GIF will either be poor at spectral compression or generate loss. These data compression methods do not take into account the relationships of information in an added dimension which can be used to enable the compression in the special cases discussed here.
Using the M&M sorting examples above, the compressed image stack can be reduced into data elements consisting of one color-coded image and the spectra and variances of each category. A stack of N images is essentially reduced into one image, a desampled contour plot, and spectral summary information. The compression factor is approximately equal to the number of images in the stack, N.
In cases where spectral categories are not well defined, as in the red and pink M&M's, compressed data can be supplied with the red and pink categories hypothetically grouped as one. Based on the spectra and variances also supplied, a spectral envelope and variance of these pixels can be generated and displayed in the conventional plot window of the GUI as part of the compressed data. In a mock client/server scenario, this transmitted information showing the large variance at longer wavelengths is indicative of a distribution of spectra which can be separated into more than one category. Such an initial preview of compressed information would prompt the client to request more detailed information, which can be isolated to a smaller and more specific subset of pixels.
The GUI platform described above is amenable to a ‘sort and lock’ procedure, which can be used to reduce computation time and facilitate compression. Multiple steps in a spectral analysis process can be performed to produce a series of contour plots, each one resulting in the identification of one or more spectral categories. Once these pixels are defined, they can be locked out and excluded from subsequent processing, thereby decreasing the number of pixels to process in the subsequent step. This ‘sort and lock’ procedure is presented as an alternative to an MNF transformation and end-member analyses (Green, A. A., Berman, M., Switzer, P, & Craig, M. D. (1988) A transformation for ordering multispectral data in terms of image quality with implications for noise removal: IEEE Transactions on Geoscience and Remote Sensing, v. 26, no. 1, p. 65-74. Boardman J. W., & Kruse, F. A. (1994) Automated spectral analysis: A geologic example using AVIRIS data, north Grapevine Mountains, Nev. In: Proceedings, Tenth Thematic Conference on Geologic Remote Sensing, Environmental Research Institute of Michigan, Ann Arbor, Mich., Vol. 1, pp. 407-418) which also seek to reduce the amount of data processed. A useful method of the present invention is to sort the data in the contour plot such that they can be compressed to representative spectra.
Examples of Sorting Strategies
Here, we demonstrate a series of steps which can be used to screen a bacterial library for enzyme variants with the fastest kinetics as well as the highest specificity for a particular reaction. For example, combinatorial cassette mutagenesis has been used to generate a recombinant library of over 10 million variants of Agrobacterium beta-glucosidase (Abg), a sugar-cleaving enzyme. Since this enzyme has a broad substrate specificity, different substrates such as glucoside and galactoside, can be tagged with different chromogenic reporters. Experiments were conducted using two indolyl derivatives; Red-gal and X-glu. Galactoside and glucoside specificities were identified by absorbance at 540 nm and 615 nm corresponding to the lambdamax of the respective indigo products formed from each derivative. Thus, the ‘bluest’ pixels would correspond to variants having the highest substrate specificity for glucoside and the ‘reddest’ pixels would correspond to variants having the highest substrate specificity for galactoside. In the following examples, a time-based image stack was first acquired from T0 to Tn corresponding to time 0 to time 2700 seconds at a wavelength of lambda=610 nm. Following this, a spectral stack was acquired over the wavelength range lambda, to lambdam corresponding to 500 nm to 700 nm. These two datastacks were stored in separate projects called Absorbance and Timecourse respectively. These examples illustrate how the images can be analyzed so that the pixels ultimately displayed are restricted by at least two independent criteria. A generalized flowchart of steps including those described in Example 1A are shown in
In this example, an analysis was first performed using the spectral data obtained at the end of a 45 minute kinetic run to select pixels with the greatest 610 nm: 540 nm absorbance ratio. After data acquisition, the following steps were followed:
In this second sorting example, an analysis was first performed using the timecourse data obtained during a 45 minute kinetic run to select pixels meeting specific temporal criteria. In other examples, this kinetic run can be longer or shorter. In this case, the temporal criterion is the fastest absorbance increase at 610 nm. The following steps were followed:
In a third sorting example, spectral data obtained at the end of a kinetic run (or during the run) is used to determine ROIs meeting a specific spectral criteria without performing a complete contour plot based spectral analysis. This is done by generating a reference image from absorbance images as previously described. Using the Abg experiment as an example, the 610 nm image can be divided by the 540 nm image and the pixels with the lowest grayvalues would correspond to the ‘bluest’ pixels. If a satisfactory pixel cutoff value has been previously determined, one can use this cutoff value to select ROIs without performing the entire spectral analysis and sorting described in steps 1-5 of EXAMPLE 1 above. A single reference image based on spectral data is generated and this image is used for the kinetic analysis as listed in steps 7-8.
In a fourth sorting example, timecourse data is used to determine ROIs meeting a specific temporal criteria without performing a complete contour plot based kinetic analysis.
This is done by generating a reference image from timecourse images as previously described. Using the Abg experiment as an example, the T=600 second image can be flatfielded by the T0 image. This would be meaningful only if it is separately determined that the timepoint selected, in this case T=600, represents a linear rate of change of product formation with time.
In this case, the pixels with the lowest grayvalues would then correspond to the ‘fastest’ pixels.
In this example, the entire timecourse analysis and sorting described in steps 1-5 of EXAMPLE 2 above, may not be necessary. A single reference image based on timecourse data is generated and this image is used for the spectral analysis as listed in steps 7-8.
Hematoxylin and eosin (H&E) stains are performed on almost all biopsied tissues before any other special stain or immunochemical analysis is considered. As a result, there are approximately 10,000H&E stained thin sections analyzed per day in the United States. However, the staining process is difficult to control, and information obtained from a stained thin section is often based on very subtle color differences. Standardization and visual enhancement of such differences can be achieved by employing imaging spectroscopy, and this capability could benefit the entire histology community. Here we demonstrate how several of the sorting algorithms of the present invention can be used to analyze datastacks acquired by imaging a slide of H&E stained tissue.
The process begins by sorting the single-pixel spectra based on maximum absorbance value. This initial sort tends to move all of the pixels representing heavy to moderately stained regions in the image to the top of the contour plot, whereas unstained or poorly stained regions in the image are sorted to the bottom of the contour plot. By clicking and dragging a grouping bar (dark green) next to the low-absorbance pixels in the contour plot, these pixels can be locked out of the subsequent sort. A second sort is then performed on the remaining high-absorbance pixels based on the ratio of absorbance at 540 nm to the absorbance at 610 nm. Pixels having a high ratio (i.e., regions stained primarily with eosin) are thereby sorted to the top and can be grouped for further processing using the violet-colored grouping bar. Pixels having a lower ratio due to the presence of a shoulder at 610 nm (i.e., regions that have been stained with hematoxylin) are sorted beneath the high-ratio group, and fall into the middle of the contour plot. This small group of pixels can also be grouped for further processing using the light blue grouping bar.
Each of the three classes of pixels can also be selected by clicking and dragging a selection/mapping bar next to the appropriate section of the contour plot. The average spectrum for a selected group of pixels is thereby displayed in the plot window and the pixels are pseudocolored in the image window. In this example, the pixels representing low-staining or unstained regions of the tissue (the bottom third of the contour plot) have been selected by clicking and dragging a light green selection bar next to this portion of the plot. The average spectrum of these pixels is displayed in light green in the plot window, and the corresponding pixels are backcolored light green in the image window. The small number of pixels in the middle of the contour plot that represent tissue regions stained with hematoxylin have been selected with a dark blue selection/mapping bar. Their average spectrum (which has a shoulder at about 610 nm) is shown in the plot window, and the corresponding pixels are backcolored dark blue in the image window. Note that the backcolored areas for these pixels correspond predominantly to the cell nuclei. Finally, the pixels at the top of the contour plot (with absorbance primarily at 540 nm) have been selected with a red selection/mapping bar. Their average spectrum is shown in red in the plot window, and the corresponding pixels have been backcolored red in the image window. These pixels highlight areas in the tissue that have been stained primarily with eosin.