Image analysis methods: preliminary results as for July 2013

According to the Grant Program of the project the Task 3 shall be focusing on designing and developing image analysis methods and software to discover statistically significant links between the morphological structure of lung X-ray & CT images of TB patients and their drug resistance status. Immediately below we have provided links to the input data and results obtained with this study which are followed by a detailed description of the research work accomplished. These materials include segmented CT and X-Ray lung images as well as data table containing extracted image features. The data are for the study group of 107 patients (see the Materials and Methods section below). The image features are placed into an excel file together with some patients’ data sub-sampled from major TB database.

Supplementary materials:

Introduction

Recently, about one hundred and thirty years after the discovery of Mycobacterium tuberculosis, the disease remains a persistent threat and a leading cause of death worldwide [1]. The greatest disaster that can happen to a patient with tuberculosis (TB) is that the organisms become resistant to two or more of the standard drugs [1, 2]. Recent reports on the global anti-TB drug resistance constitute real challenge of the TB resistance problem in the world [2].

This work represents research part of the Belarus TB Database and TB Portal project which is aimed at creating comprehensive, freely available information resources on TB resistance problem at a national level (see the dedicated portal [3]). The core part of these resources are the patient database with clinical data, X-ray and CT chest scans of each patient as well as sequences of M. Tuberculosis isolates from patients with different levels of resistance (genomic data to be added in 2013/2014). In addition, a very large collection of CT chest images of several thousand TB patients with no clinical data can be freely downloaded and used for research purposes.

The goal of this document is to present preliminary results of an exploratory study of possible correlations between the drug resistance and textural (structural) features of radiological images of lung tuberculosis patients. To our best knowledge, this is a very challenging task which was addressed to only a small extent so far using traditional radiological (i.e., visual, non-computerized) image findings only [4, 5].

Materials and Methods

Both chest X-ray and CT image data were used independently for searching possible links between the quantitative features of radiological images of TB patients and their drug resistance status. Chest radiographs were taken using KODAK Point-of-Care 260 system with 2248×2248 pixel resolution. CT scanning was performed by GE LightSpeed Pro 16 scanner with slice thickness of 2.5 mm and the number of axial slices varying from 100 to 160 depending on the region of interest. Lung regions were segmented manually on X-ray and semi-automatically in case of CT with final inspection performed by a radiologist.

Initially, 150 TB patients with all the clinical data available were sampled from the ones treated at TB hospitals. After that, 24 patients were excluded because their CT scans were inconsistent by date or not available at all. Next, from the rest of 126 patients, 1 patient was excluded because of permanent CT lung segmentation problems, 14 other were withdrawn because their chest CT were taken on machines other than GE LightSpeed Pro, and the last 4 patients were removed due to presence of HIV infection. Finally, 107 patients, 64 males and 43 females (mean age 42.7 and 48.3 years respectively, no significant age differences, p=0.11) were used as a study group. The major patient characteristic employed for this study was patient’s drug sensitivity variable with 0 standing for sensitive case and 1 denoting any of drug resistant ones.

The approach followed. In this work we capitalizing on the concept of extended, multi-sort, multi-dimensional co-occurrence matrices which properly describe image structure/content. The approach is suited for virtually any image type starting from simple 2D (edges) and 3D (mesh) object shapes, traditional gray scale & color 2D images, 3D (volumetric) images produced by various tomographies and going up to the space+time 4D image data, spectroscopic measurements, and so on.

The concept of generalized co-occurrence was inspired by classical Robert Haralick’s gray level co-occurrence matrices typically abbreviated by GLCM. We have introduced our generalization in early 1990th with the first paper in English appeared in 1996 [6]. For readers with a solid mathematical background it should be pointed out that essentially the generalized co-occurrence is a particular case of the fundamental notion of the general System Sas defined by Mesarovic & Takahara in their elegant General Theory of Systems, i.e. as S⊆X×Y where × denotes the Cartesian product of X and Y sets. Taking more closely to the information technology and pattern recognitions domains, this would be equivalent to the notion of N-tuples or corteges. Recently, the generalized co-occurrence approach we introduced is a well elaborated method which has been demonstrated its productivity in a number of biomedical image-based studies (see, for example, [7-10]).

Basic procedure. The following procedure was implemented for searching correlations between the drug resistance and textural image features as well as for visualization image regions the features are originated from.

Step-1: Calculating quantitative image descriptors. Due to the characteristic textural appearance, here we employed some variants of the the extended multi-sort, multi-dimensional co-occurrences introduced above. More specifically, X-ray lung images were described with the help of a modified version of four-dimensional matrices [8] abbreviated by IIID and counting the frequency of mutual occurrence of triplets of pixels with certain intensities (I) at different Euclidean distances (D). For describing lung CT image structure we employed the most general, six-dimensional matrices of IIGGAD type [7] counting voxel pairs with certain intensities (I), gradient magnitudes (G), and mutual angles (A) between the gradient vectors in 3D image space.

Step-2: Creating study group data table. At this step descriptor of every image is stored as a separate row of a standard object-feature matrix in which the co-occurrence matrix element (a data table column) is treated as a separate feature.

Step-3: Reducing number of features by Principal Component Analysis (PCA). Since the number of original features (co-occurrence matrix elements) is very large and they are heavily correlated, the PCA method is applied what typically reduce the number of features down to 5-15 uncorrelated components (see [9] for similar technique in details).

Step-4: Statistical analysis of possible correlations between the image features (principal components, PCs) obtained at the previous step and patients’ data using conventional statistical methods. Identification of PCs significantly correlating with the TB resistance status.

Step-5: Projection of resultant components to the input co-occurrence features and mapping them back to original images for visualization (highlighting) corresponding image regions.

Step-6: Testing the established correlations with the help of multivariate statistical analysis involving patients’ clinical data as well as k-fold and leave-one-out cross-validation techniques.

Results

CT images. The principal component analysis with Kaizer’s criterion resulted in 5 mutually uncorrelated components the one of which (PC3) significantly correlated with drug resistance (r=0.34, p=0.00038). Since it was also found that PC3 correlates with patient’s weight (r=-0.29, p=0.0025) and in its turn the drug resistance correlates with presence of recurring treatment (r=0.31, p=0.0013), an ANOVA multivariate analysis was applied for all possible combinations of these three factors (one independent vs. two dependent). As a result it was established that corresponding partial links between the PC3 and resistance status remains significant with p=0.001 or better. To ensure these results, similar multivariate analyses were performed using the discovered component PC3, drug resistance status and every other quantitative/factorial patient’s database variables (age, gender, symptoms, lung volume, etc). In all the tests the link between image PC3 and drug resistance level remained significant with p=0.00135 or better. Finally, example results of mapping feature PC3 back to original CT images are presented in Fig. 1.

X-ray images. Examining possible correlations of features derived from lung radiographs with patient’s resistance status was performed by the same procedure as given above for the CT. This case only the last extracted component PC6 was significantly correlated with drug resistance (r=0.314, p=0.0010). Contrary to CT, image feature PC6 was not correlating with patient’s weight (r=0.0446, p=0.648). However, correlation with recurrent treatment remains significant again (r=-0.281, p=0.0034). Analysis performed with ANOVA by the scheme PC6 vs. (resistance) + (recurrent treatment) gave significance scores p=0.0093 and p=0.0037 for resistance and recurrence respectively. All the multivariate analyses involving PC6, resistance and every other patient’s characteristics have demonstrated significance of particular correlation of PC6 vs. resistance not worse than p=0.00114. Example results of mapping feature PC6 back to original X-ray images are presented in Fig. 2.

Fig. 1. Results of mapping of CT image features correlated with drug resistance status for two tuberculosis patients.

Fig. 2. Results of mapping of X-ray image features correlated with drug resistance status for two tuberculosis patients.

CT vs. X-ray. Examining possible links between the characteristic image features of CT (PC3) and X-ray (PC6) images has revealed that there are no any correlation between them (r=-0.0017, p=0.986) what is not surprising considering very different basic principles of their formation and effect of anatomy “summation” in X-ray.

Conclusions

Results obtained with this study allow drawing the following conclusions:

  1. There are statistically-significant links between TB drug resistance and features of CT (r=0.34, p=0.0004) and X-ray (r=0.31, p=0.0010) images.
  2. There were no other factors found which could explain in a trivial way the correlations being discovered.
  3. The discovered links appear to be unbiased and reliable with respect to certain (technical) parameters of the method.

However, further research is necessary for disclosing biomedical substrate of the quantitative features, explaining possible links and reasons, as well as evaluating their potential diagnostic utility.

Acknowledgements. This work was partly funded by the National Institute of Allergy and Infectious Diseases, National Institutes of Health, U.S. Department of Health and Human Services, USA through the CRDF project BOB 1-31055-MK-11.

References

1. Ferguson L.A. and Rhoads J. (2009). Multidrug-resistant and extensively drug-resistant tuberculosis: The new face of an old disease, J Am Acad Nurse Practitioners, 21(11):603-609.

2. Chiang C.Y., Centis R., and Migliori G.B. (2010). Drug-resistant tuberculosis: Past, present, future, Respirology, 15(3):413-432.

3. http://tuberculosis.by

4. Cha J, Lee HY, Lee KS, Koh W-J, Kwon OJ, Yi CA, Kim TS, Chung MJ. (2009). Radiological Findings of Extensively Drug-Resistant Pulmonary Tuberculosis in Non-AIDS Adults: Comparisons with Findings of Multidrug-Resistant and Drug-Sensitive Tuberculosis, Korean J Radiol, 10:207-216.

5. Lee ES, Park CM, Goo JM, Yim J-J, Kim H-R, Lee HJ, Lee IS, Im J-G. (2010). Computed Tomography Features of Extensively Drug-Resistant Pulmonary Tuberculosis in Non-HIV-Infected Patients, J Comput Assist Tomogr, 34:559-563.

6. Kovalev V. and Petrou M. (1996). Multidimensional Co-occurrence Matrices for Object Recognition and Matching, Graphical Models and Image Processing, 58(3):187-197.

7. Kovalev V.A., Kruggel F., Gertz H.-J., and von Cramon D.Y. (2001) Three-dimensional texture analysis of MRI brain datasets, IEEE Transactions on Medical Imaging, 20(5):424-433.

8. Kovalev V., Dmitruk A., Safonau I., Frydman M., and Shelkovich S. (2011) A method for identification and visualization of histological image structures relevant to the cancer patient conditions, In: Computer Analysis of Images and Patterns (CAIP-2011), A.Berciano et al. (Eds), Springer, LNCS, 6854(1):460-468.

9. Kovalev V.A., Kruggel F., and von Cramon D.Y. (2003) Gender and age effects in structural brain asymmetry as measured by MRI texture analysis, NeuroImage, 19:896-905.

10. Kovalev V.A., Petrou M., and Suckling J. (2003). Detection of structural differences between the brains of schizophrenic patients and controls, Psychiatry Research: Neuroimaging, 124:177-189.

Appendix: Slides of the presentation on the topic which has been made on 27th of June 2013 at the International Congress on Computer Assisted Radiology and Surgery (26-30 June 2013, Heidelberg, Germany):

V. Kovalev, V. Liauchuk, I. Safonau, A. Astrauko, A. Skrahina, A. Tarasau (2013). Is there any correlation between the drug resistance and structural features of radiological images of lung tuberculosis patients? Int J Comp Assist Radiol Surg, Springer, vol. 8, supp. 1, June 2013, pp. S18–S20.

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.