Purpose
This report is intended to describe the results of the validation study, which is detailed in 1-VAL-3106 Bacteriology Algorithm Clinical Study Protocol. The purpose of this study was to validate the intended use of the Techcyte bacteriology algorithm against the reference method. The reference method for this study is direct visual examination of the slide under a microscope.
Scope
The scope of this data analysis document is results from the Bacteriology Algorithm Clinical Study performed at Ketterthill. Other data, whether internal or external, was only used to compare the accuracy of the algorithm to the manual method.
This study was limited to gram stained vaginal smear slide samples obtained from the laboratory at Ketterthill in Luxembourg. The scope of this clinical study was to compare the methods used in microscopic analysis of vaginal swab samples prepared using the Gram stain, using a fuchsin counterstain and fixed on glass slides.
The analysis was limited to the targets listed below:
- Döderlein Bacilli (Lactobacillus)
- Corynebacteria
- Gram negative Bacilli
- Gram positive Bacilli
- Mobiluncus
- Gram-variable Gardnerella Vaginalis
- Gram positive Cocci
- Epithelial cells
- Clue cells
- Leukocytes
- Erythrocytes
- Yeasts
- Filamentous Mycelium
Targets or sample types not listed in the intended use are out of scope for this study. All images used for digital microscopic view and analysis with the Techcyte Algorithm were scanned using the Motic EasyScan Pro 6. Analysis of scans performed with another counterstain or by other digital microscopy scanners are beyond the scope of this Study.
Definitions
Term | Definition |
Raw Scores | The scores that were generated by the Techcyte Bacteriology Algorithm and were not yet reviewed by the technologist. |
Digital Review | The scores generated by the algorithm and have been reviewed – and changed, if necessary – by the technologist. |
Intended Use Statement
The Techcyte algorithm is a software as a medical device (SaMD) that automates cell locating from a digital image of a cytological smear from a glass slide, intended for in-vitro diagnostic use. The software locates, counts and classifies cells from a digital image that has been uploaded to the Techcyte platform.
The Techcyte bacteriology algorithm may be used to identify bacteria, epithelial cells, leukocytes, erythrocytes, clue cells and yeasts from samples of vaginal origin.
A viewer enables a skilled operator to quickly identify and confirm or challenge the suggested semi-quantitative scores proposed by the software.
The Techcyte bacteriology algorithm may also be used in the comparison of samples that are concurrently grown in culture. In this case, a visual inspection of the cultured bacterial specimen is made by the skilled operator and compared with the visual examination of the Gram-stained specimen.
The Techcyte bacteriology algorithm can also be used as a screening tool prior to running culture, PCR or other analyses.
Assessment Methods
Manual Review
In this validation, the procedure used by the examiners for the manual microscopy review was as follows:
After randomly choosing the areas to be examined, the examiner performed a semi-quantitative analysis of the objects of interest, using the following evaluation grades: absence (0), rare (1+), some (2+), several (3+), many (4+).
Comments were added as needed by the examiner regarding doubt or confusion between different classes or any observation regarding the aspect of the sample.
If the examiner was not able to evaluate the semi-quantitative grading of one or several classes, the corresponding evaluation for the semi-quantitative score was left blank, and an explanation and rationale for non-evaluation was provided in writing.
The Hay-Ison score was generated by a predetermined computer program in use in the Clinical Study Site. This scoring method was based on the semi-quantitative analysis performed by the examiner for Döderlein Bacilli (Lactobacillus), Gardnerella vaginalis, and Mobiluncus as defined above.
If the examiner was not able to evaluate the semi-quantitative grading of one or several classes included in the calculation of the Hay-Ison score, an inconclusive score was reported.
The formula used by the Clinical Study Site for computing the Hay-Ison score was developed internally and validated by comparison with the PCR results for Mycoplasma hominis (a bacterial species without a cell wall that is not visible with Gram stain microscopy), which proliferates during the course of bacterial vaginosis and is associated with Gardnerella vaginalis.
Digital Review
Qualified examiners reviewed the slides as they were presented through the Techcyte Algorithm viewer to make a final determination on the classification of each class of objects. As the examiners viewed the scanned slide images, they either confirmed the classifications and their associated semi-quantitative scores made by the Techcyte software or changed the semi-quantitative score based on their visual observations and consequently changed or not the Hay-Ison score as calculated within the Techcyte software. The semi-quantitative scores from the technologists’ digital review of the Techcyte Algorithm’s raw results and the subsequently calculated Hay-Ison Grades were the results that were compared to the manual method for the validation.
Hay-Ison Score
The Hay-Ison score is a simplified assessment method commonly used for evaluating Gram stain smears for bacterial vaginosis1,2. The Hay-Ison score is based on an estimation of the bacterial composition observed in the Gram stain. This scoring method is based on the semi-quantitative analysis performed by the examiner for Döderlein Bacilli (Lactobacillus), Gardnerella vaginalis (including Clue Cells) and Mobiluncus:
- Grade 0: Epithelial cells only (No Lactobacilli, No Gardnerella Vaginalis, No Mobiluncus)
- Grade I: Normal – Lactobacillus morphotypes predominate
- Grade II: Intermediate – Mixed flora (some Lactobacilli present, Gardnerella and/or Mobiluncus morphotypes also present.)
- Grade III: Bacterial Vaginosis – Gardnerella and/or Mobiluncus morphotypes predominate. Clue cells may also be present. Few or absent Lactobacilli.
Adjudication Process
Assessment of Gram stained vaginal smear samples is known for its variable and sometimes subjective nature2. This nature is often manifest in results variability between technologists and between laboratories3. The development of various diagnostic and assessment methods has been aimed at reducing this variability and subjectivity for the diagnosis of bacterial vaginosis. Amsel’s criteria, Spiegel’s criteria, Nugent’s method, and the Hay-Ison method are examples of such methods1,2,3,4.
In this clinical validation study, a comparison was made between the Hay-Ison Grade (in use at the clinical study site) which was calculated based on the semi-quantitative analysis performed by manual microscopy review and digital microscopy review. For any sample in which discordant results were observed between the manual review and the digital review and human error could not be ruled out as the cause, the adjudication committee was asked to determine the definitive diagnosis for the purposes of the Study. If any results were determined to be discordant and a correction was made, a rationale was provided by the committee.
Only the discordant results impacting the Hay-Ison Grade for a sample were considered for adjudication.
The adjudication panel consisted of 3 reviewers: one technologist – the Bacteriology department head of the Clinical Study Site – and two biological pharmacists from the Clinical Study Site. Together, the reviewers considered the manual review, digital review, and PCR results when determining the final result.
Ground Truth
The manual review defined above was the reference method for comparison with the digital review. The adjudicated result as determined by the adjudication committee was considered the ground truth result for each slide analyzed in this validation.
Of the 240 slides included in this study, 166 slides had concordant Hay-Ison scores. The 74 slides with discordant Hay-Ison scores were adjudicated by the committee where 24 slides were adjudicated in favor of the manual review and 48 slides were adjudicated in favor of the digital review.
In cases where the PCR result supported the digital review findings, the adjudication committee elected to accept the digital review result. 29 slides were concerned with this situation. 4 of them had an inconclusive manual read Hay-Ison Grade, 8 of them had a manual read Hay-Ison Grade 0 while the PCR detected bacteria, 14 of them had a manual read Hay-Ison Grade 1 with overevaluated lactobacilli and/or non detected GV and/or mobiluncus, 3 of them had a manual read Hay-Ison Grade 2 with underevaluated lactobacilli or bacteria detected in absence of bacteria.
8 slides were adjudicated as inconclusive in favor of the manual read (3) or in favor of the digital read (5).
8 slides were adjudicated in favor of the manual read where the underlying presence of gram positive cocci lead to an overestimation of the GV clusters or of the lactobacilli (non distinguishable cocci chains). 1 slide was adjudicated in favor of the digital read under the same circumstances.
8 slides for which the technician hesitated and relied on the culture for confirmation were adjudicated in favor of the manual read.
5 slides were adjudicated in favor of the manual read by reevaluation of the semi-quantitative prevalence score evaluated by the digital read.
13 slides were adjudicated in favor of the digital read (for 6 of them the technician requested the culture for confirmation).
2 slides were adjudicated neither in favor of the manual read, nor in favor of the digital read (request of the culture for confirmation was noted by the technician for both).
Success Criteria
For this validation study, the results will be measured in terms of sensitivity and specificity of the Hay-Ison score. Positive predictive value, negative predictive value, and Cohen’s Kappa statistic will also be calculated to assess the agreement between the results obtained.
Since the Hay-Ison score has high variability throughout the industry, the results of the manual method and the digital method will both be compared to the adjudicated definitive results and assessed for agreement.
If the digital review using the Techcyte Bacteriology Algorithm has equivalent or better sensitivity and specificity scores to the manual method, the algorithm would be considered successful.
Data Analysis Method
A McNemar’s Chi Square test was used, since the Hay-Ison results being compared were paired results of two different methods. This method is used to determine if there is a statistically significant difference in the proportions of the paired data, where the null hypothesis is that the data is classified as positive or negative at the same rate, meaning that the contingency table is symmetric. The alternative hypothesis is that the probability rates are not the same. Additionally, the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were calculated from the various 2×2 tables. Sensitivity and specificity were calculated to compare the grading of digital review versus the adjudicated results. These were done to show the characteristics of agreement in the test, and the larger population and prevalence of the findings are not representative of the entire population. The PPV and NPV are representative for this sample and are dependent on the prevalence in the samples provided and are not comparable to the large population. The kappa statistic was calculated to measure the level of agreement between the two groups. All were calculated using R 4.2.1.
There were three pairings that were compared, manual review vs digital review, manual review vs adjudicated review, and digital review vs adjudicated review. The Hay-Ison Grade 0 results were ignored because this grading is not associated with bacterial vaginosis. The definitively inconclusive results were also ignored because they could not be reliably used for diagnosis of bacterial vaginosis. The three Hay-Ison grades were then compared by classifying Grade II in different ways. The first ignored Grade II, leaving Grade I as negative and Grade III as positive, next classified Grades I and II as negative and Grade III as positive, and the final had Grade I as negative and Grades II and III as positive. This method for handling Grade II results has been cited in the literature as having been used for evaluating the reliability of different assessment methods for Gram stain analysis related to bacterial vaginosis1,2.
Since the adjudication of the results only considered the Hay-Ison grades from the manual and digital reviews, the object class semi-quantitative scores between the manual review and the digital reviews were compared using the Wilcoxon Signed Rank Test. This method was used because the data were paired and measured on a categorical scale. Since they were paired, a non-parametric measure was needed, as it bases all the data off of ranks instead of the actual value. The null hypothesis is that the median of the differences between the paired data is 0, the alternative is that the difference is not 0. The α level was set at 0.05 for this analysis. Results can be seen below, if the conclusion was to reject the null hypothesis (that means that the p-value is small and the null hypothesis is rejected in favor of the alternative hypothesis). This means that the difference between the pairs is different than 0. The other conclusion is to fail to reject the null hypothesis, meaning that the difference between the medians is close to 0. All were tested using R 4.2.1 software.
Results
Hay-Ison Grade
Comparing the results from the manual review and the digital review (Table 1), the different uses of Grade II can be seen in Table 2 below. For Grade II removed and Grade II negative the p-value was greater than the set α of 0.05 meaning that we would fail to reject the null hypothesis and conclude that the two methods were equivalent in proportions. The grade II positive had a p-value smaller than the set α so the null hypothesis was rejected in favor of the alternative meaning that the proportions were different between the two methods. For Grade II removed and Grade II negative, the sensitivity and specificity were 97% and 100% respectively showing that the agreement between the manual and technologist adjusted results were very similar. Both also had a PPV of 100% meaning that when the grade was classified as negative there was 100% agreement, while having very high agreement in the NPV values. For the grade II positive, all the sensitivity, specificity, PPV and NPV were very high leading to the conclusion that either method provides similar results. For the Grade II removed and Grade II negative cases, the kappa was above 0.81 meaning that they had near perfect agreement while the Grade II positive case would be classified as substantial agreement.
Table 1: Hay-Ison Grade Truth Table Between Digital Review and Manual Review | ||||||
Manual Review Grade | ||||||
Grade 0 | Grade I | Grade II | Grade III | Inconclusive | ||
Digital Review Grade | Grade 0 | 6 | 0 | 3 | 0 | 0 |
Grade I | 2 | 113 | 9 | 0 | 3 | |
Grade II | 10 | 22 | 28 | 0 | 4 | |
Grade III | 1 | 3 | 2 | 19 | 2 | |
Inconclusive | 1 | 9 | 1 | 2 | 0 |
Table 2: Hay-Ison Grade Comparison Between Digital Review and Manual Review | |||
Statistical Measure | Grade II Ignored | Grade II Negative | Grade II Positive |
McNemar p-value | 0.248 | 0.074 | 0.010 |
Sensitivity | 97% | 97% | 82% |
Specificity | 100% | 100% | 84% |
PPV | 100% | 100% | 93% |
NPV | 86% | 79% | 66% |
Kappa | 0.91 | 0.87 | 0.61 |
Comparing the results from the manual review and the adjudicated results (Table 3), the different uses of Grade II can be seen in Table 4 below. For the cases with Grade II removed and Grade II negative, the p-value was greater than the set α of 0.05 meaning that we would fail to reject the null hypothesis and conclude that the two methods were equivalent in proportions. The Grade II positive had a p-value smaller than the set α so the null hypothesis was rejected in favor of the alternative meaning that the proportions were different between the two methods. For Grade II removed and Grade II negative, the sensitivity and specificity were 100% and 88% respectively showing that the agreement between the manual and adjudicated results were very similar. Both also had a PPV of 98% meaning that when the grade was classified as negative there was a very high agreement, while having 100% agreement in the NPV values so they always agreed on those classified as positive. For the Grade II positive case, all the sensitivity, specificity, PPV and NPV were very high leading to the conclusion that either method provides similar results. For Grade II removed and Grade II negative cases, the kappa was above 0.81 meaning that they had near perfect agreement while Grade II positive case would be classified as substantial agreement.
Table 3: Hay-Ison Grade Truth Table Between Manual Review and Adjudicated Result | ||||||
Adjudicated Hay-Ison Grade | ||||||
Grade 0 | Grade I | Grade II | Grade III | Inconclusive | ||
Manual Review Grade | Grade 0 | 10 | 1 | 8 | 1 | 0 |
Grade I | 0 | 119 | 20 | 3 | 5 | |
Grade II | 1 | 5 | 37 | 0 | 0 | |
Grade III | 0 | 0 | 0 | 21 | 0 | |
Inconclusive | 0 | 3 | 2 | 1 | 3 |
Table 4: Hay-Ison Grade Comparison Between Manual Review and Adjudicated Result | |||
Statistical Measure | Grade II Ignored | Grade II Negative | Grade II Positive |
McNemar p-value | 0.248 | 0.248 | 0.001 |
Sensitivity | 100% | 100% | 96% |
Specificity | 88% | 88% | 72% |
PPV | 98% | 98% | 84% |
NPV | 100% | 100% | 92% |
Kappa | 0.92 | 0.93 | 0.70 |
Comparing the results from the digital review and the adjudicated results (Table 5), the different uses of Grade II can be seen in Table 6 below. For the cases with Grade II removed, there was 100% agreement, so no p-value could be calculated since there was no variance. As a result, there was no difference in the sensitivity, specificity, PPV, and NPV and the kappa was a value of 1 meaning perfect agreement. For the Grade II negative and Grade II positive cases, the p-value was greater than the set α of 0.05 meaning that we would fail to reject the null hypothesis and conclude that the two methods were equivalent in proportions. For the Grade II negative and Grade II positive cases, the sensitivity and specificity were above 98% and 95% respectively showing that the agreement between the technologist adjusted digital and adjudicated results were very similar. Both also had a PPV of greater than 97% meaning that when Grade II was classified as negative there was almost always the same agreement, while having greater than 92% agreement in the NPV values there was also a high agreement for Grade II classified as positive. For the Grade II negative and Grade II positive cases, the kappa was above 0.81 meaning that they had near perfect agreement.
Table 5: Hay-Ison Grade Truth Table Between Digital Review and Adjudicated Result | ||||||
Adjudicated Hay-Ison Grade | ||||||
Grade 0 | Grade I | Grade II | Grade III | Inconclusive | ||
Digital Review Grade | Grade 0 | 7 | 0 | 2 | 0 | 0 |
Grade I | 1 | 122 | 4 | 0 | 0 | |
Grade II | 2 | 3 | 57 | 0 | 2 | |
Grade III | 1 | 0 | 2 | 23 | 1 | |
Inconclusive | 0 | 3 | 2 | 3 | 5 |
Table 6: Hay-Ison Grade Comparison Between Digital Review and Adjudicated Result | |||
Grade II Ignored | Grade II Negative | Grade II Positive | |
McNemar p-value | N/A | 0.480 | 1.0 |
Sensitivity | 100% | 99% | 98% |
Specificity | 100% | 100% | 95% |
PPV | 100% | 100% | 97% |
NPV | 100% | 92% | 96% |
Kappa | 1.00 | 0.95 | 0.93 |
Object classes
The semi-quantitative prevalence scores assigned to each class of bacteria or cells were also evaluated individually. The original analysis compared the manual read to the digital read (Table 7). If the conclusion was to reject the null hypothesis that means that the p-value is small and the null hypothesis is rejected in favor of the alternative hypothesis. This means that the difference between the pairs is non-zero. The other conclusion is to fail to reject the null hypothesis, meaning that the difference between the medians is close to 0. Both Lactobacilli and Filamentous Mycelium failed to reject the null hypothesis meaning that the two methods are equivalent for these bacteria or cells.
Table 7: Comparison of Semi-Quantitative Prevalence Scores Between Digital Review and Manual Review | ||
Object Class | Wilcoxon Signed Rank | Conclusion |
Bacilli Gram Negative | p-value = 1.797e-09 | reject null hypothesis |
Bacilli Gram Positive | p-value = 3.157e-14 | reject null hypothesis |
Clue Cells | p-value = 0.02388 | reject null hypothesis |
Cocci Gram Positive | p-value = 3.786e-07 | reject null hypothesis |
Corynebacteria | p-value < 2.2e-16 | reject null hypothesis |
Lactobacilli | p-value = 0.8488 | fail to reject null hypothesis |
Epithelial Cells | p-value < 2.2e-16 | reject null hypothesis |
Erythrocytes | p-value < 2.2e-16 | reject null hypothesis |
Filamentous Mycelium | p-value = 0.8853 | fail to reject null hypothesis |
Gardnerella Vaginalis | p-value = 0.000121 | reject null hypothesis |
Leukocytes | p-value = 1.654e-07 | reject null hypothesis |
Mobiluncus | p-value = 1.018e-05 | reject null hypothesis |
Yeasts | p-value = 0.000186 | reject null hypothesis |
To further evaluate the semi-quantitative prevalence scores for each of the object classes, if the raw prevalence score was ±1 between the digital review and manual review methods, they were considered equivalent (Table 8). The same null hypothesis and conclusions from above can be applied below. Using this criterion Clue Cells, Cocci Gram Positive, Lactobacilli, Filamentous Mycelium, and Mobiluncus now have results of fail to reject the null hypothesis and can be considered equivalent.
Table 8: Comparison of Semi-Quantitative Prevalence Scores Between Digital Review and Manual Review | |||
Object Class | Adjusted Within 1 | Conclusion | % Agreement |
Bacilli Gram Negative | p-value = 1.807e-07 | reject null hypothesis | 73% |
Bacilli Gram Positive | p-value = 0.002074 | reject null hypothesis | 81% |
Clue Cells | p-value = 0.7029 | fail to reject null hypothesis | 93% |
Cocci Gram Positive | p-value = 0.151 | fail to reject null hypothesis | 87% |
Corynebacteria | p-value = 7.336e-12 | reject null hypothesis | 52% |
Lactobacilli | p-value = 0.5769 | fail to reject null hypothesis | 90% |
Epithelial Cells | p-value = 2.134e-07 | reject null hypothesis | 87% |
Erythrocytes | p-value = 1.251e-09 | reject null hypothesis | 67% |
Filamentous Mycelium | p-value = 1 | fail to reject null hypothesis | 99% |
Gardnerella Vaginalis | p-value = 0.02132 | reject null hypothesis | 85% |
Leukocytes | p-value = 0.0001341 | reject null hypothesis | 89% |
Mobiluncus | p-value = 1 | fail to reject null hypothesis | 98% |
Yeasts | p-value = 0.04098 | reject null hypothesis | 95% |
Discussion
Semi-Quantitative Prevalence Scores
Given the variability and subjectivity inherent in reviewing Gram stains for bacterial vaginosis, the differences between results obtained from the digital review and manual review comparison observed for the semi-quantitative prevalence scores are not unexpected. The Hay-Ison grading system was developed to reduce the variability of these reviews1, and in this study, success is based on the comparison results related to the Hay-Ison score. The object class semi-quantitative results that are considered in the calculation of the Hay-Ison grade are Lactobacillus, Gardnerella vaginalis, and Mobiluncus, and it is notable that the results for Lactobacillus with this comparison demonstrate that two methods were similar for identifying and quantifying Lactobacillus (p = 0.8488). The distinctive visual appearance of Lactobacillus may be responsible for this result. When viewed under Gram stain, Lactobacillus may be more easily quantified compared to Gardnerella vaginalis which is arranged in clusters compared to single rods.
When evaluating the semi-quantitative prevalence scores that are ±1 between the two methods, Lactobacillus and Mobiluncus failed to reject the null hypothesis (p = 0.5769, p = 1 respectively). Given the importance of Lactobacillus and Mobiluncus along with Gardnerella vaginalis in calculating the Hay-Ison score, this result may imply that the between reviewer variation for semi-quantitative prevalence scores is close justifying the need for methods such as the Hay-Ison grading system which simplifies the evaluation in such a way that more variation is allowed for individual object classes. The agreement between the two methods in identifying and quantifying clue cells (p = 0.7029) is also important because the presence of clue cells is an important factor in Amsel’s criteria for diagnosing bacterial vaginosis4.
When evaluating the semi-quantitative prevalence scores that are ±1 between the two methods, Gardnerella vaginalis rejected the null hypothesis (p = 0.02132) and cannot be considered as equivalent. For 14 slides of the 240 slides included in this study, the difference of the Gardnerella vaginalis semi-quantitative prevalence score is higher than ±1. For 13 of these 14 slides the score is higher on the digital read than on the manual read. Only one slide has a higher score on the manual review (several) compared to the digital read (rare). The digital read is in agreement with the PCR result (rare) and the Hay-Ison Grade is the same for the manual read and for the digital read (score 3) due to the presence of a lot of Mobiluncus. For this particular slide the presence of a lot of mobiluncus could have influenced the manual read. For the 13 other slides, 2 manual and digital Hay-Ison Grades agreed, 8 adjudicated Hay-Ison Grades validated the digital read Hay-Ison Grade, 3 adjudicated Hay-Ison Grades validated the manual read Hay-Ison Grade. For these last 3 slides, the manual read said “absence” of GV. For 2 of them, Gram positive cocci clusters have influenced the digital read (“many” GV and “several” GV). The manual read technician had requested a GV confirmation by culture for one of these 2 slides. For the third slide, the manual read technician had also requested a GV confirmation by culture because he couldn’t classify the bacteria between GV and Corynebacteria.
Table 9: Comparison of Gardnerella Vaginalis Semi-Quantitative Scores Between Digital Review and Manual Review | ||||
Manual ReadGardnerella Vaginalis | Digital ReadGardnerella Vaginalis | Technician Comments on Manual Read | Technician Comments on Digital Read | Adjudication Comments |
0 – Absence | 4 – Many | Punctuated lactobacilli / Rare to some coryne / Gram Negative Bacilli ? | Very dark Gram – See culture. | Technician personal decision. |
0 – Absence | 4 – Many | Presence of cocci clusters. | ||
3 – Several | 1 – Rare | Presence of a lot of mobiluncus. | ||
0 – Absence | 2 – Some | Technician personal decision. | ||
0 – Absence | 3 – Several | If GV, some GV (no clue cells). If no GV, some Gram Negative Bacilli. | Presence of cocci clusters. | |
0 – Absence | 3 – Several | Technician personal decision. | ||
0 – Absence | 4 – Many | Technician personal decision. | ||
0 – Absence | 4 – Many | Many GV and rare clue cells or many coryne and many Gram Negative Bacilli? | Technician personal decision. | |
0 – Absence | 2 – Some | Agreement on the Hay-Ison Grade. Not included in the adjudicated slides. | ||
0 – Absence | 2 – Some | Agreement on the Hay-Ison Grade. Not included in the adjudicated slides. | ||
0 – Absence | 4 – Many | Many GV and/or corynebacteria? If GV, then rare Clue cells | Thick gram. | |
0 – Absence | 4 – Many | Classification difficulties due to the bacteria morphology. | ||
0 – Absence | 4 – Many | Technician personal decision. | ||
0 – Absence | 4 – Many | Many GV and-or coryne / If GV, some clue cells / Some or several Gram Negative Bacilli? | Classification difficulties due to the bacteria morphology. |
When comparing the digital review Hay-Ison Grade to those of the manual review, the scenarios where Grade II was ignored (p = 0.248) or considered negative (p = 0.074) we failed to reject the null hypothesis. This may indicate that the methods could be considered equivalent under these circumstances; however, under the condition where Grade II was considered positive (p = 0.01), we rejected the null hypothesis, so we cannot consider the two methods equivalent under this circumstance. The observed variance when Grade II is considered positive may be due to the number of Grade II slides as indicated by the Adjudicated results that were given Grade I using the Manual Review method. The manual review assigned 20 Adjudicated Grade II results to Grade I (Table 3) compared to the digital review, which only assigned 4 Adjudicated Grade II results to Grade I (Table 5).
Hay-Ison Grade
When the Hay-Ison Grades from the digital review and the manual review were each compared to the adjudicated Hay-Ison Grades, the digital review demonstrated agreement under all three conditions for the Grade II with a p-value of p = 0.48 or greater. The manual review demonstrated equivalence in the conditions where Grade II was ignored (p = 0.248) or considered negative (p = 0.248) when compared with the adjudicated Hay-Ison Grades, but for the case where Grade II is considered positive (p = 0.001), equivalence was not observed.
The sensitivity and specificity for the manual review compared to the adjudicated grades provides some insight here as all three conditions for Grade II demonstrated sensitivity 96% or better while specificity was 88% for the conditions where Grade II was ignored or considered negative and 72% when Grade II was considered positive. This indicates that the true positive rate for the manual review method has a high true positive rate, but the true negative rate is comparatively low.
When comparing the digital review method to the adjudicated results, sensitivity was 98% or better for all of the conditions for Grade II, and specificity was 95% or better demonstrating a greater true negative rate than that of the manual method under all Grade II conditions.
Use of the Techcyte Bacteriology Algorithm
The technologists who performed the digital review using the Techcyte Bacteriology Algorithm expressed their opinions that they were comfortable examining the slides through the Techcyte Platform to reach an accurate conclusion. The current design allows the technologist to change the semi-quantitative prevalence according to their professional judgement. The technicians expressed confidence that they would be able to reach the right conclusions through use of the Techcyte Platform when assessing Gram stained slides.
The Techcyte Bacteriology Algorithm is deterministic in its classification and quantification of cells and bacteria, and the use of the system may provide a more standardized approach to Gram stain analysis thus limiting the effect of the variability known to exist between technicians when performing Gram stain analysis by manual microscopy.
Conclusion
The comparison of the digital review with the adjudicated results demonstrates that the observed sensitivity and specificity of the algorithm is equivalent or better than that observed when comparing the manual review with the adjudicated results. Therefore, Techcyte concludes that the Techcyte Bacteriology Algorithm is equivalent or better to the manual method for examining Gram-stained vaginal samples.
No study of efficiency or time saving between the manual and digital review was conducted as part of this clinical validation study. Further evaluation is needed to determine what efficiency gains (if any) may be obtained from use of the Techcyte Bacteriology Algorithm.
No comparison was made between the digital review and manual review methods to determine inter-operator error as the two methods were not performed by the same set of technologists (Appendix I). While it is known that inter-operator error may affect the reading of Gram stain slides, we do not understand the level at which this effect may have had on the results of the study. Consequently, in the present study the adjudication process was employed to reduce the effect inter-operator variability had on the final results, and the adjudication results were used as ground truth for this validation. More evaluation may be needed to assess inter-operator variability in a future study.
Appendix I: Personnel
The personnel who performed the study can be found below. The four examiners that made the manual review were no longer associated with the Clinical Study Site at the time of the digital review.
Manual Review | Facility | Name |
Examiner A | Ketterthill | Stana Vitas |
Examiner B | Ketterthill | Sabine Rosborski |
Examiner C | Ketterthill | Jeremy Salard |
Examiner D | Ketterthill | Sabine Gryglicki |
Digital Review | Facility | Name |
Examiner A | Ketterthill | Françoise Gillardin |
Examiner B | Ketterthill | Lise Bigel |
Examiner C | Ketterthill | Nancy Moinil |
Examiner D | Ketterthill | Hélène Henrot |
References
- Ison, C. A., and P. E. Hay. “Validation of a simplified grading of Gram stained vaginal smears for use in genitourinary medicine clinics.” Sexually transmitted infections 78.6 (2002): 413-415.
- Chawla, Rohit, et al. “Comparison of Hay’s criteria with Nugent’s scoring system for diagnosis of bacterial vaginosis.” BioMed Research International 2013 (2013).
- Nugent, Robert P., Marijane A. Krohn, and Sharon L. Hillier. “Reliability of diagnosing bacterial vaginosis is improved by a standardized method of gram stain interpretation.” Journal of clinical microbiology 29.2 (1991): 297-301.
- Spiegel, Carol A., R. Amsel, and K. K. Holmes. “Diagnosis of bacterial vaginosis by direct Gram stain of vaginal fluid.” Journal of clinical microbiology 18.1 (1983): 170-177.