Previous methods to determine stroke prevalence, such as nationwide surveys, are labor-intensive endeavors. Recent advances in search engine query analytics have led to a new metric for disease surveillance to evaluate symptomatic phenomenon, such as influenza. The authors hypothesized that the use of search engine query data can determine the prevalence of stroke.
The Google Insights for Search database was accessed to analyze anonymized search engine query data. The authors' search strategy utilized common search queries used when attempting either to identify the signs and symptoms of a stroke or to perform stroke education. The search logic was as follows: (stroke signs + stroke symptoms + mini stroke − heat) from January 1, 2005, to December 31, 2010.
The relative number of searches performed (the interest level) for this search logic was established for all 50 states and the District of Columbia. A Pearson product-moment correlation coefficient was calculated from the statespecific stroke prevalence data previously reported.
Web search engine interest level was available for all 50 states and the District of Columbia over the time period for January 1, 2005–December 31, 2010. The interest level was highest in Alabama and Tennessee (100 and 96, respectively) and lowest in California and Virginia (58 and 53, respectively). The Pearson correlation coefficient (r) was calculated to be 0.47 (p = 0.0005, 2-tailed).
Search engine query data analysis allows for the determination of relative stroke prevalence. Further investigation will reveal the reliability of this metric to determine temporal pattern analysis and prevalence in this and other symptomatic diseases.