This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
Although autism is often characterized in literature by the presence of repetitive behavior, in structured decision tasks, individuals with autism spectrum disorder (ASD) have been found to examine more options in a given time period than controls.
We aimed to examine whether this investigative tendency emerges in information searches conducted via the internet.
In total, 1746 search engine users stated that they had ASD in 2019. This group’s naturally occurring responses following 1491 unique general queries and 78 image queries were compared to those of all other users of the search engine. The main dependent measure was scrolled distance, which denoted the extent to which additional results were scanned beyond the initial results presented on-screen. Additionally, we examined the number of clicks on search results as an indicator of the degree of search outcome exploitation and assessed whether there was a trade-off between increased search range and the time invested in viewing initial search results.
After issuing general queries, individuals with self-stated ASD scanned more results than controls. The scrolled distance in the results page of general queries was 45% larger for the group of individuals with ASD (
Individuals who self-stated that they had ASD scrutinized more general search results and fewer image search results than the controls. Thus, our results at least partially support the notion that individuals with ASD exhibit investigative behaviors and suggest that textual searches are an important context for expressing such tendencies.
Characterizing the internet browsing style of specific populations is important for understanding how individuals behave during naturally occurring circumstances and can ultimately be used for tailoring interfaces and content to people’s diverse styles and capabilities [
Laboratory studies have discovered that high-functioning individuals with ASD tend to make selections from assorted options in a relatively short period of time [
The internet has been described as a “newly autism-compatible environment” [
To discern whether individuals with self-stated ASD (s-sASD) examine internet search results more thoroughly than individuals without ASD, we analyzed their naturally occurring interactions with an internet search engine. Our research strategy was different from that of most autism studies, which typically use small samples of psychiatrically diagnosed participants [
Our main dependent variable was the scanned range of search results, that is, the extent to which individuals chose to present themselves with more results in addition to those shown on the initial results screen. In addition, we examined the number of clicks on search results (links to other pages) as an estimate of the exploitation of search outcomes. Finally, we tested whether there was a trade-off between increased search range and the time invested in viewing initial search results.
This study was preapproved by the Technion Research Ethics Committee (approval number: 2020003). The data set used in this study is proprietary to Microsoft Corporation and was kept anonymous for privacy reasons. This data set was approved for research, which was to be conducted by the second author in his capacity as a Microsoft Research Senior Principal Researcher. By using a data set of all English-language queries made by people in the United States on the Bing search engine in 2019, we extracted each user’s anonymized ID and their query text. Queries were filtered to identify users who described themselves as autistic in one of their queries. This was determined by the usage of one of the following phrases in query text:
In study 1, we targeted all general web searches in Bing made by individuals with s-sASD in November 2019. Additionally, we examined identical searches that were made by the control group during the same time period. Our main dependent variable was the total distance scrolled (in pixels) in the search results screen, which denoted the extent to which additional results were scanned after the initially presented, on-screen results were scanned. For validation purposes, we also compared the number of scroll events. Scroll events are discrete increases in search range that have unspecified lengths and are produced by the user (using a keyboard or mouse). Additionally, we studied the number of clicked links in the search results. Furthermore, we compared the two groups’ response times, that is, the amount of time until the first scroll event (ie, before the search range was increased) and, for control purposes, the time to the first mouse movement. We included unique queries that were searched at least 10 times. Additionally, we only included queries that were followed by mouse movements, which denoted a user’s response. The total number of queries was 1491, and this represented over 300 million searches; 37,810 were conducted by individuals with s-sASD, and the rest were conducted by the control group.
In this study, we targeted all image searches in Bing made by individuals with s-sASD in November 2019 and all matching image searches made by the control group. Initially, in study 2a, we used the same search breadth variables as those in study 1—the total distance scrolled (in pixels), number of scroll events, and number of image clicks. As the criterion of conducting 10 or more searches for a unique query produced a small amount of queries, we included unique image queries that were searched at least 5 times by both the study and control populations. Additionally, as previously stated, we only included queries that were followed by mouse movements. The total number of matching queries was only 38, which represented 364,386 searches (258 made by individuals with s-sASD). The relatively small number of matching image queries was likely due to the exclusion of devices such as tablets and cellphones, which do not have a mouse. In order to broaden the sample, in study 2b, we used an alternative dependent variable—the number of thumbnail images displayed to the user. This is similar to the total distance scrolled; however, it also includes the initial number of images that are available before scrolling. In this substudy, we also included searches with no mouse movements. The total number of image queries in study 2b was 78 (approximately 1.6 million searches, of which 698 were conducted by individuals with s-sASD).
Search indices were averaged across each unique query. Each unique query therefore had a pair of data points—one data point set from the group of individuals with s-sASD and one from the control group. Query topic categories were identified by a proprietary classifier. Age and gender data were also available for a subset of users who were registered with Bing. Our main analysis involved a comparison of the differences between groups in terms of search indices across queries (using two-tailed paired
To further validate our self-statement method, we examined all Bing searches conducted by individuals with s-sASD in 2019 that included the words
The top query topics of the individuals with s-sASD and the control group in November 2019 are shown in
The top 3 query topics that were distinctly popular in either the group of individuals with s-sASD or the control group during November 2019. Odds ratios (ORs) and 95% CIs are presented.
Topics | OR (95% CI) | |||
|
||||
|
|
|||
|
|
Television shows | 8.83 (8.49-9.18) | |
|
|
Books | 6.16 (5.73-6.63) | |
|
|
Consumer electronics | 4.36 (3.95-4.80) | |
|
|
|||
|
|
Video games | 4.20 (3.09-5.72) | |
|
|
Things to do | 2.73 (2.42-3.08) | |
|
|
Automobiles | 2.70 (2.46-2.95) | |
|
||||
|
|
|||
|
|
Flights | 4.82 (4.11-5.66) | |
|
|
Travel guide | 4.70 (4.19-5.27) | |
|
|
Things to do | 4.47 (4.04-4.94) | |
|
|
|||
|
|
Travel guide | 7.08 (5.52-9.10) | |
|
|
Restaurants | 2.96 (2.50-3.52) | |
|
|
Television shows | 1.69 (1.53-1.86) |
Study 1 results: the mean values of general search parameters for individuals with self-stated autism spectrum disorder (s-sASD) and the control groups.
Variables | Values per querya, mean (SE) | Average per individual searchb, mean (SE) | |||||||||
|
Individuals with s-sASD | Controls (matched) | Individuals with s-sASD | Controls (matched) | Controls (all other users) | ||||||
|
|||||||||||
|
Scrolled distance (pixels) | 745.38 (30.62) | 513.44 (12.66) | 835.29 (54.81) | 136.07 (12.10) | 231.16 (9.05) | |||||
|
Number of scroll events | 1.33 (0.05) | 1.18 (0.03) | 1.35 (0.08) | 0.32 (0.02) | 0.55 (0.02) | |||||
|
Number of clicked links | 0.94 (0.04) | 0.86 (0.03) | 1.11 (0.10) | 0.93 (0.20) | 0.86 (0.10) | |||||
|
|||||||||||
|
Time to first scroll | 35.99 (0.59) | 40.81 (0.30) | 33.83 (2.88) | 51.34 (9.48) | 48.74 (4.96) | |||||
|
Time to first mouse movement | 3.28 (0.12) | 3.91 (0.05) | 3.21 (0.23) | 4.48 (0.97) | 4.14 (0.36) |
aThe average values for each query give the same weight to each query.
bThe average values for each individual search were weighted by the relative volume of searches for each unique query.
Data for study 1 (search breadth indices stratified by group). The mean scrolled distance, number of scroll events, and number of clicked links for each query are presented. s-sASD: self-stated autism spectrum disorder.
The search breadth indices of the two groups were even more distinct when we compared individual searches (
The effect of relative search volume on differences between groups in study 1. Queries were divided into the top 50% of queries in both groups, those in the control group, those in the group of individuals with s-sASD, and those in neither group. The error bars denote 95% CIs. s-sASD: self-stated autism spectrum disorder.
The differences between groups seemed consistent with the notion that ASD is associated with increased investigative behaviors. However, an alternative interpretation is that these differences were due to a demographic disparity between groups. Populations with ASD typically have a gender ratio of about 4:1 (males to females) [
Finally, because we matched the control group’s queries to those of individuals with s-sASD, the studied queries represented 48.7% of the searches conducted by those with s-sASD and 15.2% of the searches conducted by the control group. In order to ensure that the subsample of searches conducted by the control group was not biased, we extracted all unique queries that were searched more than 10 times in the control group, except those that were included in our original sample (a total of 23,071 unique queries). We focused on indices for individual searches, since these queries were different from those included in this study. As can be seen in
Study 1 results: an examination of the effects of age and gender in the control group. The adjusted r2 denotes the fit of the regression model (proportion of explained variance).
Predicted variables | Age, unstandardized coefficient (SE) | Gender, unstandardized coefficient (SE) | Adjusted |
Scrolled distance | 0.07 (0.02) | 18.91 (0.71) | 0.004a |
Scroll events | 0.0008 (0.0001) | 0.05 (0.002) | 0.02a |
aSignificant at the
As in study 1, the tendency of participants with s-sASD to exhibit greater search breadth was more distinct when considering all individuals’ searches rather than their average per query. This interaction was significant in terms of scrolled distance (
Similar findings emerged in study 2b, which included a somewhat larger number of queries (
We also examined whether the null effect in study 2b was due to queries mainly producing images of faces. We divided the image queries into those including individual persons’ names (eg, “Bjork smiling”: n=21; other queries: n=57) and compared the number of images presented in each category. The results indicated that for people-related searches, the control group was presented with, on average, 101.8 (SE 8.4) images, whereas 88.5 (SE 8.3) images were presented to the group of individuals with s-sASD. In contrast, for nonpeople-related searches, the control group was presented with 89.2 (SE 4.9) images, whereas 95.4 (SE 10.8) images were presented to the group of individuals with s-sASD. However, this crossover interaction trend was not significant in the repeated measures analysis (
Study 2 results: the mean values of image search parameters for individuals with self-stated autism spectrum disorder (s-sASD) and the controls.
Variables | Values per querya, mean (SE) | Values per individual searchb, mean (SE) | |||||||
|
Individuals with s-sASD | Controls | Individuals with s-sASD | Controls | |||||
|
|||||||||
|
Scrolled distance | 2555.5 (419.5) | 2680.5 (384.1) | 3,111.7 (299.62) | 668.23 (278.39) | ||||
|
Scroll events | 3.75 (0.75) | 3.56 (0.59) | 4.43 (1.07) | 1.03 (0.43) | ||||
|
Clicked images | 0.90 (0.12) | 1.08 (0.10) | 1.09 (0.30) | 0.28 (0.14) | ||||
|
|||||||||
|
Images displayed | 93.51 (8.16) | 92.60 (4.24) | 93.88 (13.32) | 55.10 (17.39) | ||||
|
Clicked imaged | 0.42 (0.06) | 0.73 (0.07) | 0.47 (0.11) | 0.12 (0.04) |
aThe average values for each query give the same weight to each query.
bThe average values for each individual search were weighted by the relative volume of searches for each unique query.
Data for study 2a stratified by group. The mean scrolled distance, number of scroll events, and number of clicked images for each query are presented. s-sASD: self-stated autism spectrum disorder.
The first major finding we observed was that there was a large difference between how the individuals with s-sASD investigated search results for general searches and image searches. In general searches, participants with s-sASD scanned more results than controls by scrolling down on the screen of results. For example, for general queries, the scrolled distance of participants with s-sASD was 1.45 times larger than that of the control group (745.38 pixels vs 513.44 pixels). In image searches, no such trend emerged. The increased search range in general searches made by individuals with s-sASD supports the validity of laboratory studies showing that high-functioning individuals with ASD are prone to inspective behaviors [
The differences between image searches and general searches may have been due to the fact that image searches involve greater visual load than typical general searches. In individuals with ASD, the capacity for selective attention was found to be more impaired by high visual load when compared to that capacity in typically developing persons [
The second major finding of this study was that the investigative tendency of individuals with s-sASD was more pronounced in individuals’ average search parameters (across all of the queries they made) than in parameters averaged at the query level. This interaction was driven by an increase in the gap in highly popular queries between groups. For example, in study 1, with regard to the 20 most frequently searched queries in each group, the scrolled distance of individuals with s-sASD was 2.16 times higher than the mean scrolled distance of the control group. In contrast, with regard to the 20 least frequently performed queries, the scrolled distance for individuals with s-sASD was only about 1.26 times higher. Similar patterns emerged in study 2. The fact that differences between groups emerged mainly for queries that were highly popular among individuals with s-sASD is consistent with the principal tendency of such individuals to make large efforts for relatively specific areas of interest [
An important limitation of this study is the fact that individuals with s-sASD were not diagnosed with autism. Indeed, the only information about their identity stemmed from the statements that they made while searching for information on the web. Nevertheless, our validation analysis showed that these individuals were also actively making queries regarding autism-related websites. Furthermore, the favorite queries of individuals with s-sASD (
In our opinion, our findings make two contributions to the study of autism. First, they shed light on how individuals with self-reported ASD engage in naturally occurring internet searches. The answer appears to differ when the search mainly produces text and when it produces images, and these differences are affected by the queries’ relative popularity. Therefore, an interesting open question is whether the web search style of individuals with autism, which involves the fast scanning of many search results, can be adapted by individuals with autism and those without autism.
Second, our findings provide a new method for assessing autism via the “digital footprints” of one’s search statements. Therefore, we transitioned from the use of single or multiple keywords [
List of exclusion terms.
autism spectrum disorder
self-stated autism spectrum disorder
EY-T is an employee of Microsoft, owner of Bing.