This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Online interventions are aiming increasingly at cognitive outcome measures but so far no easy and fast self-monitors for cognition have been validated or proven reliable and feasible.
This study examines a new instrument called the Brain Aging Monitor–Cognitive Assessment Battery (BAM-COG) for its alternate forms reliability, face and content validity, and convergent and divergent validity. Also, reference values are provided.
The BAM-COG consists of four easily accessible, short, yet challenging puzzle games that have been developed to measure working memory (“Conveyer Belt”), visuospatial short-term memory (“Sunshine”), episodic recognition memory (“Viewpoint”), and planning (“Papyrinth”). A total of 641 participants were recruited for this study. Of these, 397 adults, 40 years and older (mean 54.9, SD 9.6), were eligible for analysis. Study participants played all games three times with 14 days in between sets. Face and content validity were based on expert opinion. Alternate forms reliability (AFR) was measured by comparing scores on different versions of the BAM-COG and expressed with an intraclass correlation (ICC: two-way mixed; consistency at 95%). Convergent validity (CV) was provided by comparing BAM-COG scores to gold-standard paper-and-pencil and computer-assisted cognitive assessment. Divergent validity (DV) was measured by comparing BAM-COG scores to the National Adult Reading Test IQ (NART-IQ) estimate. Both CV and DV are expressed as Spearman rho correlation coefficients.
Three out of four games showed adequate results on AFR, CV, and DV measures. The games Conveyer Belt, Sunshine, and Papyrinth have AFR ICCs of .420, .426, and .645 respectively. Also, these games had good to very good CV correlations: rho=.577 (
This study provides evidence for the use of the BAM-COG test battery as a feasible, reliable, and valid tool to monitor cognitive performance in healthy adults in an online setting. Three out of four games have good psychometric characteristics to measure working memory, visuospatial short-term memory, and planning capacity.
With the rise of the Internet and the introduction of eHealth, the new research area of online health care has evolved rapidly over the last decade [
From a health-behavior change perspective, both eHealth and gaming are of high interest. Widespread Internet access provides the behavior-change researcher with the platform necessary to reach large populations. In Europe and North America, Internet penetration ranges between 63.2-78.6% of the total population [
An important drawback of the Internet is that its content has to be fast and entertaining [
The effects of aging on cognitive functions have been studied increasingly [
Online cognitive testing has already been proven valid and reliable in children aged 10-12 years [
To our knowledge, this is the first study to describe, validate, and examine an online self-monitor for cognitive functioning that makes use of visually attractive, easy-to-instruct puzzle games. The BAM-COG was not developed as a diagnostic tool (eg, for the assessment of pathological cognitive aging such as dementia), nor was it designed to predict cognitive decline over time. The aim of the BAM-COG was to enable users to establish their cognitive performance and to monitor their personal cognitive development over time. This is of major importance because it greatly increases the possibilities of online research on cognitive functioning, it increases reach, and it decreases costs both monetary and in time.
The hypotheses for this study are that the BAM-COG games have good alternate forms reliability and that the face and content validity of the four newly developed puzzle games of the BAM-COG transfer into good convergent and divergent validity, compared with standard paper-and-pencil and computer-assisted cognitive assessment.
We set out to validate the BAM-COG in a cohort of community-dwelling individuals aged 40 years and older
The research website was available to participants for four months. Upon enrollment, we registered sex, age, and education level—the latter ranging from 1-8, where 1 is the lowest value (elementary school) and 8 is the highest value (university level education; see [
On their first two visits, participants performed the same BAM-COG games (see
There were two parts in this study. Part 1 involved the data collection for AFR analyses and reference values, which was done exclusively via the Internet. Participants in Part 1 were estimated to need approximately 45 minutes per session to complete the BAM-COG. In total, after three rounds of BAM-COG puzzles within 28 days, participants were estimated to have spent approximately 135 minutes on the BAM-COG. This group will be abbreviated as “Online group” from this point on. Part 2 involved the data collection necessary to calculate the BAM-COG’s convergent (CV) and divergent validity (DV). For this procedure, in addition to playing the BAM-COG games online, participants visited the Radboud University Medical Center (RUMC) once (this group will be abbreviated as the “RUMC group”). This group of participants performed both computerized cognitive tests (subtests from the Cambridge Automated Neuropsychological Test Battery or CANTAB) and paper-and-pencil neuropsychological tests (PnP) (see
For the group of participants visiting the RUMC, two additional inclusion and exclusion criteria were applied. Potential participants were excluded if they had a score ≤24 on the Mini-Mental State Examination (MMSE [
BAM-COG (Brain Aging Monitor–Cognitive Assessment Battery) game details.
BAM-COG game | Cognitive domain | Total levelsa | Range of scores | Short description |
Conveyer Belt | Working memory | 7 | 4-10 | This game shows a participant a grocery list on screen. After 1 second, the conveyer belt turns on. Groceries run down the belt and participants need to select only those products that are on their list. |
Sunshine | Visuospatial short-term memory | 8 | 3-10 | In this game, a sun creates visual patterns in a 5x5 cloud matrix. This visual pattern dissolves and, after it has completely disappeared, participants are asked to reproduce this pattern in the exact same order as it initially appeared on screen. |
Viewpoint | Episodic recognition memory | 8 | 1-8 | This game presents a 5x5 matrix filled with stimuli (asterisks) to the participant. The participant gets 3 seconds to memorize this presented pattern before it disappears from the screen. After 3 seconds, 3 answer possibilities appear on screen from which the participant is to pick the answer that is an exact match to the previously shown matrix. |
Papyrinth | Executive function - planning | 5 | 3-7 | This game starts with presenting the participant with a scrambled path. The participants task is to unscramble the path so their pawn can move from start to finish unobstructed. Clearing the route is done by sliding columns and rows in the correct order so that all pieces of road end up connected to each other. |
aExcluding the practice level.
BAM-COGa domains and proposed matching computerized and paper-and-pencil cognitive tests.
BAM-COG game (domain) | CANTABb | Paper and pencil |
Conveyer Belt |
Spatial Working Memory [ |
Letter-Number Sequencing Task from WAIS-IIIc [ |
Sunshine |
Spatial Span [ |
Spatial Span subtest from WMS-IIId [ |
Viewpoint |
Pattern Recognition [ |
Continuous Visual Memory Task [ |
Papyrinth |
Stockings of Cambridge [ |
Zoo Map Task, part of the BADSe [ |
aBAM-COG: Brain Aging Monitor–Cognitive Assessment Battery. For a short description of the BAM-COG games, see
bCANTAB: Cambridge Automated Neuropsychological Test Battery.
cWAIS-III: Wechsler Adult Intelligence Scale, third edition.
dWMS-III: Wechsler Memory Scale, third edition.
eBADS: Behavioral Assessment of the Dysexecutive Syndrome.
According to our sample size calculations for CV and DV, we needed 37 participants for Part 2 (alpha error probability <.05, power (1-beta error probability =.8) of our study. Sample size calculation was performed using GPower 3.1 [
The BAM-COG consists of four puzzle games developed to measure working memory, visuospatial short-term memory, episodic recognition memory, and executive function-planning (see
Subjects in the RUMC group also participated in tasks from the CANTAB and PnP tasks matched for the BAM-COGs cognitive domains (see
Based on expert opinion from two neuropsychologists, a geriatrician, a public health researcher, and a professional game-design team, the four puzzle games were considered to cover the chosen cognitive constructs of working memory, visuospatial short-term memory, episodic recognition memory, and planning. After this initial assessment, the instrument outline was discussed with a broader group of health care professionals consisting of neuropsychologists, epidemiologists, public health care researchers, and general psychologists. It was agreed that from a content point of view, it would be impossible to cover every cognitive domain that decreases in functionality across the lifespan, when fast and easy access are key criteria. It was decided that choosing three executive functions and one specific memory function, all of which have been established to decline in normal aging and neurodegenerative syndromes [
Alternate forms reliability (AFR) was determined to compare the three batches of BAM-COG games, administered at different time points. Every batch resembles a parallel version of the BAM-COG containing an equal number of levels and trials. Theoretically, these batches do not differ from one another in difficulty. The AFR was determined with an intraclass correlation (ICC: two-way mixed; consistency at 95%) on the results of the second and third round performances of the participants. With respect to interpretation of the ICCs, we needed to take into consideration that the study was executed outside of a clinical laboratory setting where people could be easily distracted, which may affect the test’s reliability. Therefore, ICC values between .4 and .6 were considered sufficient to support AFR for the BAM-COG. This is in line with another online validation study [
To further analyze possible systematic differences between measurements, Bland-Altman plots were calculated. In these plots, the differences between two sessions were plotted against their mean. Furthermore, the scores’ means and limits of agreement were calculated as the mean of the difference between the two measurements ±2 SD of these differences. The standard error of measurement and the 95% confidence intervals for the mean difference between the two measurements were also calculated. If the 95% confidence interval does not include zero, this indicates a systematic and undesirable change in the mean [
The CV determines whether the cognitive domain supposedly measured by the BAM-COG game is actually assessed, using validated cognitive tasks as gold standards. In contrast, the DV examines to what extent the BAM-COG correlates with cognitive domains it should not correlate with. By comparing the BAM-COG game scores to a non-related cognitive construct (in this study, IQ scores derived from the Dutch version of the National Adult Reading Test, NART), the distinctive capacities of the BAM-COG are established. Due to non-normal data distribution on BAM-COG outcome measures and small samples, both CV and DV of the BAM-COG are calculated using a one-tailed Spearman’s rho correlation coefficient.
For interpretation purposes, the data from the three batches were aggregated into one measure for the calculation of CV and DV. This enables us to judge the task as one entity instead of three separate batches. Single test statistics were generated based on participants’ average game scores (for more information on scoring, see Instruments). Reference values are provided for the games to provide some insight into the expected distribution of scores in a normal aging population of people aged 40 years and older. For every analysis, participants with a raw test score of 0 were excluded. This was done as these participants had either viewed the instructions but not started playing or played only one or two trials out of the necessary three to advance to the next level.
This study was deemed exempt from formal ethical evaluation by the local medical ethics committee (region Arnhem-Nijmegen, registration number: 2011/490). All statistical analyses were performed using IBM SPSS Statistics for Windows, Version 20.0. The Bland-Altman plots were performed with GraphPad Prism version 5.03 for Windows.
BAM-COG’s feasibility was assessed based on the total number of registrations and dropouts, the percentage of participants who played and completed the first, second, and third rounds, and examination of the score distributions for floor and ceiling effects.
Through our research website, 641 participants were enrolled in this study of whom 124 (19.3%) were excluded as they did not fulfill the age criterion. Immediately after registering, each participant was asked to perform the BAM-COG test battery for the first time. A total of 76.8% (397/517) participants in this group played at least one game and were therefore eligible for analyses; 78.6% (312/397) of these were women. The mean age was 54.9 (SD 9.6) years and the modus of education level was 6 (range 1-8).
We recruited 56 participants to participate in Part 2 of the study. Of these 56 participants, 41 were willing to register online, with a mean age of 60.8 (SD 8.2) years, of whom 58.5% (24/41) were female with a modus of educational level of 7 (range 1-8). All participants were native Dutch speakers. All were able to successfully complete the CANTAB Motor Screening Task. In total, 21 (51.2%) of the 41 participants completed the CANTAB tasks first as compared to 20 (48.8%) of the 41 participants completing the PnP tasks first.
In
Mean (SD) for age, MMSEa, NART-IQ b, and BAM-COGc scores and mode (range) for education for RUMCd and online group.
|
Online group | RUMC group |
Age, years, mean (SD) | 54.9 (9.6) | 60.8 (8.2) |
Education, mode (range) | 6 (1-8) | 7 (1-8) |
MMSE, mean (SD) | -- | 29.4 (1.07) |
NART-IQ, mean (SD) | -- | 123.2 (12.83) |
Conveyer Belt score | 5.95 (n=217) | 6.33 (n=26) |
Sunshine score | 4.60 (n=236) | 5.10 (n=24) |
Viewpoint score | 3.97 (n=306) | 3.90 (n=28) |
Papyrinth score | 4.64 (n=152) | 5.30 (n=21) |
aMMSE: Mini Mental State Examination.
bNART-IQ: National Adult Reading Test–Intelligence Quotient.
cBAM-COG: Brain Aging Monitor–Cognitive Assessment Battery.
dRUMC: Radboud University Medical Center.
Alternate forms reliability (AFR) of BAM-COGa games in intraclass correlations (ICCb).
BAM-COG game | AFR | 95% CI |
Conveyer Belt (n=55) | .420 | 0.17-0.62 |
Sunshine (n=78) | .426 | 0.23-0.59 |
Viewpoint (n=101) | .167 | −0.04 to 0.36 |
Papyrinth (n=37) | .645 | 0.41-0.80 |
aBAM-COG: Brain Aging Monitor–Cognitive Assessment Battery.
bAll ICC values >.4 are considered to support sufficient AFR.
With the exception of Viewpoint, the BAM-COG games have good (>.4) to very good (>.6) CV in comparison to both the CANTAB and PnP tasks (see
To control whether the individual games did not heavily load on the same cognitive domain, we performed Spearman correlation analysis using aggregated game scores. As was expected with a large sample, most correlations are significant. However, the size of the correlations range from very small (rho=.143,
Convergent and divergent validity of BAM-COGa games (Spearman rho’s correlation coefficient).
BAM-COG game | Convergent validityb | Divergent validityc | |||
|
Cognitive test | rho ( |
Cognitive test | rho ( |
|
|
|||||
|
WAIS-IIId Letter Number Sequencing | .577 (.001) | National Adult Reading Test | −.029 (.44) | |
|
Spatial Working Memory | −.577 (.001) |
|
||
|
|||||
|
WMS-IIIe Spatial Span Task | .669 (<.001) | National Adult Reading Test | −.029 (.45) | |
|
Spatial Span | .620 (.001) |
|
|
|
|
|||||
|
Continuous Visual Memory Test | .202 (.152) | National Adult Reading Test | −.162 (.21) | |
|
Pattern Recognition | −.157 (.212) |
|
|
|
|
|||||
|
BADSf Zoo Map | .400 (.036) | National Adult Reading Test | −.134 (.28) | |
|
Stockings of Cambridge | .424 (.028) |
|
|
aBAM-COG: Brain Aging Monitor–Cognitive Assessment Battery.
bAll convergent validity values of rho≥.4 are considered to support good CV; values of rho≥.6 are considered very good.
cAll divergent validity values of rho<.2 are considered to support good DV.
dWAIS-III: Wechsler Adult Intelligence Scale, third edition.
eWMS-III: Wechsler Memory Scale, third edition.
fBADS: Behavioral Assessment of the Dysexecutive Syndrome.
We present reference values for all games (
BAM-COGa reference values.
|
Conveyer Belt (n=217) | Sunshine (n=236) | Viewpoint (n=306) | Papyrinth (n=152) | ||||
Score | Frequency | Percentage | Frequency | Percentage | Frequency | Percentage | Frequency | Percentage |
1 | NAb | NA | NA | NA | 145 | 27.5 | NA | NA |
2 | NA | NA | NA | NA | 57 | 10.8 | NA | NA |
3 | NA | NA | 75 | 19.7 | 32 | 6.1 | 57 | 25.3 |
4 | 78 | 24.4 | 148 | 38.9 | 90 | 17.1 | 82 | 36.4 |
5 | 100 | 31.3 | 79 | 20.8 | 70 | 13.3 | 27 | 12.0 |
6 | 26 | 8.1 | 55 | 14.5 | 41 | 7.8 | 15 | 6.7 |
7 | 43 | 13.5 | 15 | 3.9 | 13 | 2.4 | 44 | 19.6 |
8 | 58 | 18.2 | 6 | 1.7 | 79 | 15 | NA | NA |
9 | 12 | 3.8 | 2 | 0.5 | NA | NA | NA | NA |
10 | 2 | 0.7 | 0 | 0 | NA | NA | NA | NA |
aBAM-COG: Brain Aging Monitor–Cognitive Assessment Battery.
bNA: Not Applicable, as this score is not a possible outcome for this game.
The number of registrations totaled 641 participants. The BAM-COG received nationwide attention on two national radio shows and in several regional and national newspapers and magazines. Of the 517 eligible participants, only 397 participants played at least one game out of any of the three batches (76.8%).
The Conveyer Belt game was played most at all three assessments (314, 143, and 107 times respectively) and Papyrinth was played the least frequently (189, 123, and 87 times respectively). On average, 75.7% of participants played all four games and, from the participants that finished the last game on a previous round, on average 80.7% returned to play the next round.
Only 8 participants quit while in the middle of playing a game. All the other participants continued until the “game over” message appeared and either continued with the next game or decided to quit playing after this message. The 8 participants who dropped out all stopped while playing Papyrinth, which is the only game that does not have an integrated time limit.
No real floor or ceiling effects were present in the data. The only possible exception to this may be a slight ceiling effect on Papyrinth and Viewpoint (with 19.6%, 44/225 and 15.0%, 79/527 respectively, completing the highest level). Otherwise, the percentages of participants completing the tasks were very low (0.5%, 2/380 and 0.7%, 2/319 respectively).
This article provides substantial support for the use of the BAM-COG game battery as an online self-monitor for cognitive performance. Three out of four games appear to be adequate measures of the related cognitive concepts (working memory, visuospatial short-term memory, and planning). Conveyer Belt, Sunshine, and Papyrinth all have good alternate forms reliability and turned out to be feasible for use in aging adults. Furthermore, they all have good to very good convergent and divergent validity and reference values for the games are now available. Since all games were designed to measure some form of cognitive domains, it stands to reason that their correlations are statistically significant. Their size, however, is either considerably smaller or equal to the task correlations with outside gold-standard measurement tools. The game Viewpoint, designed to assess episodic recognition memory, did not have an adequate validity and reliability and is not suitable for inclusion in an online assessment battery. In addition, a strength of our setup are the correlations of the BAM-COG scores with the gold-standard CANTAB and PnP tasks. The fact that the BAM-COG games proved to be solid measures of the intended cognitive domains provides good hope that replication of these results is possible in other samples and the BAM-COG can be put to use for its intended purpose.
Even though the current findings are promising with respect to the BAM-COG’s applicability, some adjustments can be recommended on the basis of these results. First, we occasionally received feedback of technical difficulties, in particular with the performance of the Conveyer Belt game. Small-sized stimuli (in this case, groceries such as apples and pears) appeared difficult to click resulting in unintentional missed responses. However, although we cannot fully rule out technical issues on some remote systems, this may have also been due to suboptimal mouse handling by individual participants. This explanation is likely since neither the software developers nor the researchers have been able to replicate this problem on different systems with different operating systems and Internet browsers. Moreover, the problem did not emerge so frequently (n=19 out of n=314) that it would have severely influenced the outcomes of our analyses. Second, feedback was given that there is a need for additional practice levels. Apparently just one trial to get acquainted with the task was not always enough for all participants to fully comprehend what was requested of them. This may have resulted in a slight underachievement in average scores. In a future release of the BAM-COG battery, this can easily be taken into account. Third, regardless of our follow-up efforts (one additional phone call and one personal reminder email), 15 participants in the RUMC group failed to register online even after they had visited the memory clinic. Reasons for this dropout could have been a sole interest in the neuropsychological screening at the research center, time restrictions, loss of motivation, or the relative ease with which reminder emails and online interventions can be ignored and forgotten. Additionally, the limited amount of personal contact with the researchers and the ease of the registration process may increase attrition [
In the interpretation of these results, we need to take the naturalistic setting in which the games were performed into account. That is, laboratory studies in which results are produced under highly controlled conditions typically result in higher ICCs and correlations. The BAM-COG assessments in this study have all been performed in the participants’ home environment without any supervision by the research team. Because the BAM-COG is not designed to be used in a laboratory setting, we feel the present design is a valid approach to examine its feasibility, validity, and reliability. If biased, the performance presented in this study may be an underestimation of the real reliability and validity of the BAM-COGs tasks [
The fact that our population consisted mainly of women (78.6%, 312/397 and 58.5%, 24/41 for Part 1 and Part 2 respectively) somewhat decreases the external validity of this study. However, this type of research and these types of puzzle games have previously been shown to attract more female participants than males [
In sum, this study provides evidence for the use of the BAM-COG test battery as a feasible, reliable, and valid tool to monitor cognitive performance in healthy adults in an online setting. Three out of four games were found to have good to very good psychometric characteristics to measure working memory, visuospatial short-term memory, and planning capacity. It should be stressed that the results can by no means be used to either diagnose neurodegenerative disorders or predict cognitive performance. The BAM-COG is suitable for use in practice for online monitoring cognition and stimulating eHealth interventions for healthy brain aging.
Overview of the BAM-COG games.
Short video of BAM-COG’s game play - Conveyer Belt.
Short video of BAM-COG’s game play - Sunshine.
Short video of BAM-COG’s game play - Viewpoint.
Short video of BAM-COG’s game play - Papyrinth.
Overview of the Bland-Altman plots for alternate forms reliability.
alternate forms reliability
Brain Aging Monitor – Cognitive Assessment Battery
Cambridge Automated Neuropsychological Test Battery
convergent validity
divergent validity
intraclass correlation
Intelligence Quotient
International Standard Classification of Education
Mini Mental State Examination
National Adult Reading Test
Radboud University Medical Center
We would like to thank Keesing Games for their support and effort developing the games. We would also like to thank Maurice Rijnaard for his contribution in recruiting and examining the participants. This project was funded by a QuickResult grant of the National Initiative Brand and Cognition (NIHC, grant #056-12-011), embedded in the pillar “The Healthy Brain, Program Healthy Cognitive Aging”. RPCK was funded by a QuickResult grant of the National Initiative Brain and Cognition (NIHC, grant # 056-11-011), embedded in the pillar “The Healthy Brain, Program Cognitive Rehabilitation”. The publication fee for this manuscript was funded by an NWO Open Access grant awarded to MGMOR.
None declared.