<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.0 20040830//EN" "journalpublishing.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="2.0" xml:lang="en" article-type="article-commentary"><front><journal-meta><journal-id journal-id-type="nlm-ta">J Med Internet Res</journal-id><journal-id journal-id-type="publisher-id">jmir</journal-id><journal-id journal-id-type="index">1</journal-id><journal-title>Journal of Medical Internet Research</journal-title><abbrev-journal-title>J Med Internet Res</abbrev-journal-title><issn pub-type="epub">1438-8871</issn><publisher><publisher-name>JMIR Publications</publisher-name><publisher-loc>Toronto, Canada</publisher-loc></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">v28i1e101190</article-id><article-id pub-id-type="doi">10.2196/101190</article-id><article-categories><subj-group subj-group-type="heading"><subject>Commentary</subject></subj-group></article-categories><title-group><article-title>Beyond Time Saved: Implementation, Equity, and the Utility Threshold for Nursing AI Scribes</article-title></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><name name-style="western"><surname>Ronquillo</surname><given-names>Charlene E</given-names></name><degrees>MSN, PhD, RN</degrees><xref ref-type="aff" rid="aff1"/></contrib></contrib-group><aff id="aff1"><institution>School of Nursing, Faculty of Health and Social Development, University of British Columbia, Okanagan Campus</institution><addr-line>1147 Research Road</addr-line><addr-line>Kelowna</addr-line><addr-line>BC</addr-line><country>Canada</country></aff><contrib-group><contrib contrib-type="editor"><name name-style="western"><surname>Law</surname><given-names>Stephanie</given-names></name></contrib><contrib contrib-type="editor"><name name-style="western"><surname>Leung</surname><given-names>Tiffany</given-names></name></contrib></contrib-group><author-notes><corresp>Correspondence to Charlene E Ronquillo, MSN, PhD, RN, School of Nursing, Faculty of Health and Social Development, University of British Columbia, Okanagan Campus, 1147 Research Road, Kelowna, BC, V1V1V7, Canada, 1 250-807-8180,; <email>charlene.ronquillo@ubc.ca</email></corresp></author-notes><pub-date pub-type="collection"><year>2026</year></pub-date><pub-date pub-type="epub"><day>27</day><month>5</month><year>2026</year></pub-date><volume>28</volume><elocation-id>e101190</elocation-id><history><date date-type="received"><day>12</day><month>05</month><year>2026</year></date><date date-type="accepted"><day>13</day><month>05</month><year>2026</year></date></history><copyright-statement>&#x00A9; Charlene E Ronquillo. Originally published in the Journal of Medical Internet Research (<ext-link ext-link-type="uri" xlink:href="https://www.jmir.org">https://www.jmir.org</ext-link>), 27.5.2026. </copyright-statement><copyright-year>2026</copyright-year><license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (<ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on <ext-link ext-link-type="uri" xlink:href="https://www.jmir.org/">https://www.jmir.org/</ext-link>, as well as this copyright and license information must be included.</p></license><self-uri xlink:type="simple" xlink:href="https://www.jmir.org/2026/1/e101190"/><abstract><p>Schwabe et al&#x2019;s pre-post time-motion study of a domain-specific artificial intelligence (AI) speech assistant used by nurses in German long-term care provides one of the few real-world, full-shift evaluations of an AI scribe deployed to a nonphysician workforce, with paired objective observation and self-reported outcomes. This commentary points to the implications of these findings that extend well beyond the time savings headline. The study reports substantial reduction in self-reported documentation time and increased satisfaction with the documentation system, yet workplace satisfaction and the perception that AI scribes are &#x201C;a good idea to implement&#x201D; did not improve. Taken together, these findings show three undertheorized issues for AI scribe implementation in nursing and long-term care. First, postimplementation increases in time spent reviewing entries and retrieving information indicate that AI scribes redistribute cognitive effort from authoring to verification, with unknown consequences for satisfaction, mastery, and error detection. Second, the apparent paradox of rising documentation satisfaction alongside falling expectations of AI quality represents user calibration. Third, the substantial equity considerations of automatic speech recognition documentation reflect a broader trend of AI scribe studies that treat equity as a caveat, rather than treating equitable performance as empirically measurable and testable across variations in linguistic styles, dialects, and social linguistic dimensions. To advance the field, the next generation of nursing AI scribe research must treat documentation as a heterogeneous bundle of authoring, reviewing, retrieving, and verifying activities with distinct satisfaction and error profiles; specify and validate end-user&#x2013;defined anchor utilities, rather than having a narrow focus on diffuse improvement; and treat equity testing and reporting of both automatic speech recognition systems and workforce adoption as standard reporting expectations, rather than caveats.</p></abstract><kwd-group><kwd>artificial intelligence</kwd><kwd>long-term care</kwd><kwd>nursing informatics</kwd><kwd>speech recognition</kwd><kwd>health equity</kwd><kwd>digital health</kwd></kwd-group></article-meta></front><body><p>Schwabe et al&#x2019;s [<xref ref-type="bibr" rid="ref1">1</xref>] study is among the few carefully designed real-world evaluations of an automatic speech recognition (ASR) artificial intelligence (AI) scribe in nursing to date, a domain substantially underevaluated relative to physician documentation [<xref ref-type="bibr" rid="ref2">2</xref>]. The full-shift, pre-post time-motion design and the choice of a domain-specific speech recognition system trained on nursing language are methodological advances over the small physician-focused pilots that dominate the AI scribe literature [<xref ref-type="bibr" rid="ref2">2</xref>]. Like much of that literature, Schwabe et al [<xref ref-type="bibr" rid="ref1">1</xref>] is a vendor-conducted evaluation: the first author and all observers were GmbH employees, developers of the ASR AI scribe used in the study. The methodological advance credited here lies in evaluation setting, duration, and analytic transparency, not independence from the vendor. In this commentary, attention is pointed to three findings that generate research questions the field has yet to substantively engage: (1) documentation work is reshaped, not only reduced; (2) the paradox of rising satisfaction alongside falling quality expectations can be interpreted as a utility threshold; and (3) the equity considerations of ASR-based documentation are treated as a caveat rather than empirically tested.</p><sec id="s2"><title>Documentation Work Is Reshaped, Not Only Reduced</title><p>The reduction in long-term care nurses&#x2019; self-reported documentation time (&#x0394;=&#x2212;31.14 min) is accompanied by an acknowledged but less-discussed compositional change from authoring toward verification and information retrieval: time spent reviewing entries rose from 0.82 to 1.47 minutes, and information retrieval rose from 3.48 to 4.85 minutes [<xref ref-type="bibr" rid="ref1">1</xref>]. The authors note the inability to directly link reduced documentation time to changes in documentation quality. This surfaces an open empirical question that the existing AI scribe literature has rarely disaggregated [<xref ref-type="bibr" rid="ref2">2</xref>]: whether the review and verification of ASR-produced documentation produces the same satisfaction, mastery, and confidence as clinician authoring, or whether it changes error detection, omission, or copy-forward propagation. The next generation of evaluation should treat documentation as a heterogeneous bundle of authoring, reviewing, retrieving, and verifying activities, each potentially carrying distinct satisfaction and error profiles.</p></sec><sec id="s3"><title>The Utility Threshold and the &#x201C;Good Idea&#x201D; Paradox</title><p>Nurses&#x2019; expectations that AI scribe use would improve documentation quality were lower post test than at baseline, even as satisfaction with the documentation system itself rose, and perceptions that it was a &#x201C;good idea to implement&#x201D; slightly decreased [<xref ref-type="bibr" rid="ref1">1</xref>]. Building on technology acceptance literature [<xref ref-type="bibr" rid="ref3">3</xref>], this pattern can be interpreted as signaling calibration rather than disappointment, from which the construct of a <italic>utility threshold</italic> emerges: the minimum demonstrable day-to-day clinical value an AI scribe must produce for nurses to endorse its sustained use. Perhaps counter to developer perceptions, the utility threshold may not be for an AI scribe to &#x201C;do everything well&#x201D; but &#x201C;do at least one thing well enough that I notice it in my shift.&#x201D; The reduction in overall documentation time per shift may indeed be sufficient for clinicians to continue using it, even where documentation quality is not markedly improved. The findings on reduced daily interruptions deserve separate emphasis: interruptions to nursing work are associated with loss of concentration and focus, delays in planned tasks, and incomplete work, and are major contributors to medication errors [<xref ref-type="bibr" rid="ref4">4</xref>,<xref ref-type="bibr" rid="ref5">5</xref>]. Any tool that meaningfully reduces interruptions warrants a dedicated patient safety study. Finally, the privacy and confidentiality work that nurses absorb when dictating at the bedside, in corridors, and in shared spaces was not addressed in the study and offers one explanation for why &#x201C;good idea&#x201D; perceptions did not rise alongside other satisfaction measures. AI scribe design should validate explicitly identified anchor utilities such as overall time savings, reduced interruptions, privacy trade-offs, or yet-to-be-identified gains, rather than diffuse improvement. The greatest promise of clinical benefit will arguably stem from these anchor utilities being defined by end users.</p></sec><sec id="s4"><title>Equity as an Underexamined Dimension</title><p>The equity stakes of ASR-based documentation are substantial but underexplored. In a US home health care setting, an evaluation of four ASR systems (two commercial: AWS General, AWS Medical; two open-source: Whisper, Wav2Vec 2.0) against gold-standard transcriptions of patient-nurse encounters found that all systems performed significantly worse for Black patients than White patients, with the largest discrepancies in affective and social linguistic dimensions, particularly relevant for nursing assessment of social cues [<xref ref-type="bibr" rid="ref6">6</xref>]. The mechanism is structural. Training data underrepresent dialect and register variation [<xref ref-type="bibr" rid="ref6">6</xref>], resulting in a documented pattern across major commercial ASR systems with differences in average word error rates among speakers of different racial backgrounds [<xref ref-type="bibr" rid="ref7">7</xref>]. It is useful to distinguish speaker-level ASR bias (dialect, accent, vocal aging) from system-level training bias (sampled at training): speaker-level bias can sometimes be mitigated through fine-tuning, whereas system-level bias requires upstream intervention beyond downstream implementers&#x2019; reach. Acknowledgment by Schwabe et al [<xref ref-type="bibr" rid="ref1">1</xref>] of differential performance across dialects, accents, and terminology, while a good start, illustrates a pattern among AI scribe studies that treat equity as a caveat rather than an empirically testable and measurable dimension of technical performance. For example, the German nursing workforce in the study [<xref ref-type="bibr" rid="ref1">1</xref>] includes regional dialect, generational, and language-of-origin variation that can serve as equitable ASR performance axes for empirical testing and measurement. A second equity dimension lies at the intersection of workforce characteristics and AI scribe adoption. The authors acknowledge that continued tool engagement may differ by career stage, training status, and role-related workload [<xref ref-type="bibr" rid="ref1">1</xref>], and can extend to age, language-of-origin, prior digital health experience, and other established predictors of technology acceptance. Addressing the ASR equity gap rather than treating it as an afterthought requires conducting subgroup analyses by context-specific axes of linguistic variation, including accent, dialect, and professional characteristics as suggested [<xref ref-type="bibr" rid="ref1">1</xref>], and disaggregated reporting on adoption, dropout, and equity metrics of ASR performance.</p></sec><sec id="s5"><title>A Research Agenda</title><p><xref ref-type="table" rid="table1">Table 1</xref> summarizes research priorities following each thread. The Schwabe et al [<xref ref-type="bibr" rid="ref1">1</xref>] study, despite limitations, contributes what hopefully becomes a precedent for stronger designs and longer time horizons of AI scribe and nursing AI studies. While the 8-week pilot is relatively brief, it offers important insight into implementation factors that determine sustained use, including workflow fit, staffing stability, and organizational readiness [<xref ref-type="bibr" rid="ref8">8</xref>]. The time-motion methodology and pairing of objective and self-reported outcomes are the kind of evidence nursing AI scribe research needs more of. The implementation, equity, and utility-specification work, however, remains.</p><table-wrap id="t1" position="float"><label>Table 1.</label><caption><p>Schwabe et al [<xref ref-type="bibr" rid="ref1">1</xref>] findings, commentary interpretation, and research priorities for nursing artificial intelligence scribe research.</p></caption><table id="table1" frame="hsides" rules="groups"><thead><tr><td align="left" valign="bottom">Schwabe et al [<xref ref-type="bibr" rid="ref1">1</xref>] finding</td><td align="left" valign="bottom">Commentary interpretation</td><td align="left" valign="bottom">Research priority</td></tr></thead><tbody><tr><td align="left" valign="top">&#x2193; Self-reported documentation time (&#x0394;=&#x2013;31.14, SE 6.57 min)</td><td align="left" valign="top">Time savings real; sustainability beyond 8 weeks unknown</td><td align="left" valign="top">Longitudinal (&#x2265;12 mo) implementation studies anchored in implementation science framework (eg, CFIR<sup><xref ref-type="table-fn" rid="table1fn1">a</xref></sup> inner-setting constructs [<xref ref-type="bibr" rid="ref8">8</xref>,<xref ref-type="bibr" rid="ref9">9</xref>])</td></tr><tr><td align="left" valign="top">&#x2191; Reviewing entries (0.82 &#x2192; 1.47 min); &#x2191; information retrieval (3.48 &#x2192; 4.85 min)</td><td align="left" valign="top">Clinical documentation task composition shifts from authoring to review and verification</td><td align="left" valign="top">Disaggregate documentation as a heterogeneous bundle of activities; examine satisfaction, mastery, and error implications of each</td></tr><tr><td align="left" valign="top">&#x2191; Documentation satisfaction; &#x2193; quality expectations; &#x201C;good idea&#x201D; not improved</td><td align="left" valign="top">Utility threshold reached; unaddressed privacy and confidentiality labor a possible explanation for stagnant &#x201C;good idea&#x201D; perceptions</td><td align="left" valign="top">Specify and validate end-user&#x2013;defined anchor utilities; investigate privacy and confidentiality protocols for bedside dictation</td></tr><tr><td align="left" valign="top">&#x2193; Daily interruptions</td><td align="left" valign="top">Patient safety relevance, not workflow nicety [<xref ref-type="bibr" rid="ref4">4</xref>,<xref ref-type="bibr" rid="ref5">5</xref>]</td><td align="left" valign="top">Test whether interruption reductions translate into measurable reductions in nursing error, omission, or delay</td></tr><tr><td align="left" valign="top">Workplace satisfaction and &#x201C;good idea to implement&#x201D; unchanged</td><td align="left" valign="top">Adoption gains &#x2260; workforce benefit; possible utility threshold not yet reached for sustained use</td><td align="left" valign="top">Mechanism studies of what drives sustained use vs initial adoption</td></tr><tr><td align="left" valign="top">ASR<sup><xref ref-type="table-fn" rid="table1fn2">b</xref></sup> equity not empirically tested</td><td align="left" valign="top">Bias likely along regional, generational, language-of-origin axes [<xref ref-type="bibr" rid="ref6">6</xref>,<xref ref-type="bibr" rid="ref7">7</xref>]</td><td align="left" valign="top">Equity audits as standard reporting: ASR performance disaggregated by context- and domain-relevant patient/speaker characteristics; adoption disaggregated by workforce demographic</td></tr></tbody></table><table-wrap-foot><fn id="table1fn1"><p><sup>a</sup>CFIR: Consolidated Framework for Implementation Research.</p></fn><fn id="table1fn2"><p><sup>b</sup>ASR: automatic speech recognition.</p></fn></table-wrap-foot></table-wrap></sec></body><back><ack><p>During the preparation of this work, the author used Claude Sonnet 4.6 to identify and review relevant literature and make the text more concise and readable. After using this tool, the author reviewed and edited the content as needed and takes full responsibility for the content of the published article.</p></ack><notes><sec><title>Funding</title><p>The author declares that no financial support was received for this work.</p></sec></notes><fn-group><fn fn-type="conflict"><p>None declared.</p></fn></fn-group><glossary><title>Abbreviations</title><def-list><def-item><term id="abb1">AI</term><def><p>artificial intelligence</p></def></def-item><def-item><term id="abb2">ASR</term><def><p>automatic speech recognition</p></def></def-item></def-list></glossary><ref-list><title>References</title><ref id="ref1"><label>1</label><nlm-citation citation-type="journal"><person-group person-group-type="author"><name name-style="western"><surname>Schwabe</surname><given-names>K</given-names> </name><name name-style="western"><surname>Ferizaj</surname><given-names>D</given-names> </name><name name-style="western"><surname>Neumann</surname><given-names>S</given-names> </name><name name-style="western"><surname>Strube-Lahmann</surname><given-names>S</given-names> </name><name name-style="western"><surname>Lahmann</surname><given-names>N</given-names> </name></person-group><article-title>Time savings through an AI speech assistant for nursing documentation: pre-post time-motion study in German long-term care</article-title><source>J Med Internet Res</source><year>2026</year><month>04</month><day>8</day><volume>28</volume><fpage>e86078</fpage><pub-id pub-id-type="doi">10.2196/86078</pub-id><pub-id pub-id-type="medline">41950503</pub-id></nlm-citation></ref><ref id="ref2"><label>2</label><nlm-citation citation-type="journal"><person-group person-group-type="author"><name name-style="western"><surname>Kanaparthy</surname><given-names>NS</given-names> </name><name name-style="western"><surname>Villuendas-Rey</surname><given-names>Y</given-names> </name><name name-style="western"><surname>Bakare</surname><given-names>T</given-names> </name><etal/></person-group><article-title>Real-world evidence synthesis of digital scribes using ambient listening and generative artificial intelligence for clinician documentation workflows: rapid review</article-title><source>JMIR AI</source><year>2025</year><month>10</month><day>10</day><volume>4</volume><fpage>e76743</fpage><pub-id pub-id-type="doi">10.2196/76743</pub-id><pub-id pub-id-type="medline">41071988</pub-id></nlm-citation></ref><ref id="ref3"><label>3</label><nlm-citation citation-type="journal"><person-group person-group-type="author"><name name-style="western"><surname>Davis</surname><given-names>FD</given-names> </name></person-group><article-title>Perceived usefulness, perceived ease of use, and user acceptance of information technology</article-title><source>MIS Q</source><year>1989</year><month>09</month><day>1</day><volume>13</volume><issue>3</issue><fpage>319</fpage><lpage>340</lpage><pub-id pub-id-type="doi">10.2307/249008</pub-id></nlm-citation></ref><ref id="ref4"><label>4</label><nlm-citation citation-type="journal"><person-group person-group-type="author"><name name-style="western"><surname>McGillis Hall</surname><given-names>L</given-names> </name><name name-style="western"><surname>Pedersen</surname><given-names>C</given-names> </name><name name-style="western"><surname>Hubley</surname><given-names>P</given-names> </name><etal/></person-group><article-title>Interruptions and pediatric patient safety</article-title><source>J Pediatr Nurs</source><year>2010</year><month>06</month><volume>25</volume><issue>3</issue><fpage>167</fpage><lpage>175</lpage><pub-id pub-id-type="doi">10.1016/j.pedn.2008.09.005</pub-id><pub-id pub-id-type="medline">20430277</pub-id></nlm-citation></ref><ref id="ref5"><label>5</label><nlm-citation citation-type="journal"><person-group person-group-type="author"><name name-style="western"><surname>Stratton</surname><given-names>KM</given-names> </name><name name-style="western"><surname>Blegen</surname><given-names>MA</given-names> </name><name name-style="western"><surname>Pepper</surname><given-names>G</given-names> </name><name name-style="western"><surname>Vaughn</surname><given-names>T</given-names> </name></person-group><article-title>Reporting of medication errors by pediatric nurses</article-title><source>J Pediatr Nurs</source><year>2004</year><month>12</month><volume>19</volume><issue>6</issue><fpage>385</fpage><lpage>392</lpage><pub-id pub-id-type="doi">10.1016/j.pedn.2004.11.007</pub-id><pub-id pub-id-type="medline">15637579</pub-id></nlm-citation></ref><ref id="ref6"><label>6</label><nlm-citation citation-type="journal"><person-group person-group-type="author"><name name-style="western"><surname>Zolnoori</surname><given-names>M</given-names> </name><name name-style="western"><surname>Vergez</surname><given-names>S</given-names> </name><name name-style="western"><surname>Xu</surname><given-names>Z</given-names> </name><etal/></person-group><article-title>Decoding disparities: evaluating automatic speech recognition system performance in transcribing Black and White patient verbal communication with nurses in home healthcare</article-title><source>JAMIA Open</source><year>2024</year><month>12</month><day>10</day><volume>7</volume><issue>4</issue><fpage>ooae130</fpage><pub-id pub-id-type="doi">10.1093/jamiaopen/ooae130</pub-id><pub-id pub-id-type="medline">39659993</pub-id></nlm-citation></ref><ref id="ref7"><label>7</label><nlm-citation citation-type="journal"><person-group person-group-type="author"><name name-style="western"><surname>Koenecke</surname><given-names>A</given-names> </name><name name-style="western"><surname>Nam</surname><given-names>A</given-names> </name><name name-style="western"><surname>Lake</surname><given-names>E</given-names> </name><etal/></person-group><article-title>Racial disparities in automated speech recognition</article-title><source>Proc Natl Acad Sci U S A</source><year>2020</year><month>04</month><day>7</day><volume>117</volume><issue>14</issue><fpage>7684</fpage><lpage>7689</lpage><pub-id pub-id-type="doi">10.1073/pnas.1915768117</pub-id><pub-id pub-id-type="medline">32205437</pub-id></nlm-citation></ref><ref id="ref8"><label>8</label><nlm-citation citation-type="journal"><person-group person-group-type="author"><name name-style="western"><surname>Damschroder</surname><given-names>LJ</given-names> </name><name name-style="western"><surname>Reardon</surname><given-names>CM</given-names> </name><name name-style="western"><surname>Opra Widerquist</surname><given-names>MA</given-names> </name><name name-style="western"><surname>Lowery</surname><given-names>J</given-names> </name></person-group><article-title>Conceptualizing outcomes for use with the Consolidated Framework for Implementation Research (CFIR): the CFIR Outcomes Addendum</article-title><source>Implement Sci</source><year>2022</year><month>01</month><day>22</day><volume>17</volume><issue>1</issue><fpage>7</fpage><pub-id pub-id-type="doi">10.1186/s13012-021-01181-5</pub-id><pub-id pub-id-type="medline">35065675</pub-id></nlm-citation></ref><ref id="ref9"><label>9</label><nlm-citation citation-type="journal"><person-group person-group-type="author"><name name-style="western"><surname>Damschroder</surname><given-names>LJ</given-names> </name><name name-style="western"><surname>Reardon</surname><given-names>CM</given-names> </name><name name-style="western"><surname>Widerquist</surname><given-names>MAO</given-names> </name><name name-style="western"><surname>Lowery</surname><given-names>J</given-names> </name></person-group><article-title>The updated Consolidated Framework for Implementation Research based on user feedback</article-title><source>Implement Sci</source><year>2022</year><month>10</month><day>29</day><volume>17</volume><issue>1</issue><fpage>75</fpage><pub-id pub-id-type="doi">10.1186/s13012-022-01245-0</pub-id><pub-id pub-id-type="medline">36309746</pub-id></nlm-citation></ref></ref-list></back></article>