Journal Articles & Proceedings

Who is Trusted for a Second Opinion? Comparing Collective Advice from a Medical AI and Physicians in Biopsy Decisions After Mammography Screening


H.H.J. Detjen, L. Densky, N. von Kalckreuth, M. Kopka
2025
Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems
https://doi.org/10.1145/3706598.3713898

Summary: In a mixed-methods study, we tested how a second opinion from an AI impacts trust in physicians and patients’ decision-making. We found that patients appreciate a four-eye principle, generally trust physicians over the AI, and trust a physician’s opinion over the AI opinion in cases of conflicting advice. However, patients also followed the AI advice often if it was the risk-averse recommendation.

Impact of a Symptom Checker Application on Patient-Physician Interaction Among Self-Referred Walk-In Patients in the Emergency Department: A Multi-Centre, Parallel-Group, Randomized, Controlled, Trial


M.L. Schmieding, M. Kopka, M. Bolanaki, H. Napierala, M.B. Altendorf, D. Kuschick, S.K. Piper, L. Scatturin, K. Schmidt, C. Schorr, A. Thissen, C. Wäscher, C. Heintze, M. Möckel, F. Balzer, A. Slagman
2025
Journal of Medical Internet Research
https://doi.org/10.2196/64028

Summary: We implemented a symptom checker in multiple emergency departments and assessed its effectiveness in improving satisfaction with patient-physician interaction. Although we could not find an objective improvement, patients and physicians perceived the symptom checker to be helpful.

Accuracy of Online Symptom Assessment Applications, Large Language Models, and Laypeople for Self-Triage Decisions


M. Kopka, N. von Kalckreuth, M.A. Feufel
2025
npj Digital Medicine
https://doi.org/10.1038/s41746-025-01566-6

Summary: We reviewed publications on the self-triage accuracy of symptom checkers, large language models, and laypeople, and found that accuracy varies depending on the specific application. Some systems perform well in identifying emergencies, while others are better at identifying in which self-care or watchful waiting is sufficient. These applications should not be universally dismissed; rather, their utility depends on the specific use case.

Trustworthiness of the Electronic Health Record in Germany: An Exploratory, User-Centered Analysis


N. von Kalckreuth, M. Kopka, C. Schmid, C. Kratzer, A. Reptuschenko, M. Feufel
2025
Frontiers in Digital Health
https://doi.org/10.3389/fdgth.2025.1473326

Summary: We identified several factors that influence perceived trustworthiness of the Electronic Health Record among German citizens. These were the provider’s reputation, feedback from other users, user experience of content and functions, and user data control.

Technology-Supported Self-Triage Decision Making


M. Kopka, S.M. Wang, S. Kunz, C. Schmid, M. Feufel
2025
npj Health Systems
https://doi.org/10.1038/s44401-024-00008-x

Summary: We developed a model to explain how medical laypeople use technology when deciding if/where to seek care. This model suggests that users consult technology for assistance in their decision-making but do not completely offload their decision. We also found that well-performing symptom-assessment applications can effectively support laypeople, while large language models in their current form do not offer any value.

Statistical Refinement of Patient-Centered Case Vignettes for Digital Health Research


M. Kopka, M. Feufel
2024
Frontiers in Digital Health
https://doi.org/10.3389/fdgth.2024.1411924

Summary: We developed a (statistical) procedure for ensuring internal and external validity of case vignette sets.

The RepVig Framework for Designing Use-Case Specific Representative Vignettes and Evaluating Triage Accuracy of Laypeople and Symptom Assessment Applications


M. Kopka, H. Napierala, M. Privoznik, D. Sapunova, S. Zhang, M. Feufel
2024
Scientific Reports
https://doi.org/10.1038/s41598-024-83844-z

Summary: We developed and validated a framework for developing clinical case vignettes that are representative of real patient cases. Using different vignette sets, we demonstrate that traditional vignettes may not be generalizable to real-world contexts.

German mHealth App Usability Questionnaire (G-MAUQ) and short version (G-MAUQ-S): Translation and Validation Study


M. Kopka, A. Slagman, C. Schorr, H. Krampe, M. Altendorf, F. Balzer, M. Bolanaki, D. Kuschick, M. Möckel, H. Napierala, L. Scatturin, K. Schmidt, A. Thissen, M.L. Schmieding
2024
Smart Health
https://doi.org/10.1016/j.smhl.2024.100517

Summary: We translated the mHealth App Usability Questionnaire to German so that the usability of German mHealth applications can be measured. Additionally, a short version can be used to measure usability faster than before.

Generalizability in Real-World Trials


A.F. Näher*, M. Kopka*, F. Balzer, M. Schulte-Althoff
2024
Clinical and Translational Science
https://doi.org/10.1111/cts.13886

Summary: Despite improvements in reporting sampling methods, many real-world evidence trials still not fully utilize random sampling and sample correction procedures. This limits the full potential of RWE in enhancing trial generalizability.

symptomcheckR: An R Package for Analyzing and Visualizing Symptom Checker Triage Performance


M. Kopka, M.A. Feufel
2024
BMC Digital Health
https://doi.org/10.1186/s44247-024-00096-7

Summary: We developed the R package symptomcheckR to standardize the metrics in symptom checker evaluations. This tool aims to enhance the reliability and efficiency of such evaluations to ultimately improve patient safety and resource allocation.

Effects of Face Mask Mandates on COVID-19 Transmission in 51 Countries: Retrospective Event-Study


A.F. Näher, M. Schulte-Althoff, M. Kopka, F. Balzer, F. Pozo-Martin
2024
JMIR Public Health and Surveillance.
https://doi.org/10.2196/49307

Summary: Face mask mandates led to an increase in public mask usage and a decrease in SARS-CoV-2 reproduction numbers. The study strengthens the evidence for mask effectiveness in controlling the spread of acute respiratory infections.

How suitable are clinical vignettes for the evaluation of symptom checker apps? A test theoretical perspective


M. Kopka, M.A. Feufel, E.S. Berner., M.L. Schmieding
2023
Digital Health
https://doi.org/10.1177/20552076231194929

Summary: We have demonstrated that current evaluation practices both overestimate and underestimate the accuracy of symptom checkers. We propose guidelines for researchers to enhance the methodology and get more reliable results.

Characteristics of Users and Nonusers of Symptom Checkers in Germany: Cross-Sectional Survey Study


M. Kopka, L. Scatturin, H. Napierala, D. Fürstenau, M.A. Feufel, F. Balzer, M.L. Schmieding
2023
Journal of Medical Internet Research
https://doi.org/10.2196/46231

Summary: The previously described characteristics (gender, education, age) are associated not with the willingness to use symptom checkers, but rather with knowledge about them. Mental health plays a significant role in interactions with symptom checkers.

Exploring How Informed Mental Health App Selection May Impact User Engagement and Satisfaction


M. Kopka, E. Camacho, S. Kwon, J. Torous
2023
PLOS Digital Health
https://doi.org/10.1371/journal.pdig.0000219

Summary: Cost, condition support and app features are the most influential factors when choosing mental health apps. The app selection database MIND could help maintain higher engagement and satisfaction despite overall low engagement rates.

The Triage Capability of Laypersons: Retrospective Exploratory Analysis


M. Kopka, M.A. Feufel, F. Balzer, M.L. Schmieding
2022
JMIR Formative Research
https://doi.org/10.2196/38977

Summary: Medical laypeople tend to be risk-averse when deciding if they need medical care and often miss identifying emergencies.

Examining the impact of a symptom assessment application on patient-physician interaction among self-referred walk-in patients in the emergency department (AKUSYM): study protocol for a multi-center, randomized controlled, parallel-group superiority trial


H. Napierala, M. Kopka, M.B. Altendorf, M. Bolanaki, K. Schmidt, S.K. Piper, C. Heintze, M. Möckel, F. Balzer, A. Slagman,  M.L. Schmieding
2022
Trials
https://doi.org/10.1186/s13063-022-06688-w

Summary: We investigate the effects of implementing symptom checkers in emergency departments.

Triage Accuracy of Symptom Checker Apps: 5-Year Follow-up Evaluation


M.L. Schmieding, M. Kopka, K. Schmidt, S. Schulz- Niethammer, F. Balzer, M.A. Feufel.
2022
Journal of Medical Internet Research
https://doi.org/10.2196/31810

Summary: On average, triage accuracy of symptom checkers has not improved over the last 5 years. However, accuracy varies drastically between different systems.

Determinants of Laypersons’ Trust in Medical Decision Aids: Randomized Controlled Trial


M. Kopka, M.L. Schmieding, T. Rieger, E. Roesler, F. Balzer, M.A. Feufel.
2022
JMIR Human Factors
https://doi.org/10.2196/35129

Summary: Trust in symptom checkers is high overall. Displaying a picture of a physician or a visualization of an AI did not have any effect on trust.

Interactive Versus Static Decision Support Tools for COVID-19: Randomized Controlled Trial


A. Röbbelen, M.L. Schmieding, M. Kopka, F. Balzer, M.A. Feufel
2022
JMIR Public Health & Surveillance
https://doi.org/10.2196/33733

Summary: COVID-19 related decisions were improved when using a decision aid. It didn’t matter whether the presentation was interactive or static. The transparent nature of static decision aids should be considered when the decision space is limited.


Invited Talks

Kopka M. (2026, 04. February). Strengths and Weaknesses of AI-Supported Tools in Patient Navigation [Invited Talk]. 8th Care Dialogue of BARMER and the State Association for Health & Academy for Social Medicine of Lower Saxony Bremen, Hanover, Germany

Kopka M. (2025, 12. December). AI for Patients and Emergency Medicine [Invited Talk]. Frankenderby der Akut-, Notfall- und Intensivmedizin (FANI) 25, Nuremberg, Germany

Kopka M. (2024, 04. December). A Conceptual Framework for Evaluating Digital Health Tools Using Representative Design [Invited Talk]. Colloquium of the Department of Psychology & Ergonomics at the Berlin Institute of Technology, Berlin, Germany

Kopka M. (2024, 19. July). Symptom Checker & Human Factors: An Introduction to User-Centered Design. AI in Medicine Module. Charité – Universitätsmedizin Berlin, Berlin, Germany

Kopka M. (2024, 11. July). Efficient Tools for Decision-Making: Fast-and-Frugal Decision Trees. Plattform Versorgungsforschung: Forum Methoden der Versorgungsforschung, Berlin, Germany

Kopka M., Heibges, M., Schmid C., Feufel M.A. (2024, 26. June). Using Large Language Models for Patients in Healthcare. Francisc I. Rainer Anthropological Research Centre, The Romanian Academy, Bucharest, Romania.


Presentations & Posters

Kopka, M., & Feufel, M.A. (2025, September). The Impact of Accuracy Transparency In Self-Diagnosis Applications [Presentation]. 70th Annual Conference of the German Society for Medical Informatics, Biometry and Epidemiology (GMDS). Jena, Germany.

Detjen H.J., Densky L., von Kalckreuth, N. & Kopka, M. (2025, 25. April). Who is Trusted for a Second Opinion? Comparing Collective Advice from a Medical AI and Physicians in Biopsy Decisions After Mammography Screening [Presentation]. 2025 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan. https://doi.org/10.1145/3706598.3713898

Kopka M. & Feufel, M.A. (2024, 13. December). Including Patients’ Perspectives in Medical Case Vignettes: A Brunswikian Approach [Presentation]. 30th International Meeting of the Brunswik Society, online.

von Kalckreuth, N., Kopka, M., Schmid, C., Kratzer, C., Reputschenko, A., & Feufel, M.A. (2024, 14 November). Trustworthiness of EHRs: Key Factors from a User-Centered Study in Germany [Poster]. 17 European Public Health Conference, Lisbon, Portugal.

Näher, A.F., Kopka, M., & Schulte-Althoff, M. (2024, 30. October). Digital Health Technologies & Representativity: Evidence from clinicaltrials.gov [Poster]. 1. DGDM Symposium on Digital Medicine, Potsdam, Germany

von Kalckreuth, N., Kopka, M., Appel, J., & Feufel, M.A. (2024, 18. June). Unlocking the Potential of the Electronic Health Record – The Influence of Transparency Features [Presentation]. The 32nd European Conference of Information System, Paphos, Cyprus. https://aisel.aisnet.org/ecis2024/track18_healthit/track18_healthit/5

Heibges M.*, Kopka M.*, Feufel M.A., Schmid C. (2024, 12. April). Large Language Models in Healthcare: Perspectives from Situated Ergonomics [Presentation]. Perspectives on Hybrid Human-AI Systems, Munich, Germany

Kopka M., von Kalckreuth N., Feufel M.A. (2024, 13. March). How to increase the uptake of digital health technologies? Fast-And-Frugal Trees as tools to predict and act on patients’ intentions to use [Poster]. 25. Jahrestagung des EbM-Netzwerks, Berlin, Germany

von Kalckreuth N., Kopka M., Feufel M.A. (2024, 13. March). Extending the Privacy Calculus once more: Toward predicting the use of the Electronic Health Record (EHR) [Poster]. 25. Jahrestagung des EbM-Netzwerks, Berlin, Germany

Schmieding M.L., Napierala H., Kopka M., Heintze C., Fürstenau D., Balzer, F. (2023, 09. September). Gesundheits-Apps: Versorgungsalltag für Patient*innen, aber nicht für Hausärzt*innen? [Presentation]. 57. Kongress für Allgemeinmedizin und Familienmedizin, Berlin, Germany

Thiel L., Kopka M., Stein C., Wihelm L.O., Kolodziejczak K., Zipper V., Fleig, L. (2023, 10. May). Efficacy and mechanisms of mHealth interventions for the prevention and treatment of low back pain: Work in progress of a systematic review and content coding of behavior change techniques [Poster]. 2. Deutscher Psychotherapie Kongress, Berlin, Germany

Kopka M., Koch K.M., Feufel M.A. (2023, 26. April). Effects of performance transparency in symptom assessment applications on the detection of medical emergencies [Presentation]. Human Factors and Ergonomics Society Europe Chapter Annual Meeting 2023, Liverpool, Great Britain.

Kopka M., Scatturin L., Napierala H., Fürstenau D., Feufel M.A., Balzer F., Schmieding M.L. (2023, 24. March). Bekanntheit, Nutzung und Nützlichkeit von Symptom Checkern in Deutschland[Poster]. 24. Jahrestagung des Deutschen Netzwerks Evidenzbasierte Medizin e.V., Potsdam, Germany.

Kopka M., Scatturin L., Napierala H., Fürstenau D., Feufel M.A., Balzer F., Schmieding M.L. (2023, 17. January). Nutzung von Symptom Checkern in Deutschland [Presentation]. 4. Charité-Versorgungsforschungskongress, Berlin, Germany.

Kopka, M., Schmieding, M. L., Staenicke, D., & Feufel, M. A. (2022, 22. April). The Influence of Symptom Checkers on Users‘ Trust and Performance [Presentation]. Human Factors and Ergonomics Society Europe Chapter Annual Meeting, Turin, Italy.

Kopka, M., Schmieding, M. L., & Feufel, M. A. (2022, 20. April). Self-triage Capabilities of Laypersons: Implications for Use Cases of Decision Support Systems [Poster]. Human Factors and Ergonomics Society Europe Chapter Annual Meeting, Turin, Italy.

Kopka, M., Schmieding, M. L., & Feufel, M. A. (2021, 22. September). Determinants of Trust in Medical Decision Aids for Laypersons [Presentation]. DGPs 12. Fachgruppentagung AOW-Psychologie und Ingenieurpsychologie, Chemnitz, Germany.

Kopka, M., & Kastner, K. (2021, 06. September). Can You Help Me? Testing HMI Designs and Psychological Influences on Intended Helping Behavior Towards Autonomous Cargo Bikes [Short Paper]. Mensch & Computer 2021, Ingolstadt, Germany. https://doi.org/10.1145/3473856.3474015

Kopka, M. (2018, 07. November). The Influence of Differently Tuned Songs on Listeners’ Arousal [Presentation]. Society for Music Business and Music Culture Research, Most Wanted Music 2018, Berlin, Germany.


Book Chapters

Kopka, M. (2022). The Influence of Music Tuned to 440 Hz & 432 Hz on The Perceived Arousal. In Musik & Marken (pp. 227-245). Springer VS, Wiesbaden. https://doi.org/10.1007/978-3-658-36472-4_10