DOI: https://doi.org/10.1080/15265161.2023.2296402
PMID: https://pubmed.ncbi.nlm.nih.gov/38226965
تاريخ النشر: 2024-01-16
مُتنبئ مخصص لتفضيلات المرضى للحكم البديل في الرعاية الصحية: قابل للتنفيذ تقنيًا ومرغوب أخلاقيًا
© 2024 المؤلف(ون). نُشر مع
نُشر على الإنترنت: 16 يناير 2024. ترخيص من مجموعة تايلور وفرانسيس، LLC.
عدد مشاهدات المقال: 3522
مُتنبئ مخصص لتفضيلات المرضى من أجل الأحكام البديلة في الرعاية الصحية: قابل للتنفيذ تقنيًا ومرغوب فيه أخلاقيًا
الملخص
عند اتخاذ قرارات بديلة للمرضى غير القادرين، غالبًا ما يواجه الوكلاء صعوبة في تخمين ما يريده المريض إذا كان لديه القدرة. قد يعاني الوكلاء أيضًا من عبء المسؤولية (الوحيدة) في اتخاذ مثل هذا القرار. لمعالجة هذه المخاوف، تم اقتراح متنبئ تفضيلات المرضى (PPP) الذي سيستخدم خوارزمية لاستنتاج تفضيلات العلاج للمرضى الأفراد من بيانات على مستوى السكان حول التفضيلات المعروفة للأشخاص ذوي الخصائص الديموغرافية المماثلة. ومع ذلك، اقترح النقاد أنه حتى لو كان مثل هذا المتنبئ أكثر دقة، في المتوسط، من الوكلاء البشريين في تحديد تفضيلات المرضى، فإن الخوارزمية المقترحة ستفشل مع ذلك في احترام استقلالية المريض (السابق) لأنها تعتمد على نوع “خاطئ” من البيانات: أي البيانات التي ليست محددة للمريض الفردي وبالتالي قد لا تعكس قيمهم الفعلية، أو أسبابهم لامتلاك التفضيلات التي لديهم. مع أخذ هذه الانتقادات بعين الاعتبار، نقترح هنا نهجًا جديدًا: متنبئ تفضيلات المرضى المخصص (P4). يعتمد P4 على التقدمات الحديثة في تعلم الآلة، التي تسمح للتقنيات بما في ذلك نماذج اللغة الكبيرة بأن تكون أكثر كفاءة وأقل تكلفة في “التعديل الدقيق” على بيانات محددة للشخص. سيكون P4، على عكس PPP، قادرًا على استنتاج تفضيلات مريض فردي من مواد (مثل قرارات العلاج السابقة) التي هي في الواقع محددة لهم. وبالتالي، نجادل أنه بالإضافة إلى كونه أكثر دقة على المستوى الفردي من PPP المقترح سابقًا، فإن توقعات P4 ستعكس أيضًا بشكل أكثر مباشرة أسباب وقيم كل مريض. في هذه المقالة، نستعرض الاكتشافات الحديثة في أبحاث الذكاء الاصطناعي التي تشير إلى أن P4 ممكن تقنيًا، ونجادل أنه إذا تم تطويره ونشره بشكل مناسب، يجب أن يخفف بعض المخاوف الرئيسية المتعلقة بالاستقلالية التي يثيرها النقاد لـ PPP الأصلي. ثم نعتبر مختلف الاعتراضات على اقتراحنا ونقدم بعض الردود المؤقتة.
الكلمات الرئيسية
مقدمة
استمرار في هذا العلاج، أو سحبه والسماح لها بالموت. ومع ذلك،
- P4 المدرب على الكتابات التي أنتجها فرد مباشرة، مثل رسائل البريد الإلكتروني، منشورات المدونات، أو منشورات وسائل التواصل الاجتماعي. يمكن بعد ذلك تكملة مثل هذا النص بمعلومات رقمية إضافية تعكس اختيارات الفرد السابقة أو سلوكه، مثل قرارات العلاج المشفرة في السجلات الصحية الإلكترونية (أو حتى نشاط “الإعجاب” على فيسبوك؛ انظر لامانا وبيرن 2018). لأسباب تقنية، يجب تخزين مثل هذه المعلومات ككتابة؛ ومع ذلك، فإن إحدى الطرق المهمة التي يمكن من خلالها الحصول على مثل هذه المعلومات ستكون من خلال التقدم في برامج تحويل الكلام إلى نص. على سبيل المثال، قد يقوم الأطباء، بإذن، بتسجيل وتحويل تلقائي للمحادثات مع المرضى الأفراد.
- P4 المحسن المدرب، بدلاً من ذلك أو بالإضافة، على الاستجابات الصريحة المقدمة من فرد، أثناء كفاءته، لأسئلة تتعلق بتفضيلاتهم العلاجية الافتراضية تحت ظروف مختلفة (أي، نسخة على مستوى الفرد من الاستطلاعات على مستوى السكان المقترحة لمؤشر تفضيلات المرضى الأصلي). يمكن أن يتخذ هذا شكل استبيانات أو مقابلات مع مقدمي الرعاية الصحية، ربما كجزء من فحص دوري، أثناء انتظار الرعاية، أو في سياق تخطيط الرعاية المتقدمة الأكثر تنظيمًا (لاقتراح مفصل حول كيفية القيام بذلك في الممارسة، انظر فيراريو، غلوكليكر، وبيلر-أندورنو 2023أ).
- ربما بشكل أكثر طموحًا، وللوصول إلى القيم أو التفضيلات الأساسية التي قد لا تكون متاحة بوعي لمعظم الناس (أي، لأغراض الإبلاغ الذاتي)، يمكن تحفيز الأفراد للمشاركة في تجارب اختيار منفصلة مصممة خصيصًا لاستنباط القيم (انظر، على سبيل المثال، رايان 2004) حيث سيتعين عليهم اتخاذ قرارات بين خيارات في سلسلة من المقايضات التي تتعارض فيها عوامل مختلفة ذات صلة بالقرار مع بعضها البعض. يمكن أن تكون هذه التجارب ‘مُعَبة’ وتُقدم من خلال تطبيق موبايل قابل للتنزيل، أو واجهة كمبيوتر مؤمنة بشكل مناسب في غرف انتظار الرعاية الصحية، أو منصة إنترنت عامة مرتبطة بحسابات المستخدمين، إلخ. لوصف تقني لكيفية دمج التفضيلات المستنبطة بهذه الطريقة، أو بطريقة مشابهة، مع معلومات أخرى (مثل البيانات الطبية) في سياق اتخاذ القرار المشترك، انظر عمل ساشي وآخرون (2015).
- يمكن أن يكون P4 المدرب على الأنواع المذكورة أعلاه من المعلومات، إذا كانت متاحة ومصرح بها بشكل مناسب، ولكن إذا لم تكن (أو بالإضافة إلى ذلك)، على ردود الأسئلة المتعلقة بتفضيلات المريض الطبية المحتملة أو المعروفة التي قدمها صناع القرار البديلون وأشخاص آخرون مقربون من المريض. من المرجح أن يتم جمع هذه البيانات بعد أن يفقد المريض القدرة، مع دمج الردود من البدلاء وتقييمها وفقًا لمعايير الخوارزمية (أي، لأغراض التنبؤ بما قد يختاره المريض في الوضع المحدد الذي نشأ).
- يمكن أن يكون P4 المعدل بدقة على أي من مجموعات البيانات المذكورة أعلاه، ولكن نموذجها الأساسي ليس نموذج LLM عام بل نموذج مدرب على بيانات على مستوى السكان، سواء كانت ردود من استطلاعات واسعة النطاق كما في الاقتراح الأصلي لـ PPP (ريد ووندلر 2014a)، أو بيانات السجلات الصحية الإلكترونية على مستوى السكان المرتبطة بنشاط وسائل التواصل الاجتماعي، وفقًا لاقتراح لامانا وبيرن (2018).
على سبيل المثال، يمكن استخدام PPP عندما لا تتوفر بيانات كافية أو وقت لتطوير P4). في أي حال، نظرًا لأهمية تحسين قرارات العلاج للأشخاص غير القادرين في مختلف الحالات الصحية الحساسة للوقت، نقترح أن يتم السعي بشكل عاجل لتطوير مثل هذه المتنبئات التفضيلية، من الناحية التقنية ومن حيث صياغة الحواجز الأخلاقية المرتبطة بها، بمشاركة جميع أصحاب المصلحة المعنيين. يشمل ذلك الأخلاقيين، والمهنيين في الرعاية الصحية، وخبراء الذكاء الاصطناعي، والمرضى، وأعضاء من الجمهور العام.
P4
مع إذن البديل، سيستند إلى أنواع مختلفة من البيانات الشخصية كما هو موضح في الجدول 1 من أجل التنبؤ (أ) بتفضيلات العلاج من الدرجة الأولى خلال فترات عدم القدرة على اتخاذ القرار، (ب) بتفضيلاتهم من الدرجة الثانية حول كيفية اتخاذ قرارات العلاج لهم خلال هذه الفترات (على سبيل المثال، فيما يتعلق بنوع أو درجة المشاركة العائلية المرغوبة، على افتراض أن هذه التفضيلات لم يتم تسجيلها صراحة في أداة اتخاذ القرار المسبق)، و (ج) مدى تأكد المريض من هذه التفضيلات ومدى قوة هذه التفضيلات: على سبيل المثال، هل تفضيلاتهم بشأن العلاجات التي يتلقونها أقوى أو أقل قوة من تفضيلاتهم بشأن المشاركة العائلية؟
الأسباب المستخدمة في كتابات هؤلاء الأفراد السابقة (Porsdam Mann et al. 2023). نموذج لغة آخر تم تخصيصه على كتابات الفيلسوف دانيال دينيت أنتج مخرجات مشابهة بشكل مقنع لاستجابات دينيت الخاصة لأسئلة جديدة لم يتم تناولها في مجموعة تدريب النموذج (Schwitzgebel, Schwitzgebel, and Strasser 2023).
القيم والتفضيلات الإنسانية: سواء بشكل عام (Askell et al. 2021; Gabriel 2020; Christian 2020; Kenton et al. 2021)، أو لأفراد محددين (Kirk et al. 2023). في كلتا الحالتين، الهدف من البحث هو تحديد عملية يمكن أن تتكيف بنجاح نماذج اللغة الكبيرة لتعكس القيم والتفضيلات الإنسانية.
التنفيذ، الخصوصية، والموافقة
كما هو الحال في حالة نموذج PPP، قد يجادل البعض بأن أفراد الأسرة وحدهم، بدلاً من خوارزمية مثل نموذج P4 المقترح، يجب الاعتماد عليهم للإشارة إلى ما يجب القيام به في الحالات التي يفتقر فيها أحباؤهم إلى القدرة. قد يكون ذلك إما بسبب الاعتقاد بأن أفراد الأسرة لديهم مطالبة مستقلة بشأن قرارات علاج المريض (وهو اعتقاد يتعارض مع الوضع القانوني في العديد من الولايات القضائية) أو احترامًا لرغبات المريض بأن يكون أسرته متورطة في أي عملية اتخاذ قرار بديلة (Brock 2014). من المحتمل أيضًا أن يختلف مستوى المشاركة المتوقعة لأفراد الأسرة بين الثقافات.
الحالات التي لا توجد فيها خيارات قابلة للتطبيق أخرى لتحديد تفضيلات المريض (مثل عدم توفر البدائل البشرية) كما جادل Jardas وWasserman وWendler (2022). قد تكون هذه نسبة كبيرة من حالات المرضى الذين يفتقرون إلى القدرة. على الرغم من الجهود المنسقة لتحسين الاستخدام، فإن عددًا قليلًا جدًا من المرضى قد أكملوا توجيهًا مسبقًا (Wendler et al. 2016). حتى بين أولئك الذين فعلوا، غالبًا ما تكون هناك صعوبات في توثيق تفضيلات العلاج دون استشارة كافية بسبب كل من المعرفة المفقودة أو الخاطئة حول الإمكانيات الطبية المستقبلية وآثارها الملموسة (Dresser 2014)،
مزايا P4
اعتراضات قائمة على الاستقلالية على توقع تفضيلات المرضى
عند سؤالهم مباشرة من قبل الأطباء. نظرًا لذلك، قد يكون من الإشكالي توقع أن يكون إما الـ P4 أو الوكيل البشري مضطرًا إلى “تقدير” الأسباب أو القيم وراء تفضيلات المريض لاحترام استقلاليتهم.
تقرير المصير، وهو جانب من جوانب الاستقلالية التي قد تكون ذات قيمة كبيرة لبعض المرضى.
القيود العملية والمعرفية للـ P4
قد يؤدي الاعتماد على مخرجات P4 إلى الاعتماد غير المناسب على نتائجها بسبب عدم القدرة على تحديد الدرجة التي تستند إليها مثل هذه التصريحات على استنتاجات معقولة من بيانات التدريب. هذه نقطة مهمة تحتاج إلى معالجة قبل النظر في الاستخدام السريري لـ P4s. يمكن أن تشمل طرق معالجتها العمل الفني الذي يهدف إلى السماح لـ P4 بحساب والتعبير عن درجة ثقته. قد يقلل الإشارة إلى مدى عدم اليقين في التنبؤ من الاعتماد المفرط، كما أنه سيوفر تمثيلاً أكثر واقعية لتفضيلات المرضى الذين يشعرون هم أنفسهم بعدم اليقين بشأن ما يجب القيام به في الحالات الصعبة. يمكن تقديم هذه الفترات الزمنية للثقة جنبًا إلى جنب مع معلومات إضافية تتعلق بوظيفة ونقاط القوة والضعف لنماذج اللغة الكبيرة بشكل عام وP4s بشكل خاص، إلى الوكلاء وصانعي القرار السريري.
الخاتمة
تم الحصول على الموافقة بشكل صريح. هذه خطوة أولى مهمة لمعالجة القضايا المتعلقة بالجدوى والخصوصية. ومع ذلك، يجب أيضًا استكشاف نماذج P4 الأولية باستخدام نماذج لغوية كبيرة بديلة يمكن تخزينها محليًا (مما يلغي مخاوف الخصوصية) إما بالتوازي أو بمجرد إثبات الجدوى. وبالمثل، يجب اختبار دقة نماذج P4 عبر لغات أخرى غير تلك التي تناسبها نماذج اللغة الكبيرة بشكل أفضل (الإنجليزية، الصينية، والإسبانية) وإذا لزم الأمر، يجب السعي لتحقيق تحسينات محددة للغة.
استخدام P4 في اتخاذ القرارات الملموسة بالإضافة إلى كيفية التعامل مع عدم اليقين التفسيري في تقييم التفضيلات المحددة.
بيان الإفصاح
إخلاء المسؤولية
تمويل
والتر سينوت-أرمسترونغ: تم دعم عمل WSA في هذه الورقة جزئيًا من خلال منح من OpenAI وجامعة ديوك. هؤلاء الممولون غير مسؤولين عن المحتوى.
دومينيك ويليكنسون: تم تمويل هذا البحث بالكامل، أو جزئيًا، من قبل مؤسسة ويلكوم [203132/Z/16/Z]. لم يكن للممولين أي دور في إعداد هذه المخطوطة أو في اتخاذ قرار النشر. لغرض الوصول المفتوح، قام المؤلف بتطبيق ترخيص حقوق الطبع والنشر العامة CC BY على أي نسخة من المخطوطة المقبولة من المؤلف الناتجة عن هذا التقديم.
جوليان سافولسكو: تم تمويل هذا البحث بالكامل، أو جزئيًا، من قبل مؤسسة ويلكوم [رقم المنحة WT203132/Z/16/Z]. لغرض الوصول المفتوح، قام المؤلف بتطبيق ترخيص حقوق الطبع والنشر العام CC BY على أي نسخة من المخطوطة المقبولة من المؤلف الناتجة عن هذا التقديم. حصل جوليان سافولسكو، من خلال مشاركته مع معهد أبحاث الأطفال في مردوخ، على تمويل من حكومة ولاية فيكتوريا من خلال برنامج دعم البنية التحتية التشغيلية (OIS). هذا البحث مدعوم من قبل المجلس الوطني للبحوث الطبية بوزارة الصحة في سنغافورة بموجب مبادرة تمويل دعم الأنشطة المتعلقة بالتجارب السريرية (رقم مشروع NMRC MOH-000951-00)، ومن قبل تمويل أبحاث تشين سو لان ومن قبل الجامعة الوطنية في سنغافورة بموجب منحة بدء التشغيل NUS; NUHSRO/2022/078/Startup/13.
أنيت ريد: تم دعم هذا العمل جزئيًا من قبل قسم الأخلاقيات الحيوية في المركز السريري، الذي يتبع البرنامج الداخلي للمعاهد الوطنية للصحة. الآراء المعبر عنها هنا هي آراء المؤلف ولا تعكس بالضرورة سياسات المعاهد الوطنية للصحة أو وزارة الصحة والخدمات الإنسانية الأمريكية.
ديفيد ويندلر: تم تمويل هذا العمل، جزئيًا، من قبل برنامج البحث الداخلي في مركز NIH السريري. ومع ذلك، فإن الآراء المعبر عنها هي آراء المؤلف الخاصة. لا تمثل الموقف أو السياسة الوطنية.
أوركيد
كارين يونغسما (دhttp://orcid.org/0000-0001-8135-6786
ماتياس براون (د)http://orcid.org/0000-0002-6687-6027
دومينيك ويليكنسون (د)http://orcid.org/0000-0003-3958-
8633
والتر سينوت-أرمسترونغ (دhttp://orcid.org/0000-0003-2579-9966
ديفيد ويندلر (دhttp://orcid.org/0000-0002-9359-4439
جوليان سافولسكو (دhttp://orcid.org/0000-0003-1691-6403
REFERENCES
Askell, A., Y. Bai, A. Chen, D. Drain, D. Ganguli, T. Henighan, A. Jones, N. Joseph, B. Mann, N. DasSarma, et al. 2021. A general language assistant as a laboratory for alignment. arXiv Preprint (1):1-48. doi: 10.48550/arXiv. 2112.00861.
Benzinger, L., J. Epping, F. Ursin, and S. Salloch. 2023. Artificial Intelligence to support ethical decision-making for incapacitated patients: A survey among German anesthesiologists and internists. Pre-print available at https:// www.researchgate.net/publication/374530025.
Berger, J. T. 2005. Patients’ interests in their family members’ well-being: An overlooked, fundamental consideration within substituted judgments. The Journal of Clinical Ethics 16 (1):3-10. doi: 10.1086/JCE200516101.
Biller-Andorno, N., A. Ferrario, S. Joebges, T. Krones, F. Massini, P. Barth, G. Arampatzis, and M. Krauthammer. 2022. AI support for ethical decision-making around resuscitation: Proceed with care. Journal of Medical Ethics 48 (3):175-183. doi: 10.1136/medethics-2020-106786.
Biller-Andorno, N., and A. Biller. 2019. Algorithm-aided prediction of patient preferences-an ethics sneak peek. The New England Journal of Medicine 381 (15):14801485. doi: 10.1056/NEJMms1904869.
Christian, B. 2020. The alignment problem. New York: W. W. Norton & Company.
de Kerckhove, D. 2021. The personal digital twin, ethical considerations. Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences 379 (2207):20200367. doi: 10.1098/rsta.2020.0367.
Earp, B. D. 2022. Meta-surrogate decision making and artificial intelligence. Journal of Medical Ethics 48 (5):287289. doi: 10.1136/medethics-2022-108307.
Ferrario, A., S. Gloeckler, and N. Biller-Andorno. 2023a. Ethics of the algorithmic prediction of goal of care preferences: From theory to practice. Journal of Medical Ethics 49 (3):165-174. doi: 10.1136/jme-2022-108371.
Ferrario, A., S. Gloeckler, and N. Biller-Andorno. 2023b. AI knows best? Avoiding the traps of paternalism and other pitfalls of AI-based patient preference prediction. Journal of Medical Ethics 49 (3):185-186. doi: 10.1136/jme-2023108945.
Giubilini, A., and J. Savulescu. 2018. The artificial moral advisor. The “ideal observer” meets artificial intelligence.
Gloeckler, S., A. Ferrario, and N. Biller-Andorno. 2022. An ethical framework for incorporating digital technology into advance directives: Promoting informed advance decision making in healthcare. The Yale Journal of Biology and Medicine 95 (3):349-353.
Houts, R. M., W. D. Smucker, J. A. Jacobson, P. H. Ditto, and J. H. Danks. 2002. Predicting elderly outpatients’ lifesustaining treatment preferences over time: The majority rules. Medical Decision Making: An International Journal of the Society for Medical Decision Making 22 (1):39-52. doi: 10.1177/0272989X0202200104.
Hubbard, R., Greenblum. J., and J. 2020. Surrogates and artificial intelligence: Why AI trumps family. Science and Engineering Ethics 26 (6):3217-27. doi: 10.1007/s11948-020-00266-6.
Jardas, E. J., D. Wasserman, and D. Wendler. 2022. Autonomy-based criticisms of the patient preference predictor. Journal of Medical Ethics 48 (5):304-310. doi: 10. 1136/medethics-2021-107629.
John, S. 2014. Patient preference predictors, apt categorization, and respect for autonomy. The Journal of Medicine and Philosophy 39 (2):169-177. doi: 10.1093/jmp/jhu008.
John, S. D. 2018. Messy autonomy: Commentary on patient preference predictors and the problem of naked statistical evidence. Journal of Medical Ethics 44 (12):864-864. doi: 10.1136/medethics-2018-104941.
Jost, L. A. 2023. Affective experience as a source of knowledge. PhD Thesis, University of St Andrews. doi: 10. 17630/sta/387.
Kang, W. C., J. Ni, N. Mehta, M. Sathiamoorthy, L. Hong, E. Chi, and D. Z. Cheng. 2023. Do LLMs understand user preferences? Evaluating LLMs on user rating prediction. arXivpreprint (1):1-11. doi: 10.48550/arXiv.2305. 06474.
Kim S. Y. 2014. Improving medical decisions for incapacitated persons: Does focusing on ‘accurate predictions’ lead to an inaccurate picture? Journal of Medicine and Philosophy 39:187-195. doi:
Kim, J., and B. Lee. 2023. AI-augmented surveys: Leveraging large language models for opinion prediction in nationally representative surveys. arXiv preprint (1): 1-18. doi: 10.48550/arXiv.2305.09620.
Kirk, H. R., B. Vidgen, P. Röttger, and S. A. Hale. 2023. Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalized feedback. arXiv preprint (1):1-37. doi: 10.48550/arXiv.2303.05453.
Lewis, J., J. Demaree-Cotton, and B. D. Earp. 2023. Bioethics, experimental approaches. In Encyclopedia of
the philosophy of law and social philosophy, by M. Sellers, S. Kirste. Dordrecht: Springer. doi: 10.1007/978-94-007-6730-0_1053-1.
Lindemann, H., and J. L. Nelson. 2014. The surrogate’s authority. The Journal of Medicine and Philosophy 39 (2): 161-168. doi:
Mainz, J. T. 2023. The patient preference predictor and the objection from higher-order preferences. Journal of Medical Ethics 49 (3):221-222. doi: 10.1136/jme-2022-108427.
O’Neil, C. 2022. Commentary on ‘Autonomy-based criticisms of the patient preference predictor. Journal of Medical Ethics 48 (5):315-316. doi: 10.1136/medethics-2022-108288.
Perry, J. E., L. R. Churchill, and H. S. Kirshner. 2005. The Terri Schiavo case: Legal, ethical, and medical perspectives. Annals of Internal Medicine 143 (10):744-748. doi: 10.7326/0003-4819-143-10-200511150-00012.
Rid, A., and D. Wendler. 2014b. Use of a patient preference predictor to help make medical decisions for incapacitated patients. The Journal of Medicine and Philosophy 39 (2):104-129. doi:
Sacchi, L., S. Rubrichi, C. Rognoni, S. Panzarasa, E. Parimbelli, A. Mazzanti, C. Napolitano, S. G. Priori, and S. Quaglini. 2015. From decision to shared-decision: Introducing patients’ preferences into clinical decision analysis. Artificial Intelligence in Medicine 65 (1):19-28. doi: 10.1016/j.artmed.2014.10.004.
Savulescu, J., and H. Maslen. 2015. Moral enhancement and artificial intelligence: Moral AI?. In Beyond artificial intelligence. Topics in intelligent engineering and informatics, by Romportl, J., Zackova, E., Kelemen, J. 9, 79-95. Cham: Springer. doi: 10.1007/978-3-319-09668-1_6.
Schwartz, S. M., K. Wildenhaus, A. Bucher, and B. Byrd. 2020. Digital twins and the emerging science of self: Implications for digital health experience design and “small” data. Frontiers in Computer Science 2:31. doi: 10. 3389/fcomp.2020.00031.
Schwitzgebel, E., D. Schwitzgebel, and A. Strasser. 2023. Creating a large language model of a philosopher. arXiv Preprint (1):1-36. doi: 10.48550/arXiv.2302.01339.
Senthilnathan, I., and W. Sinnott-Armstrong. Forthcoming. Patient preference predictors: Options, implementations, and policies. Working paper.
Shalowitz, D. I., E. Garrett-Mayer, and D. Wendler. 2006. The accuracy of surrogate decision makers: A systematic
review. Archives of Internal Medicine 166 (5):493-497. doi: 10.1001/archinte.166.5.493.
Shalowitz, D. I., E. Garrett-Mayer, and D. Wendler. 2007. How should treatment decisions be made for incapacitated patients, and why? PLoS Medicine 4 (3):E35. doi: 10.1371/journal.pmed.0040035.
Silveira, M. J. 2022. Advance care planning and advance directives. Up To Date. https://www.uptodate.com/con-tents/advance-care-planning-and-advance-directives.
Sinnott-Armstrong, W., and J. A. Skorburg. 2021. How AI can aid bioethics. Journal of Practical Ethics 9 (1):1-22. doi: 10.3998/jpe.1175.
Smucker, W. D., R. M. Houts, J. H. Danks, P. H. Ditto, A. Fagerlin, and K. M. Coppola. 2000. Modal preferences predict elderly patients’ life-sustaining treatment choices as well as patients’ chosen surrogates do. Medical Decision Making: An International Journal of the Society for Medical Decision Making 20 (3):271-280. doi: 10. 1177/0272989X0002000303.
Stocking, C. B., G. W. Hougham, D. D. Danner, M. B. Patterson, P. J. Whitehouse, and G. A. Sachs. 2006. Speaking of research advance directives: Planning for future research participation. Neurology 66 (9):1361-1366. doi: 10.1212/01.wnl.0000216424.66098.55.
Tomasello, M., M. Carpenter, J. Call, T. Behne, and H. Moll. 2005. Understanding and sharing intentions: The origins of cultural cognition. The Behavioral and Brain Sciences 28 (5):675-691. doi: 10.1017/S0140525X05000129.
Tooming, U., and K. Miyazono. 2023. Affective forecasting and substantial self-knowledge. Emotional Self-Knowledge. New York: Routledge. doi: 10.4324/9781003310945-3.
van Kinschot, C. M. J., V. R. Soekhai, E. W. de BekkerGrob, W. E. Visser, R. P. Peeters, T. M. van Ginhoven, and C. van Noord. 2021. Preferences of patients and clinicians for treatment of Graves’ disease: A discrete choice experiment. European Journal of Endocrinology 184 (6):803-812. doi: 10.1530/EJE-20-1490.
Wasserman, D., and D. Wendler. 2023. Response to commentaries: ‘Autonomy-based criticisms of the patient preference predictor’. Journal of Medical Ethics 49 (8): 580-582. Online ahead of print doi: 10.1136/jme-2022108707.
- CONTACT Brian D. Earp & brian.earp@philosophy.ox.ac.uk (-) Uehiro Centre for Practical Ethics, Faculty of Philosophy, University of Oxford, UK.
*Equal contribution: joint first authors. Please note that some of this research was carried out while BDE was Visiting Senior Research Fellow at the Centre for Biomedical Ethics, National University of Singapore.
This case is loosely based on the case of Terri Schiavo (see Perry, Churchill, and Kirshner 2005). In that case, part of the legal and ethical dispute centred on what Terri’s wishes would have been about treatment. Her husband and parents disagreed about this.
© 2024 The Author(s). Published with license by Taylor & Francis Group, LLC.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The terms on which this article has been published allow the posting of the Accepted Manuscript in a repository by the author(s) or with their consent. Others have suggested variations on this proposal, such as a PPP targeted at specific conditions (Ditto and Clark 2014) or a PPP for ‘metasurrogate’ decision-making (i.e., to predict the proxy decision of an incapacitated surrogate; see Earp 2022). Or, perhaps, the reasons, values, or evidence that the patient endorsed, or would have endorsed, as appropriate grounds for making treatment decisions. In fact, there are several different ways of understanding such criteria, which are meant to capture, in one way or another, what constitutes an individual’s ‘true’ reasons (etc.) for their preferences.
We’ve shortened to ‘P4’ rather than ‘PPPP’ to more readily distinguish the current proposal from references to the original version-the ‘PPP’in what follows. Similar opportunities exist with digital physical twins that simulate the body. There is already a burgeoning literature discussing ways in which autonomous persons might deliberately interact with their own digital physical twins, raising numerous ethical and philosophical questions (see Braun 2021, 2022 for overviews). What about persons who lack decisionmaking capacity, however, such as in our opening example? Given the trajectory of developments in medical AI, we anticipate that it will possible in the reasonably near future for one’s physical digital twin to be appropriately connected to, or integrated with, one’s psychological digital twin (such as an advanced P4) so as to derive potentially even more precise and reliable inferences about what one would choose or want in the given circumstances, i.e., based on two different sources of information: (a) their known or extrapolated values and preferences from the P4, as applied to (b) the specifics of their current health situation as represented by their physical digital twin. This could also be an important avenue to explore for persons who may not lack capacity entirely, but who for various reasons are not able clearly to articulate their physical and psychological health needs. For further discussion of the ethics of (primarily physical) digital twins in healthcare, see Braun (2021, 2022); see also Schwartz et al. (2020). For possibilities regarding integration of physical and psychological digital twins (albeit primarily for purposes of creating a ‘personal assistant’ AI), see de Kerckhove (2021). Although they float an idea that is broadly similar to the one we are exploring here, they do so in passing without much specification: “It could be argued that algorithms trained on vast amounts of individuallevel data are unwieldy or even superfluous. Who needs an algorithm to suggest the same decisions people would make themselves? Such a function might become critical, however, when choices have to be made, for instance, regarding continued life support for someone who can no longer make decisions. Algorithms would not only be able to find patterns within our own past decision making but could also compare them to patterns and decisions of many other people” (1481). To be clear, Lamanna and Byrne (2018) do not solely discuss populationlevel data sets; they, too, briefly discuss the possibility of factoring in “data provided by the patient themselves, be it implicitly through [choices] recorded on their EHR or more explicitly through social media activity” (906). We do not see an essential conflict between these different approaches or emphases. Rather, they could be seen as complementary. For example, as noted in Table 1 above, individual-level patient information could be added as a final, fine-tuned layer on top of a more general base-model LLM that was itself derived, in part, from population-level data. It may be that the greater overall volume of data afforded by such an approach (i.e., combining broad demographic correlations with person-specific information) would improve predictive accuracy in certain cases. This might be the case, for instance, in situations where the individual-level data available or authorized for a given patient is so sparse that a P4 trained exclusively on such data is unable to generate sufficiently reliable predictions. However, we acknowledge that some individuals might prefer, or only be willing to authorize, a P4 that does not include any generic or population-level correlations, even as part of an underlying base-model that is additionally trained on person-specific material.
It is important to note that, in addition to the fine-tuning mechanism we describe in this paper, there are other ways of potentially “personalizing” an LLM’s output, in the sense of adapting its output to be specific to an individual. For example, it is possible to create custom knowledge bases using writings produced by an individual, or to use custom instructions or in-context learning guided by the individual to adapt an LLM’s output to better reflect them personally. By contrast, in this paper, we envision the technical implementation of a P4 as utilizing fine-tuning as described above, due to the much larger volume of information that can be used in this method as compared to custom instructions or in-context learning, as well as the presumed ability of finetuned models to infer patient preferences (as opposed to, e.g., custom knowledge bases, which would use embeddings to reproduce existing information verbatim rather than infer preferences). Ultimately, however, it may be that a robust combination of personalization methods will be necessary to develop a successful LLM-based P4. This is potentially also an objection to the use of a P4, as at least some of the information used by a P4 to infer preferences would have been produced in such a way as not to reflect the most up-to-date or accurate medical information. It is not clear that decisions based on such faulty information could reflect the patient’s real preferences. However, this is a general problem for preference prediction, and it could be addressed by supplementing (and perhaps privileging, in terms of weighting, as described above) the P4 training data with responses by the individual patient to questionnaires, surveys, or choice experiments in which accurate and up-to-date information relevant to treatment decisions had been provided to them in advance. This would partly address concerns regarding the influence of ignorance or mistaken belief about treatment options on the part of patients (i.e., it would make the preferences expressed more informed). Another practical limitation of the P4 relates to introspective contributions to preference formation. Jost (2023) makes the case that introspective access to one’s affective state is a potential source of knowledge. Without access to such in-context affective experiences, an LLM may be limited in its ability to predict medically relevant preferences. However, an LLM might well be able to pick up affective contextual information through natural language processing of written text. The extent to which this is the case is an empirical question that will require further work for its resolution.
From another point of view, it could be argued that a study comparing conventional surrogate decision-making without a P4 (involving patient surrogates and healthcare professionals) should be compared to decisionmaking that additionally includes a P4. This is because the use of surrogates, PPPs, and P4s are not mutually exclusive: it is possible that each of these approaches provides valuable information to be incorporated into a larger decision-making process.
DOI: https://doi.org/10.1080/15265161.2023.2296402
PMID: https://pubmed.ncbi.nlm.nih.gov/38226965
Publication Date: 2024-01-16
A Personalized Patient Preference Predictor for Substituted Judgments in Healthcare: Technically Feasible and Ethically Desirable
© 2024 The Author(s). Published with
Published online: 16 Jan 2024. license by Taylor & Francis Group, LLC.
Article views: 3522
A Personalized Patient Preference Predictor for Substituted Judgments in Healthcare: Technically Feasible and Ethically Desirable
Abstract
When making substituted judgments for incapacitated patients, surrogates often struggle to guess what the patient would want if they had capacity. Surrogates may also agonize over having the (sole) responsibility of making such a determination. To address such concerns, a Patient Preference Predictor (PPP) has been proposed that would use an algorithm to infer the treatment preferences of individual patients from population-level data about the known preferences of people with similar demographic characteristics. However, critics have suggested that even if such a PPP were more accurate, on average, than human surrogates in identifying patient preferences, the proposed algorithm would nevertheless fail to respect the patient’s (former) autonomy since it draws on the ‘wrong’ kind of data: namely, data that are not specific to the individual patient and which therefore may not reflect their actual values, or their reasons for having the preferences they do. Taking such criticisms on board, we here propose a new approach: the Personalized Patient Preference Predictor (P4). The P4 is based on recent advances in machine learning, which allow technologies including large language models to be more cheaply and efficiently ‘fine-tuned’ on person-specific data. The P4, unlike the PPP, would be able to infer an individual patient’s preferences from material (e.g., prior treatment decisions) that is in fact specific to them. Thus, we argue, in addition to being potentially more accurate at the individual level than the previously proposed PPP, the predictions of a P4 would also more directly reflect each patient’s own reasons and values. In this article, we review recent discoveries in artificial intelligence research that suggest a P4 is technically feasible, and argue that, if it is developed and appropriately deployed, it should assuage some of the main autonomy-based concerns of critics of the original PPP. We then consider various objections to our proposal and offer some tentative replies.
KEYWORDS
INTRODUCTION
continue this treatment, or to withdraw it and allow her to die. However,
- A P4 trained on writing produced directly by an individual, such as emails, blog posts, or social media posts. Such text might then be supplemented by additional digital information reflecting the individual’s past choices or behavior, such as treatment decisions encoded in electronic health records (or even Facebook ‘liking’ activity; see Lamanna and Byrne 2018). For technical reasons, such information would need to be stored as writing; however, one important way in which such information could be obtained would be through advances in speech-to-text transcription software. For example, physicians might, with permission, record and automatically transcribe conversations with individual patients.
- An enhanced P4 trained, instead or in addition, on explicit responses provided by an individual, while competent, to questions relating to their hypothetical treatment preferences under various conditions (i.e., an individual-level version of the population-level surveys proposed for the original PPP). This could take the form of questionnaires or interviews with healthcare providers, perhaps as part of a regular checkup, while waiting for care, or in the context of more structured advance care planning (for a detailed proposal as to how this might be done in practice, see Ferrario, Gloeckler, and Biller-Andorno 2023a).
- Perhaps more ambitiously, and to get at underlying values or preferences that might not be consciously accessible to most people (i.e., for purposes of self-report), individuals could be incentivized to participate in specially designed, value-eliciting discrete choice experiments (see, e.g., Ryan 2004) in which they would need to decide between options in a sequence of tradeoffs pitting various decision-relevant factors against one another. These could potentially be ‘gamified’ and delivered by way of a downloadable mobile app, an appropriately secured computer interface in healthcare waiting rooms, a publicly accessible internet-based platform associated with user accounts, etc. For a technical description of how preferences elicited in this, or a similar manner, might be integrated with other information (e.g., medical data) in a shared decision-making context, see the work by Sacchi et al. (2015).
- A P4 trained on the above types of information, if available and appropriately authorized, but if not (or in addition), on responses to questions concerning a patient’s likely or known medical preferences made by surrogate decision-makers and other persons close to the patient. Most likely, such data would be collected after a patient loses capacity, with the responses from surrogates integrated and weighed according to the parameters of the algorithm (i.e., for purposes of predicting what the patient would choose in the particular situation that has arisen).
- A P4 fine-tuned on any of the above-mentioned datasets, but whose base model is not a generic LLM but one trained on population-level data, whether responses from large-scale surveys as in the original PPP proposal (Rid and Wendler 2014a), or population-level electronic health record data linked to social media activity, as per the suggestion of Lamanna and Byrne (2018).
example, a PPP could be used when sufficient data or time to develop a P4 are not available). In any case, given the importance of improving treatment decisions for incapacitated persons in various timesensitive healthcare situations, we suggest that the development of such preference predictors, both technically and in terms of crafting associated ethical guardrails, should urgently be pursued with the involvement of all relevant stakeholders. These include ethicists, healthcare professionals, AI experts, patients, and members of the general public.
THE P4
with their surrogate’s permission, it would draw on various types of personal data as described in Table 1 in order to predict (a) their first-order treatment preferences during periods of decisional incapacity, (b) their second-order preferences for how treatment decisions are made for them during these periods (e.g., with respect to the type or degree of desired family involvement, assuming that such preferences were not explicitly recorded in an advance decisionmaking instrument), and (c) how certain the patient is about these preferences and how strong the preferences are: for example, are their preferences regarding which treatments they receive stronger or less strong than their preferences regarding family involvement?
reasons employed in those individuals’ prior writings (Porsdam Mann et al. 2023). Another LLM fine-tuned on philosopher Daniel Dennett’s writings has produced outputs convincingly similar to Dennett’s own responses to novel questions not addressed in the model’s training set (Schwitzgebel, Schwitzgebel, and Strasser 2023).
human values and preferences: both in general (Askell et al. 2021; Gabriel 2020; Christian 2020; Kenton et al. 2021), and for specific individuals (Kirk et al. 2023). In either case, the aim of research is to identify a process that can successfully adapt LLMs to reflect human values and preferences.
IMPLEMENTATION, PRIVACY, AND CONSENT
As in the case of the PPP, some may argue that family members alone, rather than an algorithm such as the proposed P4, should be relied upon to indicate what should be done in situations when their loved one lacks capacity. This may either be out of a belief that family members have an independent claim over the patient’s treatment decisions (a belief that is contrary to the legal situation in many jurisdictions) or out of respect for the patient’s wishes that their family be involved in any surrogate decision-making process (Brock 2014). The level of expected involvement of family members is also likely to vary between cultures.
cases where there are no other feasible options for determining a patient’s preferences (e.g., human surrogates are not available) as argued by Jardas, Wasserman, and Wendler (2022). This may be a large proportion of cases of patients who lack capacity. Despite concerted efforts to improve uptake, too few patients have completed an advance directive (Wendler et al. 2016). Even among those who have, there are often difficulties in documenting treatment preferences without adequate counseling due to both missing or mistaken knowledge about future medical possibilities and their concrete implications (Dresser 2014),
ADVANTAGES OF THE P4
AUTONOMY-BASED OBJECTIONS TO PATIENT PREFERENCE PREDICTION
preferences when directly asked by clinicians. Given this, it may be problematic to expect that either a P4 or a human surrogate should have to “appreciate” the reasons or values behind a patient’s preferences to respect their autonomy.
determination, an aspect of autonomy which might be valued greatly by some patients.
PRACTICAL AND EPISTEMIC LIMITATIONS OF THE P4
by a P4 could lead to inappropriate reliance on its output due to an inability to determine the degree to which such statements are based on plausible inferences from training data. This is an important point that needs to be addressed before clinical use of P4s is considered. Methods of addressing it could include technical work aimed at allowing a P4 to compute and express its degree of confidence. Indicating how uncertain the prediction is might reduce over-reliance and would also provide a more realistic representation of the preferences of patients who are themselves uncertain about what to do in difficult cases. These confidence intervals along with additional information concerning the functioning, strengths, and weaknesses of LLMs in general and P4s in particular, could be provided to surrogates and clinical decisionmakers.
CONCLUSION
consent has been explicitly obtained. This is an important first step for addressing feasibility and privacy concerns. However, prototype P4s using alternative LLMs capable of being stored locally (thus obviating privacy concerns) need also to be explored either in parallel or once feasibility has been established. Likewise, the accuracy of P4s across languages other than those for which LLMs are best suited (English, Chinese, and Spanish) needs to be tested and if necessary language-specific improvements should be pursued.
using P4 in concrete decision making as well as how to deal with interpretive uncertainties in the assessment of indicated preferences.
DISCLOSURE STATEMENT
DISCLAIMER
FUNDING
Walter Sinnott-Armstrong: WSA’s work on this paper was supported in part by grants from OpenAI and Duke University. These funders are not responsible for the content.
Dominic Wilkinson: This research was funded in whole, or in part, by the Wellcome Trust [203132/Z/16/Z]. The funders had no role in the preparation of this manuscript or the decision to submit for publication. For the purpose of open access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.
Julian Savulescu: This research was funded in whole, or in part, by the Wellcome Trust [Grant number WT203132/Z/ 16/Z]. For the purpose of open access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission. Julian Savulescu, through his involvement with the Murdoch Children’s Research Institute, received funding through from the Victorian State Government through the Operational Infrastructure Support (OIS) Program. This research is supported by the Singapore Ministry of Health’s National Medical Research Council under its Enablers and Infrastructure Support for Clinical Trials-related Activities Funding Initiative (NMRC Project No. MOH-000951-00), the Chen Su Lan Research Funding and by National University of Singapore under the NUS Start-Up grant; NUHSRO/2022/078/Startup/13.
Annette Rid: This work was supported in part by the Clinical Center Department of Bioethics, which is in the Intramural Program of the National Institutes of Health. The views expressed here are those of the author and do not necessarily reflect the policies of the National Institutes of Health or the U.S. Department of Health and Human Services.
David Wendler: This work was funded, in part, by the Intramural Research Program at the NIH Clinical Center. However, opinions expressed are the author’s own. They do not represent the position or policy of the National
ORCID
Karin Jongsma (D http://orcid.org/0000-0001-8135-6786
Matthias Braun (D) http://orcid.org/0000-0002-6687-6027
Dominic Wilkinson (D) http://orcid.org/0000-0003-3958-
8633
Walter Sinnott-Armstrong (D http://orcid.org/0000-0003-2579-9966
David Wendler (D http://orcid.org/0000-0002-9359-4439
Julian Savulescu (D http://orcid.org/0000-0003-1691-6403
REFERENCES
Askell, A., Y. Bai, A. Chen, D. Drain, D. Ganguli, T. Henighan, A. Jones, N. Joseph, B. Mann, N. DasSarma, et al. 2021. A general language assistant as a laboratory for alignment. arXiv Preprint (1):1-48. doi: 10.48550/arXiv. 2112.00861.
Benzinger, L., J. Epping, F. Ursin, and S. Salloch. 2023. Artificial Intelligence to support ethical decision-making for incapacitated patients: A survey among German anesthesiologists and internists. Pre-print available at https:// www.researchgate.net/publication/374530025.
Berger, J. T. 2005. Patients’ interests in their family members’ well-being: An overlooked, fundamental consideration within substituted judgments. The Journal of Clinical Ethics 16 (1):3-10. doi: 10.1086/JCE200516101.
Biller-Andorno, N., A. Ferrario, S. Joebges, T. Krones, F. Massini, P. Barth, G. Arampatzis, and M. Krauthammer. 2022. AI support for ethical decision-making around resuscitation: Proceed with care. Journal of Medical Ethics 48 (3):175-183. doi: 10.1136/medethics-2020-106786.
Biller-Andorno, N., and A. Biller. 2019. Algorithm-aided prediction of patient preferences-an ethics sneak peek. The New England Journal of Medicine 381 (15):14801485. doi: 10.1056/NEJMms1904869.
Christian, B. 2020. The alignment problem. New York: W. W. Norton & Company.
de Kerckhove, D. 2021. The personal digital twin, ethical considerations. Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences 379 (2207):20200367. doi: 10.1098/rsta.2020.0367.
Earp, B. D. 2022. Meta-surrogate decision making and artificial intelligence. Journal of Medical Ethics 48 (5):287289. doi: 10.1136/medethics-2022-108307.
Ferrario, A., S. Gloeckler, and N. Biller-Andorno. 2023a. Ethics of the algorithmic prediction of goal of care preferences: From theory to practice. Journal of Medical Ethics 49 (3):165-174. doi: 10.1136/jme-2022-108371.
Ferrario, A., S. Gloeckler, and N. Biller-Andorno. 2023b. AI knows best? Avoiding the traps of paternalism and other pitfalls of AI-based patient preference prediction. Journal of Medical Ethics 49 (3):185-186. doi: 10.1136/jme-2023108945.
Giubilini, A., and J. Savulescu. 2018. The artificial moral advisor. The “ideal observer” meets artificial intelligence.
Gloeckler, S., A. Ferrario, and N. Biller-Andorno. 2022. An ethical framework for incorporating digital technology into advance directives: Promoting informed advance decision making in healthcare. The Yale Journal of Biology and Medicine 95 (3):349-353.
Houts, R. M., W. D. Smucker, J. A. Jacobson, P. H. Ditto, and J. H. Danks. 2002. Predicting elderly outpatients’ lifesustaining treatment preferences over time: The majority rules. Medical Decision Making: An International Journal of the Society for Medical Decision Making 22 (1):39-52. doi: 10.1177/0272989X0202200104.
Hubbard, R., Greenblum. J., and J. 2020. Surrogates and artificial intelligence: Why AI trumps family. Science and Engineering Ethics 26 (6):3217-27. doi: 10.1007/s11948-020-00266-6.
Jardas, E. J., D. Wasserman, and D. Wendler. 2022. Autonomy-based criticisms of the patient preference predictor. Journal of Medical Ethics 48 (5):304-310. doi: 10. 1136/medethics-2021-107629.
John, S. 2014. Patient preference predictors, apt categorization, and respect for autonomy. The Journal of Medicine and Philosophy 39 (2):169-177. doi: 10.1093/jmp/jhu008.
John, S. D. 2018. Messy autonomy: Commentary on patient preference predictors and the problem of naked statistical evidence. Journal of Medical Ethics 44 (12):864-864. doi: 10.1136/medethics-2018-104941.
Jost, L. A. 2023. Affective experience as a source of knowledge. PhD Thesis, University of St Andrews. doi: 10. 17630/sta/387.
Kang, W. C., J. Ni, N. Mehta, M. Sathiamoorthy, L. Hong, E. Chi, and D. Z. Cheng. 2023. Do LLMs understand user preferences? Evaluating LLMs on user rating prediction. arXivpreprint (1):1-11. doi: 10.48550/arXiv.2305. 06474.
Kim S. Y. 2014. Improving medical decisions for incapacitated persons: Does focusing on ‘accurate predictions’ lead to an inaccurate picture? Journal of Medicine and Philosophy 39:187-195. doi:
Kim, J., and B. Lee. 2023. AI-augmented surveys: Leveraging large language models for opinion prediction in nationally representative surveys. arXiv preprint (1): 1-18. doi: 10.48550/arXiv.2305.09620.
Kirk, H. R., B. Vidgen, P. Röttger, and S. A. Hale. 2023. Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalized feedback. arXiv preprint (1):1-37. doi: 10.48550/arXiv.2303.05453.
Lewis, J., J. Demaree-Cotton, and B. D. Earp. 2023. Bioethics, experimental approaches. In Encyclopedia of
the philosophy of law and social philosophy, by M. Sellers, S. Kirste. Dordrecht: Springer. doi: 10.1007/978-94-007-6730-0_1053-1.
Lindemann, H., and J. L. Nelson. 2014. The surrogate’s authority. The Journal of Medicine and Philosophy 39 (2): 161-168. doi:
Mainz, J. T. 2023. The patient preference predictor and the objection from higher-order preferences. Journal of Medical Ethics 49 (3):221-222. doi: 10.1136/jme-2022-108427.
O’Neil, C. 2022. Commentary on ‘Autonomy-based criticisms of the patient preference predictor. Journal of Medical Ethics 48 (5):315-316. doi: 10.1136/medethics-2022-108288.
Perry, J. E., L. R. Churchill, and H. S. Kirshner. 2005. The Terri Schiavo case: Legal, ethical, and medical perspectives. Annals of Internal Medicine 143 (10):744-748. doi: 10.7326/0003-4819-143-10-200511150-00012.
Rid, A., and D. Wendler. 2014b. Use of a patient preference predictor to help make medical decisions for incapacitated patients. The Journal of Medicine and Philosophy 39 (2):104-129. doi:
Sacchi, L., S. Rubrichi, C. Rognoni, S. Panzarasa, E. Parimbelli, A. Mazzanti, C. Napolitano, S. G. Priori, and S. Quaglini. 2015. From decision to shared-decision: Introducing patients’ preferences into clinical decision analysis. Artificial Intelligence in Medicine 65 (1):19-28. doi: 10.1016/j.artmed.2014.10.004.
Savulescu, J., and H. Maslen. 2015. Moral enhancement and artificial intelligence: Moral AI?. In Beyond artificial intelligence. Topics in intelligent engineering and informatics, by Romportl, J., Zackova, E., Kelemen, J. 9, 79-95. Cham: Springer. doi: 10.1007/978-3-319-09668-1_6.
Schwartz, S. M., K. Wildenhaus, A. Bucher, and B. Byrd. 2020. Digital twins and the emerging science of self: Implications for digital health experience design and “small” data. Frontiers in Computer Science 2:31. doi: 10. 3389/fcomp.2020.00031.
Schwitzgebel, E., D. Schwitzgebel, and A. Strasser. 2023. Creating a large language model of a philosopher. arXiv Preprint (1):1-36. doi: 10.48550/arXiv.2302.01339.
Senthilnathan, I., and W. Sinnott-Armstrong. Forthcoming. Patient preference predictors: Options, implementations, and policies. Working paper.
Shalowitz, D. I., E. Garrett-Mayer, and D. Wendler. 2006. The accuracy of surrogate decision makers: A systematic
review. Archives of Internal Medicine 166 (5):493-497. doi: 10.1001/archinte.166.5.493.
Shalowitz, D. I., E. Garrett-Mayer, and D. Wendler. 2007. How should treatment decisions be made for incapacitated patients, and why? PLoS Medicine 4 (3):E35. doi: 10.1371/journal.pmed.0040035.
Silveira, M. J. 2022. Advance care planning and advance directives. Up To Date. https://www.uptodate.com/con-tents/advance-care-planning-and-advance-directives.
Sinnott-Armstrong, W., and J. A. Skorburg. 2021. How AI can aid bioethics. Journal of Practical Ethics 9 (1):1-22. doi: 10.3998/jpe.1175.
Smucker, W. D., R. M. Houts, J. H. Danks, P. H. Ditto, A. Fagerlin, and K. M. Coppola. 2000. Modal preferences predict elderly patients’ life-sustaining treatment choices as well as patients’ chosen surrogates do. Medical Decision Making: An International Journal of the Society for Medical Decision Making 20 (3):271-280. doi: 10. 1177/0272989X0002000303.
Stocking, C. B., G. W. Hougham, D. D. Danner, M. B. Patterson, P. J. Whitehouse, and G. A. Sachs. 2006. Speaking of research advance directives: Planning for future research participation. Neurology 66 (9):1361-1366. doi: 10.1212/01.wnl.0000216424.66098.55.
Tomasello, M., M. Carpenter, J. Call, T. Behne, and H. Moll. 2005. Understanding and sharing intentions: The origins of cultural cognition. The Behavioral and Brain Sciences 28 (5):675-691. doi: 10.1017/S0140525X05000129.
Tooming, U., and K. Miyazono. 2023. Affective forecasting and substantial self-knowledge. Emotional Self-Knowledge. New York: Routledge. doi: 10.4324/9781003310945-3.
van Kinschot, C. M. J., V. R. Soekhai, E. W. de BekkerGrob, W. E. Visser, R. P. Peeters, T. M. van Ginhoven, and C. van Noord. 2021. Preferences of patients and clinicians for treatment of Graves’ disease: A discrete choice experiment. European Journal of Endocrinology 184 (6):803-812. doi: 10.1530/EJE-20-1490.
Wasserman, D., and D. Wendler. 2023. Response to commentaries: ‘Autonomy-based criticisms of the patient preference predictor’. Journal of Medical Ethics 49 (8): 580-582. Online ahead of print doi: 10.1136/jme-2022108707.
- CONTACT Brian D. Earp & brian.earp@philosophy.ox.ac.uk (-) Uehiro Centre for Practical Ethics, Faculty of Philosophy, University of Oxford, UK.
*Equal contribution: joint first authors. Please note that some of this research was carried out while BDE was Visiting Senior Research Fellow at the Centre for Biomedical Ethics, National University of Singapore.
This case is loosely based on the case of Terri Schiavo (see Perry, Churchill, and Kirshner 2005). In that case, part of the legal and ethical dispute centred on what Terri’s wishes would have been about treatment. Her husband and parents disagreed about this.
© 2024 The Author(s). Published with license by Taylor & Francis Group, LLC.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The terms on which this article has been published allow the posting of the Accepted Manuscript in a repository by the author(s) or with their consent. Others have suggested variations on this proposal, such as a PPP targeted at specific conditions (Ditto and Clark 2014) or a PPP for ‘metasurrogate’ decision-making (i.e., to predict the proxy decision of an incapacitated surrogate; see Earp 2022). Or, perhaps, the reasons, values, or evidence that the patient endorsed, or would have endorsed, as appropriate grounds for making treatment decisions. In fact, there are several different ways of understanding such criteria, which are meant to capture, in one way or another, what constitutes an individual’s ‘true’ reasons (etc.) for their preferences.
We’ve shortened to ‘P4’ rather than ‘PPPP’ to more readily distinguish the current proposal from references to the original version-the ‘PPP’in what follows. Similar opportunities exist with digital physical twins that simulate the body. There is already a burgeoning literature discussing ways in which autonomous persons might deliberately interact with their own digital physical twins, raising numerous ethical and philosophical questions (see Braun 2021, 2022 for overviews). What about persons who lack decisionmaking capacity, however, such as in our opening example? Given the trajectory of developments in medical AI, we anticipate that it will possible in the reasonably near future for one’s physical digital twin to be appropriately connected to, or integrated with, one’s psychological digital twin (such as an advanced P4) so as to derive potentially even more precise and reliable inferences about what one would choose or want in the given circumstances, i.e., based on two different sources of information: (a) their known or extrapolated values and preferences from the P4, as applied to (b) the specifics of their current health situation as represented by their physical digital twin. This could also be an important avenue to explore for persons who may not lack capacity entirely, but who for various reasons are not able clearly to articulate their physical and psychological health needs. For further discussion of the ethics of (primarily physical) digital twins in healthcare, see Braun (2021, 2022); see also Schwartz et al. (2020). For possibilities regarding integration of physical and psychological digital twins (albeit primarily for purposes of creating a ‘personal assistant’ AI), see de Kerckhove (2021). Although they float an idea that is broadly similar to the one we are exploring here, they do so in passing without much specification: “It could be argued that algorithms trained on vast amounts of individuallevel data are unwieldy or even superfluous. Who needs an algorithm to suggest the same decisions people would make themselves? Such a function might become critical, however, when choices have to be made, for instance, regarding continued life support for someone who can no longer make decisions. Algorithms would not only be able to find patterns within our own past decision making but could also compare them to patterns and decisions of many other people” (1481). To be clear, Lamanna and Byrne (2018) do not solely discuss populationlevel data sets; they, too, briefly discuss the possibility of factoring in “data provided by the patient themselves, be it implicitly through [choices] recorded on their EHR or more explicitly through social media activity” (906). We do not see an essential conflict between these different approaches or emphases. Rather, they could be seen as complementary. For example, as noted in Table 1 above, individual-level patient information could be added as a final, fine-tuned layer on top of a more general base-model LLM that was itself derived, in part, from population-level data. It may be that the greater overall volume of data afforded by such an approach (i.e., combining broad demographic correlations with person-specific information) would improve predictive accuracy in certain cases. This might be the case, for instance, in situations where the individual-level data available or authorized for a given patient is so sparse that a P4 trained exclusively on such data is unable to generate sufficiently reliable predictions. However, we acknowledge that some individuals might prefer, or only be willing to authorize, a P4 that does not include any generic or population-level correlations, even as part of an underlying base-model that is additionally trained on person-specific material.
It is important to note that, in addition to the fine-tuning mechanism we describe in this paper, there are other ways of potentially “personalizing” an LLM’s output, in the sense of adapting its output to be specific to an individual. For example, it is possible to create custom knowledge bases using writings produced by an individual, or to use custom instructions or in-context learning guided by the individual to adapt an LLM’s output to better reflect them personally. By contrast, in this paper, we envision the technical implementation of a P4 as utilizing fine-tuning as described above, due to the much larger volume of information that can be used in this method as compared to custom instructions or in-context learning, as well as the presumed ability of finetuned models to infer patient preferences (as opposed to, e.g., custom knowledge bases, which would use embeddings to reproduce existing information verbatim rather than infer preferences). Ultimately, however, it may be that a robust combination of personalization methods will be necessary to develop a successful LLM-based P4. This is potentially also an objection to the use of a P4, as at least some of the information used by a P4 to infer preferences would have been produced in such a way as not to reflect the most up-to-date or accurate medical information. It is not clear that decisions based on such faulty information could reflect the patient’s real preferences. However, this is a general problem for preference prediction, and it could be addressed by supplementing (and perhaps privileging, in terms of weighting, as described above) the P4 training data with responses by the individual patient to questionnaires, surveys, or choice experiments in which accurate and up-to-date information relevant to treatment decisions had been provided to them in advance. This would partly address concerns regarding the influence of ignorance or mistaken belief about treatment options on the part of patients (i.e., it would make the preferences expressed more informed). Another practical limitation of the P4 relates to introspective contributions to preference formation. Jost (2023) makes the case that introspective access to one’s affective state is a potential source of knowledge. Without access to such in-context affective experiences, an LLM may be limited in its ability to predict medically relevant preferences. However, an LLM might well be able to pick up affective contextual information through natural language processing of written text. The extent to which this is the case is an empirical question that will require further work for its resolution.
From another point of view, it could be argued that a study comparing conventional surrogate decision-making without a P4 (involving patient surrogates and healthcare professionals) should be compared to decisionmaking that additionally includes a P4. This is because the use of surrogates, PPPs, and P4s are not mutually exclusive: it is possible that each of these approaches provides valuable information to be incorporated into a larger decision-making process.