Abstract
China accounts for approximately 50% of global new liver cancer cases, predominantly driven by hepatitis B virus (HBV)-related hepatocellular carcinoma. The disease presents a distinctive socio-medical profile characterized by a prolonged etiological chain (chronic hepatitis B—cirrhosis—liver cancer), late-stage diagnosis, and a cultural norm of family-first disclosure—whereby physicians typically inform family members before patients themselves. These features position family caregivers, rather than patients, as the primary information seekers and communicators in online health discourse. Drawing on the Comprehensive Model of Information Seeking (CMIS) and Communication Privacy Management (CPM) theory, this study analyzes 1,857 liver cancer-related user comments collected from Xiaohongshu (Little Red Book)—a Chinese lifestyle-oriented social media platform with over 300 million monthly active users—between December 2024 and February 2026. BERTopic, a neural topic modeling framework suited to short Chinese social media texts, was employed for inductive topic extraction. The analysis yielded 15 interpretable topics, organized into five thematic categories: clinical diagnosis and treatment (34.8%), emotional support and caregiving (26.3%), upstream chronic disease management (14.2%), symptom and tumor assessment (12.5%), and online health consultation (9.6%). Notably, hepatitis B and cirrhosis medication management emerged as the second-largest topic, directly mirroring the HBV-dominant etiological chain characteristic of Chinese liver cancer. Family-role terms ("father," "mother," "husband") and Buddhist prayer language appeared as high-salience keywords across multiple topics, underscoring the caregiver-driven and culturally embedded nature of the discourse. The study proposes the concept of "proxy information seeker" as a theoretical extension of CMIS, reveals privacy boundary tensions arising from caregivers' routine public disclosure of patient health information under China's non-disclosure culture, and provides empirical grounding for caregiver-oriented platform governance and digital health literacy interventions.
References
Cao, M. D., Xia, C. F., Zhou, J. H., Teng, Y., Li, Q. R., Tan, N. P., Wang, J. C., Zuo, T. T., Li, T. Y., Zheng, Y. J., & Chen, W. Q. (2025). Global epidemiology of liver cancer burden due to hepatitis B virus: A comprehensive estimate based on Global Burden of Disease Study 2021. Chinese Medical Journal, 138, 3457–3466.
Chou, W. Y. S., Prestin, A., Lyons, C., & Wen, K. Y. (2013). Web 2.0 for health promotion: Reviewing the current evidence. American Journal of Public Health, 103(1), e9–e18.
Egger, R., & Yu, J. (2022). A topic modeling comparison between LDA, NMF, Top2Vec, and BERTopic to demystify Twitter posts. Frontiers in Sociology, 7, 886498.
Fan, R., & Li, B. (2004). Truth telling in medicine: The Confucian view. Journal of Medicine and Philosophy, 29(2), 179–193.
Gage-Bouchard, E. A., LaValley, S., Mollica, M., & Beaupin, L. K. (2017). Communication and exchange of specialized health-related support among people with experiential similarity on Facebook. Health Communication, 32(10), 1233–1240.
Ji, T., Liu, Z., Su, Z., et al. (2025). E-cigarette narratives of user-generated posts on Xiaohongshu in China: Content analysis. Journal of Medical Internet Research, 27, e71173.
Johnson, J. D., & Meischke, H. (1993). A comprehensive model of cancer-related information seeking applied to magazines. Human Communication Research, 19(3), 343–367.
Liu, Y., Yang, J., Huo, D., Fan, H., & Gao, Y. (2018). Disclosure of cancer diagnosis in China: The incidence, patients' situation, and different preferences between patients and their family members and related influence factors. Cancer Management and Research, 10, 2173–2181.
McInnes, L., Healy, J., & Astels, S. (2017). hdbscan: Hierarchical density based clustering. Journal of Open Source Software, 2(11), 205.
McInnes, L., Healy, J., Saul, N., & Großberger, L. (2018). UMAP: Uniform Manifold Approximation and Projection. Journal of Open Source Software, 3(29), 861.
Petronio, S. (2002). Boundaries of Privacy: Dialectics of Disclosure. State University of New York Press.
Röder, M., Both, A., & Hinneburg, A. (2015). Exploring the space of topic coherence measures. Proceedings of the 8th ACM International Conference on Web Search and Data Mining, 399–408.
Sung, H., Ferlay, J., Siegel, R. L., Laversanne, M., Soerjomataram, I., Jemal, A., & Bray, F. (2021). Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians, 71(3), 209–249.
Wang, D. C., Guo, C. B., Peng, X., Su, Y. J., & Chen, F. (2011). Is therapeutic non-disclosure still possible? A study on the awareness of cancer diagnosis in China. Supportive Care in Cancer, 19(8), 1191–1195.
Yan, R., Sun, M., Yang, H., Du, S., Sun, L., & Mao, Y. (2025). 2024 latest report on hepatitis B virus epidemiology in China: current status, changing trajectory, and challenges. Hepatobiliary Surgery and Nutrition, 14(1), 66-77.
Zhang, Y., Chen, L., Wang, H., Luo, Y., & Hu, D. (2021). COVID-19 health information analysis on Weibo using text mining. International Journal of Environmental Research and Public Health, 18(3), 1238.
Zhao, Y. C., Zhao, M., & Song, S. (2022). Online health information seeking behaviors among older adults: Systematic scoping review. Journal of Medical Internet Research, 24(2), e34790.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2026 Hongyu Jiang, Liyao Xiao

