Agents built on large language models (LLMs) have excelled in turn-by-turn human-AI collaboration but struggle with simultaneous tasks requiring real-time interaction. Latency issues and the challenge of inferring variable human strategies hinder their ability to make autonomous decisions without explicit instructions. Through experiments with current independent System 1 and System 2 methods, we validate the necessity of using Dual Process Theory (DPT) in real-time tasks. We propose DPT-Agent, a novel language agent framework that integrates System 1 and System 2 for efficient real-time simultaneous human-AI collaboration. DPT-Agent’s System 1 uses a Finite-state Machine (FSM) and code-as-policy for fast, intuitive, and controllable decision-making. DPT-Agent’s System 2 integrates Theory of Mind (ToM) and asynchronous reflection to infer human intentions and perform reasoning-based autonomous decisions. We demonstrate the effectiveness of DPT-Agent through further experiments with rule-based agents and human collaborators, showing significant improvements over mainstream LLM-based frameworks. To the best of our knowledge, DPT-Agent is the first language agent framework that achieves successful real-time simultaneous human-AI collaboration autonomously. Code of DPT-Agent can be found in https://github.com/sjtu-marl/DPT-Agent.
@article{zhang2025dpt,title={Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration},author={Zhang*, Shao and Wang*, Xihuai and Zhang, Wenhao and Li, Chaoran and Song, Junru and Li, Tingyu and Qiu, Lin and Cao, Xuezhi and Cai, Xunliang and Yao, Wen and Zhang, Weinan and Wang, Xinbing and Wen, Ying},year={2025},journal={Preprint Under Review},}
Language Games
Language Games as the Pathway to Artificial Superhuman Intelligence
The evolution of large language models (LLMs) toward artificial superhuman intelligence (ASI) hinges on data reproduction, a cyclical process in which models generate, curate and retrain on novel data to refine capabilities. Current methods, however, risk getting stuck in a data reproduction trap: optimizing outputs within fixed human-generated distributions in a closed loop leads to stagnation, as models merely recombine existing knowledge rather than explore new frontiers. In this paper, we propose language games as a pathway to expanded data reproduction, breaking this cycle through three mechanisms: (1) \textitrole fluidity, which enhances data diversity and coverage by enabling multi-agent systems to dynamically shift roles across tasks; (2) \textitreward variety, embedding multiple feedback criteria that can drive complex intelligent behaviors; and (3) \textitrule plasticity, iteratively evolving interaction constraints to foster learnability, thereby injecting continual novelty. By scaling language games into global sociotechnical ecosystems, human-AI co-evolution generates unbounded data streams that drive open-ended exploration. This framework redefines data reproduction not as a closed loop but as an engine for superhuman intelligence.
@article{wen2025language,title={Language Games as the Pathway to Artificial Superhuman Intelligence},author={Wen, Ying and Wan, Ziyu and Zhang, Shao},journal={Preprint Under Review},year={2025}}
2024
Language Agent
Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task
Theory of Mind (ToM) significantly impacts human collaboration and communication as a crucial capability to understand others. When AI agents with ToM capability collaborate with humans, Mutual Theory of Mind (MToM) arises in such human-AI teams (HATs). The MToM process, which involves interactive communication and ToM-based strategy adjustment, affects the team’s performance and collaboration process. To explore the MToM process, we conducted a mixed-design experiment using a large language model-driven AI agent with ToM and communication modules in a real-time shared-workspace task. We find that the agent’s ToM capability does not significantly impact team performance but enhances human understanding of the agent and the feeling of being understood. Most participants in our study believe verbal communication increases human burden, and the results show that bidirectional communication leads to lower HAT performance. We discuss the results’ implications for designing AI agents that collaborate with humans in real-time shared workspace tasks.
@article{zhang2024mutualtheorymindhumanai,title={Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task},author={Zhang*, Shao and Wang*, Xihuai and Zhang, Wenhao and Chen, Yongshan and Gao, Landi and Wang, Dakuo and Zhang, Weinan and Wang, Xinbing and Wen, Ying},year={2024},eprint={2409.08811},archiveprefix={arXiv},primaryclass={cs.HC},journal={Preprint Under Review}}
Zero-shot Coordination
ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination
Zero-shot coordination (ZSC) is a new challenge focusing on generalizing learned coordination skills to unseen partners. Existing methods train the ego agent with partners from pre-trained or evolving populations. The agent’s ZSC capability is typically evaluated with a few evaluation partners, including human and agent, and reported by mean returns. Current evaluation methods for ZSC capability still need to improve in constructing diverse evaluation partners and comprehensively measuring the ZSC capability. We aim to create a reliable, comprehensive, and efficient evaluation method for ZSC capability. We formally define the ideal ’diversity-complete’ evaluation partners and propose the best response (BR) diversity, which is the population diversity of the BRs to the partners, to approximate the ideal evaluation partners. We propose an evaluation workflow including ’diversity-complete’ evaluation partners construction and a multi-dimensional metric, the Best Response Proximity (BR-Prox) metric. BR-Prox quantifies the ZSC capability as the performance similarity to each evaluation partner’s approximate best response, demonstrating generalization capability and improvement potential. We re-evaluate strong ZSC methods in the Overcooked environment using the proposed evaluation workflow. Surprisingly, the results in some of the most used layouts fail to distinguish the performance of different ZSC methods. Moreover, the evaluated ZSC methods must produce more diverse and high-performing training partners. Our proposed evaluation workflow calls for a change in how we efficiently evaluate ZSC methods as a supplement to human evaluation.
@article{wang2023quantifying,title={ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination},author={Wang*, Xihuai and Zhang*, Shao and Zhang, Wenhao and Dong, Wentao and Chen, Jingxiao and Wen, Ying and Zhang, Weinan},year={2024},eprint={2310.05208},archiveprefix={arXiv},primaryclass={cs.AI},journal={The 38th Conference on Neural Information Processing Systems (NeurIPS 2024) Track on Datasets and Benchmarks},}
Social Dillemma
Aligning Individual and Collective Objectives in Multi-Agent Cooperation
Yang Li, Wenhao Zhang, Jianhong Wang, Shao Zhang, Yali Du, Ying Wen, and Wei Pan
The 38th Conference on Neural Information Processing Systems (NeurIPS 2024), Dec 2024
@article{li2024aligning,title={Aligning Individual and Collective Objectives in Multi-Agent Cooperation},author={Li, Yang and Zhang, Wenhao and Wang, Jianhong and Zhang, Shao and Du, Yali and Wen, Ying and Pan, Wei},journal={The 38th Conference on Neural Information Processing Systems (NeurIPS 2024)},year={2024},month=dec,}
Data Annotation
StorySpark: Expert-Annotated QA Pairs with Real-World Knowledge for Children Storytelling
@inproceedings{chen2024storyspark,title={StorySpark: Expert-Annotated {QA} Pairs with Real-World Knowledge for Children Storytelling},author={Chen, Jiaju and Lu, Yuxuan and Zhang, Shao and Yao, Bingsheng and Dong, Yuanzhe and Xu, Ying and Li, Yunyao and Wang, Qianwen and Wang, Dakuo and Sun, Yuling},booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing},year={2024}}
Zero-shot Coordination
Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination
Yang Li*, Shao Zhang*, Jichen Sun, Wenhao Zhang, Yali Du, Ying Wen, Xinbing Wang, and Wei Pan
Securing coordination between AI agent and teammates (human players or AI agents) in contexts involving unfamiliar humans continues to pose a significant challenge in Zero-Shot Coordination. The issue of cooperative incompatibility becomes particularly prominent when an AI agent is unsuccessful in synchronizing with certain previously unknown partners. Traditional algorithms have aimed to collaborate with partners by optimizing fixed objectives within a population, fostering diversity in strategies and behaviors. However, these techniques may lead to learning loss and an inability to cooperate with specific strategies within the population, a phenomenon named cooperative incompatibility in learning. In order to solve cooperative incompatibility in learning and effectively address the problem in the context of ZSC, we introduce the Cooperative Open-ended LEarning (COLE) framework, which formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy. We present two practical algorithms, specifically COLESV and COLER, which incorporate insights from game theory and graph theory. We also show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis. Subsequently, we created an online Overcooked human-AI experiment platform, the COLE platform, which enables easy customization of questionnaires, model weights, and other aspects. Utilizing the COLE platform, we enlist 130 participants for human experiments. Our findings reveal a preference for our approach over state-of-the-art methods using a variety of subjective metrics. Moreover, objective experimental outcomes in the Overcooked game environment indicate that our method surpasses existing ones when coordinating with previously unencountered AI agents and the human proxy model. Our code and demo are publicly available at https://sites.google.com/view/cole-2023.
@article{10.1613/jair.1.15884,author={Li*, Yang and Zhang*, Shao and Sun, Jichen and Zhang, Wenhao and Du, Yali and Wen, Ying and Wang, Xinbing and Pan, Wei},title={Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination},year={2024},issue_date={Sep 2024},publisher={AI Access Foundation},address={El Segundo, CA, USA},volume={80},issn={1076-9757},url={https://doi.org/10.1613/jair.1.15884},doi={10.1613/jair.1.15884},journal={J. Artif. Int. Res.},month=jul,numpages={47},}
LLM Agents
Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach
Bin Zhang, Hangyu Mao, Jingqing Ruan, Ying Wen, Yang Li, Shao Zhang, Zhiwei Xu, Dapeng Li, Ziyue Li, Rui Zhao, and others
@inproceedings{zhang2023controlling,title={Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach},author={Zhang, Bin and Mao, Hangyu and Ruan, Jingqing and Wen, Ying and Li, Yang and Zhang, Shao and Xu, Zhiwei and Li, Dapeng and Li, Ziyue and Zhao, Rui and others},booktitle={LLM Agents Workshop@ICLR2024},year={2024},month=may,}
AI-in-the-Loop
Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis Diagnosis
Shao Zhang, Jianing Yu, Xuhai Xu, Changchang Yin, Yuxuan Lu, Bingsheng Yao, Melanie Tory, Lace M. Padilla, Jeffrey Caterino, Ping Zhang, and Dakuo Wang
In Proceedings of the CHI Conference on Human Factors in Computing Systems, May 2024
Today’s AI systems for medical decision support often succeed on benchmark datasets in research papers but fail in real-world deployment. This work focuses on the decision making of sepsis, an acute life-threatening systematic infection that requires an early diagnosis with high uncertainty from the clinician. Our aim is to explore the design requirements for AI systems that can support clinical experts in making better decisions for the early diagnosis of sepsis. The study begins with a formative study investigating why clinical experts abandon an existing AI-powered Sepsis predictive module in their electrical health record (EHR) system. We argue that a human-centered AI system needs to support human experts in the intermediate stages of a medical decision-making process (e.g., generating hypotheses or gathering data), instead of focusing only on the final decision. Therefore, we build SepsisLab based on a state-of-the-art AI algorithm and extend it to predict the future projection of sepsis development, visualize the prediction uncertainty, and propose actionable suggestions (i.e., which additional laboratory tests can be collected) to reduce such uncertainty. Through heuristic evaluation with six clinicians using our prototype system, we demonstrate that SepsisLab enables a promising human-AI collaboration paradigm for the future of AI-assisted sepsis diagnosis and other high-stakes medical decision making.
@inproceedings{zhang2023rethinking,author={Zhang, Shao and Yu, Jianing and Xu, Xuhai and Yin, Changchang and Lu, Yuxuan and Yao, Bingsheng and Tory, Melanie and Padilla, Lace M. and Caterino, Jeffrey and Zhang, Ping and Wang, Dakuo},title={Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis Diagnosis},year={2024},month=may,isbn={9798400703300},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/3613904.3642343},doi={10.1145/3613904.3642343},booktitle={Proceedings of the CHI Conference on Human Factors in Computing Systems},articleno={445},numpages={18},keywords={Human-AI collaboration, Medical decision making, Sepsis diagnosis},location={<conf-loc>, <city>Honolulu</city>, <state>HI</state>, <country>USA</country>, </conf-loc>},series={CHI '24},}
AI-in-the-Loop
Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older Adults
Ziqi Yang, Xuhai Xu, Bingsheng Yao, Ethan Rogers, Shao Zhang, Stephen Intille, Nawar Shara, Guodong Gordon Gao, and Dakuo Wang
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., May 2024
Despite the plethora of telehealth applications to assist home-based older adults and healthcare providers, basic messaging and phone calls are still the most common communication methods, which suffer from limited availability, information loss, and process inefficiencies. One promising solution to facilitate patient-provider communication is to leverage large language models (LLMs) with their powerful natural conversation and summarization capability. However, there is a limited understanding of LLMs’ role during the communication. We first conducted two interview studies with both older adults (N=10) and healthcare providers (N=9) to understand their needs and opportunities for LLMs in patient-provider asynchronous communication. Based on the insights, we built an LLM-powered communication system, Talk2Care, and designed interactive components for both groups: (1) For older adults, we leveraged the convenience and accessibility of voice assistants (VAs) and built an LLM-powered conversational interface for effective information collection. (2) For health providers, we built an LLM-based dashboard to summarize and present important health information based on older adults’ conversations with the VA. We further conducted two user studies with older adults and providers to evaluate the usability of the system. The results showed that Talk2Care could facilitate the communication process, enrich the health information collected from older adults, and considerably save providers’ efforts and time. We envision our work as an initial exploration of LLMs’ capability in the intersection of healthcare and interpersonal communication.
@article{yang2023talk2care,author={Yang, Ziqi and Xu, Xuhai and Yao, Bingsheng and Rogers, Ethan and Zhang, Shao and Intille, Stephen and Shara, Nawar and Gao, Guodong Gordon and Wang, Dakuo},title={Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older Adults},year={2024},issue_date={May 2024},publisher={Association for Computing Machinery},address={New York, NY, USA},volume={8},number={2},url={https://doi.org/10.1145/3659625},doi={10.1145/3659625},journal={Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.},month=may,articleno={73},numpages={35},keywords={Large-language-model, Older adults, Patient-provider communication},}
LLM
More Samples or More Prompts? Exploring Effective Few-Shot In-Context Learning for LLMs with In-Context Sampling
Bingsheng Yao, Guiming Chen, Ruishi Zou, Yuxuan Lu, Jiachen Li, Shao Zhang, Yisi Sang, Sijia Liu, James Hendler, and Dakuo Wang
In Findings of the Association for Computational Linguistics: NAACL 2024, Jun 2024
While most existing works on LLM prompting techniques focus only on how to select a better set of data samples inside one single prompt input (In-Context Learning or ICL), why can not we design and leverage multiple prompts together to further improve the LLM’s performance? In this work, we propose In-Context Sampling (ICS), a low-resource LLM prompting technique to produce confident predictions by optimizing the construction of multiple ICL prompt inputs. Extensive experiments with three open-source LLMs (FlanT5-XL, Mistral-7B, and Mixtral-8x7B) on four NLI datasets (e-SNLI, Multi-NLI, ANLI, and Contract-NLI) and one QA dataset (CommonsenseQA) illustrate that ICS can consistently enhance LLMs’ performance. An in-depth evaluation with three data similarity-based ICS strategies suggests that these strategies can further elevate LLM’s performance, which sheds light on a new yet promising future research direction.
@inproceedings{yao-etal-2024-samples,title={More Samples or More Prompts? Exploring Effective Few-Shot In-Context Learning for {LLM}s with In-Context Sampling},author={Yao, Bingsheng and Chen, Guiming and Zou, Ruishi and Lu, Yuxuan and Li, Jiachen and Zhang, Shao and Sang, Yisi and Liu, Sijia and Hendler, James and Wang, Dakuo},editor={Duh, Kevin and Gomez, Helena and Bethard, Steven},booktitle={Findings of the Association for Computational Linguistics: NAACL 2024},month=jun,year={2024},address={Mexico City, Mexico},publisher={Association for Computational Linguistics},url={https://aclanthology.org/2024.findings-naacl.115},pages={1772--1790}}
2023
Zero-shot Coordination
Cooperative Open-ended Learning Framework for Zero-Shot Coordination
Yang Li*, Shao Zhang*, Jichen Sun, Yali Du, Ying Wen, Xinbing Wang, and Wei Pan
Proceedings of the 40th International Conference on Machine Learning(ICML2023), Jun 2023
Zero-shot coordination in cooperative artificial intelligence (AI) remains a significant challenge, which means effectively coordinating with a wide range of unseen partners. Previous algorithms have attempted to address this challenge by optimizing fixed objectives within a population to improve strategy or behavior diversity. However, these approaches can result in a loss of learning and an inability to cooperate with certain strategies within the population, known as cooperative incompatibility. To address this issue, we propose the Cooperative Open-ended LEarning (COLE) framework, which constructs open-ended objectives in cooperative games with two players from the perspective of graph theory to assess and identify the cooperative ability of each strategy. We further specify the framework and propose a practical algorithm that leverages knowledge from game theory and graph theory. Furthermore, an analysis of the learning process of the algorithm shows that it can efficiently overcome cooperative incompatibility. The experimental results in the Overcooked game environment demonstrate that our method outperforms current state-of-the-art methods when coordinating with different-level partners. Our code and demo are available at https://sites.google.com/view/cole-2023.
@article{COLE,author={Li*, Yang and Zhang*, Shao and Sun, Jichen and Du, Yali and Wen, Ying and Wang, Xinbing and Pan, Wei},title={Cooperative Open-ended Learning Framework for Zero-Shot Coordination},journal={Proceedings of the 40th International Conference on Machine Learning(ICML2023)},volume={PMLR 202},number={n/a},pages={},year={2023},keywords={Cooperative MARL, Zero-shot coordination, Open-ended learning},doi={n/a},}
AI-in-the-Loop
GeoDeepShovel: A platform for building scientific database from geoscience literature with AI assistance
With the rapid development of big data science, the research paradigm in the field of geosciences has also begun to shift to big data-driven scientific discovery. Researchers need to read a huge amount of literature to locate, extract and aggregate relevant results and data that are published and stored in PDF format for building a scientific database to support the big data-driven discovery. In this paper, based on the findings of a study about how geoscientists annotate literature and extract and aggregate data, we proposed GeoDeepShovel, a publicly available AI-assisted data extraction system to support their needs. GeoDeepShovel leverages state-of-the-art neural network models to support researcher(s) easily and accurately annotate papers (in the PDF format) and extract data from tables, figures, maps, etc., in a human–AI collaboration manner. As a part of the Deep-Time Digital Earth (DDE) program, GeoDeepShovel has been deployed for 8 months, and there are already 400 users from 44 geoscience research teams within the DDE program using it to construct scientific databases on a daily basis, and more than 240 projects and 50,000 documents have been processed for building scientific databases.
@article{https://doi.org/10.1002/gdj3.186,author={Zhang, Shao and Xu, Hui and Jia, Yuting and Wen, Ying and Wang, Dakuo and Fu, Luoyi and Wang, Xinbing and Zhou, Chenghu},title={GeoDeepShovel: A platform for building scientific database from geoscience literature with AI assistance},journal={Geoscience Data Journal},volume={n/a},number={n/a},pages={},year={2023},keywords={artificial intelligence, big data-driven discovery, data extraction, scientific database, human-computer interaction},doi={https://doi.org/10.1002/gdj3.186},url={https://rmets.onlinelibrary.wiley.com/doi/abs/10.1002/gdj3.186},}
LLM
Human Still Wins over LLM: An Empirical Study of Active Learning on Domain-Specific Annotation Tasks
Yuxuan Lu, Bingsheng Yao, Shao Zhang, Yun Wang, Peng Zhang, Tun Lu, Toby Jia-Jun Li, and Dakuo Wang
@article{lu2023human,title={Human Still Wins over LLM: An Empirical Study of Active Learning on Domain-Specific Annotation Tasks},author={Lu, Yuxuan and Yao, Bingsheng and Zhang, Shao and Wang, Yun and Zhang, Peng and Lu, Tun and Li, Toby Jia-Jun and Wang, Dakuo},journal={arXiv preprint arXiv:2311.09825},year={2023},}
2022
AI-in-the-Loop
KnowledgeShovel: An AI-in-the-Loop Document Annotation System for Scientific Knowledge Base Construction
Constructing a comprehensive, accurate, and useful scientific knowledge base is crucial for human researchers synthesizing scientific knowledge and for enabling Al-driven scientific discovery. However, the current process is difficult, error-prone, and laborious due to (1) the enormous amount of scientific literature available; (2) the highly-specialized scientific domains; (3) the diverse modalities of information (text, figure, table); and, (4) the silos of scientific knowledge in different publications with inconsistent formats and structures. Informed by a formative study and iterated with participatory design workshops, we designed and developed KnowledgeShovel, an Al-in-the-Loop document annotation system for researchers to construct scientific knowledge bases. The design of KnowledgeShovel introduces a multi-step multi-modal human-AI collaboration pipeline that aligns with users’ existing workflows to improve data accuracy while reducing the human burden. A follow-up user evaluation with 7 geoscience researchers shows that KnowledgeShovel can enable efficient construction of scientific knowledge bases with satisfactory accuracy.
@article{zhang2022knowledgeshovel,title={KnowledgeShovel: An AI-in-the-Loop Document Annotation System for Scientific Knowledge Base Construction},author={Zhang, Shao and Jia, Yuting and Xu, Hui and Wang, Dakuo and Li, Toby Jia-Jun and Wen, Ying and Wang, Xinbing and Zhou, Chenghu},journal={arXiv preprint arXiv:2210.02830},year={2022},}
Convex Approximation
Investigating the geometric structure of neural activation spaces with convex hull approximations
Neural networks have achieved great success in many tasks, including data classification and pattern recognition. However, how neural networks work and what representations they learn are still not fully understood. For any data sample fed into a neural network, we wondered how its corresponding vectors expanded by activated neurons change throughout the layers and why the final output vector could be classified or clustered. To formally answer these questions, we define the data sample outputs of each layer as activation vectors and the space expanded by them as the activation space. Then, we investigate the geometric structure of the high-dimensional activation spaces of neural networks by studying the geometric characters of the massive activation vectors through approximated convex hulls. We find that the different layers of neural networks have different roles, where the former and latter layers can disperse and gather data points, respectively. Moreover, we also propose a novel classification method based on the geometric structures of activation spaces, called nearest convex hull (NCH) classification, for the activation vectors in each layer of a neural network. The empirical results show that the geometric structure can indeed be utilized for classification and often outperforms original neural networks. Finally, we demonstrate that the relationship among the convex hulls of different classes could be a good metric to help us optimize neural networks in terms of over-fitting detection and network structure simplification.
@article{JIA202293,title={Investigating the geometric structure of neural activation spaces with convex hull approximations},journal={Neurocomputing},volume={499},pages={93-105},year={2022},issn={0925-2312},doi={https://doi.org/10.1016/j.neucom.2022.05.019},url={https://www.sciencedirect.com/science/article/pii/S0925231222005501},author={Jia, Yuting and Zhang, Shao and Wang, Haiwen and Wen, Ying and Fu, Luoyi and Long, Huan and Wang, Xinbing and Zhou, Chenghu},keywords={Neural networks, Activation space understanding, Convex hull}}
AI-in-the-Loop
DeepShovel: An Online Collaborative Platform for Data Extraction in Geoscience Literature with AI Assistance
Geoscientists, as well as researchers in many fields, need to read a huge amount of literature to locate, extract, and aggregate relevant results and data to enable future research or to build a scientific database, but there is no existing system to support this use case well. In this paper, based on the findings of a formative study about how geoscientists collaboratively annotate literature and extract and aggregate data, we proposed DeepShovel, a publicly-available AI-assisted data extraction system to support their needs. DeepShovel leverages the state-of-the-art neural network models to support researcher(s) easily and accurately annotate papers (in the PDF format) and extract data from tables, figures, maps, etc. in a human-AI collaboration manner. A follow-up user evaluation with 14 researchers suggested DeepShovel improved users’ efficiency of data extraction for building scientific databases, and encouraged teams to form a larger scale but more tightly-coupled collaboration.
@article{zhang2022deepshovel,title={DeepShovel: An Online Collaborative Platform for Data Extraction in Geoscience Literature with AI Assistance},author={Zhang, Shao and Jia, Yuting and Xu, Hui and Wen, Ying and Wang, Dakuo and Wang, Xinbing},journal={arXiv preprint arXiv:2202.10163},year={2022},}
2019
Data Visualization
Keyword Analysis Visualization for Chinese Historical Texts
Historical texts form the basis of the study of antiquities. In the case of Chinese historical texts different genres exist, e.g. chronological and biographical works etc. The contents of these texts normally consist of complex and interrelated information which covers long time period. Traditional history research relies heavily on information extraction and analysis by human researchers. With the recent development of the internet, data science and visualization technologies, digital history gradually attracts more and more attentions and in turn significantly impacts the field of historical study through altering the accessibility of the source materials, the narrative strategy and the analytical methodologies. This paper provides a system that enhances the Chinese historical research using word segmentation, texts analysis and visualization technologies. We can improve the workflow of traditional historical research via automatically detecting important keywords in Chinese historical texts and extracting, analyzing and visualizing the relations between a keyword and other words. This does not only accelerate the text based historical study but also to a great extent increase the scope of the search and analysis of the keywords in Chinese historical texts which used to be limited by the capacity of human researchers.
@inproceedings{10.1145/3356422.3356441,author={Zeng, Jihui and Zhan, Beibei and Zhang, Shao and Bie, Jiajun and Xiao, Sheng},title={Keyword Analysis Visualization for Chinese Historical Texts},year={2019},isbn={9781450376266},publisher={Association for Computing Machinery},address={New York, NY, USA},url={https://doi.org/10.1145/3356422.3356441},doi={10.1145/3356422.3356441},booktitle={Proceedings of the 12th International Symposium on Visual Information Communication and Interaction},articleno={14},numpages={4},keywords={Digital History, Word Segmentation, Data Visualization, Historical Document},location={Shanghai, China},series={VINCI'2019},}