Altmetrics
Downloads
230
Views
132
Comments
0
This version is not peer-reviewed
Submitted:
06 April 2024
Posted:
17 April 2024
You are already at the latest version
Stage | Query | Search Engine/Database | Search Results (# Papers) |
---|---|---|---|
Initial stage | mainQRY | Google Scholar | 201 |
mainQRY | Scopus | 13 | |
Final stage | mainQRY AND (“participants” OR “interview” OR “questionnaire” OR “quantitative” OR “qualitative”) | Google Scholar | 69 |
mainQRY AND (“participant*” OR “interview” OR “questionnaire” OR “quanti*” OR “quali*”) | Scopus | 2 |
Paper | Goal | Approach |
---|---|---|
Foerster et al. [28] | Learning a binary (in execution mode) communication protocol | DRQN |
Jorge et al. [29] | Learning Guess who? by two agents (asker, answerer) | Based on [28] |
Sukhbaatar and Fergus [30] | Learning continuous communication between a dynamically changing set of agents for fully cooperative tasks | New model |
Havrylov and Titov [31] | Learning to communicate with sequences of discrete symbols (referential game) | LSTM |
Das et al. [32] | Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning | VGG-16, LSTM |
Mordatch and Abbeel [33] | Formulate the discovery of the action and communication protocols for their agents jointly as a reinforcement learning problem | LSTM |
Jiang et al. [34] | Modeling multi-agent environment as a graph | New model |
Celikyilmaz et al. [35] | Addressing the challenges of representing a long document for abstractive summarization by deep communicating agents in an encoder-decoder architecture | B-LSTM, Attention |
Das et al. [36] | Proposing an architecture for multi-agent reinforcement learning that allows targeted continuous communication between agents via a sender-receiver soft attention mechanism and multiple rounds of collaborative reasoning. | Actor-critic, GRU |
Cogswell et al. [37] | Developing an implicit model of cultural transmission and compositionality in deep neural dialog agents, where language is transmitted from generation to generation because it helps agents achieve their goals | Based on [38] |
Performance Metrics | Paper (s) |
---|---|
Normalized rewards | Foerster et al. [28], Das et al. [32], Mordatch and Abbeel [33], Jiang et al. [34] |
Failure rates/ Win rates | Sukhbaatar and Fergus [30], Havrylov and Titov [31], Das et al. [36] |
Mean error | Sukhbaatar and Fergus [30], |
Loss | Havrylov and Titov [31] |
Accuracy, precision, and recall | Cogswell et al. [37] |
Custom measure | Jorge et al. [29], Jiang et al. [34], Celikyilmaz et al. [35] |
Paper | Number of Environments (Ordinal) | Scalability (Ordinal (0-2)) |
---|---|---|
Foerster et al. [28] | 2 | 1 |
Jorge et al. [29] | 1 | 0 |
Sukhbaatar and Fergus [30] | 5 | 2 |
Havrylov and Titov [31] | 1 | 0 |
Das et al. [32] | 1 | 0 |
Mordatch and Abbeel [33] | 1 | 0 |
Jiang et al. [34] | 1 | 2 |
Celikyilmaz et al. [35] | 1 | 2 |
Das et al. [36] | 4 | 2 |
Cogswell et al. [37] | 2 | 0 |
Data availability | Paper(s) |
---|---|
Create new dataset | Cogswell et al. [37] |
Used public dataset | Foerster et al. [28], Havrylov and Titov [31], Celikyilmaz et al. [35], Das et al. [36] |
Implementing a new environment | Foerster et al. [28], Jorge et al. [29], Sukhbaatar and Fergus [30], Das et al. [32], Mordatch and Abbeel [33] |
Using the existing environment | Sukhbaatar and Fergus [30], Jiang et al. [34], Das et al. [36], Cogswell et al. [37] |
Code Availability | Paper(s) |
---|---|
Not available | Havrylov and Titov [31], Das et al. [32], Mordatch and Abbeel [33], Celikyilmaz et al. [35], Das et al. [36] |
Provided pseudo-code | Foerster et al. [28], Cogswell et al. [37] |
Available | Foerster et al. [28], Jorge et al. [29], Sukhbaatar and Fergus [30], Jiang et al. [34], Cogswell et al. [37] |
Paper | Number of Comparisons with Baselines Model (Ordinal) | Number of Internal Comparisons (Ordinal) | Experimental Setup (Ordinal (0-2)) |
---|---|---|---|
Foerster et al. [28] | 1 | 4 | 2 |
Jorge et al. [29] | 0 | 5 | 1 |
Sukhbaatar and Fergus [30] | 3 | 0 | 2 |
Havrylov and Titov [31] | 0 | 4 | 0 |
Das et al. [32] | 0 | 2 | 1 |
Mordatch and Abbeel [33] | 0 | 2 | 0 |
Jiang et al. [34] | 3 | 2 | 2 |
Celikyilmaz et al. [35] | 7 | 7 | 1 |
Das et al. [36] | 4 | 3 | 1 |
Cogswell et al. [37] | 5 | 0 | 0 |
Research Design | Paper | Goal |
---|---|---|
Quasi-experimental | Tucker et al. [39] | To investigate human judgments of the robot, agent, or human actions using a dynamic survey |
Strouse et al. [40] | To test how effectively the FCP agents collaborate with humans in a zero-shot setting | |
Miura et al. [41] | To investigate whether using legibility as an objective would improve the interpretability of agents’ goals by humans | |
Woodward and Wood [42] | To evaluate if the proposed POMDP representation produces robust robots to teacher error, (that can accurately infer task details, and that are perceived to be intelligent.) | |
Wang et al. [43] | To investigate the impact of a robot’s embodiment, its explanation, and its promise to learn from mistakes on trust and team performance | |
Experimental study | Buehler et al. [44] | To evaluate the benefits of the assistive communication on task performance between robot and human |
Data Collection Method | Paper | Details | Method of Recording |
---|---|---|---|
Online survey | Tucker et al. [39] | 253 participants via Amazon Mechanical Turk | Online answers |
Miura et al. [41] | 26 participants via Amazon Mechanical Turk. The only requirement for participation was the ability to read English. |
Online answers | |
Woodward and Wood [42] | 26 participants Consisting of undergraduate and graduate students ranging in age from 18 to 31 with a mean age of 22. Four of the participants were randomly selected for the “human robot” role, leaving for the “teacher” role. |
Online answers | |
Online survey + Questionnaire | Wang et al. [43] | 61 participants from a higher-education military school in the United States 14 women, 39 men, age range: 18-23 |
Online answers |
Strouse et al. [40] | 114 participants from Prolific, an online participant recruitment platform 37.7% female, 59.6% male, 1.8% nonbinary; median age between 25–34 years. At the end of the study, an open-ended question for feedback on participants’ partners. |
Online answers | |
Observational study + Questionnaire | Buehler et al. [44] | 14 participants Participants were randomly divided into two groups, one started with an assisted trial, the other started unassisted. The participants had no prior experience with the task |
Recorded actions |
Paper | Limitations (Mentioned or Identified) |
---|---|
Tucker et al. [39] | Inadequate analysis Lack of verification of human judgments |
Strouse et al. [40] | Inadequate analysis |
Miura et al. [41] | It is not always possible to significantly improve legibility over policies maximizing underlying rewards. Their initial evaluations are limited to MazeWorld instances using BST belief update |
Woodward and Wood [42] | Inadequate analysis Lack of full explanation of how to collect data Lack of full explanation of the test scenario |
Wang et al. [43] | Lack of full explanation of how to collect data |
Buehler et al. [44] | Inadequate explanation of the questionnaire Lack of full explanation of how to collect data Lack of full explanation of the test scenario |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
Marc Domènech i Vila
et al.
,
2024
Nikodem Pankiewicz
et al.
,
2022
Ryeonggu Kwon
et al.
,
2023
© 2024 MDPI (Basel, Switzerland) unless otherwise stated