Frameworks for Collective Intelligence: A Systematic Literature Review

Extracts from ACM Computing Surveys article https://doi.org/10.1145/3368986

13 min readDec 18, 2020

Introduction

The concept of ‘Collective Intelligence’ (CI) (i.e., collaborative problem solving and decision making) has been a keen interest of researchers ever since the 18th century [41, 63]. Since this period, the different applications of CI and its associated concepts have extended throughout a wide spectrum of research domains ranging from sociology, psychology, biology, management, economics to computer science among many others [50]. In our work, we focus on CI in Information and Communications Technology (ICT), and therefore, we adhere to the widely accepted formal definition of CI in the ICT domain, proposed by Pierre Levy in 1995 [43]. Levy defined CI as a “form of universally distributed intelligence, constantly enhanced, coordinated in real time, and resulting in the effective mobilization of skills” [43]. Some of the CI platforms of the early period include WikiWikiWeb, Experts-Exchange and Google [50]. Since then, advancements in ICT technologies like Web 2.0 [65, 71], Semantic Web [28, 44] and Crowdsourcing [7, 17] have enabled and drastically eased large-scale collaborations over the Internet; leading to the development of well-known CI platforms like WaterWiki [16, 62], Climate CoLab [34, 51], DDtrac [26, 27], WikiCrimes [19, 68] and Goldcorp [4], which facilitate knowledge sharing, problem-solving and decision making among individual users and groups, through web-based interactions and collaborations.

The success of these systems can be credited to their underlying architectures or frameworks (hereinafter referred to as ‘models’). Unfortunately, most of these models are often defined using system-specific elements, principles, attributes, requirements or their combinations [39]; and are based on specific problems [21]. Since each of these CI systems is designed for a specific problem or use-case, the models proposed for these systems are often presented as completely different entities. However, comparing these models shows that although each new CI system and model expands on our current understanding of CI, nevertheless many of these systems bear a few similarities [48]. Sadly, this abundance of diverse knowledge has not yet lead to the development of a unified CI model [13, 56, 67] that can support the development of new CI systems based on systematic knowledge rather than intuition [39]. Also, many of the existing CI systems are proprietary and are therefore not available in scientific literature. And, systems that are described in scientific literature, focus more towards the theoretical foundations, usability, and future applications of collective intelligence [21], rather than focusing on the implementation [39]. This lack of well-defined and systematic knowledge about the architecture and principles of the underlying CI systems has led to a reproducibility crisis.

In order to achieve comprehensive knowledge of CI systems, it is imperative that we extensively investigate published scientific literature irrespective of the so-called proposed models. We are convinced that although different CI systems are defined in different ways, they must share more than just a few common characteristics. And, identifying these characteristics could help us to achieve a unified formal model for designing CI systems, irrespective of their application. To this end, we contribute by conducting a first of its kind Systematic Literature Review (SLR) of Collective Intelligence models in ICT. In this SLR, we extensively investigate the characteristics of 12 CI models, selected from a pool of 219 scientific publications, identified after through exploring of 9,418 scholarly articles on CI published since 2000. And, based on the results of our review we develop a novel framework that can be utilized to understand existing CI systems. The proposed framework provides a generic model and a set of requisites that would enable creation of novel CI systems, regardless of their domains. This is achieved by exhaustively combining all attributes of the studied CI models into the proposed framework.

Additionally, to better explain the functioning of CI systems with respect to the proposed framework, we examine the different components of six ongoing CI projects: CAPSELLA, hackAIR, openIDEO, Climate CoLab, WikiCrimes, and Threadless.

In particular, through our work we aim to answer the following research questions:

RQ1: What are the underlying models of existing CI systems? What are the common terminologies used to describe CI models? What are their components? And, how are these components associated to each other?

RQ2: Do any of the available CI models appropriately define all CI systems, irrespective of their applications? Can these models be used to create CI systems for novel challenges?

RQ3: If not, then can we somehow combine the available knowledge of CI models and systems to create a unified model that could define all CI systems?

Methodology

To answer the research questions through a transparent and objective approach, we decided to conduct this review based on Kitchenham’s “Guidelines for performing Systematic Literature Reviews in Software Engineering” [37]. A SLR summarizes, critically appraises, and identifies valid and applicable evidence in available research by using explicit methods to perform thorough literature search [9, 37, 66]. Based on Kitchenham’s guidelines, we performed this SLR in five stages:

Search Strategy: Based on the previously identified research questions, we selected a set of search terms. We then used the combination of these search terms to look for relevant research articles in different academic databases. After this, we applied the inclusion criteria on the identified articles and shortlisted the most relevant articles (which we refer to as “Primary Studies”). Following Kitchenham’s guidelines, we then evaluated the primary studies using the quality assessment criteria. And finally, the selected studies were investigated in the data extraction and synthesis stages of the SLR.

Study Selection: To identify the articles relevant to our research questions, we applied a two-phase selection process (please refer Tables 3 and 4). During this process, two researchers of this review independently analysed the identified articles and selected the studies, which were most likely related to our research questions.

Study Quality Assessment: The intention of this phase is to determine the relevance of selected studies while limiting bias in the study selection process. In this phase, all three researchers of this review independently assessed the primary studies by answering the questions presented in Table 4. For each primary study, the researchers answered the questions as “Yes,” “Partly,” or “No”; scoring each criterion as 1, 0.5, and 0, respectively. The individual scores for each question were then added to derive a total score for each primary study. The studies that scored 3 or higher were finally selected for the data synthesis stage. Any conflict of opinion about the process and results of the quality assessment measures were discussed among all three researchers to reach a consensus (please refer Table 6 for the final “Selected Studies”).

Data Extraction and Synthesis: The intention of data extraction stage is to identify the main contributions of the selected studies, and to present a summary of the work. Table 7 presents the data items extracted from the 12 selected studies. The contributions, i.e., models and elements of the Selected Studies, are presented in Table 16. The goal of the data synthesis stage is to collate and summarize the contributions of the selected studies. We first catalogue the definition types and classifications of the studied CI models; we then identify all unique and synonymous characteristics, levels, requirements, properties, and building blocks and classify them into 24 distinct attributes (presented in Table 16). Finally, based on these findings, we then answer the first two research questions (RQ1 and RQ2) in Section 4 and the final research question (RQ3) in Section 5.

Novel ‘Generic’ CI Framework

Literature shows that CI is a multidisciplinary field, drawing concepts and techniques from a number of different disciplines including computer science [23], organizations [25], social media [69], complexity sciences [70], and psychology [84]; therefore, different scholars have described CI from different perspectives. However, over the years only three definitions of CI have been widely adopted in ICT; two of which were proposed in this decade. The first formal definition of collective intelligence (in ICT) was proposed by Pierre Lévy (1997) [43], followed by Jerome C. Glenn (2013) [23] and Thomas W. Malone (2015) [50]. Although each of the definitions describes CI in its own distinct way, nevertheless, when examined together, the definitions express CI as having three main components, i.e., individuals (with data/information/knowledge); coordination and collaboration activities (according to a predefined set of rules); and means/platform for real-time communication (viz., hardware/software). When combined, these components enable intelligent behaviour in groups or crowds. Table 17 is the result of segregating all the characteristics defined in Section 3 in terms of the just discussed three main components of CI systems.

Now, using the findings from the data extraction phase of the SLR, we attempted to contribute to the available CI models by proposing a unified framework for CI by combining the 24 unique attributes (see Table 17) of CI models identified from studies S1 — S12. The purpose of the proposed framework is to answer the final research question (RQ3) and provide additional insights and explanations that can help us better understand CI systems in general. Combining the knowledge of the CI models studied in this SLR, we propose a novel framework that describes CI systems in a fine-grained manner. We do so by comprehensively classifying all components of the studied CI models into the previously mentioned unique attributes, and then categorize them into three sections:

· a “generic” model that defines all CI systems (see Figure 1)

· additional requisites for CI systems (see Section 5.2)

· CI as a complex adaptive system (see Section 5.3)

While taking inspiration from the building blocks for CI proposed by Malone et al. [52], combined with the findings from Section 4.1.1, we propose a model that describes CI systems by the means of staff, process, goal, and motivation. Designed as an extension to Malone’s concept of building blocks, the proposed generic model segregates the originally proposed genes into more fine-grained types; introduces a new classification, namely, interactions; and suggests vital properties for the staff and goal building blocks of the generic model. The remaining attributes that could not be accommodated into the building blocks are aggregated into the additional requisites category. And finally, the last three components that are necessary to enable CI in ICT systems, are described as complex adaptive system attributes.

Threats to Validity

The primary threats to the validity of this Systematic Literature Review include bias in search strategy, bias in selection process, and inaccuracies in data extraction.

The selection of studies relied on the search strategy, which included the selection of search terms and literature resources, and the search process. The search terms were selected based on both the research questions and an initial literature review; followed by a three-step process to construct the search string as described in Section 1. We then chose four prominent academic databases of computer science and used the formulated search string to identify relevant literature. Table 2 presents the number and types of research articles identified from each of the academic databases. To avoid bias in our search strategy and to identify relevant technical reports, books, and theses, we conducted a manual search on Google Scholar.

To avoid bias in the study selection process, we first reviewed the titles and abstracts of the identified studies and then selected only those studies that fulfilled the inclusion criteria. We then studied these selected articles and manually checked their references to make sure that we did not miss any relevant articles during the search process. Finally, the selected studies were then evaluated based on the quality assessment criteria. As a result of the study selection phase, we were able to identify the most relevant studies with respect to our research questions.

To eliminate inaccuracies in data extraction, each primary study was independently studied by all researchers and any disparities in findings were resolved through discussions. During the process, we found two pairs of studies, i.e., S1, S2 and S6, S9, which shared a couple of similarities. The first pair (S1, S2) described the characteristics of CI systems using similar classifications, while the second pair (S6, S9) was written by the same authors. By consensus, we decided to keep both pairs in our selected studies, as S1 and S2 described CI systems from different perspectives, whereas S6 and S9 provided different contributions.

Conclusion

The objective of this article was to analyse different collective intelligence models described in the scientific literature and to identify a generic model that could be utilized to design new CI platforms. To this end, we conducted a Systematic Literature Review, in which we identified 9,418 articles on collective intelligence models. Out of these articles, we selected 12 studies based on an exhaustive selection process. We then critically analysed these selected studied and found that none of the models provided a generic view of CI systems, as each of the models was designed based on specific perspectives. And, the models that could potentially be used to design domain independent CI systems lacked granularity and needed to be researched further. So, to fill this research gap, we aggregated the components of the CI models described in the selected studies and proposed a unified framework for understanding CI systems. The proposed framework describes CI systems in three parts. First, a generic model, which describes CI systems as a combination of goals, staff, motivation, and processes, which are further described as types, interactions, and properties. Second, a list of requisites necessary for CI systems to work effectively. And third, guidelines that could enable complex adaptive behaviour in CI platforms. To evaluate if the proposed model could define CI systems from different domains, we selected a set of ongoing CI projects and observed user activities within the platform, over a duration of 6 months. After this, we systematically organized our observations and segregated them according to the different components of our proposed generic model. We found that our model successfully described the components of each of the CI platforms and revealed some interesting relations between the types of actors, their activities, and motivations. The evaluation of the proposed model also gave us the opportunity to present our unified CI framework by means of examples (i.e., six ongoing CI initiatives). It was imperative that we describe the components of these CI platforms in terms of the proposed CI model, so that both researchers and system designers/developers in the field could utilize our novel model to design and develop new CI systems. The 24 unique attributes that describe the proposed framework could provide initial insights to system designers and developers and could be beneficial during the requirement elicitation process when developing new CI systems. We recognize that we need to further examine the proposed framework by comparing it to a larger set of CI platforms, as doing so would help us gain a deeper understanding about how the proposed framework could be used to design new CI systems. Additionally, we would like to evaluate the proposed framework by conducting qualitative interviews with domain experts and researchers working on upcoming CI initiatives. And finally, we would also like to investigate different trust and reputation models that could be utilized to reduce user bias within CI platforms, thereby enhancing user experience and enabling a smooth exchange of knowledge and information within communities.

Relevance of Our Work for Science

CI on the web can be found in various forms, and looking at the scientific literature, web-based CI applications can be primarily categorized into three different paradigms. First, given how easy it has become for us to connect to other people around the world, we are now able to learn about other people’s cultures, religions, politics, medicine, music, art and more; thereby, enhancing our understanding and increasing our acceptance towards one another. Through platforms that support crowdsourcing and crowdfunding (like: OpenIDEO and Kickstarter), people from opposite parts of the world can contribute each other’s initiatives and are able to provide not only financial support but also mental support in creating new and innovative ideas. Second, organizations are now able to both learn from and share knowledge with general web users through platforms (like: Stack Overflow and Medium); more and more organizations now employ members of general public to learn about user needs, requirements and perceptions, allowing institutions to keep pace with the ever-changing consumer markets. Within organizations, use of web based social platforms has enabled knowledge creation and exchange, so much so that more and more organizations are transforming themselves into ‘knowledge creating companies’. And finally, this overall rise in interest in the social web and collective intelligence has prompted government institutions to empower citizens with more decision-making capabilities when carrying out governance activities. Thanks to such platforms (like: WikiCrimes and CitizenOS) citizens are now able to participate more proactively in their own governance and can even guide their governments in coming up with new policies and regulations.

Now, looking at more recent examples from this year, be it the COVID-19 pandemic or the more recent protests following the police killing of George Floyd and the subsequent protests on racial discrimination worldwide. During these critical times, we have seen how the general public, irrespective of the nationality, religion and socio-economic background, came closer together as communities and contributed a plethora of sociotechnical innovations. From individuals and small/medium enterprises manufacturing face masks/shields, PPE suits and ventilators; to employees from organizations like: Amazon, Facebook and Google demanding policy changes regarding facial recognition technologies and racial discrimination, the last few months have clearly illustrated how powerful and innovative the crowd can be especially under daunting circumstances. Drawing for the current events and the slowly changing digital/social landscape, it’s become not so difficult to fathom that in the near future we as citizens of the world would come even closer to one-another and work together to solve wicked challenges which would have global impacts.

Taking into consideration the above-mentioned factors it has become apparent over the years that we are transitioning into a new age of governance and problem solving; unfortunately, even with the wide variety of ICT innovation and research, the naïve task of designing Collective Intelligence (CI) platforms is still a rather time-consuming task and can often require months if not years of analysis and planning. Making it difficult for small/medium enterprises and governing bodies to design such solutions.

That said, through our research, we would like to argue that providing institutes with CI platforms (based on our ‘generic’ CI framework) where stakeholders could simply combine different components required to enable CI and empowering them in developing their own CI platforms on-the-go, should make it easier to design said CI/crowdsourcing/open innovation platforms. Utilizing the findings of our work, it should become easier for researchers/developers and other stake holders (including governing bodies) to develop new CI platforms while reducing the amount of time and money required for the same. Also, as an added advantage the findings from our work could also enable young/upcoming researchers to understand how collective intelligence is enabled/achieved in different ICT systems. And by doing so, it should allow individuals, countries and organizations from around the world to tap into the enormous knowledge pool and innovative capabilities of world citizens. Hopefully, leading to a future where such platforms are utilized not only for knowledge acquisition but also for information dissemination, thereby helping us evolve into collectively intelligent societies, that are closer, more connected and more empathetic towards each other.