As biomedical research projects become increasingly interdisciplinary and complex, collaboration with appropriate individuals, teams, and institutions becomes ever more crucial to project success. While social networks are extremely important in determining how scientific collaborations are formed, social networking technologies have not yet been studied as a tool to help form scientific collaborations. Many currently emerging expertise locating systems include social networking technologies, but it is unclear whether they make the process of finding collaborators more efficient and effective.
This study was conducted to answer the following questions: (1) Which requirements should systems for finding collaborators in biomedical science fulfill? and (2) Which information technology services can address these requirements?
The background research phase encompassed a thorough review of the literature, affinity diagramming, contextual inquiry, and semistructured interviews. This phase yielded five themes suggestive of requirements for systems to support the formation of collaborations. In the next phase, the generative phase, we brainstormed and selected design ideas for formal concept validation with end users. Then, three related, well-validated ideas were selected for implementation and evaluation in a prototype.
Five main themes of systems requirements emerged: (1) beyond expertise, successful collaborations require compatibility with respect to personality, work style, productivity, and many other factors (compatibility); (2) finding appropriate collaborators requires the ability to effectively search in domains other than your own using information that is comprehensive and descriptive (communication); (3) social networks are important for finding potential collaborators, assessing their suitability and compatibility, and establishing contact with them (intermediation); (4) information profiles must be complete, correct, up-to-date, and comprehensive and allow fine-grained control over access to information by different audiences (information quality and access); (5) keeping online profiles up-to-date should require little or no effort and be integrated into the scientist’s existing workflow (motivation). Based on the requirements, 16 design ideas underwent formal validation with end users. Of those, three were chosen to be implemented and evaluated in a system prototype, “Digital|Vita”: maintaining, formatting, and semi-automated updating of biographical information; searching for experts; and building and maintaining the social network and managing document flow.
In addition to quantitative and factual information about potential collaborators, social connectedness, personal and professional compatibility, and power differentials also influence whether collaborations are formed. Current systems only partially model these requirements. Services in Digital|Vita combine an existing workflow, maintaining and formatting biographical information, with collaboration-searching functions in a novel way. Several barriers to the adoption of systems such as Digital|Vita exist, such as potential adoption asymmetries between junior and senior researchers and the tension between public and private information. Developers and researchers may consider one or more of the services described in this paper for implementation in their own expertise locating systems.
Social networking technologies have become one of the latest “killer applications” on the Internet, with some sites such as MySpace and Facebook amassing large numbers of users in a very short period of time [
Social networking approaches have the potential to help scientists find appropriate collaborators more quickly and efficiently than is currently the case. Over the past several decades, science has become significantly more collaborative, both generally [
Currently, most researchers use one of two primary methods to find new collaborators [
A third method for finding collaborators is to use databases of researchers partially or exclusively designed for the purpose. Knowledge management systems of this type, which include “expertise locating systems” [
The computer-supported cooperative work (CSCW) literature contains numerous examples of systems designed to connect people with each other to solve specific problems [
In this study, we are focusing on the much bigger challenge of establishing the long-term collaborations typical in biomedical science. In this case, not only are researchers looking for the most qualified expert, but they also will most likely enter into a long-term relationship. Evaluating an individual’s promise for such a long-term relationship requires information, engagement, and effort much beyond what is needed for finding an expert for singular (or even episodic) problem solving. A thorough literature search located only one report of a system [
In contrast to the dearth of reports in the literature, electronic systems purporting to make it easier to help scientists find collaborators abound. Similar to social networking sites such as Facebook, such systems endeavor to help individuals make connections to others that are not likely to be made in an off-line context [
However, there are good reasons to suspect that expertise locating systems could help scientists find the most appropriate collaborator(s) more quickly and efficiently than is currently the case [
The confluence of the trends of increasing scientific collaboration, the emergence of social networking as a powerful mediator of social interaction, and the growing availability of information about scientists and their work presents a significant opportunity to investigate whether expertise locating systems can make the process of finding collaborators more effective and efficient. Current systems are relatively new and have an uncertain track record. One immediate question that occurs is whether those systems are responsive to the requirements of scientists seeking collaborators. In answer to this question, the main goal of this study was to develop preliminary, generalizable requirements for expertise locating systems for biomedical scientists. Its second goal was to design a set of services responsive to these requirements, implement them in a prototype system, and formatively evaluate them with representative end users.
The main focus of this paper is to describe services and functions useful for expertise locating systems in general, not their implementation in a specific system. This study has been conducted as part of the University of Pittsburgh’s Clinical and Translational Science Institute in response to the core challenge to accelerate scientific discovery and the application of its results. As the other 23 current CTSA awardees in the United States are pursuing the same goal, our results are highly significant in that context. In addition, we hope that scientists and developers of expertise locating systems consider our results in the context of their own projects, potentially adopt/implement them, and conceptualize and design additional services as necessary.
This project proceeded in two phases: the background research phase and the generative phase. While the background research phase of the project emphasized discovering as much as possible about the relevant problem domain, the generative phase was intended to develop as many viable solutions as possible and then to choose one or more approaches to implement in a prototype. The project team included two faculty from the Center for Dental Informatics (TS and HS), one faculty from the Katz Graduate School of Business (BB) at the University of Pittsburgh, and two faculty (Susan Fussell and Brad Myers) and five senior masters students (SS, DW, LP, PR, and GM) from the Human-Computer Interaction Institute at Carnegie Mellon University. The project took place from January to August 2007. We describe the two main project phases briefly below.
We began the background research phase with a systematic literature review on relevant topics from the computer-mediated communication, social network theory, and computer-supported cooperative work literature. Keywords included “expertise locating systems,” “expertise management systems,” “knowledge communities,” “knowledge management,” “knowledge management systems,” “communities of practice,” and “virtual communities.” We searched Medline, the ISI Web of Science, the ACM Portal, and the IEEE Digital Library (all available years). From this material, we generated an affinity diagram [
Researcher affiliation, gender, seniority, collaborator count, and perceived collaborative load
School
Gender
Seniority
Number of Collaborators
Perceived Collaborative Load
Medicine
M
Junior
3-4
too few
Medicine
F
Junior
4
too few
Medicine
M
Senior
4
too few
Medicine
F
Junior
7
too few
Dental
M
Junior
7
too few
Rehabilitation
M
Senior
9
too few
Public Health
F
Junior
10
too few
Pharmacy
F
Junior
25
too few
Pharmacy
M
Junior
6
just right
Medicine
M
Senior
6-8
just right
Medicine
F
Junior
8
just right
Nursing
F
Junior
8
just right
Rehabilitation
M
Senior
8
just right
Pharmacy
M
Junior
9
just right
Pharmacy
M
Senior
9
just right
Medicine
M
Junior
10
just right
Medicine
M
Senior
10
just right
Dental
M
Senior
15
just right
Nursing
F
Junior
20
just right
Medicine
F
Junior
20
just right
Public Health
M
Senior
30
just right
Rehabilitation
M
Junior
30-50
just right
Public Health
M
Senior
16-20
too many
Medicine
M
Senior
24
too many
Public Health
F
Senior
40
too many
Public Health
M
Senior
7
n/a
Dental
M
Senior
15
n/a
In a parallel study, we conducted semistructured interviews with 27 scientists at the University of Pittsburgh (see
We analyzed the semistructured interviews using grounded theory [
We modeled three of the semistructured interviews in accordance with the CI method described above and added the resulting workflow, sequence, and cultural models to the 10 sets of models developed during the CI phase. We did this in order to increase the variety of observations and add insights that may have been articulated during the interviews but not during the CI sessions. Subsequently, we consolidated the data into single flow, sequence, and cultural models. The flow model provided a good view of actors and their roles and the flow of information among them. The cultural model identified the cultural aspects that have a strong influence on whether and how collaborations are formed.
We then derived a detailed set of requirements from the consolidated models and the results of the interviews and categorized them into five main themes: compatibility, communication, intermediation, information quality and access, and motivation. These themes served as the basis for developing the design ideas during the generative phase, which we describe next.
The generative phase began with brainstorming design ideas for systems to help facilitate the establishment of collaborations in light of the system requirements we had formulated. Two initial brainstorming sessions resulted in a total of over 40 ideas. The ideas included semiautomatic updating of online profiles; locating collaborators through colleagues or matching research interests in published papers; utilizing online journal clubs, online video presentations, and live question-and-answer sessions; social tagging of research papers; facilitating directed social contact through methods such as ride sharing and hobby groups; and creating systems to support matchmaking through “social hubs,” such as department chairs. Several of the ideas drew on functions available in the Web 2.0 and ubiquitous/mobile computing technology spaces.
Sixteen of the 40 ideas generated during the brainstorming phase were selected for formal concept validation. During this phase, we evaluated the design concepts with nine researchers at the University of Pittsburgh Health Science Campus. The participants represented scientists at the junior, senior, and executive levels with varying research foci (basic, clinical, and translational) at several schools. We presented each design idea as a real-life scenario to the participant and solicited feedback on its functionality and usefulness. Thus, we used the viewpoint of the end user as a central guiding principle for shaping our designs, an approach crucial to the development of user-centered applications [
For each design idea, the individual ratings of the researchers were combined into a summary score that ranged from 1 (not needed) to 4 (very much needed). At the same time, the project team rated the feasibility of implementing each idea on a scale from 1 (low) to 3 (high). The feasibility rating integrated judgments about how difficult it would be to implement each idea based on technical, environmental, and cultural considerations.
Based on the feedback from the concept validation sessions, we selected three related ideas for implementation and evaluation in a prototype. We implemented the design first as a wire frame, then as a high fidelity prototype. We performed think-aloud evaluations with four scientists using three use cases. The use cases described common scenarios that we asked participants to complete using the Digital|Vita prototype. Two observers kept notes on the interaction of each participant with the system, focusing on functions that were found to be either problematic or useful. The development team then brainstormed system improvements and implemented them to the highest degree possible. The high-fidelity prototype was used to produce a video about the system, which served as a way to solicit input from senior decision makers and external reviewers.
The studies conducted as part of the background research phase and generative phase were approved by the University of Pittsburgh Institutional Review Board (IRB approval numbers: 0612065 and PRO07050299).
The background literature review, affinity diagramming, contextual inquiry, and semistructured interviews yielded five themes of requirements for systems to support the formation of collaborations. The themes are compatibility, communication, intermediation, information quality and access, and motivation. We briefly describe the themes below.
Beyond expertise, successful collaborations require compatibility with respect to personality, work style, productivity, and many other factors [
Finding appropriate collaborators requires the ability to effectively search in domains other than your own using information that is comprehensive and descriptive [
Social networks are very important for finding potential collaborators, assessing their suitability and compatibility, and establishing contact with them. Established researchers often use existing connections with colleagues as their primary resource for locating new collaborators. However, junior researchers with few or no contacts within the desired field may have significant difficulty initiating collaborations [
Information profiles must be complete, correct, up-to-date, and comprehensive and allow fine-grained control over access to information by different audiences. Missing, incorrect, and out-of-date information and poor indexing (for instance, through the use of nonstandard vocabularies) of information profiles [
Keeping online profiles up-to-date should require little or no effort and be integrated into the scientist’s existing workflow. In order to provide relevant and up-to-date information to colleagues, researchers must have an incentive to supply the information and keep it current. For example, our study participants regularly invested time updating information in their biographical and professional documents, such as their CV, biosketches for grants, and faculty evaluation forms. They were highly motivated to do so because these documents must be up-to-date in order to obtain grant funding, provide background information when invited to lecture or consult, and participate in university evaluations such as promotion and/or tenure decisions. There is no such motivation to update online profiles. In addition, our contextual inquiries showed that a major problem with the existing workflow is that researchers need to reformat and update the same information in multiple documents. We recognized this as an opportunity to draw researchers into using an online system. If it were possible to streamline the process by synchronizing information in multiple documents, the system would provide an incentive to keep information updated. Therefore, before a database of personal researcher profiles can be used as a tool to initiate collaboration, it must to be adopted as a repository of biographical and research-related information by a wide range of researchers. This “critical mass” problem is one of the classic challenges to the adoption of CSCW systems [
Results of the concept validation phase for 16 design ideas using nine scientist participants; need rating: 1 (low) to 4 (high); feasibility rating: 1 (low) to 3 (high)
As described in the Methods section, the research team generated a large number of ideas for one or more systems to support researchers in locating collaborators.
1. Maintaining, formatting, and semiautomated updating of biographical information: This set of services allows users to maintain biographical information and output it to several standard formats.
2. Searching for experts: These services provide capabilities for searching for potential collaborators using a range of search criteria and allow searchers to exploit the social network represented in Digital|Vita in the process.
3. Building and maintaining the social network and managing document flow: These services allow users to build a network of social connections, group colleagues into teams, and manage the flow of biographical documents within their teams.
The following sections briefly describe these sets of services.
This service is provided by the My Information (see
The My Information component in Digital|Vita allows the user to enter and update biographical information through manual or semiautomated processes
Three services in My Information allow the user to enter and update biographical information:
Importing information from existing sources: The primary method for populating biographical information is extraction from existing sources such as the National Library of Medicine’s Medline and the National Institutes of Health’s CRISP databases. A similar approach to retrieving and aggregating data from existing sources is being used in many other systems [
Propagating information through social networks: A second mechanism for acquiring biographical information is the semiautomated synchronization of updates made by colleagues in Digital|Vita. The process is semiautomated because all affected scientists are automatically notified about updated information, but each of them has to manually approve the update for inclusion in their own information. For instance, when Digital|Vita users manually enter a paper, they have the option of selecting coauthors from within the Digital|Vita system. When an entry is saved, Digital|Vita automatically propagates this update to the coauthors and displays it on each user’s Digital|Vita home page (see
Manually entering and updating information: The third mechanism for entering and updating information is manual entry. This is appropriate for data for which existing sources are neither available nor accessible. Examples of such data include professional appointments, degrees, and publications indexed in services that do not allow automated retrieval.
The My Documents component provides functions to output biographical information to several standard formats, customize information content, archive old versions, and include updates to biographical information selectively
While My Information allows the user to input and manage their biographical information, the My Documents function helps the user produce and archive several forms of output from that data. My Documents includes three services:
Output to several standard formats: The current design of Digital|Vita provides for several standard output formats for biographical information such as a university-specific CV, a brief CV, and NIH and NSF biosketches. Users can choose the desired output format, generate the new document, and edit it according to their preferences.
Customization and versioning: The ability to easily customize document content was deemed essential for the researchers we interviewed because they typically adapt biographical documents for specific grant applications, even if the format required for each is the same. With this service, users can customize documents with a simple checkbox approach—if an item is checked, it is included in the specific document. My Documents also supports versioning so that older versions of a specific document are available on demand.
Selective updating: The system makes it explicit when the existing version of a document does not include recently updated information (see
As the user edits a specific document, the system displays the length of the document in pages in order to allow the user to observe page limits. In addition, the user can preview the printed version of the document; send it to colleagues in their professional network and recipients through email; and save the document in predetermined file formats.
Our background research indicated that researchers consider a variety of factors when choosing potential collaborators. For many, searching Medline and Google Scholar is only the first step in acquiring several types of information about their colleagues. The purpose of the My Information section in Digital|Vita is to store rich and comprehensive profiles of researchers in the database and make them available for flexible and powerful searching by others.
Simple and advanced search of profiles: The first step in finding an expert within Digital|Vita is to allow users to query profiles flexibly. While the simple search in Digital|Vita only offers the capability to query profiles using keywords, the advanced search adds institution, department, location (for institutions with multiple campuses), publication activity, and relevance. (Relevance is a score indicating the level of expertise of the “hit” regarding the desired research topic.) Search results return key information about each hit (see
Exploiting the social network to search: The search results page also displays connections through colleagues (identified by an icon symbolizing a social network, see icon next to "Wendy Roberts" in
A sample search results screen in Digital|Vita shows brief profiles of potential collaborators
This service is managed in the My Colleagues section of Digital|Vita. It is intended for researchers to keep track of their collaborators, colleagues within their department, and general professional network of colleagues within Digital|Vita. It is the area in the system where users build the social network that they are able to exploit when they search for collaborators (see above) and where they manage document flow between themselves and their research teams.
Creating links to colleagues: The value of social networks for recommending collaborators has been discussed earlier [
Assembling research teams: My Colleagues also provides a simple mechanism to label and organize groups of colleagues. The primary use of these groups is to manage the flow of biographical documents among them. The rationale for this feature is that many collaborations in academia arise within the context of pursuing a particular funding opportunity. Since one key activity in preparing grant applications is collecting biographical information from each team member, it was logical to add functions to Digital|Vita to support this effort. Users can create any number of named groups drawing from their list of colleagues on record in the system. As in real life, individual colleagues can belong to more than one group. Groups can be annotated with relevant information, for instance with the identifier of the funding opportunity the group is working on.
Managing biographical document flow: Once a Digital|Vita user has created a named group, he or she can issue an electronic request to the group specifying the type of document requested (eg, NIH biosketch), the purpose for the request, and the date the information is needed by. Team members respond to requests through the system, which gives each person the opportunity to customize the requested document before it is sent. Digital|Vita issues automatic reminders to team members who have not responded by the due date. (Requesting documents in this manner is also possible between individuals.) The status of requests to and from other colleagues, as well as responses, is tracked in a Document Inbox. The Document Inbox allows users to send or request a document, as well as view and manage their recent document requests. Historical requests are accessible through a link to an archive. Before sending a document, users can preview it to ensure they are sending the correct document and that it contains the desired information. If new additions to the user’s biographical information have been made, the document can be edited directly before sending. Users can also decline a document request. Requests are archived automatically after the due date of the document has passed or when the user has sent the requested document.
In summary, maintaining, formatting, and semiautomated updating of biographical information; searching for experts; and building and maintaining the social network and managing document flow are three sets of services designed to make the process of finding collaborators more efficient and effective and so facilitate the establishment of collaborations. We have focused on describing the Digital|Vita functionality as separate services in order to allow other researchers and developers to implement them selectively or all together in other systems.
The preceding section presents a relatively abbreviated description of the functionality of the Digital|Vita system. A video illustrating a prototype of the system and its use is available in the Multimedia Appendix. In addition, the final report (dated July 2007) about the Digital|Vita prototype project, which includes a comprehensive description of the problem space, research, and development methods and the Digital|Vita design and functionality, including the design rationale, is available online. At present, the Digital|Vita development team is writing detailed system specifications for the development of a production application.
The problem of connecting scientists with each other is not new. However, doing so efficiently and effectively has taken on particular relevance and urgency in an age when much of science is migrating to a multidisciplinary, collaborative, and team-oriented model. At the same time, while electronic systems to help connect scientists have existed for some time, to this point they appear to have played only a minor role in helping scientists form collaborations.
Systematic approaches to designing systems to help researchers find collaborators are only in their infancy. We began this study with two basic research questions: (1) What requirements should systems for finding collaborators in biomedical science fulfill? and (2) Which information technology services can address these requirements? We believe that we have made an important contribution to the design of expertise locating systems with regard to both questions. The five main themes we have identified as requirements for such systems (compatibility, communication, intermediation, information quality and access, and motivation) show that collaboration seeking is a complex activity that does not depend simply on the ability to retrieve factual information about potential collaborators. It is clear from our exploration of these themes that social connectedness, personal and professional compatibility, and power differentials influence the formation of collaborations. This means that systems that do not model and leverage the social context are at a clear disadvantage in satisfying the social requirements for establishing collaborations.
On the other hand, a rich informational representation of potential collaborators also appears to be important. Checking PubMed and Google for publications of a potential collaborator was only a starting point for many of our interviewees. Detailed investigation included other information resources, such as the NIH’s CRISP, as well as patent and other databases. Because of the fragmentation of information about potential candidates, a thorough background search on potential collaborators is time and effort intensive. The cost of a search, therefore, appeared to be a barrier to finding the most appropriate and qualified collaborators. It was therefore logical for our design to focus on the most comprehensive and up-to-date, but customarily also least accessible, information profile available: the CV of the individual scientist.
Systems such as the Community of Science have long made biographical information a centerpiece of researcher profiles. Why do we think Digital|Vita may succeed where others have failed? Digital|Vita is centered on one component of a workflow that scientists almost universally perform on an ongoing basis—maintaining and updating the CV—and adds functions to support the establishment of collaborations. This design mirrors Payton’s [
Managing biographical information within Digital|Vita not only requires no extra effort from a scientist compared to the traditional approach, it actually reduces effort because the raw biographical information is converted automatically to several frequently used standard formats. Making this workflow a central feature of Digital|Vita may prompt researchers to at least explore the collaboration-seeking functionality of the system.
However, the simple availability of features to search for collaborators does not mean that they will be used. Encouraging researchers to seek collaborators through Digital|Vita as opposed to traditional methods faces significant obstacles. For instance, established researchers often are so well-informed and well-connected that they, on average, will outperform any electronic system. We therefore anticipate that Digital|Vita may be primarily attractive to younger scientists (who may be using social networking tools in their life outside of work) and scientists who are new to the University of Pittsburgh or who are planning to collaborate with individuals in disciplines that they are not very familiar with. Digital|Vita also faces a complex challenge in keeping information about a researcher private while at the same time marketing that researcher to maximum effect. We believe that the granular control Digital|Vita provides in determining what information is public and what is not will help individuals adjust their public profile to their preferences. Other potential barriers to adoption include establishing an initial critical mass of profiles adequate for finding and choosing collaborators and integrating the systems and its capabilities with the regular work practice of the institution and individual researchers.
Future work on the Digital|Vita system will take two major directions. After development and implementation of the production system, we plan to design additional functions intended to improve the matching process among potential collaborators. Most likely, this research strand will focus on the development of algorithms to help pinpoint the most promising collaborators and bring new potential collaborative opportunities to a researcher’s attention. A second direction for the Digital|Vita effort will be to identify other information technology services to help scientists find and access resources that are useful for their work. For instance, we are currently working on a directory of computational resources at the University of Pittsburgh to support scientific problem solving.
It is clear that electronic systems in support of research, and specifically those supporting the establishment of collaborations, will become increasingly important in the future. As more and more science goes “digital,” both in its execution as well as in its documentation, systems such as Digital|Vita will become essential to the everyday life and activities of scientists.
We gratefully acknowledge the support and efforts of the many scientists who participated in the studies leading to the development of the Digital|Vita project. A special thank you goes to Susan Fussell, Brad Myers, Robert Kraut, and Nosh Contractor who contributed many ideas. We appreciate Patricia Anderson’s, Ellen Detlefsen’s, and Ann Majchrzak’s comments on this manuscript, as well as Michael Dziabiak’s help with formatting and submission. Also gratefully acknowledged are Ellen Detlefsen and Erin Nordenberg, for their help with interviewing scientists, and Janice Stankowicz for her help with managing the flow of interviews.
This project was, in part, supported by grant number UL1 RR024153 from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH) and NIH Roadmap for Medical Research, and its contents are solely the responsibility of the authors and do not necessarily represent the official view of NCRR or NIH.
None declared.
Prototype of the Digital|Vita system (Video)
contextual inquiry
Computer Retrieval of Information on Scientific Projects
curriculum vitae
National Institutes of Health