Volume 6, No. 2, Art. 36 – May 2005

Acquiring Qualitative Data for Secondary Analysis

Louise Corti & Gill Backhouse

Abstract: Qualidata was launched in 1994 as a proactive service for the location, documentation and preservation of qualitative social science research data. Over the course of its relatively short history, Qualidata has succeeded in gaining acceptance for the deposit and re-use of qualitative data material amongst the academic community. Initially concentrating on acquiring important early social studies, then working with the UK's Economic and Social Research Council (ESRC) to operate a "datasets policy," ESDS Qualidata, now merged into UK Data Archive, is in a good position to look back and review the progress made in acquiring data. This paper draws on ESDS Qualidata's pioneering experiences in acquiring and making available qualitative data. It reflects on strategies that have proved successful and comment on others that have been less productive, developing these into guidance for others wishing to embark on the acquisition and deposit of qualitative social science research data.

Key words: social science data archives, qualitative data archives, creating data, depositing data, acquisitions policy, data sharing policy, secondary analysis of qualitative data, consent, confidentiality

Table of Contents

1. Introduction

2. The Early Years of Qualidata—Starting to Acquire Qualitative Data

3. The Acquisition of ESRC Funded Data

4. Consolidation of Qualidata—Continuing the Acquisition Process

5. The Evolution and Comparison of Qualidata's Acquisition Policy

6. Criteria Used in Evaluating Qualitative Data for Archiving

7. Ethical Issues in Archiving Data

8. Summary of Main Points for a Successful Qualitative Data Acquisition Strategy

Acknowledgments

Notes

References

Authors

Citation

 

1. Introduction

The ESRC Data Archive was established in 1967 to preserve the most significant machine-readable data from research funded by the Economic and Social Research Council (ESRC). Until recently, most machine-readable data was statistical, based on surveys. Qualitative research was paper based and therefore only a small proportion of qualitative research data funded by the ESRC was archived. In 1991 the ESRC commissioned Paul THOMPSON (1998) to carry out a small pilot study to find out what was happening to qualitative data from projects that it had funded. The results of the survey showed that 90% of social science qualitative research material from projects funded by the ESRC was at risk or already lost. [1]

As a consequence of the survey report, the ESRC established the ESRC Qualitative Data Archival Resource Centre (Qualidata), with additional resources for accommodation and office expenses provided by the University of Essex. Paul THOMPSON was appointed as the Director, Louise CORTI (a scientist and social scientist by training) as Senior Administrator, and Janet FOSTER (an archivist) as Senior Research Officer. This team embarked on Qualidata's remit which was two fold: first to undertake a salvage operation to rescue the most significant material created by research from previous years; and secondly to work with the ESRC and the ESRC Data Archive to ensure that for current and future projects, the unnecessary waste of the past did not continue (THOMPSON 1998). [2]

It is important to explain here that Qualidata was not set up as an archive, but to act as a broker between researchers and existing archives. Qualidata's role was to locate and evaluate data, process and catalogue the data and arrange for their deposit in an appropriate archive. The existence of archived research materials for re-use would then be made available to the research and teaching community. [3]

2. The Early Years of Qualidata—Starting to Acquire Qualitative Data

As Qualidata is not an archive, the Center's staff needed to explore existing repositories across the UK that would be suitable to hold qualitative research material. Not only did this survey indicate potential archives to work with, it also revealed a significant amount of archived qualitative data. [4]

The main challenge, however, was to establish how much significant qualitative research material had survived from earlier research projects. Three different surveys were conducted by contacting researchers who had collected qualitative data, sometimes as far back as 1945. ESRC funded projects that had generated qualitative data, including earlier SSRC projects, were surveyed, and produced records extending back to 1970. From these surveys, details of each individual research project were entered into a database making a total of 2,565 records. [5]

There was an urgency concerning the acquisition of material from earlier social studies as it was discovered that many important datasets were already lost. One such dataset was the research material on child-rearing which John and Elizabeth NEWSON had been collecting in Nottingham since the early 1960s, consisting of over 3,000 high quality in-depth interviews with parents and children. Only weeks before the inauguration of Qualidata, the Newsons decided that their lifetime's research collection should be destroyed (THOMPSON 1998; BACKHOUSE & THOMPSON 2000). The results of this work can be seen in Table 1 with 4 datasets archived initially and then 29 in the subsequent year.

 

Mar 95-Sept 95

Oct 95-Sept 96

Oct 96-Sept 97

Oct 97-Sept 98

Oct 98-Sept 99

Oct 99-Sept 2000

Number of datasets evaluated

16

20

43

25

76

130

Number of datasets archived

4

29

13

35

36

20

Number of datasets cataloged1)

4

64

32

37

101

40

% of staff effort devoted to "classic studies"

80

50

35

30

25

10

Table 1: Performance measurement statistics 1995 – 20002) [6]

One of the most significant data collections archived in the first two years was the lifetime's research materials of Peter TOWNSEND. These include data from "The Family Life of Old People" (1957), "The Last Refuge" (1962), "Poverty in the UK" (1979), and an unused set of interviews carried out in the 1950s with residents of Katherine Buildings in East London, which had been a housing experiment in the 1880s. No archive was able to take large paper-based collections on social policy and social change and therefore Qualidata set up the National Social Policy and Social Change Archive (NSPSCA) at Essex in 1996, with funding from the Joseph Rowntree Foundation. Without this action, an extremely rich source of material on poverty and old age, drawn from large scale and in-depth studies, would have been lost to social science researchers and historians. [7]

Paul THOMPSON's interviews, which formed the basis for his book "The Edwardians" (1975), were archived at Essex and have been re-used by numerous researchers, leading to many publications on a wide range of topics (THOMPSON 2000). 450 in-depth interviews, from a quota sample of men and women across the UK born between 1870 and 1906, were collected by the research team for the project. This rich and unique dataset includes themes on gender issues, childhood, work and family life, and continues to provide scope for further exploration by researchers. [8]

Additional significant material that was archived includes Dennis MARSDEN's interviews from "Mothers Alone: Poverty and the Fatherless Family" (1969), and "Workless: some Unemployed Men and their Families" (1975). [9]

3. The Acquisition of ESRC Funded Data

The ESRC Datasets Policy was established in 1995 and reinforces the ESRC's stated position relating to the acquisition and use of datasets, the requirements of which are now a condition of ESRC research funding (ESRC Guide to Research Funding Section 17). The ESRC requires all award-holders to offer for deposit copies of both machine-readable quantitative data, and machine- and non-machine-readable qualitative data, within three months of the end of the award. This relates not only to datasets arising as a result of primary data collection, but also to derived datasets resulting from ESRC-funded work. [10]

The Datasets Policy requires that datasets must be deposited to a standard that would enable the data to be used by a third party, including the provision of adequate documentation. Depositors are advised to contact the two Resource Centers at the earliest opportunity should the nature of the data be such that it may be difficult to archive. The earlier in the research process these discussions occur, the more likely researchers are to create datasets which are well-documented, free of confidentiality or license constraints, and usable for secondary analysis. [11]

Certainly the ESRC is far ahead of other funders in the UK in terms of its support for archiving data through its Datasets Policy, by funding the UK Data Archive and Qualidata, and thereby making data available to other researchers for secondary analysis. However, further action is needed by the ESRC to enable this Policy to be fully implemented. CORTI and WRIGHT (2002) have made recommendations for "a set of changes to the operational procedures that would create a more robust, systematic and accountable policy." They make the following recommendations:

These changes would considerably advance the progress made in pursuing the aims of the Datasets Policy. [13]

4. Consolidation of Qualidata—Continuing the Acquisition Process

The period from October 1996 to 2000 was one of consolidation for Qualidata. With an established comprehensive database of research projects, procedures in place for evaluating datasets and depositing them in appropriate archives, this period saw a broadening of the range of disciplines of the datasets acquired. The post of Researcher Support Officer was established in 1998 providing advice and information to ESRC grant applicants and award holders on archiving data and undertaking outreach work with the ESRC Research Programmes; in all, adopting a proactive approach to the acquisition of data. [14]

Qualidata has contacted other major UK funders about their archival policies. The Joseph Rowntree Foundation has no formal datasets policy but does encourage both quantitative and qualitative data to be offered for archiving. The Nuffield Foundation has adopted an archival policy for social science datasets for its research. The Leverhulme Trust leaves the decision on archiving data with the researchers but does send information on awards to the Data Archive. Qualidata has been in discussion with others funders of data to explore the possibilities of working together on archival policies. [15]

As Table 1 shows, from 1996 to 1997, there was perhaps a surprising fall in the number of datasets archived. This can be explained partly by staff changes, less time spent on acquiring earlier studies and too early for the Datasets Policy to have an effect. In Table 2 are figures for datasets identified for archiving, and those that have been archived, with numbers for each discipline. The large number of datasets for sociology reflects the fact that social studies were surveyed in the pilot work and again in the researchers' surveys.

Discipline

Total no. Collections identified for acquisition

%

Awaiting response from investigator

%

Data destroyed

%

Investigator not willing

%

Collections acquired

%

Other

%

Area Studies / Environmental Planning

1.1

0.9

1.2

0.5

1.0

0.9

Criminology

5.1

7.1

3.1

1.9

1.9

5.1

Economic and Social History

1.7

1.3

2.5

1.5

3.1

1.8

Education

11.0

12.0

9.2

10.7

12.2

11.7

Geography

4.8

5.0

4.3

5.3

2.6

4.5

Management and Business Studies

11.8

14.6

15.3

12.1

4.1

12.2

Political Science and Industrial Relations

4.3

5.6

0.0

5.8

2.1

4.4

Social Psychology

5.3

5.0

6.1

4.4

5.0

5.0

Social Administration / Social Policy3)

3.2

2.8

5.5

6.3

3.3

3.5

Social Anthropology

7.5

8.2

7.4

10.2

6.4

8.0

Socio-legal Studies

1.7

1.4

1.8

1.9

3.1

1.8

Socio-Linguistics

2.9

3.0

1.8

3.4

4.5

3.2

Sociology4)

39.6

33.1

41.7

35.9

50.6

37.8

 

 

 

 

 

 

184

Table 2: Researcher survey progress: datasets selected for acquisition and outcomes by discipline at October 20005) [16]

The range of disciplines for which Qualidata evaluated and archived data widened from October 1998 as contact was made with all ESRC Research Programmes to provide advice and information on archiving data. What is clear from Table 2 is the very low response rate from researchers to requests for details on datasets for potential deposit. To achieve an archived dataset, not only does the researcher have to be in favor of the principle of archiving data, but the confidentiality and consent agreements from interviewees must be satisfactory. In addition, researchers must be willing to prepare and, in some cases, resurrect their data for archiving. Archiving retrospectively is time consuming and often low in priority for an academic's workload. [17]

Significant datasets that were archived between 1996 and 2000 were Stan COHEN's, "Folk Devils and Moral Panics: The Creation of the Mods and Rockers" (1971); "The Affluent Worker in the Class Structure" (1969) study undertaken by John GOLDTHORPE, David LOCKWOOD, Frank BECHHOFER and Jennifer PLATT; and Tony COXON's research SIGMA Sexual Diaries (1982-1998). [18]

By September 2000, Qualidata had deposited 137 qualitative datasets and added details of a further 140 existing collections to its catalogue. During this period a large proportion of Qualidata's staff were engaged in data processing. The majority of acquisitions, until 2000, were paper-based, and some of the larger collections amounted to thousands of transcript pages. Paul THOMPSON's dataset for the Edwardians comprises 34,172 pages; the Affluent Worker collection contains 28,479 transcript pages. Most datasets are smaller in size but all require cataloging to prescribed standards, careful listing and labellings to enable the user to identify and make full use of the material. Anonymization may also be required involving specialist processing work, which can be very time-consuming. Nor does the processing task for machine-readable data necessarily take less time than for paper transcripts. Processing machine-readable texts may vary according to the individual requirements of each dataset, including differing demands for anonymization (CORTI 2002; CORTI & THOMPSON 1999). [19]

5. The Evolution and Comparison of Qualidata's Acquisition Policy

A policy for the acquisition of qualitative data has to take account of current thematic priorities of funders and researchers and cater for current user demand for the re-use of existing datasets, which may encompass wider themes. Secondly, an acquisition policy must adopt a long-term perspective on preserving important datasets that are not currently a preoccupation, but may be in the future. Potential acquisitions are thus always evaluated in an attempt to predict future research trends and the long-term needs of researchers. [20]

During the development of the service and in response both to changing formats for data collection and storage, and to researchers' demands for archived material for secondary analysis, Qualidata has formulated a number of criteria for evaluating qualitative data with potential for secondary analysis (THOMPSON 1998). Not all the criteria are essential for a dataset to be accepted for archiving. Some criteria are more important than others, for example, the life's work of a significant researcher, even though in a problematic format, may be deemed worthy of additional resources to convert the data into an acceptable format for preservation and re-use. [21]

6. Criteria Used in Evaluating Qualitative Data for Archiving

It is interesting to compare these criteria with Qualidata's current acquisition priorities. Qualidata's priorities in 2002 center on:

Certain criteria are deemed to be essential: that data are documented to a minimum standard, are in appropriate formats, are complete, and that confidentiality, data protection and copyright issues have been addressed. More emphasis is now placed on data in particular study topics, and data collected using both qualitative and quantitative research methods. Through experience of data archiving and re-use, data based on a national sample and datasets that have not been fully explored in the original study, provide the greatest potential for secondary analysis. [24]

THOMPSON, reflecting on his own personal experience of re-using qualitative data, argues that: "the most valuable qualitative datasets for future re-analysis are likely to have three qualities: firstly, the interviewees have been chosen on a convincing sample base; secondly, the interviews are free-flowing but follow a life story form, rather than focusing narrowly on the researcher's immediate themes; thirdly, when practicable re-contact is not ruled out" (THOMPSON 2000, para 41). [25]

In comparison, the US national qualitative data archive, the Murray Research Center, based at Harvard University, has a number of different collection priorities:

Founded in 1976, the Murray Research Center holds both quantitative and qualitative data and emphasizes the in-depth multi-disciplinary study of individual lives. Of particular note in its collection priorities are the longitudinal design of the data and the possibility of further follow-up of the sample by new researchers. [27]

Recent consultancy work carried out by the UK Data Archive for the Medical Research Council (CORTI & WRIGHT 2002) reported on the features of data that have a high potential for longevity and secondary analysis. Longitudinal studies have obvious long term potential. They also have the advantage of addressing broader sets of questions and thereby providing excellent opportunities to ask new questions. For many investigators, studies with the greatest value had a long-term follow-up design. Studies should be of high quality and well documented. For secondary analysis value, a dataset requires focus and breadth of investigation, complexity and high data quality. [28]

There exists considerable consensus on the most important features of a dataset for preservation and re-use between Qualidata, the Murray Research Center and the Medical Research Council. Together these provide valuable guidance for others newly embarking on the acquisition of qualitative data. There is agreement that longitudinal studies with the possibility of follow-up of subjects, a national sample base, high quality important data with a breadth of investigation, and data that have not been fully explored in the original study, are the most desirable qualities for acquiring qualitative data for secondary analysis. [29]

7. Ethical Issues in Archiving Data

It is essential that the issues of confidentiality and consent are resolved prior to the acquisition of data (CORTI, DAY & BACKHOUSE 2000; ESRC 1999). These are often the major obstacles to archiving a dataset, when they have not been clarified and agreements made with interviewees. At the end of a research project, confidentiality can be preserved through anonymization, and consent can be obtained by the researcher re-contacting informants, but these measures are both resource intensive. Researchers often do not fully appreciate the complexity of these issues and many assume that confidentiality is an imperative, without full exploration of the consequences with interviewees (GRINYER 2000). In a number of studies, especially in oral history, research participants have been very willing for their interview material to be archived without anonymization. Full consideration must be given to these issues by the researcher in consultation with the research participants. [30]

8. Summary of Main Points for a Successful Qualitative Data Acquisition Strategy

Acknowledgments

Qualidata is the Qualitative Data Service of the UK Data Archive which is jointly funded by the Economic and Social Research Council; Joint Information Systems Committee; and the University of Essex. Material for this paper has been drawn from Qualidata internal reports. For more information on ESDS Qualidata (this name has superceded "Qualidata"), see the website: http://www.esds.ac.uk/qualidata/.

Notes

1) Includes existing archived qualitative research material archived in public repositories but not handled by Qualidata. <back>

2) Taken from Report of the ESRC Qualitative Data Archival Resource Centre (Qualidata) October 1999 – October 2000. Figures are not presented for October 2000 onwards (the service's core ESRC funding was drastically reduced while the ESRC undertook a holistic review of their longer-term data archiving strategy). <back>

3) Some studies in this discipline were primarily classified under Sociology. <back>

4) Sociology includes the disciplines of cultural, gender and media studies. <back>

5) From Annual Report to the ESRC of the ESRC Qualitative Data Archival Resource Centre (Qualidata) Oct 1999 – Oct 2000. <back>

References

Backhouse, Gill & Thompson, Paul (2000). On the Hunt for Research Data from "Classic Social Studies". Newsletter of the British Sociological Association Network, October 2000.

Corti, Louise (2000). Progress and Problems of Preserving and Providing Access to Qualitative Data for Social Research—The International Picture of an Emerging Culture [58 paragraphs]. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research [Online Journal], 1(3), Art. 2. Available at: http://www.qualitative-research.net/fqs-texte/3-00/3-00corti-e.htm [Date of Access: July 20, 2004].

Corti, Louise (2002). Qualitative Data Processing Guidelines, Qualidata, UK Data Archive, University of Essex, UK.

Corti, Louise & Thompson, Paul (1999). Quinquennial Report to the Economic and Social Research Council, October 1994 - September 1999, Qualidata, University of Essex, UK.

Corti, Louise & Wright, Melanie (2002). Consultants' Report to the Medical Research Council on the MRC Population Data Archiving and Access Project, UK Data Archive, University of Essex, UK.

Corti, Louise; Day, Annette & Backhouse, Gill (2000). Confidentiality and Informed Consent: Issues for Consideration in the Preservation of and Provision of Access to Qualitative Data Archives [46 paragraphs]. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research [Online Journal], 1(3), Art. 7. Available at: http://www.qualitative-research.net/fqs-texte/3-00/3-00cortietal-e.htm [Date of Access: July 20, 2004].

Economic and Social Data Service (2002). Guidelines for social science researchers: Ethical and legal considerations. University of Essex . Available at: http://www.esds.ac.uk/aandp/create/ethical.asp [Date of Access: June 10, 2002].

Economic and Social Research Council (2002). ESRC Datasets Policy. Swindon, UK: ESRC.

Economic and Social Research Council (2002). Guide to Research Funding. Swindon, UK: ESRC.

Grinyer, Anne (2002). The Anonymity of Research Participants: Assumptions, Ethics and Practicalities. Social Research Update, 36, Available at: http://www.soc.surrey.ac.uk/sru/SRU36.html [Date of Access: June 10, 2002].

Thompson, Paul (1998). Sharing and Reshaping Life Stories. In Mary Chamberlain & Paul Thompson (Eds.), Narrative and Genre (pp.168-181). London, UK: Routledge.

Thompson, Paul (2000). Re-using Qualitative Research Data: a Personal Account [48 paragraphs]. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research [Online Journal], 1(3), Art. 27. Available at: http://www.qualitative-research.net/fqs-texte/3-00/3-00thompson-e.htm [Date of Access: July 20, 2004].

Authors

Louise CORTI

Present position: Associate Director of the UK Data Archive at the University of Essex, Economic and Social Data Service (ESDS), and Head of ESDS Qualidata (formerly the Qualitative Data Service), the Outreach & Training and Acquisitions Sections of the ESDS. Past position: Deputy Director Qualidata, Department of Sociology, University of Essex.

Major research areas: statistical literacy/using data in teaching; qualitative data archiving and secondary analysis of qualitative data; mixed methods data analysis.

Contact:

Louise Corti

ESDS Qualidata
University of Essex
Colchester CO4 3SQ, UK

E-mail: corti@essex.ac.uk
URL: http://www.esds.ac.uk/

 

Gill BACKHOUSE

Present position: Senior Acquisitions and Advice Officer at the UK Data Archive, Economic and Social Data Service (ESDS), University of Essex.

Major research areas: qualitative data archiving; informed consent and confidentiality of qualitative data.

Contact:

Gill Backhouse

UKDA Acquisitions
University of Essex
Colchester CO4 3SQ, UK

E-mail: acquisitions@esds.ac.uk
URL: http://www.esds.ac.uk/

Citation

Corti, Louise & Backhouse, Gill (2005). Acquiring Qualitative Data for Secondary Analysis [31 paragraphs]. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 6(2), Art. 36, http://nbn-resolving.de/urn:nbn:de:0114-fqs0502361.

Revised 3/2007

Forum Qualitative Sozialforschung / Forum: Qualitative Social Research (FQS)

ISSN 1438-5627

Creative Common License

Creative Commons Attribution 4.0 International License