Abstract: Qualitative social and cultural research is increasingly engaging with visual data. Starting from the premise "all is data" in grounded theory methodology (GTM), we propose a general framework to realize a visual grounded theory methodology (VGTM). Referring to exploratory visual methods based on objective hermeneutics, the documentary method, and segment analysis, as well as existing GTM discourses, we discuss how this text-centered procedure can be applied to visual data. We focus on the (re)formulation of procedural steps (such as making an inventory, segmentation and coding, memo writing, and sampling strategies), and the examination of images in relation to GTM logic.

Key words: grounded theory methodology (GTM); pictorial turn; image analysis; documentary method; objective hermeneutics; segment analysis; visual data

1. Introduction: Variety of Data as a Challenge

2. First Considerations Toward a Visual Grounded Theory Methodology

3. Approaches to Image Analysis From the Field of Cultural Semiotics and Art History

4. From Text to Image: Various Image-Analytical Elaborations

4.1 Objective hermeneutic image interpretation

4.2 Documentary image interpretation

4.3 Segment analysis

4.4 Interim results

5. Basics of a Visual Grounded Theory Methodology

5.1 Contextualization

5.2 Description/inventory

5.3 Segmentation

5.4 Memo writing and coding as an interwoven interpretation process

5.5 Interpretation and the integration of forms of knowledge

5.6 Formation of categories

5.7 Continuation: Expansion of the material (sampling)

5.8 Integration of image/text categories

6. Summary and Outlook: Discourses on Visuality and Grounded Theory Methodology







1. Introduction: Variety of Data as a Challenge

In qualitative social research, visual data are in great demand. Given the technological possibilities such as recording and reproduction, visual data have become increasingly popular in social science research. In particular, images/photographs and films/videos, less frequently drawings, and occasionally also objects/artifacts, are considered significant research material. Today, non-textual data are frequently included in various research designs. As a result, an "all is data" mentality has become the implicit assumption for many qualitative research projects. [1]

From its beginnings, "all is data" has been one of the central premises of grounded theory methodology (GTM) as conceived by Barney GLASER and Anselm STRAUSS (1967). According to GTM, the various types of data directly shape the development of its subsequent theories. GLASER (2007, n.p.) specifically encouraged GTM researchers to take the full scope of data into account (although he did not explicitly mention visual data):

"By diverse I mean whatever may come the GT researcher's way while theoretically sampling: documents and current statistics, newspaper articles, questionnaire results, social structural and interactional observations, interview, casual comments, global and cultural statements, historical documents, whatever, whatever as it bears on the categories. [...] GT is a general methodology usable on any data, and it is up to the researcher to figure out exactly what the data is." [2]

As we can see, "all is data" relates to questions of integrating various types of data, i.e., questions of triangulation and meaningful connections of various types of data and mixed methods designs (MORSE & NIEHAUS, 2009). [3]

Because visual data are increasingly being used throughout qualitative research projects alongside more traditional forms of data, integration becomes an even more challenging task. Due to this increase in the use of visual data, an integration of research options becomes more difficult and we witness a growing need for appropriate evaluation procedures. However, as of today we lack such accurate tools of analysis. Arnulf DEPPERMANN (BREUER et al., 2014, p.274) points out that, given the widely assumed mentality of "all is data," we tend to believe that accurate transcription or description of recorded data represents good (image/video) analysis. [4]

In a similar vein, Ralf BOHNSACK lamented a few years ago:

"When examining the development of qualitative methods during the last twenty years, we come to an observation which, at first sight, seems to be a paradox: the growing sophistication and systematization of qualitative methods has been accompanied by the marginalization of the picture. The considerable progress in qualitative methods during the last twenty years is—especially in Germany—essentially associated with the interpretation of texts. This is partly due to the so-called linguistic turn" (2008, §2). [5]

By now, however, images—or visual data in general—have become a fundamental part of the methodological discussion in the social and cultural sciences. Visual methods have grown considerably, referencing important pioneers of the field such as Roland BARTHES, Max IMDAHL, Erwin PANOFSKY, and others (see Section 3). Today, various analytical perspectives are available. Moreover, an ever-growing number of projects enrich the discussion (see KNOBLAUCH, BAER, LAURIER, PETSCHKE & SCHNETTLER, 2008; MARGOLIS & PAUWELS, 2011; ROSE, 2001). [6]

In the following, we will focus on image analysis without differentiating between static or moving images (videos/films). We first present an outline of visual data and GTM by Krzysztof KONECKI, and expand his perspective with our own criteria for Visual Grounded Theory Methodology (VGTM) (Section 2). Secondly, we present a selection of several classic approaches to image analysis from the fields of cultural semiotics and art history (Section 3), as well as current methods of image interpretation (Section 4). In a third step, we suggest key features of a VGTM (Section 5), and conclude by discussing possible repercussions (Section 6). [7]

2. First Considerations Toward a Visual Grounded Theory Methodology

Interestingly, GTM is not yet established in discussions and publications about visual methods. Despite occasional video-related research projects based on GTM (f.e., HABIB & HINOJOSA, 2015), there is just one outline of a VGTM proposed by the Polish sociologist Krzysztof KONECKI (2011). KONECKI develops the concept of "multislice imaging," arguing that images not only show multiple layers of meaning but can also be interpreted from multiple perspectives:

"The multislice imagining is a grammar of visual narrations analysis that accents the following stages: a) an act of creating pictures and images (analysis of context of creation); b) participation in demonstrating/communicating visual images; c) the visual product, its content and stylistic structure; d) the reception of an "image" and visual aspects of presenting/ representing something" (2011, p.131). [8]

Elaborating on this concept, KONECKI references CLARKE's (2005), SCHUBERT's (2006), and SUCHAR's (1997) GTM-oriented frameworks. He relies specifically on SCHUBERT's videographic study, which is based on the depiction of slices (i.e., image-oriented layers of meaning)1) from which categories are created (KONECKI, 2011, p.137). He also incorporates "specification memos" as suggested by CLARKE (KONECKI, 2011, p.146). In drawing on his image-based studies (about the practice of yoga as well as homelessness), however, KONECKI eventually proposes a framework of analysis that differs from those of the authors he mentions: KONECKI suggests first reconstructing the layers of meaning of an image from its context of production and reception. Secondly, the explicit requirements for the researcher's image analysis as well as implicit assumptions of the interpretation of the image should be considered. Thirdly, a sociocultural analysis of the image context should be conducted. These analytical steps are repeated in order to allow for the image to be analyzed in a multi-faceted way and from different perspectives. [9]

In doing so, KONECKI integrates essential elements of the GTM research logic. Significant for GTM is both the application of constant comparison and a theoretical sampling strategy. Driven by the general research question "What does homelessness mean?" KONECKI considers different dimensions of visuality, such as the visualization of homelessness based on images homeless people have taken, contrasted with journalistic depictions. Other data types and sources are also taken into account in order to clarify the research question. As we can see, KONECKI follows classic GTM sampling strategies in his approach, essentially rendering images as a source of data among others. [10]

In addition, KONECKI introduces the concept of "theoretical sensitivity." For him, expert knowledge needs to be made explicit during the research endeavor in order to comprehend the image as part of the overall sociocultural ensemble. Finally, he argues in favor of coding visual material during the later stages of the analyses, once categories are already established. [11]

It is safe to say that KONECKI's approach is very ambitious. It aims to include the circumstances of production of the image and the image itself, as well as the context of reception and the sociocultural framework. As promising as this may sound, we also feel that such a broad perspective potentially neglects the more image-immanent components, specifically questions of composition and aesthetics. In light of our own research,2) we understand the image as a medium in need of explication, specifically with respect to its formal composition. We reject the transformation of an image into mere text and textual interpretation, encoded as latent forms of knowledge. Instead, reflection and coding of image elements are driven by "the image as such" (including the reciprocal relation between the image elements), and guided by the composition of the image. [12]

In this light, we think of a GTM-based visuality in three ways:

  • Firstly, it is important to closely examine VGTM from a theoretical point of departure to deepen and expand KONECKI's efforts.

  • Secondly, and modeled after existing coding strategies for written data, we feel the need to spell out concrete analytical steps to ensure systematic and rule-based analysis of non-textual data. This is particularly important because existing guidelines usually focus on pragmatic issues and often merely refer to the potential applicability of various software tools.

  • Thirdly, an outline of existing suggestions about triangulation is needed in order to specify how codes and categories can be integrated vis-à-vis different forms of data. [13]

To further explicate the theoretical foundations of VGTM, we explore the theoretical connections to other academic discourses on images, particularly within the disciplines of cultural semiotics in the wake of Roland BARTHES, as well as art history with reference to Erwin PANOFSKY and Max IMDAHL (Section 3). We then take a closer look at how text-oriented approaches such as objective hermeneutics, the documentary method, and (genuinely image-oriented) interpretative sociology accomplish the conceptualization of visual data. Last but not least, it is particularly important for us to shed light on how those approaches treat the mediality of data, and how this treatment can be put to use in VGTM (Section 4). [14]

3. Approaches to Image Analysis From the Field of Cultural Semiotics and Art History

The emergence of the pictorial turn (MITCHELL, 1992) is often regarded as the starting point of image analysis (cf. HARPER, 2000; STIEGLER 2010). However, PEIRCE's (1932) semiotics already laid claim to the significance of visual signs almost half a century before the pictorial turn appeared in the social sciences. [15]

The works of Roland BARTHES, who dealt predominantly with photography in cultural semiotics, underscores that images (as a special case of the visual) demand their own approach, and in that respect, specifically tailored analytical perspectives. In his single-case studies (Panzani advertisements, BARTHES, 1977 [1964]; family portraits, Paris Match magazine cover, BARTHES, 2001 [1957]), images have been analyzed more closely according to their meanings. In the Panzani pasta advertisement case (BARTHES, 1977 [1964]), he defines the messages of the image. BARTHES differentiates three levels: the linguistic message (text), the symbolic message (connoted image), and the literal message (denoted image). For him, the image is to be understood as an arrangement of signifiers, which are being decoded from the recipient's point of view and thus are location-bound. [16]

BARTHES (1981 [1980]) introduces two terms in his classic essay "Camera Lucida": studium and punctum, both of which play a role in discourses about visuality until today. While studium refers to an acquired interest for the image, including a collectively shared impact on socialization (e.g., "being affected" in the face of war images), punctum designates the effect of the image on personal experiences. (BARTHES for example mentions an image of his mother, which "struck him like lightning," p.35.) [17]

In art history, past and contemporary approaches accentuated the image differently. The art historian Erwin PANOFSKY (esp. 1972 [1955]) is considered to be one of the founders of iconology. His differentiation of iconography and iconology became highly influential for contemporary discourses on methods. PANOFSKY's concepts were critically considered and further developed by Max IMDAHL (see especially 1996). [18]

In his iconology, PANOFSKY (1972 [1955], pp.43f.) follows three steps: The pre-iconographic description provides a detailed account of image contents, focusing on primary or, as he calls it, "natural" layers of meaning (1). The iconographic analysis investigates the image composition by means of extra-pictorial (e.g., literary) references. Symbols and motifs are understood as carriers of a secondary or conventional layer of meaning, which can be discovered by expert recipients as deliberate arrangements of meaning by the artist (2). The subsequent iconological interpretation conceives the image as a temporal or epoch-related document and locates it in a broader historical and sociocultural context (3). Image arrangements are not only considered as intentionally created by artists, but also as trans-intentional articulations of the spirit of an epoch. IMDAHL's iconicity criticized the latter in particular. He argued that an overemphasis on the image as a product of its epoch is created to the disadvantage of the reflection of its aesthetic features. IMDAHL further claimed that PANOFSKY's analysis effectively reduced the image elements to their function, treated them merely as reference points, and ignored the more image-immanent features and the inherent relationship of the various image elements. More so, IMDAHL argued that pre-existing iconographic knowledge and standards of interpretation—if applied—lead to a mere recognition and categorization of the image in traditional terms. Forms of appreciation of the image as such, so IMDAHL argued, are rendered impossible in PANOFSKY's approach. In contrast, IMDAHL asks for a different perspective of interpretation, which he terms the seeing view (1996, pp.84-96). [19]

The contributions of PANOFSKY and IMDAHL have been widely echoed throughout the debates on image interpretation in the social and cultural sciences (e.g., BRECKNER, 2007, 2010; BRECKNER & PRIBERSKY, 2016; PRZYBORSKI & SLUNECKO, 2012a; SCHNETTLER & RAAB, 2008). Particularly, documentary image interpretation as developed by Ralf BOHNSACK takes them into consideration (2009; for a short summary see BOHNSACK, 2008). [20]

In the ongoing debate, concrete analytical questions within the framework and discussion of PANOFSKY and IMDAHL gain significance: How much space should the formal structures of the images occupy ("seeing viewing" [sehendes Sehen])? What status does contextual knowledge have ("recognizing viewing" [wiedererkennendes Sehen])? Can interpretations be validated by consultation of texts and/or further images?3) [21]

In order to establish a VGTM within the current methodical discourse of social and cultural studies, we consider it useful to reflect on some well-established qualitative approaches—mostly developed by German researchers—in order to better understand how to gain access to the image from a methodological point of view. For this purpose, we will turn to Roswitha BRECKNER's segment analysis. BRECKNER's image segmentation shows potential for a more detailed elaboration of a VGTM, specifically because segmentation is also applied in text-related GTM. Concerning the extension of GTM from text to image analysis, it seems finally instructive to engage with objective hermeneutics and the documentary method.4) [22]

4. From Text to Image: Various Image-Analytical Elaborations

The approaches mentioned come with theoretical and methodological differences: While the documentary method and segment analysis are localized in the sociology of knowledge, OEVERMANN (2013) disagrees with such a classification, as he claims a unique position for his method. [23]

In reference to GTM, the diversity of approaches promises novel insights and productive connections, because GTM was indeed closely affiliated with the interpretative paradigm and pragmatism/symbolic interactionism. At the same time, over the course of its development GTM has witnessed ever-new adjustments, including constructionist, postmodern, and reflexive dimensions (for an overview see BRYANT & CHARMAZ, 2007; RUPPEL & MEY, 2017). [24]

4.1 Objective hermeneutic image interpretation

Objective hermeneutics focuses on latent layers of meaning and primarily applied to textual data (e.g., OEVERMANN, 2008). Only occasionally are objects like archaeological findings considered (OEVERMANN, 2006). Recently, OEVERMANN (2014) also introduced objective hermeneutic image analysis. The roots of this development reach back to the 1990s, with verbal data (PEEZ, 2006) being drawn on to verify the findings of visual analyses (e.g., LOER, 1994). Within this tradition, we also rely on PEEZ (2006), who discusses the integration of interpretations stemming from text and image data respectively. [25]

To OEVERMANN (2014), differences in treatment of text and image on the level of methods are less central. For him, the (as he puts it) "epistemological" conceptualization of the concept "image" has to take center stage. Images, in OEVERMANN's opinion, are not depictions of reality with a genuine representational function (p.31). Rather, he sees their distinctive characteristic in a contouring effect that represents more than just a formal and constitutive element of the image. It is this contouring frame that sets the image apart from a background, and thereby constitutes it. Moreover, OEVERMANN emphasizes that images capture their contents out of the stream of events and bring them to a standstill. Only in this way do we become able to look at them more closely. [26]

Unfortunately, the discussion of the contouring frame is the only reference to the image's mediality in OEVERMANN's work. He does not attempt to interpret images based on their inherent mediality but instead relies on the interpretation of their function. His perspective is one of cultural anthropology: He understands images as the very first products, and as such, artifacts of human conduct (p.33). [27]

A discussion of media-related differences in the sense of iconicity versus textuality are beyond OEVRMANN's scope. His perspective is to treat image and text in a very similar way on the epistemological level in order to make them usable for objective hermeneutics. The social or cultural reality that images articulate and the authenticity of such valid expression is of primary interest to OEVERMANN (p.34). [28]

A different position can be found in PEEZ's work (2006). He combines the interpretation of photographs with participant observation protocols in the school context. For PEEZ, text and image interpretation do not compete—rather, they complement each other. The protocol of participant observation visualizes temporary sequences that could not be recognized in an image. While the textual protocol provides linguistic utterances and dialogs for analysis, the image captures "atmospheric" details and spatial characteristics. PEEZ conceptualizes his image analysis as a balancing act: On the one hand, he follows the premises of objective hermeneutics, when he, like OEVERMANN, understands images (like all data) as texts—as protocols of (social) reality—subject to the principle of sequence analysis. On the other hand, PEEZ recognizes that images are perceived diachronically and not sequentially like texts. The image is however, in his approach, still analyzed sequentially as text. To take the simultaneity of the image into account however, PEEZ emphasizes the iconic paths that guide the view of the interpreter. He assumes a certain sequence of perception that is organized by the arrangement of the elements of the image. In agreement with with LOER, however, he points out that an image can be observed from a variety of perspectives (and in that respect, different pathways have to be taken into consideration). [29]

PEEZ holds that the formal features of an image are highly relevant. He however differentiates between formal aspects (e.g., typical for snapshots) and compositional elements (this term he reserves for more choreographed types of images). In his example the image is characterized as a snapshot, and so the "formal aspects" guide the view in a specific way. It is this order of perception that should also be reflected in the image analysis protocol. By collecting ever-more formal aspects of the image, a thick description is created from which the interpretation emerges. Over the course of the analysis, such descriptions of the whole image are moving to the background, allowing for the description of individual elements of the image. Eventually, the interpretation of text and image are compared so as to make sure that they follow the same discursive frame (2006, p.138). [30]

It is apparent that in PEEZ's approach the simultaneous character of the image is analyzed traditionally—that is, sequentially. The iconic paths dictate the course of the text, ultimately producing a text about an image. Moreover, images are thought to be less capable than texts of transport meaning, at least inasmuch as PEEZ deems additional context-specific observation data necessary. Without (verbal language) protocols of the context (based on participant observation), no valid assertion seems to be possible to him. [31]

4.2 Documentary image interpretation

The documentary method and image analysis are linked predominantly in BOHNSACK's work stemming back to the early 2000s. Over the years, BOHNSACK continued his work on qualitative interpretation of image and video (esp. 2009, see also BOHNSACK 2008). [32]

With respect to BARTHES's and PANOFSKY's image analysis, BOHNSACK (2009) identifies the same methodical-methodological issue he had previously pointed out in the case of text analysis: In his view, different forms of knowledge (i.e., atheoretical knowledge, implicit knowledge, communicative knowledge) can collide during the interpretation process. Images or image elements are often interpreted based on a "mode of association"—that is, based on external standards, not image-immanent factors. Specifically, implicit and communicative knowledge tends to influence the researcher/interpreting person to base their interpretation primarily on factors external to the concrete data at hand. According to BOHNSACK, this is true for images even more that it is for texts (p.35). [33]

To remedy this shortcoming, BOHNSACK argues for a compositional analysis following PANOFSKY and IMDAHL. Accordingly, the image needs to be understood in its non-text equivalent regularity. As such, a reconstruction of the formal composition is required. Referring to IMDAHL, BOHNSACK insists that the image cannot be interpretively "abandoned" too quickly by relying on knowledge located "outside the image." He instead calls for an integration of iconographic knowledge. [34]

Additionally, for BOHNSACK a departure from sequence analytic methods, which have a raison d'être for texts but not images, is indicated to deal with the mediality of the image. Texts are characterized by "narrativity" and temporal succession, while images exhibit simultaneous presence of their elements. [35]

Notwithstanding the text/image distinction, BOHNSACK suggests integrating text-based qualitative research procedures with the interpretation of images. In line with the concept of fictional or empirical horizons of comparison as applied in qualitative research, the comparison with other images is regarded as a significant step during interpretation. Possible questions might be: What other ways can be thought of for treating the topic at hand within the same discourse? How is the topic negotiated in other discourses? BOHNSACK places this procedure at the stage of reflective interpretation in his method of image interpretation. In contrast to the methods discussed earlier, the compositional analysis of the image takes center stage. This includes planimetrics (the reconstruction of the overall composition of the image),5) perspectivity (e.g., central perspective), and scenic choreography (how groups of persons or objects are related to one another or separated from one another (e.g., BOHNSACK, 2009, pp.58-72). [36]

4.3 Segment analysis

Roswitha BRECKNER (2010) has developed an interdisciplinary approach that she calls "segment analysis" [Segmentanalyse]. In contrast to objective hermeneutics and the documentary method, segment analysis was developed specifically for the purpose of analyzing images. BRECKNER conceives segment analysis in accordance with BOHNSACK as "simultaneous und multidimensional," i.e., not characterized by sequentiality (BRECKNER, 2010, p.270). [37]

For BRECKNER, the interpreting subject and their "line of sight" [Blickrichtung] are most important. Line of sight, in this view, is not understood as contingent but—in accordance with LOER's (1994) concept of iconic paths—as a function of the structure of the image (BRECKNER, 2010, p.274). BRECKNER relies on Rudolf ARNHEIM, who in 1984 already underlined the importance of the image structure as created by "scanning" the image. BRECKNER also refers to OEVERMANN and his (text-related) sequence analysis, although she alters OEVERMANN's procedures in significant ways. Most crucially, she argues that the image or the image element should first be interpreted without relying on external knowledge. Instead, the first interpretation should purely be accomplished in terms of various potential "perceptual modes" [Sehweisen]. [38]

BRECKNER argues that evidence and plausibility of certain hypotheses are obtained with respect to the gestalt of an image and not external information. She also argues that hypotheses should follow the following structure: First, a pictorial element, a segment, should be isolated from the image/visual context and interpreted independently. For this purpose, various contexts are created in which the element makes sense, i.e., in which it would "demonstrate" something. It is crucial to consider as many contexts as possible and to allow for opposing views as well. By including ever more image elements into an interpretative context, the plausibility of the reading can be assessed (BRECKNER, 2010, pp.275f.). [39]

In contrast to OEVERMANN's method, segment analysis considers the image in its mediality. BRECKNER's approach is characterized by the documentation of the process of perception, and three further steps: Analysis of the formal pictorial design; investigation of the image composition by consulting PANOFSKY and IMDAHL (planimetric structure, perspective projection, scenic choreography); and the reconstruction of the image's concept. Subsequently, three additional steps are dedicated to editing the result of the analysis (BRECKNER, 2010, p.285). [40]

4.4 Interim results

In contrast to objective hermeneutics, which either treats the image in its mediality in a predominantly theoretical fashion or like a text during the actual analysis (OVERMANN), the documentary method involves a detailed comparison of the media types text and image. [41]

BRECKNER's and BOHNSACK's approaches in particular assume legitimacy of the image without text by analyzing the image without reference to external/discursive meanings. To depart from the text-based approach of sequentiality also influences how we look at perception. BRECKNER argues that during the process of perception, a gestalt of the whole image is formed successively by establishing the part-whole relation for every image element, thereby constituting the gestalt of the image (BRECKNER, 2010, p.273). In order to capture this gestalt formation during the analytical process, segment analysis unites three important factors: First, it acknowledges the simultaneity of images, as described in the documentary method as well as in objective hermeneutics; secondly, it applies compositional analysis to salvage the structure of the image (as originated in the documentary method); and thirdly, it applies sequence analysis as pioneered in objective hermeneutics to establish a fully-fledged interpretation of the image. [42]

To sum up, three insights appear to be essential for us to further elaborate a VGTM:

  • Image is not text and cannot be analyzed in terms of sequentiality. Images are not characterized by succession but by simultaneity. As such, criteria are required for a chronological order and "levels of meaning" for the interpretation (BOHNSACK, BRECKNER).

  • Criteria to establish the gestalt and sequence of levels of meaning during the interpretation process of an image are not justifiable on the epistemological level alone (OEVERMANN), but have to be deducted from the mediality of the image (BRECKNER, BOHNSACK).

  • If the mediality of the image is supposed to guide the sequence of the level of meaning, the referential framework needs to be spelled out. This can be accomplished in two ways: The composition of the image itself takes center stage to analyze the levels of meaning (BOHNSACK); or the line of sight of the researcher functions as the organizing principle for the analysis, and the composition is merely informing the image analysis (BRECKNER). [43]

5. Basics of a Visual Grounded Theory Methodology

The key question for VGTM is this: How can GTM be adopted to fit the particularities of image mediality? Specifically, how can the coding procedures be modified? Despite varying linguistic terms and diverging procedural steps,6) GTM, overall, targets micro-analytical studies. In a first step to approach the given data (e.g., the transcript of an interview), they are divided into units of meaning (these can be single words, parts of a sentence, complete sentences, or entire passages of text). The segmented units of meaning are coded and condensed into categories. The aim here is to transcend the level of pure description in order to gain access to the text's conceptual content. Over the course of this analysis the findings are differentiated, continuously compared, and summarized in more comprehensive categories, as well as related to each other in order to extract data-based information about connections (relations, pattern, and types). The coding steps are fixed in memos, which are continuously expanded and revised (for an overview see RUPPEL & MEY, 2017). [44]

Based on this process logic, "open coding" should be applied to visual material. As such, the segments to be coded need to be identified. The following procedural steps can be understood as a framework of orientation for the investigation of images. [45]

5.1 Contextualization

As a first step of image analysis, a decision has to be made whether context information is sought and should be compiled, and in what way. Context information can be used as an indicator for image formation (as suggested by KONECKI, 2011, who, in his yoga example, elaborates on the perceptive situation of the image, the space, etc.), or context information can inform about the producers of the image and the location of publication (e.g., in magazines). However, a context-free description of images is also possible (as explicitly postulated by OEVERMANN, 2014; for GTM, see GLASER 2004). [46]

This decision depends largely on the interpreters' level of knowledge, as well as the research question at hand. Moreover (and with more far-reaching implications), this decision depends on the intended application of potential external information and its status within the analysis (the question of contextual knowledge is discussed in more detail in Section 5.5). [47]

5.2 Description/inventory

Creating a description or inventory does not necessarily include a detailed list of (visible) image elements. Instead, this step aims at a preliminary analysis of the space created by the image, i.e., what is shown and in which perspective, etc. The production of the inventory is active, interpretative work, not simply a list of image elements—rather, an active construction. Whether this interpretation proceeds in terms of the fore-, middle-, and background of an image (as outlined by BOHNSACK, 2009, p.60, who locates the detailed image description at the pre-iconographic level) has to be determined in relation to the image and its composition and the concrete issue under investigation. [48]

5.3 Segmentation

The sequence of interpretation of pictorial elements and, as such, the segmentation process itself, are inextricably bound to the concrete image. Thus, the procedure can be outlined here in broad terms only: Images depicting a main character, for example, can be segmented easily by the compositional analysis following the documentary method (planimetrics, scenic choreography, perspectivity). Spatially less complex images can be segmented according to the line of sight (following the iconic paths) as suggested by BRECKNER. As a third, more demanding option, the BRECKNERian approach can be applied in conjunction with the documentary method by comparing the researcher's line of sight with findings from a compositional analysis, in line with BOHNSACK's approach. Such a combination allows for the line of sight analysis to be legitimized by the results of the more formal compositional analysis. However, according to GTM, it is of utmost importance in all three cases that the image be subject to an intersubjectively comprehensible segmentation of elements. [49]

5.4 Memo writing and coding as an interwoven interpretation process

In the case of image analysis in particular, memo writing and open coding can be closely interlinked: In contrast with KONECKI's approach, the focus is not to interpret the text that is the result of the interpreter's own image analysis. Instead (and given the inherent logic of the image and GTM), by means of segmentation it is possible to interpret the image directly and without translating it into a text. Interpretation in GTM means to create codes, and in this case to create codes that refer to the concepts of each image element. During the coding process visual data have to be "broken up," and the process of breaking up itself has to be documented in memos. In short, this means posing so-called "generative questions" (just as already established for textual analysis in GTM, see synoptically RUPPEL & MEY, 2017) that are also denoted as WH questions ("what," "who," "when/how long," "where," "why," "with which," and "what for"). Bringing all the WH questions to attention guarantees the consideration of elements beyond eye-catching aspects. Each image segment receives its own code, which is registered in the code list. In following this procedure, interpreters moreover become aware of potential (semantic) relationships between pictorial elements. Multiple coding procedures are necessary to consider all relationships within the image, and constant comparison is indispensable. Of course—and as a result of the constant comparison method—these codes should also be registered in the code list. [50]

5.5 Interpretation and the integration of forms of knowledge

More profound interpretations are required early on during the analytical process. Various interpretation procedures can ideally be distinguished in terms of their level of integration into forms of knowledge (see for more details STRAUB, 2006). The spectrum ranges from powerful image-immanent interpretations using only the bare minimum of everyday knowledge (as mainly with the documentary method), to interpretations strongly relying on contextual and expert knowledge. An image segment in this approach first of all represents an extra-pictorial field of semantics. The latter form of interpretation focuses on semiotic traces, which according to KONECKI (2011, p.140), constitute the analysis of the "outer context" of the image, and the "visual cultures and subcultures" or "social worlds." The goal of the analysis is to align extra-pictorial discourses, visual cultures, or connotations of objects with the image segment itself (see also RAAB, 2012, who presents a systematic approach to image analysis with reference to BARTHES and GOFFMAN). Comparative procedures, as for instance, provided by the documentary method (albeit only in the final reflective interpretation) and VGTM in accordance with KONECKI, help to carve out image-related similarities and differences by means of comparison with other existing images. [51]

Whether an interpretation aims at a (far-reaching) suspension of contextual knowledge or at (selective) claims of forms of knowledge depends on the overall research question. However, the central discussion within GTM on "forcing versus emerging" (KELLE, 2005) is always at issue. In our view, depending on the research question, contextual or expert knowledge can be integrated or suspended at various levels. However, applied knowledge and the steps of interpretation need to be explicated in the presentation of the research process to assure intersubjective traceability (see MRUCK & MEY, 2007, 2018 on GTM and reflexivity). To accomplish this, memo writing is essential (reflexive memos to capture pre-conceptions and pre-structures as demonstrated in contextual knowledge; theoretical memos to record conceptual work; and organizational memos to explicate, for example, additional data collection). [52]

5.6 Formation of categories

As mentioned above, the goal of an image interpretation (as with texts) is to establish codes, which (as with texts) can be condensed into categories to reflect the conceptual content of the image. For this purpose, it is advisable to further elaborate the findings captured in the memos. Looking at the connections of categories and subcategories is also vital to the coding task. [53]

With respect to the categories, it is important to take the semantic meaning into account. Beyond that, however, the formal constitution of the image is important in relation to the overall research question as well. If the image genre is of importance to the research question, for example, matters of contrast, color, etc. might be more important to the interpretation. [54]

5.7 Continuation: Expansion of the material (sampling)

Over the course of the interpretation, theoretical sampling becomes increasingly pivotal. Theoretical sampling is used to decide whether additional material is needed to conduct the analysis and answer the research question. As is common in GTM, theoretical sampling is guided by the categories already established as well as their re-evaluation according to the research question (KONECKI for example accomplished theoretical sampling in accord with the spatio-social arrangements of the image). [55]

5.8 Integration of image/text categories

To ensure continuous integration of the research material and the categories, constant comparison must be made. In this regard, all sequentially used data should potentially be considered during the analysis. This could be, to name a few: further images; texts on images; ethnographic explorations on the contexts of the images; interviews with recipients or producers of the images; or textual contexts of the images—for instance with journal articles. [56]

At this point, the question also arises: What is the relationship of different data types, and how are these data shaped by their medium? The relationship of image and text needs to be clarified (for example in the case of magazine article and magazine cover). During the analysis, it may become clear that categories of text and image categories are highly interlinked (cf. PEEZ). In such a scenario, the image would represent what can be found in the text as well. Contrasting or oppositional relationships of image and text are however also possible. Media-specifically, it could become apparent that images produce contents that are detectable in the texts as notifications, intensifications, emphasis, etc. In such a scenario, the text would capture more complex and more differentiated semantics than the image because of its "communicative concentration." [57]

6. Summary and Outlook: Discourses on Visuality and Grounded Theory Methodology

This article has offered an orienting frame for the implementation of a visual grounded theory methodology based on a critical reflection of already established approaches for the analysis of images in qualitative social research. For this purpose, we have adapted and modified essential procedural steps of GTM (coding procedures, memo writing, categorization, sampling) to analyze images as simultaneously composed material with a different mode of sequentiality as compared to textual forms of data. After the inventory of the image's elements is generated, the next step is an image-oriented compositional segmentation of the image. A comprehensive interpretation is done during the process of coding and segmentation. Through the subsequent condensation of codes, categories should be constructed that reflect the basic concepts of the images, and help to develop a strategy for the theoretical sampling procedure in terms of both a textual and formal trace for interpretation that can be followed to explore the material. Moreover, we have demonstrated the decisions that must be taken before and after the analysis (inclusion of contextual knowledge: if so, how much and with what benefit? How relevant is the image composition to the research issue? How does this affect the construction of codes and categories?). In the research fields of the documentary method, objective hermeneutics, and segment analysis, far-reaching groundwork on the (theoretical and methodical-methodological) interpretation of images has been made. The project of a VGTM, however, has thus far been stimulated primarily by KONECKI (and authors such as Charles SUCHAR or Adele CLARKE, considered by him). [58]

Our thoughts on a VGTM are based on the increase of visual data in many research projects. At the same time, in many research fields the prerequisites to include visual data in empirical research projects still have to be spelled out (as is the case for example in cultural psychology or cultural sociology). We consider it mandatory to work on such research issues with GTM or to conceptualize an analysis of non-textual material for GTM and to make procedural suggestions. A combination of GTM with other approaches to image analysis entails GTM-external procedural steps and requires the researcher to be aware of very different theoretical and epistemological/methodological foundations (and partially their incommensurability). The GLASERian credo "all is data" can only be put into practice if we develop ways to deal with different kinds of data, text-based or visual. First steps into this direction have been made (e.g., with regards to narrations, see LAL, SUTO & UNGAR, 2012; RUPPEL & MEY, 2015). [59]

VGTM cannot deal with "static" images alone. Instead, it needs to be applicable to moving images like films as well. First attempts at integrating videography into GTM are being made (DIETRICH & MEY, 2018b; HABIB & HINOJOSA, 2015). Entertainment movies and music videos are, however, virtually unexplored sources. Their potential benefit to social research should also be considered in GTM. This applies equally to media-related hybrid presentations like, for instance, websites combining text, images, and moving images, and other network-based formats. [60]

GTM allows for all potentially relevant data to be used, irrespective of their media-related aspects. Consequentially, to look at text-based data alone is insufficient. The pictorial turn is in the past, a material turn lies ahead of us: How can artifacts or objects be captured in terms of GTM (see also CLARKE, 2005; KAUTT, 2017)? The investigation of culture and society with sociological methods does not end with spoken, written, or illustrated data. [61]


