Volume 9, No. 3, Art. 26 – September 2008

The Interpretation of Pictures and the Documentary Method

Ralf Bohnsack

Abstract: The considerable progress in qualitative methods is directly connected with developments in the field of text-interpretation. On the basis of a thorough reconstruction of their formal structures texts are treated as autonomous domains of self-referential systems. Such a methodological status has been denied to pictures in empirical research in the field of social sciences up until now. The documentary method, based on Karl MANNHEIM's Sociology of Knowledge, opens up methodical access to pictures. Methodologies from art history (PANOFSKY, IMDAHL) can thus become relevant for empirical research in social sciences. Connections to semiotics (BARTHES, ECO) and philosophy (FOUCAULT) are worked out in their consequences for qualitative methods. Thus verbal contextual and pre-knowledge can be controlled methodically in the documentary interpretation of pictures. The reconstruction of formal structure of pictures becomes of central importance in analysis. All of this will be demonstrated by examples from research practice.

Key words: documentary method; interpretation of pictures; iconology; sociology of knowledge; art history; semiotics; formal structure of pictures; comparative analysis

Table of Contents

1. Introduction

2. The Increasing Progress of Qualitative Methods and the Marginalization of the Picture

3. An Understanding through Pictures versus an Understanding about Pictures

4. The Change in Analytic Stance: From "What" to "How", from Iconography to Iconology, from Immanent to Documentary Meaning

5. The Difference between the Habitus of the Representing and the Habitus of the Represented Picture Producers

6. The Importance of Formal Structure and the Methodically Controlled Suspension of Parts of Iconographic Knowledge

7. Example of a Private Family Photo

8. Example of an Advertising Photo

9. The Analysis of the Formal Structure Opens up an Access to the Picture in its Entirety

10. Sequence Analysis, Reconstruction of Simultaneity and the Importance of Comparative Analysis

11. Conclusions






1. Introduction

Some general remarks concerning the development of picture interpretation in the field of qualitative methods will open up this contribution. Then I will come to the question of how it may be possible to develop a social scientific method which is designed to treat pictures as self-contained, autonomous domains that can be subjected to analysis in their own terms. As I would like to demonstrate, the methodological background for this method can be found in Karl MANNHEIM's Sociology of Knowledge in connection with methods and theories of art history, and to some extent of semiotics. Consequences for the practice of the documentary interpretation of pictures will be demonstrated through private and public photographs. [1]

2. The Increasing Progress of Qualitative Methods and the Marginalization of the Picture

When examining the development of qualitative methods during the last twenty years, we come to an observation which, at first sight, seems to be a paradox: the growing sophistication and systematization of qualitative methods has been accompanied by the marginalization of the picture. The considerable progress in qualitative methods during the last twenty years is—especially in Germany—essentially associated with the interpretation of texts. This is partly due to the so-called linguistic turn (see also: BOHNSACK, 2007c). [2]

In the field of empirical social sciences, the concept of the linguistic turn succeeded easily, because it was preceded by a premise in empirical research which has been concisely articulated by Karl POPPER (1959, pp.95ff.): Reality must, if it should become scientifically relevant, be articulated by ways of "protocol sentences" or "basic statements" and that means in the form of a text. Qualitative research has not only followed this premise, but has also developed it further. Only original research data which consists of linguistic action of research subjects, meaning texts which are produced by the actors themselves, must not be transformed into protocol sentences. In the field of picture interpretation, however, this transformation is especially necessary, consequently making it suspect of being invalid. [3]

The orientation towards the paradigm of the text and its formal structures has led to enormous progress in qualitative methods' precision. One of the reasons for this success can be seen in the methodological device of treating the text as a self-referential system or—as Harvey SACKS (1995, p.536) has put it: "If one is doing something like a sociology of conversation, what one wants to do is to see what the system itself provides as bases, motives, or what have you, for doing something essential to the system." This device or premise, which was first applied in the field of Conversational Analysis, was later followed by other methodologies pertaining to the area of text interpretation. However, up until now this premise has not yet become relevant in a strict sense for those qualitative methods which deal with the interpretation of pictures1). The focus on this methodological device—meaning the treatment of pictures in empirical research as self-referential systems—is one of the central concerns of my paper. [4]

Acknowledging that pictures have the methodological status of self-referential systems also has consequences for the ways of understanding pictures as a media of communication. We can differentiate between two quite distinct means of iconic understanding. A communication about pictures is to be distinguished from an understanding through pictures, as I would like to put it. [5]

3. An Understanding through Pictures versus an Understanding about Pictures

For the most part, an immediate understanding through pictures, or within the medium of the picture and thus beyond the medium of language and text, has been excluded tacitly or without further explanation from methodology and also from the theory of action. Theory, methodology and practical research should be in the position, "to no longer explain pictures through texts, but to differentiate them from texts," as the historian of the arts Hans BELTING (2001, p.15) with reference to William J.T. MITCHELL (1994) has put it. [6]

To speak of an understanding through pictures means that our world, our social reality, is not only represented by, but also constituted or produced by pictures and images. William MITCHELL (1994, p.41) has devoted a great deal of attention to this subject. Constructing the world through images, however, may be understood in at least two ways. One way of understanding only takes into consideration the interpretation and explanation of the world as essentially applied in the medium of iconicity. A more extensive understanding also includes the importance of pictures or images for practical action, their quality and capacity to provide orientation for our actions and our everyday practice. [7]

The latter aspect has been widely neglected in theories of action, communication and human development. Pictures provide orientation for our everyday practice on the quite elementary levels of understanding, learning, socialization and human development—and here we are not speaking primarily of the influence of mass media. Behavior in social situations or settings as well as forms of expressions through gestures and the expressions of faces are learned through the medium of mental images. They are adopted mimetically (compare: GEBAUER & WULF, 1995) and are stored in memory through the medium of images. [8]

Images are implicated in all signs or systems of meaning. In the terms of semiotics, a specific "signified" which is associated with a specific "signifier" (for instance a word) is not a thing, but a mental image. In the semiotics of Roland BARTHES (1967, p.43) we can read: "the signified of the word ox is not the animal ox but its mental image." And according to Alfred SCHUTZ (1964, p.3) every symbol or—more precisely: every typification is based on the "imagination of hypothetical sense presentation." These images are based to a great extent on iconic knowledge. [9]

The understanding and the orientation of action and everyday practice through the medium of iconicity is mostly pre-reflexive. This modus of understanding is performed below the level of conceptual or verbal explication. Iconic or image-based understanding is embedded in tacit knowledge, in "atheoretical" knowledge, as it is called by Karl MANNHEIM (1982). [10]

It is above all habitual, routinized action, which is structured by atheoretical or tacit knowledge. Tacit knowledge is also imparted through the medium of text and through the genres of narrations and descriptions in the form of metaphors, of metaphorical, meaning image-based depictions, of social settings. In a fundamental and elementary way, however, atheoretical or tacit knowledge is imparted by the medium of iconicity, for instance in the medium of pictures or images about social settings, and by incorporated practices of actions. The medium of atheoretical knowledge is thus generally that of "imagery" ("Bildlichkeit"), if we define the concept of imagery in the sense of Gottfried BOEHM (1978, p.447) in the way that "picture and language are participating at a joint level of imagery." This dimension of imagery belongs to the sphere of tacit or atheoretical knowledge. [11]

The transition in interpretation from the sphere of explicit knowledge to that of tacit or atheoretical knowledge is, in the terms of Erwin PANOFSKY (1955), the transition from Iconography to Iconology. As a historian of the arts, PANOFSKY was in his time essentially influenced by the discussion in the social sciences—especially by his contemporary Karl MANNHEIM and by MANNHEIM's Documentary Method of Interpretation (see also: BOHNSACK, 2007a). [12]

4. The Change in Analytic Stance: From "What" to "How", from Iconography to Iconology, from Immanent to Documentary Meaning

Long before devoting attention to the interpretation of pictures, I worked with the Documentary Method of Interpretation myself. The Documentary Method is rather popular as an essential element of the Ethnomethodology of Harold GARFINKEL (1967). Having been influenced by GARFINKEL since the 1970's, I went back to the roots of the Documentary Method in MANNHEIM's Sociology of Knowledge (BOHNSACK, 2006). On the basis of MANNHEIM's methodology, we began to develop a method for the interpretation of talk, especially of group discussions (among others: BOHNSACK, 2004), and then of all sorts of texts in general (BOHNSACK, 2008a and BOHNSACK, PFAFF & WELLER, 2008). [13]

The change from the immanent or literal meaning to the documentary meaning, the change from iconography to iconology is a change in perspective and analytic mentality. It can be characterized in correspondence with Martin HEIDEGGER (1986), Niklas LUHMANN (1990) and especially Karl MANNHEIM as the change from the question of What to the question of How. It is the change from the question, what cultural or social phenomena are all about to the question, how they are produced. Following PANOFSKY, the question What does not only include the level of iconography, but also the so called pre-iconographic level.

Diagram 1: Dimensions of meaning and interpretation in the picture [14]

The difference between iconography and pre-iconography is relevant not only to art history, but also to the social sciences and action theory. This becomes evident when PANOFSKY (1955, pp.52-54) explains these two levels or steps of interpretation, not in the field of works of art, but in the field of "everyday life" (p.53), as he himself calls it. As an example, PANOFSKY describes the gesture of an acquaintance. This gesture, which at the pre-iconographical level will at first be identified as the "lifting of a hat"(p.54), can only at the iconographical level be analyzed as a "greeting"(p.52) (see Diagram 1). [15]

When we elaborate PANOFSKY's argumentation in the framework of social sciences, the step from the pre-iconographical to the iconographical level of interpretation can be characterized as the step to the ascription of motives, more precisely: to the ascription of "in-order-to-motives," as Alfred SCHUTZ (1964, p.31) has called it: The acquaintance then is lifting his hat, in order to greet. On the level of iconographical interpretation, we search for subjective intentions—as we always do in the realm of common sense. This sort of iconographical interpretation is only on a sound methodical basis as long as we are dealing with action within the framework of institutions and roles. Otherwise, the iconographical interpretation is based on introspection and ascriptions, on the construction of motives, which cannot be the object of direct empirical observation. [16]

In contrast to the iconographical approach to analysis, iconological interpretation is characterized by "the rupture with the presuppositions of lay and scholarly common sense" as we can call it in Pierre BOURDIEU's terms (1992, p.247). The iconological stance of analysis, its analytic mentality, is radically different from asking the question What. It is searching for the How, for the modus operandi of the production, or the emergence, or the process of the formation of a gesture. Asking in this way, we can—according to PANOFSKY—gain access to the "intrinsic meaning or content" of a gesture (1955, p.40), to its "characteristic meaning" or its "documentary meaning" (1932, pp.115, 118), as PANOFSKY formulates with reference to MANNHEIM. By the way of iconological interpretation,

"we will receive the impression of a specific disposition from the gesture (…), which documents itself in the act of greeting, as clearly and independently from the intent and the consciousness of the greeting person as it would document itself in any other utterance of the life of the person concerned" (1932, pp.115f.). [17]

This characteristic meaning (in German: "Wesenssinn"), "which documents itself," is also called "habitus" by PANOFSKY. As is generally known, BOURDIEU adopted this concept from PANOFSKY. The conception of habitus can refer to individuals or to collective phenomena like milieus: for instance to the "proletarian" or the "bourgeois" habitus. It may be the expression of a phase of contemporary history or of a specific generation: for instance the habitus of the "68-generation." Or it may be understood—as it was in the original intention of PANOFSKY—as the expression of a historical epoch in general: for example of the Gothic or the Renaissance period. [18]

5. The Difference between the Habitus of the Representing and the Habitus of the Represented Picture Producers

According to PANOFSKY, in reconstructing iconological meaning, we are searching for the habitus of the picture's producer. Especially in the area of photography, however, it seems to be necessary to proceed beyond PANOFSKY and to differentiate between two fundamental dimensions or kinds of picture producers: On one hand we have the representing picture producers, as I would like to call them, such as the photographer or the artist, as well as all of those who are acting behind the camera and who are participating in the production of the picture, even after the photographical record. On the other hand we have the represented picture producers. These are all the persons, beings, and social scenes which are part of the subject of the picture and are acting in front of the camera. [19]

The methodical problems which result from the complex relation between these two different kinds of picture producers can be solved easily as long as both belong to the same milieu, to the same "(conjunctive) space of experience" (in German: [konjunktiver] Erfahrungsraum"), as we call it using the terminology of Karl MANNHEIM (1982).2) This is, for instance, the case when a member of a family is producing a family photo or when (as it is with historical paintings which are meant to give us insight into a historical epoch) the painter as well as the models or pictured scenes belong to the same epoch3). It is the main concern of iconological and documentary interpretation to gain access to the space of experience of the picture producers. And a central element of this space of experience is the individual or collective habitus. [20]

All this becomes methodically much more complex when the habitus of the represented picture producer is not in correspondence or congruent with that of the representing picture producer, for instance the photographer or the painter. I have tried to demonstrate this with a photo of a family of farm workers from Brazil (see Picture 1), which was taken by a professional photographer with artistic ambitions. By careful interpretation it might be shown, that the incongruities between the habitus of the representing and the represented picture producers refer to incongruities of the different spaces of experience, the different milieus they both belong to and to their relation in society (BOHNSACK, 2008a, p.249ff.).

Picture 1: Sebastião Salgado: Family with eleven children in Sertão de Tauá. Ceará 1983 (from: Sebastião SALGADO, 1997, p.98) [21]

Returning to PANOFSKY, it can be seen as one of his most extraordinary achievements to have worked out the concept of habitus or the documentary meaning (for instance of an epoch like the Renaissance) by ways of homologies (that means: structural identities) between quite different media or quite different genres of art from the same epoch (from literature to painting, and architecture to music). Exactly this extraordinary achievement has become the point of reference for the art historian Max IMDAHL to ask what then is singular to the picture medium or to iconicity in PANOFSKY's interpretations. PANOFSKY is not primarily interested in those meanings which are conveyed through pictures alone, but in those which are also imparted through pictures and other media. [22]

6. The Importance of Formal Structure and the Methodically Controlled Suspension of Parts of Iconographic Knowledge

In this context, Max IMDAHL (1996, pp.89ff.) also criticized the reduced significance of "forms" and "formal compositions" in the work of PANOFSKY. Forms and compositions are reduced to the function of arranging pictured objects in their concreteness, and of arranging iconographical narrations (for example a text from the Bible) in a recognizable manner. IMDAHL (1996a, pp.89f.) contrasts this so-called "recognizing view" ("wiedererkennendes Sehen") with the "seeing view" ("sehendes Sehen"), which has its point of reference not in pictured objects in their concreteness, but in their relation to the overall context and to the entire composition of the picture. [23]

The "seeing view," in opposition to the "recognizing view," is the basis of IMDAHL's method, which he has called "iconic" ("Ikonik" in German) (IMDAHL 1994 and 1996a). Iconical interpretation is based primarily on formal composition and on pre-iconographical description. According to IMDAHL, iconical interpretation can abstain from the ascription of iconographical meanings or iconographical pre-knowledge—and that means from textual knowledge. Iconic interpretation can—as IMDAHL has put it—"refrain from the perception of the literary or scenic content of the picture, it is particularly successful when the knowledge of the represented subject is—so to speak—methodically suppressed" (1996b, p.435). [24]

Such a "suppression" or "suspension" of textual pre-knowledge seems to be methodically necessary if we seek to comprehend a picture in IMDAHL's sense (1979, p.190) as a "system, which is constructed according to inherent laws and its evident autonomy." In terms of the social sciences this means comprehending the picture as a "self-referential system"(LUHMANN 1987, pp.31f.). If we follow Max IMDAHL and attempt to grasp the relevance of his approach for the social sciences, we will be simply—as I have already mentioned—making use of a device which has been the source of enormous progress in qualitative methods as far as the field of text interpretation is concerned. Now the question is how we can manage to transfer this device to the interpretation of pictures, to iconicity and its inherent laws. [25]

As far as the suspension of the textual knowledge, as stipulated by Max IMDAHL, is concerned, we can find correspondences or analogies to semiotics in the work of both of its prominent representatives: Umberto ECO as well as Roland BARTHES. Beyond the differences between them, both agree that we must begin our interpretation of pictures below the level of connotations in order to advance to the autonomy and inherent laws of the picture. The level of connotation, however, as ECO (1968, p.143) emphasizes, corresponds in several respects to PANOFSKY's level of iconography4). [26]

The singularity of the picture in contrast to text, and the specific system of meaning, the singular message of the pictorial, iconical signs, is thus determined on the pre-iconographical or denotative level. When decoding these messages, however, we must always pass through the next higher level: the level of iconographical or connotative code, which somehow obtrudes upon our minds and which Roland BARTHES (1991, p.45) has called the "obvious meaning" ("sens obvie"). In our common sense-interpretations, we usually tend to interpret non-abstract pictures by beginning with a mental construction of actions and stories which might have taken place in the picture. In the territory of common sense, we thus tend towards an iconographical interpretation. [27]

The decoding of a message which can be imparted exclusively by a picture thus must always go through iconographical or connotative code. However the message must "get rid of its connotations" as Roland BARTHES (1991, p.31) has put it, and "is first of all a residual message, constituted by what remains in the picture when we (mentally) erase the signs of connotation"5).

Picture 2: Diego Velázquez: Las Meninas, 1656. Madrid, Museo del Prado (from GREUB, 2001, p.295) [28]

At this point, some parallels with FOUCAULT's well-known interpretation of the painting "Las Meninas" by Diego Velázquez become apparent (see Picture 2). In his interpretation, FOUCAULT (1989, p.10) emphasized: "We must therefore pretend not to know." According to FOUCAULT, it is not so much the knowledge about institutions and roles which should be suspended (in the example of "Las Meninas," this would mean suspending our knowledge about the institution of the Spanish Court with its courtiers, maid of honors and gnomes). It is much more "proper names," as FOUCAULT (1989, p.10) says, which should be "erased." This means that our knowledge about the case-specific or the milieu-specific peculiarity of what is presented, and of its concrete history, should be omitted, "if one wishes to keep the relation of language to vision open, if one wishes to treat their incompatibility as a starting point for speech instead of as an obstacle to be avoided" (FOUCAULT, 1989, p.10). [29]

As my last expositions suggest, it appears that certain correlations can be worked out between prominent approaches and traditions in the area of picture interpretation. These correlations suggest that specific meanings or specific elements of knowledge on the connotative or iconographical level, which are primarily formed by narrations and by our textual knowledge, need to be—so to speak—suspended or ignored. In this way it seems to be possible to "keep open" the relation or tension between picture and language or picture and text in FOUCAULT's sense (1989, p.10). [30]

The precondition for this openness is to avoid, from the outset, the subordination of the picture to the logic of language and text. Up until now this problem has not been taken into account in qualitative methods consequently. In the field of semiotics, it was Roland BARTHES who presented a number of exemplary interpretations, which follow the method of suspension outlined here, which begins "when we (mentally) erase the signs of connotation," as BARTHES (1991, p.31) has put it. [31]

BARTHES calls the system of meaning which is the result of these interpretations the "obtuse meaning" (1991, pp.53ff.) ("sens obtue"). In the medium of text or language, the significance of this system of pictorial meaning can be transmitted only in the form of ambiguities and contrariness. With reference to photographs from the Eisenstein movie "The Battleship Potemkin" Roland BARTHES has shown that the facial expression of a weeping old woman, for instance, is neither a face which is tragic in the classic sense, nor does it cross the line into being comical. In a similar way, Umberto ECO (1994, p.146) speaks of the "productive ambiguity" ("ambiguità produttiva") in the deeper semantic structure of the picture. [32]

The iconic meaning, which is Max IMDAHL's term for this deeper semantic structure, has—according to IMDAHL—its peculiarity in a "complexity of meaning which is characterized by transcontrariness" (in German: "eine Sinnkomplexität des Übergegensätzlichen") (1996a, p.107).

Picture 3: Giotto, The capture of Christ, about 1305. Padua, Arena-Kapelle (from: IMDAHL, 1996a, Abbildungsverzeichnis, p.45, the slanting line was drawn by me according to IMDAHL) [33]

IMDAHL (1994, p.312) explains this with the example of Giotto's famous fresco "The Capture of Christ" (see Picture 3) and tries to demonstrate, that "due to a specific pictorial composition, Christ appears in a position of being inferior and superior at the same time." This complexity of meaning, which transcends simple iconography, is essentially based upon the so-called "planimetric composition" ("planimetrische Komposition"), that means: upon the composition of the picture as a plane. In the case of Giotto's "Capture of Christ" it is only one slanting line, which—according to IMDAHL—is decisive for the composition of the picture. The complexity of meaning in its transcontrariness can hardly be expressed in words and the verbal transmission of its meaning can succeed only in direct reference to the picture. [34]

Whereas—according to IMDAHL—it is not completely futile to attempt to verbalize this complexity of meaning, Roland BARTHES (1991, p.59) insists that "we can locate theoretically but not describe" that deeper semantic structure of the picture which he calls the "obtuse meaning." And a further quotation: "The obtuse meaning is not in the language system" (1991, p.51 and 54). [35]

On the basis of Roland BARTHES' theory of semiotics, there seems to be no successful way to develop a method for the interpretation of pictures which is relevant for the social sciences and is able to transcend the surface of iconographical or connotative meanings. It seems to be more promising to attempt to do this in the tradition of PANOFSKY's theory and its modifications and advancements through Max IMDAHL. In the framework of social sciences, however, several methodical specifications seem to be required, especially with respect to the suspension of iconographical or connotative meaning that is, disregarding of parts of verbal and textual knowledge. In the field of social scientific interpretations of pictures, these specifications seem to be especially necessary, because here iconographical knowledge is not transmitted in a codified manner—as we will find in the history of arts, for instance in the form of Biblical texts. [36]

FOUCAULT emphasizes (as I have already mentioned), that in the case of the interpretation of pictures we should not suspend all of our knowledge about names—not all names should be "erased," only the "proper names." Taking a family photo as an example, we should, or must proceed on the assumption (or on the basis of secured information) that the pictured persons are a family. Thus we have to activate our knowledge about the institution of the family and its role-relations. If we know that it is the "Johnson" family, we should also draw upon our knowledge about the role-relations of the presented picture producers: mother, father, aunt, uncle and so on. We should, however, suspend or ignore as completely as possible all of the knowledge we have about the concrete biography and history of the "Johnson" family. [37]

In the framework of the Documentary Method and Karl MANNHEIM's Sociology of Knowledge, which we call the "Praxeological Sociology of Knowledge" (BOHNSACK, 2006) the two forms of knowledge which are to be differentiated here can be categorized as communicative knowledge on the one hand and conjunctive knowledge on the other (see Diagram 1). Communicative knowledge concerns generalized and mostly stereotyped, more precisely: institutionalized knowledge. In the understanding of Peter L. BERGER and Thomas LUCKMANN (1966, p.51): "Institutionalization occurs whenever there is a reciprocal typification of habitualized actions by types of actors." This knowledge concerns role-relations in society. From this communicative knowledge, we must differentiate the conjunctive knowledge which is connected with proper names. This sort of knowledge about the "Johnson" family concerns its individual, case-specific peculiarity on one hand, and its milieu-specific character on the other. [38]

Even when we are endowed with valid knowledge about the biography of the family in a verbal-textual form (maybe on the basis of interviews or the analysis of family conversations), we should suspend or ignore this in the course of the interpretation of the photos. [39]

Thus we must begin as far as possible below or beside the iconographical level, that is, on the pre-iconographical level and on the level of the formal structure (see Diagram 1). [40]

With Max IMDAHL (1996a, Chapter II) we can differentiate among three dimensions in the formal compositional structure of the picture: the "planimetric structure," the "scenic choreography" and the "perspectivic projection." Perspectivity has its function primarily in the identification of concrete objects in their spatiality and corporality. Perceptivity is thus orientated to the regularity of the world which is presented in the picture, to the world outside, and within the environment of the picture. With reference to scenic choreography, the same is true for the social scenes in the world outside. In contrast to that, the reconstruction of the planimetric composition, of the picture's formal structure as a plane, leads us to the principles of design and to the inherent laws of the picture itself. It is first of all the planimetric composition which leads us to the picture as a "system, which is designed according to its inherent laws and is evident in its autonomy" (IMDAHL 1979, p.190). [41]

If we thus succeed in gaining access to the picture as a self-referential system, then we will also attain systematic access to inherent laws of the picture producer's realms of experience—for example to the realms of experience of a family with its specific collective habitus. [42]

7. Example of a Private Family Photo

Picture 4: Family photo [43]

To illustrate this, I would like to refer to an example from a research project about traditions in families from Eastern Germany, from the former GDR. In addition to family photos, we also based our interpretation on conversations at the living room table and on group discussions with parents and grandparents (for a more comprehensive interpretation see: BOHNSACK, 2008b; for another interpretation of family photos on the basis of the documentary method see: NENTWIG-GESEMANN, 2006).

Picture 5: Family photo: planimetry [44]

Here we have a photo of a family celebration, a photo of a First Communion in the GDR at the beginning of the 1980's (see Picture 4). The planimetric composition of the picture is strictly dominated by vertical and horizontal lines (see Picture 5). The representing picture producer and the represented picture producer have chosen a prefabricated building with GDR-typical slabs and the large trees with the harsh contrasts of vertical lines as the background. Moreover, the group is positioned on a path paved with slabs, so that the photo on the whole is dominated by a vertical and horizontal structuring which gives it harshness and a rigid order. [45]

Essential elements of the milieu of this family, of its realms of experience are thus expressed in an immediate way. A precondition for the validity of such a far-reaching interpretation, however, is that also in other dimensions of the picture—especially at the level of pre-iconographic description—homologous elements can be worked out. Harshness and rigidity are documented not only in the planimetric composition, but also in the expressions of faces, in gestures and in posture, which is characterized by a strictly vertical body axis. [46]

This rigidity and harshness stands in contrast to the provisional character of other parts of the foreground. The path on which the group is positioned is not yet completed. It seems to lead to nowhere and its provisional cordon is destroyed. This impression of being unaccomplished and unsure or insecure is increased by the picture's design, with the background being moved far away and by the absence of a middle ground. Thus the small group seems to be isolated in a special way and removed from relationships in which they could be held and imbedded. The group seems to be a little bit "lost". [47]

All together, we have a tense relationship between the impression of being provisional, insecure, and isolated on one hand, and harshness and rigidity on the other. This tense relationship makes up the atmosphere of the picture and gives us some insight into the family's habitus. In a verbal-textual manner, this habitus can only be formulated through "transcontrariness"—as the habitus of rigidity and harshness in the context of provision and insecurity. As I have already mentioned, the specific quality of the iconic meaning resp. of its verbalization is seen by Max IMDAHL in its "complexity of meaning characterized by transcontrariness" which becomes immediately evident in the picture, which however can hardly be formulated in a verbal-textual manner. [48]

8. Example of an Advertising Photo

Picture 6: Advertising photo I: Burberry London. From Vogue 2005 Russia [49]

As another example of such a "complexity of meaning characterized by transcontrariness" and as an example of the importance of formal structure, I would like to present a quite different family photo to you (see Picture 6): here we have an advertising photo from the clothing company Burberry, which is meant to target markets in Russia and the USA.

Picture 7: Advertising photo I: Burberry London: planimetry. From Vogue 2005 Russia (lines were drawn by me) [50]

A closer interpretation of this advertising photo (see Picture 7) can give us insight into the lifestyle which is being promoted here. Taking a look at the planimetric composition, it becomes evident that we have two groups. The group on the right hand is being viewed upon favorably by the group on the left. The distinct styling of the group on the right makes it evident that this group is the primary vehicle of the advertising message, and also the addressee of the message. The right-hand group represents a specific generation: the generation in transition from the pre-family to the family phase of its life cycle. Through the benevolence and acceptance on the part of group on the left, which is constituted by representatives of other generations, the right-hand group and the lifestyle which it stands for is integrated into a trans-generational context, and at the same time, into the context of the extended family. [51]

In contrast to the compositional arrangement, and to the physical closeness of the members of the right-hand group, we can observe the absence of any visual contact. The impression of belonging, unity, and community which is produced by the planimetric composition and scenic choreography is thus negated by the absence or denial of visual contact. The protagonists of our photo are members of a community, and at the same time they are isolated individuals. The Burberry Style as a lifestyle of clothing—which seems to be the message here—can enable us to experience belonging and community without requiring us to forfeit our individualism. [52]

However, we recognize that the presentation of individuality and autonomy has taken the specific form of a negation. This is due to the peculiar form of presentation in advertising. Advertising depends on the medium of the pose (see also: BOHNSACK, 2007b and IMDAHL, 1996c), the "hyper-ritualization" as Erving GOFFMAN (1979, p.84) has called it, and is confronted with the paradoxical challenge of expressing individuality through the medium of poses and stereotypes. In our case, this is accomplished through the absence or denial of visual contact. This effect is even more evident in the photo which is intended for the German advertising market (see Picture 8).

Picture 8: Advertising photo II: Burberry London: planimetry. From Vogue 2005 Germany [53]

Thus the photo demonstrates yet another form of transcontrariness in its iconic or iconological meaning: the presentation of individuality by posing or using stereotyped postures.

Picture 9: Advertising photo I: Burberry London planimetry and golden section. From Vogue 2005 Russia

Picture 10: Advertising photo I: Burberry London: perspectivity. From Vogue 2005 Russia (lines were drawn by me) [54]

If we return to the photo for the Russian and American markets, we can see that one person is standing in the planimetric center (see Picture 9), which is here marked by the intersection of the circles, as well as in the so called golden section, and also in the perspectives center, in the vanishing point (see Picture 10). That person is the supermodel Kate Moss who personifies the propagated lifestyle to the extreme (for a more comprehensive interpretation see: BOHNSACK, 2007d and 2008b).

Picture 11: Family photo: perspectivity [55]

Returning to the photo of the First Communion (see Picture 11), we can now see that it is not the most important person of the ritual, the child receiving First Communion, who has been moved into the perspective's center, but rather the grandmother. The photographer or representing picture producer (the child's aunt), has positioned herself eye-to-eye with the grandmother. The focus of perspective, the vanishing point, is on the level of the grandmother's eyes and close to them. Perspectivity can reveal insights into the perspective of the presenting picture producers and their philosophy, their "Weltanschauung," as PANOFSKY (1992) has elaborated in his essay on the "perspective as a 'symbolic form'." [56]

Here, a gender-specific hierarchy with generation-specific elements is documented. We have a predominance of women, especially the elder women in the family. Homologous to the focus of the photographer's perspective, which means, of the presenting picture producer, the group—the presented picture producers—have positioned themselves around the grandmother. Such observations concerning the structure of this family could later be validated on the basis of the interpretation of texts from group discussions and from table conversation. [57]

9. The Analysis of the Formal Structure Opens up an Access to the Picture in its Entirety

By thoroughly reconstructing the formal, especially the planimetric composition of a picture, we are somehow forced to interpret the picture's elements, not in isolation from each other, but basically ensemble, in the context of the other elements. In contrast to that, in a common-sense interpretation, we are inclined to pick single elements out of the picture's context. [58]

Analogies to methodological devices for the interpretation of texts become apparent here. As we know from the field of Ethnomethodology, it is indispensable for the proper understanding of an utterance to consider the overwhelming context which is produced by the speakers themselves. The single elements of a text as well as the elements of a picture arrange themselves as contexts and settings, and attain their proper meaning only through the settings which they are part of. In the area of Ethnomethodology, this mutual relation has been called reflexivity. According to Harold GARFINKEL (1961 and 1967) the method of interpretation, which allows access to the structures of meaning constituted by this reflexivity is the documentary method. We are only able to validly reconstruct context if we succeed in identifying formal structures. They are documents for the natural order which has been produced by the actors themselves. [59]

Conversational Analysis has done pioneering work here. The reconstruction of formal structures is an important instrument for the interpretation of deeper semantics. In Germany, for example, this has been verified by the analysis of communicative genres (GÜNTHNER & KNOBLAUCH, 1995) as well as by the reconstruction of textual genres with the method of narrative interviews (SCHÜTZE, 1987), and also through the reconstruction of discourse organization in our own interpretations of conversation on the basis of the documentary method (BOHNSACK & PRZYBORSKI, 2006). In the field of the interpretation of pictures, however, the reconstruction of formal structures is still in its infancy. For the further development of methodology, it seems to be useful to make use of the preliminary work concerning formal aesthetics in the field of art history. [60]

10. Sequence Analysis, Reconstruction of Simultaneity and the Importance of Comparative Analysis

The interpretations of texts, like pictures, have in common the methodological device of gaining access to inherent laws of meaning of a text by way of formal structure. However, the procedures and strategies for its application are quite different. As IMDAHL has emphasized, we are only successful in interpreting the inherent meaning of a picture if we comprehend its fundamental structure of simultaneity6). IMDAHL (1996a, p.23) describes this in his headstrong language as "the coincidence of composition and endowment with meaning," where "the entirety is totally present from the outset." [61]

Here we have an essential difference to the qualitative methods in the field of text interpretation, where sequence analysis is the central methodical device. When trying to transfer this to the interpretation of pictures, we would ignore its inherent structures. Sequence analysis, however, can be understood as being derived from the more general principle of comparative analysis, the principle of operating with horizons of comparison. [62]

The specific structure of conversational meaning or of narration, for instance, is made accessible when I comparatively contrast it with alternative courses of conversation or narration (BOHNSACK, 2001). In the interpretation of pictures we are dependent on horizons of comparison as well (see also: BOHNSACK, 2003). Access to the interpretation of the formal composition of a picture in its individuality can be gained—as Max IMDAHL (1994) has shown—by contrasting it with other contingent possibilities of composition. These can be designed by experiments of thought or—and even more validly—the interpretation can be guided by empirical horizons of comparison (for instance when comparing the photo of a First Communion with those from different milieus or different cultures: for instance in Eastern and Western Germany; BOHNSACK, 2008b). [63]

11. Conclusions

When developing qualitative methods for the interpretation of pictures, it seems to be important not to explain pictures by texts, but to differentiate them from texts. Nevertheless, it seems equally important to develop common standards or methodological devices which are relevant for the interpretation of texts, as well as for the interpretation of pictures. Examples of common standards are: to treat the text as well as the picture as a self-referential system, to differentiate between explicit and implicit (atheoretical) knowledge, to change the analytic stance from the question What to the question How, to reconstruct the formal structures of texts as well as pictures in order to integrate single elements into the over-all context, and—last but not least—to use comparative analysis. The application or realization of these common standards and methodological devices in the field of the interpretation of pictures, however, has to be quite different from that of the interpretation of texts, if we intend to advance to iconicity as a self-contained domain, to its inherent laws and to its autonomy independent from texts. [64]


1) And this is also true for the analysis of videos and movies in social sciences. In those areas of video analysis, which allocates itself in the tradition of Conversation Analysis and Ethnomethodology (and also of Cultural Studies), the picture only has a supplementary function to the analysis of talk, meaning a supplementary function to the text (see also: BOHNSACK, 2008b). Charles GOODWIN (2001, p.157) has made this explicit in a very clear manner: "However in the work to be described here neither vision, nor the images (…) are treated as coherent, self-contained domains that can be subjected to analysis in their own terms. Instead it quickly becomes apparent that visual phenomena can only be investigated by taking into account a diverse set of semiotic resources (…). Many of these, such as structure provided by current talk, are not in any sense visual, but the visible phenomena (…) cannot be properly analysed without them."

Whereas it is regarded as impossible by GOODWIN to analyze visible phenomena without reference to talk, Conversational Analysis has a long tradition in analyzing talk, meaning verbal phenomena, without reference to other semiotic resources, especially visible phenomena. Neither here nor in other publications in the realm of Conversation Analysis I could find a comprehensive reasoning for this fundamental difference concerning the methodological and theoretical status of pictures and texts.

For a video analysis on the basis of the documentary method see also BOHNSACK (2008b) and Monika WAGNER-WILLI (2006). <back>

2) Here the question arises, if the amateur photographs and the habitus of the amateur photographer can be interpreted according to the standards and methods of art history. The answer has been given by Pierre BOURDIEU (1990) already with the title of his book about family photography: "Photography. A Middle-brow Art" (in French: "Un art moyen"). And in the book he explains: "In fact, while everything would lead one to expect that this activity (…) would be delivered over to the anarchy of individual improvisation, it appears hat there is nothing more regulated and conventional than photographic practice and amateur photographs" (1990, p.7). The stylistic preferences, the habitus, "the system of schemes of perception, thought and appreciation common to a whole group" (1990, p.6), constitutes a selectivity, which has its consequences also for the snapshot and especially for the snapshot (more comprehensive to that: BOHNSACK, 2008b).

In the field of qualitative text-interpretation it is a matter of course to interpret profane products like pieces of art, artful practices with inherent laws and a strict order, or, as it is called in Ethnomethodology: "as an ongoing accomplishment (…) with the ordinary, artful ways of that accomplishment" (GARFINKEL, 1967, p.vii). But up to know this device has not really been transferred to the interpretation of pictures. <back>

3) Different from the English translation in MANNHEIM 1982 (p.204), where we can find the formulation: "conjunctive experiential space," I prefer to translate the German term "konjunktiver Erfahrungsraum" (MANNHEIM 1980, p.227) with "conjunctive space of experience." <back>

4) Concerning the correspondences between Roland BARTHES and Erwin PANOFSKY see also: van LEEUWEN (2001). <back>

5) Here I am not following the English translation in BARTHES (1991, p.31): "(…) is first of all a privative message, constituted by what remains in the image, when we (mentally) erase the signs of connotation." <back>

6) Whereas IMDAHL as a historian of the arts is focusing on the picture as a performance of the representing picture producer, the structure of simultaneity is also valid for the performance of the represented picture producers, as has already been worked out by Ray L. BIRDWHISTELL (1952) in his classic on the interpretation of gestures, of Kinesics. Hubert KNOBLAUCH (2006, p.78) has pointed to this "dimension of simultaneity" concerning video analysis (without concreter references to research practice however). For the importance of simultaneity in video analysis in methodology and research practice on basis of the documentary method see BOHNSACK (2008b) and Monika WAGNER-WILLI (2006). <back>


Barthes, Roland (1967). Elements of semiology. London: Jonathan Cape.

Barthes, Roland (1991). The responsibility of forms. Critical essays on music, art and representation. Berkeley: University of California Press.

Belting, Hans (2001). Bild-Anthropologie. Entwürfe für eine Bildwissenschaft. München: Fink.

Berger, Peter & Peter Luckmann (1966). The social construction of reality. Garden City, New York: Doubleday.

Birdwhistell, Ray L. (1952). Introduction to kinesics (An annotation system for analysis of body motion and gesture). Louisville: University of Louisville.

Boehm, Gottfried (1978). Zu einer Hermeneutik des Bildes. In Hans-Georg Gadamer & Gottfried Boehm (Eds.), Seminar: Die Hermeneutik und die Wissenschaften (pp.444–471). Frankfurt a.M.: Suhrkamp.

Bohnsack, Ralf (2001). Dokumentarische Methode. Theorie und Praxis wissenssoziologischer Interpretation. In Theo Hug (Ed.), Wie kommt Wissenschaft zu Wissen? Bd. 3: Einführung in die Methodologie der Kultur- und Sozialwissenschaften (pp.326-345). Baltmannsweiler.

Bohnsack, Ralf (2003). Qualitative Methoden der Bildinterpretation. Zeitschrift für Erziehungswissenschaft (ZfE), II, 159-172.

Bohnsack, Ralf (2004). Group discussion. In Uwe Flick, Ernst von Kardorff Iris Steinke (Eds.), A companion to qualitative research. (pp.214-220). London: Sage.

Bohnsack, Ralf (2006). Mannheims Wissenssoziologie als Methode. In Dirk Tänzler, Hubert Knoblauch, Hubert & Hans-Georg Soeffner (Eds.), Neue Perspektiven der Wissenssoziologie (pp.271-291). Konstanz: UVK.

Bohnsack, Ralf (2007a). Die dokumentarische Methode in der Bild- und Fotointerpretation. In Ralf Bohnsack, Iris Nentwig-Gesemann & Arnd-Michael Nohl (Eds.), Die dokumentarische Methode und ihre Forschungspraxis. Grundlagen qualitativer Sozialforschung (2nd edition, pp.67-90).Wiesbaden: VS-Verlag.

Bohnsack, Ralf (2007b). "Heidi": Eine exemplarische Bildinterpretation auf der Basis der dokumentarischen Methode In Ralf Bohnsack, Iris Nentwig-Gesemann & Arnd-Michael Nohl (Eds.), Die dokumentarische Methode und ihre Forschungspraxis. Grundlagen qualitativer Sozialforschung (2nd edition, pp.323-337). Wiesbaden: VS-Verlag (2nd edition).

Bohnsack, Ralf (2007c). Zum Verhältnis von Bild- und Textinterpretation in der qualitativen Sozialforschung. In Barbara Friebertshäuser, Heide von Felden, Heide & Burkhard Schäffer (Eds.), Bild und Text – Methoden und Methodologien visueller Sozialforschung in der Erziehungswissenschaft. (pp.21-45). Opladen: Barbara Budrich.

Bohnsack, Ralf (2007d). Dokumentarische Bildinterpretation am Beispiel eines Werbefotos. In Renate Buber & Hartmut Holzmüller (Eds.), Qualitative Marktforschung. Konzepte. Methoden. Analysen (pp.951-978). Stuttgart: Gabler.

Bohnsack, Ralf (2008a). Rekonstruktive Sozialforschung. Einführung in qualitative Methoden (7th edition). Opladen: UTB.

Bohnsack, Ralf (2008b). Qualitative Bild- und Videointerpretation. Einführung in die dokumentarische Methode. Opladen: UTB.

Bohnsack, Ralf & Przyborski, Aglaja (2006). Diskursorganisation, Gesprächsanalyse und die Methode der Gruppendiskussion. In Ralf Bohnsack, Aglaja Przyborski & Burkhard Schäffer (Eds.), Das Gruppendiskussionsverfahren in der Forschungspraxis. (pp.233-248). Opladen & Farmington Hills: Barbara Budrich.

Bohnsack, Ralf; Pfaff, Nicolle & Weller, Wivian (Eds.) (2008), Qualitative analysis and documentary method in international educational research. Opladen & Farmington Hills: Barbara Budrich.

Bourdieu, Pierre (1990). Photography. A middle-brow art. Stanford: Stanford University Press.

Bourdieu, Pierre (1992). The practice of reflexive sociology (The Paris workshop). In Pierre Bourdieu & Loïc J.D. Wacquant (Eds), An invitation to reflexive sociology (pp.217-260). Cambridge: Polity Press.

Eco, Umberto (1968). La struttura assente. Milano: Bompiani.

Eco, Umberto (1994). Einführung in die Semiotik (8th edition). München: Fink.

Foucault, Michel (1989). The order of things. An archaeology of the human sciences. London: Routledge.

Garfinkel, Harold (1961). Aspects of common sense knowledge of social structures. In International Sociological Association (Ed.), Transactions of the Fourth World Congress of Sociology (Vol. IV, pp.51-65). Lovain: International Sociological Association.

Garfinkel, Harold (1967). Studies in ethnomethodology. Englewood Cliffs, New Jersey: Prentice-Hall.

Gebauer, Günther & Wulf, Christoph (1995). Mimesis. Culture—art—society. Berkeley: University of California Press.

Goffman, Erving (1979). Gender advertisements. New York: Harper and Row.

Goodwin, Charles (2001). Practices of seeing visual analysis: An ethnomethodological approach. In Theo van Leeuwen & Carey Jewitt (Eds.), Handbook of visual analysis (pp.157-182) Los Angeles: Sage.

Greub, Thierry (Ed.) (2001). Las Meninas im Spiegel der Deutungen. Eine Einführung in die Methoden der Kunstgschichte. Berlin: Reimer.

Günthner, Susanne & Knoblauch, Hubert (1995). Culturally patterned speaking practices—The analysis of communicative genres. Pragmatics, 5(1), 1-32.

Heidegger, Martin (1986). Sein und Zeit. Tübingen: Mohr. [Orig.1927]

Imdahl, Max (1979). Überlegungen zur Identität des Bildes. In Odo Marquard &Karlheinz Stierle (Eds.), Identität (Reihe: Poetik und Hermeneutik, Bd. VII) (pp.187-211). München: Fink.

Imdahl, Max (1994). Ikonik. Bilder und ihre Anschauung. In Gottfried Boehm (Ed.), Was ist ein Bild? (pp.300-324). München: Fink.

Imdahl, Max (1996a). Giotto – Arenafresken. Ikonographie – Ikonologie – Ikonik. München: Fink.

Imdahl, Max (1996b). Wandel durch Nachahmung. Rembrandts Zeichnung nach Lastmanns "Susanna im Bade". In Max Imdahl, Zur Kunst der Tradition. Gesammelte Schriften, Vol.2 (pp.431-456). Frankfurt a.M.: Suhrkamp.

Imdahl, Max (1996c). Pose und Indoktrination. Zu Werken der Plastik und Malerei im Dritten Reich. In Max Imdahl, Reflexion − Theorie − Methode. Gesammelte Schriften, Vol.3 (pp.575-590). Frankfurt a.M.: Suhrkamp.

Knoblauch, Hubert (2006). Videography. focused ethnography and videoanalysis. In Hubert Knoblauch, Bernt Schnettler, Jürgen Raab & Hans-Georg Soeffner (Eds.), Video analysis. Methodology and methods. Qualitative audiovisual data analysis in sociology (pp.69-83). Frankfurt a.M.: Peter Lang.

Leeuwen, Theo van (2001). Semiotics and iconography. In Theo van Leeuwen & Carey Jewitt (Eds.), Handbook of visual analysis (pp.92-118). Los Angeles: Sage.

Luhmann, Niklas (1987). Soziale Systeme. Grundriss einer allgemeinen Theorie. Frankfurt a.M.: Suhrkamp.

Luhmann, Niklas (1990). Die Wissenschaft der Gesellschaft. Frankfurt a.M: Suhrkamp.

Mannheim, Karl (1952). On the interpretation of Weltanschauung. In Karl Mannheim, Essays in the sociology of knowledge (pp.33-83). London: Routledge & Kegan Paul.

Mannheim, Karl (1980). Strukturen des Denkens. Frankfurt a.M.: Suhrkamp.

Mannheim, Karl (1982). Structures of Thinking. London: Routledge & Kegan Paul.

Mitchell, William J.T. (1994). Picture theory. Essays on verbal and visual representation. Chicago & London: The University of Chicago Press.

Nentwig-Gesemann, Iris (2006). The ritual culture of learning in the context of family vacation: a qualitative analysis of vacation pictures. In Tobias Werler & Christoph Wulf (Eds.), Hidden dimensions of education. Rhetoric, rituals and anthropology (pp.135-148). Münster: Waxmann.

Panofsky, Erwin (1932). Zum Problem der Beschreibung und Inhaltsdeutung von Werken der Bildenden Kunst. Logos, XXI, 103-119.

Panofsky, Erwin (1955). Iconography and iconology: An introduction to the study of Renaissance art. In Erwin Panofsky, Meaning in the visual arts (pp.51-81). Harmondsworth, Middlesex: Penguin Books.

Panofsky, Erwin (1992). Die Perspektive als "symbolische Form". In Erwin Panofsky (Ed.), Aufsätze zu Grundfragen der Kunstwissenschaft (pp.99-167). Berlin: Wissenschaftsverlag Spiess. [Orig. 1927]

Popper, Karl R. (1959). The logic of scientific discovery. London: Hutchinson & Co.

Salgado, Sebastião (1997). Terra. Frankfurt a.M.: Zweitausendeins.

Sacks, Harvey (1995). Lectures on conversations. Vol. I & II. Oxford (UK) and Cambridge (USA): Blackwell.

Schutz, Alfred (1964). Collected papers I. Den Haag: Martinus Nijhoff.

Schütze, Fritz (1987). Das narrative Interview in Interaktionsfeldstudien: Erzähltheoretische Grundlagen. Studienbrief der Fernuniversität Hagen.

Wagner-Willi, Monika (2006). On the multidimensional analysis of video-data. Documentary interpretation of interaction in schools. In Hubert Knoblauch, Bernt Schnettler, Jürgen Raab & Hans-Georg Soeffner (Eds.), Video analysis. Methodology and methods. Qualitative audiovisual data analysis in sociology (pp.143-153). Frankfurt a.M.: Peter Lang.


Ralf BOHNSACK, Dr. rer. soc., Dr. phil. habil., Dipl.-Soziologe, University Professor, Director of the Department of Qualitative Research on Human Development at the Free University of Berlin.

Main Areas of Research: Reconstructive Social Research; Sociology of Knowledge; Documentary Method; Analysis of Talk; Interpretation of Pictures and Films; Evaluation Research; Research on Milieu, Generation, Youth and Deviance.


Prof. Dr. Ralf Bohnsack

Freie Universität Berlin
Arbeitsbereich Qualitative Bildungsforschung
Freie Universität Berlin
Arnimallee 11
14165 Berlin, Germany

Phone: +49-30-83854228
E-mail: bohnsack@zedat.fu-berlin.de


Bohnsack, Ralf (2008). The Interpretation of Pictures and the Documentary Method [64 paragraphs]. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 9(3), Art. 26, http://nbn-resolving.de/urn:nbn:de:0114-fqs0803267.

Forum Qualitative Sozialforschung / Forum: Qualitative Social Research (FQS)

ISSN 1438-5627

Creative Common License

Creative Commons Attribution 4.0 International License