Is Attention Really Immaterial? Visual Culture after Post-Fordism

Seb Franklin


Responding to the “Visual Culture Questionnaire” proposed by the editors of October in 1996, Susan Buck-Morss makes a pair of statements that point towards the troubled relationship that exists between theories of the visual and digital culture. Firstly, responding to the ease with which Walter Benjamin’s claim that “[i]mages in the mind motivate the will” moves from a feting of surrealism to a slogan for the advertisement industry in the society of the spectacle, Buck-Morss argues that “a critical analysis of the image as a social object is needed more urgently than a program that legitimates its ‘culture’.”1 In itself this represents an admirable proposition, one that we might do well to follow. A few sentences later, Buck-Morss turns to the internet as a realm of the visual and asserts that “[i]t is striking to anyone who has visited the internet how visually impoverished a home page can be […] The possibility of computer screens replacing television screens may mean a great deal to stockholders of telephone companies, but it will not shake the world of the visual image.”2 Powerful as many of Buck-Morss’s claims in this response are, the conundrum that emerges between these two specific examples is striking. On the one hand, we need a “critical analysis of the image as a social object.” On the other hand, the visual dimension of the internet (and, by implication, software) is “impoverished” and cannot be defined as comprising an aesthetic experience. So, by extension, my question is this: what if the visual dimension of software (including websites and video games) constitutes a distinct set of social practices from those surrounding the production and consumption of painting, photography, film and television? The analysis of these more recent social relations, according to Buck-Morss’s formulation, would constitute a study of visual culture, but not one of aesthetics. Buck-Morss suggests as much when she goes on to suggest that “perhaps the era of images that are more than information is already behind us.”3 While this is an overly-pessimistic suggestion—after all, the range of visual practices that existed before the emergence of techniques for their discretization continue to exist—it is clear that an analysis of how, if not in aesthetic terms, we might study these new types of images has yet to be adequately addressed.

At first impression the pursuit of a distinctive mode of visual cultural analysis for software might appear confined to little more than disciplinary nitpicking, or worse a form of intellectual activity that is directed towards producing the “digital creative” of the future from today’s classrooms.4 There is, however, a political dimension of this analysis that has far broader implications for the forms of labor and power that characterize the present historical period. For today software, unlike painting, photography or cinema in prior historical periods, is not only a dominant form of visual cultural production (however “impoverished” these visuals might be) but also shapes the dominant form of work in industrial countries.5 To put this another way: the social object that software constitutes—that is, the dominant (if aesthetically “impoverished”) visual culture of today—is also the foremost mode of production. This is not only true in an analogical sense, as in the relationship between cinema and the factory, but also quite literally. As McKenzie Wark demonstrates in Gamer Theory the logic of web surfing or video game play or smartphone use is exactly the same logic that is required by contemporary office work from data entry to stock trading to battlefield management.6

When Gilles Deleuze, late in life, was first moved to define the social and political conditions of a late twentieth century, where computers, cybernetic logic and neoliberal economism become central to the management of industrial societies, as the constituents of control societies, it was not in a self-contained work on such societies but in an aside to an essay on the cinema of Jean-Marie Straub and Danièle Huillet. In “Having an Idea in Cinema” Deleuze invokes control societies in order to express, as he sees it, the fundamental intractability between the work of art and the work of communication—with the erasure of the former under the latter configured as a crucial indicator of historical periodization.7 The implication here, echoed in Buck-Morss’s response to the “Visual Culture Questionnaire,” is that, under the conditions of control, the aesthetic is overwritten by the communicative, a procedure that is emblematic of wider transformations in the political-economic logic of industrial societies and the broader global interests they control. To be clear, both Deleuze and Buck-Morss posit this supremacy of communication or information as ideological: it is not the case that the aesthetic realm no longer exists, but rather that recent transformations of social conditions elevate communication and information above it. Both texts implore us to locate the markers of social practices in visual culture, but at the same time suggest that to carry out such a project today necessitates an engagement with the conditions of an era in which the aesthetic is reframed as the communicative or the technical. Ultimately, what both Buck-Morss and Deleuze suggest is that to focus on the content of the image alone—be this in aesthetic, ontological or indexical terms—is to focus on the wrong aspect of the visual for the critical analysis of the present era.

Certainly we do not have to look hard to observe that analyses of the visual dimension of so-called new media are predominantly concerned with communication, or questions of how digital images are made and, by extension, prepared for storage and transmission. The range, or perhaps more accurately the number, of texts attempting such a project is substantial. Representative texts in this canon might include William J. Mitchell’s exhaustive analysis of the technics of digital images in The Reconfigured Eye (1992), Bernard Stiegler’s short essay “The Discrete Image” (1995), Lev Manovich’s location of numerical representation as the cornerstone element of any new media object in The Language of New Media (2001), and D.N. Rodowick’s analysis of the implications of the digital image for film philosophy in The Virtual Life of Film (2007).8

All of these approaches—despite their virtues and uses—fail to summon a historically specific ethics or politics of the digital image. Such a project would not be limited to evaluations of the “truth” of a given image at the level of its material substrate and manipulability, but would also have to entail an analysis of what a given image is asking us to do, or—to borrow W.J.T. Mitchell’s phrase—what a given picture wants. On the one hand critics of visual culture have to come to terms with the choice between that which is experienced as radically new and that which is not. Periodization, however roughly-hewn, becomes an indispensible tool here because the emergence of digital technologies for the discretization, mathematical representation and subsequent total manipulability of analogue images is tightly bound up with the historical emergence of post-Fordist economism and the associated extension of productive activity, which now encompasses a range of leisure practices and forms-of-life alongside classical forms of work. The changing function of images is crucial to this turn.

Jonathan Beller’s The Cinematic Mode of Production,focused as it is on attention economy, foregrounds both the necessity and the difficulty of analyzing the instrumental role of the image under post-Fordism.9 This difficulty is foregrounded in the book’s opening sentences, where Beller states that:

The Cinematic Mode of Production remands to the reader the following idea: Cinema and its succeeding (if still simultaneous) formations, particularly television, video, computers, and the internet, are deterritorialized factories in which spectators work, that is, in which we perform value-productive labor. It is in and through the cinematic image and its legacy, the gossamer imaginary arising out of a matrix of socio-psycho-material relations, that we make our lives.10

While Beller’s general analysis—of a transformation in capitalism throughout the twentieth century whereby supposedly immaterial objects such as information, affects and attention attain commodity status—is descriptive enough of a historical turn clearly evidenced in sources as heterogeneous as management literature and critical accounts of late capitalism, his fixation on a conceptual change (the cinematic production of specific forms of vision) at the root of these transformations fails to address the relationship between post-Fordist economism and the specific materiality of its emblematic technology: the computer.11

Fig. 1: “Cloud Power” (detail). Microsoft Cloud Computing advertisement, 2010.

Visuality provides a crucial entry point into the location of the materiality of computer technology and its uses because it foregrounds a set of essential problems around the notion of immateriality—problems that are at the centre of the present critiques of post-Fordism. By not engaging with the materiality of computation the assorted discussions of novel forms of production and valorizable activity found in the work of Maurizio Lazzarato, Paolo Virno, Michael Hardt and Antonio Negri amongst others curiously mirror the presentation of computation and networking found in software advertisements such as the Microsoft “Cloud Power” campaign. This supposed immateriality flies in the face of a range of evidence to the contrary: from the coltan reserves in the Democratic Republic of Congo to mass waste dumps such as the now-infamous Giuyu computer graveyard in China to the growing prominence of wrist supports, gloves, ergonomic keyboards and mice designed to ward off RSI in the workplace, the materiality of digital culture is plain to see. All of this is not to suggest that the conceptual facets of the current industrial vanguard should be ignored, however: the reason visuality provides such a rich entry point is that it both facilitates material practices through the production of affects and ideas (if we suppose for the moment that these things are indeed immaterial) and comprises specific material practices of production and use. In short, if there is to be a critique of post-Fordism that functions at the level of its definitive practices and technologies it must come to terms with the materiality of the new forms of valorizable activity as well as the conceptual and affective transformations that make these activities both possible and desirable.

To focus on a single form of so-called immaterial economy in order to form a theoretical response to the broader situation we might begin from the following question: how can attention be measured? An analysis of the specific, technically-facilitated practices that make such a thing as an attention economy materially (as opposed to just conceptually) possible will be a small but crucial step towards a critical theory that addresses the ways in which post-Fordism configures the social as a technical system, rather than the technical function of the machines in isolation.

Here we might do well to think of Geert Lovink and Florian Schneider’s provocation that the emergence of the network as the infrastructural form of global capital form necessitates a movement of critical approaches away from the spectacular and towards “invisible, subtle processes and feedback loops.”12 This is a historical claim that echoes Buck-Morss’s more patient analysis of the transforming diagrams of capital in her essay “Envisioning Capital: Political Economy on Display,” in which the “amorphous” sociogram produced by system-dynamic approaches to economism is contrasted to classical hierarchical diagrams of corporate structure.13 While Lovink and Schneider’s critique of visual-cultural studies of the digital represents a crucial historical intervention in the analysis of digital media, its bluntness masks the crucial, if transformed, role played by the visual under post-Fordism. If digital networks expand and intensify the range of activities that can be functionally productive, as the range of analyses from Italian political theory suggest they do, then questions of how these activities are both motivated and measured must be addressed, and this requires an analysis of the social, as well as the technical, dimension of media objects. The concept of attention economy, theorized as it is in both business literature and critical theory, represents an optimal point from which to analyze the complex of aesthetic, technical and social factors that constitute the historical dimension of the digital image.

In the second of his cinema books, Deleuze defines two classes of cinematic image that are produced through distinctive editing schemes. The two ways in which the cut functions, rationally and irrationally, are defined in the following manner:

[In classical cinema] the cuts which divide up two series of images are rational, in the sense that they constitute either the final image of the first series or the first image of the second [...] when there is a pure optical cut, and likewise when there is false continuity, the optical cut and the false continuity function as simple lacunae, that is, as voids which are still motor, which the linked images must cross. In short, rational cuts always determine commensurable relations between series of images.14

[In e.g. Godard’s cinema] the image is unlinked and the cut begins to have an importance in itself. The cut, or interstice, between two series of images no longer forms part of either of the two series: it is the equivalent of an irrational cut, which determines the non-commensurable relations between images. It is thus no longer a lacuna that the associated images would be assumed to cross: the images are certainly not abandoned to chance, but there are only relinkages subject to the cut, rather than cuts subject to the linkage.15

The rational cut predominantly maintains continuity between series of images, while the irrational cut draws attention to itself by breaking continuity, subordinating the series of images to the cut. To map this onto Deleuze’s own distinction between forms of cinema, the rational cut is that of the movement-image while the irrational cut is that of the time-image. It should be noted that this is also a distinction, in Deleuze, between classical and avant-garde modes of production—where the movement image belongs to the former and the time-image to the latter. As we well know by now, the movement image of classical, continuity cinema aims to produce a continuum of attention, where no component emerges that would draw a viewer’s attention towards the artificiality of the experience. To echo Guy Debord’s rephrasing of Marx, we could say that the movement-image of classical, commodity or spectacle cinema functions as an opiate. The time-image, by contrast, aims to introduce jolts, or spikes that explicitly draw attention to the artificiality of the editing process. This type of image (or combination of images), however, also suggests a distinct narcotic effect: not that of an opiate but of a stimulant. 

So what kind of image would result if the principal function of commercial (rather than avant-garde) media were no longer to opiate, but instead produced a situation in which the subject internalizes the logic of the classical commodity form in a way that stimulates action and thus generates measurable attention? In other words, what would the image-type be if media were valorized not only through their purchase as commodities but also through the active process of consumption they stimulated? The advertising image that informs Guy Debord’s theorization of spectacular culture implores us to respond, to buy the advertised product, but at some time in the future. By contrast, the image produced for software interfaces, video games and websites demands an instant response, a motor response which in many cases functions more quickly than the time needed to fully process the image.16 This type of image would be not a movement-image or a time-image but an action-image—an image that is made to motivate user input. A spike or jolt of attention would be an essential component of this image because each significant change in the image would ideally necessitate some form of input, a mouse-click, keystroke or button-press. We might here recall Friedrich Kittler’s rejoinder—made in his essay “Computer Graphics: A Semi-Technical Introduction”—that while modern graphical computing presents not only a user interface but also images and video that directly reference the lineage of photography, film and television in their appearance and relations, the fading memory of a computer screen populated by white dots on an amber or green background serves to remind us that the “techno-historical roots of computers lie not in television, but in radar.”17 The crucial aspect of the radar image, of course, is not the visual form it takes but the fact that the user “must be able to address the dots, which represent attacking enemy planes, in all dimensions and to shoot them down with the click of a mouse.”18 The image, in this setup, may present a model of the world or a model with no actual correlate, but in any case it is always aimed at motivating user action.

If the irrational cut—the cut that “spurns all syntax and figurality between images,” “draws attention to the possibility that the images being joined are of different kinds” and “deplores closure”—forms the basic functional condition of television, as Richard Dienst suggests it does, this is in part because of the possibility for the viewer to change channels in the middle of a given shot or edited stream of images.19 At the basic level of channel switching on television, then, the degree of interactivity introduces the attention spike of the time-image to everyday mediated experience. User input—from the possibility of changing channels on television to the pause, fast-forward and rewind functions of video, to the various forms of interaction afforded by digital media—introduces a distinction between the series of images that executes through editing to the image that is executable.

The action-image thus moves beyond the types of visuality defined by Deleuze by presenting, instead of the cut, a moment of user instantiation. The concept of execution—the fundamental experiential unit of computation—becomes primary. Here Manovich’s notion of spatial montage, in which cutting becomes not a change from one frame to another but the appearance and disappearance of multiple windows within the frame of the screen, is worth recalling. The problem with Manovich’s formulation, however, is its focus on the produced image rather than the way in which the image is produced. In other words, the way in which windows open and close as a result of user input and the way programs invite such input through graphical and aural cues appear of little interest in Manovich’s analysis. Spatial montage is strictly created through one-way manipulation of images by a filmmaker, even when the multiple-window layout of the computer’s Graphical User Interface presents the originary model:

a number of images, potentially of different sizes and proportions, appearing on the screen at the same time does not [by itself] result in montage; it is up to the filmmaker to construct a logic which drives which images appear together, when they appear and what kind of relationships they enter with each other.20

Spatial montage, as Manovich describes it, might describe the content of computational images but it does not structure the experience of computer use: the appearance of a new window or button to press, or the movement on a Non Player Character (NPC) in a video game, represents a change within the frame intended to produce a momentary spike of attention and thus a user input. As a bidirectional medium the fundamental experiential unit of the computer—its equivalent of the cut in unidirectional visual media such as film—must be user input rather thanapredetermined relation between visual content. Visual elements within the frame of the computer screen always bookend action: one output to motivate input, and then one output to show that the input has registered. A closed folder is viewed on the desktop; a mouse button is double clicked; a folder opens. A turtle moves towards Mario; the “jump” button is pressed; Mario jumps. This is the fundamental structure of software use. To repeat: the equivalent to the cut is input when it comes to computational media objects.

If we focus on the image as directed towards motivating user action rather than aesthetic reflection, the complex elevation of execution above presence as the recordable measure of attention can be productively traced from cinema and successive electronic media in terms of the practice of demographics and the integration of user input with this practice. Rather than claim that there is a clean break between cinema, television and video on the one hand and digital media on the other hand, we should elaborate a developmental genealogy through electronic media that culminates in the highly-detailed measurement of attention based on instances of user input. The present role of the computer as the emblematic site of data collection in attention economies can be traced through the progressive emergence of new systems for recording user data connected to each successive medium that are increasingly based in user activity. This latter type of data collection is based not on time spent in front of a screen but in instances of user input: mouse clicks and keystrokes. Putting aside software, including websites and video games, we might observe the emergence of interactivity as a mode of attention capture through the increase of user input from the channel-switching, volume and image adjustments (brightness, contrast, saturation) introduced by the television to the time-based inputs (fast-forward, rewind, pause, slow-motion) enabled by video, which culminate in the executable media of software and videogames. A highly instructive example of the convergence of these technologies with the monetization of attention can be seen in the StopWatch service released in 2006 by the manufacturers of the TiVo digital video recorder; this is a technology that combines elements of television, video and computers.21 StopWatch technically both augments and supersedes Nielsen, the predominant provider of audience ratings data, by both recording and making available second-by-second viewing data, including instances of pausing, rewinding and fast-forwarding during each programme.22 Through this particular example, which ties together experiential and technical facets of both spectacular and executable media, it is possible to grasp 1) the massive increase in collectible demographic data and 2) the focus of this data on user actions rather than a simple equation of attendance with attention through which the mass field of media consumers can be cast into highly specialized sets defined by user input.

In the end the question of immateriality as the right or wrong concept through which to analyze and critique the extraction of attentive labor comes down to the methodological centrality of materialist history in one’s analysis. If attention economies function through the internalization of a certain logic of mechanically-ordered images, then cinema cannot be the root technology; we would have to incorporate a history of mechanically-structured, discrete technical vision which, as Kittler has argued, incorporates a range of pre-cinematic technologies including the camera obscura, linear perspective, and the printing press.23 This lack of historical specificity would not be a problem in itself, provided one abandons belief that attention economies demarcate a distinctive period in the history of capitalism. If we want to acknowledge that today attention economies are a fully-formed and material reality rather than a nascent or even a purely conceptual one, however, then we must come to terms with the fact that computer graphics and computation in general are not the same thing—and that the produced inseparability of the two promises (or threatens) to make ideas and actions functionally interchangeable. Consider for a moment an implication of this turn that proves highly instructive in grasping the modification of critical response it requires: under the material conditions of attention economies the distribution of ‘shock’ images such as the hello.jpg of and the 2 Girls 1 Cup video does not imply the depths of aesthetic, moral or ethical wrongness but rather confers on the distributor the role of ideal citizen, the instigator of a torrent of mouseclicks and keystrokes as the content is viewed, redistributed, commented upon and remixed.

The centrality of computation to actual systems of attention monetization at large in the world today, emblematized by Google’s constant refinement of their search algorithm and their PageRank and DoubleClick technologies through data generated by user activity, suggests that we must bolster our analysis of the visual not only with an understanding of the material systems that enable technical visuality, but also with an understanding of the real subsumption of images made possible by computation. It is these functions of computer technologies, not those that enable image-making itself, which make the discretization, parsing and valorizing of attention possible. If we are to grasp the relationship between visuality and material labor in the era of post-Fordism, it is towards the historical and technical conditions of the action-image that we must look.

Seb Franklin is a writer and teacher based in Brighton, UK. He received his doctorate from the University of Sussex in 2010 and is currently Postdoctoral Research Fellow in the Cultures of the Digital Economy Research Institute at Anglia Ruskin University.



