Four Axes of Rhetorical Convergence

by Anders Fagerjord

This essay presents a theoretical model of genre relations in multimedia. Any text may be described according to the four axes Mode of Distribution (the balance of amount of material and time between authoring and reading); Mode of Restrictions (range and detail in space and time); Mode of Acquisition (the reading process required of the reader); and Mode of Signification (the particular combination of sign systems). Rhetorical convergence is when a text is similar to one genre on one axis and another genre on another axis. However, the model implies that rhetorical divergence may be a better description.

1. Rhetorical Convergence
2. Mode of Distribution: Bandwidth
3. Mode of Restrictions: Canvas
4. Mode of Acquisition
5. Mode of Signification: Sign systems
6. Limits of Technology
7. Rhetorical Divergence?

1. Rhetorical Convergence

How do you describe a Web service? Often, the easiest solution is to compare it to earlier media. In the case of a modern news site, we might say "it is like a newspaper, but you can watch video there, and it is constantly updated."

As we recognise a Web site as similar to several earlier media, it is the patterns and styles of writing, of photographing, of editing we recognise. All these different kinds of details may be covered with the term rhetoric. Earlier, I have suggested the term rhetorical convergence for Web sites that in this way are "between" earlier media ("Rhetorical Convergence").

Rhetorical convergence is the combination in one medium of rhetorical forms or devices that were earlier only seen in separate media. The present essay is an attempt to chart the ease or difficulty with which rhetorics in different media combine. Which forms may be merged, which have to be altered, and which are mutually exclusive? I will propose a multidimensional model of media rhetoric, collected in four "axes" that enable an overview of the different convergences of media forms.

Along the way we will examine a long row of examples, all perceived to be mixing rhetorics of several established media. Each example shows a form very similar in one respect to what we are used to see in an earlier medium, but in another respect just like another earlier medium; just as a daughter may be the spitting image of her mother, yet no one can deny she has her father's eyes. To gain an overview, I have collected all such similarities "in one respect" and "in another respect," and tried to reduce the number of "respects" by assembling them into groups and constructing a theoretical model.

This is done regularly within statistics; one may for example measure a set of objects in many ways with many different instruments. For the statistician, each reading would then be one dimension. Using mathematics, the number of dimensions may further be reduced, for example to three, by projecting the measurements down to three dimensions. The objects may then be compared using only these three numbers, a much more manageable task.

When Web sites are compared to texts in earlier media, the many aspects that we will discuss below cannot always be expressed quantitatively. I am not proposing a mathematical or statistical model, I am using statistical methods as an analogy for the process of collecting related aspects into larger categories.

The approach is not without precedence in studies of texts either. Barthes' method of textual analysis, developed over a number of essays and perfected in S/Z, consisted of dividing the text into small units, and registering the presence of five co-present "codes." There are several different sub-types under each code. Both chronology, descriptions of time and place, and common sayings are, for example, coded as the "referential code" as they all refer to common knowledge within a culture. The code of actions consists of all the diverse actions performed by the characters in a story. To explain this, Barthes used the analogy of a polyphonic musical score, where several voices develop relatively independently in the same piece. Genette, on the other hand, used the term dimensions in The Architext: An Introduction when arguing that genres are recognised by (at least) shared themes, modes of enunciation and "formal Ômedium' of imitation (in what language, in what meter, etc.)" (78), aspects that in their turn are made of different parts or dimensions, such as different meters or different enunciative modes (of which all of narratology is just one part).

I will use the analogy of "axes" for this way of ordering the large number of different qualities a text has, so it is easier to get an overview over all the details. Still, it is not my ambition to reduce the complexities of architextual references we here call rhetorical convergence. Rather, I want to present a perspective that enables us to see the full range of complex relations, just as Barthes did in S/Z and Genette in Architext.

In constructing the following model, I have primarily relied on textual analysis of Web sites, but earlier literature has been an important source of inspiration. One cannot read much into the topic of computer media without noticing all the deliberations on two themes I will summarise as multimedia and interactivity. Under the heading multimedia I put texts about the computer's ability to combine different sign systems such as writing, images, video and music. [1] Although the term interactivity is contested, I use it here as a heading for literature concerned with how computer texts respond to user input. [2]

Writers of hypertext theory and Web design [3] often comment on the fact that hypertext systems such as the World Wide Web are simultaneous media that allows for continuous updates and alterations, and may allow users to comment or modify texts. This can be treated in relation to the stress on live broadcasting within radio and television that is theorised within media studies [4]. A Web site may be live and archive at the same time.

More practical writers on Web design [5] never fail to stress how Web authors are constrained by technical limitations such as bandwidth, screen resolution and colour depth. We will see that much of the "interactivity" in Web sites are actually remedies to compensate for low screen resolution and bandwidth.

I have chosen to project the many dimensions of Web rhetoric on four axes. Multimedia, interaction, "liveness" and technical limitations on the Web were refined into four axes of rhetorical convergence and labelled signification, acquisition, distribution and restrictions. They are called "axes" for several reasons; the first is that they are co-present. Texts and genres are characterised by how authors have chosen to combine these four aspects. They are different sides of a text, different dimensions. Furthermore, each of these four aspects are ranges of choices, on which texts will position themselves differently. A broadcast is either live or not. A text may be written, spoken, or both. When discussing the four axes below, I will discuss different modalities, or positions, along each axis, and a text's position will thus be called its "mode of" distribution, acquisition, restrictions, and signification.

I propose these four axes as a projection, a way of getting an overview over the many dimensions of rhetoric. I think of rhetorical convergence as taking place in a four-dimensional space (to the extent I am able to visualise such a construct). When we discuss popular media what we tend to discuss is really their typical genres. Examples from television might be the news show, the serial, the soap opera and the talk show, from newspapers, the report, the feature, and the op-ed. Each of these media-specific realisations of different genres would occupy a unique space in my four-dimensional space if their modalities along the four axes were mapped. Media texts that combine aspects of different genres would then position themselves between the positions of established media, similar to one genre along one axis, similar to another genre along another axis.

We may thus rephrase rhetorical convergence as being a form from one medium or genre that is copied in another medium, but where the copy differs from the original by substituting its mode on one or more of the axes with the mode of a form we recognise from a different medium's rhetoric.

Before the computer, the mode of distribution, mode of acquisition, mode of restrictions, and mode of signification were usually given by technological constraints or conventions of the medium. What makes rhetorical convergence an interesting study object is that in a computer medium, few of these aspects of rhetoric are given. Instead they are ranges of choices to the author.

2.1. Mode of Distribution: Bandwidth

Let us use the terms creation and distribution to describe the process of getting the messages from author to audience, a process that takes time. Authors are aware of the time it takes, and this influences media rhetoric. Newspapers are printed during the night and distributed to subscribers and newsstands in the early morning. It is a carefully timed process, and all journalists have to deliver within deadline or the newspaper is delayed. In broadcast media, the distribution from sender to receiver is almost instantaneous, but the creation of the broadcast takes a lot more time. Recording and editing sound or video may take a lot of time, and the raw material has to be transported to the editing facility, and then to the sender facility. For fast evolving news, the first reports are likely to be read from studio or reported on telephone from a reporter in the field, and only in later newscasts does edited video material appear. That creation and distribution times determine the rhetoric is true for all media. But the Web is fundamentally different from earlier media as its distribution time is variable.

A Web page is not loaded before the reader asks for it, by typing in a Web address (a URI) or following a link. Following Jan L. Bordewijk and Ben van Kam's typology from the article "Towards a New Classification of Tele-Information Services" the Web's traffic pattern is consultation, as it is a relation with one sender (the server of the page being read) to (potentially) many receivers, and the receivers initiate the transaction. In contrast, the traffic pattern of earlier media is transmission in this model. In order to see how this difference makes Web rhetoric different, it is useful to look at three different aspects of distribution which we will call bandwidth, latency, and permanence.

As the Internet is a packet-switching network of many different networks with different standards, the capacity of transmission, known as bandwidth, varies enormously. If the files that are transported over the net are small, however, the user may not necessarily notice the bandwidth difference; research in Human-Computer Interaction has shown that we experience everything that takes less than about a second as instantaneous (Nielsen 42-44).For static pages (without sound or moving parts), Nielsen's studies conclude that we normally do not care much about differences of a couple of seconds either (44). But some sign systems are slower to distribute than others. When different sign systems are coded in computers, the file formats takes up different amounts of space. Text and numbers are coded extremely economically. Images are coded less economically, sound takes up more space than images, and video takes the most space. Images, sound, and video may be compressed to be less voluminous, at the expense of image or sound quality. Heavy compression also requires more computer power, both to "wrap for shipping" through the net and to "unwrap" or "deflate" at the receiving end. A slow computer will thus slow down the display of compressed media, but again, as long as the decompression takes less than a second, the reader won't notice.

A Web designer who is aware of this fact will thus face choices that are unique to computer communication. Many pictures in a magazine or quality footage in television may slow down the production process and bring up production cost, but the images always arrive in the newsstand with the rest of the magazine. Television would be a very different experience indeed if we had to wait for a video segment in television news, while a talking head was instantaneous. In Web television, this may very well be the case, so in Web design, "heavy" or "slow" elements such as big colour pictures or video have to be used sparingly if the pages are meant to be appreciated by any Web surfer.

The "interactive feature" "Sights and Sounds of the Way West" in Nationalgeographic.com provides an example of an aesthetic grown out of bandwidth concerns. "The Way West" is a movie created solely of still photographs that are "made moving" as if filmed by a movie camera that moved over each image, zooming in or out. The projection of the show is timed to a soundtrack with music, sound effects and narration.Animating still images by moving would only be used in mainstream television when there is no video footage available, as often is the case in history documentaries. In "The Way West", on the other hand, the whole feature is made in this manner. In the introduction, for example, photographer Jim Richardson who also is the narrator introduces himself in voice-over, illustrated with two shots of him doing field work. In television, it would be rare indeed not to show the moving image of Richardson speaking. On the Web, on the other hand, it makes more sense to do it in this way, as the still images made moving can be large, fine-grained and have rich colour while keeping the movie files relatively small. Video would take up much more computer memory, and thus require more bandwidth (or patience during download) to be played.

The introduction sequence of "The Way West" shows the influence of bandwidth concerns on rhetoric in more detail. "The Way West" is made in Flash, and requires that a fairly large file be downloaded before one can start watching. To cover the download time, the authors have taken advantage of the fact that writing loads faster than recorded sound and colour photographs. While download still is in progress, an animated title sequence appears.

Opening screnn of "The Way West".

The title screen shown above appears one word at a time. After a while (how long depends on the download), the letters are animated as "blowing away," accompanied by wind sounds.

Words animated as "blowing away" from the opening screen.

A written introduction to the sequence then appears three lines at a time. The appearance is timed to the download, and on a fast connection, the next three lines will appear a little before one is finished reading the previous at normal reading speed.

The three first lines of the written page.

The finished text page.

A little slower than a television introduction, this "splash sequence" as it is called in industry lingo cleverly fills the download time with interest, and at the same time makes a smooth transition to the moving images that are to follow.

This is just one example of how slide-motion films such as "The Way West" stand midway between television and print rhetoric. The introduction contains more writing than is normal on television, but it is dynamic, it appears over time. In the body of "The Way West", narration resembles radio (as it with few exceptions is totally comprehensible without the images), but is illustrated; the illustrations are not video but still images made for print; yet they are not completely still, as they are viewed through a moving frame. "The Way West" is not exactly either print, radio, nor television, but borrows a little from each; and it makes a reasonable compromise of bandwidth concern and television rhetoric.

The concern for bandwidth is principally about reducing the time readers have to wait for a page to load. I call this waiting time latency, and it is worth considering in itself.

2.2. Mode of Distribution: Latency

Radio and television - the broadcast media - eliminated distribution time when they were introduced. Radio or television signals travel from sender to receiver in almost no time, and in the years before storage media such as magnetic tape were introduced to broadcasting production, there was no delay between creation and broadcast either. Nowadays, most television programs are broadcast from an edited tape, but the possibility (and the history) of broadcasting live is still what gives television both its particular aesthetic, and its particular place in our lives, according to Raymond Williams, John Ellis, and Umberto Eco (Kant and the Platypus), to mention just a few.

We saw in 2.1 that bandwidth is a concern to all Web designers, as the Web is slow for many users. Still, live broadcast is an absolute possibility on the Web. In order to see the differences and similarities between broadcasting and Internet media, we have to consider two different measures of network speed. Computer scientists speak of bandwidth and latency as two related, but not identical features. Bandwidth, as we saw above, is a line's capacity to carry large amounts information. Latency is the speed with which a single packet can be transmitted over a certain line. A simplified explanation could be that latency measures the time it takes before the user gets the very first part of a message, while bandwidth measures the time it takes before the user gets the very last part, and the message is complete. A physical line has the same bandwidth all the way, while latency increases the longer the line is [6]. An example may clarify the distinction: With popular "streaming media" formats such as Real Media, Windows Media or QuickTime, it is possible to broadcast live video even over narrowband modem connections. The image will be smaller and with less detail than what would be possible on a broadband connection, but the latency is short enough that it is reasonable to call it live.

Both latency and bandwidth may determine the rhetoric chosen, as authors strive to match their texts to the expected bandwidth and the latency they wish to obtain. The "Web TV" made by Yahoo! FinanceVision is an excellent example of this.

Until 2001, FinanceVision was a live "Webcast". It was a page where a video pane showed live streaming video of a talk show about finance and stock markets in fairly traditional television style, while stock quotes and other investor information appeared in another pane whenever a company was mentioned. FinanceVision's creators chose to make it a live show, maybe to capture a fast-moving market, maybe to obtain connotations of "being there when it happens," maybe both. The visual style and the semi-scripted conversation closely resembled what can be seen on television every day in a lot of countries. At the same time, keeping the clothes of the presenters in few, clear colours and having a similarly clean and monochrome studio made video compression more effective, as there was less information in the image. Few camera movements and edits, and programme hosts sitting fairly still added to this. By ensuring there is as little difference as possible from one frame of the video to another, compression algorithms work more effectively to get high quality video over low bandwidth, while ensuring a short enough latency to call it live.

Stig Hjarvard defines live as the near simultaneous events of media creation, broadcasting and consumption. Near simultaneous, for in fact, there will always be some latency - for satellite broadcasts, it may approach a second. Following this definition, Web cameras are also "live". Web cameras take snapshots of a scene at regular intervals, for example, every minute, and the latest image is always available from the server connected to the camera. Many Web camera pages have text next to the camera pane reading "watch place x now!", and the parallel to television cameras used for surveillance is easily drawn, as Bolter and Grusin have treated in depth. Bolter and Grusin also point out how similar the promises of Web cameras are to television's aura of being live.

But if a Web camera is "live", then any Web page is live, as it is available from the moment it is put on a server. During the high-profile Orderud murder trial in Norway in 2001, tape recorders were prohibited in the courtroom. Bypassing the ban, a journalist of the Web newspaper VG Nett did his best to type everything that was said throughout the trial. Every time he finished a couple of sentences, he pushed a button, and the new lines were sent via a wireless Internet connection and added to the transcript on VG Nett's Web page. After Hjarvard's definition, this is live, as creation, distribution and consumption may be almost simultaneous. What differs is that the events and what is said is mediated by a journalist, and written by him on a keyboard. This is a slower and more filtering process than a video camera, and outside what is normally thought of as live.

These examples show that latency is a variable on an ungraded continuum. The shortest latencies will be felt as "live", but what the border values are cannot be pinpointed exactly. Towards the other end of the continuum, we find delayed release. Up to this point we have discussed as if media texts are always distributed as soon as they are ready, which of course is not always the case. Most earlier media are periodic, and a story has to wait to be published in one "issue", whether it is a printing or a special segment in a broadcasting schedule. On the Web, a story that is kept from immediate publication can be released at any point in time; whether to publish at the same time every day or week, whenever material is ready, or even "live" is up to the author to choose.

Latency is thus the sum of two parts: the time from event to broadcast, which in this example of a live show was (almost) none; and the time from the user enters the page to the video appears in the video pane. The first, which we may call creation latency, is a rhetorical choice the authors made. Transmission latency, the other part, is a physical limit of the line, the file sizes, and the speed of the computers involved.

Like bandwidth, the transmission latency needs to be considered by Web authors if they want to achieve a low latency overall, such as live or almost live distribution of an event. The (rhetorical) choice of whether to transmit live or not may also be a consequence of the third aspect of distribution: permanence.

2.3. Mode of Distribution: Permanence

A certain kind of live broadcast that is very widespread on television becomes very peculiar on the Web: the practise of interviewing someone live during a newscast. What is interesting for our present discussion is not the interview as such, but the fact that it is conducted live during the newscast, while all the information normally has been known for several hours. Very little news happen to take place in front of a filming camera team during the time the newscast's program slot. Live interviews in television newscasts appear to be very current, bringing the very latest information, but in fact they only bring information that has been known for quite a while, at least by the news desk. More important than the actual freshness of what is said is the appearance of freshness. Live interviews project an image of the television news desk as "standing in the middle of the stream of current affairs," as Hjarvard puts it.

A live broadcast needs an audience to be worthwhile, however. It does not seem likely that a Web news site would put up a Web camera and post a journalist in front of it, so he could be interviewed live whenever someone logged on to the server. Live streaming on the Web is thus normally reserved for special, heavily marketed events such as sports events or important speeches. Unlike television news, live Web events have to be advertised over a time period in order to attract an audience. On television, the audience shows up every night to watch the news without being asked. The evening news is a performance, a spectacle of announcing the events of the day, heightened by live reports from studio and locations around the world, performed for the audience that has gathered. This audience has gathered because it has little choice; if they turn on the television half an hour later, the news is over.

It is this lack of permanence that is the source of television's peculiar rhetoric of live interviews. After the interview is finished, it is too late. The live interview is a way to make up for the lack of permanence by reducing latency. But this is a trick, a sliding in story time. What actually happened earlier is turned into an event that happens now; or rather, an event is staged, which sole content is an earlier event. On the Web the recording of an interview may be stored and offered as video-on-demand to readers. It may become a (relatively) permanent offer. To stage a live conversation at a point in time with only a random relation to the time of the actual event makes no sense in a technology allowing for permanence.

What technologies allow for permanence, then? Print is in principle permanent, at lest until the paper withers away. In practise, however, only libraries are able to keep complete collections of periodic publications. It seems that in earlier media, permanence goes hand in hand with latency, both being results of the distribution technology. The shorter the latency from sender to receiver is, the shorter the permanence. A live broadcast is gone when the broadcast is over, while books are kept in the bookshelves. For print products, the user decides whether an item should be saved or thrown away; but few people collect magazines, even fewer save newspapers longer than about a week, while bound books rarely are thrown away.

In Web sites, permanence is a choice of the author. A Web server is a public repository of files, and any new file may be archived forever if the author wants it. There are no longer connections between latency and permanence, as a live video feed may be available from an archive immediately after the live event has ended. Most periodic Web sites make use of this possibility and keep extensive archives of older material.

When whether to keep an archive becomes a rhetorical choice, a rhetoric of the archive becomes a possibility, although not all sites take the opportunity. In newspapers, few news reports stand very well by themselves, as most news stories are follow-ups of earlier stories, bringing new developments, comments or reactions to issues already reported on. When an old news article is pulled from an archive, the original context is often lost, and it requires more thinking and guessing from the reader to understand what is reported. The requirement of background knowledge is well known within journalism, however, so many stories will contain a summary of earlier events, typically at the end of the article. Thus, for readers of archived reports, the problem of lack of background may be reversed: it is common to find several articles with almost identical summaries of the background. What is regarded as significant also changes over time. Not everything deemed newsworthy in the day-to-day running of a news desk stands the test of time, and in hindsight, it is easy to see that some developments are more important than others. Writing history means to summarise, and in an archive, the lines of developments may be lost in the details. (Not to mention the many "related articles" that are found in some automated archives that turn out not to be related at all.)

In contrast stand those periodic Web sites that keep background material, overview articles, biographies and timelines in one place, and consistently link to this material whenever these issues are reported on. MSNBC's "interactive" on Palestine, titled "Searching for Peace" is an example of this. From a central timeline, a large number of maps and overview articles about Israel and Palestine is available, as well as biographies of key players in international Middle East politics. Another example that stands out (despite being old by Web standards) is Nationalgeographic.com's 1996 feature "Gaza", where an article about Gaza is linked to parts of earlier articles from several years of National Geographic Magazine with great effect. What makes both these sites different from mere archives of earlier articles is that they are edited as overview sites, and care is taken to provide overview and understanding and avoiding redundancy or excessive detail.

These few examples demonstrate the whole new range of rhetorical possibilities that opens with a technology that enables the author to choose how long a text should be available. Permanence, together with bandwidth and latency form the axis of distribution.

2.4. Distribution and Rhetorical Convergence

A Web form is perceived to be the result of a rhetorical convergence if it blends rhetorical properties of two or more genres and media. The axis of distribution is one set of rhetorical properties: it describes how a text is transported from author to reader. It has two related dimensions which find a balance in each text: the kind and amount of material to be transported; and the time between writing and reading. (In the case of reports of real events a third dimension is added: the time between event and writing.) The axis of distribution is useful for describing certain kinds of rhetorical convergence, as may be expressed in this way:

A Web form with a balance of amount of material and time between authoring and reading similar to one genre in one medium, but in other respects similar to another genre in another medium is perceived as a result of rhetorical convergence between the two.

We see this in many popular Web media: continually updated Web newspapers do not change all articles from one edition to the other, but add new stories whenever they are ready, in a way we recognise from news channels on TV or radio. When recorded newscasts are offered as "video-on-demand", they appear more like print newspapers, as they may be "read" at any time. Bandwidth concerns tend to find compromises between video and print practises, as when moving images are replaced by stills in a documentary, or when a small video pane is placed on a page surrounded by paragraphs of writing, as an illustrative image would be.

Genre theorists will tell us that any text is understood or recognised via its affinities with texts we have encountered before. Rhetorical convergence is when we recognise a single text as being similar to other typical texts of two or more different media. A Web newspaper is an example of rhetorical convergence, as both writing and lay-out is very similar to print, but its mode of distribution is broadcasting-like. News in newspapers and television is compared to Web newspapers and video-on-demand in the table below. By registering aspects from the axes of distribution and signification, we see clearly how Web newspapers and video-on-demand may be seen as results of rhetorical convergences of print and television news [7].

Mode of Distribution

Mode of Signification

Transmission Latency


Dominant Sign sytems

Print newspaper

8 hours

approx. 1 week

writing, still images

Web newspaper

≈ 0

3 years

writing, still images

Web video

≈ 0

3 months

moving images, speech




moving images, speech

3.1. Mode of Restrictions: Canvas

The second of the four axes is the mode of restrictions, the limits of a texts "canvas." We will use the term canvas for the material that the signifiers are made of, or in the words of Umberto Eco's Theory of Semiotics, the continuum within which the signs are shaped (217). Let us initially use the term for the paper in print, the silver screen in cinema, the loudspeakers of a radio, or the screens of television sets and computers. When these membranes are written upon or vibrated to produce sound, they have different physical qualities. Wide-screen cinema has far better image resolution and colour range than television. This has implications for the rhetoric; cinematographers may photograph subtle images that would be wasted on TV, as the subtlety would be invisible.

The properties of the canvas are based in technical limitations and possibilities. Why introduce a special term, canvas, for what seems to be merely technology? The answer is that the canvas is not just the technology in use. In using a painter's canvas as a metaphor, I want to direct attention both to the physical properties of a technology, as in the way painting on canvas is different from painting on paper or a dry wall, and to the choices the author makes, as when a painter cuts the canvas to the desired proportions before painting, thus setting the maximum size of the image.

These choices of the author are restrictions. Like a painter's canvas is a reduction from the original piece of canvas, any text is realised within a subset of the possibilities the technology offers. Each medium has some fixed standards, some flexible ones. The frame in cinema is normally limited to a few standard ratios of height and width (American, wide-screen, et cetera), while the length of a feature is much less standardised; most feature films are roughly between 90 and 150 minutes long, but there are many exceptions. Television screen ratios are even more fixed than cinema's, while books can be of virtually any size a person can handle without special tools.

Thus, the author sets many restrictions for the text. It appears to me that we form an overview of the many properties of the physical appearance of a text by seeing it as existing within a limited range and having a limited number of details. The limits of range and detail operate both in space and time. Range is the distance between the outer limits of a continuum for sign-production. A visible text is restricted within a vertical and a horizontal range. In addition, it has a range of contrast, the span between the colours that contrast the most. Sound has little spatial existence, but is nonetheless restricted at any point in time by a frequency range (the span between the highest and lowest frequencies possible) and a dynamic range (the span between the softest and the loudest sounds).

The range used in a text is often less than the largest range technology allows. In most Web sites, the area used in any page is significantly smaller than my computer screen. This is normal practice, in order not to annoy those readers who have small screens. Just as Web designers need to design for different bandwidths, they also need to allow for the different screens used by their target audience.

Within the range(s), there is a lowest level of detail, that is, there is a lower limit to how small a part can be, which also sets an upper limit to the number of physical parts. In computer images, these are called resolution (the largest possible number of details within a range) and colour depth (the number of possible colours). Also in painting - which does not have any graded differences in the same manner as computer graphics - there is a limit to how fine a line is, set by the paint, the surface, the brushes and the painter's technique. These limits are also rhetorical choices: if the painting is a roof decoration, for example, the smaller details are not called for, as all viewers will stand several meters away from the artwork.

Computer screens have very low resolution compared to cinema and print [8], and this puts a limit on the use of detailed graphics. Much of what is called "interactivity" is rhetorics devised to reduce or bypass this disadvantage. One example is the "Congo Trek" site made by Nationalgeographic.com. The site is an archive of more than seventy-five letters from conservationist Michael Fay, who spent fifteen months walking across the rainforest of central Africa. A map of the route Fay followed contains links to all the seventy-five letters, but the details needed of the map are not visible on a computer monitor. To remedy this problem, the authors programmed a "magnifying glass" that the reader can direct over the map to reveal the details.

Screenshot of the "Congo Trek" Web site

Range and detail also have a temporal element. A dynamic text has a range (time span), and a number of possible changes within this range (the frame rate in film, sample rate in recording, screen refresh rate for computer monitors, polygons per second in graphic computer games).

The temporal range is often a clear indicator of genre. A feature film or novel is supposed to be of a certain length, while an American television sitcom like Friends is adapted to a half-hour playing time, made to be interrupted by commercial messages. In computer media, temporal detail - frame rate, for example - is often a trade-off between technical quality and bandwidth concerns, a part of what we called mode of distribution above. A high frame rate is necessary to show fluent motion, but it also drives up the file size. It is thus a necessary decision for a Web author, that will have rhetorical consequences. The table below places some technical terms that may describe a text's mode of restrictions in relation to range and detail in space and time.




Aspect ratio
Contrast range

Colour Depth
Sample depth



Frame rate
Refresh rate
Sample rate

3.2. Canvas and Rhetorical Convergence

A Web form is perceived to be the result of a rhetorical convergence if it blends rhetorical properties of two or more genres and media. The axis of distribution is one set of rhetorical properties. It describes the canvas: the range and detail in time and space available for sign production in a text. Rhetorical convergence may be perceived by a shared canvas:

A Web form with a canvas similar to one genre in one medium, but in other respects similar to a genre in another medium is perceived as a result of rhetorical convergence between the two.

A unique feature of computer texts is that they are flexible, a fact that has been deliberated at some length by, for example, J. David Bolter in Writing Space and Nicholas Negroponte in Being Digital. A consequence of this is that a text may have a flexible canvas, it may vary between different modes of restrictions. Most file formats have not set all the restrictions we have discussed, but let the author specify them for each new text. While, for example, the frame rate of television is a given, it may be different in two subsequent computer videos, or even change during the course of a film. Some restrictions may be left for the reader to decide; so-called "fluid" designs will fit the computer window whatever width or height the user has set it to. The possibilities of combining modes of restrictions from different media's rhetorics thus seem virtually unlimited. A few examples are given here only to demonstrate rhetorical convergence along the axis of restrictions.

In 2003, the visual quality of computer displays is so poor compared to paper that there are very few visual forms from these earlier media that can be reused on the Web without changes. The only typography from paper that can be recreated on the Web is that of small formats: newspaper layout would be impractical on the Web, as it would require a lot of scrolling both horizontally and vertically [9]. Almost any Web form derived from print would be perceived as converging with computer rhetoric (or computer interface conventions, which is an alternative term) because of the difference in canvas. The "magnifying glass" from "Congo Trek" discussed above is clearly related to the "locator" interface in the popular image editor application Adobe Photoshop.

"Congo Trek" is the Web version of the story about Michael Fay's expedition. The same year, Fay's story was also told in National Geographic Magazine and in a televison documentary in the National Geographic Explorer series. The table below renders aspects of three of the four axes when comparing the map in the "Congo Trek" Web site with the similar maps in the National Geographic Magazine and the television program. Although the Web site has a canvas with even less detail than a television screen, it uses a map almost as detailed as the one in print. To make the map readable, a mode of acquisition ("display control", that is, the magnifying glass) from computer media is used, exemplified in the table with Acrobat Reader, an application used to view on a computer documents designed for print. Like Photoshop, Acrobat reader uses a "magnifying glass" as an interface for moving images larger than the screen. "Congo Trek" is thus a convergence of three different forms. The flexibility of computer technology opens for a multitude of positions between established media's modes of restrictions. Web camera pages are a good example, as they are midway between films and still images, technologically speaking. A Web camera is in fact a video camera, but it is connected to a computer that is programmed to copy the camera's image at a much lower temporal detail (frequency or frame rate) than video has, so it will not be perceived as moving. Still, it is a dynamic image, it does change over time. All earlier media are either dynamic to the degree that they have enough temporal detail to make us perceive continuous movement or sound, or else they are still. The computer has opened up a whole range of rhetorical possibilities between these two perceptual poles.


Mode of Signification

Mode of Acquisition


720x576 pixels

Small maps, few details, animation


Web site

351x280 pixels

Large map, many details; writing

Reading + Display control


12000x16200 dots [10]

Large map, many details; writing



1280x1024 pixels

Large map, many details, writing

Reading + Display control

4.1. Mode of Acquisition

Among the established media, there are marked differences in how they are read. Cinema is viewed collectively in one sitting from beginning to end, in a darkened, public theatre. A newspaper is read privately, can be taken anywhere, put down and taken up again, flipped through and read in any sequence. Few readers read the whole paper, instead most readers rely on headlines, leads, images and captions to decide what to read. We will discuss such differences here under the heading mode of acquisition.

Mode of acquisition is how the reader accesses the signs of a text. Again, this notion would often seem superfluous in earlier media. In computer media, on the other hand, the kind of reading process the text enables and encourages is a rhetorical choice, and encompasses the different devices often contained under the label interactivity.

While most films and novels are told in one sequence which the reader is expected to follow, Web sites rarely are. Most Web sites offer a partial list of their contents as a menu of choices for the reader. The reader chooses what links to follow, and only when a link is activated (typically clicked on) by a user does new page appear.

The reader's choose-and-click reading activity is named interaction by some writers, while others reserve this term for other kinds of reading and writing activities, with computers or between human beings only [11]. A less disputed term for the reading activity involved in Web reading would be ergodic, coined by Espen Aarseth in Cybertext. Aarseth defines ergodic literature as literature in which "nontrivial effort is required to allow the reader to traverse the text," and having made this effort, the reader "will have effectuated a semiotic sequence" (1). Reading traditional novels, watching films or television of course requires cognitive effort, but apart from interpretation [12] the effort is considered trivial, as the reader only has to turn pages, the viewer to watch. In hypertext or computer games, however, the reader/player constantly has to make decisions as to what to do with the text [13].

These "non-trivial efforts" are manipulations of the text as a material and mechanic object. Manipulations may be throwing dice or coins, ordering pieces of paper, or controlling a computer interface with mouse, keyboard or other input devices. To focus in this way on the material structure of a text and the ergodic effort involved in reading is what Aarseth terms a cybertextual perspective (22). To view a text as a cybertext is to view it as textual machine "for the production of a variety of expression" (3), a machine that according to some principle combines pieces of text into the text the reader reads. The machine thus has three important sets of parts: the textons, the pieces of text that may be combined; the scriptons, which are the pieces of text the reader is expected to read; and the traversal function, "the mechanism by which scriptons are revealed or generated from textons and presented to the user of the text" (62). Examples of traversal functions are the links in a hypertext, the simulation and representation engines in an adventure game, or throwing coins in the case of I Ching.

We can in principle treat any text from this cybertextual perspective. In a traditional printed novel the textons and the scriptons would then be identical, and the traversal function nothing. By stating that, we have said very little about the novel, however, the cybertextual perspective is only interesting when studying ergodic texts. In a static Web site, the pages stored at the server are the textons, the links between them the traversal function, and the pages displayed to the reader when links are followed are the scriptons. When developing his typology of ergodic texts, however, Aarseth also uses cybertext as the name of a genre. Understood in this way, a cybertext (perhaps we might call it a cybertext proper) is a text where the traversal function involves some principle of calculation (75), so the text will not look the same in two readings. In this typology the static Web site would not be a cybertext proper, while a computer game would.

Viewed from the cybertextual perspective, most Web media are constructed differently from earlier media. The Web technology was initially designed to support distributed hypertext, and the basic structure of saving Web pages as separate files makes it a manifestation of the cybertext model. Links, and more advanced techniques such as javascripts and server-side scripting, allow for the construction of a wide range of traversal functions, opening for an even wider range of modes of acquisition. It is rare to see forms from other media being adapted to Web media without adding cybertextual features such as links and search routines.

What I call mode of acquisition should then be understood as the reading process that results from the mechanical (cybertextual) construction. Modes of acquisition in computer media have been classified by different theories.

Based on a text's mechanical (or cybertextual) construction, Espen Aarseth is able to classify it as inviting the user to perform one of four user functions. The interpretative function is to read and comprehend; the explorative also requires the user to decide where to read next. These two user-functions are required by unicursal and multicursal texts respectively [14].

Dynamic texts, that is, texts where the number of scriptons is not constant, open for two other user functions. The configurative function is to let the reader "configure their scriptons by rearranging textons or changing variables" (64), while allowing the reader to add textons is a textonic user function.

The mode of acquisition is also related to the sign system(s) used. Sound and moving images exist over time, and have to limit the reader at least for a moment for at all to exist as texts, while all writing in principle is open to be read in any sequence. Still, a looping video of wind in trees, or a long sound recording of waves may not be very different from a still image in the way it is read.

Following Kant's distinction of objective and subjective sequences, Gunnar Liestøl has discerned between different kinds of activity in reading texts. "With the consumption of dynamic information (audio/video) the dominant activity is located in the textual object itself, as object-action; with the consumption of static information, however, the dominant activity is located with the user-subject, as subject-action" (45). Of subject-action, there are two kinds, intervening and non-intervening. Non-intervening subject-action is the activity of reading a static text, while intervening subject-action is to actively choose where to read on. Reading hypermedia is to alternate between parts of the text dominated by object-action and the different kinds of subject-action, according to Liestøl.

Yet another way of discerning between different modes of acquisition is proposed by Jens F. Jensen in "Interactivity: Media Studies' Blind Spot?." Jensen proposes a typology of twelve different media positions, depending on whether the media allow for registrational interactivity and conversational interactivity, and which of three kinds of selective interactivity is offered (none, transmissional or consultational). Based on Bordewijk and van Kaam's typology of "traffic patterns," Jensen's typology orders media after the power relations they set up between providers and consumers. Registrational interactivity is "a measure of a medium's potential ability to register information from and thereby also adapt and or respond to a given user's needs and actions [...]" (60), or in other terms of the flow of information from consumer to provider. Conversational interactivity is information exchange between consumers "in a two-way media system" (60). Selective interactivity is the possibility for consumers to select between different available programs or texts made by providers. If the providers control the distribution and consumers only select what to read, it is of the transmissional kind, if consumers initiate distribution, it is of the consultational kind.

Aarseth, Liestøl and Jensen all provide perspectives from which the modes of acquisition may be described and assessed; perspectives of material construction, of sign system, and of power relations respectively; perspectives that are not necessarily contradictory, although they divide the range of texts differently.

In my earlier paper "Linearity and Multicursality," I have argued that the selection of links by readers is governed very much by how the links are signified. I am thus able to see a range of different kinds of what we may call "explorative user functions", "intervening subject-actions" or "consultational selective interactivity" in the different vocabularies reproduced above by identifying some ideal types of acquisition mode.

Imagine a continuum of how much influence and control a reader has of the sequence of parts in a Web site. At one end we find movies: videos, animations and narrated slide shows. The most television- or film-like of these run in one sequence by themselves; I have called this cinematic mode. Most movie streams, however, have a pause button and perhaps VCR-style controls for fast forward and "rewind." A similar control is handed over to the user when Web pages are linked in a chain with a "next" link on every page. Similar to reading a novel, there is only one next page, but the reader chooses when to go to it. Although these two modes have different dominant actions (static Web pages with "next" links are dominated by subject-action, while VCR-controls introduce subject-action to texts dominated by object-action), I have chosen to group them under the term progress control. All these texts would be what we call linear, sequential, or unicursal.

At the other side of the continuum's middle are the multicursal sites; sites where most pages have more than one link. In several of them, there are links, but there is either so little information of what will be at the other end of each link, or the text of each page depends so much on what was written on a previous one that one sequence is strongly prioritised before all others. I have called this default sequence, using a term inherited from Jane Yellowlees Douglas' and Jill Walker's discussions of the novel Afternoon by Michael Joyce. Further along the continuum, we find texts where there are more possible or probable sequences, but the reader still has limited control. These are sites where links load random pages, and sites without a prioritised sequence, but also so little information about the links' destination that the reader is navigating blindfolded through a labyrinth. In my paper "Linearity and Multicursality", I called this oblique linking. An example of this kind of text would be Michael Joyce's canonical hypertext novel Afternoon. At the end of the continuum furthest removed from films are Web sites where not only there are many links, but where the structure of the linking is made so explicit that the reader can navigate freely, as in a building she knows well. I will argue it is only this mode of acquisition deserves the name multicursal in a strict sense. While other forms of linking may open the possibility of many courses through the text, only an explicit linking practice makes it likely that the reader experiences the text as making multiple courses possible.

In the figure below, my five modes of acquisition are related to Espen Aarseth's user functions and textual positions, and to Gunnar Liestøl's kinds of activity. As these five modes only describe texts with static textons, the figure only maps what Jens F. Jensen calls the dimension of "selective interactivity", and is not able to grasp the "conversational" and "registrational" dimensions, or Aarseth's "configurative" or "textonic" user functions.


Non-intervening subject-action

Intervening subject-action



Cinematic mode

Progress Control

Default Sequence

Oblique linking


Less user influence

More user influence

This way of mapping modes of acquisition implies that hypertext or computer rhetoric has two levels: a cybertextual level and a rhetorical level. Default sequence, oblique linking and multicursality are all results of similar cybertextual constructions, it is the writing on the pages and links that separate them. Clearly, link-node hypertext is a certain figure of cybertextual construction that may give rise to many different rhetorical figures.

Espen Aarseth touched upon this separation in his essay "Nonlinearity and Literary Theory", and later in Cybertext. Following Pierre Fontainier's nineteenth- century rhetoric, he separates between tropes and "le figures non-tropes", or syntactical and semantical figures, as Aarseth terms the two kinds (Cybertext, 91). In "Nonlinearity" Aarseth lists some "figures of nonlinearity" as syntactical figures, or figures; static Web sites and other link-node hypertexts typically make use of the figure linking/jumping. In Cybertext (91), he addresses the other side of the pair, the "tropes" or "semantic figures", in the analysis of Michael Joyce's Afternoon.

As this particular use of the traditional concept pair trope and figure for semantic and syntactic figures respectively is far from universal in the rhetorical tradition, I will prefer the more common terms figure of thought and figure of diction, leaving trope for figures that transform the meaning of words or expressions such as metaphors and metonymies (compare Plett 309). But Aarseth's point, that the various cybertextual mechanisms for textual production are different figures of diction rather than figures of thought is an important one.

Many cybertextual constructions are neutral techniques that can be used with different effects in different texts. Static Web sites are examples of "link-node hypertext", a cybertextual figure where stable scriptons are bound together with stable links, so the scriptons in the text always have the same relation to each other. I call this structure a figure of diction, a way of constructing a text. As a structure of scriptons and their relations, which may be filled with any semantic content or mere gibberish, the structure will remain the same.

Aarseth's two hypertextual figures of thought, aporia and epiphany describe the users frustration when lost in the textual labyrinth of Michael Joyce's Afternoon, and then the bliss when a new direction of reading is found. These two figures of thought are put on top of a certain structure, a use of the link-node figure and a figure of restricted access, both figures of diction. The very same structure, with the same conditional links could have been made explicit, with every link and block explained to the reader. In that case, the figures of thought would have been different.

As with all other rhetorical figures, the hypertextual figures we have discussed here, both figures of diction and figures of thought, may be combined with most other rhetorical or poetic devices. A worked through system of navigation links, for example, is usually associated with professional business sites, while hypertext novels of the "Eastgate school" usually employ relation links, but this need not always be the case. Bobby Rabyd's Web novel Sunshine '69 is an example of a serious hypertext fiction using a clear and understandable set of navigation links, resulting in a truly multicursal work of fiction. In Hamlet on the Holodeck, Janet Murray lists numerous experiments in immersive computer fiction, among them multiform stories, stories that present "a single situation or plotline in multiple versions, versions that would be mutually exclusive in our ordinary experience" (30). When giving multiform stories as a writing assignment to her students, many respond with a "violence hub" story, Murray reports (135). These are stories where something violent or traumatic has taken place, and the reader is invited to follow links to explore how a number of characters react to the event. In these texts, many rhetorical figures are at work simultaneously. The "violence hub" is a certain realisation of the more general multiform story. None of these need to be multicursal works; in fact, most of Murray's examples of multiform stories are Hollywood movies. But some are multicursal computer texts, and these need to communicate this structure to the reader, using certain linking figures. All these devices, multiform, violence hub, and linking figures, are figures of thought, used within a certain figure of diction, the link-node structure. Similarly, the decision to posit the reader as a character in the diegesis as in many computer games, is a figure of thought not concerning the cybertextual construction of a text, a figure that also has been used in codex novels, such as Italo Calvino's If on a Winter Night a Traveler.

Mode of acquisition should thus be understood as the reading effort and experience the text invites and expects from its readers. It consists of both the required handling of the text's mechanical structure, how this requirement is communicated to the reader, and how it is aligned with the text's message.

4.2. Mode of Acquisition and Rhetorical Convergence

As mentioned, a Web form is perceived to be the result of a rhetorical convergence if it blends rhetorical properties of two or more genres and media. The mode of acquisition is one set of rhetorical properties: it describes the reading process required of the reader to read a text.

A Web form with a required reading process similar to one genre in one medium, but in other respects similar to a genre in another medium is perceived as a result of rhetorical convergence of the two.

The mode of acquisition is an integral part of all earlier media, something which is witnessed by the large literature on the difference between the "linearity" of print and the "nonlinearity" of hypertext. As with mode of distribution and mode of restriction, a change in the mode of acquisition will be perceived as changing the rhetoric towards the rhetoric of a different medium. It further seems rare to recreate any rhetoric on the Web without introducing some linking beyond the imitation of page turning or the controls of a VCR. Introducing a cybertextual figure is a convergence towards computer rhetoric, but as we have seen, this may result in a wide range of different rhetorical forms. A simple example may show how the convergence of video and ergodic texts may take place:

Many television networks offer video recordings of their news on the Web. Many such newscasts are chaptered, so the individual reports can be selected from a list, and viewed out of sequence. Viewing television in this form becomes more like reading a newspaper. The reader may select what to view when, and is not likely to have the patience or interest in following the original sequence of the newscast, a sequence that probably was made with care. Following her own priorities, the reader will continually consider whether a story is worth her attention, and be ready to break it off at any point by selecting another item from the menu. And it is likely that she will skip some items entirely, as they do not interest her. Changing the mode of acquisition in this way is a profound change from the kind of ceremony the evening news is on broadcast television. The next table renders the rhetorical convergence along two axes:

Mode of Acquisition

Dominant sign systems

Print newspaper

Multicursal/progress control

Writing, still images

Chaptered video

Multicursal/progress control

Moving images, speech

Television news


Moving images, speech

5.1. Mode of Signification: Sign systems

What I here will call mode of signification is not difficult to grasp, it is simply the sign systems used in a text; for example, video, writing, spoken language, or still images. The similarities and differences between the sign systems are complex, however. As with the other three axes of rhetorical convergence, the mode of signification encompasses many modalities a text can occupy. And as with some of the other axes, the modalities are intertwined.

Any printed matter, sound recording, video, film or broadcast may be reproduced, or re-represented, or copied, in the computer. It will be stored as one of four classes of file formats: text (mainly ASCII, but other formats exist), images (formats such as JPEG, GIF, TIFF, PICT, BMP, PNG), sound (formats such as AIFF, WAV, MPEG, AAD), and video (formats such as AVI, QuickTime, Real, Windows Media, MPEG). To the human ear and eye, however, the distinctions blur. A photograph of a poster would be stored in the computer as an image, but we read it as writing nevertheless. Similarly, we may film a still image, or draw an image with ASCII signs. Audio may contain recognisable sounds of all kinds, including music and spoken language. To form an understanding of multimedia as communication between humans, we must instead distinguish between the different kinds of signification these formats store.

There are many ways of relating the four classes of sign systems listed above. We will in the following list several distinctions between characteristics of sign systems, in order to show similarities and differences between different modes of signification.

Philosophers and scholars of rhetoric, poetics, aesthetics, linguistics, and semiotics have discussed at length the differences between different communication systems such as language and image, spoken and written language, the eye and the ear, the spatial and the temporal. These dichotomies are all intertwined, as we may illustrate by putting the four sign systems into a table:





Of the sign systems in this table, only speech is perceived with the ear, the other three by the eye. The top two, writing and speech are based on natural language, the bottom two are kinds of imagery. The right half, speech and video exist in time, they are dynamic or temporal, while the left half are fixed, spatial. One way of labelling the rows and columns is thus:









Many of the differences between popular genres in different media can be sorted along these two dichotomies. Alphabetic writing is abstract, for example, and the length of lines and pages do not matter in many writing styles, while alterations of proportions or size in images do change the impact of the image. When writing and images are combined, writing will have to give up some flexibility to fit with the pictures. Adding moving images to explanatory graphics is another example that makes the difference stand out. Using animation, processes and time relations may be rendered more effectively, but at the same time the ability a reader has to scan and compare the different parts of a still image is taken away.

Within each row in the table, the differences are also much discussed in literature. The difference between spoken and written language has been a topic for language philosophy since Socrates, renewed by present-day linguistics. The film theories of, for example, Mitry or Deleuze are concerned with the specificities of the moving image as opposed to the still.

What is lost in our labelling of the categories in this manner is the distinction between eye and ear mentioned above. This is an important distinction, however, when discussing the combination of forms. It is often easier to comprehend combinations of sign systems involving two different senses, a point we will return to in 5.2 below.

Sound is always temporal, always dynamic [15]. But all sound is not speech. Music and sound effects are obvious examples. To fit this distinction in, we might have to reduce the language/image dichotomy to language and non-linguistic signs. Thus:







Animated writing





Music, Sound effects

This division also makes us aware of animated writing as a distinct form. It occupies a middle position between writing on one hand and video and sound on the other, and Gunnar Liestøl has demonstrated in "Aesthetic and Rhetorical Aspects" how animated writing may smoothen the transition from writing to video (a point we also touched upon while discussing bandwidth above). Moving writing looses one of the powerful aspects of alphabetic writing however: the ability to read at different speeds. Reading more than a few words of moving writing is thus likely to annoy many readers.

One might very well suspect that the "non-linguistic" category is too broad, and this is brought to the surface if we consider Web pages combining photographs and diagrams, two very different forms of images.

In September 2001, the Spanish Web newspaper El Mundo carried an "interactive graphic" of the September 11th disaster, titled "Oleada de atentados en Estados Unidos" In the feature, schematic drawings and photographs of the World Trade Center are combined with great effect.

A screen from "Oleada" showing a plane crashing into the South Tower in photos and a Flash animation.

The drawing explains what is happening (a logos appeal), while the photograph is a witness, and much more emotional (a pathos appeal). In Languages of Art, Nelson Goodman clarifies the distinction between the two, as well as the distinction between language and image. In his vocabulary, languages are differentiated, while images are dense [16]. Like in Saussure's semiology, words are seen as disjunct, differentiated signs with differentiated meanings in Goodman's theory [17]. Dense sign systems (or dense notational schemes in Goodman's vocabulary) do not have differentiated positions. Thus, between two signs, there is a possible infinite number of signs. His example is a mercury thermometer without a grid marking the temperature scale. In such a thermometer, any position of the mercury scale would be meaningful. Dense sign systems may further be either relatively attenuated or relatively replete. In a photograph or a painting any aspect of the image is potentially meaningful, so the image would be another if any detail was changed. In a map, on the other hand, choices of colour or thickness of line are relatively arbitrary. On a world map, it has little importance if Zimbabwe is coloured green or pink as long as its colour is different from the colours of neighbouring Zambia or South Africa. Maps and diagrams are relatively more attenuated than photographs, as some dimensions of the visuals carry meaning while others do not. Below, Goodman's distinctions are drawn into our diagram.





Differentiated (digital):


Animated writing


Dense (analog)


Diagrams, Typography

Moving diagrams





Sound effects

This further subdivision not only helps us discern diagrams from other images, it also tempts us to place of other sign systems used in Web sites, such as typography and lay-out. Music may also be distinguished from sound effects in this manner.

When we claim that photographic images of the World Trade Center catastrophe are "witnesses", it is based in the knowledge that photography is a process of chemistry and optics. In Charles Sanders Peirce's terms, the photograph is both iconic, as it resembles its motive, and indexical, as it is a physical imprint caused by another physical imprint of the light rays that were reflected off the motive. Now that we have introduced Peirce's canonical trichotomy of signs: iconic sign, index and symbol, we see that in the above table, all the differentiated sign systems are symbolic, while all the replete are iconic. The attenuated, however will have aspects of both, while there is no separate place in the table for the indexical. Furthermore, a shot in a film of a landscape with a column of smoke rising from a distant hill would be iconic first, in that the image resembles an actual scene, then indexical second, as the smoke is a sign that there is a campfire burning. Our table cannot capture Peirce's typology of signification while maintaining the differences we have charted so far. To map all dimensions of signification in one diagram would be overly complex.

Only a little reflection on how music communicates will complicate this yet further. Music may be seen as a system in which some parts (rhythm, scales, harmonies) are parts of a differentiated system, while others (timbre, volume, pulse, phrasing) are dense. In addition, music always carries strong connotational meanings. A simple example is found in the "Becoming Human: The Documentary" section of the Becoming Human Web site, where (supposedly) Ethiopian music lends connotations of "africanness" to a description of an excavation in Ethiopia. These secondary meanings (in additon to the primary, denotational meanings) apply to any interpretant to any of the sign systems involved, and cannot be captured by the above table either.

5.2. Mode of Signification and Rhetorical Convergence

We have repeatedly stated that a Web form is perceived to be the result of a rhetorical convergence if it blends rhetorical properties of two or more genres and media. The mode of signification is one set of rhetorical properties: it describes the particular combination of sign systems used in a text.

A Web form with a combination of sign systems similar to one genre in one medium, but in other respects similar to a genre in another medium is perceived as a result of rhetorical convergence of the two.

Changing the sign system while keeping the rest of the rhetoric is the most obvious rhetorical convergence. All the examples used to illustrate the other three axes used the mode of signification as one dimension in the comparison. Many of them also in addition contain in the combinations of sign systems. The narrated slide show "Sights and Sounds of the Way West" described under 2.1 above combines a dynamic sound track with still writing and imagery, but both photographs and written words are made dynamic by moving the frame and fading words in and out. Yahoo! FinanceVision, the example from 2.2, put paragraphs of written text next to a small video pane. The map from "Congo Trek" discussed under 3.1 uses the power of diagrams and maps to provide overview of a large number of written parts.

When two different modes of significations with different characteristics in all the ways listed above have to be aligned, two principles govern the combinations: the limits of the senses and perception, and what I call containment.

Our vision cannot read writing and images simultaneously (an observation elaborated by many scholars, for example by Michel Foucault in his analysis of Magritte's aesthetics in This is Not a Pipe), so we will have to move back and forth between the two. Thus, if a lengthy text is projected on top of a video segment for a short while only, it will be very difficult not to miss either some of the text or some of the images. Eye and ear may cooperate nicely, however, as when a voice-over explains images in a documentary film. It does also seem to me that our ability of language processing is such that we not only are unable to simultaneously comprehend two people speaking at the same time, but also that reading and comprehending several paragraphs of writing while listening to a speech is equally impossible.

Containment is a word I use to describe the fact that sign systems do not appear next to each other on the Web, but are convoluted. In Web pages, either a video pane is inserted into a text page, or text is inserted into a video window. Digital video is always rectangular, and text will normally either be within or around the rectangle. The fact that a video clip has to be a separate file from the HTML page makes it even harder to penetrate the edges of the video rectangle, if the author should desire to do so. Apart from these technical reasons, photographic video has always been separated by a frame, a basis for theories on film by Jean Mitry, Lev Manovich, and others. To escape the frame, or for text to penetrate it, the video would have to loose all depth, all sense of foreground and background. It is imaginable to shoot a video against a monochrome background, matte it out, and script text to blend in with parts of it, but I have never seen it done in an actual Web page. What emerges is a master-servant or parent-child relationship.

The parent-child-relationship between the containing semiotic system and the contained can also be treated as a time relation; one reads the parent before the child. This makes it possible to treat, for example, video inserted in a text page similar to video that opens in a separate window from a link in a text page. It is an advantage to do so, as the two often are used with comparable effect. To classify a page in one of the two categories, one would query which signs that reaches readers first; those of the text or those of video. Again, it is possible to imagine a page designed in a manner that makes this distinction impossible, that it is random what one reads first (which would require that the video loaded as quickly as the HTML). I have not seen it in reality, but were if it to be found, it would be a third category requiring its own analysis.

Let us go through the different distinctions between modes of signification. We will identify different examples of rhetorical convergence, and note the new intermediate forms that appear.

Eye and Ear. In the opening paragraph from "The Way West," music (harmonica, acoustic guitar, and double bass) plays as a background accompaniment, contributing to the Western pioneering mood. The two sign systems complement each other, each bringing one part of a combined message. Using music to evoke connotations in this way is common in film, but here it is coupled with an imitation of a printed page.

Introductory screen from "The Way West."

Later on in the same feature, still photography is coupled with radio - a different kind of eye/ear combination. The landscape photographs in the middle of next page are shown while a narrator reads the story of the pioneers, and the sounds of birds and a waterfall are also heard.

Screenshot from "The Way West", showing a clearing in a wood.

(Bird song) Voice-over: Out onto the plains of Kansas. New voice: "Now we were out of civilisation and the influences of civilised society entirely, and cut out from the rest of the world to take care of ourselves for a while." Alicia D. Perkins 1849.

Screenshot from "The Way West", showing a waterfall.

(Sound of waterfall) Voice-over: When they came to Alcove springs in Kansas, they described it as the most beautiful site on the whole trail, even though their whole trail experience had just been a few days.

Rather than bringing different messages, sound and visuals here have the same content; the message is doubled. The sense of reality and immersion is heightened when they align. It is a little more like standing there, experiencing the landscape than images or sound alone would be.

Static and Dynamic. In these two examples from "The Way West," sound and visuals also bridge another division: that between static and dynamic sign systems. As sound also exists over time, it adds a temporal dimension and dynamic to the scene. In the still image of the waterfall, one can almost see the water moving when the sound is added.

Another effective combination of static and dynamic sign systems is found in an exercise page from the British Men's Health site, which combines written words and print-like layout with video. The exercise program is explained in writing, and can be consulted over and over, while the exercises are demonstrated in video, thus showing the actual movements.

Exercise page with video and writing.

Differentiated and Dense. Not just different in being static and dynamic, writing and video are also differentiated and dense respectively. This adds to the effect of rhetorical convergence in the above example from Men's Health; as the video images are dense, all details are recorded, and may be studied by the aspiring weight lifter, including the finer points not mentioned in the written instructions.

Another combination of differentiated and dense sign systems is of course images and writing, a combination so common, it hardly deserves the name rhetorical convergence (was there ever a time when people never drew images on the same surface they were writing on?). A computer version of the combination, however is to let the user make the writing visible at will. Nationalgeographic.com's "Columbia River" is a moving panoramic image of the river. When the reader positions the mouse over an element in the image, a written label appears, and when the reader clicks, a smaller pane with more writing opens. This particular combination of image and writing makes it possible to combine a large-scale, detailed image with explanatory labels without cluttering the image with letters.

In the illustration below, a fish in the water was clicked, bringing up the pane with writing and images.

Screenshot from "Columbia River".

Screenshot from "Columbia River" with a popup with writing and photography.

Attenuated and Replete. The image in "Columbia River" is a stylised drawing, which allows the artist to draw attention to the details he conciders to be important. In Goodman's vocabulary, it is relatively attenuated. When the fish is clicked, however, a photograph opens, showing what the fish "really" looks like in all its details. The photograph is replete. In 5.1, we noticed how effectively "Oleada" combined photography and drawing to reap the benefits of both the repleteness of photography and the attenuation of drawings, providing both overview and detail.

Screenshot from "Oleada" combining animated drawings with photography.

Iconic, Indexical, and Symbolic. "Oleada" also demonstrates the combination of iconic, indexical, and symbolic signs. The photographs are indexical and iconic, the physical traces of the catastrophe, while the drawing is iconic but not indexical.

Similarly, BBC News regularly combines the indexical with the symbolic, by linking parts of radio programs from newspaper-like written news stories. A quote in writing may thus be backed by the recording of how the statement fell, and the recording will also reveal the tone of voice of the speaker.

The ability to combine so many modes of significations within a Web site was the starting point for our investigation of rhetorical convergence of the Web, and its most visible and basic manifestation. But the obvious multitude of possible combinations of signifiers with different properties, and of different kinds of semiosis makes the term convergence seem a less fitting description for the actual resulting text. Does not the discussion above indicate a divergence of rhetorical forms? We will return to that question towards the end of this essay, but first we should view the four axes of rhetorical convergence together.

The table below lists the four axes and the different terms we have discussed. (Neither their placement, nor the dividing lines have any significance, the figure is merely meant as a summary.)

A summarising figure.

6. Limits of Technology

We have discussed the four axes of rhetorical convergence: mode of distribution, mode of restrictions, mode of acquisition, and mode of signification. Any rhetoric may be described by registering the variables listed for each of the axes, also established, well-known genre rhetorics. Most Web texts will score so close to a genre rhetoric known from earlier media that we recognise them as fairly similar. In many cases, however, there are also some variables that are different, they will show similarities to a different rhetoric of a different established medium. It is this simultaneous resemblance to two or more established rhetorics we call rhetorical convergence.

What we have left to clarify is the relation between the four axes. It might be tempting to align the four axes with other familiar distinctions, for instance, to say that mode of distribution and canvas are aspects of technology or even medium, mode of acquisition is syntax, and mode of signification is semantics. Such a division would not stand up to inspection; technology is present as a factor in all four axes. As well as setting the limits for distribution speed and canvas, technology also governs which sign systems may be used, and the possibilities for user input and influence on the text. The mode of acquisition and the mode of distribution chosen by the authors will also influence on the semantic content of a text. We have already seen how a text may be read differently if it is live, or how bandwidth limits the use of video. Furthermore, multicursal aspects or merely playback controls open for a different reading and understanding of a text.

A perhaps more helpful way of relating the axes is to view them as a process of communication.

A communication model.

Then the mode of distribution would be how the message is brought to the reading surface, canvas is the properties of that surface, the mode of acquisition governs how the reader manipulates the surface to experience the signs, and the signs in turn are what the reader reads.

Such a highly abstracted view of a reading process may be helpful to memory, but as all models, this way of aligning the axes also obscures some relations as it highlights others. Although reading and comprehension take place in the meeting of text and reader, the rhetoric of the text is shaped by all earlier stages of the communication process. The axes of rhetorical convergence are not events that take place one after the other, but simultaneous dimensions that may describe any given rhetoric.

It is further important to realise that technology determines all four sets of aspects. Technology sets the premises for the possible modes of distribution and acquisition, technology dictates which sign systems may be used, and the canvas is just a subset of the possibilities of the technology. As such, the four axes may also be seen as a set of limitations, of subsets of subsets of technology's possibilities, as in the following drawing.


The canvas is a set of chosen limits within the possibilities of the technology, a subset that sets the limits for which sign systems can be used and in which ways. The canvas also determines the distribution process, as a large and detailed canvas requires more bandwidth. The mode of signification and the mode of distribution are relatively independent, however, as the mode of distribution describes the communication as process, and not the semiosis itself. It is a way of expressing the temporal relation between signifier and signified, between story and discourse.

As modes of acquisition have to be signified, these depend in turn on the sign systems. The mode of distribution also determines the mode of acquisition, as all other acquisition modes than the cinematic require some permanence.

Such a theoretical model is not a description of a procedure or design workflow, it is probably rarely the case that screen size is decided on first, then writing and editing, and then linking. It remains a concern, however, that if one is to link from a video, for example, the presence of a link needs to be signified.

Also this perspective obscures one important relation: that the distribution process is a prerequisite for the communication to be possible at all. Each of these alternative views of the relations between the axes have its strengths and weaknesses, which is why it seems justified to view them as interrelated axes on which a multidimensional space is projected.

How does this multidimensional model relate to the rhetorical heritage? Traditional rhetorical figures reside inside what is signified, and are thus not visible in this model. This does not mean that they are not important, on the contrary: I have argued that the effects from choices along the four axes of rhetorical convergence have effect on the message. Still, this effect is probably less in terms of persuasion than the message conveyed by the signs. In our larger understanding of human communication, rhetorical convergence remains a footnote.

7. Rhetorical Divergence?

In an area in the Pacific, known as the intertropical convergence zone, the strong trade winds from the east collide head on with winds from the west. It is a phenomenon meteorologists know as convergence. The colliding winds bend upwards, and the high pressure is released as the winds spread out again in a divergence on a higher altitude.

I have argued that the perceived convergence of media may be viewed as combinations of earlier form, or forms sliding towards each other as variables change. Some combinations are time-tested and traditional; others are new and creative. Combinations adhere to certain restrictions and follow certain patterns, described in the section above using mode of distribution, mode of restrictions, mode of acquisition, and mode of signification.

As each of the four axes span many modes and dimensions, their possible combinations are numerous. Traditional media utilise only a few combinations of modes, many more are possible on the computer. What has become possible is a veritable divergence of rhetorics. I have neither hope nor ambition of describing all aspects of rhetorical convergence, but even the small number of perspectives I have discussed can combine in a vast number of different forms, as a little math would easily demonstrate. And the likely finding of just one other rhetorical dimension would drastically increase the total number of possibilities. Hence, although I have argued that convergence is a meaningful term to use, this convergence can only result in a divergence of forms. Computer media allow authors to choose many rhetorical modes that previously were dictated by the various media's technologies. Media convergence has, as it were, broken apart the building blocks of genres in earlier media, and given them all to the digital author to set up new rhetorical constructions with many combinations of materials hitherto unheard of. We haven't seen most of them yet.

Works Analysed

Africa Extreme. National Geographic Explorer. MSNBC. 18 March 2001.

BBC News. British Broadcasting Corporation. 3 January 2003 <http://news.bbc.co.uk/>.

Becoming Human. 2001. Institute of Human Origins. 28 September 2002 <http://www.becominghuman.org/>.

"Columbia River." Nationalgeographic.com. Ed. Bella Desai. 2001. 20 April 2003 <http://www.nationalgeographic.com/earthpulse/columbia/>.

"Congo Trek." Nationalgeographic.com. Ed. Michael Heasley. 23 January 2001. National Geographic Society. 26 July 2002 <http://www.nationalgeographic.com/congotrek/>.

"Gaza." Nationalgeographic.com. Ed. Margaret Zackowiz. 1996. National Geographic Society. 27 June 2003 <http://www.nationalgeographic.com/gaza/>.

Men's Health. U.K. ed. 3 January 2003 <http://www.menshealth.co.uk/>.

"Oleada de atentados en Estados Unidos." Elmundo.es. Eds. David Alameda, et al. September 2001. 10 September 2002 <http://www.elmundo.es/elmundo/2001/graficos/septiembre/

Quammen, David. "Megatransect: Across 1,200 Miles of Untamed Africa on Foot." National Geographic Magazine October 2000: 2-29

"Searching for Peace." MSNBC. Eds. Michael Moran, et al. 2000. 10 September 2002 <http://www.msnbc.com/modules/mideast_peace/>.

"Sights and Sounds from the Way West." Nationalgeographic.com. Ed. Valerie May. September 2000. National Geographic Society. 10 April 2003 <http://www.nationalgeographic.com/ngm/0009/feature2/media2.html>.

"Trippeldrapet." VG Nett. 2001. VG Multimedia AS. May 10 2001 <http://www.vg.no/nyheter/spesial/trippeldrapet/>.

Yahoo! FinanceVision. Yahoo! 27 February 2002 <http://vision.yahoo.com/>.

Works Cited

Aarseth, Espen J. Cybertext: Perspectives on Ergodic Literature. Thesis, University of Bergen, 1996. Baltimore: Johns Hopkins University Press, 1997.

---. "Nonlinearity and Literary Theory." Hyper/Text/Theory. Ed. George P. Landow. Baltimore: Johns Hopkins University Press, 1994. 51-86.

Andersen, Peter Bøgh. A Theory of Computer Semiotics. Cambridge: Cambridge University Press, 1990. Aristotle. Aristotle in 23 Volumes. Vol. 22. Trans. J. H. Freese. Cambridge: Harvard University Press, 1926.

Barthes, Roland. S/Z. Trans. Richard Miller. Oxford: Blackwell, 1993. Trans. of S/Z. Paris: Seuil, 1970.

Bernstein, Mark. "Hypertext Gardens." 1998. Eastgate Systems. 23 August 2000 <http://www.eastgate.com/garden/>.

---. "Patterns of Hypertext." HyperText 98: The 9th ACM Conference of Hypertext and Hypermedia. Pittsburg: ACM, 1998. 21-29.

Black, Roger and Sean Elder. Web Sites That Work. San Jose: Adobe Press, 1997.

Bolter, Jay David. . Writing Space: The Computer, Hypertext, and the History of Writing. Hillsdale, New Jersey: Erlbaum, 1991.

Bolter, Jay David and Richard Grusin. Remediation: Understanding New Media. Cambridge, Massachusetts: MIT Press, 1999.

Bordewijk, Jan L. and Ben van Kaam. "Towards a New Classification of Tele-information Services". Intermedia 14.1 (1986): 16-21.

Bordwell, David and Kristin Thompson. Film Art: An Introduction. Fourth ed. New York: MCGraw-Hill, 1993.

Calvino, Italo. If on a Winter's Night a Traveler. Trans. William Weaver. New York: Harcourt Brace Jovanovich, 1981. Trans. of Se una notte d'inverno una viaggiatore. Torino: Einaudi, 1979.

Deleuze, Gilles. Cinema 1: The Movement-Image. Trans. Hugh Tomlinson and Barbara Habberjam. London: Athlone, 1986. Trans. of Cinˇma 1, L'Image-Movement. Paris: Minuit, 1983.

Douglas, Jane Yellowlees. "'How do I Stop This Thing?': Closure and Indeterminacy in Interactive Narratives." Hyper/Text/Theory. Ed. George P. Landow. Baltimore: Johns Hopkins University Press, 1994. 159-88.

Eco, Umberto. Kant and the Platypus: Essays on Language and Cognition. Trans. Alastair McEwen. London: Secker & Warburg, 1999.

---. A Theory of Semiotics. Bloomington: Indiana University Press, 1976.

Ellis, John. Visible Fictions: Cinema, Television, Video. 1982. Revised ed. London: Routledge, 1992.

Foucault, Michel. This is Not a Pipe. Trans. James Harkness. Berkely: University of California Press, 1983. Trans. of Ceci n'est pas une pipe. n.p.: Fata Morgana, 1973.

Genette, Gˇrard. The Architext: An Introduction. Trans. Jane Lewin. Berkeley: University of California Press, 1992. Trans. of Introduction ˆ l'architexte. Seuil: Paris, 1979.

Goodman, Nelson. Languages of Art: An Approach to a Theory of Symbols. Indianapolis: Bobbs-Merrill, 1968.

Greenspun, Philip. Philip and Alex's Guide to Web Publishing. San Fransisco: Kaufman, 1999. Hjarvard, Stig. "Live: om tid og rum i TV-nyheder". Mediekultur.19 (1992).

Jensen, Jens F. "'Interactivity': Media Studies' Blind Spot?" Interactive Television: TV of the Future or the Future of TV? Eds. Jens F. Jensen and Cathy Toscan. Aalborg: Aalborg University Press, 1999.

Joyce, Michael. Afternoon, a story. Watertown: Eastgate, 1990. ---. Of Two Minds: Hypertext Pedagogy and Poetics. Ann Arbor: University of Michigan Press, 1995.

Landow, George P. Hypertext: The Convergence of Contemporary Critical Theory and Technology. Baltimore: Johns Hopkins University Press, 1992.

Lanham, Richard A. The Electronic Word. Chicago: University of Chicago Press, 1993.

Liestøl, Gunnar. "Aesthetic and Rhetorical Aspects of Linking Video in Hypermedia." ECHT Ô94. Proc. of European Conference on Hypermedia Technology, 1994.

---. "Essays in Rhetorics of Hypermedia Design." Dr. Philos. Thesis, University of Oslo, 1999.

Manovich, Lev. The Language of New Media. Cambridge: MIT Press, 2001.

Mitry, Jean. Semiotics and the Analysis of Film. Trans. Christopher King. London: Athlone, 2000.

Murray, Janet. Hamlet on the Holodeck: The Future of Narrative in Cyberspace. Cambridge, Massachusetts: MIT Press, 1997.

Negroponte, Nicholas. Being Digital. London: Hodder and Stoughton, 1995.

Nelson, Theodore Holm. Computer Lib/Dream Machines. Revised Microsoft ed. Redmond: Tempus, 1987.

Nielsen, Jakob. Designing Web Usability: The Practice of Simplicity. Indianapolis: New Rider, 2000.

---. "Print Design vs. Web Design." Alertbox. 24 January 1999. 15 April 2003 <http://www.useit.com/alertbox/990124.html>.

Peirce, Charles Sanders. "Nomenclature and Divisions of Triadic Relations, as Far as They Are Determined." 1903. The Essential Peirce: Selected Philosophical Writings. Eds. Nathan Houser, et al. Bloomington: Indiana University Press, 1998. 289-99. Originally part 5 of "A Syllabus of Certain Topics of Logic". MS 478. First published in CP 2.233-72.

Rabyd, Bobby. Sunshine '69. 1996. Sonicnet. 27 March 2003 <http://www.sunshine69.com/>.

Renov, Michael. "Toward a Poetics of Documentary." Theorizing Documentary. Ed. Michael Renov. New York: Routledge, 1993. 12-36.

Sano, Darell. Designing Large-Scale Web Sites: A Visual Design Methodology. New York: Wiley, 1996.

Veen, Jeffrey. The Art and Science of Web Design. Indianapolis: New Riders, 2001.

---. HotWired Style: Principles for Building Smart Web Sites. San Fransisco: Wired, 1997.

Walker, Jill. "Piecing Together and Tearing Apart: Finding the Story in Afternoon." Hypertext Ô99. Proc. of The Tenth ACM Conference on Hypertext and Hypermedia: Returning to Our Diverse Roots, 1999, Darmstadt. New York: ACM, 1999.

Williams, Raymond. Television: Technology and Cultural Form. New York: Schocken, 1975.


  1. Writers on multimedia include Nelson, Negroponte, Liestøl ("Essays"), Bolter and Grusin, Manovich, and others.(Return to article)

  2. As writers on interactivity, I would include Nelson, Negroponte, Bolter, Lanham, Joyce (Of Two Minds), Bernstein ("Patterns of Hypertext"; "Hypertext Gardens"), Landow (Hypertext), Aarseth ("Nonlinearity"; Cybertext), Liestøl ("Essays"), and others. (Return to article)

  3. Bolter, Joyce (Of Two Minds), Aarseth (Cybertext), Landow (Hypertext), Greenspun and Veen (Hotwired Style) have all commented on the liveness of hypertext. (Return to article)

  4. For theories of live television, see, for example, Ellis, Hjarvard, or Eco (Kant and the Platypus). (Return to article)

  5. Among the many writers on Web design, we could list Sano, Siegel, Veen (Hotwired Style; Art and Science), Black, Greenspun and Nielsen (Designing). (Return to article)

  6. I am indebted to Bjørn Remseth for pointing out this distinction to me.(Return to article)

  7. I have entered typical values in the table, assuming that the newspaper would arrive at the newsstand eight hours after deadline, and that the Web sites would keep written articles for three years and video for three months. Web sites are registered with a transmission latency close to nil, as they have the possibility to transmit live. (Return to article)

  8. At the same time, the range of contrast in images is much better on computer screens than in print, it is almost as good as in positive film projection (slides or cinema). (Return to article)

  9. The consequences of this for the "usability" of Web sites are discussed by Nielsen in "Print Design vs. Web Design." Nielsen also uses the metaphor of the canvas, but in a more restricted sense than it is used here. (Return to article)

  10. Resolution for the printed map is calculated as if printed on a 1200dpi printer. Actual resolution is probably higher. For Adobe Acrobat, it is given as the full viewable area of what in 2003 is a fairly large monitor. (Return to article)

  11. Aarseth uses the Husserlian formulation "in an extranoematic sense" (1). (Return to article)

  12. See Jens F. Jensen's "Interactivity: Media Studies' Blind Spot?", for a collection and comparison of many of the contradictory uses of the terms interaction and interactivity. (Return to article)

  13. As Aarseth shows, ergodic texts are not limited to computers. He lists a large number of examples, from Egyptian hieroglyphic murals and the Chinese I Ching to twentieth-century experiments in books that may be read in many different ways. In the same way, computer texts are not automatically ergodic texts or cybertexts (ergodic texts that involve calculation, so they will change each time they are read). A computer text may be perfectly linear and non-changing, even more so than a book: a book can always be opened on any page, while a computer system may block access a page until the reader has read the preceding page, Aarseth argues.(Return to article)

  14. The terms unicursal and multicursal, taken from a study of labyrinths, are Aarseth's alternative to terms such as sequential and non-sequential; linear and non-linear or multilinear (44). (Return to article)

  15. At least, sound is always temporal. It might be said that some sounds are static, for instant a ship's engine noise. A mechanic would take the sound as a static sign that the engine is working steadily. A change in the sound would be a sign that the speed is changed, for example, and/or that something is wrong. (Return to article)
  16. Goodman also uses the synonyms digital and analog, which are practical, but dismissed in our present discussion to avoid confusion with other uses of the word digital. (Return to article)
  17. Goodman distinguishes further between disjunct and differentiated both on the syntactic and the semantic plane, in order to define proper notational schemes (such as music notation) but this is beyond the needs for the present discussion. (Return to article)