Distant Reading

Reading more is always a good thing but not the solution.(Moretti)

In the above quote Moretti draws attention to the fact that there is no possible way for one person to read every book in one particular field of study. For example if one were to study Victorian literature, there is ‘as many as 60,000 novels published in 19th-century England.’ (Schulz) Crane also noted that:

The Greek historian Herodotus has the Athenian sage Solon estimate the lifetime of a human being at c.26, 250 days. If we could read a book on each of these days, it would take almost forty lifetimes to work through every volume in a single million book library. (Crane)

The solution to this very problem seems to be distant reading. Distant reading allows users to understand and analyse what Margaret Cohen calls the ‘great unread.’ (Qtd in Moretti) Moretti notes with distant reading it allows for users to study large corpora in ways that would be impossible to do so without computational methods. The prime example is Google books in which they have been digitising a vast number of books. However they have run into many problems with copyright issues. They have been involved by a ‘class action lawsuit from book publishers and authors.’ (Bohannon) Unfortunately as ambitious as it is to digitise 1 million books there are always complications which can ensue, copyright being one of the main issues.

There are questions as to whether close reading is a more effective method to analysing a subject than distant reading. Moretti points out that the major problem of close reading is that it ‘depends on a small canon.’ (Moretti) As the point above that 60,000 novels were published in 19th century England, studying the vast scope of Victorian literature is physically impossible by close reading methods. Moretti describes distant reading as a ‘condition of knowledge. It allows you to focus on units that are much smaller or much larger than the text: devices, themes, tropes-or genres and systems.’ (Moretti) Kathryn Schulz noted Moretti’s latest project was to discover hidden aspects in the plot of Hamlet. Schulz explains that Moretti turned characters in Hamlet into nodes and their verbal exchanges into connections. Schulz noted in criticism that as Hamlet has a lot of soliloquies, it thins out the plot thereby taking away from a succinct analysis of the material. However Moretti claims that using these methods allows for experimentation which in turn leads to the user seeing the plot of Hamlet in a new light.

Moretti discusses how literature can be studied using methods such as graphs and maps. In the paper ‘conjectures on world literature,’ he discusses the terms ‘tree’ and ‘wave’ which are used to study systems of knowledge. The tree metaphor originates from Darwinism (the most obvious image of this metaphor is the family tree, different branches of generations.) The wave metaphor would be used mostly in linguistics (i.e. the overlaps of commonality in languages) He posits that metaphors such as these forms used to study formal systems such as sociology, history, biology and so on would be very effective for studying literature. Again this is in contrast with the method of close reading when it comes to literature.

Schulz largely criticised Distant Reading in her article in the New York Times. Her main criticism of Moretti is that Literature is not a science and it should not be studied as a science. Schulz was deeply unimpressed with Moretti’s statement: ‘Literature is a collective system that should be grasped as such.’ (Qtd in Schulz) Schulz seemed to feel that Literature cannot be classified in such a broad way; literature cannot be ‘counted on to obey a set of laws.’ (Schulz) A colleague of Moretti, Matthew Jockers was very quick to defend his colleague and to refute Schulz’s criticism. Jockers claimed that her article was under researched and over exaggerated. He also noted that many people have pointed out errors with her article in regard to her criticism of distant reading. (He didn’t mention what the errors were however.)

Jockers discusses distant reading in his blog, he prefers the term macro analysis as he feels this is a more appropriate term. He terms close reading as microanalysis. He states

By way of an analogy, we might think about interpretive close-readings as corresponding to microeconomics while quantitative macro analysis corresponds to macroeconomics. Consider, then, that in many ways the study of literary genres or literary periods is a type of macro approach to literature. (Jockers)

He then gives the example of someone studying 20th century poetry. He notes that this scholar may be called upon to give a generalized reading of the above topic. With macro analysis this allows for that kind of reading which would be of benefit to someone wanting to have general overview knowledge of the topic. He notes that macro analysis is not there to replace micro analysis as this is also a necessary process but rather the two forms of analysis are there to complement each other.

Dan Cohen also wrote a very interesting paper on data mining of large digital collections in which he gives the allegory of Borges’s short story The Library of Babel, in which the narrator is lost in vast, maze like library with stacks and stacks of unread books, infinite knowledge which is impossible for one person to consume in a lifetime. Cohen uses this allegory to portray the futility of trying to study such a vast collection of books. This is where computer scientists come in and take the approach of looking for patterns in large volumes of books, i.e. date mining digital collections in order to attempt to understand such a vast array of knowledge stored within a million books.

