We can run Proust through social network analysis (SNA) software in order to generate network models among its information nodes. The visualization to the left shows the association (basically, an uncategorized tag I use to label church-related passages) of Venice as it is networked among passages (referenced by their ID numbers and pagination codes) and notes on narrative context. When manipulated in real time, the visualization highlights the links to other nodes and their related concepts or passages. What this means for the study of Proust is that we can think of the novel as a network of nodes consisting of concepts, characters, narrative elements, and any other unit of meaning that might enhance exploration of its text.
We can even include various texts and external information for a genetic or contextual study of the novel. In a hypothetical archive containing digitized avant-texte and published variants of the Recherche, we could potentially see -- in strikingly visual terms -- the correspondence of, say, the impact of WWI on the development of different sections. This could provide new insights into Proust's writing process as this work continually ballooned and changed during and after the war. What kinds of associations got the most development during and immediately after the war, and in what points of the narrative did they occur? Which churches received the most attention and where are they located in both fictional and real space? There is a wealth of traditional scholarship addressing genetic and contextual issues like these. However, SNA presents an opportunity to view all of the information nodes simultaneously, a much more powerful (and accurate) tool for the study of a book than other print books are. In that way, we would see and move around in the Recherche as a writerly text that responds to its own inner needs in reference to the war. Tools like SNA also present new pastures for narratology. If the text is marked up appropriately, all instances of a particular narrative device or structure could be instantly recalled by a researcher and viewed in relation to any parameters desired. It would be an even more comprehensive supplement than Barthes and Genette. All of this goes to say that a more rigorous taxonomy of the Recherche would be necessary for a meaningful SNA application. At present, the interpretive apparatus of the Ecclesiastical Proust Archive consists solely of associations, which are uncategorized tags denoting concepts, themes, important details, architectural elements of the churches described, and so on. It would be far more meaningful to tag separately the characters, churches, architectural elements, themes, concepts, and plot elements that make up the rich density of this novel, as well as the images and other media added here to illustrate it, so that they become individual nodes within the information network. Then a far more rigorous and powerful visualization of the novel would be possible, and new discoveries will almost certainly be made. But so far, these notions pertain only to my particular study of the church motif. A far richer application would be made if we took an entire electronic text (or, better, all of the variants and translations) and allowed researchers to mark them up and add media by way of illustration. In that way, the Proust archive would become a collaborative, electronic research and editing environment that takes shape from individuals' own scholarly pursuits
Social network analysis (SNA) software combines a variety of methods commonly used in digital humanities research, such as text mining, visualization, and modeling. SNA software can pour over the data and metadata in the archive's XML files and generate a network of nodes. It could be trained to recognize and normalize names, or even pseudonyms, and the metadata, provided by readers, would tell it whether a given passage contained the idea of Venice, or the subject/object distinction, or jealousy (or all three). If it had a qualitative analysis component it could even recognize concepts. And of course there would need to be the capability for scholars to add and tag information about the documents in the archive. This is a daunting task, but eminently possible with the aid of text and data mining software. There are pitfalls, of course. The accuracy of any analytic tool depends on the quality of the data it operates upon. We must always be aware that tools like these, powerful and impressive though they are, always represent a state of the information realm. This is no different from traditional, print-based scholarship, but it bears consideration given the sometimes exaggerated hype of digital humanities at the time of this writing. So, now that AccessTEI has provided us an XML file with structural TEI markup, I'll be looking for ways to mine it with text analysis and SNA software. Stay tuned for more updates. As an aside, SNA has tremendous possibilities for the study of modernist magazine culture, which is an actual, publishing network. See my post from earlier today at the Magazine Modernisms blog.