Livres sans frontières and with search capability

Dennis Baron begins “From Pencils to Pixels,” the conclusion to A Better Pencil, by discussing Google’s digitization efforts as a means of reflecting on the various issues of authenticity, authority, and reader-/authorship explored throughout the book as a whole. Massive online projects like Internet Archive and Google Books may be to many (as Baron puts it) “the future of the book and its death” (227), the former because they re-produce easily accessible and searchable versions of out-of-print or rare texts; the latter because they re-present many of these texts with additions/substitutions that fundamentally change the way in which the text is approached. For instance, covers are often changed and advertisements and endpapers elided for the ostensible purpose of streamlining reading, yet this also makes a statement about what types of text are deemed important by those not necessarily invested in their study.

Original WW cover
This original cover of the Woman’s World was specifically commissioned after Oscar Wilde–an ardent proponent of tasteful ornamentation–took over the editorship of the periodical.
Google WW cover
Interestingly, this Google Books digitization not only changes the cover image (adapting it from an illustration in the first issue), but is in fact made from a facsimile of this periodical rather than an original.

I can safely say that without digitization, much of my own scholarly work would be nearly impossible, and this is likely the case for most literature and book history scholars at this point. The ephemeral nature and far-flung archives of many of the Victorian periodicals that I study, as well as the sheer bulk of their production, combine to make this a challenging field indeed for anyone not willing to use digital tools to facilitate access. Instead, one would have to either resign oneself to limited access, or face being overwhelmed in an avalanche of (often un-catalogued) moldering pages—a dichotomy Baron also hints at (231-232). Citing Anthony Grafton, Baron notes that despite the fears of techno-phobes, “scanning is no replacement for the actual physical print object,” not only because of accidental omissions, but also because “the print artifacts themselves tell the reader more than the words on the page” (229). Even so, scholarly sites committed to maintaining the authenticity (oh troubling word!) of the texts they digitize—such as The Yellow Nineties Online and the Wellesley Index to Victorian Periodicals—are still naturally better at paying attention to producing fully searchable versions of books and periodicals in their entirety. Others, like the Modernist Journals Project, even create a hierarchy that explicitly shows which texts have been prioritized for digitization: the MJP Directory explains, “we have indicated on this list the journals we consider most suitable for digitization (in red type), and others that we consider interesting but would put second in order of priority (in blue type). Journals in purple type are those we have already digitized (wholly or partly).” This adds another layer to the ways in which remediation is often means of rewriting—at least in part—the text in question.

As once exclusively archival texts are re-mediated in online formats and made available to the public, these “publication” processes necessarily lead us to contemplate what is written over during digitization, even while other things are written out more clearly.

7 thoughts on “Livres sans frontières and with search capability”

  1. Petra,
    If Baron talks about the death of printed books, or its possibility, the way that we all might feel in the recurrent moments of nostalgia, fear and prediction, your response clearly enough reminds us of an undeniable revival that should be valued and celebrated every day. None of us can imagine how one could carry out the simplest research project just twenty years ago. Simply stating and beyond any theoretical sophistications, let’s be grateful!
    Still, whether a scanned text is an image of the real text or the text itself is a Platonic question in the realm of archival documents and texts. Are we really getting far from the truth of each text as we digitize it and publish it in a new format? Is form dependent on format? I do not think stand for any anti-digital approaches that exaggerate the catastrophe of digitization, but I also believe that each printed work bears a portion of the text’s identity which is not replaceable or transferrable. Is an online version capable of building its own identity through the passage of time?

  2. I’m with you both on the incredible wealth of materials and texts that open up to us because of digitization and archivization.

    With my work in 19th century African American newspapers, what I have found the most problematic is how metadata is selected and organized. The person I was interested in, William Parker, was on the run from the law for murder and treason charges from 1851 through the end of his life; he lived in Canada but returned (anonymously, in the newspapers, any way) to PA at least once. Because he isn’t named, a straight search for his name won’t turn him up, even though I know he came back.

    In that sense, having the original newspapers (besides being incredibly fragile and hard to search) wouldn’t be any better for me as a researcher than reading the online versions, because the search feature wouldn’t help me in either case.

    Petra, have you come up against similar searchability issues with the sources you research? Are there times when you just have to get your hands on the original to understand it in a way that digitization doesn’t permit?

  3. I have run into this sort of problem so many times, and my field isn’t as necessarily historical (I mean, I still do lots of history, but most of it is post-internet, rather than pre-internet). A lot of the research I do involves texts which are not really built to last (popular paperbacks) or considered not important enough to archive and digitize immediately. Documents get waitlisted, and sometimes those waitlists exceed the reasonable lifespan of the document. For example, two years ago, I needed articles from the last six months of Indian newspapers. Those documents are low-priority for digitization (for some very practical and some sadly colonialist reasons), and even when they do get archived (like at Columbia), they rarely get archived, are in poorly-maintained hardcopy, or get sabotaged by metadata formats built for specific cultural instances. (A search for a journalist contact’s work by her last name in an archive of Indian national papers, for example, was enormously counterproductive).

    Short version: You are so, so right.

  4. I agree with you entirely about the wealth of materials made available by digitization. My senior thesis paper was on Southern history, particularly dealing with the mythology and civil religion of the Confederacy. All of my supporting primary source research was done online. I still feel a bit strangely about that, because I’ve been trained to regard REAL research as happening in document archives and historical societies– not from the comfort of my couch.

    If things hadn’t been digitized, however, my paper would not have happened. Traveling to an archive/research center in North Carolina, where most original documents were housed, was not an option. Accessing original copies of mid-1860s newspapers would have been impossible. Digitization, whether through scanned images or transcriptions, allowed me to indulge my interests and support my thesis from several hundred miles away.

    Through all of my research, had never thought that any of my sources would be altered in a way that fundamentally changes their meaning. I suppose that even considering the relative value of digital research versus that of hands-on investigation demonstrates the impact that the option of online work has had, despite the fact that the final synthesized product has the same meaning either way.

  5. Petra,

    This is such a rich area of digital writing/reading to think about. Last semester, when I was working on that paper for Dr. Brückner’s class on _The History of Constantius and Pulchera_ (1789), I was very excited to find that the magazine in which the text had originally been serialized had been scanned and digitally archived by ProQuest. In fact, the argument I ended up making would have been impossible without access to that text– so I was, as others have expressed, incredibly grateful.

    But for ease of searching, ProQuest chose to archive each individual piece within the magazine separately, with each column chunk as a separate PDF, rather than scanning complete pages. Good in theory– but the result was this incredibly strange, fragmented reading experience– one that was certainly different from the experience that its contemporary audience would have had. This was a particular concern for me, since I was trying to look for clues about the text’s framing and intended reception. Even more disconcerting was what got left out as a result– what didn’t fit neatly into a category of a certain article, and thus wasn’t available. (I’m thinking in particular of an engraving that might have lent some clarity to how the text was being positioned for the audience.)

    Baron tells us that “[t]he trick to dealing with too much information is filtering out what to ignore” (Loc 3954). But such decisions must include an awareness of the purposes to which a text might be put, must account for the various contexts in which that text might have meaning. In the case of ProQuest’s archiving of _Constantius and Pulchera_, I think it’s safe to say that searchability was prioritized over completeness or “authenticity.” I’m not implying that this choice was wrong– the fact is, I probably wouldn’t have located the text if it wasn’t so well archived. But the decision of how to digitize it also constrained the uses to which I was able to put it.

    Anyway, thanks for the thought-provoking post! Hope we talk more about this in the future.


  6. Petra,
    To enhance your argument, I would like to mention TEI (Textual Encoding Initiative) as a way that we can think about programming (encoding) texts such that they try to maintain the kind of features that give us clues (smudges on documents; someone’s handwriting or drawing at the top of a letter) to who the author was and how they were writing.

    I only know a few bits of TEI myself, but the programming allows us to theorize about types of marks on pages, positions of words and objects, and really try to understand how an archival document was put together and how we can understand it better (which means different things to different people).

    I think this sort of thing is especially useful for us Englishy folk, as we like to not only think about cool, old documents, but we also like to theorize about what these things are and how the parts of these documents make up the wholes of something as simple as a letter from Elizabeth Gaskell.

  7. A thoughtful post, Petra, which prompted a series of engaged responses. For the most part, I share the interest of everyone here in “cool, old documents,” to borrow Katie’s nice phrase, although I do worry a bit that this sort of discussion sounds a bit like one of those conversations in which people who really love vinyl complain that digitized music is cold and remote and filled with tiny gaps that only the super-sensitive can detect. I mean, there are times when I just want to focus on the text rather than the document.

    And while not a digital fanatic, I’m also a little wary of a sentimental attachment to print. Doesn’t digitization sometimes improve not only access but quality? I’m thinking here of something like the Blake Online Archive, which allowed me to see why people make such a fuss over his plates in a way that looking at them in books never did.

    Interesting stuff to think about!


Leave a Reply to Kiley Dhatt Cancel reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s