“Why don’t you just digitise it all?”

If we had a £ for every time we have heard this question…

There are many reasons why archive services do not scan all historical documents and make them available electronically – one of the main reasons being the inherent instability of digital files. Most professionals would now dispute that we are really heading for a digital dark age but that doesn’t mean we can be laissez-faire about the preservation of digital content.

As technology changes so rapidly, preservation of digital data actually requires much more active management than most of our paper and parchment collections – the computer I’m typing this on doesn’t even have a CD/DVD drive let alone a floppy disk drive (although fortunately, we do have access to both of this within the office).

There are many examples of lost digital data, the loss of over 50 million songs from MySpace being the most recent – see If it’s online, it’s not permanent. Internet archives can disappear.

Here at Derbyshire Record Office we have been thinking about how we preserve digital content for many years, but this is still something very much in development. However, in the last few weeks we have made good progress and more digital archives are now being received. Watch this space for further developments.


4 thoughts on ““Why don’t you just digitise it all?”

  1. Helpful and experienced staff with local knowledge beats convenience on online searching any day. Keep up the good work preserving our heritage for future generations.

  2. It’s not that simple.
    There are numerous examples of important historic libraries being lost to disaster. Many of us have had an accident, spilling a drink on a book, throwing the wrong document away by accident, finding stored volumes have been affected by water leak. Many valuable volumes have been stolen from libraries. In any case paper has a limited “life expectancy”, books degrade a little every time they are used, that’s why valuable historic works are stored in controlled atmospheres perhaps only accessible to limited numbers of experts wearing cotton gloves. Digitising books is one way of ensuring that the bulk of the content remains available if the original is lost, stolen or destroyed.

    Digitising has the additional benefit that, depending on the technology, the entire body of text can be searched or analysed digitally in seconds. The obvious benefit is no longer needing to read large numbers of books to find specific references but there can be unexpected benefits too. When the Complete Oxford Dictionary became available in fully digitised format linguists were able to find, not just the definition, etymology and examples of a word but also find where that word had been used in the documentation of another entry.

    A digitised document can be made available on the internet to anyone anywhere in the world within seconds. That’s not to dismiss the importance of original documents especially historic or hand written works, indistinct text may be open to re-interpretation, there may be valuable marginal annotations. So yes, we should be taking great care of original hard-copy documents but digitising them delivers numerous benefits.

    Digitisation can be by taking an essentially “photographic” copy or by using OCR (Optical Character Recognition). OCR has the advantage of rendering the content able to be fully computerised into a format like a word-processor document but is imperfect in other respects. OCR can’t (yet) cope with handwriting and it will stumble if a document has been annotated or the print quality is poor (e.g. from an old typewriter). Both methods have their merits in some cases the best is to use both.

    As the article says, digital files can be destroyed but it’s so easy and inexpensive to store numerous copies in different locations and using different technologies that there is no longer any excuse for a complete loss. As I always say to someone who asks me for help having lost a digital file “just restore it from your backup”. It’s at that point that they begin to regret not listening to the widespread advice to have a backup regime for anything you value. That’s true of the quoted MySpace loss, some individuals placed all their reliance on that single storage resource, it’s not just the more canny that kept local copies. Most digital material on MySpace was a copy of a PC file, the users HAD their own copy but some had subsequently deleted it. To quote MySpace themselves: “Make sure you always have an extra copy of the content you uploaded.”

    • Hi Rob, I think you’ve absolutely hit the nail on the head. Archives are defined (theoretically, if not always in practice) by their uniqueness, in Derbyshire we very much believe that the purpose of preservation (i.e. storing the material) is to provide access to it. Perhaps in a world of infinite resources it should all be digitised and made available online – resource being another of the key factors prohibiting such an approach. I agree too that the ease with which multiple copies in different formats and locations can be made makes digital records superior to traditional archives at least in that respect. Such backup procedures are most certainly essential, and would certainly have presented such catastrophic losses of data, but to ensure that digital content remains accessible in the future, there are other things that archivists, records managers and those that create records in the first place need and this is what we are really getting into at the moment. It is most certainly overdue but even more reason for us to really get to grips with it now.

      P.S. we recently came across a reference to AI technology now being developed that would enable text recognition of handwriting and other content that OCR cannot cope with

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.