Crowdsourcing

Crowdsourcing offers archives that give online access to scanned documents and other materials the ability to enlist assistance from the general public in performing tasks that would usually handled by in-house staff.  At the current time, this primarily focuses on transcribing digital documents or annotating digital scans.  Crowdsourcing has the advantages of drawing new users to a site; expanding the “reach” of a collection beyond its core users in the academic or specialist community; and helping achieve the goal of efficiently transcribing or otherwise annotating documents or other digital images without the investment of hiring outside freelancers or enlisting current staff.

For our assignment, we looked primarily at sites that use crowdsourcing to help transcribe hand-written documents (The Collected Works of Jeremy Bentham and Papers of the War Department) and correcting machine-generated transcriptions (Trove), and annotating digital imagery (NYPL Building Inspector).  The findings of all these sites were remarkably consistent: most of the work is done by a small coterie of “power users”; and users tend to be highly educated, retired, and driven by a sense of serving the “common good.”  While many institutions initiate a crowdsourcing project because they lack the budget or manpower to do the work on their own, I thought it was telling that the one site that evaluated the cost benefits of crowdsourcing (Jeremy Bentham papers) found that the money spent on hiring two project managers to oversee the work of the volunteer transcribers could have been better spent on just having the two managers perform the transcriptions work themselves.  Of course, this doesn’t factor in the cost of having outside editors review the work of the two managers, which would also be necessary.

Indeed, crowdsourcing sites must be carefully designed to be easy to use, with few barriers for participation, otherwise few will complete the work.  Further, full time curators are needed to assist the volunteers, which—as the Bentham experience shows—is not inexpensive.  The site management and design can be quite costly to implement, and there is  not much information yet on the long-term benefits of this approach.  Will people continue to be engaged with a site for a sufficiently long period to transcribe what are often massive amounts of papers or annotate a great number of digital scans?

My own major motivation to participate in these activities was if I had some interest in the content itself.  The task of transcribing is fairly tedious, and the handwritten documents are difficult to read. Then again, there is a willingness among those who are fascinated by the subject matter to perform what can be time-consuming work.  I am personally skeptical of the thinking behind NYPL’s Building Inspector project that individuals will use their spare time waiting on line to correct the tracing of the footprints of buildings on old fire insurance maps. This is not the kind of engaging “gamification” that one finds on Candy Crush or similar addictive apps and websites.  It will be interesting to see over time if enough material is reviewed to achieve the project’s goals.

Leave a Reply

Your email address will not be published. Required fields are marked *