LOC/NEH Chronicling America Bulk Exports Available

Kalev Leetaru

14 years ago

While at the Library of Congress presenting the keynote address opening the 2012 IIPC General Assembly, Kalev met with the staff behind the Library of Congress / National Endowment for the Humanities Chronicling America initiative to talk about ways it can be made more readily available for data mining. One of Kalev's recommendation to the Library was to make bulk export files containing the entire OCR output of the collection available to researchers. As a result of those discussions, the Library of Congress has created a set of bulk exports of the collection, which it now makes available for academic research.