Freedom, They Printed
AI on Bridges Solves Mystery of Who Printed Seminal Works on Liberty
In the 17th century, you could get jailed or even executed for criticizing the government of England. But a flood of books on civil liberties, produced at great risk by anonymous printers, helped change that. An artificial intelligence (AI) analysis of irregular letters using PSC’s Bridges platform has helped a Carnegie Mellon team solve the mystery of who printed nine of these seminal works.
Why It’s Important
Prior to the 18th century, expressing your ideas on politics, religion, even divorce—anything your country’s leaders found threatening—could get you arrested in England. Could get you killed.
The 17th century English Civil War proved a boon to free speech. Censorship was disrupted. A tidal wave of forbidden publications flooded England. It was still dangerous to write and print such books. Some authors were anonymous. Others were willing to take the risk of putting their names to their work. But the vital printers—without whom circulation of the books would have been far more limited—were mostly anonymous. Historians don’t know the printers of an astonishing 25 percent of English-language books in the 17th century.
“Take a second and think about a world without the First Amendment. There’s no such thing as freedom of the press; publishing is tightly regulated by the guild, the Stationers’ Company, [with additional control] by the crown and Parliament. They all had their own interests, but frequently combined to tamp down anything they deemed threatening.”—Christopher Warren, Carnegie Mellon University
One of the most important of these books was John Milton’s Areopagitica of late 1644. Its plea for freedom of speech helped transform England. It led indirectly to the U.S. First Amendment. But its printers were unknown.
Researchers at Carnegie Mellon University—Christopher Warren, Taylor Berg-Kirkpatrick, Max G’Sell, and Shruti Rijhwani—wondered if they could solve that mystery. They turned to PSC’s Bridges.
How PSC Helped
In the 17th century, printers would set lead type—one piece for each letter—in a wooden rack, backwards. They smeared the type with ink and pressed a piece of paper onto it. The ink transferred right-way letters onto the paper. But type pieces were imperfectly cast, and the pieces suffered damage from use. Some developed tiny irregularities. These would show up every time that type piece was used.
David Como of Stanford University proved that such irregularities were as good as fingerprints. But the human labor of matching irregular letters between unknown and known printers made the work very slow. The CMU team wondered whether they could use Ocular, a computer program for analyzing type, to compare the type in the first edition of Areopagitica and several other forbidden books of the period with the type in about 100 books whose printers were known.
They used a machine learning (ML) approach to recognize text in old printed documents and to match irregular characters across books by the same publishers. Their specific method was a custom generative probability model of the printing press, in which the AI simultaneously made inferences about the parameters of the printing process and the actual text that was printed. The researchers applied this model to type from both known and unknown volumes. Bridges was particularly helpful for this because of its ability to store and move data fluidly between many parallel nodes. The system’s user-friendly architecture also made it possible to run Ocular, which had not been written for supercomputers. The researchers enhanced the machine analysis with the insight of human expertise.
“We recognized we needed a flexible combination of a lot of storage and number of cycles to do the OCR [optical character recognition] process … What we were trying to do on every page was to use a number of variables to predict what letter each set of pixels in the image represents.”—Christopher Warren, Carnegie Mellon University
The solution to the mystery was worthy of an Agatha Christie novel—specifically, Murder on the Orient Express, in which (spoiler alert) they all did it. Historians had long suspected that printer Matthew Simmons had been involved in publishing Areopagitica. He was known to print forbidden books and had printed Milton’s non-forbidden publications. But they lacked hard evidence. The ML analysis showed that type in Areopagitica matched books known to have been Simmons’. More surprising, the type also matched works by Simmons’ ex-partner, Thomas Paine. The researchers don’t know what this means; historians had thought the two had broken their partnership by late 1644 and had never worked together after.
The type also matched works printed previously by Gregory Dexter. But he had been shut down in a government raid early in 1644. Soon after, Dexter left England for the colony that was to become the U.S. State of Rhode Island. The researchers don’t know how Dexter’s type pieces appear in Areopagitica. Possibly, Simmons or Paine bought them when Dexter’s business was liquidated. As Warren and his coauthors write in the Spring 2020 edition of the journal Milton Studies, their analysis “raises nearly as many questions as it answers.”
The next step will be to expand the analysis. The Milton Studies paper identified Simmons and Paine as the printer of eight other books on civil liberties, including one by Roger Williams, the founder of Rhode Island. But hundreds still lack an identified publisher. The team is investigating using deep learning, in which layers of inference build up to create a more sophisticated artificial intelligence, to tackle this much larger problem. They’d also like to make the analysis no longer need enhancement with human expertise, and sensitive to more subtle measures of type irregularity. This will involve 10,000 books covering all the anonymous books and every known printer across over a decade in the 1600s. Such work will involve the deep-learning-specialized Bridges-AI and the future Bridges-2 system, each of which contains many coupled AI-optimized graphics processing units (GPUs) for large-scale deep learning.