Wednesday, July 11, 2007

A distinguished guest




We just spent a riveting afternoon with Michael Hart, the founder of Project Gutenberg.
Michael and I are both on the Book People listserv put out by the University of Pennsylvania. Last spring he noticed something that we'd done in systematically harvesting Google books and adding the results to our web pages. He wrote me that he visits the Hamden area each July, and asked if he could visit and get a look at what we are doing with book digitization. I wrote back and told him that we would be honored if he came by to see our operation.
Eventually we settled on the date of July 10 - the day after he spoke in Cambridge for Marvin Minsky's institute and the day before he gave the same presentation in a venue in Hartford. In the past I had read that he is reclusive, so I expected a very quiet person.When he came into my office just before noon, I found out just how wrong I was.
.
He was very gregarious and high-energy, passing out DVDs to everyone nearby and giving us permission to copy anything inside and redistribute the texts. We immediately launched into a discussion of baseball after seeing my memorabilia lining the room. I joked that it must have been total culture shock to go from Minsky to Ballard. I showed him a few of the things we were doing with access to ebooks, starting with the work in progress of VERSO, the graphic interface to electronic reference books, and then the systematic access to Googlee books on particular topics. Later, I created a record for him that gave access to a dynamic link of books written by Freud, available in full on Google. I normally add pictures to these records, but there isn't a picture of Freud in my image holdings. Hart suggested that I contact a colleague of his in Honolulu to get a public domain Freud image.
During lunch at Luce's (a popular Italian lunch spot near campus). I asked him about July 4, 1971, the day that he put up the first text on a computer network. He said that he'd been to a fireworks show that evening and didn't feel like going home - opting instead for the computer lab at the University of Illinois, where there was good air conditioning. He had stopped to pick up some food along the way at a small grocery store. This was when America was ramping up for the Bicentennial, and the grocer slipped a faux parchment reproduction of the Declaration of Independence in his bag. When he got to the computer lab, the paper fell out, and a light clicked in his head. He had been pondering the idea of doing something that would endure on computers forever. He took the Declaration and began typing it manually on a teletype machine. He said that by the time he finished it was past 1 AM on the 5th, but he still counts July 4 as the anniversary of etext. Hart is the undisputed father of etexts, and he is unabashedly proud of that. At the time, he wanted to email the file to others on the networks, but learned that this would have taken down the entire net - after all, the file was a full 5 kilobytes. They worked out a system so that remote users could ask for a certain 9-track tape to be mounted containing the file, and thus the first Project Gutenberg downloads began.

He said that he had a variety of careers between 1971 and the mid 1980's. Among other things, he spent some time in San Francisco as a folk singer - both at restaurants and as a street performer. In the mid 1980's he got another career change when he happened to be running a bicycle repair operation to get money for his bike racing habit. He was called in to tune up a bicycle for a monk who was friends with the new Provost at Benedictine College. The Provost asked him what other skills he had, and Hart mentioned that he could work wonders with computers. Hart built the man a computer that had multiple floppy drives and two hard drives sharing data in such a way that if one broke the other was an automatic backup. This souped-up computing was so impressive to the Provost that he hired Hart as an adjunct professor, and gave him the task of creating the world's first electronic library.

By 1991 when the Internet was starting to become an everyday reality for academics, his library had grown to 10 titles - other government documents such as the Constitution and the Bill of Rights and the Bible. He said the addition of Alice in Wonderland "changed everything." "The big difference with Alice was that people of all ages read it,and kids brought their parents and grandparents to the computer to read it, and vice versa. . .it was our first "big hit," and I knew from a few events in 1989, prior to the official release, that the whole eBook thing as actually working. . .people read them, end to end to end. . .even people I never expected!!!"
He said that many of the projects that happen in technology are subject to the 'S' curve. At the beginning, nobody believes that aproject can possibly be done. Then it does happen, and gathers so much momentum that nobody thinks it could ever stop. Then it hits a point where it can no longer sustain the growth and it slows down dramatically. The classic example of this was the "Dot Com Bubble Burst." He quoted an old Chinese proverb that "The person who thinks something cannot be done should not interrupt the person who is doing it."

In the early 1990s, Hart set up a goal of 10,000 books freely available online. He attracted enough attention that a volunteer army formed of people who typed public domain works into the computer. Many people said that the goal was unworkable, but he did reach it in 2003. He admits that he is a workaholic - doing his job until he drops from exhaustion nearly every day. He said that he had a new goal of a billion books. Yes, a billion. The way this could happen is to take every one of the public domain books on the Internet and translate them into every major language on Earth - 250 languages with at least one million speakers. He asked for guesses what the five most prevalent languages are on the internet. The first few are easy -English, Chinese and Spanish. After that it gets rocky - Charles correctly guessed Hindi. Hart said "You'll never guess the last one." I chimed in with "Urdu," and to my great surprise, that was correct. He said that web translation machines are at about the point that OCR was in the late 1980's. Barely good enough to make them a starting point for a project, knowing that much intervention will be necessary. He mentioned that in 1998, he was one of Wired magazine's "Wired 25." He was flown to Los Angeles and hosted at a red carpet party with the other 24 - including Steve Jobs and the man he sat next to - Robert Altman.
I walked Hart to his car after four and a half hours of high energy brainstorming. We all felt quite privileged to meet a major pioneer of access to information.






From now until August 4, you can visit Hart's latest project - an online ebook festival at www.worldebookfair.com/, and choose from 750,000 ebook files for free.


0 comments: