The fascinating story of Perl’s role in the dynamic web spans newsgroups and mailing lists, computer science labs, and continents.
The web’s early history is generally remembered as a few seminal events: the day Tim Berners-Lee announced the WWW-project on Usenet, the document with which CERN released the project’s code into the public domain, and of course the first version of the NCSA Mosaic browser in January 1993. Although these individual moments were certainly crucial, the period is far richer and reveals that technological development is not a set of discrete events, but rather a range of interconnected stories.
One such story is how exactly the web became dynamic, which is to say, how we got web servers to do more than serve static HTML documents. This is a story that spans newsgroups and mailing lists, computer science labs, and continents—its focus is not so much one person as one programming language: Perl.
CGI scripts and infoware
In the mid- to late-1990s, Perl and the dynamic web were nearly synonymous. As a relatively easy-to-learn interpreted language with powerful text-processing features, Perl made it easy to write scripts to connect a website to a database, handle form data sent by users, and of course create those unmistakeable icons of the ’90s web, hit counters and guestbooks.
Such website features came in the form of CGI scripts, named for the Common Gateway Interface, first implemented by Rob McCool in the NCSA HTTPD server in November 1993. CGI was designed to allow for drop-in functionality, and within a few years one could easily find archives of pre-cooked scripts written in Perl. An infamous case was Matt’s Scripts Archive, a popular source that unintentionally carried security flaws and inspired members of the Perl community to create a professional alternative called Not Matt’s Scripts.
At the same time that amateur and professional programmers took up Perl to create dynamic websites and applications, Tim O’Reilly coined the term “infoware” to describe how the web and Perl were part of a sea of change in the computing industry. With innovations by Yahoo! and Amazon in mind, O’Reilly wrote: “Traditional software embeds small amounts of information in a lot of software; infoware embeds small amounts of software in a lot of information.” Perl was the perfect small-but-powerful tool—the Swiss Army Chainsaw—that powered informational media from large web directories to early platforms for user-generated content.
Forks in the road
Although Perl’s relationship to CGI is well-documented, the links between the programming language and the rise of the dynamic web go deeper. In the brief period between the appearance of the first website (just before Christmas 1990) and McCool’s work on CGI in 1993, much of what defined the web in the 1990s and beyond—from forms to bitmaps and tables—was up in the air. Although Berners-Lee was often deferred to in these early years, different people saw different potential uses for the web, and pushed it in various directions. On the one hand, this resulted in famous disputes, such as questions of how closely HTML should follow SGML, or whether to implement an image tag. On the other hand, change was a slower process without any straightforward cause. The latter best describes how the dynamic web developed.
In one sense, the first gateways can be traced to 1991 and 1992, when Berners-Lee and a handful of other computer scientists and hypertext enthusiasts wrote servers that connected to specific resources, such as particular CERN applications, general applications such as Oracle databases, and wide area information servers (WAIS). (WAIS was the late 1980s precursor to the web developed by, among others, Brewster Kahle, a digital librarian and founder of the Internet Archive.) In this way, a gateway was a custom web server designed to do one thing: connect with another network, database, or application. Any dynamic feature meant running another daemon on a different port (read, for example, Berners-Lee’s description of how to add a search function to a website). Berners-Lee intended the web to be a universal interface to diverse information systems, and encouraged a proliferation of single-purpose servers. He also noted that Perl was “a powerful (if otherwise incomprehensible) language with which to hack together” one.
However, another sense of “gateway” suggested not a custom machine but a script, a low-threshold add-on that wouldn’t require a different server. The first of this kind was arguably Jim Davis’s Gateway to the U Mich Geography server, released to the WWW-talk mailing list in November 1992. Davis’s script, written in Perl, was a kind of proto-Web API, pulling in data from another server based on formatted user queries. Highlighting how these two notions of gateway differed, Berners-Lee responded to Davis requesting that he and the author of the Michigan server “come to some arrangement,” as it would make more sense “from the network point of view” to only have one server providing this information. Berners-Lee, as might be expected of the person who invented the web, preferred an orderly information resource. Such drop-in gateways and scripts that pulled data in from other servers meant a potential qualitative shift in what the web could be, extending but also subtly transforming Berners-Lee’s original vision.
Going Wayback to the Perl HTTPD
An important step between Davis’s geography gateway and the standardization of such low-threshold web scripting through CGI was the Perl HTTPD, a web server written entirely in Perl by grad student Marc Van Heyningen at Indiana University in Bloomington in early 1993. Among the design principles Van Heyningen laid out was easy extensibility—beyond the fact that using Perl meant no compiling was necessary, the server included “a feature to restart the server when new features are added to the code with zero downtime,” making it “trivial” to add new functionality.
The Perl HTTPD stood in contrast to the idea that servers should have a single, dedicated purpose. Instead, it hinted at an incremental, permanently beta approach to software products that would eventually be considered common sense in web work. Van Heyningen later wrote that his reason for building a server from scratch was there was no easy way to create “virtual documents” (i.e., dynamically generated pages) with the CERN server, and joked that the easiest way to do this was to use “the language of the gods.” Among the scripts he added early on was a web interface to Sun’s man pages as well as a a Finger Gateway (an early protocol for sharing information about a computer system or user).
Although the Indiana University server used by Van Heyningen was primarily used to connect to existing information resources, Van Heyningen and fellow students also saw the potential for personal publishing. One of its more popular pages from 1993-1994 published documents, photographs, and news stories around a famous Canadian court case for which national media had been gagged.
The Perl HTTPD wasn’t necessarily built to last. Today, Van Heyningen remembers it as a “hacked up prototype.” Its original purpose was to demonstrate the web’s usefulness to senior staff who had chosen Gopher to be the university’s network interface. Van Heyningen’s argument-in-code included an appeal to his professors’ vanity in the form of a web-based, searchable index of their publications. In other words, a key innovation in server technology was created to win an argument, and in that sense the code did all that was asked of it.
Despite the servers’s temporary nature, the ideas that accompanied the Perl HTTPD would stick around. Van Heyningen began to receive requests for the code and shared it online, with a note that one would need to know some Perl (or someone who did) to port the server to other systems. Soon after, Austin-based programmer Tony Sanders created a portable version called Plexus. Sanders’s web server was a fully fledged product that cemented the kind of easy extensibility that the Perl HTTPD suggested, while adding a number of new features such as image decoding. Plexus in turn directly inspired Rob McCool to create an “htbin” for scripts on the NCSA HTTPD server, and soon after that the implementation of the Common Gateway Interface.
Alongside this historical legacy, the Perl HTTPD is also preserved in a more tangible form—thanks to the wonderful Internet Archive (the Wayback Machine), you can still download the tarball today.
For all the tech world’s talk of disruption, technological change is in fact a contradictory process. Existing technologies are the basis for thinking about new ones. Archaic forms of programming inspire new ways of doing things today. Something as innovative as the web was very much an extension of older technologies—not least, Perl.
To go beyond simple timelines of seminal events, perhaps web historians could take a cue from Perl. Part of the challenge is material. Much of what must be done involves wrangling structure from the messy data that’s available, gluing together such diverse sources as mailing lists, archived websites, and piles of books and magazines. And part of the challenge is conceptual—to see that web history is much more than the release dates of new technologies, that it encompasses personal memory, human emotion, and social processes as much as it does protocols and Initial Public Offerings, and that it is not one history but many. Or as the Perl credo goes, “There’s More Than One Way To Do It.”
This is the first article in Opensource.com’s Open Community Archive, a new community-curated collection of stories about the history of open source technologies, projects, and people. Send your story ideas to email@example.com.