Explorations: the Cornell Theory Center's Online Science Book

Margaret Corbit
Cornell Theory Center
Cornell University
Ithaca, NY USA

Michael Herzog
Cornell Theory Center
Cornell University
Ithaca, NY USA

1.0 Introduction: Audience and Scope of Project

As part of its mission, the Cornell Theory Center (CTC), one of four National Science Foundation-supported supercomputing centers in the US, communicates information about research conducted using its resources (hardware, software, and staff expertise) through a variety of media at several levels of technical detail (http://www.tc.cornell.edu). This work includes the production of a periodic book featuring scientific and engineering applications of interest to a lay audience.

CTC's 1996 online science book, Explorations, is intended for a mixed audience that includes researchers, college students, the media, K-12 students, legislators, and the general public. Our goal was to retain the graphic appeal of hardcopy publications and to enhance the publication through the incorporation of hypermedia technologies. We wanted to involve the viewers at the same time that we entertained them.

Target audiences access the World Wide Web via all possible computing platforms, from high-end workstations with accelerated graphics processors to outdated personal computers donated to schools and libraries. In general, CTC tries to accommodate low-end users when providing online information. Since a goal of this project was to experiment with leading-edge technologies for the communication of science, development of Explorations was not constrained in this way. Instead, we attempted to select new technologies and the hardware they would be ported to early on and aimed the technical development at the midlevel machines.

Emerging technologies featured in Explorations include inline animation using a server push, the frames function that was new with Netscape 2.0*, and viewing of 3D files in Virtual Reality Modeling Language (VRML). By designing with frames, we were able to enhance the visual appeal and the graphical navigation of the book. Frames were suppoorted on all platforms at the time this paper was prepared. In contrast, porting of the technology for viewing VRML files has not been consistent among platforms. Acceptable viewers were not available for Macintosh* or UNIX* platforms at this writing. These factors influenced the development process.

Because we wanted to avoid presenting the audience with unnecessary surprises or frustrations, we incorporated a CGI script to detect the reader's browser and its version when he/she first enters the site. This information is presented on a gateway page along with links to sites for downloading the necessary browser and other technologies before entering the publication (these resources are also accessible from within the publication). For VRML, we link to the VRML Repository (http://www.sdsc.edu/vrml), which maintains current technical information as well as access to downloadable software. In addition, we provide a navigation tips screen that graphically and succinctly explains the features of the document and buttonbar functions, a text-based table of contents, and an index-based search capability.

2.0 Technical Development of Explorations

2.1 Implementation of Frames

Explorations is comprised of a series of feature stories describing research in computational science. Through the use of hypertext, the stories are supported with background information, examples from practical applications related to the research, and links to related sites. We incorporated scientific visualizations wherever possible to illustrate the concepts being presented. The structure of the book emerged through a team effort as we explored the possibilities of the frames function.

As one of our first steps after committing to publishing a multimedia science publication on the Web, we created a changeable storyboard using 4x5 file cards, which provided space for a small representation of each screen type and explanatory notes and comments. These aided us in determining the interrelationships of the publication screens within the whole and within a story. Shifting them around and changing their order facilitated planning by trial and error. Once the flow concept was set, the plan was documented and communicated to the team members via a generalized storyboard diagram created in Adobe PageMaker*. The frame structure emerged from this planning process.

Netscape likens frames to window panes in the browser screen, each with a view of a different, but related, part of the site. By clicking in a frame, you can move deeper, either as that particular frame refreshes itself with new files, or as it calls up a file into another frame on the page. In the second model, the original frame remains the same and serves as a reference point. We used both models in designing Explorations, but focused our efforts on exploring the power of the second.

Figure 1. The frames function was used in Explorations as a graphical navigation tool. Here an image map in the left frame holds links to two deeper layers of the story presented in the large frame. The small frame (buttonbar, lower right) gives the viewer access to the whole site as well as help, feedback, and search functions.

At the beginning of the project, we rejected the use of controlled formats such as Adobe Acrobat* or PostScript*. Although they would have provided more design flexibility and artistic control, we saw this as adopting hardcopy technology for online publication, rather than employing new technologies to develop the publication completely in a hypertext environment. We began production of Explorations with the first beta version of Netscape 2.0 (Netscape was estimated to be preferred by approximately three quarters of the browsing audience when we began this project). Cornell University student interns Daniel Cane and Timothy Chi developed the file structure and coding methods used for the Explorations frames.

The use of frames offered several advantages in design and layout of the publication, establishing a "place" for each major component of the pages (graphic image maps, text, illustrations) and, therefore, a consistency of format. Frames also enabled us to maintain a logical and pleasing relationship among the scrolling story, the illustrations, and utility items such as the buttonbar.

All of the top level pages are divided into three frames. On the left, a vertical frame presents a graphic image that is usually a composite of illustrations from the pages at the level below. These images act as graphic sidebars to the pages, enhancing the visual interest at the same time that they present an intuitive graphical guide to the next level in the text. They are sized to fit the initial template and mapped with links to the pages in the hypertext structure to which they refer. Brief text overlays give additional indication of links. The new version of the server software NCSA HTTPd (HTTPd NCSA/1.5.0a), which was installed on our system soon after we started, greatly simplified setting up the image map links.

The major frame for each page is the focal point for that level in the site. Each contains the body of the text, the full illustrations, and often text links to other levels. This is the one frame that we expect to generate scrollbars--since size and font are set by the viewer, we did not attempt to prevent scrollbars here. However, we limited the text length for any segment to no more than three screens in a relatively large font (14 point Times on a Macintosh, "huge" on an SGI). A horizontal frame at the bottom of the page allows the viewer to navigate at a larger scale within the publication. Buttons access the top level of the current story, the entry page for the entire publication, a text-based table of contents, and the navigation tips page.

Any frame on the page will automatically generate scrollbars within its allotted space if the files that are called into it are wider or longer than the space provided. This was inconsistent on different viewing platforms and it was necessary to compromise, determining the final image template sizes by trial and error. (There is an HTML tag that will override the scrollbars, but the results are extremely problematic.) We attempted to minimize the presence of scrollbars on the pages by offering a screen sizing page that functions also as a cover page at the beginning of the site. Viewers are asked to size their browser window to fit the image on this cover page. Because the images in the graphic vertical frames are standardized, scrollbars become necessary only to view oversized illustrations or to scroll text that is larger than the area of the major frame.

2.2 Additional Features

2.2.1 Feedback

We elicit feedback from viewers wherever they may be in Explorations so that we can make continual improvements in our online science publications. A feedback form is readily accessible via the buttonbar, and the messages are sent to the project coordinator as electronic mail.

2.2.2 Search

Explorations is not only an information book, but also serves as an educational and reference tool. We incorporated an ALIWEB* CGI search mechanism, available via the buttonbar, to improve access for students and researchers. This search engine will permit major Web indexed sites to point to the articles in Explorations for the entire World Wide Web audience. Searches are based on the IAFA (Internet Anonymous FTP Archive) template and look for titles, keywords, and descriptions that are provided in the source code for individual pages. This system requires an investment of time by the developers to index the pages manually, but ensures the quality of their accessibility to broad searches.

3.0 Telling the Story with Words: Writing for Hypertext

CTC participated in the development of the National Science Foundation MetaCenter Computational Science Highlights project (http://www.tc.cornell.edu/Research/MetaScience/) during 1994-95, generating new science stories specifically for that site. This experience drove home the importance of breaking apart the text of a traditional, linear feature article into "chunks" of information that are linked and yet stand alone, analogous to, but more complex than, the use of sidebars in conventional print media.

In the hypermedia environment, conventional wisdom suggests that short pages are better. We began with linear features on the research and then took out sections that could become related chunks. Once the story was dissected, it was recast with references to the chunks. Top level pages averaged more than 500 words in length.

Chunks were rewritten to add necessary, though sometimes redundant, information. At this point, chunks often evolved and deepened, especially where we presented practical applications of the research. Chunk length varied from less than 100 to more than 350 words, and this was related to their evolution. The hypertext structure enhances this style of development. By the end of the project, we were writing in hypertext chunks.

While we used images to attract viewers and as lures to deeper levels of information, we also wanted to ensure that people read the text. We therefore kept text and images close together on all the pages, for example interpreting illustrations on the same page as the image. We wrote captions after we incorporated animations and images into the structure of the story to ensure that the context of the illustrations was clear.

4.0 Telling the Story with Pictures: Incorporating Scientific Visualizations and Illustrations

CTC visualization specialists and individual researchers collaborate to present research results graphically, often in three or more dimensions. In some cases, these visualizations represent subjects, such as the human heart, that are easily recognized by a lay audience; in others, the images may be extremely abstract, for example, a solution surface for a complex series of equations. Inevitably, there are aspects of the visualizations that require careful explanation and interpretation. The choice of format is thus important and the complexity of the information included in the images must be considered in terms of clarity as well as file size.

We were able to incorporate still images, animations, and 3D files based on researchers' results to illustrate the stories, although in many instances, these files required reprocessing. In particular, there was a great deal of staff time invested in translating 3D files to the VRML format while controlling for file size and image quality. In addition, most of the animations required extensive editing and reformatting before being included in the site.

4.1 Still Images

We sought to place the research in context by providing examples of related applications. Illustrations for these examples were critically important to the publication, both because the structure required them and because they balanced scientific visualizations which were often exotic in appearance and difficult to interpret. These additional still images were chosen for their ability to add interest and to clarify the stories and were gathered from a variety of sources, both hardcopy and electronic. We found these supporting images almost exclusively via the Web. For example, the image of the Matterhorn was taken from a photograph shot in the summer of 1995 by a mountain climber. We found him by searching the Web using the mountain's name as the keyword and were delighted when he agreed to digitize his best image and then provided the file to us free of charge.

Whereas we enhanced the visual interest of the book by creating the compound images for the vertical frame, we did not modify the actual images to which these were linked so that we ensured accuracy of representation of the researchers' work and fair representation of borrowed images. However, many scientific visualizations were reformatted, resized, or cropped; several were extracted from animations. All were reviewed and approved by the researchers.

4.2 Movies

The animations presented in the book were made available to us in a number of formats ranging from video tape to MPEG movies. Almost all were too lengthy (input video files generated movies of several hundred megabytes), and often too slow-paced to be translated directly into an online animation. We compromised by identifying clips from the tapes that would illustrate concepts presented in the text. We did not attempt to edit any existing soundtracks to fit the clips.

The animations that appear automatically on the pages that present the four sections of the book (for example, Down to Earth) are server pushes. They are generated, image by image, by the server and sent sequentially to the browser. The viewer does not have to wait for the entire file to be downloaded (as much as 25 megabytes) or to store such large files on their machines. When the viewer has a reasonable connection to the network that is not overloaded at the time of viewing, this is a very satisfying option for presenting animations.

With the exception of the four server push files, virtually all the animations were edited in Adobe Premiere* and output as flattened QuickTime* movies (the format required by our UNIX server). In many cases, we adjusted the quality (increasing the contrast during input, for example, to improve the translation from video tape) and compromised on file size through a variety of approaches in order to keep the files as small as possible. Even so, many still fall outside our desired 5 megabyte upper limit for file size.

We chose the QuickTime format for several reasons. First, viewers were available for all platforms. (We make the general UNIX viewer, xanim, available for downloading from our server.) In addition, because there was no simple editing approach for the MPEG format, saving MPEG files as QuickTime movies was the only way to access them without going back to the original digital files (often no longer available). In addition, QuickTime plays more smoothly than MPEG on a Macintosh.

4.3 3D Files

Scientific visualization offers intuitively accessible information about research results and is recognized by many researchers as an important part of the scientific process. In the laboratory--whether it is a physical location with test tubes or an immersive environment in cyberspace--3D visualization of data enables exploratory analyses and aids in processes such as interactive steering through databases. With Web-based technologies, the lay viewer now has the opportunity to fly into and through these files in much the same way as the experienced researcher.

We chose to include a limited number of VRML files in features where exploration of a 3D data set would enhance the information presented and the experience of the viewer. This work depended on conversion of visualization files created for CTC users into VRML format.

The two initial applications approach this goal in different ways. The first, a biomolecular model, affords an experience similar to that of the researcher, presenting a challenge to the viewer in terms of navigating the file. It has an embedded hypertext link that functions as an illustration caption, explaining what the viewer sees. This allows the viewer to explore the environment and learn about the chemistry of the enzyme molecule by clicking on a part of it. We intend to expand the use of this technique in future features.

The second example of 3D illustration, a set of sample files from a geographic information systems (GIS) database of New York State, allows the viewer to fly over selected regions of the state looking at the distribution of vegetation, for example, and is geared more toward entertainment.

To accommodate viewers new to VRML, we grouped the VRML files together with access through an introductory page, Virtual Reality Tours, where introductory information and a link to the VRML repository are provided. The repository maintains up-to-date information on software and formats. Moving from the Virtual Reality Tours page into a model, viewers find a user interface (designed by Cornell University undergraduate Daniel Switkin) in the form of a remote control. At this point, they select a specific model, load it, and fly into it. Viewers exit the models through links in the models themselves or through text links on the Virtual Reality Tours page, which remains visible during their VRML session.

5.0 Self-evaluation and Future Plans

Development of Explorations ended on April 1, 1996 when it was released publically, and while the site has been refined in response to feedback, no major changes have been made to the publication since. Our aim in developing Explorations was to apply the latest available technologies to create a highly interactive, exciting, and information-filled multimedia experience through the World Wide Web. This forced us to explore the limits of such new technology and to develop new capabilities for online documents. This is an ongoing process.

5.1 Overall Design

We experienced an unwelcome design constraint working for cross-platform application. Because of the effect that differences in screen resolution have on the size of the images on the page, we enlarged the design as much as possible to make the publication look reasonable on high resolution screens; this necessitated additional vertical scrolling for low-end Macintosh screens (13" RGB monitors). In the future, we hope that new technologies such as Java* will resolve this problem by allowing the images to be resized on the fly.

This was our first use of frames and we hope that the artificial boundaries they enforced can be overcome though design and a more flexible frame structure similar to the grid system used in conventional publications. Depending on the evolution of Web standards, we may look for alternatives to the frames approach to page design. Access to the broadest possible audience will be an important factor in the decision-making process.

5.2 Text

Five hundred words is probably too much text to include in one piece of hypertext. To further reduce the size of the text pages we will have to rethink our approach to telling the stories. We are also considering a discrete method for providing hypertext definitions--one that will not detract from the flow of the stories for those choosing not to access the definitions.

To enhance the illustration captions, we plan to incorporate audio explanations of the animations and still images into the features. In addition, sound files from scientific data will be included whenever possible and we are currently studying sound file formats for these files

5.3 Illustrations

In this publication, we did not pay a lot of attention to image file size reduction. This will be a priority in upcoming publications. To make this easier, we have acquired Equilibrium Technology's De-Babelizer*. We will also reconsider the use of the JPEG format. We had rejected JPEG in favor of the GIF* format, which permits transparent images, interlacing, and viewing as inline or spawned images on any browser.

VRML and animated illustrations in online publications can be very effective. However, they are not easy to incorporate into the page. Several limitations exist; some will be resolved by advances in Web technology. Others involve careful planning of the publication and close communication with the researchers/visualization specialists. For example, we hope that providing links to sources of information about VRML and download sites for browsers will limit the frustration of those without this capability. However, lumping these illustrations together makes it easier for people only interested in VRML to browse the files and leave without experiencing the rest of the site--for better or worse depending on your point of view. We look forward to the time when inline VRML browsers are commonplace.

6.0 Future Plans

We plan to begin development of our next science publication later this year. With this in mind, we are currently tackling issues such as the potential of Java, JavaScript* or other client-side programming methods and the use of multimedia programs such as Macromedia Shockwave*. One goal is to incorporate small applications into the site as online exercises in computational science. These will allow viewers to explore data sets interactively, for example by watching soil temperature change in response to surface wind speed in a simplified meteorological model.

In the next version we will enhance interactivity by richer use of VRML and possibly through interactive software applets. In addition, there is a potential for the development of interrelated themes within a more interactive publication structure. For example, intermediate pages between treatments of particle physics concepts which appear in both astrophysics and nuclear physics features.

By nature, hypermedia publications depend on the availability of images, animations, sound, and 3D files. As mentioned above, 3D files created by research teams in the process of scientific work did not necessarily suit the editorial and production needs of Explorations. Developing effective illustrations for online publication requires early contact and ongoing communication with the researchers as well as independent resources and expertise to translate and edit files for transfer and presentation on the World Wide Web. The importance of this interaction will only increase as science features become richer and more interactive.


Special thanks: We would like to acknowledge the contributions of Linda Callahan, CTC director of external relations, and Dan Dwyer, leader of CTC's Online Information Project, and to thank visualization experts Richard Gillilan, Chris Pelkie, Wayne Lytle, and Bruce Land for their help in many different ways during the course of the project.

This work was conducted as part of the production of the CTC's online science book, Explorations. CTC, one of four high performance computing and communications centers supported by the National Science Foundation, operates a 512-processor IBM SP system. Activities of the Center are also funded by New York State, the Advanced Research Projects Agency, the National Center for Research Resources at the National Institutes of Health, IBM, and other members of CTC's Corporate Partnership Program.

* Trademark or Registered trademark