« PreviousContinue »
U.S. Patent Apr. 3,1990 Sheet 4 of 5 4,914
GARBAGE COLLECTOR FOR HYPERMEDIA
FIELD OF THE INVENTION 5
This invention relates to distributed digital computer systems (including systems having heterogeneous operating environments) and, more particularly, to methods and means for (1) enabling users of such systems to create, manipulate, share and make virtually unre- 10 stricted use of data intensive files, such as digital voice and music files, "scanned-in" image files, and animated or full motion video files, while (2) avoiding the need for making, storing and handling multiple copies of such files, and (3) reclaiming storage space allocated to such 15 files when they are no longer needed.
BACKGROUND OF THE INVENTION
Traditional distributed computer systems have limited users to alphanumeric and simple graphics commu- 20 nications (collectively referred to herein as "textual communications"), even though voice, video and even musical media frequently are more efficient and effective for interpersonal communications. Others have recognized the need to extend such systems sufficiently 25 to support non-textual communications as an alternative or supplement to textual communications. For example, substantial effort and expense have been devoted to the development of voice message systems, as well as to the development of multi-media systems for annotating text 30 with voice. Voice is the non-textual communication medium that has been most widely investigated for use in distributed computer systems, so this invention will be described in that context to provide a representative example. Nevertheless, it will be understood that the 35 broader aspects of this invention are also applicable to other types of data intensive, non-textual communications in distributed computing systems.
Several interesting and potentially important advantages flow from treating voice and other non-textual 40 media as data in a distributed computing environment. See Nicholson, "Integrating Voice in the Office World," Byte, Vol. 8, No. 12, December 1983, pp. 177-184. It enables non-textual media to be incorporated easily into electronic mail messages, and into an- 45 notations applied to ordinary text files, as well as into prompts and other interactive messages provided by the user interface to the computing environment. In short, such a treatment permits users to create, manipulate and share these non-textual data files in much the same way 50 as they handle conventional test files, and enables programmers to implement functions having such non-textual data files in generally the same way as they implement functions involving text files.
However, voice and other non-textual data files differ 55 significantly from ordinary textual data files. For example, classical workstations cannot record or play voice data files in analog form, so special devices are needed for that purpose. Even more significantly, voice data files typically are much larger than text files containing 60 the identical words. Indeed, the recording of standard telephone quality, uncompacted voice consumes roughly 64K bits of storage per second, which is several orders of magnitude greater than the storage capacity required for an equivalent passage of types test. Still 65 another factor to be taken into account is that there are stringent real time requirements on transferring voice because unintended pauses or chopping of words during
the playback of voice creates a perceptual problem that may interfere with or even defeat the effort to communicate.
Users of distributed computer systems sometimes reside in heterogeneous computing environments having diverse network services implemented through the use of different communication protocols. Moreover, traffic between such computing environments may be routed through a variety of common or private carriers which conceivably may involve different path switching schemes. Gateways have been developed for exchanging textual message traffic between heterogeneous environments, so the value and use of non-textual communications in such systems may depend in significant part on the ease with which non-textual communications may be transferred through such gateways.
Others have addressed some of the issues that need to be resolved to carry out voice and similar non-textual communications in distributed computing environments. The Sydis Information Manager utilizes special workstations (called "VoiceStations") for recording, editing and playing back voice. See Nicholson, "Integrating Voice in the Office World," Byte, Vol. 8, No. 12, December 1983, pp. 177-184. Additionally, a system for integrating voice and data for simple workstation applications has been described. See Ruiz, "Voice and Telephone Applications for the Office Workstation," Proceedings 1st International Conference on Computer Workstations, San Jose, Ca., November 1985, pp. 158-163. Speech storage systems having facilities for recording, editing and playing back voice have been proposed. See Maxemchuck, "An Experimental Speech Storage and Editing Facility," Bell System Technical Journal, Vol. 58, No. 9, October 1980, pp. 1383-1395.
Even more to the point, there are the systems that enable users to share documents containing embedded references to non-textual media objects residing on a shared file service and for "garbage collecting" those objets to reclaim the storage space allocated to them when there no longer are any documents or document folders containing references to them. See Thomas et al., "Diamond: A Multimedia Message System Built on a Distributed Architecture," Computer, Vol. 18, No. 12, December 1985, pp. 65-78. Systems, such as the Diamond Systems, which employ textually embedded references to refer to voice, video and other diverse types of non-textual data sometimes are referred to as "hypermedia systems." See Yankelovich et al., "Reading and Writing the Electronic Book," Computer, Vol. 18, No. 10, October 1985, pp. 15-30. Unlike most of the other systems that have been proposed, the embedded references used by the Diamond system avoid the need to include copies of the non-textual data files (i.e., voice files) in each document file with which they are associated. However, the simple reference count based garbage collection scheme of the Diamond System is incompatible with permitting references to internally stored objects to be included in documents or document folders that are stored outside the system.
Interesting prior art relating to the garbage collection of ordinary data files also has been uncovered. The Cambridge File Server requires clients to take an explicit action to prevent files from being garbage collected, because it automatically deletes files that are not accessible from client updated and server maintained indices. See Mitchell et al., "A Comparison of Two Network-Based File Servers," Communications of the
ACM, Vol. 25, No. 4, April 1982, pp. 233-245. Somewhat less relevant, but still interesting as an example of how to build a highly reliable reference server is the system described in Liskov et al., "Highly-Available Distributed Services and Fault Tolerant Distributed 5 Garbage Collection," Proceedings of Symposium on Principles of Distributed Computing, Alberta, Canada, August 1986, pp. 29-39. The garbage collection scheme they envision requires all sites that store references to remotely stored, shared objects to run a garbage collec- 10 tor locally for purposes of sending information about distributed references to a common reference server.
At least two issues still have to be resolved. In view of the very large size of most non-textual data files (e.g., voice data files), it is important that a technique be 15 developed for editing those files through the use of simple databases, without requiring that the files be moved, copied, or decrypted (if they are stored in encrypted form), and for describing the results of the editing operations. Also, an improved technique is needed for using simple databases to support a garbage collector for automatically reclaiming storage space allocated to obsolete non-textual data files. The management and editing of hypermedia is addressed in our concurrently filed, copending and commonly assigned U.S. patent application of Swinehart et al, which was filed under Ser. No. 07/118,492 on a "Server Based Facility for Managing and Editing Embedded References in Hypermedia Systems," (D/87278), so this ap- 3Q plication is directed to the garbage collection issue.
SUMMARY OF THE INVENTION
In accordance with the present invention a database of interests is maintained in a distributed computing 35 system to register the individual interests of users in centrally stored non-textual media files, such as digital voice, music, scanned-in image, and video files, uniquely named piece table style persistent data structures are employed to give users controlled access to 40 the underlying non-textual media files by embedded name reference to such piece tables in ordinary message or text files, so a database of piece tables is also maintained. A garbage collector periodically enumerates the interest database to delete interest entries which have 45 been invalidated. Aged piece tables are deleted from the reference database when there no longer are any recorded interests referring to them, and non-textual media files are deleted to reclaim the storage space allocated to them when there no longer are any piece 50 tables referring to them.
BRIEF DESCRIPTION OF THE DRAWINGS
Still other features and advantages of this invention will become apparent when the following detailed de- 55 scription is read in conjunction with the attached drawings, in which:
FIG. 1 is a schematic illustrating a pair of local area networks which are on command linked by gateways and a communications channel, with the networks 60 being configured in accordance with this invention to support voice communications in addition to ordinary textual communications;
FIG. 2 is a workstation screen illustrating a suitable user interface for recording, editing, and playing back 65 voice files;
FIG. 3 is a logically layered schematic of a voice manager for a local area network;
FIG. 4 is a schematic illustrating the correlation of voice files with the data structures that are used to reference them;
FIG. 5 is simplified functional flow diagram of an interest garbage collectors;
FIG. 6 is a simplified functional flow diagram of a voice rope garbage collector;
FIG. 7 is a simplified functional flow diagram of a voice file garbage collector;
FIG. 8 is a simplified partial functional flow diagram for an integrated- voice rope/voice file garbage collector, and
FIG. 9 is a simplified partial functional flow diagram for an integrated interest/voice rope garbage collector.
DETAILED DESCRIPTION OF THE
While the invention is described in some detail hereinbelow with reference to a single, illustrated embodiment, it is to be understood that there is no intent to limit it to that embodiment. On the contrary, the intent is to cover all alternative, modifications and equivalents falling within the spirit and scope of invention as defined by the appended claims.
Turning now to the drawings, and at this point especially to FIG. 1, there is a distributed computer system 21 (shown only in relevent part) comprising a local area network ("LAN") 22 with a gateway 23 for interfacing it with another LAN (not shown), directly or possibly via a switched communications facility. In keeping with the usual configuration of CSMA/CD (i.e., Ethernet) networks, the LAN 22 has a linear topology for linking a plurality of workstations 24a and 24b but it will be evident that it or any other LAN with which it is interfaced may have a different topology, such as a ring-like topology. Still other examples of the heterogeneous environment that may exist given the diverse characteristics of commercially available workstations and LANs will be apparent, so it is to be understood that the gateway 23 performs the reformatting and retiming functions that are required to transfer data from a LAN operating in accordance with one communications protocol to another LAN operating in accordance with a different communications protocol. Additional gateways 23 may be provided if it is desired to extend the system 21 to'include still more LANs (not shown).
To enable users to transmit and receive voice mes- ■ sages via the LAN 22 in addition to or in lieu of the usual textual messages and data files that they exchange through their workstations 24a and 24b, there are microprocessor based digital telephone instruments 31a. and 31b, located near, but not physically connected to the workstations 24a, and 24*, respectively. These telephone instruments convert voice into the digital data format required to satisfy the communications protocol of the LAN to which they are connected. For example, the telephone instruments 31a and 316 digitize, packetize and encrypt telephone quality voice for direct transmission over the Ethernet-style LAN 22. For a more detailed description of how that is accomplished, see Swinehart et al., "Adding Voice to an Office Computer Network," proceedings IEEE GlobeCom '83, November 1983, and Swinehart et al., "An Experimental Environment for Voice System Development, IEEE Office Knowledge Engineering Newsletter, February 1987. Both of those references are hereby incorporated by reference. As previously pointed out, the telephone instruments 31a and 3lb are not directly attached to the