« PreviousContinue »
(12) United States Patent
Douceur et al.
(io) Patent No.: (45) Date of Patent:
US 7,272,630 B2 Sep.18, 2007
(54) LOCATING POTENTIALLY IDENTICAL
OBJECTS ACROSS MULTIPLE COMPUTERS
BASED ON STOCHASTIC PARTITIONING
(75) Inventors: John R. Douceur, Bellevue, WA (US);
Marvin M. Theimer, Bellevue, WA
(US); Atul Adya, Bellevue, WA (US);
William J. Bolosky, Issaquah, WA (US)
(73) Assignee: Microsoft Corporation, Redmond, WA (US)
( * ) Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.C. 154(b) by 36 days.
(21) Appl. No.: 10/991,571
(22) Filed: Nov. 18, 2004
(65) Prior Publication Data
US 2005/0097148 Al May 5, 2005
Related U.S. Application Data
(62) Division of application No. 09/876,376, filed on Jun. 6, 2001.
(51) Int. CI.
(52) U.S. CI 709/203; 709/201; 709/219;
(58) Field of Classification Search 709/203,
709/201,219; 715/739 See application file for complete search history.
(56) References Cited
U.S. PATENT DOCUMENTS
5,202,982 A 4/1993 Gramlich et al.
5,317,728 A 5/1994 Tevis et al.
5,371,794 A 12/1994 Diffie et al.
5,452,447 A 9/1995 Nelson et al.
5,588,147 A 12/1996 Neeman et al.
Potentially identical objects (e.g., files) are located across multiple computers based on stochastic partitioning of workload. For each of a plurality of objects stored on a plurality of computers in a network, a portion of object information corresponding to the object is selected. The object information can be generated in a variety of manners (e.g., based on hashing the object, based on characteristics of the object, and so forth). Any of a variety of portions of the object information can be used (e.g., the least significant bits of the object information). A stochastic partitioning process is then used to identify which of the plurality of computers to communicate the object information to for identification of potentially identical objects on the plurality of computers.
17 Claims, 15 Drawing Sheets
Federal Information Processing Standards Publication 186: Digital
Signature Standard (DSS). 1994, no date.
Borg, Digital Signatures Keep Cyberstreets Safe for Shoppers,
Computer Technology Review, vol. 16, No. 2, Feb. 1996 p. 1.
Hu, Some Thoughts on Agent Trust and Delegation, Available at
http://www.cs.nccu.edu.tw/jong, 2001, pp. 489-496.
E. Adar and B. Huberman, "Free Riding on Gnutella," Xerox PARC
Technical Report, pp. 1-22, Aug. 2000.
R. Anderson, "The Eternity Service," PRAGO-CRYPT, pp. 242252, Oct. 1996.
T. Anderson, M. Dahlin, J. Neefe, D. Patterson, D. Roselli, and R.
Wang, "Severless Network File Systems," 15fh Symposium on
Operating Systems Principles, pp. 109-126, Dec. 1995.
W. Bolosky, J. Douceur, D. Ely, M. Theimer, "Feasibility of a
Serverless Distributed File System Deployed on an Existing Set of
Desktop PCs", Proceedings of the International Conference on
Measurement and Modeling of Computer Systems, pp. 34-43, Jun.
W. Bolosky, S. Corbin, D. Goebel, and J. Douceur, "Single Instance
Storage in Windows® 2000," Proceedings of the 4th USENIX
Windows Systems Symposium, pp. 13-24, Aug. 2000.
G. Cabri, A. Corradi, F. Zambonelli, "Experience of Adaptive
Replication in Distributed File Systems", 22nd IEEE
EUROMICRO, 10 pages, Sep. 1996.
M. Castro and B. Liskov, "Practical Byzantine Fault Tolerance," Proceedings of the Third Symposium on Operating Systems Design and Implementation, 14 pages, Feb. 1999.
M. Castro and B. Liskov, "Proactive Recovery in a Byzantine-Fault-
Tolerant System," 4fh Symposium on Operating Systems Design
and Implementation, pp. 273-287, Oct. 2000.
I. Clarke, O. Sandberg, B. Wiley, and T. Hong, "Freenet: A Dis-
tributed Anonymous Information Storage and Retrieval System,"
ICSI Workshop on Design Issues in Anonymity and Unobserv-
ability, 21 pages, Jul. 2000.
J. Douceur and W. Bolosky, "A Large-Scale Study of File-System
Contents," SIGMELRICS, pp. 59-70, May 1999.
L. Fan, P. Cao, J. Almeida, and A. Broder, "Summary Cache: A
Scalable Wide-Area Web Cache Sharing Protocol", ACM
SIGCOMM, pp. 254-265, 1998.
A. Goldberg and P. Yianilos, "Lowards and Archival Intermemory,"
IEEE International Forum on Research and Lechnology Advances in
Digital Libraries, pp. 147-156, Apr. 1998.
J. Howard, M. Kazar, S. Menees, D. Nichols, M. Satyanarayanan, R. Sidebofham, and M. West, "Scale and Performance in a Distributed File System," ACM Transactions on Computer Systems, pp. 51-81, Feb. 1988.
J. Kistler and M. Satyanarayanan, "Disconnected Operation in the Coda File System," ACM Transactions on Computer Systems, vol. 10, No. 1, pp. 3-25, Feb. 1992.
J. Kubiatowicz et al., "OceanStore: An Architecture for Global-
Scale Persistent Storage," Proceedings of the Ninth International
Conference on Architectural Support for Porgramming Languages
and Operating Systems, 12 pages, Nov. 2000.
E. Lee and C. Thekkafh, "Petal: Distributed Virtual Disks," Seventh
International Conference on Architectural Support for Programming
Languages and Operating Systems, pp. 84-92, Oct. 1996.
D. Mazieres, M. Kaminsky, M. F. Kaashoek, and E. Witchel,
"Seperating Key Management from File System Security", 17fh
ACM Symposium on Operating Systems Principles, pp. 124-139,
D.L. McCue, M.C. Little, "Computing Replica Placement in Distributed Systems", IEEE Second Workshop on Replicated Data, pp. 58-61, Nov. 1992.
M. K. McKusick, W. N. Joy, S. J. Leffler, and R. S. Fabry, "A Fast File System for Unix," ACM Transactions on Computer Systems, vol. 2, No. 3, pp. 181-197, Aug. 1984.
The OceanStore Project web pages, http://oceanstore.cs.berkeley. edu/info/overview.html, 2 pages, last modified Jul. 8, 2002. C. Plaxton, R. Rajaraman, and A Richa, "Accessing Nearby Copies of Replicated Objects in a Distributed Environment", Proceedings of the 9fh Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 311-320, 1997.
C. Plaxton, R. Rajaraman, and A Richa, "Accessing Nearby Copies of Replicated Objects in a Distributed Environment", Theory of Computing Systems, pp. 32:241-280, 1999.
R. T. Reich and D. Albee, "S.M.A.R.T. Phase-II," No. WP-9803-
001, Maxtor Corporation, 3 pages, Feb. 1998.
J. D. Saltzer and M. D. Schroeder. "The Protection of Information
in Computer Systems," Proceedings of the IEEE 63(9), pp. 1278-
1308, Sep. 1975.
R. Sandberg, D. Goldberg, S. Kleiman, D. Walsh, and B. Lyon,
"Design and Implementation of the Sun Network Filesystem,"
Summer USENIX Conference, pp. 119-130, Jun. 1985.
A. Sweeny, D. Doucette, W. Hu, C. Anderson, M. Nishimoto, and
G. Peck, "Scalability in the XFS File System," USENIX Annual
Technical Conference, 15 pages, 1996.
C. Thekkafh, T. Mann, and E. Lee, "Frangipani: A Scalable Distributed File System," 16fh ACM Symposium on Operating Systems Principles, pp. 224-237, 1997.
W. Vogels, "File system usage in Windows NT 4.0," 17fh ACM Symposium on Operating Systems Principles, pp. 93-109, Dec. 1999.
J. Wylie, M. Bigrigg, J. Strunk, G. Ganger, H. Kiliccote, and P. Khosla, "Survivable Information Storage Systems," IEEE Computer, pp. 33(8):61-68, Aug. 2000.
Evans, Matt, "FTFS: The Design of A Fault Tolerant Distributed File-System," May 2000, pp. 1-49.
Cheriton, David R. and Mann, Timothy P., "Decentralizing a Global Naming Service for Improved Performance and Fault Tolerance," ACM Transactions on Computer Systems, vol. 7, No. 2, May 1989, pp. 147-183.
Miller et al, "Strong Security for Distributed File Systems", 2001 IEEE, pp. 34-40.
Ferbrache, "A Pathology of Computer Viruse", Springer-Verlag London Limited, 1992, pp. 1-6.
* cited by examiner