A centralized system for sharing sensitive content
I have too many stupid ideas which I’ll never have enough time to implement. Despite that, some of them I’d really like to see implemented because I bloody need them. So please someone steal this and implement it, even though it’s a stupid piece of trash ;_;
Overview
The goal is to combine the easy-to-use native interfaces of DropBox (http://www.dropbox.com/) with the paranoid strong-encryption cryptography of Tarsnap (http://www.tarsnap.com/) to create a cloud-based sharable storage system where you can share content with yourself and other people, but not even the server providers can see the content being shared.
Deficits in Existing Systems
DropBox
DropBox is a service for easily storing and sharing content in the cloud — after registering an account, it effectively presents itself as a file share on your local machine (Windows, OSX, Linux, etc). Any changes to the data on the file share are automatically and seamlessly propagated to the central server, and from there to any other clients looking at those files. Effectively, it’s a USB drive that’s stored on the internet.
Their shell integration is critical to their success — a naive user can simply run the software and interact with it in the same manner as a USB thumb drive. Because it exposes itself as a logical volume, applications can interface with it out-of-the-box.
Despite the amazing ease-of-use, DropBox is completely insecure and unsuitable for use in a sensitive environment:
- It relies on password authentication
- The server software they use is buggy; numerous critical security holes are constantly found
- Password reset doesn’t deauth clients, http://forums.dropbox.com/topic.php?id=12645
- Able to reset any password, http://pastebin.com/yBKwDY6T
- Data is not encrypted; hosting providers (or anyone who can get access) has all your data
Tarsnap
Tarsnap is an online backup system “for the truely paranoid”. After registering, you provide tarsnap with a public key to authenticate all data requests. There are two methods of operation — put data and get data. All data is automatically encrypted by the client software with your public key, then signed with your private, then sent to the server. As soon as the data leaves your system, no one can access it ever again without your private key.
Despite the extreme caution it takes with data security, Tarsnap is completely unusable for the majority of DropBox’s use cases:
- All core functionality is exposed in command-line tools rather than shell integration
- Designed around loading large, static files; no support for inter-file metadata (directories, etc)
- Everything done with a single key pair — cannot share data with other uses without giving them your private key
Solution Criteria
We need something that combines the ease-of-use of DropBox’s data-sharing characteristics with the data paranoia of Tarsnap. In particular, it should fulfill the following criteria:
- No data sent over the network or stored on the server should be unencrypted
- The server should not be able to decrypt any of the data it contains
- Private keys must never be shared
- It must be possible for one user to share a single binary file with multiple users without duplicating the binary content
- The system must present itself to the end-user as if it were a USB drive (e.g., seamless shell integration)
Proposed Solution
Transport-Level Details
Data is represented in an encrypted unit which will be henseforth termed a “blob”. A blob consists of the following data segments:
- The binary payload itself, encrypted with a single-use symmetric key, X⁰
- A list of Pⁿ, where each Pⁿ is the known public key of a friend the user authorizes to view the data
- A list of Xⁿ, where X⁰ is encrypted with each Pⁿ
Each blob is identified by the SHA256 (or equivalent) hash of its contents (henceforth referred to as the blob ID).
Like Tarsnap, the transport provides two operations – putting content on the server, and getting content from the server.
Sending Content
To put content on the server, one blob for each logical file is created, signed with the user’s private key, then uploaded to the server. The server can then verify that the payload was sent by the user and is what the user intended to send. Furthermore, it can see who the user has authorized to view the data (so it can quickly send access denied messages to people who don’t have access to the content).
Receiving Content
Likewise, a client can receive content by sending a request for a specific blob ID. The request is signed with the user’s key for authentication purposes. If the client is authenticated, the server then transmits the blob.
The client then thumbs through the blob and finds the copy of the single-use symetric key signed with their public key. They decrypt it, then use the decrypted key to decrypt the payload of the blob.
Listing/Removing Content
Since the server knows effectively nothing about the content, these are pretty easy use-cases: the client simply sends a signed request to the server. In the former, the server sends a list of blob IDs back to the client (in addition to possible metadata, like file size, for billing purposes). In the later, the client simply sends a blob ID (or list of IDs) to the server and the server removes them.
Providing a Seamless User Experience
What’s been described thus far is effectively Tarsnap with a form of content sharing built-in. As such, it is only suitable for client consumption, not end-user consumption. In addition to transmitting, storing and receiving binary blobs, the user must be able to append metadata to that blob. Some likely forms of metadata include
- Symbolic name of the content (e.g., a filename)
- Hierarchical organization of the content (e.g., file directory structure)
- Other tidbits normally expected of filesystems to provide (atime/mtime, etc)
Support for metadata is built entirely on top of the existing transport infrastructure — metadata for all files belonging to a user is encoded as a single, separate blob which contains a hierarchy of metadata objects, each of which contain the blob IDs of the data they reference.
In addition to the actual metadata, as listed above, each metadata object also contains one of the following:
- A blob ID which references the blob containing the content of the file, OR
- A set of “child” metadata objects (e.g., this one is a directory) OR
- A blob ID which references another metadata blob (e.g., a shared directory)
The “shared directory” is an abstraction on top of the transport-level permission details that services two purposes: it provides beyond all-or-nothing to share metadata with other users, and it provides an intuitive way to do directory-level sharing (e.g., having a “Shared with Alice and Bob” directory — though the client would have to make sure every blob referenced in that tree was appended with the appropriate encrypted keys).
At this point, we’ve effectively built, from the ground-up, a centralized file-sharing system with no shared secrets.
Good luck making it financially viable ;_;
6 commentsEDIT: Apparently it already exists. lol.