this is a work in progress
bfs uses public key crypto and key-based RBAC to authenticate peers and control access to data repos. in tomo the mesh will be able to use a few different types of keys to authenticate peers.
peers who replicate a link will also publish a copy of this link object, analogous to seeding in the BitTorrent protocol. each peer keeps an index of the bfs-link objects within their FOAF range. individual files may be fetched on-demand as long as an active peer can be found. private sharing can be accomplished by publishing private links. bfs doesn't make any assumptions about the target files, which may be encrypted or chunked using another tool.
the "hashed-directory-tree" allows clients to verify the integrity of their downloads progressively. a more strict, chunked merkle-dag might be more efficient, but unless that can be accomplished without altering or copying the source files we will have to make do.
files are transferred using rsync. partial files are always cached so that downloads can be resumed. because files are content addressed, a client can resume downloads from any peer that has a given hash. bfs-links can be downloaded and validated in parallel by fetching multiple files at a time from one or more peers.
- how to have editable resources? could work like pub/sub, but for changes to a filesystem. what about having multiple agents edit the same resource?
following the ideas on ssb around editing a message by publishing edit messages. a link could be "edited" by publishing a message pointing at a new version of a given file. links are always immutable, but you can create a chain of commits pointing at the latest data. this also allows for forking. merging binary data is, of course, a manual process, but you can express each step as a message pointing at the resulting file.
bfs only cares about data integrity. confidentiality must be enforced by another program that wraps the rsync stream.
a virtual interface allows clients to content address blobs, without requiring user data to be stored in a particular way. users can opt to hard link their data into bfs, but this isn't the default behavior.
why DHT tho?
like, yeah content addressing is dope, but DHT isn't just content addressing. there's the whole discovery process, but why does it operate at such a low level? when i grab an infohash there's no way for me to know if i will be able to complete the download until the transfer stalls. not helpful. either you have the file or you don't, 30% of $whatever is ~(the same) as not having the damn thing.
so we're back to content addressing. super easy shit do
and you get the hash. you can grab it from a server with
rsync rsync://firstname.lastname@example.org:$(sha256sum $whatever) .
its even resumable, but then you run into routing and discovery issues.
so we need something to make NAT go away and a durable way to transmit hashes and routing information.
something where routes can be determined using pubkeys maybe?
swarming is a nice thing to have too, for hash/addresses with multiple files this naive rsync system can already do it. i can imagine using an rsync library to work at a lower level to transfer (and resume) on a delta/block level so i can download a particular file from more than one peer at a time.
it might even be reasonable to embed all of this context inside the structure of the response when you ask a peer about a given blob. so, rather than publish a message describing a blob, you publish a reference to it. peers look up a given blob using a modified DHT that operates in parallel to the gossip network.
for editable resources, some metadata stored on the updated version can point at its parent using the &<blob> name that is used elsewhere. the advantage here for "link" objects is that the version metadata is stored in an immutable log. it may not be possible to follow an edit chain all of the way back to the first version if it isn't explicitly stored in a user's log.