Aug 16
bullshit rant, part 1
If you haven’t watched Josh Berkus’s presentation, “Scale Fail”, consider doing so now.
I’m a real sucker for using Reddit/prog/it to choose what programming language/framework/libraries to use for my upcoming projects. Sometimes it’s worse than usual, and I find myself trying to figure out which project to shoehorn one thing or another into.
Naturally, right now I’m running high on the Node.js wave and tweaking along the Go rave. Naturally, I’ve had wildly different experiences with each.

I started using Node.js back when it was version 0.2.0, back before npm was widely used, when Express was first brewing, even before hipsters decided how to be cool. To be honest, since then it’s had a fairly stable API, and the ecosystem gets better every day. In my mind though, it still has two massive warts:
Continuation passing can make code pig disgusting
Quite a few people have bitched about continuation passing in Node.js being a pain for a variety of reasons.
It makes the code look like piss
This is not that big of a deal — if your code looks like piss because of the continuation passing style, then you’re not doing it right. Your code needs to either be restructured to accurately reflect what you’re trying to do, or you should use one of several existing libraries for manipulating control flow in a CPS-friendly manner.
Lexical scoping can make refactoring code “fun”
Almost every Node.js application is going to have code that looks like
function(foo, bar) {
foo.do(function(x) {
bar.do(function(y) {
/* snip */
});
});
}
Assume that you want to lift the callback passed to foo.do into a separate, named function:
function lifted(x) {
bar.do(function(y) {
/* snip */
});
}
The problem is that we’re referencing bar which was previously closed on from the parent environment. By lifting the function out of the lexical scope (presumably to use the same functionality elsewhere), we completely sacrifice all the automatic loveliness of closures. Instead, we have to explicitly go through the function to lift (which may be fairly long and complex, mind you), annotate exactly what’s used from the parent scope, then manually bind it into a closure:
function lifted(bar) {
return function(x) {
bar.do(function(y) {
/* snip */
});
};
}
function(foo, bar) {
foo.do(lifted(bar));
}
To add insult to injury, the compiler will accept whatever you give it, and fault you at runtime for any mistakes.
If you’ve got some Javascript curry magic, please let me know :(
When spaghetti is what’s on the menu
The biggest issue I have with continuation passing is that (since I’m dealing mostly with web applications) it’s exceedingly difficult to trace a failure back to a specific request. Error propagation in Node.js manifests itself in one of three forms:
- A thrown exception
- An 'error' event is emitted from an EventListener
- An error parameter is passed to a callback
Thrown Exceptions
Exceptions are used solely to communicate errors in synchronous calls, which naturally are few and far between. The handler is almost always in the same lexical scope of the calling code (since letting the exception propagate further may trigger a process.unhandledException event, which should almost always kill the application as cleanly as possible), so it’s not that big of a deal.
If you’re balking at the aforementioned “process.unhandledException should kill the application as cleanly as possible” — the logic here is that you really have no idea where the hell the exception came from, and even if you do, you have no access to the scope from which it was thrown. You can’t guarantee that your application is in a consistent state — your only recourse is to SHUT. DOWN. EVERYTHING. (if this is not the case and you can guarantee a consistent state, somehow, please let me know. that would be amazing).
Error Events
'error' events are fun little things — whenever you emit an 'error' event on any EventEmitter, if there are no listeners for that event, the process exits (I don’t believe that an process.unhandledException is emitted, even).
This actually works out really well, in most cases. Whenever you’re binding listeners for something, you should bind the error listener too — you’ve got all the stuff you need to identify from whence the error originated within lexical scope.
The one massive snafu is that when you’re abstracting an abstraction that uses EventEmitter internally, you MUST remember to handle and forward all the error events. You might be reading this and say “oh but that’s easy to remember it’ll never happen”. It actually happens more often than you think — the built-in http.Client functionality didn’t properly catch and forward errors from the internal net.Socket for a long time. You had to manually get the undocumented socket member and attach a listener manually. I think they fixed this in the new http.request interface.
Errors in callbacks — lexical scope strikes again
I think the most common form of error passing is by just returning an error code in a callback parameter. This works really well, until you start refactoring stuff and realize how much shit you’re stuffing in a closure.
Actually I don’t remember where I was going on this one — it seems that it’s trivially solved by just using the “pass it forward” error code C-ism.
Writing native (C++) extension is a bitchface
I uhh, this is getting kind of long. I’m gonna break this into a multipart/post and write up section 2 of Node.js bitchings and then eventually write up a section on Go bitchings. Hooray!
Tagged with: bullshit rant, javascript, mention of go, node.js, tl;dr, why did i quit smoking
5 comments
Jul 17
Invoking mount(2) in FreeBSD 8.x
So I’m still writing Go bindings for a lot of common FreeBSD functionality. Yesterday I implemented a means to list all mounted filesystems, so today I’m writing the bindings to mount(2) to mount/umount them.
If you look at the man page for mount, you’ll see that the function signature looks like this:
int mount(const char *type, const char *dir, int flags, void *data);
The void* should scare you.
I haven’t been able to dig up any information about what the fuck should be passed to it (granted, I haven’t looked very hard because, judging from the contents of src/sbin/mount_*/*.c in the FreeBSD sources, it’s been entirely superseded by nmount.
int nmount(struct iovec *iov, u_int niov, int flags);
Poking around, struct iovec (eventually included from sys/uio.h) is defined as this:
struct iovec {
void *iov_base;
size_t iov_len;
}
Effectively, nmount takes an array of these structs which effectively form a flattened vector of (key, value) tuples. As far as I can tell, iov_base is always a NULL-terminated char*, and iov_len should be strlen(iov_base) + 1 (for the NULL terminator).
Unfortunately, the only hints that man 2 nmount gives us is
The following options are required by all file
systems:
fstype file system type name (e.g., ``procfs'')
fspath mount point pathname (e.g., ``/proc'')
Depending on the file system type, other options may be recognized or
required; for example, most disk-based file systems require a ``from''
option containing the pathname of a special device in addition to the
options listed above.
So far, the only way I’ve been able to find the actual options is to dig through mount_* sources and see what they use, but it’s pretty gross. Take, for example, the following two filesystems:
- nullfs simply layers one vnode on top of another, effectively grafting one directory over another.
- unionfs (roughly) does the same thing, but still lets you access the grafted-over directory in read-only mode (and can be configured to do cool shit like copy-on-write).
They’re pretty close, but let’s look at the arguments that each of them take:
nullfs
- fstype: “nullfs”
- fspath: Path to the directory to graft over.
- target: Path of the directory that’s being grafted onto another.
IMHO, "target" should be "from", bikesheds, et. al.
unionfs
- fstype: “unionfs”
- fspath: Path to the directory where the unionfs will be mounted.
- from: Same as “target”, above.
- below: Makes “fspath” writable, “from” read-only (swaps default behavior)
- errmsg: …I have no fucking idea, a char[255] which presumably is used as a buffer instead of errno?
- …anything else passed as -oyour=mom passed to mount_unionfs?!
Maybe this is more a gripe that unionfs seems to be very shitty. And maybe I just haven’t found a nice magical table of options that every filesystem takes. But FFFFFF SERIOUSLY >:(
Tagged with: c, freebsd, mention of go, nuclear farting program, where are the goddamn docs
2 comments
Jul 16
getmntinfo(2) from Go — a foray into cgo
Go is a fun esoteric language that strives for system-level usage. Currently in all real operating systems, C is the dominant systems language and as such, all the functionality for interfacing with core features are exposed as raw C APIs. Go provides a C FFI layer called cgo, which handles all the preprocessing and linking magic in the background. Unfortunately, there’s little-to-no documentation available for cgo, just a couple of toy examples in Go’s misc/cgo directory (there’s actually a shitton of production examples in the Go package sources though — fucking everything uses cgo).
So, what I want to do is expose getmntinfo, which simply lists the metadata for all mounted filesystems. In C, this is pretty trivial:
#include <sys/param.h>
#include <sys/ucred.h>
#include <sys/mount.h>
#include <stdio.h>
int main() {
struct statfs *bufs;
int i = getmntinfo(&bufs, 0);
int j = 0;
for (j = 0; j < i; ++j) {
struct statfs fs = bufs[j];
printf("[%s] %s -> %s\n", fs.f_fstypename,
fs.f_mntfromname, fs.f_mntonname);
}
return 0;
}
This, however, presents a variety of problems for the Go implementation –
- We don’t really know how many struct statfs we’re getting back.
- The memory allocated is actually allocated statically; we just get an opaque pointer back to an in-library address.
- The fields of struct statfs are char[N]s rather than char*s.
Thankfully, calling getmntinfo is pretty trivial –
func GetMntInfo() []MntInfo {
var tmp *C.struct_statfs;
i := int(C.getmntinfo(&tmp, 0))
It’s pretty close to the C version — we just allocate a pointer, and pass a pointer to it in. getmntinfo sets the value of the pointer to an internal array of struct statfs‘s and lets us go along our merry way. Naturally, we want to marshal it to the appropriate Go types.
info := make([]MntInfo, i)
for j, _ := range(info) {
So we create an array to marshal values into and begin to iterate through it.
This is where it gets nasty. All we have right now is an opaque pointer to a struct statfs — in C we’d just use pointer arithmetic to get the other entries in the array. Go, fortunately, explicitly disallows pointer arithmetic. I’m not sure what the appropriate method to get values out of it is. First, I tried something like
foo := (*[]MntInfo)(unsafe.Pointer(tmp))
item := (*foo)[j]
But that seems to cause a panic (no idea why). I got tired of dicking with it and threw in the cards, simply exposing the following C function in the cgo header –
struct statfs* offset(struct statfs *v, int i) {
return v + i;
}
With that, there’s no need to dick with much of anything, so we can get the current struct statfs of the iteration pass via
s := C.offset(tmp, C.int(j))
Finally, the char[16] values need to be marshaled out. Unfortunately, the C.GoString marshaling function only takes a char* and it’s too damn stubborn to take an implicitly-convertible type (noting that X* != X[]). The other beef is that cgo’s type system processes a char[] strangely as a []_C_char_type, so we can index it perfectly fine (but not implicitly coerce it into a pointer).
So we juggle some types and shit all over unsafe.Pointer and make it do what we want –
info[j].FsType = C.GoString((*C.char)
(unsafe.Pointer(&s.f_fstypename[0])))
info[j].MntFrom = C.GoString((*C.char)
(unsafe.Pointer(&s.f_mntfromname[0])))
info[j].MntOn = C.GoString((*C.char)
(unsafe.Pointer(&s.f_mntonname[0])))
}
return info
}
And, after several hours of not finding any fucking documentation and screaming at the fucking monitor the damn thing finally works. I’m completely glossing over the terrible shitty build system they’ve got set up (it basically only provides functionality to INSTALL to built cgo packages — I haven’t found a way to actually build and link them otherwise) — will probably have to read through all the fucking makefiles that do evil shit.
At some point just doing everything in C is easier, I suspect :|
will post full code listing in a sec
Tagged with: batman's cock is huge, cgo, fuck documentation, go
4 comments
Jul 11
Calling a templated member function of a typedef’d template class
C++ is insane.
Assume you have a templated Object:
template
And you want to wrap up the instance in a Proxy object:
template
Pretty straightforward, but when you actually try to invoke Proxy
struct Foo {};
int main() {
Proxy
g++ shits itself completely:
$ g++ test1.cpp
test1.cpp: In static member function ‘static void Proxy
Fucking fantastic.
Some tinkering reveals that the compiler is getting confused as to what the fuck obj.func is somewhere. The following implementation of Func works fine (but defeats the point of using templates) --
static void Func() {
Proxy
I searched for awhile and turned up jack diddly squat, then a co-worker informed me the fix is to use the following:
static void Func() {
Proxy
I don't know what the fuck this instance.template function<..>() bullshit is, but apparently MSVC implicitly puts it in there for you. I've certainly never seen it before and it's completely orthogonal to any fix I would have assumed.
tl;dr C++ is a clusterfuck.
EDIT: A stack overflow post which contains a reference to the C++03 standard (14.2/4) in the answers. fml.
Tagged with: c, c++ is a clusterfuck, seriously wtf, stupid shit
4 comments
Jun 20
A centralized system for sharing sensitive content
I have too many stupid ideas which I’ll never have enough time to implement. Despite that, some of them I’d really like to see implemented because I bloody need them. So please someone steal this and implement it, even though it’s a stupid piece of trash ;_;
Overview
The goal is to combine the easy-to-use native interfaces of DropBox (http://www.dropbox.com/) with the paranoid strong-encryption cryptography of Tarsnap (http://www.tarsnap.com/) to create a cloud-based sharable storage system where you can share content with yourself and other people, but not even the server providers can see the content being shared.
Deficits in Existing Systems
DropBox
DropBox is a service for easily storing and sharing content in the cloud — after registering an account, it effectively presents itself as a file share on your local machine (Windows, OSX, Linux, etc). Any changes to the data on the file share are automatically and seamlessly propagated to the central server, and from there to any other clients looking at those files. Effectively, it’s a USB drive that’s stored on the internet.
Their shell integration is critical to their success — a naive user can simply run the software and interact with it in the same manner as a USB thumb drive. Because it exposes itself as a logical volume, applications can interface with it out-of-the-box.
Despite the amazing ease-of-use, DropBox is completely insecure and unsuitable for use in a sensitive environment:
- It relies on password authentication
- The server software they use is buggy; numerous critical security holes are constantly found
- Password reset doesn’t deauth clients, http://forums.dropbox.com/topic.php?id=12645
- Able to reset any password, http://pastebin.com/yBKwDY6T
- Data is not encrypted; hosting providers (or anyone who can get access) has all your data
Tarsnap
Tarsnap is an online backup system “for the truely paranoid”. After registering, you provide tarsnap with a public key to authenticate all data requests. There are two methods of operation — put data and get data. All data is automatically encrypted by the client software with your public key, then signed with your private, then sent to the server. As soon as the data leaves your system, no one can access it ever again without your private key.
Despite the extreme caution it takes with data security, Tarsnap is completely unusable for the majority of DropBox’s use cases:
- All core functionality is exposed in command-line tools rather than shell integration
- Designed around loading large, static files; no support for inter-file metadata (directories, etc)
- Everything done with a single key pair — cannot share data with other uses without giving them your private key
Solution Criteria
We need something that combines the ease-of-use of DropBox’s data-sharing characteristics with the data paranoia of Tarsnap. In particular, it should fulfill the following criteria:
- No data sent over the network or stored on the server should be unencrypted
- The server should not be able to decrypt any of the data it contains
- Private keys must never be shared
- It must be possible for one user to share a single binary file with multiple users without duplicating the binary content
- The system must present itself to the end-user as if it were a USB drive (e.g., seamless shell integration)
Proposed Solution
Transport-Level Details
Data is represented in an encrypted unit which will be henseforth termed a “blob”. A blob consists of the following data segments:
- The binary payload itself, encrypted with a single-use symmetric key, X⁰
- A list of Pⁿ, where each Pⁿ is the known public key of a friend the user authorizes to view the data
- A list of Xⁿ, where X⁰ is encrypted with each Pⁿ
Each blob is identified by the SHA256 (or equivalent) hash of its contents (henceforth referred to as the blob ID).
Like Tarsnap, the transport provides two operations – putting content on the server, and getting content from the server.
Sending Content
To put content on the server, one blob for each logical file is created, signed with the user’s private key, then uploaded to the server. The server can then verify that the payload was sent by the user and is what the user intended to send. Furthermore, it can see who the user has authorized to view the data (so it can quickly send access denied messages to people who don’t have access to the content).
Receiving Content
Likewise, a client can receive content by sending a request for a specific blob ID. The request is signed with the user’s key for authentication purposes. If the client is authenticated, the server then transmits the blob.
The client then thumbs through the blob and finds the copy of the single-use symetric key signed with their public key. They decrypt it, then use the decrypted key to decrypt the payload of the blob.
Listing/Removing Content
Since the server knows effectively nothing about the content, these are pretty easy use-cases: the client simply sends a signed request to the server. In the former, the server sends a list of blob IDs back to the client (in addition to possible metadata, like file size, for billing purposes). In the later, the client simply sends a blob ID (or list of IDs) to the server and the server removes them.
Providing a Seamless User Experience
What’s been described thus far is effectively Tarsnap with a form of content sharing built-in. As such, it is only suitable for client consumption, not end-user consumption. In addition to transmitting, storing and receiving binary blobs, the user must be able to append metadata to that blob. Some likely forms of metadata include
- Symbolic name of the content (e.g., a filename)
- Hierarchical organization of the content (e.g., file directory structure)
- Other tidbits normally expected of filesystems to provide (atime/mtime, etc)
Support for metadata is built entirely on top of the existing transport infrastructure — metadata for all files belonging to a user is encoded as a single, separate blob which contains a hierarchy of metadata objects, each of which contain the blob IDs of the data they reference.
In addition to the actual metadata, as listed above, each metadata object also contains one of the following:
- A blob ID which references the blob containing the content of the file, OR
- A set of “child” metadata objects (e.g., this one is a directory) OR
- A blob ID which references another metadata blob (e.g., a shared directory)
The “shared directory” is an abstraction on top of the transport-level permission details that services two purposes: it provides beyond all-or-nothing to share metadata with other users, and it provides an intuitive way to do directory-level sharing (e.g., having a “Shared with Alice and Bob” directory — though the client would have to make sure every blob referenced in that tree was appended with the appropriate encrypted keys).
At this point, we’ve effectively built, from the ground-up, a centralized file-sharing system with no shared secrets.
Good luck making it financially viable ;_;
EDIT: Apparently it already exists. lol.
Tagged with: dropbox, lots of bullet points, tarsnap, terrible idea, too much wanking; not enough free time
6 comments
