123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607 |
- = GPSD-NG: A Case Study in Application Protocol Evolution
- Eric S. Raymond <esr@thyrsus.com>
- v1.5.3, Jan 2021
- :author: Eric S. Raymond
- :description: A case study in the evolution of the gpsd protocol
- :email: <esr@thyrsus.com>
- :keywords: GPSD, protocol, evolution
- :robots: index,follow
- :sectlinks:
- :toc: macro
- include::inc-menu.adoc[]
- This document is mastered in asciidoc format. If you are reading it in HTML,
- you can find the original at the GPSD project website.
- == Introduction
- GPSD is a service daemon that collects data from serial and USB GPS
- sensors attached to a host machine and presents it in a
- simple-to-parse form on TCP/IP port 2947. This is a less trivial task
- than it sounds, because GPS sensor interfaces are both highly variable
- and really badly designed (see http://esr.ibiblio.org/?p=801[Why GPSes
- suck, and what to do about it] for a description of NMEA 0183 and
- other horrors).
- In this paper, however, we will be ignoring all the dodgy stuff that
- goes on at GPSD's back end to concentrate on what happens at the front
- - the request-response protocol through which client programs get
- access to the information that GPSD acquires from its devices and
- internal computations.
- The GPSD request-response protocol is entering its third generation of
- design, and I think the way it has evolved spotlights some interesting
- design issues and long-term trends in the design of network protocols
- in general. To anticipate, these trends are: (1) changing tradeoffs
- of bandwidth economy versus extensibility and explicitness, (2) a
- shift from lockstep conversational interfaces to event streams, (3)
- changes in the "sweet spot" of protocol designs due to increasing use
- of scripting languages, and (4) protocols built on metaprotocols.
- Carrying these trends forward may even give us a bit of a glimpse at
- the future of application-protocol design.
- == The first version: a simple conversational protocol
- The very first version of GPSD, back in the mid-1990s, handled
- NMEA GPSes only and was designed with a dead-simple request-response
- protocol. To get latitude and longitude out of it, you'd connect
- to port 2947 and have a conversation that looked like this:
- -------------------------------------------------------------------------
- -> P
- <- GPSD,P=4002.1207 07531.2540
- -------------------------------------------------------------------------
- That is GPSD reporting, the only way it could in the earliest protocol
- version, that I'm at latitude about 40 north and 75 west.
- If you are a mathematician or a physicist, you're probably noticing
- some things missing in this report. Like a timestamp, and a circular
- error estimate, and an altitude. In fact, it was possible to get some
- these data using the old protocol. You could make a compound request
- like this:
- -------------------------------------------------------------------------
- -> PAD
- <- GPSD,P=4002.1207 07531.2540,A=351.27,D=2009:07:11T11:16Z
- -------------------------------------------------------------------------
- For some devices (not all) you could add E and get error estimates.
- Other data such as course and rate of climb/sink might be available
- via other single-letter commands. I say "might be" because in those
- early days gpsd didn't attempt to compute error estimates or velocities
- if the GPS didn't explicitly supply them. I fixed that, later, but
- this essay is about protocol design so I'm going to ignore all the
- issues associated with the implementation for the rest of the discussion.
- The version 1 protocol is squarely in the tradition of classic textual
- Internet protocols, even though it doesn't look much like (say) SMTP
- transactions - requests are simple to emit and responses are easy to
- parse. It was clearly designed with the more specific goal of
- minimizing traffic volume between the daemon and its clients. It
- accomplishes that goal quite well.
- == The second version: from conversational to streaming
- However, when I started work on it it in 2004 there was already
- pressure from the existing userbase to change at least one of the
- protocol's major assumptions - that is, that the client would poll
- whenever it wanted data. It's usually more convenient to be able to
- say to the daemon "Speak!" and have it stream TPV
- (time/position/velocity) reports back at you at the sensor's sampling
- rate (usually once per second). Especially when, as with GPSD, you
- have a client library that can spin in a thread picking up the updates
- and dropping them in a struct somewhere that you specify.
- This was the first major feature I implemented. I called it "watcher
- mode", and it required me to add two commands to the protocol. There
- were already so many single-shot commands defined that we were close
- to running out of letters for new ones; I was able to grab "W" for the
- command that enables or disables watcher mode, but was left with the
- not-exactly-intuitive "O" for the streaming TPV report format. Here's
- how it looks:
- -------------------------------------------------------------------------
- -> W=1
- <- GPSD,W=1
- <- GPSD,O=MID2 1118327700.280 0.005 46.498339529 7.567392712 1342.392 36.000 32.321 10.3787 0.091 -0.085 ? 38.66 ? 3
- <- GPSD,O=MID2 1118327701.280 0.005 46.498339529 7.567392712 1342.392 48.000 32.321 10.3787 0.091 -0.085 ? 50.67 ? 3
- <- GPSD,O=MID2 1118327702.280 0.005 46.498345996 7.567394427 1341.710 36.000 32.321 10.3787 0.091 -0.085 ? 38.64 ? 3
- <- GPSD,O=MID2 1118327703.280 0.005 46.498346855 7.567381517 1341.619 48.000 32.321 10.3787 0.091 -0.085 ? 50.69 ? 3
- <- GPSD,Y=MID4 1118327704.280 8:23 6 84 0 0:28 7 160 0 0:8 66 189 45 1:29 13 273 0 0:10 51 304 0 0:4 15 199 34 1:2 34 241 41 1:27 71 76 42 1:
- <- GPSD,O=MID2 1118327704.280 0.005 46.498346855 7.567381517 1341.619 48.000 32.321 10.3787 0.091 -0.085 ? ? ? 3
- -> W=0
- <- GPSD,W=0
- -------------------------------------------------------------------------
- The fields in the O report are tag (an indication of the device
- sentence that produced this report), time, time error estimate,
- longitude, latitude, altitude, horizontal error estimate, vertical
- error estimate, course, speed, climb/sink, error estimates for
- those last three fields, and mode (an indication of fix quality). If
- you care about issues like reporting units, read the documentation.
- The 'Y' report is a satellite skyview, giving right-ascension,
- declination, and signal quality for each of the visible satellites.
- GPSes usually report this every five cycles (seconds).
- The 'W', 'O' and 'Y' sentences, together, effectively constituted
- version 2 of the protocol - designed for streaming use. The other
- single-shot commands, though still supported, rapidly became
- obsolescent.
- Attentive readers may wonder why I designed a novel 'O' format rather
- that writing the watcher-mode command so that it could specify a
- compound report format (like PADE) every second. Part of the answer
- is, again, that we were running out of letters to associate with new
- data fields like the error estimates. I wanted to use up as little of
- the remaining namespace as I could get away with.
- Another reason is, I think, that I was still half-consciously thinking
- of bit bandwidth as a scarce resource to be conserved. I had a bias
- against designs that would associate "extra" name tags with the
- response fields ("A=351.27") even though the longest tagged response
- GPSD could be expected to generate would still be shorter than a
- single Ethernet packet (1509 bytes).
- == Pressure builds for a redesign
- Along about 2006, despite my efforts to conserve the remaining
- namespace, we ran out of letters completely. As the PADE example
- shows, the protocol parser interprets command words letter
- by letter, so trying to wedge longer commands in by simple
- fiat wouldn't work. Recruiting non-letter characters as
- command characters would have been ugly and only postponed
- the problem a bit, not solved it.
- 'H' is actually still left, but at the time I believed we couldn't
- commit the last letter (whatever it was) because we'd need it as an
- inline switch to a new protocol. I started feeling pressure to
- actually design a new protocol. Besides running out of command
- namespace in the old one, a couple of things were happening that
- implied we'd need to define new commands.
- What had used up the last of the command namespace was multi-device
- support. Originally, GPSD could only monitor one GPS at a time. I
- re-engineered it so it could monitor multiple GPSes, with GPS streams
- available as data channels to which a client could connect one at a
- time. I was thinking about use cases like this one: spot two GPSes on
- either end of an oil tanker, use the position delta as a check on
- reported true course.
- (For those of you wondering, this wasn't the huge job it may sound
- like. I had carefully structured GPSD as a relatively small (about
- 5.5 KLOC) networking and dispatcher top-level calling a 30 KLOC driver
- and services library, all of which was designed from the get-go to use
- re-entrant structures. Thus, only the top layer needed to change, and
- at that only about 1 KLOC of it actually did. Building the test
- framework to verify the multi-device code in action was a bigger job.)
- Note that the "one at a time" limitation was imposed by the
- protocol design, notably the fact that the 'O' record didn't contain
- the name of the device it was reporting from. Thus, GPSD could not
- mix reports from different devices without effectively discarding
- information about where they had come from.
- Though I had just barely managed to cram in multi-GPS support without
- overrunning the available command space, we were starting to look at
- monitoring multiple *kinds* of devices in one session - RTCM2
- correction sources and NTRIP were the first examples. (These are both
- protocols that support
- http://www.esri.com/news/arcuser/0103/differential1of2.html[differential
- GPS correction].) My chief lieutenant was muttering about making GPSD
- report raw pseudorange data from the sensors that allow you to get at
- that. It was abundantly clear that broadening GPSD's scope was going
- to require command-set extensions.
- Even though I love designing application protocols only a little bit
- less than I love designing domain-specific minilanguages, I dragged my
- feet on tackling the GPSD-NG redesign for three years. I had a strong
- feeling that I didn't understand the problem space well enough, and
- that jumping into the effort prematurely might lock in some mistakes
- that I would come to gravely regret later on.
- == JSON and the AISonauts
- What finally got me off the dime in early 2009 were two developments - the
- push of AIS and the pull of JSON.
- AIS is the marine http://www.navcen.uscg.gov/enav/ais/[Automatic
- Identification System]. All the open-source implementations of AIS
- packet decoding I could find were sketchy, incomplete, and not at a
- quality level I was comfortable with. It quickly became apparent that
- this was due to a paucity of freely available public information about
- the applicable standards.
- http://esr.ibiblio.org/?p=888[I fixed that problem] - but having done
- so, I was faced with the problem of just how GPSD is supposed to
- report AIS data packets to clients in a way that can't be confused
- with GPS data. This brought the GPSD-NG design problem to the front
- burner again.
- Fortunately, my AIS-related research also led me to discover
- http://www.json.org/[JSON], aka JavaScript Object Notation. And JSON
- is *really nifty*, one of those ideas that seem so simple and
- powerful and obvious once you've seen it that you wonder why it wasn't
- invented sooner.
- In brief, JSON is a lightweight and human-readable way to serialize
- data structures equivalent to Python dictionaries, with attributes
- that can be numbers, strings, booleans, nested dictionary objects,
- or variable-extent lists of any of these things.
- == GPSD-NG is born
- I had played with several different protocol design possibilities
- between 2006 and 2009, but none of them really felt right. My
- breakthrough moment in the GPSD-NG design came when I thought this:
- "Suppose all command arguments to GPSD-NG commands, and their
- responses, were self-describing JSON objects?"
- In particular, the equivalent of the 'O' report shown above looks like
- this in GPSD-NG (with some whitespace added to avoid hard-to-read
- linewraps):
- -------------------------------------------------------------------------
- {"class":"TPV","tag":"MID50","device":"/dev/pts/1",
- "time":"2005-06-09T14:35:11.79",
- "ept":0.005,"lat":46.498333338,"lon":7.567392712,"alt":1341.667,
- "eph":48.000,"epv":32.321,"track":60.9597,"speed":0.161,"climb":-0.074,
- "eps":50.73,"mode":3}
- -------------------------------------------------------------------------
- To really appreciate what you can do with object-valued attributes,
- however, consider this JSON equivalent of a 'Y' record. The skyview
- is a sublist of objects, one per satellite in view:
- -------------------------------------------------------------------------
- {"class":"SKY","tag":"MID2","device":"/dev/pts/1",
- "time":"2005-06-09T14:35:11.79",
- "reported":8,"satellites":[
- {"PRN":23,"el":6,"az":84,"ss":0,"used":false},
- {"PRN":28,"el":7,"az":160,"ss":0,"used":false},
- {"PRN":8,"el":66,"az":189,"ss":40,"used":true},
- {"PRN":29,"el":13,"az":273,"ss":0,"used":false},
- {"PRN":10,"el":51,"az":304,"ss":36,"used":true},
- {"PRN":4,"el":15,"az":199,"ss":27,"used":false},
- {"PRN":2,"el":34,"az":241,"ss":36,"used":true},
- {"PRN":27,"el":71,"az":76,"ss":43,"used":true}
- ]}
- -------------------------------------------------------------------------
- (Yes, those "el" and "az" attributes are elevation and azimuth. "PRN"
- is the satellite ID; "ss" is signal strength in decibels, and "used"
- is a flag indicating whether the satellite was used in the current solution."
- These are rather more verbose than the 'O' or 'Y' records, but have several
- compensating advantages:
- * Easily extensible. If we need to add more fields, we just add named
- attributes. This is especially nice because...
- * Fields with undefined values can be omitted. This means extension
- fields don't weigh down the response format when we aren't using them.
- * It's explicit. Much easier to read with eyeball than the corresponding
- 'O' record.
- * It includes the name of the device reporting the fix. This opens up
- some design possibilities I will discuss in more detail in a bit.
- * It includes, up front, a "class" tag that tells client software what it
- is, which can be used to drive a parse.
- My first key decision was that these benefits are a good trade for the
- increased verbosity. I had to wrestle with this a bit; I've been
- programming a long time, and (as I mentioned previously) have reflexes
- from elder days that push me to equate "good" with "requiring minimum
- computing power and bandwidth". I reminded myself that it's 2009 and
- machine resources are cheap; readability and extensibility are the goals
- to play for.
- Once I had decided that, though, there remained another potential
- blocker. The implementation language of gpsd and its principal client
- library is C. There are lots of open-source JSON parsers in C out
- there, but they all have the defect of requiring malloc(3) and handing
- back a dynamic data structure that you then have to pointer-walk at
- runtime.
- This is a problem, because one of my design rules for gpsd is no use
- of malloc. Memory leaks in long-running service daemons are bad things;
- using only static, fixed-extent data structures is a brutally effective
- strategy for avoiding them. Note, this is only possible because the maximum
- size of the packets gpsd sees is fairly small, and its algorithms are O(1)
- in memory utilization.
- "Um, wait..." I hear you asking "...why accept that constraint when
- gpsd hasn't had a requirement to parse JSON yet, just emit it as
- responses?" Because I fully expected gpsd to have to parse structured
- JSON arguments for commands. Here's an example, which I'll explain fully
- later but right now just hint at the (approximate) GPSD-NG equivalent
- of a 'W+R+' command.
- -------------------------------------------------------------------------
- ?WATCH={"raw":1,nmea:true}
- -------------------------------------------------------------------------
- Even had I not anticipated parsing JSON arguments in gpsd, I try to
- limit malloc use in the client libraries as well. Before the
- new-protocol implementation the client library only used two calloc(3)
- calls, in very careful ways. Now they use none at all.
- So my next challenge was to write and verify a tiny JSON parser that
- is driven by sets of fixed-extent structures - they tell it what shape
- of data to expect and at which static locations to drop the actual
- parsed data; if the shape does not match what's expected, error out.
- Fortunately, I am quite good at this sort of hacking - the
- result, after a day and a half of work, fit in 310 LOC including
- comments (but not including 165 LOC of unit-test code).
- == Un-channeling: the power
- Both gpsd and its C client library could now count on parsing JSON;
- that gave me my infrastructure. And an extremely strong one, too;
- the type ontology of JSON is rich enough that I'm not likely to ever
- have to replace it. Of course this just opened up the next question -
- now that I can readily pass complex objects between gpsd and its
- client libraries, what do I actually do with this capability?
- The possibility that immediately suggested itself was "get rid of channels".
- In the old interface, subscribers could only listen to one device at
- a time - again, this was a consequence of the fact that 'O' and 'Y'
- reports were designed before multi-device support and didn't include a
- device field. JSON reports can easily include a device field and
- thus need not have this problem.
- Instead of a channel-oriented interface, then, how about one where the
- client chooses what classes of message to listen to, and then gets
- them from all devices?
- Note, however, that including the device field raises some problems of
- its own. I do most of my gpsd testing with a utility I wrote called
- gpsfake, which feeds one or more specified data logs through pty
- devices so gpsd sees them as serial devices. Because X also uses pty
- devices for virtual terminals, the device names that a gpsd instance
- running under gpsfake sees may depend on random factors like the
- number of terminal emulators I have open. This is a problem when
- regression-testing! I thought this issue was going to require me
- to write a configuration command that suppresses device display; I
- ended up writing a sed filter in my regression-test driver instead.
- Now we come back to our previous example:
- -------------------------------------------------------------------------
- ?WATCH={"raw":true,nmea:true}
- -------------------------------------------------------------------------
- This says: "Stream all reports from all devices at me, setting raw
- mode and dumping as pseudo-NMEA if it's a binary protocol." The way to
- add more controls to this is obvious, which is sort of the point --
- nothing like this could have fit in the fixed-length syntax of the old
- pre-JSON protocol.
- This is not mere theory. At the time of writing, the ?WATCH command is
- fully implemented in gpsd's Subversion repository, and I expect it to
- ship ready for use in our next release (2.90). Total time to build
- and test the JSON parsing infrastructure, the GPSD-NG parser, and the
- gpsd internals enhancements needed to support multi-device listening?
- About a working week.
- Just to round out this section, here is an example of what an
- actual AIS transponder report looks like in JSON.
- -------------------------------------------------------------------------
- {"class"="AIS","msgtype":5,"repeat":0,"mmsi":"351759000","imo":9134270,
- "ais_version":0,"callsign":"3FOF8","shipname":"EVER DIADEM",
- "shiptype":70,"to_bow":225,"to_stern":70,"to_port":1,"to_starboard":31,
- "epfd":1,"eta":05-15T14:00Z,"draught":122,"destination":"NEW YORK",
- "dte":0}
- -------------------------------------------------------------------------
- The above is an AIS type 5 message identifying a ship - giving, among
- other things, the ship's name and radio callsign and and destination
- and ETA. You might get this from an AIS transceiver, if you had one
- hooked up to your host machine; gpsd would recognize those data
- packets coming in and automatically make AIS reports available as
- an event stream.
- == The lessons of history
- In the introduction, I called out three trends apparent over time in
- protocol design. Let's now consider these in more detail.
- === Bandwidth economy versus extensibility and explicitness
- First, I noted *changing tradeoffs of bandwidth economy versus
- extensibility and explicitness*.
- One way you can compare protocols is by the amount of overhead they
- incur. In a binary format this is the percentage of the bit stream
- that goes to magic numbers, framing bits, padding, checksums, and
- the like. In a textual format the equivalent is the percentage
- of the bitstream devoted to field delimiters, sentence start and
- sentence-end sentinels, and (in protocols like NMEA 0183) textual
- checksum fields.
- Another way you can compare protocols is by implicitness versus
- explicitness. In the old GPSD protocol, you know the semantics of a
- request parameter within a request implicitly, by where it is in
- the order. In GPSD-NG, you know more explicitly because every parameter is a
- name-attribute pair and you can inspect the name.
- Extensibility is the degree to which the protocol can have new
- requests, responses, and parameters added without breaking old
- implementations.
- In general, *both extensibility and overhead rise with the degree
- of explicitness in the protocol*. The JSON-based TPV record has
- has much higher overhead than the O record it replaces, but what
- we gain from that is lots and *lots* of extensibility room. We
- win three different ways:
- * The command/response namespace in inexhaustibly huge.
- * Individual requests and responses can readily be extended by adding
- new attributes without breaking old implementations.
- * The type ontology of JSON is rich enough to make passing arbitrarily
- complex data structures through it very easy.
- With respect to the tradeoffs between explicitness/extensibility and
- overhead, we're at a very different place on the cost-benefit curves
- today from when the original GPSD protocol was designed.
- Communications costs for the pipes that GPSD uses have
- dropped by orders of magnitude in the decade-and-change since GPSD
- was designed. Thus, squeezing every last bit of overhead out of the
- protocol representation doesn't have the real economic payoff it used to.
- Under modern conditions, there is a strong case that implicit,
- tightly-packed protocols are false economy. If (as with the first GPSD
- protocol) they're so inextensible that natural growth in the
- software breaks them, that's a clear down-check. It's better to design
- for extensibility up front in order to avoid having to throw out
- a lot of work later on.
- The direction this points in for the future is clear, especially
- in combination with the increasing use of metaprotocols.
- === From lockstep to streaming
- Second, I noted *a shift from lockstep conversational interfaces to
- event streams*.
- The big change in the second protocol version was watcher mode. One
- of the possibilities this opens up is that you can put the report
- interpreter into an asychronous thread that magically updates a C
- struct for you every so often, without the rest of your program having
- to know or care how that is being done (except possibly by waiting a
- mutex to ensure it doesn't read a partially-updated state).
- Analogous developments have been visible in other Internet protocols
- over roughly the same period. Compare, for example, POP3 to IMAP. The
- former is a lockstep protocol, the latter designed for streaming - it's
- why IMAP responses have a transaction ID tying them back to the
- requesting command, so responses that are out of order due to
- processing delays can be handled sanely.
- Systems software has generally been moving in a similar direction,
- propelled there by distributed processing and networks with unavoidable
- variable delays. There is a distant, but perceptible, relationship
- between GPSD-NG's request-response objects and the way transactions
- are handled within (for example) the X window system.
- This trend, too, seems certain to continue, as the Internet becomes
- ever more like one giant distributed computing system.
- === Type ontology recapitulates trends in language design
- Third, *changes in the "sweet spot" of protocol designs
- due to increasing use of scripting languages.*
- The most exciting thing about JSON to me, speaking as an application
- protocol designer, is the rich type ontology - booleans, numbers,
- strings, lists, and dictionaries - and the ability to nest them to any
- level. In an important sense that is orthogonal to raw bandwidth,
- this makes the pipe wider - it means complex, structured data can more
- readily be passed through with a minimum of fragile and bug-prone
- serialization/deserialization code.
- The fact that I could build a JSON parser to unpack to fixed-extent C
- structures in 300-odd LOC demonstrates that this effect is a powerful
- code simplifier even when the host language's type ontology is limited
- to fixed-extent types and poorly matched to that of JSON (C lacks not
- only variable-extent lists but also dictionaries).
- JSON is built on dictionaries; in fact, every JSON object is a legal
- structure literal in the dictionary-centric Python language (with one
- qualified exception near the JSON null value). It seems like a simple
- idea in 2009, but the apparent simplicity relies on folk knowledge we
- didn't have before Perl introduced dictionaries as a first-class data
- type (c.1986) and Python built an object system around them (after
- 1991).
- Thus, GPSD-NG (and the JSON it's built on) reflects and recapitulates
- long-term trends in language design, especially those associated with
- the rise of scripting languages and of dictionaries as a first-class
- type within them.
- This produces several mutually reinforcing feedback loops. The
- rise of scripting languages makes it easier to use JSON to its full
- potential, if only because deserialization is so trivial. JSON will
- probably, in turn, promote the use of these languages.
- I think, in the future, application protocol designers will become
- progressively less reluctant to rely on being able to pass around
- complex data structures. JSON distils the standard type ontology of
- modern scripting languages (Perl, Python, Ruby, and progeny) into a
- common data language that is far more expressive than the structs of
- yesteryear.
- == Protocols on top of metaprotocols
- GPSD-NG is an application of JSON. Not a completely pure one; the
- request identifiers, are, for convenience reasons, outside the JSON
- objects. But close enough.
- In recent years, metaprotocols have become an important weapon in
- the application-protocol designer's toolkit. XML, and its
- progeny SOAP and XML-RPC, are the best known metaprotocols. YAML
- (of which JSON is essentially a subset) has a following as well.
- Designing on top of a metaprotocol has several advantages. The most
- obvious one is the presence of lots of open-source software to use for
- parsing the metaprotocol.
- But it is probably more important in the long run that it saves one
- from having to reinvent a lot of wheels and ad-hoc representations
- at the design level. This effect is muted in XML, which has a weak
- type ontology, but much more pronounced in YAML or JSON. As a
- relevant example, I didn't have to think three seconds about the right
- representation even for the relatively complex SKY object.
- == Paths not taken
- Following the first public release of this paper, the major questions
- to come up from early readers were "Why not XML?" and "Why not a
- super-efficient packed binary protocol?"
- I would have thought the case against packed binary application
- protocols was obvious from my preceding arguments, but I'll make it
- explicit here: generally, they are even more rigid and inextensible
- than a textual protocol relying on parameter ordering, and hence more
- likely to break as your application evolves. They have significant
- portability issues around things like byte order in numeric fields.
- They are opaque; they cannot be audited or analyzed without bug-prone
- special-purpose tools, adding a forbidding degree of complexity and
- friction to the life-cycle maintenance costs.
- When the type ontology of your application includes only objects like
- strings or numbers that (as opposed to large binary blobs like images)
- have textual representations differing little in size from packed
- binary, there is no case at all for incurring these large overheads.
- The case against XML is not as strong. An XML-based protocol at least
- need not be rigidly inextensible and opaque. XML's problem is that,
- while it's a good basis for document interchange, it doesn't naturally
- express the sorts of data structures cooperating applications want to
- pass around.
- While such things can be layered over XML with an appropriate schema,
- the apparatus required for schema-aware parsing is necessarily
- complicated and heavyweight - certainly orders of magnitude more so
- than the little JSON parser I wrote. And XML itself is pretty
- heavyweight, too - one's data tends to stagger under the bulk
- of the markup parts.
- == Envoi
- Finally, a note of thanks to the JSON developers...
- I think JSON does a better job of nailing the optimum in metaprotocols
- than anything I've seen before - its combination of simplicity and
- expressiveness certainly isn't matched by XML, for reasons already
- called out in my discussion of paths not taken.
- I have found JSON pleasant to work with, liberating, and
- thought-provoking; hence this paper. I will certainly reach for this
- Swiss-army knife first thing, next time I have to design an
- application protocol.
|