title: Researchcation, concluded date: 2014-06-03 14:00 author: Christine Lemmer-Webber tags: researchcation, mediagoblin, kinda-mediagoblin
I just got done with a three week thing I've dubbed "researchcation". It's exactly what it sounds like, research + vacation.
It's hard for me to take time away from MediaGoblin right now and have it still meet its goals as a project. On the other hand, there's a lot that we have planned for the year ahead, but some of it I'm not really prepared enough for to make optimal decisions on. In addition, the last year and a half really have just not given me much of a break at all, and running a crowdfunding campaign (not to mention two over two years) is really exhausting. (Not that I'm complaining about success!)
I was feeling pretty close to burnout, but given how much there is to get done, I decided to take a compromise on this break... instead of taking a full fledged vacation, I'd take a "researchcation": three weeks to recharge my batteries and step away from the day to day of the project. In the meanwhile of that, I'd work on some projects to prepare me for the year ahead. A number of good things came out of it, though not exactly the same things I expected coming in. But I think it was worth the time invested.
My original plan going in was that I would work on two things: something related to the Pump API and federation, and something related to deployment. It turns out I didn't get around to the deployment part, but working on the federation part was insightful, though not in all the ways I anticipated. Though I've read the Pump API document and helped advise a bit on the design of PyPump (not to take credit for that, clearly credit belongs to Jessica Tallon generally, not me), there's nothing really like having a solid project to toss you into things, and I wanted to take a non-MediaGoblin-codebase approach to playing around with the Pump API.
I started out by hacking on a small project called PumpBus, which was going to be a daemon which wrapped pypump and exposed a d-bus API. I figured this would make writing clients easier (and even make it possible to write an emacs client... yeah I know). I got far enough to where I was able to post a message from emacs lisp, then decided that what I was working on just wasn't that interesting and wasn't teaching me much more than I already knew.
Given that there was both the "research" but also the "-cation" component to this, I figured the risks of failure were low, so I'd up the challenge of what I was working on a bit. I instead started working on something I've dubbed Pydraulics: a python-powered implementation of the Pump API. Worst came to worst I'd learn a few things.
I decided from the outset to keep a few assumptions related to pydraulics:
So, what came out of it?
Pondering asynchronous coding developments and MediaGoblin/libgoblin/pydraulics turned out to be fruitful. Mostly I have been looking at "what would it take for libgoblin to be usefully integrated into asyncio?"
This turns out to be a bit more challenging than it appears at the outset for one reason: mg_globals. mg_globals is a pretty sad design in MediaGoblin that I'd like to get rid of; basically it makes it easy to write functions that don't have to have the database session and friends, template environment and etc passed into them, because those are set on a global variable level. That works (but is nasty) as long as you're not in a multithreaded environment, but breaks as soon as you are. I recently created a ticket reflecting such, suggesting switching over to werkzeug context locals (Flask makes heavy use of this). Werkzeug's hack is clever, using thread locals so that even in a multi-threaded environment, the objects you're accessing are still globals, but they're the right globals.
But Werkzeug's solution is not good enough for integration with asyncio, where you might be doing "asynchronous" behavior in the same thread, suspending and resuming coroutines or coming back to tasks or etc. As such, it's almost guaranteed in this system that you'll be clobbering the variables another task needs.
What to do? I did research to see if anyone had ideas. It looks like you could do such a thing with Task.current_task() in asyncio, and that would be fairly equivalent. I think you'd need careful implementation though... if you're not paying close attention you might not attach the right things to the right subtask, and that whole thing just seems... fragile. But it still is a neat idea to play around with.
But here's some ideas that I think are neat all combined, related to this problem:
But you don't really know whether or not some bit of code is using an asyncio task, a web request, or whatever to pass around. Here's the thing though: it doesn't really matter most of the time. With rare exceptions, you're just looking for \$OBJECT.db or \$OBJECT.templates or something. You just need some kind of object you can tack attributes on to.
So that's my idea in libgoblin/pydraulics: you have an application and you want to do something with it (handle a request, execute a task, etc), you can tack stuff onto that object. So either create a fresh context object to tack stuff onto or just start tacking things onto an object you have!
Currently, this looks like:
# Pydraulics -- Easy Pump API integration into your software.
#
# Copyright (C) 2014 Pydraulics contributors. See AUTHORS.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# Large parts of this heavily borrowed from MediaGoblin.
# Copyright (C) 2011-2014 MediaGoblin contributors, see MEDIAGOBLIN-AUTHORS
class PydraulicsApp(object):
"""
Pydraulics "basic" WSGI application
"""
# ...
def gen_context(self, request=None):
"""
Generate or apply a context.
If we have a request, use that as the context instead.
"""
if request is not None:
c = request
c.app = self
else:
c = Context(self)
c.template_env = jinja2.Environment(
loader=self.template_loader, autoescape=True,
undefined=jinja2.StrictUndefined,
extensions=[
'jinja2.ext.autoescape'])
# Set up database
c.db = self.Session()
if request is not None:
c.url_map = url_map.bind_to_environ(request.environ, server_name=self.netloc)
else:
c.url_map = url_map.bind(
self.netloc,
# Maybe support this later
script_name=None)
Anyway, simple enough. Then you have request.db made, or if you've just got a command line script and you need the equivalent of that and you already have your instantiated application, just run application.gen_context(). Thus, for utilities that are working with this application and need a variety of instantiated things (the database and the template engine and so on and so on) it's easy enough to just accept "context" as the first argument of the function, then use context.db and etc. (I've considered using just "c" or "ctx" instead of "context" as the variable name since it seems so common and since it conflicts a bit with template context and friends, though that's not very explicit.) So, this seems good.
At one point I got frustrated with the massive amount of porting to libgoblin I was having to do and thought "I really should probably just use Django or Flask itself." However, I found that neither framework really addresses the asyncio stuff I was dealing with above, and once I got enough ported of libgoblin over, libgoblin-based development is very fast and comfortable.
That said, it took up enough time working through those things where I didn't complete the Pump implementation. That's okay, I've got enough to do what's required on my end from MediaGoblin (and we've got good direction and help on the federation end this upcoming year where the most important thing is that I have a good understanding). I still think pydraulics is a pretty neat idea, and I may finish it, though it'll be back-burner'ed for now.
However, libgoblin is something I'm likely to extract. I'm convinced that MediaGoblin is at a point where it's stable enough to know what works and doesn't about the technical design, so that gives me a good basis to know what to build from here. There are other applications I'd like to build which should mesh nicely with MediaGoblin but which really don't belong as part of MediaGoblin itself, and would be kind of hacky add-ons. Clearly this is not the most important development, but towards the end of the summer as we hopefully get the Python 3 branch merged, I will be looking towards this.
Aside from this, on the "-cation" end of things, I took some time to relax and also reapproach my health. I may have a separate post on that soon.
So, that's that. Overall it was productive, but again, not quite in the ways I was expecting. I feel okay about that though... I wanted to do some hacking and not feel deeply pressured or stressed about it... if that wasn't true, I think the "-cation" part wouldn't have held up. So I feel okay that I wandered a bit, and the other things I worked on / found I think are important anyhow, and have me much better prepared for the year ahead. Not to mention the most important part: I feel pretty refreshed and capable of taking it on!
What's next for the coming week? Well now that this is all over, I'm organizing plans so we can get rewards out the door and do project planning for the year ahead. We've got a lot of promises to fulfill. Better get to it!