Planet Open Plans (dev)

July 02, 2008

openplansdev tags

June 30, 2008

OpenCore Blog Comments

Comment on Hi, I’m the Opencore Release Manager by ra

Okay, well, I guess I think it was a mistake for z3c.autoinclude to go down the road of dependency spidering in the first place. I much prefer ZCMLLoader’s very tight scope, which is simply to expose the explicit loading of specific packages’ ZCML as a setuptools entry point. I won’t keep pushing, but I’d still personally rather see us fix ZCMLLoader’s loading order and stick w/ the simpler tool.

by ra at June 30, 2008 10:28 PM

Comment on Hi, I’m the Opencore Release Manager by ejucovy

@ra: z3c.autoinclude only spiders dependencies if you tell it to explicitly with an directive (same for plugins, ) so it wouldn’t be introducing any complexity that we don’t ask for. It’s also used by other people and has a test suite, which are pretty strong points in its favor.

ZCMLLoader is also broken w.r.t. include ordering in a way that is fixed with z3c.autoinclude; this is blocking our use of it for oc-geotagging. (IIRC its ordering is a:meta, a:configure, a:overrides, b:meta, b:configure, b:overrides.) It wouldn’t be difficult to fix, but I don’t think there’s any reason to.

by ejucovy at June 30, 2008 06:29 PM

Comment on Hi, I’m the Opencore Release Manager by ra

First, heartfelt thanks to you, Paul, for taking this on. I think this is a very important role, and look forward to supporting you and the greater OpenPlans user community however I can.

One small issue, though… you talk about switching from ZCMLLoader to z3c.autoinclude. I actually think this is not a good idea. ZCMLLoader is very, very small in scope, and it does everything we need. I think the ongoing maintenance burden of ZCMLLoader will be minimal, possibly nil. z3c.autoinclude tries to do some interesting magic, IMO, such as spidering the dependencies of the included packages to figure out what other things need to be installed. If there are features in autoinclude that we need, then I’d say let’s consider switching, but as long as ZCMLLoader is meeting or needs (and AFAIK that is still the case) I’d advocate for staying with the simpler implementation and avoiding the complexity that z3c.autoinclude will introduce.

by ra at June 30, 2008 05:57 PM

TOPP Engineering

Week 1: Xinha, no more popups

I spent most of my second week doing some Xinha UI work with Doug. Our broad directive was to improve the Xinha user experience, so we started by talking with Sonali and Bryan about known interface bugs and quirks, and got a good idea of where some of the major usability shortcomings are. My general thoughts on WYSIWYG editors for HTML-based content are that they’re almost always mediocre and are hard to do well as full-fledged desktop applications, and are even tougher to do well as web-based apps. To date, I’ve never seen any browser-based WYSIWYG editor that didn’t require me to either fight its idiosyncricities or delve into the source to get it to do exactly what I wanted it to. Xinha has come a long way, but there’s definitely still a lot of work to be done.

We decided to start by concentrating on Xinha’s popup dialogs, which are used in plugins all over the place, e.g., for adding a link or inserting an image. Since many people find popups annoying, not to mention the fact that they interfere with people’s popup-blockers, we decided to look into converting the existing popups into lightboxes. Not wanting to reinvent the wheel, we looked at several different options for lightbox libraries. I hadn’t realized just how many lightbox libraries have (no pun intended) popped up over the past year or two. Unfortunately, many of them have significant downsides: some aren’t released under free licenses, and more than a few appear to no longer be under active development. We ultimately opted to give nyroModal a shot, since it has had more than one release, uses JQuery, is relatively full-featured, and is released under an MIT license.

Not long after we started using nyroModal, however, we came across a branch of Xinha called new-dialogs that was started a few months ago by one of the core Xinha developers. As its name suggests, the branch is a major update to Xinha’s dialog system and has had a significant amount of work put into it. Details of its development can be found in the change history for ticket #1176. In short, it does almost exactly what we were hoping to do, and overall, does it quite well.

Interestingly, new-dialogs does not use any of the aforementioned libraries for its lightboxes. Instead, it includes its own lightbox code written from scratch. Its dialogs are resizable, draggable, and even dockable, though I’m not sure how necessary (or even desirable) the first two are for a modal system.

Much of what’s left to do with new-dialogs is to transition all of the plugins that use the old popup dialogs over to the new system. Doug and I have started working on this process by updating the UnFormat plugin.  Doug has since done the BackgroundImage plugin and I’ve done the Abbreviation and FindReplace plugins. There are still quite a few plugins to go, but once the rest are updated there shouldn’t be much holding us back from making the jump to a popup-free Xinha.

by nicholasbs at June 30, 2008 05:01 AM

OpenCore Blog Comments

Comment on Hi, I’m the Opencore Release Manager by k0s

# Opencore releases will be on a time-based schedule, TBD.
# Features that aren’t ready in time for a release can just wait until the next release.
# Rationale: Feature-driven releases inevitably get delayed and delayed, and the features don’t get done any faster.

a big +1 on all of this; congrats, slinkp! i’m sure you’ll do an awesome job

by k0s at June 30, 2008 01:15 AM

June 28, 2008

OpenCore Blog Comments

Comment on Viewlets for the oc topnav by ejucovy

OK, I think the way to do this is to register the viewlets for an interface on the *request*, not the context. Then we just use a pre-traversal hook to mark the request as having passed through a project, a member, etc.

by ejucovy at June 28, 2008 05:10 PM

June 27, 2008

OpenCore Blog Posts

Hi, I’m the Opencore Release Manager

I’m going to be playing Opencore Release Manager, starting now. Let me explain what that means, and relay some plans.

Scope

I’m just talking about opencore, the Plone extension at the heart of the openplans software stack.

I’m not talking about any other part of the openplans software stack (tasktracker, wordpress, deliverance, …)

I’m definitely not talking about openplans.org the website.

Opencore needs a release manager?

Opencore was originally created by TOPP for www.openplans.org. So opencore releases have historically been driven by the needs of www.openplans.org’s deployment schedule. Since our deployment process doesn’t require a formal opencore release, releases tend to be a rare afterthought.

But there are at least two things that more formal releases would help with:

  1. Lower the barrier for other community members to get started using opencore.
  2. Let the world know about what we’re doing.

My overarching goal is to promote a lively community of developers and users for opencore, and provide some balance to TOPP’s path of least resistance–to only worry about our own deployments.

Why me?

Ethan asked me to do it and I was happy to volunteer. (If you don’t know who Ethan is either, this picture should suffice: http://www.openplans.org/people/ejucovy )

Who?

I’m Paul Winkler, a Zope user and developer since 1999, a TOPP employee since October 2007, and Opencore’s release manager since right now. http://www.openplans.org/people/slinkp

What will determine the release schedule?

I want to decouple Opencore’s release cycle from the TOPP deployment cycle. Douglas Mayle has volunteered to act as the openplans.org deployment manager; he and I have tentatively agreed on this plan:

  • Opencore releases will be on a time-based schedule, TBD.
  • Features that aren’t ready in time for a release can just wait until the next release.

    Rationale: Feature-driven releases inevitably get delayed and delayed, and the features don’t get done any faster.

  • Openplans.org deployments will be on an independent time-based schedule. (I believe the plan is to deploy weekly.)
  • “Must-have” openplans.org features and fixes can be deployed independently of the opencore release cycle by putting them in a new package that layers on top of opencore.
  • Openplans-specific code and text (like the current copyright notice) will be gradually moved out into this new package.
  • The aforementioned openplans.org customization package does not exist yet; we need to create one ASAP.

Of course TOPP’s own needs will continue to drive a lot of opencore development, which is a good thing; but that shouldn’t be a burden on you, the wider opencore user/developer community.

Does the Release Manager have to be a TOPP employee?

No. I’m just the first. It should be a rotating job. It’s my hope that before long, people outside TOPP will be willing and able to play this role.

When will the job rotate?

Don’t know yet.

What does the Opencore Release Manager do?

  • Manage the release schedule (not TOPP’s deployment schedule!). This does not mean I’m the dictator, just the point man and coordinator.
  • Shepherd release candidates through a stabilization process.
  • Create and announce final release packages.
  • Document upgrade process.
  • Document any backward incompatibilities.
  • Draft and enforce relevant dev process standards (e.g. which branches or plugins should you commit bugfixes or features to).
  • Document this process for the next release manager.

At this stage, I’ll also be acting as a sort of ambassador between TOPP and the wider community. Which is a bit silly, as every opencore contributor does that to whatever extent they wish. But I think for now it’s useful to consciously formalize it as part of my job. I will make an effort to:

  • Solicit community input about decisions we need to make for opencore.
  • Remind TOPP employees to discuss and make decisions about opencore out in the open. If you have a real-life relevant conversation I’ll try to remind you to post a summary to the mailing list.
  • Keep the community updated on changes to our build process. If we break your builds, you can complain to the opencore-user or opencore-dev mailing list, and you should expect a response from me.
  • Lobbyist for the community: e.g. prioritizing and scheduling feature requests and bugfixes.

There will of course be other forces at TOPP driving things with other priorities, but *my* role will be to ensure that the community’s needs don’t get ignored. This is more about release management than development management: I’m not in charge of deciding which features get done when. But whenever good code comes from the community, we have a responsibility to get that code released in a timely manner, and generally be as responsive as we can.

What about the rest of the stack?

We realize that most of the non-TOPP users of opencore are using some or all of the entire www.openplans.org stack. We’d eventually like to be making formal releases of all the packages, and possibly some kind of “batteries included” big meta-package; but that’s in the future. For now, opencore is our fastest-moving target and the most in need of release management.

Release Numbers

We supposedly have a policy now (Ethan proposed it here: http://tinyurl.com/4ybpro) … but we haven’t been sticking with it. I’m going to draft a proposal for a simpler convention and process. More on this in a later message.

Releases, Eggs, and Builds

I’d like to be releasing versioned eggs of opencore to PyPi. These should be installable into a suitable Zope/Plone instance using nothing more than easy-install.

(We have made a few releases to PyPi, but they aren’t actually usable … some files are missing.)

For bootstrapping a full stack including Zope, Plone, and other non-egg stuff, TOPP will continue to use and develop our Fassembler build tools. But it should become easier to do without that, if you want.

Branch Policy


We are currently trying to follow a “Stable Trunk” practice. Experimental and risky code should not be on the trunk.

Developer Infrastructure


This is only tangentially related to release management. I just wanted to note in passing that we’re planning a more thought-out integrated home on the web for opencore development. We’re calling this the “TOPP Dev Center” for now. Other open-source packages from TOPP will live there too.

In the meantime:

Wiki and mailing lists are at: http://www.openplans.org/projects/opencore

Trac (bug tracking) is at: http://trac.openplans.org/openplans

If you need a Trac account, just ask. (I wish we could take anonymous bug reports, but we got trac spam.)

If you want SVN commit access, first you should send some bugfixes and features as patches (either in trac or on the dev list). We’ll happily give commit access to people who have a history of submitting good patches. There will be a contributor agreement you’ll have to sign, somewhat similar to Zope’s.

Current Road Map


Some things coming up in the short term:

  • Plone 3

    We will soon be releasing a version of opencore that runs on Plone 3. Rob Miller has done all the hard work and we just need to QA this branch and merge it back to the trunk.

    Currently, the plan is for TOPP to create a release candidate from this branch, QA it, and deploy it to www.openplans.org. When that goes well, I will:

    1. Branch off the current opencore trunk to an opencore-plone2.5 branch
    2. Make a final stable release from the opencore-plone2.5 branch (to be followed by bugfix releases as necessary) 
    3. Merge the plone 3 branch back to the trunk. 

    This will definitely happen this summer; more concrete dates will be forthcoming. 

    If people in the community expect to continue using the Plone 2.5 version for a while, I would like to hear about it so I can serve you better.

  • Easier “plugin” management using zc.autoinclude. This will replace our usage of zcmlloader, doing the same job but better (and other people are actively developing it).


Some ongoing maintenance tasks I’d love community help with:

  • Opencore is slow. Let’s make things faster. (But let’s get plone 3 merged to trunk first, it already helps substantially.)
  • The test suite has a lot of problems. More on this in a later message.
  • Dependency cleanup. I suspect there is cruft in the Products “bundle” that we don’t actually use.
  • What do you guys want?


Going forward, this road map should have a permanent home prominently linked in the aformentioned dev center. There is already such a page at http://www.openplans.org/projects/opencore/planning but it badly needs an update. I’ll be looking to revive that.

Questions?


Whew. Sorry for the length. Feedback would be most welcome.

- PW

by slinkp at June 27, 2008 11:18 PM

openplansdev tags

[from ianb] YAML Builder | A tool for visual development of YAML based CSS layouts

Yet Another CSS framework, but with a nice builder and seems more reasonable than a couple I've seen

by ianb at June 27, 2008 06:56 PM

June 26, 2008

TOPP Engineering Comments

Comment on Digging on the Windmill by ra

Yup, I’ve spent some time w/ Selenium as well, and we evaluated it for OpenPlans testing back when we settled on twill and flunc. It’s been around for a lot longer, and is definitely a lot more stable. I’m not a fan of the Selenium testing language, however, and Selenium doesn’t lend itself to automation as nicely as Windmill. I’m hopeful that the Windmill developers will continue to be responsive and will fix the show-stoppers that we hit.

by ra at June 26, 2008 06:50 PM

Comment on Digging on the Windmill by nickyg

Thanks Rob! I spent some time experimenting with Windmill a while back and can definitely see its potential. However, I ran into the “waits” problem, which at the time seemed like a blocker. I subsequently spent some time playing with Selenium, which was much better (flawless, actually) in that regard.

Anyway, I really like the idea of using a testing tool that has a real-time IDE for creating tests. They aren’t perfect, but they certainly make the process faster.

by nickyg at June 26, 2008 06:17 PM

TOPP Engineering

Digging on the Windmill

The Promise

In preparation for a ramp-up in testing the OpenPlans Plone 3 upgrade efforts, I’ve been revisiting OSAF’s Windmill functional testing tool. The story is compelling; if you’re going to be launching a browser and clicking around to test a site anyway, why not turn on a recorder that will auto-generate test suites that can reproduce your actions? Even if it ends up making the original testing take a little while longer, that pays for itself the first time you re-run the tests.

With something concrete to accomplish, then, I sat down to see if Windmill delivers on this promise.

Bootstrapping

Getting started went pretty smoothly. Installing was easy, on Ubuntu Gutsy, anway; just regular setuptools stuff. I’m working from the Windmill trunk, so I created and activated a python 2.5 virtualenv, checked the code out from http://svn.osafoundation.org/windmill/trunk/, and ran ‘python setup.py develop’ to install it into the environment.

Windmill is then launchable with ‘windmill firefox URL’. This will open firefox, albeit without any of the customizations that you might have set up in your profile. It also opens a smaller Windmill IDE browser window, and starts a Windmill “controller service”, which exposes the API for manipulating the browser window.

Capturing Tests

The IDE is pretty simple. It’s implemented in HTML and Javascript, and it provides 4 primary features: a test recorder, a test runner, a DOM explorer, and an “Assert” explorer. The test recorder is what I’m first interested in. I click the record button, and the browser window jumps to the foreground. So I start clicking around, entering text and submitting forms, lo and behold, I can see what I’m doing being captured in the IDE. Good start.

To start, I make an incorrect login attempt, and, upon failure, try to create a new account, getting all the way to the “check your email” confirmation screen. I turn off the recorder, and click on the ’save’ link in the IDE window. Windmill can export the test suites as either python or JSON. I’ve chosen python, so I get another window with the following text:

# Generated by the windmill services transformer
from windmill.authoring import WindmillTestClient

def test():
    client = WindmillTestClient(__name__)

    client.click(link=u'Sign in')
    client.type(text=u'bogus', id=u'__ac_name')
    client.type(text=u'bobobobo', id=u'__ac_password')
    client.click(name=u'login')
    client.click(link=u'Create account')
    client.type(text=u'testuser1', id=u'id')
    client.type(text=u'Test UserOne', id=u'fullname')
    client.type(text=u'test1@example.com', id=u'email')
    client.type(text=u'testy', id=u'password')
    client.type(text=u'testy', id=u'confirm_password')
    client.click(name=u'task|join')

I paste this into an emacs buffer and save it as windmill_tests.py. Then I shut down the windmill process (I have to ‘kill’ it to make sure everything terminates correctly :P) and pass in the test I just created with ‘windmill firefox URL test=windmill_tests.py’. Sure enough, the browser launches, hits the site, and starts following links and entering text! Initial success makes much happy.

Controller API

The ‘client’ variable in the python code above is a handle to a WindmillTestClient object, which exposes the Windmill Controller API. The controller is what is driving the browser. One fun trick is that you can put a pdb in your test code and you’ll get an interactive prompt which you can use to control the live browser window. Now enjoy the power of a python command line interface to teh interwebs, while still being able to access all the rich multimedia that today’s users demand! It’s definitely more fun than it should be to fill out web forms and click on links by typing in ‘client.type’ and ‘client.click’ commands.

Assertions

Astute readers will have noticed that the second run of the user creation test would have a different result than the first. Indeed, when I used Windmill to run the test steps I’d captured, it ended with an error message informing me that the username was already taken. Real tests of course will want to make assertions, to ensure that the behaviour is really what you want. Enter the Assertion Explorer. At any point, you can click on the Assertion Explorer button, and then click somewhere on the page, and the IDE will generate a best-guess assertion for you for the element that you selected.

The generated assertions are sometimes spot-on. Whenever I clicked on a portal status message, for instance, it asserts (using XPath) that the PSM element exists on the page, and that it contains the specified text. When I click on regular page text, however, it only checks for the existence of the clicked node, when I want it to check the text as well.

The asserts are a part of the Controller API, and are pretty easy to understand. Here’s an example of a couple that I generated:

    client.asserts.assertNode(xpath=u'/html/body/div/div/div/div/div')
    client.asserts.assertText(xpath=u'/html/body/div/div/div/div/div', validator=u'Welcome! You have signed in.')

    client.asserts.assertNode(xpath=u’/html/body/div/div/div/div/div’)
    client.asserts.assertText(xpath=u’/html/body/div/div/div/div/div’, validator=u’Your changes have been saved.’)
    client.asserts.assertNode(link=u’something else+’) 

Nothing particularly mysterious there.

The Warts

Despite the early successes, it wasn’t long before I hit a couple bumps. Here’s an overview of the issues that came up for me:

  • Hard to extract info from the page

    The Controller API works pretty well for controlling the page and making assertions, but there don’t seem to be very good hooks for extracting information from the page itself into the python environment. For instance, in order to complete the user registration, the test will need to log in as an admin and hit a special page which will return the user’s confirmation key. This will need to be extracted from the response and then used as part of the URL of a subsequent request. Even a simpler case, such as simply extracting the URL of the page that the browser is currently visiting, doesn’t come clear to me from a thorough comb of the documentation.
  • The ‘waits’ stuff seems to be broken ATM

    The Controller API has a whole selection of ‘waits’ methods, which will tell the test running to not move on until some criteria is met, either a specific amount of time has elapsed, or the page has loaded, or an element shows up on the page, etc. For me, running FF2.0.0.14 on Ubuntu Gutsy, these were completely broken. Any time I try to use one of these calls, the test suite just stops right there. This is a show-stopper, since I very quickly hit false failures due to the tests running more quickly than the browser was responding.
  • Xinha typing didn’t get picked up by the recorder

    Most of the actions I performed in the browser were dutifully recorded by the IDE. The only exception to this is any text I would enter into the Xinha editor. Whenever I would edit a wiki page, the recorder would skip directly from clicking the ‘edit’ link to clicking on the ’save’ button. I think we can work around this by just adding these commands by hand to the generated python. I’m not 100% on this, though.

Python is Nice

As I’ve mentioned, it’s possible to export the generated tests and assertions as either JSON or python code. Having such a straightforward way to drive the tests from pure python is a big plus for those of us who like that sort of thing. We can use try:finally to make sure clean-up code gets hit, and httplib2 to talk to the server to actually perform the clean-up. Control is easy in python. A quick peek at the code indicates that it does pass unrecognized options in to the test suites themselves, which means that we can code up test suites that understand additional control parameters as we need.

The Good News

While I did stumble on a couple of issues, I am very happy to report that the folks in the #windmill channel on freenode are very helpful. Even better than that, they’re very happy to get my bug reports, and are very responsive about fixing them. I spent about a day playing with Windmill a few months back. In that time I uncovered one bug and one usability weirdness. They were both fixed within days. The issues I’ve raised this time are already recorded as issues in their tracker, and I’m told that they should be resolved by the end of the week.

Conclusions

Takeaways? I’d say “cautiously wildly enthusiastic” best describes how I feel. Windmill is very close to delivering on its promise, making it ridiculously easy to generate robust, JavaScript-supporting test suites that can be trivially run on IE, Safari, and FireFox. If the pattern of developer responsiveness continues, and the issues that come up are either easily worked around or are resolved within Windmill itself, then I think this really hits the sweet spot. It didn’t take long for me to hit a couple of pretty big issues, however. If it turns out that there are a lot of similar bombs in there, that the problems stack up faster than the developers can deal with them, then it would probably end up being a headache for us.

I don’t really expect this, however. I know OSAF is using Windmill internally, and the developers that I’ve had contact with seem very enthusiastic about getting my reports and getting the problems resolved. I’m going to continue my exploration, and I hope to generate a thorough set of Windmill tests for the Plone 3-based OpenPlans stack over the next week.

by ra at June 26, 2008 01:37 AM

June 25, 2008

openplansdev tags

OpenCore Blog Comments

June 24, 2008

openplansdev tags

[from scrollie] Choosing the right things to say no to - (37signals)

I’ve made much more money by choosing the right things to say no to than by choosing things to say yes to. I measure it by the money I haven’t lost and the quality I haven’t sacrificed.

by scrollie at June 24, 2008 12:09 PM

June 22, 2008

openplansdev tags

June 21, 2008

Law 2.0 - Anil Makhijani

Drupal on Mac OS X

I tried installing Drupal on Mac OS X Leopard the other night and everything worked right out of the box.  There are the two things I had trouble with

  1. The PHP on OS X does NOT come with the GD image library installed.  I found a very good tutorial to get it working here.  (The one problem with the tutorial is that the php.ini file that comes with the download did not seem to work on my machine.  Instead I used the php.ini file that comes with Mac OS X)
  2. The .htaccess file that comes with Drupal seems to be missing a line that seems to break “clean urls” on a OS X.  Adding the rule “RewriteBase /path/to/drupal” seemed to fix things.

Great, now that I have Drupal cleanly installed, time to try experimenting …

by Anil Makhijani at June 21, 2008 11:37 PM

June 20, 2008

TOPP Engineering

Week 0: Trac and new user registration

This week was my first as a regular TOPP employee, after having not been here since I was an intern last summer. The current plan is for me do a few rounds of rotations, pairing with folks for a week or so to get a taste for what they’re doing and (hopefully) contributing something helpful. After getting settled in a bit, I began my first such pairing with Jeff, working on adding CAPTCHA support to the new user registration page on Trac provided by the AccountManager plugin, per ticket #8.


Broadly speaking, there are two primary ways to tackle this problem. The first is to write a stand-alone plugin that implements the IRequestFilter and ITemplateStreamFilter interfaces. The second  approach comes in two parts, the first of which is to modify the AccountManager plugin to provide a new extension point for making additions to the user registration page. The second part is to then write a plugin that implements this new interface to actually add the CAPTCHA to the form and then verify it is correct upon submission.


As an example of the first approach, I wrote a quick and dirty prototype, CaptchaAuth. To explore the second option, I wrote RegistrationConfirmationPatch to modify AccountManager to include a minor variation of the IRegistrationConfirmation interface suggested in ticket #8. I then wrote SimpleCaptcha to implement this interface and provide very basic CAPTCHA support.


As it stands, the pure plugin approach, CaptchaAuth, is a little hackish. While it does have the advantage of not requiring any modifications to AccountManager, it has some downsides. First, it relies on an XPath to grab on to the portion of the form to add the CAPTCHA to. This means that future changes to AccountManager could inadvertently break it. Second, error messages are poorly integrated, and as of now raise an ugly TracError instead of looking like other form validation errors on the page.


The RegistrationConfirmationPatch/SimpleCaptcha approach, on the other hand, has the downside of requiring modification to AccountManager, but does not have the other problems mentioned above. It also has the advantage that the patch provides a generalized interface that could be used to add any additional elements to the new user registration form. As an example, the same interface could be used for e-mail verification of new users.


I’ve e-mailed the Trac dev listserv to get to some input from the core developers and the community at large. Hopefully we’ll be able to work together to come up with a flexible and clean solution for allowing plugins to add form elements to the new user registration page.

by nicholasbs at June 20, 2008 11:03 PM

OpenCore Blog Comments

Comment on Our Build Process by ianb

By “pushback” I mean making changes in the code to facilitate the build, instead of making the build always facilitate the code as it exists. We’ve already done some of that.

As for BuildIt, well… it’s also what filemaker does. Different implementations.

The change could be as incremental as a change to zc.buildout. That is, we don’t have to make our build process instantly relocatable. Some things like virtualenv would have to be changed, and .pth files. For simple projects this is pretty much enough. For other cases we might need to use an environmental variable to point to the base directory (which we could make sure is set in the startup scripts) and then make sure configuration and such can use things like %(BASE_DIR)s. Another option is to have a kind of fix-up script(s), which would write any files that needed to have paths rewritten.

For platform needs one option is things like platform-dependent eggs, where we ship binaries for everything. This is what Macs use for a lot of platform issues, I think. Unfortunately there’s small issues that can cause problems, like if Python was compiled for UCS2 or UCS4 internal unicode. One option for cases like this is a kind of lazy build system, where on startup we make checks and recompile or otherwise manipulate the environment based on things we detect.

All of which seems like a lot of trouble, but of course only applies to small subsets of the software we are using — most of it is easily relocatable. I’m not *sure* the work gives enough benefit. But there’s something intriguing about the concrete nature of the build in that case. For example, I think it’d make it much easier to figure out our workflow when setting up staging and production builds. Or you could do diffs and get an accurate idea of what’s changed. Or easily archive exact versions of the complete build.

by ianb at June 20, 2008 07:03 PM

Comment on Our Build Process by ra

Heh… interestingly, this “we’re just pushing files around” philosophy is what buildit, which we tried at one point, claims to be about.

My first reaction to this is cautiously negative, but then I’ve gotten comfortable with our current build process, so I’m a little reluctant to mix things up too much, especially so soon after things have settled. And, of course, we do depend on some C code, which needs to be recompiled for each platform.

I’m interested in hearing more, though. How do you imagine handling the binary executable stuff? And I’m not sure what you mean by “pushback”… can you elaborate?

by ra at June 20, 2008 05:45 PM

OpenCore Blog Posts

Our Build Process

Rob Miller has been thinking about zc.buildout lately, and providing a buildout-based bootstrap for our stack (even if it only calls out immediately to fassembler).  This got me to thinking, and I finished a reflection post about fassembler.

At the end I almost talked myself into refactoring stuff to make the good parts of fassembler usable in zc.buildout, so that we could move over to that in the future if we want to.  Then… right at the end… pow!  Files!  All files, only files!

What if our build process just built files?  Relocatable files.  Not-system-dependent files.  You could tar the whole thing up, copy it to another location or server, unpack it, and get the exact same site.  That tarball would basically be equivalent to the build itself.

It’s not that App Engine seems particularly awesome.  More that it seems like it might be Just The Right Kind Of Dumb.  It’s that this makes it so easy to think about the build, about what it does, about stuff like the workflow for testing and deployment and rolling back.  Easy to explain to other people, easy to audit, easy to debug.

The problem of course is that it requires pushback into other products.  Lots of other products.  But we can do that.

Thoughts?

by ianb at June 20, 2008 04:50 AM

Ian Bicking

My Experience Writing a Build System

Lately there’s been some interest in build processes among various people — Vellum was announced a while back, Ben has been looking for a tool and looking at Fabric, and Kevin announced Paver. At the same time zc.buildout is starting to gain some users outside of the Zope world, and I noticed Minitage as an abstraction on top of zc.buildout.

A while ago I started working on a build project for Open Plans called fassembler. I think the result has been fairly successful and maintainable, and I thought I’d share some of my own reflections on that tool.

Update: what we were trying to accomplish

I didn’t make it clear in the post just what we were trying to do, and what this build system would accomplish.

Our site (openplans.org) is made up of several separate servers with an HTML-rewriting proxy on the front end. We have a Zope server running a custom application, Apache running WordPress MU, and some servers running Pylons or other Python web applications for portions of our site. We needed a way to consistently reproduce this entire stack, all the pieces, plugged together so that the site would actually work. Two equally important places where we had to reproduce the stack are for developer rigs and the production site.

Our code is primarily Python and we use a lot of libraries, developed both internally and externally. Setting up the site is primarily a matter of installing the right libraries and configuration and setting up any databases (both a ZODB databases and several MySQL databases). We use a few libraries written in C, but distutils handles the compilation of those pretty transparently.

For this case we really don’t care about build tools that focus on compilation. We don’t care about careful dependency tracking because we are compiling very little software.

make doesn’t make sense

Update 2: If you think the make model makes lots of sense, read the preceding section — it makes sense for a different problem set than what we’re doing.

We initially had a system based on BuildIt, which is kind of like make with Python as the control code. It wasn’t really a good basis for our build tool, and I think it added a lot of confusion, compounded by the fact that we weren’t quite sure what we wanted our build to do. Ultimately I think the make model of building doesn’t make sense.

The make model is based on the idea that you really want to save work. So you detect changes and remake things only as necessary. For compilation this might make sense, because you edit code and recompile a lot and it’s tedious to wait. But we are building a website, and installing software, and none of that style of efficiency matters. make-style detection of work to be done doesn’t even save any time. But it does make the build more fragile (e.g., if you define a dependency incorrectly) and much harder to understand, and you constantly find yourself wiping the build and starting from scratch because you don’t trust the system.

The metaphor for the new build system was much simpler: do a list of things, top to bottom. There’s no effort into detecting changes in the build, or changes in the settings, or anything else.

Do things carefully

In the build system almost all actions go through the filemaker module. This is kind of a file abstraction library. But the goals are entirely different than convenience: the goal is transparency and safety. In contrast Paver uses path.py for convenience, but I’m not sure what the win would be if we used a model like that.

filemaker itself is heavily tied to the framework that it’s written for, specifically user interaction and logging. Most tasks just do things, and rely on filemaker to detect problems and ask the user questions. For example, every time a file is written, it checks if the file exists, and if it has the same content. If it exists with other content, it asks the user about what to do. It doesn’t overwrites files without asking (at least by default). I think this makes the tool more humane as the default behavior for a build is to be careful and transparent. The build author has to go out of their way to make things difficult.

Many zc.buildout recipes will blithely overwrite all sorts of files which always made me very uncomfortable with the product. It’s the recipes in zc.buildout which do this, not the buildout framework itself, but because buildout made overwriting the easy thing to do, and didn’t start with humane conventions or tools, this behavior is the norm.

What I think filemaker most accomplished was the ability to do file operations while also asserting the expected state of the system, and so makes build bugs noticeable earlier instead of getting a build process that finishes successfully but creates a buggy build, or having an exception show up far from where the error was originally introduced.

Also, because it won’t overwrite your work in progress this has saved the build from engendering deep feelings of hatred in cases when it might overwrite your work in progress. It’s hard to detect this absence of hatred, but I know that I’ve felt it with other systems.

Update: a corollary: ignore no errors

One question you might wonder about: why not a shell script? We did prototype some things as shell scripts, but we’ve consistently moved to Python at some point, even things that seemed really trivial. The problem with shell scripts is they have horribly bad behavior with respect to errors. Ignoring errors is really really easy, noticing errors is really hard.

This is absolutely unacceptable for builds. Builds must not ignore errors. The build may mostly work despite an error. It might be totally broken, but the error message is lost in all sorts of useless output. The error message probably makes no sense. The context is lost. No suggestion is given to the user.

When builds work, that’s great. Build do not always work. They always fail sometimes, and some poor sucker (usually in some hot potato-like arrangement) has to figure out what went wrong. You have to plan for these problems.

Everything in the build tries to be careful about errors. All places where it is not, it is a bug. The resolution isn’t to see something appear to work, but create a broken build, and say "oh, you forgot to set X". The resolution is to make sure when you forget to set X it gives you an error that tells you to set X.

This is one of the more important and more often ignored principles of a good build/deployment system. Maybe it’s gotten better, but when I first used zc.buildout (very early in its development) the poor handling of errors was by far the biggest problem and it left me with a bad taste in my mouth. easy_install and setuptools in general is also very flawed in this respect.

Log interesting things

I tried to make a compromise between logging very verbosely, and being too quiet. As a user, I want to see everything interesting and leave out everything boring. Determining interesting and boring can be a bit difficult, but really just require some attention and tweaking.

To make it possible to visually parse the output of the tool I found both indentation and color to be very useful. Indentation is used to represent subtasks, and color to make sections and warnings stand out.

The default verbosity setting is not to be completely quiet. Silence is a Unix convention that just doesn’t work for build tools. Silence gets you interactions like this:

$ build-something target-directory/
(much time passes)
Error: cannot write /home/ianb/builds/20080426/target-directory/products/AuxInput/auxinput/config/configuration.xml

Why did it want to write that file? Why can’t it write that file? Is the build buggy? Did I misconfigure it? Does the directory exist?

The typical way of handling this is either to run the build again with logging setup or otherwise make it more verbose, or to get in the habit of always running it verbose.

Mixing code and configuration

BuildIt, which we were using before, had the ability to put variables in settings, and you could read an option from another section with something like ${section/option}. It was limited to simple (but recursive) variable substitution, and had some clever but very confusing rules that created a kind of inheritance.

I liked the ability to do substitution, but wasn’t happy with the compromise BuildIt made. I wasted a lot of time trying to figure out the context of substitutions. So, I saw two directions. One was to remove the cleverness and just do simple substitution. This is the choice zc.buildout made. The other was to go whole-hog. With a bit of trepidation I decided to to go for it, and I made the choice to treat all configuration settings as Tempita templates. All configuration is generally accessed via config.setting_name, and that lazily interpolates the setting (it took me quite a while to figure out how to avoid infinite loops of substitution). Because evaluation is done lazily settings can depend on each other and be overridden and have lots of code in defaults (e.g., a default that is calculated based on the value of another setting), and it works out okay. Most settings just ended up having a smart default, and as a result very little tweaking of the configuration is necessary.

Somewhat ironically the result was a kind of atrophying of the settings, because no one actually set them, instead we just tweaked the defaults to get it right. Now I’m not entirely sure what exactly the "settings" are setting, or who they should really belong to. To the build? To the tasks? While this is conceptually confusing, in practice it isn’t so bad. This mixing of code and configuration has been distinctly useful, and not nearly as problematic to debug as I worried it would be. In some ways it was a way of building lambda into every string, and the lazy evaluation of those strings has been really important. But it’s not clear if they are really settings.

Would normal string interpolation have been enough (e.g., with string.Template)? I’m pretty sure it wouldn’t have been. The ability to do a little math or use functions that read things from the environment has been very important.

Managing Python libraries

fassembler uses virtualenv for building each piece of the stack. Generally it creates several environments and installs things into them — it doesn’t run inside the environments itself. This works fine.

zc.buildout in comparison does some fancy stuff to scripts where specific eggs are enabled when you run a script. Each script has a list of all the eggs to enable. You can’t install things or manage anything manually, even to test — you always have to go through buildout, and it will regenerate the scripts for you. zc.buildout was implemented at the same time as workingenv (the predecessor to virtualenv), and I actually finished virtualenv with fassembler in mind, so I can’t blame zc.buildout for not using virtualenv. That said, I don’t think the zc.buildout system makes any sense. And it’s really complicated and has to access all sorts of not-really-public parts of easy_install to work.

Isolation is only the start. easy_install makes sure each library’s claimed dependencies are satisfied. You might then think easy_install would do all the work to make the stack work. It is nowhere close to making the stack work. setup.py files can/should contain the bare minimum that is known to be necessary to make a package work. But they can’t predict future incompatibilities, and they can’t predict interactions. And you don’t want all your packages changing versions arbitrarily. If you work with a lot of libraries you need those libraries to be pinned, and only update them when you want to update them, not just because an update has been released.

So for each piece of the stack we have a set of "requirements". This is a flat files that indicates all the packages to install. They can have explicit versions, far more restrictive than anything you should put in setup.py. It also can check out from svn, including pinning to revisions. This installation plan can go in svn, you can do diffs on it, you can branch and copy and do whatever. Maybe at some point we could use it to keep cached copies of the libraries. For now it mostly uses easy_install (and python setup.py develop for checkouts).

In parallel we have a command-line program for just installing packages using files like this, called PoachEggs. I want to make this better, and have fassembler use it, but I mostly note it because it implements a feature that can "freeze" all your packages to a requirements file. You take a working build and freeze its requirements, giving explicit (==) versions for packages, and pin all the svn checkouts to a revision, so that the frozen requirements file will install exactly the packages you know work.

An alternative to this is what the Repoze guys are doing, which is to create a custom index that only includes the versions of libraries that you want. You then tell easy_install to use this instead of PyPI. It works with zc.buildout (and anything that uses easy_install), but I can’t get excited about it compared to a simple text file. I also want svn checkouts instead of create tarballs of the checkout — I like an editable environment, because the build is just as much to support developers as to support deployment.

The structure

A big part of the development of fassembler was nailing down the structure of our site, and moving to use tools like supervisor to manage our processes. A lot of these expectations are built into the builds and fassembler itself. This is part of what makes the build Work — the pieces all conform to a common structure with some basic standards. But this isn’t the build tool itself, it’s just a set of conventions.

I don’t know quite what to make of this. Extracting the conventions from the builds leads to a situation where you can more easily misconfigure things, and the installation process ends up being more documentation-based instead of code-based. We do not want to rely on documentation, because documentation is generally because of a flaw in the build process that needs explaining. It’s faster for everyone if the code is just right. Maybe these conventions could be put into code, separate from the build. The abstraction worries me, though — too much to keep track of?

What we don’t get right

The biggest problem is that fassembler is our own system and no one else uses it. If someone wants to use just a piece of our stack they either have to build it manually or they have to use our system which is meant to build all our pieces together with our conventions. There’s some pressure to use zc.buildout to make pieces more accessible to other Zope users. We’ve also found things that build with zc.buildout that we’d like to use (e.g., setups for varnish).

We haven’t figured out how to separate the code for building our stuff from the build software itself. There’s a bootstrapping problem: you need to get the build code to build a project, and so it can’t be part of the project you are building. zc.buildout uses configuration files (that aren’t code, so they lack the bootstrap problem) and it uses recipes (a kind of plugin) and has gone to quite a bit of effort to bootstrap everything. virtualenv also supports a kind of bootstrap which we use to do the initial setup of the environment, but it doesn’t support code organization in the style of zc.buildout.

Builds are also fairly tedious to write. They aren’t horrible, but they feel much longer than they should be. Part of their length, though, is that over time we put in more code to guard against environment differences or build errors, and more code to detect the environment. But compared to zc.buildout’s configuration files, it doesn’t feel quite as nice, and if it’s not as nice sometimes people are lazy and do ad hoc setups.

The future

We haven’t really decided, but as you might have noticed zc.buildout gets a lot of attention here. There’s quite a few things I don’t like about it, but a lot of these have to do with the recipes available. We don’t have to use the standard zc.buildout egg installation recipe. In fact that would be first on the chopping block, replaced with something much simpler that assumes you are running inside a virtualenv environment, and probably something that uses requirement files.

Also, we could extract filemaker into a library and recipes could use that. Possibly logging could be handled the same way (the logging module just isn’t designed for an interactive experience like a build tool). Then if we used other people’s recipes we might feel grumpy, since they’d use neither filemaker or our logging, but it would still work. And our recipes would be full of awesome. The one thing I don’t think we could do is introduce the template-based configuration. Or, if we did, it would be hard.

That said, there is a very different direction we could go, one inspired more by App Engine. In that model we build files under a directory, and that directory is the build. Wherever you build, you get the same files, period. All paths would be relative. All environmental detection would happen in code at runtime. Things that aren’t "files" exactly would simply be standard scripts. E.g., database setup would not be done by the build, but would be a script put in a standard location.

This second file-based model of building is very much different than the principles behind zc.buildout. zc.buildout requires rebuilding when anything changes, and does so without apology. It requires rebuilding to move the directories, or to move to different machines. Using a file-based model requires a lot of push-back into the products themselves. Applications have to be patched to accept smart relative paths. They have to manage themselves a lot more, detect their environment, handle any conflicts or ambiguities, being graceful about stuff like databases, because the files have to be universal. In an extreme case I could imagine going so far as to only keep a template for a configuration file, and write the real configuration file to a temporary location before starting a server (if the server cannot be patched to accept runtime location information).

So this is the choice ahead. I’m not sure when we’ll make this choice (if ever!) — build systems are dull and somewhat annoying, but they are no more dull and annoying than dealing with a poor build system. Actually, they are definitely less dull than working with a build system that isn’t good enough or powerful enough, or one that simply lacks the TLC necessary to keep builds working. So no choice is a choice too, and maybe a bad choice.

by Ian Bicking at June 20, 2008 04:41 AM

June 19, 2008

OpenCore Blog Posts

GrassyKnoll: a pluggable search engine in Python

This past Tuesday I attended the NYC Python User’s Group meeting at the offices of DayLife.

The presentation this week was by Peter Fein about GrassyKnoll, a text search engine written in Python.

From TOPP’s point of view there are several interesting things about it:

  • It could provide us with a sitewide search solution for openplans.org and livablestreets.com.
  • It can use any of a number of pluggable back ends (notably PyLucene).
  • You interact with it entirely via a simple REST API.
  • A GrassyKnoll client can also trivially be a GrassyKnoll server. (Peter gave a live demo of this.) In theory this could allow for fun things like smart clustering, where one server gets your query and dispatches queries to N other servers, and then merges the results appropriately. The end client still sees the same simple rest API.
  • It’s written entirely in Python and is Free.
  • They’re looking for more people to help out and would love for us to get involved.

The major down side is that they’re still some months away from a production-ready release. (There were a few glitches in the live demo.) When I pressed him about this, Peter said “definitely by the end of the year.” I got the impression that more hands helping would speed that up.

Also, it doesn’t define any kind of common query language; it just passes them along to the back end. So you do have to know what you’re really talking to.

Peter said he was hopefully coming to the next NYCPUG meeting that we’re hosting here at TOPP on July 15.

by slinkp at June 19, 2008 11:16 PM

openplansdev tags

Law 2.0 - Anil Makhijani

Drupal, Drupal, Drupal

Last night I went to a Net Squared meet-up at the Oxfam International office in New York City.   Attending the meeting were representatives from a number non-profits around the world (change.org, TheElders.org etc).  In retrospect, there was one theme that tied all of these organizations together: Drupal.  Joshua Wiese, the Online Director for TheElders.org, mentioned how his organization has needs for online mapping software.  When I approached him after the meeting and talked to him about Open Layers, the first question he asked me was if there was a Drupal plugin for it.  After the meeting I also had a long conversation with a board member from Community Mediation Services (CMS).  CMS is looking for volunteers to help revive their website.  What platform are they are planning to create this website on?  Surprise, surprise: Drupal.

After talking to a few co-workers about Drupal earlier this week, I am pretty sure why it is the non-profit framework of choice.  Drupal is trivial to setup and simple to configure, and has an enormous developer community.  Sure PHP takes its shots, but people seem to get a lot of things done with it.   Since I am a member of the non-profit tech community, many of my peers are using Drupal, and thus I think it is worth my while to learn some basics.  As I experiment more with the framework, I will be sure to blog about my experiences.

by Anil Makhijani at June 19, 2008 07:20 AM

June 18, 2008

Chris Holmes

Social Business thoughts


A month or two ago I came across an amazing piece by Muhammad Yunus, where he introduces the ‘Social Business’. I do hope the meme builds momentum, and I’m hoping my ‘dot-org‘ concept can grow to be the technological sector of the social business world. It’s a great narrowing of the term ‘Social Enterprise‘, which I feel has almost been too broadly adopted so as to become less meaningful. I’m looking for a more narrow focus – entities that are fundamentally both in service of a mission and operated to draw income from the market, not from charity. He articulates this better than I’ve seen before, with his full cost recovery social business - leaving behind the dependency of a charity, one enters the business world with ‘limitless possibilities’.

I especially liked his points about a Social Stock Market, that it needs to be a set of separate concepts, with different measures and media outlets. But I find his ideas on how to get there a bit weak – having a design competition that rewards the best ones with funding. He also suggests that someone soon could just hatch a Social Stock Market, and a little research found that Rockefeller is already moving to fund one. But I’m fearful that it’s too much too soon, and that a bunch of not so great social enterprises will give it a bad name.

Though maybe it would just be a sector of the Social Stock Market over time, one thing I’ve been thinking about, especially as OpenGeo progresses towards full cost recovery, is making it so money generated by social businesses is viral. The current thought with OpenGeo is that relatively soon we’ll spin it off in to its own company, fully owned by the foundation (TOPP), just like the Mozilla Corporation. Most profit will then go back to TOPP (with a portion towards employee profit sharing), which is a not for profit and by definition has to reinvest the money back in service of its mission. We’ve been thinking about what ‘outside investment’ might look like, and I’m pretty sure we’d want it to operate like the money our funder is putting in now – all returns must go back in service of our mission. But I’m thinking that outside investors could then have control over which TOPP initiatives their returns could go towards, and could choose to direct a new project, leveraging a team of programmers and designers from TOPP. If there was a social stock market, however, it would make perfect sense that their returns could go to other ventures that also guarantee to put their returns back on the alternate market.

Thus money would operate like source code with the GPL – it only helps those who agree to the same set of principles, towards a commons that can be used by others with the same values. I like this because I think it cuts a nice middle ground between non-profit charity giving and the social enterprise investment now that has little way of knowing how much investors actually do care about the profit bottom line. I suppose the money could also go back in to non-profits that are more geared towards charity. This would help foundations allow to make their endowments work for good, instead of just having the capital in traditional investments. But I’d hope that instead of just returning back to foundation endowments it boot straps social venture capital firms and incubators and more capital in service of social businesses. And thus shares in the social stock market (or at least sector thereof) is a real alternative with teeth, where success breeds further capital for more success, instead of just a nice idea.

I have not read any wider literature on these things, so this may be very naive, or an idea already tried, but it sounds potentially cool to me, and it’s really just trying to extend the proto-model we have at TOPP. But I hope to read more on where things are at with social enterprise and where they can move forward, as we’ve mostly been operating in a vacuum.

by cholmes at June 18, 2008 06:12 AM

June 17, 2008

OpenCore Blog Comments

Comment on XML and ZPT by slinkp

I have nothing against the builder approach. But for XHTML, I think you typically do want to “see the whole structure”.

by slinkp at June 17, 2008 09:05 PM

Comment on XML and ZPT by ianb

I personally prefer to use a builder for XML, as I seldom feel a need for something template-like (i.e., see the whole XML structure with substitution markers), and a builder makes the construction much safer and more convenient. Something like the ElementTree builder: http://effbot.org/zone/element-builder.htm

by ianb at June 17, 2008 08:47 PM

June 16, 2008

openplansdev tags

[from ianb] pyx Spawning 0.1 Released

Yet Another HTTP WSGI server.. Should be fast, but more interestingly is worker processes with code monitoring etc (like paster --reload, but probably more efficient and usable in production).

by ianb at June 16, 2008 08:52 PM

June 13, 2008

openplansdev tags

[from whitmo] Eventlet 0.5 Released

new release of the linden labs spawned python concurrency lib originally based on greenlets (now supporting libevent). Includes a new wsgi server.

by whitmo at June 13, 2008 07:08 AM

June 12, 2008

openplansdev tags

[from scrollie] @ Future of Journalism: Adrian Holovaty's vision for data-friendly journalists | PDA: The Digital Content Blog | guardian.co.uk

It's not so much that what Adrian Holovaty describes is very radical, it's more that it highlights a few engrained cultural prejudices and perhaps a little shortsightedness that have stopped news organisations exploring the 'raw news' potential of data.

by scrollie at June 12, 2008 07:30 AM