Posted 13 days ago
I just released RelStorage 1.4.0b1. New features:
More documentation.
Support for history-free storage on PostgreSQL, MySQL, and Oracle. This reduces the need to pack and makes RelStorage more appropriate for session
... [More]
storage.
Speed. New tests prompted several optimizations that reduced the effect of network latency in both read and write operations. Memcached support is now integrated in a much better way.
Support for asynchronous database replication. Previous versions of RelStorage worked with MySQL replication, but did not keep ZODB caches in sync when failing over to a slave that was slightly out of date.
The Oracle adapter now uses PL/SQL for speed and lock timeouts. Lock timeouts are important for preventing cluster lockup.
Moved the speed test script into a separate package named zodbshootout, making it easier for developers and administrators to run comparative performance tests.
The adapter code is more modular, making it easier to support new kinds of databases and database adapter modules.
The zodbshootout script tells me this release of RelStorage is faster than ever. It reports objects read or written per second, so unlike the previous charts I’ve made, bigger is now better. Here are the results:
PostgreSQL now beats MySQL in some of the tests. Oracle (not on this chart) is now looking pretty good too.
The new features led to far more automated tests. My private Buildbot, which tests RelStorage with several combinations of Python, ZODB, and operating systems (in virtual private servers), now takes 2 hours to run all the tests. Maybe I need to upgrade that server or investigate the possibility of making Buildbot launch an Amazon EC2 instance.
The previous release was 1.3.0b1, which added ZODB blob support. Several customers asked for new features right after I released 1.3.0b1, so I decided to jump to version 1.4.0b1 rather than finalize the 1.3 series. The 1.2 series has had more extensive testing, so use that for a while if you have troubles with 1.4.0b1.
This new release should be particularly interesting for Plone users, since Plone is always hungry for faster infrastructure. [Less]
Posted 13 days ago
I really like generic setup to build plone instance.
But, I was not able to do some things like setup contents (folders for structure, documents to make some example pages, …), keywords and portlets.
Note: The default import
... [More]
structure step of Plone doesn’t do enought things for me, and I find it hard to use. So I have contributed to some third part modules:
CSVReplicata
csvreplicata first give you a way to import/export content within the csv file format. So you can import/export atcontent types (and more) with it. I have added a csvimportStep to be able to use it with genericsetup:
def importcsvStep(context):
"""This step give you a way to import content in your website with
csvreplicata.
How it works:
read replicata.cfg file to load configuration
read replicata.csv and call csvreplicata with the config.
Config:
[replicable_types]
#List schemata for each types you want to import
#Folder : default, categorization, dates, ownership, settings
#News Item : default
Document: default
[settings]
#list here every csv settings that will be given to replicator.importcsv
encoding:utf-8
delimiter:,
stringdelimiter:"
datetimeformat:%d/%m/%Y
wf_transition = publish
conflict_winner:SERVER
"""
So you just have to drop replicata.csv and replicata.cfg files in your profile to be able to import contents in your Plone site. An other way of doing, is to create the content with Plone, then export it, and save the file in your profile.
PloneKeywordManager
I have done the same with PloneKeywordManager. There is a new import step named ‘keywords’. It reads a ‘keywords.txt’ file in the current profile, it creates a document called ‘keywords’. The file must have one keyword per line. It is simple and it works.
Note: To use this you need to use the svn collective trunk version at the moment but they will be released soon.
Portlet ??
Now I m trying to make the same thing with portlets but I don’t have any idea how to do this. If any one has ideas about this, please mail me, I would be glade to implement it !
[Less]
Posted 13 days ago
Yesterday morning I came in to work to find that my (previously fine) plone 3 buildout kept failing to rebuild. After much debugging and googling, I tracked it down to the fact that over the weekend a new alpha version of plone.recipe.zope2instance
... [More]
was put up on pypi (4.0a1). Buildout merrily downloaded the new version which, by its own admission, introduces changes which can break plone 3 (see the changelog):
http://pypi.python.org/pypi/plone.recipe.zope2instance/#changelog
Now, you could argue this is my fault for not pinning my versions. In my defence, I had pinned the majority of versions but had left out major recipes like plone.recipe.zope2instance, assuming they were safe.
The upside to this issue was my discovery of buildout's 'prefer-final' option, which causes it to pick older, final releases over newer, alpha ones. You can add this option to the [buildout] section of your buildout file as follows:
[buildout]
parts = zope2
instance
...
prefer-final = true
...
According to the buildout user guide, this option will default to 'true' in future versions of zc.buildout:
http://pypi.python.org/pypi/zc.buildout#preferring-final-releases [Less]
Posted 13 days ago
We just spend two hours chasing down a bug in our code. Internally, we use
subversion as we have to manage xml content that gets edited by multiple
people and that we want to manage multiple versions of. So subversion matches
quite
... [More]
well.
At some point in time, it was decided that py.path's svn wrapper would be handy. For
some things, it really is.
But watch out: there's an internal well-hidden cache for storing svn info
for up to 20 seconds. This includes revision number information. So in our
doctest we modify something, commit it (using py.path), do an "svn up" (using
py.path) and... the revision number is still stuck at the old value.
What? Debugging time! Checking the directory: yeah, everything is updated
just fine. So where's the problem? This is where the power of open source
comes in: you can just look at the original source code and discover where the
problem is. Ah, some information is stored for performance reasons in a
module-level dictionary that functions as a singleton.
In our code (luckily only in one place) we now solve it by clearing the
dictionary that is used as a cache:
from py.__.path.svn import cache
# cache is a dictionary that is used as a singleton
def some_method():
...
cache.clear()
...
That py.__.path.svn looks like madness. Double underscores in an import?
It is apparently valid. Probably an "inventive" way to keep stuff private.
I'd rather not have such inventive dirty rotten maggot-invested flea-ridden
junk in my code, however.
Update: a less intrusive fix is to pass usecache=False to the
info() call. You need to do that everywhere. Luckily we already make a
subclass of the related py.path class to override some things, so a custom
info() method doesn't hurt that much. [Less]
Posted 13 days ago
Gmail's search function works very well.
Subversion mails me the svn diff for all the checkins.
I cannot remember where I added a certain helper function but I'm sure I did
it somewhere. I might remember a term I put in the commit
... [More]
message.
Combination: just search in gmail. Handy when it is on a branch you haven't
checked out locally, for instance.
Of course I'll stick to using trusty old grep in most cases, but this is a
handy addition to my arsenal. And the results are better than what I'm
getting out of trac. [Less]
Posted 13 days ago
When to use regular expressions and "abort".
Posted 14 days ago
(You are encouraged to read this article with its formatting and typography intact, instead of in this RSS reader)
Introduction & Rationale
What if there was a backwards compatible way to transfer all of the resources that are
... [More]
used on every single page in your site — CSS, JS, images, anything else — in a single HTTP request at the start of the first visit to the page? This is what Resource Package support in browsers will let you do.
When it comes to browser performance, it’s widely known that a lot of the time is spent waiting for HTTP requests. You are probably familiar with the issue; a well-known optimization technique is to reduce the number of HTTP requests that are done for a given web site, since browsers only do 2–6 requests in parallel. This is why techniques like image spriting exist.
There are problems with image spriting, though. In addition to potentially severe memory penalties, they obfuscate the code — “What is this icon at 704px, exactly?” — and every time you add a new icon, you have to update the sprite file, which adds to the maintenance burden.
Some images can’t be sprited (think about YouTube, which easily serves up 40 JPEG thumbnails on a given page), and there’s also other resources like JavaScript & CSS, which — while possible to combine — at the very least need one file each. You can see how this quickly saturates the available parallel HTTP pipelines.
Even if bandwidth is getting better, and the internet is getting faster, ping times are actually getting worse in a lot of cases. With mobile internet browsing, and to some extent US domestic cable internet and DSL, the round-trip time for a single request can be slow, even if it downloads relatively fast once the transfer starts.
While there are lots of workarounds to solve this class of problems, we suggest a standard approach that all browsers makers can easily implement, and that is backwards compatible with browsers that do not support it. We also want a solution that works for all types of resources, not only image bitmaps, and one that doesn’t require any new tools, file types or protocols.
Goals
This proposal has the following goals:
Make it possible to serve all the resources (images, stylesheets, javascript) required by a page in a single HTTP request, freeing up the other parallel requests to fetch resources that are page-specific.
Be as simple to implement as possible, so anyone with a passing familiarity with HTML should be able to perform the optimization.
Be entirely transparent to browsers that do not support it.
Avoid retransmission of existing resources.
Use existing tools that are widely used on all platforms.
Support the “80% use case” over adding a lot of complexity to the spec.
Non-goals
Some explicit non-goals of this proposal:
Invent new file formats
Invent new compression formats
Implementation
While Zip files do not have not the most elegant or efficient packing format out there, they have the following very desirable traits:
Easily available reference implementations.
Can be unpacked even in partial state — which means that we can stream the file, and put CSS and JavaScript first in the archive, and they will unpacked and made available before the entire file has been downloaded.
Excellent toolchain support, zip/unzip is available on all major platforms, so it’s easy for web developers to use.
We propose this markup to signal a zipped resource package:
<link rel="resource-package"
type="application/zip"
href="site-resources.zip" />
The default MIME type for a resource package is application/zip, and you can omit it in documents where it is valid, like in HTML5, where an equivalent would be:
<link rel="resource-package"
href="site-resources.zip" />
This will tell the browser to download this file first, and use the resources contained in the file instead of the referenced images, style sheets and javascript files — or for that matter, any other file. Browsers should prefer the files in the resource package, and do individual requests for images that are not contained in the package.
A given browser will probably block downloading any resources until the lists of files that are available in resource packages have been accounted for — or there may be a way to do opportunistic requests or similar, we leave this up to the browser vendor unless there’s a compelling reason to specify how this should work.
Older browsers that do not support resource packages will simply ignore this tag, and fetch the files normally, with one HTTP request for each.
Path handling
Paths will be rendered relative to where the resource package is located, so you can supply additional directories inside the Zip to mimic existing site structure.
The resource package is referenced as follows in a page somewhere on the site www.example.com:
<link rel="resource-package"
type="application/zip"
href="/static/site-resources.zip" />
The Zip file has this internal structure:
manifest.txt
javascript/jquery.js
css/reset.css
css/grid.css
css/main.css
images/save.png
images/info.png
In this example, the resolved path to the main.css file would be http://www.example.com/static/css/main.css. Notice how the path inside the zip file is added to the path where the actual zip file is located.
The manifest file
To give the browser the ability to know up front what files are in the zip file without reading the entire file first, we support an optional manifest file that can contain this information. This file can be supplied as a separate file (useful if combining with Offline Resources), or as the first file in the zip file itself.
Example manifest.txt file:
javascript/jquery.js
styles/reset.css
styles/grid.css
styles/main.css
images/save.png
images/info.png
This file must be the first file in the archive.
This file must be named manifest.txt when supplied as part of the archive.
If this simple format looks familiar, that’s not a coincidence. Initially, we were looking at using either an XML or JSON format to specify this, but we believe it’s easier to add a couple of new abilities to the offline resource specification instead. When using resource packages with Offline Resources (which are also part of the HTML5 spec), we’d like it to be easy to extend the rules, so the offline manifest with resource package support could look like this:
CACHE: resources=/static/siteresources.zip
javascript/jquery.js
styles/reset.css
styles/grid.css
styles/main.css
images/save.png
images/info.png
# The above section lists the files in resource.zip. To start a new section, do:
CACHE:
images/outside-package.png
The only thing we’d need to add to the HTML5 offline spec is that it should ignore anything on the same line after CACHE: if it doesn’t know how to handle it. This means that you could potentially put the resource package definitions in your Offline Resources manifest — we would also support doing it the other way around, and put the Offline Resources manifest inside the resource package.
This isn’t a requirement for implementing the initial version of resource packages, however — but could be an easy way to add support for it to the offline resource specification. If there’s a better way to do it, let me know.
Fallback
There should be no compatibility issues with old browsers, as they will just load the individual files instead of the zip file.
Browsers that don’t implement this will seem slow in comparison to other browsers. Luckily, it should be a simple addition to any of the modern browsers.
Two simple examples
It’s not hard to see where Resource Package could be useful in existing sites, but two main categories would be:
Supply the core layout & functionality of a site
Typically, you would ship over all the CSS, JS and images that are used on every page in the site. These could be cached quite aggressively, and even use ETags to invalidate the zip file when needed.
The thumbnail search result case
Consider a typical YouTube search results page. It contains 20-40 thumbnails of videos, and there’s no easy way to add all these images into an image sprite, since the long tail of search results would vary a lot. Resource Packages would let you build a zip of search result thumbnails on the fly, and ship them all over in one HTTP request. It would require some CPU power, but would be much faster for the end-user. This wouldn’t have to be cached, or could be cached on a per-search basis.
Other approaches
There are several other approaches that could solve parts of this problem, but
HTTP pipelining
This is a more aggressive way of utilizing the HTTP keep-alive mode, but is not implemented correctly by all web servers. Proxies have a hard time with it, some browsers also do, so it’s not really working unless you want to be aggressive and/or whitelist/blacklist certain servers.
Multipart MIME
Hard for integrators, requires special packing that isn’t trivial to do, and has poor browser support.
JAR files
No reasonable fallback mode, as the file name is embedded in the href/src link, and browser that doesn’t support it just won’t render it.
SPDY
While this effort from Google aims to make everything faster, it is largely orthogonal to what we’re trying to do with Resource Package. It also requires you to retrofit both web browsers and web servers to make it work, which means it will take quite a while before this will be in common use. Resource Packages work without any changes to the web server software, and will work as soon as any browser supports it — with no adverse effects to the browsers that don’t.
Additional notes
The Zip format doesn’t have MIME type support, so this will have to be solved by the browser based on filename extensions or other heuristics. We don’t believe this to be a problem, since browsers already have to do this.
All the resources in the package will have the same headers (expiry, ETags, etc.) as the zip file itself. If you need different expiry dates or other caching settings, you should specify multiple resource files with different cache headers.
You can specify a charset in the resource package definition. If unspecified, it is assumed that any non-binary files inside are UTF-8.
Next steps
We have sent this out to the major browser vendors for feedback, and we will be implementing this in the next upcoming release of Firefox (tentatively has the version number 3.7, but this may change).
FAQ
Does zipping up multiple optimized PNGs or other files work with zip? Can it potentially increase file size or lead to a high unpacking CPU overhead?
Zip automatically chooses the best of deflation or no compression. Images will usually not be compressed, since they already are, text files like CSS/JS will be. In general, CPU impact from unzipping is negligible, even on slow devices.
How does this affect mobile devices, which have limited CPUs?
More realistic concerns are cache ability and bandwidth — as well as the ping time on mobile networks — and memory. A lot of mobile browsers only keeps things in the browser cache at all if the individual file is something like 20kB or less. For returning visitors, you suddenly need to download one large file again, instead of having multiple small files locally.
In general, mobile browsers clear their caches quite aggressively — although with the resource package spec, one would hope that they would implement more optimal handling and prioritize caching these, since they more likely to be valuable for browsing performance than another random image/CSS/JS file in a site.
How would Resource Packages work with CDNs?
There would be no special handling, these mirrors would just carry the resource package file like any other file they are supplying.
Acknowledgments
Mike Solomon from YouTube for encouraging me to propose a solution to this issue.
David Baron & Elika Etemad from Mozilla for comments on the implementation feasibility, and for helping identify prior art.
Vladimir Vukićević & Jonas Sicking from Mozilla for help with adapting the Offline Resources standard to handle Resource Packages.
Dion Almaer & Ben Galbraith from Palm, Steve Souders & Alex Russell from Google & the Chrome team for feedback on the proposal from an implementer’s perspective.
Improvements & feedback
If you have any suggestions on how to improve this proposal, send me an email at limi@mozilla.com or even better, comment in the open thread over at Mozilla’s dev.platform forum. It has been filed as bug #529208 in Bugzilla for those of you that want to subscribe to progress updates on this.
Proposal State: Draft
Revision 4: Nov 16th, 2009 — first published & widely circulated version, added Offline Resources support
Revision 3: Nov 10th, 2009
Revision 2: Sep 1st, 2009
Revision 1: Jun 15th, 2009 [Less]
Posted 14 days ago
In last months I'm keeping updated a branch of TinyMCE that can be very interesting for Italian users of Plone (but not only...)
Posted 15 days ago
The last releases of mr.developer (1.2 and 1.3) reduce the amount of surprises.
The packages from auto-checkout are now automatically added and removed from the list of development packages when you switch buildout configurations. Before you
... [More]
had to run develop reset and rerun buildout for that to work. If you have an existing checkout, you may want to reset it, so that this change is picked up.
The last used buildout configuration is now read directly. That means if you change source declarations in your buildout configuration, then you don’t have to rerun buildout anymore for mr.developer to pick up those changes. You still have to run buildout after changing the develop status of a package, but that’s the same as with a plain buildout without mr.developer.
[Less]
Posted 16 days ago
Our experiences with code forks in the Zope world and customer projects
Copyright © 2009 Geeknet, Inc., All Rights Reserved.