Posted
20 days
ago
PostgreSQL RPM packages for Fedora-12 released:
http://yum.pgsqlrpms.org/news-fedora12-packages-released.php
Posted
20 days
ago
by
epa...@googlemail.com
Its now possible to make web sites based on PostgreSQL by writing ONLY SQL (or PLPGSQL), HTML (plus a few special HTML comments), and some XML configuration. No PHP or C++ or Java required.
Posted
20 days
ago
No Perl content today, I'm afraid. I'm just back from my trip to Japan and wanted to post this very nice video of my talk. Unlike the versions from other conferences, this one focuses primarily on the slides, with me appearing in audio only. This makes it really easy to follow. Enjoy.
Read More »
Posted
20 days
ago
by
nor...@blogger.com (Dave Page)
I finally got my photos from the JPUG 10th Anniversary conference in Tokyo online. Most are actually from the events before and after the actual conference which involved some sightseeing, and lots of discussions of PostgreSQL on topics such as
... [More]
infrastructure issues and server virtualisation, how to be more supportive of new developers, and how to make MERGE work nicely with the PostgreSQL rules system.
After hitting the limits on my free Flickr account I figured it was time to move to Smugmug, so you can find the full set of pics over there.
Thanks again to all the JPUG folks for an exceptional conference! [Less]
Posted
21 days
ago
by
nos...@example.com (Magnus Hagander)
I've finally had the time to summarize the feedback we received from pgday.eu.
We received feedback from about 35 people, which is obviously way less than we were hoping for. Ideas for how to improve this for next time are very welcome! This
... [More]
also means that the figures we have are not very exact - but they should give a general hint about what our attendees thought.
I just sent out the individual session feedback summaries to each individual speaker. These will not be published - it's of course fine for each speaker to publish his own feedback if he wants to, but the conference organizers will not publish the detailed per-session data.
The statistics we do have show that most of our speakers did a very good job, and that the attendees were in general very happy with the sessions. We have also received a fairly large amount of comments - both to the conference and the speakers - which will help us improve specific points for next year!
I'll show a couple of graphs here with the total across all sessions and speakers. In these graphs, 5 is the highest score and 1 is the lowest.
The attendees also seemed to be very happy with our speakers, which is something I'm very happy to hear about. It's also good to see that almost nobody felt the speakers didn't know very well what they were talking about - always a worry with a conference that has so many experienced community people attending.
Actually trying to figure out which speaker is best using this data is very difficult. But here's a list of the top speakers based on speaker quality, who had more than 5 ratings on their talks. The list includes all speakers with an average score of at least 3.5. There are a lot more hovering around that line, but there has to be a cutoff somewhere... Again note that there are still not that many ratings to consider, so values are pretty unstable. I've included the standard deviation as well to make sure this is visible.
Place
Speaker
Score
Stddev
Num
1
Gavin M. Roy
4.9
0.5
18
2
Guillaume Lelarge
4.9
0.4
7
3
Robert Hodges
4.8
0.4
13
4
Magnus Hagander
4.8
0.4
20
5
Jean-Paul Argudo
4.8
0.5
8
6
Joshua D. Drake
4.6
0.7
9
7
Simon Riggs
4.6
0.6
17
8
Dimitri Fontaine
4.5
0.5
14
9
Greg Stark
4.3
0.5
8
10
Vincent Moreau
4.1
0.6
8
11
Mark Cave-Ayland
4.0
0.6
11
12
David Fetter
3.9
1.1
9
13
Gabriele Bartolini
3.7
1.0
15
14
Heikki Linnakangas
3.6
0.7
9
All of these are clearly very good numbers.
So once again, a big thanks to our speakers for their good work. And also a very big thanks to those who did fill out the session feedback forms - your input is very valuable!
Update: Yes, these graphs were made with a python script calling the Google Charts API. Does anybody know of a native python library that will generate goodlooking charts without having to call a remote web service? [Less]
Posted
23 days
ago
by
nos...@example.com (Leo Hsu and Regina Obe)
In this exercise, we'll go thru our steps for upgrading a Redhat Enterprise Linux 64-bit PostgreSQL
box from PostgreSQL 8.3 to PostgreSQL 8.4. If you don't have any kind of PostgreSQL installed on your box,
you can skip the Upgrade
... [More]
step.
Updgraing from PostgreSQL 8.* to PostgreSQL 8.4
If you are starting from scratch -- just skip this section.
If you are upgrading from PostgreSQL 8.3 to 8.4, in theory you can use PgMigrator, but in practice,
particularly with a Yum install, you are bound to run into obstacles. If you are running an older version, you must dump and restore.
Continue reading "An almost idiot's guide to Install and Upgrade to PostgreSQL 8.4 with Yum" [Less]
Posted
23 days
ago
by
in...@datanamic.com
Datanamic is pleased to announce the release of DeZign for Databases V6.0, its database design and modeling tool. This new version introduces more than 20 new features.
Posted
23 days
ago
by
nor...@blogger.com (Pavel Stěhule)
Searching and calculating Median in databases was terrible. Still median isn't ANSI SQL aggregate function. There are two commons method how to calculate median of some column. First - very old, and very slow based on self join alchemy, second - new
... [More]
- based on analytic function. Now, I will be test some newer methods on one million rows large table:
postgres=# create table milrows(a real);
CREATE TABLE
Time: 7,975 ms
postgres=# insert into milrows select random()*1000000 from generate_series(1,1000000);
INSERT 0 1000000
Time: 6863,575 ms
simple scan on this table takes 200ms
Frompostgres=# select avg(a) from milrows ;
avg
──────────────────
499515.883033113
(1 row)
Time: 200,176 ms
In 8.4 we can use analytic functions. These functions uses TupleStore - internal store feature - it allows work with very large tables - limit is free space on disc.
Analytic methods--Joe Celko's method
postgres=# SELECT avg(a)::float
FROM (SELECT a, row_number() OVER (ORDER BY a asc) AS hi,
count(*) OVER () + 1 - row_number() OVER (ORDER BY a) AS lo
FROM milrows) qs
WHERE hi IN (lo-1,lo,lo+1);
avg
───────────────
499188.546875
(1 row)
Time: 4922,678 ms
-- Andrew Gierth's method
postgres=# select avg(a)
from ( select a, row_number() over (order by a),count(*) over () from milrows ) s
where row_number between floor((count::float8-1)/2+1) and ceil((count::float8-1)/2+1)
;
avg
───────────────
499188.546875
(1 row)
Time: 5021,001 ms
-- modified Andrew's method (count(*) over () is slow)
postgres=# select avg(a)
from ( select a, row_number() over (order by a),(select count(*) from milrows) as count from milrows ) s
where row_number between floor((count::float8-1)/2+1) and ceil((count::float8-1)/2+1)
;
avg
───────────────
499188.546875
(1 row)
Time: 3931,922 ms
Array based methodsNext methods are based on using an arrays. These methods are fast, but limit for this methods is size of operation memory. For very very large tables could to take all application memory.
--Regina's method -- it's not 100% correct http://www.postgresonline.com/journal/index.php?/archives/67-Build-Median-Aggregate-Function-in-SQL.html#extended
CREATE OR REPLACE FUNCTION array_median(double precision[])
RETURNS double precision AS
$$
SELECT CASE WHEN array_upper($1,1) = 0 THEN null
ELSE asorted[ceiling(array_upper(asorted,1)/2.0)]::double precision END
FROM (SELECT ARRAY(SELECT $1[n]
FROM generate_series(1, array_upper($1, 1)) AS n
WHERE $1[n] IS NOT NULL
ORDER BY $1[n]) As asorted) As foo
$$ LANGUAGE 'sql' IMMUTABLE;
CREATE AGGREGATE median(double precision) (
SFUNC=array_append,
STYPE=double precision[],
FINALFUNC=array_median
);
postgres=# select median(a) from milrows ;
^CCancel request sent
ERROR: canceling statement due to user request -- killed 5 minutes !don't use array_append for bigger arrays (length > 10000)
postgres=#
--My method
postgres=# create or replace function median(anyarray)
returns double precision as $$
select ($1[array_upper($1,1)/2+1]::double precision + $1[(array_upper($1,1)+1) / 2]::double precision) / 2.0;
$$ language sql immutable strict;
CREATE FUNCTION
Time: 1,557 ms
Time: 2574,677 ms
postgres=# select median(array(select a from milrows where a is not null order by a));
median
───────────────
499188.546875
(1 row)
Time: 2555,342 ms
This week I added support for median aggregate to orafce package. You can download it from url http://pgfoundry.org/frs/download.php/2472/orafce-3.0.2-devel.tar.gz . Function median use some fetures 8.4 and needs 8.4 - it isn't supported on PostgreSQL 8.3 and older.
-- orafce 3.0.2 median (needs PostgreSQL 8.4 and higher)
postgres=# select median(a::float8) from milrows;
median
───────────────
499188.546875
(1 row)
Time: 687,577 ms
It's very fast - if your table has about one million rows (1000000) you can use it (for this table size takes max. 15MB RAM (for one column)). [Less]
Posted
24 days
ago
So there it is, this newer contribution of mine that I presented at PGDay is
now in debian NEW queue. pg_staging will empower you with respect to what
you do about those nightly backups (pg_dump -Fc or something).
The tool provides a
... [More]
lot of commands to either dump or restore a database. It
comes with documentation covering about it all, except for the londiste
support part, which will be there in time for 1.0.0 release. The Todo list
is getting smaller and smaller, the version you'll soon find in debian sid
is already called 0.9.
So, how do you go about using this software, and what service it implements?
it's all about deriving a staging environment from your backups
To validate backups, you want to restore them and check the database you get
from them. And your developers will want to sometime refresh the database
they're working with. And you could have both an integration environment and
a pre-live one: On the former, you develop new code atop a stable set of
data; while on the latter you test stable enough code (ready to go live) on
a set of data as near as live data as possible.
And you want to be flexible about it, so that there's not a fulltime job to
handle retoring databases each and every days, for project A integration or
project B pre-live testing, or project C accounting snapshot. Or you name
it.
And of course you want to have a single point of control of all your
databases. Let's call it the controler.
setting up pg_staging
The pg_staging setup consists of one pg_staging.ini file wherein you
describe your different target databases (those dev and prelive ones), and
of course where to get the production backups from. Currently you have to
serve the backups file in a format suitable for pg_restore (that means you
use either pg_dump -Ft or pg_dump -Fc) on an apache folder. The produced
HTML will get parsed.
So you setup the DEFAULT section with common settings, then one section per
target: the databases you want to restore. Tell pg_staging where they are
(host), etc, and it'll be able to drive them.
In order to being able to host more than a single restored dump on a staging
server, for the same database, we use pgbouncer:
pg_staging> pgbouncer some_db.dev
some_db some_db_20091029 :5432
some_db_20090717 some_db_20090717 :5432
some_db_20091029 some_db_20091029 :5432
So as explained into the pg_staging(1) man page, you have to open
non-interactive SSH connection from the controler to the hosts where the
databases will get restored. Then you have to do a minimal setup pgbouncer
on the hosts with a trust connection. It'll get used from pg_staging for
adding newly restored database and have them accessible. Then you can also
switch the new database to being the virtual some_db so that you avoid
editing any connection string on your softwares.
Also, install the pgstaging-client package on every host you target. The
client is a simple shell script that must run as root (sudo is used) in
order to replace your pgbouncer setup or manage your londiste services.
See man 5 pg_staging for available options, including schemas to filter out
either completely or just skipping data restoring in those.
pg_staging usage
Now you're all setup, you can begin to enjoy using pgstaging. Enter the
console and see what you have in there.
$ pg_staging
Welcome to pg_staging 0.9.
pg_staging> databases
...
pg_staging> restore some_db.dev
...
pg_staging> pgbouncer some_db.dev
...
pg_staging> dbsizes --all some_db.dev
...
pg_staging> psql some_db.dev
some_db_20091125=#
And as you can see in man pg_staging there are a lot of commands
already. You can for example obtain a new pg_restore catalog from a dump
file, with some schemas commented out. It will even comment out triggers
that are using a function which is defined in a filtered out schema, for
example a PGQ trigger. And much much more.
pg_staging will even allow you to dump your production databases, but
consider installing a separate instance of it on the machine serving the
backups to your local network thanks to an apache directory listing!
Roadmap to 1.0.0
What's remain to be done is testing and having PITR based restoring to work,
and adding some documentation (tutorial, which this blog post about is; and
londiste support). At this point, unless some reader here asks for a new
feature (set), I'll consider pg_staging ready for 1.0.0. After all, we're
using it about daily here :)
Consider commenting, you should be able to easily spot my private mail
address... [Less]
Posted
24 days
ago
by
pr...@datanamic.com
Datanamic is pleased to announce the release of DeZign for Databases V6.0, its database design and modeling tool.