Posted
about 5 hours
ago
by
pet...@gmx.net (Peter Eisentraut)
Like many people, we are interested in deploying solid-state drives (SSDs) for our database systems. Jignesh posted some performance test results a while ago, but as I had commented there, this test ran with the write cache on, which concerned me.The
... [More]
Write CacheInterlude: The disk write cache is the feature that causes you to lose your data when the server machine crashes or loses power. Just like the kernel may lie to the user-land application about writing stuff to disk unless the user-land application calls fsync(), the disk may lie to the kernel about writing stuff to the metal unless the write cache is turned off. (There is, as far as I know, no easy way to explicitly flush the cache. So write cache off is kind of like open_sync, if you know what that means.) As PostgreSQL pundits know, PostgreSQL does fsyncs at the right places unless you explicitly turn this off, and ignore all the warning signs on the way there. By contrast, however, the write cache is on by default on consumer grade ATA disks, including SATA disks and, as it turns out, also including "enterprise" SSD SATA devices.
To query the state of the write cache on a Linux system, use something like hdparm -W /dev/sda. To turn it off, use hdparm -W0 /dev/sda, to turn it back on, hdparm -W1 /dev/sda. If this command fails, you probably have a higher-grade RAID controller that does its own cache management (and doesn't tell you about it), or you might not have a disk at all. ;-) Note to self: None of this appears to be described in the PostgreSQL documentation.
It has been mentioned to me, however, that SSDs require the write cache for write wear leveling, and turning it off may significantly reduce the life time of the device. I haven't seen anything authoritative on this, but it sounds unattractive. Anyone know?The TestsAnyway, we have now gotten our hands on an SSD ourselves and gave this a try. It's an Intel X25-E from the local electronics shop, because the standard, big-name vendor can't deliver it. The X25-E appears to be the most common "enterprise" SSD today.
I started with the sequential read and write tests that Greg Smith has described. (Presumably, an SSD is much better at being better at random access than at sequential access, so this is a good worst-case baseline.) And then some bonnie++ numbers for random seeks, which is where the SSDs should excel. So to the numbers ...
Desktop machine with a single hard disk with LVM and LUKS over it:
Write 16 GB file, write caching on: 46.3 MB/sWrite 16 GB file, write caching off: 27.5 MB/sRead 16 GB file: 59.8 MB/s (same with write cache on and off)Hard disk that they put into the server that we put the SSD in:
Write 16 GB file, write caching on: 49.3 MB/sWrite 16 GB file, write caching off: 14.8 MB/sRead 16 GB file: 54.8 MB/s (same with write cache on and off)Random seeks: 210.2/s
This is pretty standard stuff. (Yes, the file size is at least twice the RAM size.)
SSD Intel X25-E:
Write 16 GB file, write caching on: 220 MB/sWrite 16 GB file, write caching off: 114 MB/sRead 16 GB file: 260 MB/s (same with write cache on and off)Random seeks: 441.4/s
So I take it that sequential speed isn't a problem for SSDs. I also repeated this test with the disk half full to see if the performance would then suffer because of the write wear leveling, but I didn't see any difference in these numbers.
A 10-disk RAID 10 of the kind that we currently use:
Write 64 GB: 274 MB/sRead 64 GB: 498 MB/sRandom seeks: 765.1/s(This device didn't expose the write cache configuration, as explained above.)
So a good disk array still beats a single SSD. In a few weeks, we are expecting an SSD RAID setup (yes, RAID from big-name vendor, SSDs from shop down the street), and I plan revisit this test then.
Check the approximate prices of these configurations:
plain-old hard disk: < 100 €X25-E 64 GB: 816.90 € retail, 2-5 weeks deliveryRAID 10: 5-10k €For production database use, you probably want at least four X25-E's in a RAID 10, to have some space and reliability. At that point you are approaching the price of the big disk array, but probably pass it in performance (to be tested later, see above). Depending on whether you more deperately need space or speed, SSDs can be cost-reasonable.
There are of course other factors to consider when comparing storage solutions, including space and energy consumption, ease of management, availability of the hardware, and reliability of the devices. It looks like it's still a tie there overall.
Next up are some pgbench tests. Thanks Greg for all the performance testing instructions.
(picture by XaYaNa CC-BY) [Less]
Posted
about 6 hours
ago
In an article at High Scalability this article explaining HBase on a conceptual level was referenced. It's a very good starting point for understanding the basic concept of HBase (and BigTable) and it's no more than a five minute read.
Posted
1 day
ago
PostgreSQL 8.4.0 released.
http://www.postgresql.org/about/press/features84.html
Posted
1 day
ago
by
nor...@blogger.com (Selena Deckelmann)
Yesterday, I traveled to a Michelin (yes, the tire company!) plantation for a party thrown in honor of the new Secretary to the Ondo State Government, Dr. Aderotimi Adelola.
Michelin grows rubber trees on this sprawling estate. It took
... [More]
nearly 20 minutes to get from the highway to the primary school deep inside the plantation where the celebration was held. Tapped rubber trees pictured below!
I was invited to a table inside the Governor's main tent, and spent most of the time just looking around at all the government officials, and chatting with the Chairman of SITEDEC, Cyril Egunlayi.
The high point of the afternoon was Dr. Olusegun Mimiko's speech welcoming Dr. Adelola to the government. He's a charismatic speaker. The people around the perimeter pressed closer, and were attentively silent for his 10 or 15 minute speech. He emphasized education -- his hometown's slogan is "Home of Education". He also said that despite Ondo State's history of leading Nigeria in educational opportunities, the state had regressed and needed to catch up again. Mimiko speaking:
The car ride out and back to the plantation took about two hours each way. I spent much of that time talking about open source options for various IT infrastructure, where something like Google Apps might fit in for them, and passed on information I'd I'd gotten about microwave links from a Portland WiMax provider, Stephouse Wireless. I also told Cyril about feedback regarding a replacement for Exchange. My followers on Twitter universally recommended Zimbra, and that was confirmed by at least one End Point coworker, Adam Volrath.
We also stopped by the office on our way home to check in on a new wireless repeater the engineers were installing on the tower they have out behind the SITEDEC center. We still have a few details to work out for the class arrangements.
In the evening, I enjoyed some Nigerian barbecue with Deji Agbebi. Originally from Legos, he worked for a Canadian firm in the early 90s who's goal was to provide clean drinking water to villages in Ondo state. For various reasons, including a military coup, that business failed. Now Deji works in the US. He's a friend of Cyril's, and is here in Akure, hoping to help with the work the government is trying to complete before January. [Less]
Posted
1 day
ago
Also yesterday, and also Peter Eisentraut, committed patch by Guillaume Smet, which:
Add log_line_prefix placeholder %e to contain the current SQL state
Author: Guillaume Smet
What exactly does it do, and how the state
... [More]
looks? Let’s find out.
I defined log_line_prefix to be ‘%m %u@%d %p %r {%e} ‘.
log_min_duration_statement was set to 0.
Then I issued some queries. Logs looked [...] [Less]