Committed to Code

The Apache HTTP Server Project is a collaborative software development effort aimed at creating a robust, commercial-grade, feature-rich, and freely-available source code implementation of an HTTP (Web) server. The project is jointly managed by a group of volunteers located around the world, using the Internet and the Web to communicate, plan, and develop the server and its related documentation. This project is part of the Apache Software Foundation. In addition, hundreds of users have contributed ideas, code, and documentation to the project.

Code Analysis


Recent Highlights

Avatar

Large commit — Rebuild

More than 1000 lines of source code were added or removed in this commit.

In commit /p/apache/commits/177746611 by Rich Bowen (Using name ‘rbowen’) on 2012-05-11 (11 days ago)

Anon32

Large commit — update transformation

More than 1000 lines of source code were added or removed in this commit.

In commit /p/apache/commits/177457387 by nd on 2012-05-08 (13 days ago)

Avatar

Large commit — xforms

More than 1000 lines of source code were added or removed in this commit.

In commit /p/apache/commits/177457368 by Daniel Gruno (Using name ‘humbedooh’) on 2012-05-08 (14 days ago)

Avatar

Large commit — xforms

More than 1000 lines of source code were added or removed in this commit.

In commit /p/apache/commits/177457363 by Daniel Gruno (Using name ‘humbedooh’) on 2012-05-08 (14 days ago)

Avatar

Large commit — xforms

More than 1000 lines of source code were added or removed in this commit.

In commit /p/apache/commits/177457359 by Daniel Gruno (Using name ‘humbedooh’) on 2012-05-08 (14 days ago)

See all highlights…


News

Isabel Drost: GeeCon 2012 - part 1

Devoxx, Java Posse, Qcon, Goto Con, an uncountable number of local Java User Groups – aren’t there enough conferences on just Java, that weird programming language that “makes developers stupid by letting them type too much boiler plate” ... [More] (Keith Braithwaite)? I spent Thursday and Friday last week in Poznan at a conference called GeeCon – there main focus is on anything Java, including TDD, Agile and testability. It’s all community organised – switching between Poznan and Krakow on a yearly basis, backed by two corresponding Java User groups with a clear focus on good speakers and interesting content: Really well done, wish they could have fit more talks into each of these days: Five tracks in parallel left one with just around 4 regular talks + keynotes each day. That does make for a very human start and end time – but it feels like there’s so much going on in parallel that most likely you miss some of the particularly interesting content. Looking forward to the videos!

One note: If you are ever invited as a speaker to GeeCon: Do accept! It’s really well organised, an incredibly friendly atmosphere, and a really tasty speaker’s dinner. One thing that caught me be surprise this morning: My room was all paid for even though I stayed longer and had offered to cover the additional nights myself - Thanks guys, you rock!

Watch this space for more details on the talks in the coming days. [Less]


Ceki Gulcu: Dealing with ignorable errors

In real-world applications assembled out of heterogeneous parts or interacting with systems beyond your control, error conditions are bound to occur. Often times, some of these errors can be safely ignored. For example, the qos.ch e-commerce site is ... [More] crawled by googlebot which for some unknown reason insists on visiting invalid URLs about 30-50 times a day. These invalid requests cause the wicket [Less]


Jean-Baptiste Onofré: Apache Karaf Cellar 2.2.4

Apache Karaf Cellar 2.2.4 has been released. This release is a major release, including a bunch of bug fixes and new features.

Here’s the list of key things included in this release.

Consistent behavior
Cellar is composed by ... [More] two parts:

the distributed resources which is a datagrid maintained by each cluster nodes and containing the current cluster status (for instance of the bundles, features, etc)
the cluster event which is broadcasted from a node to the others

Cluster shell commands, cluster MBeans, synchronizers (called at startup) and listeners (called when a local event is fired, such as feature installed) update the distributed resource and broadcast cluster events.

To broadcast cluster events, we use an event producer. The cluster event is consommed by a consumer which delegates the handling of the cluster event to a handler. We have a handler for feature, bundle, etc.

Now, all Cellar “producers” do:

check if the cluster event producer is ON
check if the resource is allowed, checking in the blacklist/whitelist configuration
update the distributed resources
broadcast the cluster event

Only use hazelcast.xml
The org.apache.karaf.cellar.instance.cfg file has disappear. It’s now fully replaced by the hazelcast.xml.

It fixes issue around the network configuration and allows new configuration, especially around the encryption.

OSGi event support
cellar-event feature now provides OSGi event support in Cellar. It uses eventadmin layer. All local event generates a cluster event which is broadcasted to the cluster. It allows to sync remote nodes.

Better shell commands
Now, all cluster:* shell commands mimic the core Karaf commands. It means that we will find quite the same arguments and options and similar output.

The cluster:group-* shell commands have been improved and fixed.

A new shell command has been introduced: cluster:config-propappend to append a String to a config property.

Check everywhere
We added a bunch of check to be sure to have a consistent situation on the cluster and predictable behavior.

It means that the MBeans and shell commands check if a cluster group exists, if a cluster event producer is on, if a resource is allowed on the cluster (for the given cluster group), etc.

You have clean messages informing you about the current status of your commands.

Improvement on the config layer
The Cellar config layer has been improved. It now uses a karaf.cellar.sync property to avoid infinite loop. The config delete operation support has been added, including the cluster:config-delete commands.

Feature repositories
Previously, the feature repositories handling was hidden for the users.

Now, you have a full access to the distributed features repositories set. It means that you can see the distributed repositories for a cluster group, add a new features repository to a cluster group, and remove a features repository from a cluster group.

To do that, you have the cluster:feature-url-* shell commands.

CellarOBRMBean
Cellar provides a MBean for all parts of the cluster resources (bundles, features, config, etc).

However, if an user installed cellar-obr feature, he got the cluster:obr-* shell commands but no corresponding MBean.

The CellarOBRMBean has been introduced and is installed with the cellar-obr feature.

Summary
Karaf Cellar 2.2.4 is really a major release, and I think it should have been named 2.3.0 due to the bunch of the bug fixes and new features: we fixed 77 Jiras in this release and performed lot of manual tests.

The quality has been heavily improved in this release comparing to the previous one.

I encourage all Cellar users to update to Karaf Cellar 2.2.4 and I hope you will be pleased with this new release [Less]


Yoav Shapira: I'm going to travel a bit: #RTW2012

Later next week I'm leaving Boston on a round-the-world (RTW) trip.  This blog post contains some initial details, primarily for the benefit of interested family and friends.

NASA image

I've wanted to do such a trip for a long ... [More] time, and I have a great opportunity right now.  I left HubSpot last month, as described here earlier.  I have been talking to a variety of people about the next job, and I'm lucky to have many options.  The timing now is perfect, before I start something new that I'm passionate about, meaning I'll spend a lot of hours on it.

The details below are subject to change.  I'm extending an open invitation to friends and readers of this blog to meet me during the trip and tag along for any portions they want, however they want.  Just let me know in advance, please, since the plans described here might change without my updating this post.

One of the hardest things about this kind of trip is deciding what to do.  There are so many options, so many places to visit, so many fun ideas, that "analysis paralysis" is a real threat.

Eventually, I figured out that I really want to stick with big famous cities, see their culture, enjoy authentic local food (street food and otherwise), and have fun with local nightlife.  I will visit museums and famous attractions, but only the ones that interest me, and on my pace, not as part of a tour group.  I think of this trip as a "degustation" or "tasting menu."  If I find places I really like, I can hopefully come back to them in the future.

From Design Roxx's review of Comerc, 24

I am not planning to do much hiking, trekking, scuba diving, kiteboarding, or some of the other sports which I do enjoy, but which require gear.  I do plan to play local pickup basketball and soccer every place I can, in order to meet locals and get a workout in.  I also hope to volunteer in various ways at various stops, again to meet locals, and help out.

There will be separate upcoming blog posts about lodging choices, packing choices, planning, and more.  All my posts, tweets, foursquare updates, etc on this topic will have the #RTW2012 keyword / hashtag.

I'm not the only one using the above #RTW2012 hashtag.  That's intentional: a few other folks are doing their own RTW trips and sharing that hashtag, which I like.  You can filter just for me easily enough if you want.

For my tech-curious friends: yes, this is why I recently resumed using foursquare after more than two years away.

I considered a ton of places, as the world offers many things.  I'll explain some of the omissions, and the reasoning for them, below.  So here we go, with the actual current plan, approximate dates.

San Francisco ~ 5/25 - 5/30, possible side trips to Santa Cruz and wine country.

Tokyo ~ 5/31 - 6/6, side trip to Kyoto.

Hong Kong ~ 6/6 - 6/11, side trips to Kowloon and Macau.

Bangkok ~ 6/11 - 6/16, side trip to Angkor Wat (Cambodia), maybe Phuket.

Delhi ~ 6/16 - 6/21, side trips to the Taj Majal (Agra), and "maximum city" Mumbai.

Istanbul ~ 6/21 - 6/25, probably no side trips.

Tel-Aviv ~ 6/25 - 7/14, family time but also some business.

Russia ~ 7/14 - 7/23, splitting time between Moscow and St. Petersburg.

Scandinavia ~7/23 - 7/30, splitting time between Stockholm, Copenhagen, and Oslo.

Reykjavik, Iceland ~ 7/30 - 8/3.

Berlin and Prague ~ 8/3 - 8/10, maybe also a side trip to Vienna or Barcelona.

Paris  ~8/10 - 8/13.

Barcelona ~ 8/13 - 8/16, unless I go there before Paris.

London ~8/16 - 8/20,.

Then it's back to Boston, at least for now, unless I miss the Barcelona stop earlier in August.

There are also some places that I thought about for a while, that I would really like to visit, but that don't fit the trip for one reason or another:

South Africa and safaris: would be awesome, but not in line with the theme, and require a lot of time.

Austalia and New Zealand: same as South Africa above.  Would love to visit these on a dedicated trip in the future.  Also, Sydney for New Year's Eve is on my "bucket list."

The Maldives: going to go on a diving trip there with my dive center instead, before the islands sink.

Seoul, Singapore: just not interesting enough to spend a "segment" or flight leg(s) of my RTW ticket on them.

Hanoi, Saigon: beautiful and fascinating places, just couldn't make it work this trip.  If I extend my trip, it would be to spend time here.

Latin and Central America: been to a bunch of places here already, and going to my major gap (Brazil) in 2014 anyways.  Also, it's a fairly easy flight from the US, so I can (relatively) easily get there in the future.

Alright, this post is long enough for now.  I'm really excited about this trip and looking forward to it.  Upcoming blog posts will cover more of the packing, planning, hacking, logistics, and related things.

One last note: I'm not sure yet how "online" I'll be during the trip.  I will not be checking email, but I will be posting the occasional photo + caption to my foursquare account.  That gets cross-posted to Twitter and Facebook for the curious.  This is my "I'm still alive" notification for friends and family.

Long-form blog posts will be difficult, and thus rare, while I travel.  I'm not taking a laptop, on purpose, and will generally try to unplug as much as possible. [Less]


Steve Loughran: Possible Stonebraker Trajectories

In my time I've managed to say bad things about SQL in an email discussion that had the (now officially late) Jim Gray in the CC:. I also faulted the first generation of grid engines for believing the storage vendors when they said "don't worry about ... [More] storage" when I was participating in a panel with the lead of the Condor project. Despite this, the ACM doesn't give me a blog page.

It's a shame, then, to see the ACM letting Stonebraker publish another of his rants, Possible Hadoop Trajectories

First he gets a dig in at Java developers "discovering parallel processing". Actually, they've had threads and small clusters for a long time actually. What Hadoop brings to the planet is the opening up of the ability to work with thousands of cores and PB of data to lots of people. The query languages, Pig and Hive, mean that you don't need to learn Java either -any more than you need Objective C skills to use an iPhone.

Then he tries the "it's so inefficient the planet dies" story. At least they are planning to build their next supercomputer by a hydro plant -but it's still going to have a power budget of Megawatts, so calling Hadoop environmentally unfriendly is ironic. It's like Airbus saying their planes are greener than Boeings' -when both are flying people across the Atlantic(*).

If there's one thing that really annoys me here it's a bit from the opening paragraph:
"we applaud Hadoop for its success in this area, which we believe is due largely to the simplicity and accessibility of its environment. "Exactly.
It is simple to use because Map(x):(k,y) and Reduce(k, [y1,...,yn]): z are easy to understand and play with. You can write efficient routines without knowing relational algebra and set theory, unlike, say RDBMs [Codd71].It is accessible because it runs (slowly) on your laptop and massively faster on your production cluster. It is also economically accessible because it is free to download and start to play with. You may need training or support -which Hortonworks will gladly offer -but that payment is optional. You can learn through the books and trial and error. You can learn to support your own system. Doing so does give you the duty to rummage through the code yourself, but if you contribute any fixes you have made back, even your in-house support efforts benefit the community as a whole.
There is nothing wrong with simplicity and accessibility. This is why PHP is one of the key development platforms of Facebook. When Facebook wanted those PHP developers to work with Hadoop, they didn't say "go learn Java". They said "here's an SQL bridge", called Hive. For those people who already known SQL, Hive lets you work with Hadoop without having to write a line of Java. There is nothing wrong with that and it does not make sense to denounce Hadoop because someone wrote tooling to help SQL experts work with it.

That does not mean that SQL is a good language. That little fact has been forgotten since RDMBs's became widespread, when developers learned to write things like "SELECT * FROM users WHERE name="steve"". SQL is a language designed to make script injection the default operation; something SELECT * FROM users where name=""; DROP TABLE users".

SQL started out as SEQUEL: "Structured English Query Language". It was written on the expectation that business people -presumably the same people that COBOL was targeted at- would sit at their shiny new IBM teletype and type in an 'ad-hoc query'. That's right: SQL was not targeted at developers, but "normal people" -and to be easy for COBOL and PL/I developers to embed. A key goal of the SQL language was to present the same capabilities, and a consistent syntax, to users of the PL/I and COBOL host languages and to ad hoc query users.[Chamberlin81].

Nowadays, the main experts in SQL are people like Facebook's PHP devs, and script hackers. Java developers run from it, hence the broad set of O/R mapping tools. Enterprise Java Beans were first; someone had a vision that people would write reusable "beans" to represent enterprise entities (User, Customer, Purchase), and that there would be some kind of market for that. Well, that died, but Hibernate and Spring keep letting Java devs write distributed database transactions without having to learn SQL. Where are Stonebraker's snide language-elist comments then? Why no ACM article saying "ORM tools have finally brought the power of the database to Java developers"? Is it because he felt that ORM was a good idea, or that he recognizes that tools to make working databases easier benefited him?

The harsh truth is that SQL is not a particularly good language for expressing relations and predicates. Back in 1984 the illustrious C.J Date (as in "Introduction to Database Systems" Date) published a 47 page dot-matrix-printed critique of the language [Date84] -an article whose criticism on the difficulty embedding SQL into PL/I is effectively the precursor to all critiques of O/R mapping. It's SQL/Language mapping, and there've been problems mixing code and SQL back since System R first booted. A key problem is that all it does is read and write data from the DB, but for programs you need more than that, so you end up mixing SQL queries in that COBOL-esque syntax with the real code, either through some contrived ORM process or some hand-rolled string construction thing that at best is a maintenance task and at worst leaves your entire site's credit card records up on pastebin.

If you did want to work with databases properly, you'd need a programming language which makes relations and predicate calculus integral parts of the language: Prolog, Linq and, effectively, Erlang. Linq interests me as it is the most recent attempt, and because Dryad/Linq showed that it could do more than just database lookups.

Returning to System R, the database from which DB2 and Oracle DB are derived, [Chamberlin81] concludes with a lovely sentence:
We feel that our experience with System R has clearly demonstrated the feasibility of applying a relational database system to a real production environment.Which can be translated as: "even though people preferred more efficient low-level data storage techniques, hand-tuned for the specific application, pre-written in assembly language, COBOL or PL/I, the System R team -including the illustrious and now sadly absent Jim Gray- felt that making working with data easier outweighed the alternative.

That's something Stonebraker appears to have missed. The RDBMs isn't an end in itself -it's a means to an end. A tool. As is Hadoop. A tool to let you work with data at a scale and price point that that the commercial RDBMs can't play at.

Is MapReduce the meta-algorithm to solve everything? Of course not. The Stratosphere team in Berlin, the Asterix team at UC Berkeley are key leaders in the academic space -there a both ideas and code to pick up here. Then there's the real world projects coming out of the web companies, who do have to work at a scale and price point that RDMBs's can't match: Pig, HBase, Hama, Giraph, S3; other key-value stores nearby: Cassandra, Project Voldemort. All of these worked for their organizations.

Which is why I have a quote; a slight mutation of the system R conclusion based on the experiences of all the Hadoop users:
We feel that our experience with Hadoop has clearly demonstrated the feasibility of applying a Hadoop system to a real production environment.For anyone interested in things like Stratosphere, the Graph Layer, what Yarn allows &c, there's a two day workshop after Berlin Buzzwords, "Beyond MapReduce" -free for all conference attendees. Stonebraker is cordially invited to attend the conference and the workshop. I'll gladly sit next to him on a panel and say things he won't agree with.

(*) This post was written on an A340-400 between SFO and LHR. I do have all the cited papers on my laptop. If you are going to argue with the RDBMs people, you need to know where they are coming from.

[Chamberlin81] D Chamberlin et al., A History and Evaluation of System R, 1981.
[Codd71] E. F Codd, A Database Sublanguage Founded on the Relational Calculus, 1971
[Date84]: C.J. Date, A Critique of the SQL Database Language 1984 [Less]


Read all Apache HTTP Server articles…

Edit RSS feeds.