Sites Rejecting Apache 2? 389
An anonymous reader writes "Vnunet
reports on the low adoption of Apache 2 has caused its producers to advocate freezing development of the open-source Web server until makers of add-in software catch up. Almost six months after the launch of Apache 2, less than one percent of sites use it, due to a lack of suitable third-party modules." I'm not sure where they are getting the freezing Apache development part, more talk about forking for 2.1 right now on the httpd mailing list. The article does have it right though that until there is a reason to upgrade and the modules are in place that adoption is not going to happen. While the cores of both Perl and PHP are thread-safe, the third-party modules are not. This renders one the larger reasons to use Apache 2.0, the threaded http support, useless for applications using either of these application layers. It comes down to the question of whether the third-party module writers are better off supporting what is used or what is new.
Third party modules? (Score:2, Interesting)
I think the fact that it's not being adopted is more because there is no need for the new version from most sites. What they have works and is stable, so there is no reason to upgrade.
cl
Re:Third party modules? (Score:2, Informative)
Like MySQL, GD, ImageMagick, etc.
In PHP at least, they are a very important part of site writing.
Re:Third party modules? (Score:4, Informative)
Re:Third party modules? (Score:3, Informative)
There is no RedHat or Debian packages of Apache 2.0 (offical as in from RedHat or Debian, and part of their stable distribution). There are a few Debian people who are packaging Apache 2.0 (namely Thom May, who is the current package bunny...err...maintainer *grin*), but last I heard they were having a horrible time getting it working, and it's still only in unstable (sid), and hasn't made it to testing (sarge).
If it gets into RedHat and Debian's stable distributions, chances are it'll make a higher percentage mark on site usage. Till then, I don't think things are going to change much.
Re:Third party modules? (Score:2, Informative)
Apache 2 is in Mandrake contribs (not really supported nor officially maintained), so if you buy the 9.0 ProSuite, it will be available. I am hearing talk from Mandrake that Apache 2 will be the default web server in Mandrake 9.1.
Re:Third party modules? (Score:4, Informative)
I agree it will get a LOT more use once the Linux and BSD distros start shipping it by default, and once PHP and mod_perl are solidified for it. The Red Hat beta includes both, so they should be about ready.
Apache Module Popularity Survey (Score:5, Informative)
--CTH
Re:Apache Module Popularity Survey (Score:2)
This would certainly outrage Apache users, but in the case of Open Source would have the secondary effect of promoting forking of the codebase.
I'll do you one better. The beauty of open source means that even a fork isn't really needed, just an official "unofficial" database of patches and/or patched packages. I could see this happening so long as the patches were for security and bugfixes only. (Not features.)
But I doubt it'll ever have to come about. If the Apache people are as smart as they usually appear, they'll wait until all but a few percent of total Apache users are switched to 2.x before they drop support for 1.x or hand it off to another group that's interested in supporting it.
Re:Third party modules? (Score:2)
Personally, I haven't upgrade either of my personal [smutcraft.net] servers [diaspora.gen.nz], because I fail to see any real benefit from doing so. Both are on seperate 128Kbit links with adequate horsepower to serve pages behind them, so why mess with a new mod_perl?
I'm still waiting on PHP (Score:4, Interesting)
We'll all get to Apache 2, it just takes time to migrate.
Re:I'm still waiting on PHP (Score:2)
I'd love to migrate to Apache2.0, but until PHP works properly I can't do that.
As it is now our companys main webserver runs apache1.3.26 and will continue to do, even with the problems we're experiencing with it.
There appears to be some memoryleak somewhere which makes apache consume more and more memory until we restart it. It doesn't happen that often, but we do have a script that kills off apache about once a month.
While looking into what was wrong I got the impression that this was a known error, but I couldn't isolate the problem.
My setup is as follows: Apache/1.3.26, PHP/4.2.1, mod_perl/1.27, mod_ssl/2.8.10 and OpenSSL/0.9.6a on a Solaris7 box.
Let's just hope Apache2 solves my problem with this memoryleak too.
Re:I'm still waiting on PHP (Score:4, Informative)
Why not just use MaxRequestsPerChild?
This way you can knock off each Apache child one by one after a given period of use without having to restart Apache completely.
Re:I'm still waiting on PHP (Score:2)
And as someone noticed bewow, this is a known problem with solaris.
It was a while since I tinkered with it. It is solaris after all, it's solid as a rock.
Do you Hattrick [hattrick.org]?
PHP safe mode is what we use (Score:2)
This is not fixed in Apache 2, AFAIK.
Re:PHP safe mode is what we use (Score:3, Informative)
Of course, both open_basedir and safe_mode are crappy solutions to a problem that needs to be solved higher up. Like with the Apache2 perchild MPM, but that is a long way from being production quality on a couple of different levels.
Why fix what ain't broken?? (Score:5, Insightful)
My main question is, what would it matter if sites weren't using apache 2.0, isn't it enough that open source software is being used??
Re:Why fix what ain't broken?? (Score:2)
I've installed Apache 1.3x on numerous machines over the past few years. All of the webhosting companies I've worked with still run 1.3.23 or 1.3.26. I know the process of installing Apache 1.3.x with PHP and MySQL ("LAMP" or "FAMP" servers) like the back of my hand. I've written shell scripts to do it for me. As long as the tried-and-true Apache keeps running, and is still being actively bugfixed, I see no reason to switch production servers to Apache 2.0.
"Why fix what ain't broken" is a damn good way to sum it up, IMO. This is coming from a guy who's perfectly happy running MacOS 8.6.1 on his G4, and WinME on his Windows boxes. There's no sense upgrading if everything's working fine now. Along the same train of thought, why take the time to learn the new configuration/installation options for Apache 2.0x, not to mention updating scripts or doing the actual installs, when 1.3.26 works just as well as it ever has? The benefits of 2.0x simply haven't won me over yet.
Someday, but not yet.
Shaun
Re:Why fix what ain't broken?? (Score:2)
"Why fix what ain't broken" is a damn good way to sum it up, IMO. This is coming from a guy who's perfectly happy running MacOS 8.6.1 on his G4, and WinME on his Windows boxes.
Many would say that you broke your Windows boxes when you "upgraded" to WinME from the far superior Win98.
Re:Why fix what ain't broken?? (Score:5, Informative)
Apache 1.x has a big problem when it comes to dynamic/updating data in shared hosting environments: security, or lack thereof.
All php, mod_perl, (and pretty much anything except suexec cgi) based pages are run as the same uid/gid as the apache server. Everything your scripts have read/write access to, so does everyone else on the same machine.
So, for instance, if your database passwords are in a php script, or a file that a your php script reads, the webserver must have read access to that data in order for it to work. Since everyone else's scripts also run with the webserver uid/gid, they also have read access to your database username/password info, and can therefore connect to your database, and do all the damage they want.
To address this problem, Apache 2 has the perchild MPM [apache.org] which allows a virtual host to have it's own process fork, uid/gid, and thread pool. Unfortunately, the perchild MPM is not presently stable.
With that being unstable, and php and mod_perl also being "experimental", Apache 2 doesn't really offer an advantage over 1.3 yet. ...But don't be so certain that Apache 1.x "ain't broken".
Re:Why fix what ain't broken?? (Score:2)
Can someone explain how this works? If I'm understanding it correctly... 1) There's still a "main" server, which still runs as nobody (or maybe root now?) and which listens to the port(s) and accepts incomming connections 2) Each virtual host has its own multithreaded process 3) The main server determines the virtualhost of the request and pipes the data to and from the appropriate VH process.
is that about right or am I missing something? It seems like that might have some serious performance and/or memory use implications.
This very much sounds like a killer feature, especially if it works with mod_perl and PHP.
Re:Why fix what ain't broken?? (Score:3, Informative)
Re:Why fix what ain't broken?? (Score:2)
Is this similar to IIS's ability to let each cgi-bin run as its own, user-specified user? Like if I create the user Fred, and only allow him NTFS permissions on his own cgi-bin, and nothing else, that cgi instance will only be able to read Fred's cgi-bin files.
Does this work with an ACL addon to Linux?
utter crap... (Score:3, Informative)
cLive
Re:utter crap... (Score:4, Informative)
If I would like to use threaded server... (Score:2, Insightful)
It's a stable and tested technology.
For my project I stuck with apache 1.3.x
here's what would make me switch .. (Score:5, Insightful)
-- References. Have any high profile apache sites migrated? While my sites are small
-- PHP Support. As of 4.2.0, Apache2 support was experimental. The change log does not show anything which says its supported.
-- Mod_gzip support. This is a big one. Mod_Gzip makes my sites download a extremely fast when users over dialup lines log in. This is true specially for low bandwidth countries in Asia. Mod_gzip support has left me fairly confused
Even with all of this.. I'm not likely to change unless there is a perceptible difference in the load / performance stats on my system during the switch.
Re:here's what would make me switch .. (Score:2, Informative)
PHP support seems to be somewhat stable on apache2 using the prefork mpm. The threaded mpm's don't work on FreeBSD, so I didn't really have a choice.
The preformance seems to be pretty good after I removed the unneeded modules. --Matt
Re:here's what would make me switch .. (Score:2)
Re:here's what would make me switch .. (Score:2)
PHP works fine thank you, (Score:3, Interesting)
Well, my server has been running nicely for quite some time now.
I haven't encountered a single problem, Well, except that the default config is more secure and I had to manually change it to run legacy apps.
HTTP/1.1 200 OK
Date: Tue, 10 Sep 2002 08:18:09 GMT
Server: Apache/2.0.39 (Unix) PHP/4.2.2 DAV/2
Last-Modified: Sun, 24 Feb 2002 15:50:43 GMT
ETag: "2d405e-d7-4ac5ac0"
Accept-Ranges: bytes
Content-Length: 215
Content-Type: text/html; charset=ISO-8859-1
I'd like to upgrade but i have those darn issues.. (Score:2)
I see the writer's point, I does appear that the apache group is pretty much only patching apache 13.x at this point to solve issues, verses imporoving and or adding things so thts probablt a good start to get people start moving. However till te other things catch up(which honestly how long was 2.0.x in beta, they should have been able to work against the dev tree, and come out with compatable products, although I am not an apache developer so I don;t truely know whats involved)
Re:I'd like to upgrade but i have those darn issue (Score:2)
Untill mod_perl is ported I won't be actually using it myself.
Don't need the headache... (Score:2)
By the way, I'd like to know who the hell came up with this god-awful colour scheme?!!
Why did Apache 2.0 need to break compatibility? (Score:4, Insightful)
I know Apache does not have any "customers" to support, but why were they so eager to break compatibility for Apache 1.3 modules in Apache 2.0? I know backwards compatibility code isn't sexy, but couldn't they keep the old module API and thunk it to the new API? Then Apache 2.0 could ship with rock-solid mod_php and mod_perl. Let modules developers migrate slowly on their own schedule.
Here's an interesting perspective from Ole Eichorn, the CTO of Aperio Technologies [userland.com]:
One of the more significant recent discontinuities occurred with the release of Apache 2.0. Although it has been under-reported, Apache 2.0 is significantly discontinuous (non-backward-compatible) with Apache 1.3. Many webmasters have decided not to upgrade for now, rather than have to recode their custom modules. And many of the custom modules out there are 3rd party, so the resources to make the changes are not readily available.
It is not clear to me why the discontinuity was required. There was no technical reason not to maintain backward compatibility. I think your essay gets it right, the people who made these decisions were not involved in the original development, and were not sufficiently aware of the impact their decisions would have on their developer community. Multi-threading processes, which inspired most of the discontinuity, primarily benefits Windows sites - a small proportion of Apache installations - and most Windows sites use IIS and aren't going to change.
I bet in a few years we'll be able to track Apache's decline as the leading web server back to this point.
Decline or fork? (Score:4, Insightful)
That or where it started to fork.If people are unwilling to go 2.x, they'll put the effort into adding new stuff into 1.x. Are we seeing Open Source at work?
Xix.
Re:Why did Apache 2.0 need to break compatibility? (Score:2, Informative)
On a side note, I'd have to disagree with the CTO of Aperio Technologies, Solaris also gets a serious performance improvement with Apache 2, albeit not as good as Windows, but still decent.
Re:Why did Apache 2.0 need to break compatibility? (Score:2)
yes, but a proposed backwards compatibility API (which could thunk to the new API) could take care of the thread synchronization and communication BEHIND the API, without the old Apache 1.3 modules knowing the differences. As long as the old API maintains the same interface promises, then old modules should continue to run (but probably with performance problems).
I'm surprised there hasn't been more work to create something like a "mod_apache13" to ease module transition, instead of forcing module developers to break everything all at once. Someone created a mod_aolserver [eveander.com] to allow
Re:Why did Apache 2.0 need to break compatibility? (Score:2)
I'm sure it's a trivial piece of work that would take less time to write than to whine about on slashdot.
Re:Why did Apache 2.0 need to break compatibility? (Score:2, Insightful)
as you can run over toes of anything. Apache 1.3 was built for dirty coding. Apache 2.0 has to expect a level of code quality. You have to be very careful.
In other news:
It is not clear why water company pipes cannot carry electicity. Electric companies are stunned at disconuity of water utility corporation.
BESIDES: when people write modular code. indent. comment it. things are dead easy to rework. IF people are sloppy. That would be a lesson for them. Or pain to inheritors of the code and a good precedent to make requirements for clean code in the future.
Re:Why did Apache 2.0 need to break compatibility? (Score:2, Insightful)
This is a double edged sword. Say what you will about MS, they've done a very good job maintaining compatibility with previous versions of Windows - because their customers insisted.
OTOH, a lot of the problems with both security and stability came from this backward compatibility.
Its quite possible that by breaking compatibility that Apache 2.0 will avoid those same pitfalls.
Re:Why did Apache 2.0 need to break compatibility? (Score:2)
Every empire crumbles eventually. Apache 1.x will decline and dissappear some day, just because, once you're at the top, there's no place to go but down.
With Apache 2.0 there's a good chance that the next dominent web server will be from the same family.
Unlike commercial companies, however, there's nothing compelling Apache 1.3.x users from moving before they're ready. I'm sure there will still be bug fixes on the 1.3.x tree for as long as there are a significant number of users.
Re:Why did Apache 2.0 need to break compatibility? (Score:2, Informative)
Re:Why did Apache 2.0 need to break compatibility? (Score:2)
Here's why (Score:3, Informative)
Now as to why they did it, Apache 1.3 is great. I love it, but it is not as cross-platform as it pretends to be (it does not perform well on Windows) and it really is not built for speed. If you need these things, you need multithreading, a better abstraction model so you are not assumign POSIX compatibility (and hence emulating it on Windows) etc. This means you break the compatibility. Pure and simple, but in the end, you get a better product.
Think of Apache 2.0 as Apache-- Next Gen. Not yet supported but when it does, it will be more competitive than 1.3.x because it has a better architecture.
Re:Why did Apache 2.0 need to break compatibility? (Score:3, Informative)
Threaded programming is more difficult than non-threaded programming (just like mod-perl programming is more difficult that plain perl programming). Usually, it is because globals are used. Web servers are typically easier to thread (because each transaction doesn't usually interfere with others).
A single threaded server takes one request at a time, processes it, and then takes another request. The way Apache got around this was to have multiple processes, each which could take requests.
The problem is one of scale. While it is possible to have 1000 people simultaniously hit your web site at the same instance, it is unlikely that you will have 1000 processes running to take their requests. So some users have to wait. But is is possible to have a small number of processes with 1000 threads available to take requests.
Threads reduce memory useage. For example, each process has to load the code for the executable into memory, which multithreaded processes share. Also, if there is server file caching, mutiple threads can share the cache, but multiple processes can't.
Also threads can make more efficient use of resources. Lets say your application connects to a database on the back end (which is probably multithreaded, by the way). Lets also suppose that some transactions take longer than others. The first problem in a non-threaded application that each process has to have its own database connections. They cannot be shared between processes. Also, each process has to first wait for the tcp connection, then wait for the database to respond, then wait for the data to be sent out. While they are waiting, they cannot process other requests. The problem is that all the processes could block on the database doing long connections, while other requests that might not even require database connections wait. In a threaded model (with enough threads), many transactions can be started, while only the ones that actually have to do database connections block on the database.
Finally, threaded programs are more efficient in a multi-processor enviornment. These days, more and more servers have more than one processor. Because each thread can run on a separate processor, you can more efficiently use the hardware.
Threading is the way of the future. That is why Java caught on on the server side. Because it supports threading in the language (something that C or C++ don't do). The Apache writters were looking towards the future, not at the past.
Progress is good and all but... (Score:3, Funny)
Re:Progress is good and all but... (Score:2, Interesting)
>> wb 3.5/3.9? HYPErions OS4? Berniethlon? AROS?
>> DEad? MorphOS? OH yeah, C='s Unix for the
>> A3000Us?
> Ummm, UNIX isn't AmigaOS.
Yet it was an official Amiga OS for the unix A3000 (tape back up and all).
> MorphOS is a PPC kernel isn't it?
It also emulates the 68K code as well for supposedly seemless operation. IIRC.
> I haven't heard of DEad or Berniethlon before.
DEad (think it was orginally called Digital Enviroment, or something like that) is Amino (current owners of Amiga Inc) Tao-group's JAVA repackaged for PDA/Cellphone/STBs, now called AmigaAnywhere. There has been some pretty heavy spin posted on
> Classic (typically OSes 2-3.1 but not excluding
> 1.x) and 3.5-3.9 are not dissimilar (though the
> later ones have undealt with problems) and are
> usually what is meant by AmigaOS.
3.5 - 3.9's code (minus the grab bag of other code they may or may not have paid for from 3rd party developers) belongs to Haage&Partner and not to Amino (current owners of Amiga Inc). HYPErion is having to use AOS/WB 3.1 because they can't get access to source code of 3.5 - 3.9 for their HYPEOS4, which they own the source code.
> OS 4 depends on how Amiga like it is (if its a
> freak good luck).
Again, that's HYPErion's product that eveidently can continue on develope for regardless of Amino or it's Amiga Inc legal status. One of the two really smart things HYPErion has done, get DOPUS-Magellon and a legal agreement that they can continue when Amiga Inc goes belly up.
> I even think the Amiga is pretty dead though, I
> just wish there was a good OS/hardware thingy to
> replace it.....
Once they (HYPErion and MOS crew) locked themselves into PPC, I know it was the end of the line. AROS (http://www.aros.org) and Berniethlon are the only two Amiga related OSs that I hope will survive.
Dammy
Why change what works? (Score:2)
Yeah, one of these days I'll upgrade my webservers, (probably when I decide to do a full install of the latest version of a distro that includes it) but there's no particular rush at the moment.
Re:Why change what works? (Score:2)
While that may be true for some folks, my sites are relatively low volume, run on Unix (well, Linux), and don't yet need the Module Enhancements (although I ought to take a closer look at what's available), that doesn't really apply in my case.
(OTOH, I'm about to download Tomcat 4.1 for my JSPs.)
It's the PHP Stupid. (Score:2)
Yeah, yeah, I hear everyone saying "PHP 4.2 works fine with Apache2" Well, we're not touching it as long as it labels apxs2 support as "experimental"
Re:It's the PHP Stupid. (Score:5, Informative)
Of course, if you run the non-threaded pre-fork mpm, it should be ok. But really, what is the point then? That's why PHP support has been slow going. We develop stuff because we need it ourselves for something. Right now spending a lot of energy on supporting Apache 2 seems somewhat futile. What we need here is a concentrated effort on the part of many different projects to pool their knowledge and generally improve the thread safetyness of all common libraries. I have written a summary and started this work here:
Thread Safety Issues [apache.org]
I would very much appreciate comments and additions to this. I don't think Apache 2.0 is dead in the water, it just needs better overall infrastructure in terms of non-buggy kernels and a push to make all libraries threadsafe before it can really become a viable solution for sites needing dynamic content.
Or, alternatively, we might start pushing the FastCGI architecture more to separate the Apache process-model from the PHP one.
Re:It's the PHP Stupid. (Score:2, Interesting)
Until then, PHP is an executable just like Perl and Python, and if that costs too much performance I'll shove another cheap pizzabox in the rack (that's why everyone is using a load-balancer these days
MS did get that right though... (Score:4, Informative)
Of course, there are some issues: when you let the code executed by the request of user A create an object in an STA and move that into a container which can hold both STA's and MTA's, and let code executed by the request of user B access that user A's STA object, you get thread unsafety and possible crap.
However: the OS's functionality offers the option to do it threadsafe and still have multi-threading in full effect. Perhaps a thing to look at for the thread/process guys in the Linux kernel team.
(It has been a long time, but afaik, a simple fork() is not forking off a complete new process, but a childprocess which runs as a thread inside the mother process, or am I mistaken? (if not: why then the threadsafetly crap NOW, because a fork() will result in the same issues)
Not FreeBSD's fault (Score:4, Informative)
As it stands, it is fully compliant with the POSIX threads standard.
If it is not working for Apache, it is because Apache is not a POSIX compliant threads client implementation.
From looking at the code, we can see this is the case, with the Apache code having an assumption of kernel threads, which you are not permitted by the POSIX standard to assume.
Although I have not yet verified it, an examination of the code *seems* to indicate that it has "the Netscape problem", which is an assumption about scheduling coming back to a given thread in a group of threads after involuntary preemption by the kernel when the process quantum has expired.
In older versions of Netscape, this displayed as a bug in the Java GIF rendering code, which was not thread reentrant, in that if you used a Java application as a web UI, and moved the mouse before all the pictures were loaded, the browser would crash. After I explained this, Netscape corrected their assumption, and the problem went away.
Ignorance of the requirements for writing threaded applications which will work on all POSIX compliant threads implementations is no excuse, nor is it a valid reason for blaming the host OS, unless you make it known what your requirements are, above and beyond the standard contract offered by POSIX, and that you are stricter than an application written to the POSIX interface, without such additional assumptions.
You will find that you have these same problems on MacOS 9 (NOT FreeBSD-derived), MaxOS X (uses Mach threads), Mach, Plan 9, VxWorks, OpenVMS, etc..
You will find you do NOT have these problems on systems with implied contracts above and beyond those provided by the POSIX standard: Solaris, UnixWare, Windows, and Linux. You may have *other* problems in Windows, related to implied contracts over virtual address space issues (see other posting).
-- Terry
Re:Not FreeBSD's fault (Score:3, Informative)
Re:No, it's not. (Score:2)
I thought java was all about write once, run anywhere.. what's the point of java if you're using it server side? We're not exactly changing webservers every other day. Why not just use C++ or C# instead?
Java is a language without a purpose these days. No one wants to use it client side, and server side just doesn't make sense when compared to the initial purpose of the language (being able to run the same binary anywhere makes no sense when it's only going to be running on your webserver).
Support everything new (Score:3, Insightful)
As a software author, you really need to worry about your own users outpacing you. For instance, if someone likes a feature in Apache 2, and every module they use, except yours, works with Apache 2, people quickly discover that they don't need your module all that much anyhow.
Wasn't that everyone's experience when switching from Windows? You can't get program XYZ for Unix, so you discover that you never really needed it that much anyhow...
As a programmer, it always pays to be everywhere you possibly can. But, when it's open source, programmers don't care what's best for the user, so don't expect it to happen.
Re:Support everything new (Score:2)
And how is that bad? Commercial software houses have an incentive to confuse users into buying zillions of useless packages, needed or not. But for open source software, both the maintainer's and the user's interests are aligned: if it isn't needed, nobody should waste their time on it.
common factor (Score:5, Informative)
everyone will "just use" it. Of course there would
be some rejection rate, of stubborn people. 1.3
development would stop and everyone would slowly roll over to 2.0.
pro 2.0:
- threaded stuff is blindingly fast. most systems threads are faster then processes
- other new technologies, like layered content filtering are great for developers of hight traffic sites.
pro 1.3:
very very many people using apache use linux. Linux threads are almost same performance as processes. Due to kernel limitation, you can stack only so many threads per process.Plus threaded model does not account for stability. One NULL pointer dereference and you're gone. Apache2.0 of course uses bundles of threads. so you still have multiprocess model kicking around.
Expect 2.0 gain popularity on systems like Sun, BSD and Win32 where processes handling is relatively expesive. Threads are dirt cheap.
As everything, things take time. Just like well brewed beer.
cheers.
So it's Linux fault? (Score:2)
So, because limitations in Linux' kernel design, Apache 2.0 is held back? Interesting. What I wondered when reading your remark quoted above, was: apache can't be the only program which will benefit from multi-threading? I mean: a server with a database system on it will benefit greatly using threads for query processing. Processes are nice, and I know Unix' schedulers mainly first schedule processes and then threads, but if Apache or another program puts the spotlight on a flaw in Linux, why isn't it fixed?
Multi-threading is more efficient than multi-process, so why are Linux kernel designers still on the route to multi-process and not multi-thread? To me, this sounds like a flaw which Linus and friends don't want to solve for some reason.
Re:So it's Linux fault? (Score:3, Insightful)
Re:So it's Linux fault? (Score:5, Informative)
Actually it is the other way around. Linux has the smallest process creation and process switching overhead of any Unix with virtual memory. It is simply not possible for threads to be all that much faster than that. Apache 2 is optimizing something that simply was not all that expensive on Linux in the first place.
Re:So it's Linux fault? (Score:2)
Use the new, everybody wins (Score:2)
Here's the obvious... (Score:2)
Redhat 8.0/Apache 2.0 (Score:2, Informative)
Threads killed Apache 2 (Score:5, Insightful)
This will certainly not win me friends in the "everything should use threads because it's easier to do linear programming than to build a session reentrant state machine" camp, but...
Threads are useful for SMP scalability, but they aren't very useful for much else (I/O interleaving is adequately handled by most network stacks, the I/O interfaces themselves, and the fact that almost all the bytes being mode are being moved from the server to the client: the protocol is very asymmetric, even if you aren't pushing multimedia files). In most cases, threads are a liability.
Under Windows, they introduce data marshalling issues that have to be accounted for in user code -- not just in the modules which implement interpreters for that user code.
Under UNIX, threads are generally a loss, unless there is specific scheduler support for thread group affinity, when threads are running on the same processor. and CPU negaffinity, when there are multiple processors, to ensure that there is maximal usage of computer resources.
If you do the first, then you have the possibility of starvation deadlock for other applications: basically, it's not possible to do it correctly in the scheduler, you have to do it by means of quantum allocation, outside the scheduler. This means a threading approach such as scheduler activations, async call gates, or a similar technique. If you do the second, then you pay a serious penalty in bus bandwidth any time locality spans multiple CPUs -- in other words, it's useless to use SMP, if you have, for example, a shopping cart session that needs to follow a client cookie around.
Overall, this means that you were much better off using session state objects to maintain session state, rather than using threads stacks to do the same job. This is actually pretty obvious for HTTP, in any case, where requests are handled independently as a single request/response pair, and connection persistance isn't generally overloaded to imply session information (you can't do that because of NAT on the clinet side, multiple client connections by a browser on the client side, and server load balancing on the server side, etc.).
Overall, this factors out into threads bringing additional pain for module writers, without any significant performance or other benefit, unless you go SMP, and have a really decent threads and scheduler implementation -- which means you are running a recent IRIX or Solaris, which is a really limited fraction of the total web server market.
Frankly, they would have been a lot better off putting the effort into the management of connection state and MTCP or a similar failover mechanism, and worried about NUMA-based scaling, rather than shared memory multiprocessor with particular threads implementation scaling. The cost for what you get out of the switch is just too high.
-- Terry
Now be honest (Score:4, Funny)
Re:Threads killed Apache 2 (Score:5, Insightful)
Unfortunately this ideal is sometimes hard to achieve because non-blocking APIs are not always available. (e.g. there is no way to poll/select a pipe on Windows, and true asynchronous file I/O is still in the testing stages on Linux)
Keeping this on topic - there are plenty of HTTP servers out there with more sane concurrency models - thttpd [acme.com] is one of many... (I can't really fault Apache for making the choices they did; their goals are more standards conformance and portability than raw speed).
Thread/CPU affinity, and starvation (Score:3, Informative)
Basically, the promise of threads is that you will not be paying the equivalent of a full process context switch overhead, because your VM and other process-specific things will not have to change when context switching from one thread in a process and another thread in a process.
On a machine that has 1001 processes, and you are the 1 process, and you have five threads in your thread group (process), You basically have a 4 out of 1004 chance of one of your threads being picked as the next thing to get a quantum, when one of your threads makes a blocking call, so that it's no longer runnable.
What that means is that you have just reneged on the promise of lower context switch overhead, if you run thread #1, then run "cron", and then run thread #2.
So you have to play favorites, and say "I know "cron" has been waiting a long time, but I just blocked processing on thread #1, and thread #2 is runnable, so I'm going to preferrentially run thread #2, because it lets me avoid the VM switch, and the TLB shootdown, and the other overhead of a full process context switch, and therefore lets me keep my promise about threads being lower overhead than processes".
Any time you play favorites, you starve your non-favorites; just like a Robin or Sparrow with a Cuckoo's Egg in its nest.
So then you have to add all sorts of arcane accounting and other crap to avoid the starvation of other processes, and your scheduler becomes very, very complicated.
Compare this with Scheduler Activations, or an async call gate, where you give a quantum to a process -- and the quantum belongs to that process. In this case, your process runs until either there are no more threads to be run, or until its quantum is used up.
Things are actually more complicated than even this; for example, you want a threaded program to compete as multiple processes for quantum, or you are encouraging people to write programs that fork multiple children, instead of threads, in order to allocate themselves more quantum. On the other hand, you want to set some upper bound on the amount of unfair competition a single unpriviledged program can engage in, relative to other processes on the system.
If you attack thread group affinity as a scheduler problem, the amount of complexity you introduce is substantial, and there will always be corner cases.
There's actually been a huge amount of research on this; check the NEC CS search engine for "scheduling" and "load balancing" and "parallel".
-- Terry
Interesting (Score:2, Insightful)
I faced this dilema when I started offering web hosting and related services as another part of my services earlier this year. So far my pages are simple and dont require much third party software, and 2.0.x seems to be working fine.
Was I to install the new 2.0 version or stick with what everyone else was using? And yes, it does say on the apache web site that 2.0 is not fully backward compatible. After a little thought I decided on 2.0.x for a few reasons.
First, as my complexity needs have risen over time I always find a way to use the software to accomplish what I need done. And when I decide to take on a new level of services, it gives me time to familiarize myself with the process before turning it into a paid service. Second, if I have been familiar with a version of server that my competition doesnt feel any need to learn, it may turn into an advantage down the road.
The points raised about add-on modules are very interesting to me, and well observed. But I must be honest, call me a sick freak, but I LOVE the challenge of getting something to work for the first time.
Since you have to upgrade anyway... (Score:2, Informative)
You might as well goto 2.0, if for nothing else, then for intellectual curiosity. I did, it was a little painful, but php, perl, and mod_ssl work like a charm.
very hard to switch from good to unknown (Score:3, Insightful)
So, IMHO it's a negative positive problem :), Apache 1.3x is just way too good.
this is an issue for open source (Score:5, Insightful)
As Joel Spolsky points out [joelonsoftware.com], this is sloppy thinking. Programmer time might not cost an open source project any money, but that doesn't mean it is not scarce or does not have value.
The same applies to Apache. So much of the value of the server is tied up in the various modules. It might not have been technically elegant or easy to program in backward compatibility, but reading the comments in this thread, it's clear it would not have beeen *that* hard either -- especially compared to the programmer time it will take to rewrite the modules, and the degree to which 2.0 development will slow as people drag their heels adopting it.
This is one thing Microsoft consistently gets right. It has certainly hurt them when it comes to security, but is critical to their dominance on the desktop.
Re:this is an issue for open source (Score:5, Insightful)
Yes, you are right. I guess the reason for that is that it is a lot more fun hacking new code and adding new things without giving any consideration to your current user's needs. Keeping up with the Linux binary kernel modules is a nightmare. Why can't just put a third party Linux kernel module in some directory and forget about it? With every upgrade I have to make sure that I recompile the third party kernel modules for the Linux version I run. If the you use a binary-only kernel module, then you can't even install a kernel update unless the vendor has released an updated kernel module.
This is not the way it should be. Look at Solaris. I have seen kernel modules from 2.4 run on 7 and from 2.6 run on 8, etc.
Third-party modules, aye (Score:2)
If the Apache team wants to speed acceptance of 2.0, they're going to have to either build a 1.x module compatibility layer or spend some time porting existing third-party modules. Clearly the third-party module authors are in no hurry to support 2.0.
Not distributed with Linux (Score:2)
Until then, we'll just wait and watch adoption be gradual.
Gradual adoption is great, though. That means that the late adopters can be more sure that the platform is stable and efficient.
Don't fix what is not broken (Score:2)
1.3 just works (Score:2, Interesting)
Now, we've setup a test platform, and when our customers are happy we'll move it into production in a month or so, but secondary to our 1.3 setup. In about a year, we'll shut down the old setup and 'force-migrate' anyone that's still using it.
Targeting the SME market, we need to provide that sort of stability because my customers typically are not I-want-to-run-the-latest-and-greatest geeks and, having paid a lot of cash for their website, they're happy it runs and they don't care on what version it runs.
I think that most of my colleagues are in the same position, so 1.3 will probably be the major version for at least a year to come.
(Modules aren't the issue for me - in fact, I've not built the PHP module for 2.x because with all the script kiddies hacking around, I have decided to forward
Who needs the hassle? (Score:2)
I wouldn't be surprised if many UNIX users don't ever go for this and Apache 1.x just branches off into a separate project. Apache 2 can turn into some kind of specialized Apache derivative for platforms that just can't handle forking; we shouldn't keep burdening UNIX software with accomodating those other kludgy operating systems.
Re:Who needs the hassle? (Score:2)
Not true. Do you like forking a 15MB process for every concurrent connection? With apache 1.3 the number of concurrent users you can serve suddenly becomes a fuction of machine's memory. Multithreaded model is certainly more scalable. I can imagine that this is going to help a lot for large sites. Small sites can continue using 1.3.x just fine for now however.
Re:Who needs the hassle? (Score:3, Insightful)
Re:Who needs the hassle? (Score:2)
Note that JVMs are much bigger, but unlike Apache, a JVM can actually do threading safely in a single address space.
Re:Who needs the hassle? (Score:2)
Just because you have a hammer (kernel threading) on your particular pet operating system (BSD? Solaris? NT?) doesn't mean that you need to go around hitting everything with it.
failing to adopt != rejecting (Score:2)
Apache 2.0 has only recently been released and has not even made it into a large number of server OS distributions (certainly not in the way Apache 1.x has).
After its inclusion in a few OS distributions and after support for mod_p{erl,php} becomes stable, then we will be in a position to judge whether or not it is being rejected, but certainly not now.
Backwards Compatiblilty (Score:2, Insightful)
Would it be possible to create a patch/module for Apache 2 that allows old modules ot be used?
Upgrademania and Incompatibility (Score:3, Insightful)
The other thing is suggested by the author of the original post, and has to do with the fact that Apache 2.0 breaks compatibility with old modules. Downward compatibility is one of the Commandments in software development, and it's quite possible that this is a major reason for admins to be reluctant to switch to Apache 2.0.
Interestingly, both expecting people to upgrade to a product that almost certainly contains yet-to-be-discovered bugs, and breaking compatibility with previous releases are frequently observed in the practices of the Great Stan of Redmond. It may therefore not be surprising that those admins running Apache (rather than It Isn't Secure) would not go with it.
It's slow than 1.3 (Score:2, Insightful)
Threading? (Score:2)
Threading in general is a really really bad idea unless you absolutely need it. Stick with a process model, with IPC if needed, unless you're one of those poor sods who absolutely has to have threading.
In fact, the only engineering idea that could be worse for Apache would be to include C++ code... can you say 'unresolved symbol in xxxxx'? You'd never find two binary-only modules that could be loaded into the same server. I do so love trying to figure out exactly which version of which compiler I have to compile Apache with to link it to the proprietary modules we, unfortunately, have.
Re:Threading? (Score:2)
The advantage of processes (using IPC for communications if you need it) is that the code becomes simpler and you have a very well defined interface for the shared data.
Threading inherently violates Keep It Simple.
A threaded apache may very well be 'better' in some ways, but it will be more difficult to maintain and while it might not crash more often, more time will have to be spent ensuring that it remains of the same quality as a process based apache.
No need to upgrade here... (Score:2)
We do have one Windows machine also running Apache 1.3.26 - basically we needed a Windows web server for some web-based data drivers, and I really didn't want to use IIS for obvious reasons. (Basically, I think Microsoft would be doing themselves a favour by scrapping IIS and taking out a licence on Apache.)
Does anyone know how well Apache 2 is in the Windows variant, as I heard it had significant improvements over 1.3.x so that might be worth upgrading.
Apache 2 (Score:2, Insightful)
Totally dumb question about hyperthreading (Score:2)
I know, I'll get modded into the basement for asking, but I wonder if Apache 2.x will do any better on intel's new hyperthreading processors.
There's an article here [theregister.co.uk] that mentions intel's future offerings and how they will all feature hyperthreading, and while the 25% performance increases must be mostly a marketing scam, I wonder how this new bullet item on the P4 feature list will work out.
Okay, I'm buying some of the hype for the time being, so sue me.
I Upgraded Because... (Score:2)
It took a while to get mod_webapp working on FreeBSD (with enough research done that I wasn't opening any new ports to the outside world). But once I was comfortable with the new setup, I was back.
I must admit, it does seem slower sometimes, but that might be because I upgraded to Tomcat 4 at the same time. Since I don't get nearly so much traffic that it makes a difference (it's a hobby site), Apache 2 works fine for me.
Why threads are important (Score:2)
Quality... (Score:4, Informative)
The build process has been slowed down and, IMO, gone entirely broken. Previously I ran the configure script, which took a minute or so, compiled and installed. It worked.
Now a run a monstruous ./configure, which calls itself recursively and takes about ten minutes to complete, at which time any and all warnings have scrolled well past the top of the window. It does not report easy mistakes such as trying to make "so" a shared module until it is almost finished. And the libraries are not linked against the modules properly, so attempting to use a static libssl or libm is not possible.
An upgrade from 1.3.x to 1.3.x+1 took about half an hour. An upgrade from 1.3.x to 2.0.x has taken me the better part of two days, including reinstalling openssl shared so that mod_ssl works at all, for no immediate gain.
I can understand that people do not make the switch.
current modules will not ever work 100% (Score:3, Interesting)
In the final analysis, all the major apache 1.3 modules will never work corrects, to the point where code for one works well in the other, and vice-versa. The sad truth is that, like the Apache 1.x, the modules will slowly creep to replace the CGI's, and that took a few years to happen, and mainly with mod_perl replacing perl CGI's.
yeah, that might suck donkies, but its the sad way of human nature. WE simply want to make it like we used to have it in 1.3, and whatever. This it will never be again. Totally new modules should be writen, and used by the upcoming generation of coders, those whom are not corrupted by what we older folks have become used to. I'm 26 btw.
For example, the syntax of php is very good, and so are many of its ways of structuring things. But php itself needs to be thrown away as it stands now. Perl cannot speak of good syntax, it is simply one of the ugliest, yet most usefull languages there ever was. Yet mod_perl has a good chance of remaining viable on Apache2. This is what confuses most folks, because they don't understand how something to them, the elegant code they write, could not work well in another environment. And when your apache module becomes a place that itself is a launch pad for other modules, then what? For example, in php... most folks like to have mysql as a module, or GD, or whatever. However, now you have to wonder that in Apache2, that mysql could be a direct module to Apache2 itself , and php, or perl, just share the common thread. Do you suppose that php, or perl could be writen in a way to share their connections to MySQL, no... probably not going to play nice like that.
People just have to get past the notion that their development environment is just plain bad. The people at the Apache foundation knew it, and probably expected this sort of crap, why they want to mess things up in the next relase to confound the module writer is beyond me.
Re:But mod_perl and libphp4 already work on 2.0 (Score:2)
Re:But mod_perl and libphp4 already work on 2.0 (Score:2, Insightful)
There are two very different angles to look at this problem from. Those who hack on linux in their spare time, and those who run mission-critical systems for their living.
Here in my basement, I run Debian Sid. I play with 2.5 series kernels. CVS doesn't phase me. And all is well, for if something gets screwed up, the only loss is my time, and there is more to be gained from the experience than there was lost to it. However, when the sun rises, and I make my way to work, the story is much different.
In the server room at work, I am responsible for the servers that host our client's websites, email, and DNS records. If something hits a bug, that something malfunctions. Maybe it hiccups, maybe it takes the entire box to it's knees. Given my druthers, I'd take the former, after all, if it just hiccups, it doesn't interfere with everything else. Now, I may think that I have a very firm grasp on what is happening on those boxes. I even pretend to think that I have a firm grasp on what is happening on my system here at home. My boss even trusts and respects my judgement. If I decided, for example, to replace our very stable and definately efficient-enough Apache 1.3.26, PHP, mod_gzip, etc. with the Apache 2.x and the corresponding modules, he wouldn't blink an eye. Why? Because he trusts me to make good decisions. I can't think of any better reason to stick with what is known to be stable, verses something that is "cooler," newer, or jus' phatter.
After all, as much as we like to think it, they don't pay us systems administrators to sit there and hack. They pay us to deliver systems that work.
Re:Synchronous Access to Old Modules? (Score:2)