Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!

Install an optimized BIND 9.2.3Beta UNIX
I'm the DNS (and DHCP!) administrator for a large midwestern bank, and I like to keep current with software available for my platforms. A newer Beta version of BIND came online at the beginning of May, and I decided to compile it on my Mac, for use as the local DNS caching server. It compiled fine out of the box. But I wondered ... I have the Standard Apple Build Tools (gcc) and it does have optimization options that aren't picked up in the standard ./configure script. Here are the CFLAGS environment variables I set the next time around:
-mabi=altivec -maltivec -mcpu=7450 -mtune=G4 -faltivec -O3 -fast
For ./configure I ran with the --with-openssl option. On the Dual CPU machine at home, I'll also run it with --with-threads. The binary is about one-half the size from the original un-optimized BIND build, in spite of the unrolled loops. It also seems to handle load better, to the tune of ten percent or so, by my top guesstimation (queries vs. application time).

I don't know if the the Altivec is even being touched, but there is some improvement.
    •    
  • Currently 1.00 / 5
  • 1
  • 2
  • 3
  • 4
  • 5
  (1 vote cast)
 
[5,593 views]  

Install an optimized BIND 9.2.3Beta | 8 comments | Create New Account
Click here to return to the 'Install an optimized BIND 9.2.3Beta' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
Install an optimized BIND 9.2.3Beta
Authored by: Greedo on May 14, '04 11:47:40AM

If you are looking for a simple, lightweight caching nameserver (that can also act as your DHCP server), take a look at dnsmasq.

I only have experience running it on a Linux machine, but it does compile under Mac OS X. It's considerably smaller than BIND.



[ Reply to This | # ]
Install an optimized BIND 9.2.3Beta
Authored by: mhorn on May 14, '04 10:12:03PM

How do you start it up? Do you know of any websites that talk about configuring dnsmasq?



[ Reply to This | # ]
Install an optimized BIND 9.2.3Beta
Authored by: mule on May 14, '04 02:30:07PM

Well, last time I checked, there was no altivec enabled code in BIND, so I do not quite get what kind of improvement this should get you. Furtermore, binary size does not allow a proportinal comparison towareds speed. There are binaries which are very large yet execute faster than a small binarry. Unroling loops does get you some speed boost, butgenerakky speaking, for various reasons, I do nto feel Mac os X do be the best platform wheny ou run a large DNS server..



[ Reply to This | # ]
What reasons?
Authored by: woodgie on May 15, '04 06:57:19AM

As I'm possibly going to have to set up DNS for a company I'm doing work for I'm curious as to why you say: "...for various reasons, I do not feel Mac OS X to be the best platform when you run a large DNS server..."

In which case, what platform WOULD you recommend?



[ Reply to This | # ]
Install an optimized BIND 9.2.3Beta
Authored by: Arakageeta on May 14, '04 10:32:09PM
I was about to reply stating that -O3 does not unroll loops. But I decided to double check this first through the gcc man page on 10.2.3. To my surprise, -floop-optimize is enabled for all -0x (excluding x=0). However, it is not clear if -O3 actually unrolls loops by asserting this flag. Man page states:


...and optionally do strength-reduction and loop unrolling as well.

Enabled at levels -O, -O2, -O3, -Os.

Optionally?

Under the versions of Linux that I've used, loop unrolling has to be explicitly enabled even if you compile with -O3. What is the case on OS X? My guess is that loop unrolling will not take place unless explicitly specified since -floop-optimize description uses the word "optional". Options typically need to be stated/selected. Anyone agree, disagree, prove, or disprove? If -floop-optimize does not unroll loops, then the stuff above about loop unrolling is moot since no unrolling ever took place.

[ Reply to This | # ]

Install an optimized BIND 9.2.3Beta
Authored by: c15zyx on May 15, '04 07:53:18PM

-O3 will unroll loops, but will determine by heuristics which ones are worth unrolling. To unroll all loops you need a separate flag.



[ Reply to This | # ]
Install an optimized BIND 9.2.3Beta
Authored by: Arakageeta on May 14, '04 10:50:09PM

By the way, there's a good chance that multi-threading will boost performance on single processor machines too.

Say you have 100 units of computing to do and each unit must be done in order, then a single threaded app will likely perform the task faster than a multi-threaded app since mult-threaded apps have a certain amount of overhead attached to thread management. However, if the units of computing can be broken into smaller blocks, one thread can handle each block and all blocks can be processed at once.

So what's the big deal if you only have one processor? All computation has to go through one processor, right? So what are the gains?

Consider this general case. Of your 100 blocks, 10 required disk accesses. Disk accesses are biggest bottle-neck to computing today. In a single threaded app, we have to wait for each disk access before moving onto the next units-- THIS ADDS UP! Why wait when there is other code that can be executed? Say that the 100 units can be broken into 10 units of 10 each containing a single disk access. When one thread blocks (jargon for when a thread must stop while waiting for a resource, like a disk) the next thread can go on ahead and do it's work

Your milage may vary on how well threading is implemented in an app. You'll get the most out of threads in problems were the data processing can be broken into seperate unrelated tasks. An example of this would be, say, a web server or any application that processes queries/requests.



[ Reply to This | # ]
Install an optimized BIND 9.2.3Beta
Authored by: uridium on Nov 04, '05 01:50:02PM

If your optimising you can also make it more portable. If you for instance run it over a mix older machines (for instance 750GX g3's) & newer kit (eg g5's) and want a common bin to roll out over {nfs,afp,smb} and have the latest tiger dev tools, you can create a fat binary using MAB's. If you set your CFLAGS to include: "-arch ppc -arch ppc64" this will let you run the same thing in/on 64bit mode on your g5 and in 32bit mode on your g3. OSX's got $CLUE which part to run for which binary without switches. Eg.. simple program dumping word size from sizeof(long):

(on the 32bit 750GX g3-400)
downshift:~ %./mabLuv-arch_DPaction

32 bit

(On the 64bit 970 g5-1.8ghz)
springer:~ %./mabLuv-arch_DPaction

64 bit

As far as $BLOAT factor goes, it's not that bad, here's the file sizes for ppc32bit, ppc64bit and the fat MAB bin:

-rwxr-xr-x 1 uridium uridium 13800 Oct 23 23:43 mabLuv-arch_ppc
-rwxr-xr-x 1 uridium uridium 12688 Oct 23 23:43 mabLuv-arch_ppc64
-rwxr-xr-x 1 uridium uridium 34280 Oct 23 23:42 mabLuv-arch_DPaction

Roughly double. Nice though if your messing with a few dozen or a cluster of machines and have the policy of rolling out a common binary without having to compile on each node.



[ Reply to This | # ]