May 6, 2011

Blast from the past : grpof

Filed under: history,programming,skills — Mark Dalrymple @ 1:48 pm
Tags: , ,

Dino rific

I stumbled across this little tutorial I wrote back in the mists of time, probably around 1996 or 1997.  And it was based on a tutorial I wrote at Visix, probably in 1993 during one of our Optimization Parties.  It describes how to read the output of gprof, a profiling tool available on most unix systems.  It’s even still there on Mac OS X.  So you kids with your fancy Shark and Instruments, here’s what some of your elders used.

gprof is not a GNU tool, even though it has the leading “g”.  That “g” probably stands for “call Graph” profiler. You’ll need to check your system’s documentation (e.g. man gprof) for exact instructions on getting gprof to work, but usually it just involves compiling and linking with -pg, running your program, and doing gprof gmon.out > oopack.

Here’s a 300K sample of output from gprof on the Dec Alpha if you want to take a look at it. This particular report is from a run of AOLServer 2.2.1 which involved fetching index.html 53,623 times.  The links that follow go to anchors in that 300K sample.  What was I wanting to profile?  I wanted a gut check to make sure that life in the server was sane, and if there were any obvious bottlenecks that maybe I could address if I had the time.  The test was to fetch index.html over and over again.

There’s 4 parts to gprof output:

  • Built-in documentation: Short form of everything here, and more.
  • Call-graph: Each function, who called it, whom it called, and how many times said calling happened.
  • Flat profile How many times each function got called, the total times involved, sorted by time consumed.
  • Index: Cross-reference of function names and gprof identification numbers numbers.

When I first start looking at gprof output, I go to the flat profile section. The big time consumers are usually pretty obvious. You’ll notice that each function has a [number] after it. You can search on that number throughout the file to see who called that function and what functions that function calls. Emacs incremental search is really nice for bouncing around the file like this.

Here you can see that DString is a big time gobbler:

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 17.7       3.72     3.72 13786208     0.00     0.00  Ns_DStringNAppend [8]
  6.1       5.00     1.28   107276     0.01     0.03  MakePath [10]
  2.9       5.60     0.60  1555972     0.00     0.00  Ns_DStringFree [35]
  2.7       6.18     0.58  1555965     0.00     0.00  Ns_DStringInit [36]
  2.3       6.67     0.49  1507858     0.00     0.00  ns_realloc [40]

Out of 21.05 seconds of total clock time, Ns_DStringNAppend consumed about 4 seconds, or about 18% of the time in and of itself. It was called 13 million times.

MakePath consumed one and a half seconds itself, and its children consumed three and a half seconds. At least one individual call to this consumed 0.01, and at least one individual call took a total of 0.03 seconds in MakePath and its children.

Handy tip – the function numbers in brackets are approximately sorted by time consumption, so a function with a [low number] will generally be more interesting than one with a [high number].

Now that you know that Ns_DStringNAppend is called a bunch of times, this could be a useful target for optimization, I’d look at its entry in the call graph section.

Before doing that, just for illustration, take a look at AllocateCa [33] since it has all of the interesting pieces of the call graph in a more compact size:

                0.04        0.18   53622/160866      Ns_CacheNewEntry [62]
                0.04        0.18   53622/160866      Ns_CacheDoStat [58]
                0.04        0.18   53622/160866      Ns_CacheLockURL [64]
[33]     3.0    0.11        0.53  160866         AllocateCa [33]
                0.16        0.17  160866/321890      Ns_DStringVarAppend [30]
                0.06        0.00  160866/1555972     Ns_DStringFree [35]
                0.06        0.00  160866/1555965     Ns_DStringInit [36]
                0.04        0.00  160866/1341534     Ns_LockMutex [43]
                0.03        0.00  160866/1341534     Ns_UnlockMutex [53]

The entries above AllocateCa [33] are the functions that call AllocateCa. The entries below that are the functions that AllocateCa calls. There are two numbers separated by a slash: the first number is the number of calls that the function has made, while the second number is the total number of invocations of that function.

In other words, for 160866/321890 Ns_DStringVarAppend [30], AllocateCa called Ns_DStringVarAppend 160866 times. Across all of AOLServer, Ns_DStringVarAppend was called 321890 times.

Similarly, for 53622/160866 Ns_CacheNewEntry [62], means that Ns_CacheNewEntry called AllocateCa 53622 times, and AllocateCa was called 160866 times total.

So, just by looking at this snippet, you know that the three Ns_Cache functions each call AllocateCa about once per serving of index.html, and that AllocateCa makes a single call to Ns_DStringVarAppend, Ns_DStringFree, etc… each time. What’s also interesting to note is that someone is calling Ns_DStringFree more than Ns_DStringInit. This may be (or may not) be a bug in AOLServer. You can go see Ns_DStringInit and Ns_DStringFree yourself and track down who the culprit is.

The floating “3.0” on the left is the percent of total time that the function consumed. The two columns of numbers are the amount of time (in seconds) that the function consumed itself (AllocateCa took 0.11 seconds of time total to run its own code) and the amount of time in the function’s children (0.53 seconds were spent in its children)

Getting back to real analysis of DStringNAppend, you can see that MakePath made 50% of the Ns_DStringNAppend calls. Since you know that there were 53623 fetches of index.html, that means that for each page, MakePath was called twice, and for each call to MakePath, Ns_DStringNAppend was called 64 times.

If one call to MakePath could be elided (since it’s getting called twice), or if fewer than 64 Ns_DStringNAppends could be done per call, we could see a performance boost.

Just browsing the gprof output can be an illuminating exercise. If you have a gut feeling that a particular function is a hot spot (say, Ns_LockMutex [43]), you can see the call graph for that function, see if it’s consuming lots of time, or if it’s being called a whole bunch.  Here it was called 1,341,534 times, or about 25 times per page serve. Maybe that’s high.  Maybe that’s right.  Sometimes a suspected culprit isn’t there, or you find a surprising time waster.

Because this sample gprof output was done on a Dec Alpha system, there was some suckage involved, such as no explicit time recorded for system calls. So we don’t know if, for example, select() blocked for a long time on each call.

(slightly edited and re-posted over at the miniblog)


April 22, 2011

Certifications as a Learning Tool

Filed under: 20-minute,programming,skills — Mark Dalrymple @ 1:18 am
Tags: , ,

Shoktar devourer of worlds

Matt Kloskowski had a posting on his Lightroom Killer Tips blog about LightRoom 3 ACE (Adobe Certified Expert) certification test prep materials becoming available. Then there were a couple of paragraphs defending the whole certification process: basically, if you’re looking for a career that is based on Lightroom or Photoshop, having the ACE sticker on your resumé is a good thing. It shows potential employers that you have a certain base-level of knowledge, and the sitck-withitedness to see the ACE through to its completion.

Possessing a certification doesn’t mean you’re good, of course. I’ve worked with Certified Oracle DBAs that were freaking amazing. (xux, I’m looking at you), and others that barely knew how to analyze a table. Both were certified Oracle DBAs. Both had passed the tests. But quite a difference in knowledge, ability, and work ethic.

That being said, I kind of like certification tests. Even if you don’t need the certification and don’t intend on taking the tests, they certification test contents provide a curriculum you can use to learn stuff.

I rarely admit it, but I have an Oracle 8i DBA certification. I took the five tests, passed them on the first try, and got my little certificate and card. Did I want to be a DBA? Hell no. Good DBAs live a high-stress life keeping twitchy and cranky systems up and running. But I wanted to be a better developer. I like being self-sufficient. If I could set up my own Oracle instance I could do local experiments and not rely on the remote systems. If I knew how all the different moving parts interact I can design the physical layout of my schemas so that disk I/O is somewhat optimized. If I knew how the profiling tools work I can rapidly tune queries, or at least figure out what tradeoffs I need to make to achieve adequate performance. If I knew how backups worked I would know the difference between a good backup scheme and one that’ll bite us in the ass if the DB turns toes-up. I could also restore to a test cluster to run tests against “production data” that wouldn’t actually affect the production site.

Oracle’s exams were broken into five tests. I used Oracle’s training materials as well as the “Exam Cram” series of books to steep myself in the material for each test. I could have skipped taking the test of course, but the company was willing to pay the $100 or so for each one, so it was a no-brainer to take. That little certificate might have come in handy later on. (It didn’t)

After my Oracle work, I was needing to learn Java quickly for a contracting gig. At everyone’s recommendation I tried working through Bruce Eckel’s introductary book Thinking in Java. Unfortunately, it’s what I call a “loops are cool!” book. As a developer with fifteen years of experience under my belt, I just couldn’t fathom slogging through beginner programmer material.

Then I found the Java2 Exam Cram books. I really should have looked for those first. They were geared, of course, to giving you the info to pass Sun’s Java certification tests. This was perfect for an experienced developer. The books were a distillation of all that was different with Java from other languages, as well as a discussion of any truly peculiar portions of the language. A week of reading a couple of these books prepared me for kicking butt on-site.

So, looping back to Lightroom, will I take the Lightroom 3 ACE test? Nope. But I’ll probably pick up some of the educational materials so I can find out more about the software product I use after every photo shoot I do. I’ve already spent a couple hundred dollars on software, and hundreds of hours of my life learning bits and pieces of it, and I know I’ve really only scratched the surface of what it can do. So I believe in the long run it’ll be money well spent.


April 19, 2011

Learning iPhone Programming

Filed under: 20-minute,cocoaheads,programming,skills — Mark Dalrymple @ 5:15 pm
Tags: , , ,


D31 2840 1

This last weekend I taught an Objective-C and iOS bootcamp to a group of students and faculty at WVU down in Morgantown WV. They have a cool “AppLaunch” project going on, to inspire students to write real applications on iDevices and encourage an entrepreneurial spirit, and they invited me down to kick off the technical portion.

A common question I got beforehand is “what’s the best way to get up to speed on this stuff?” That’s kind of like asking “what’s the best kind of pizza”, it all depends on where you’re coming from, where you want to go, and how fast.

If money is no object, take a Big Nerd Ranch class. In addition to teaching there, I have taken a number of Ranch classes from a bunch of different instructors, and they are all top-notch. The Ranch has a way of doing things that ends up with a really high quality product, enjoyable to both instructor and student. This will cost you a couple thousand dollars and take a week of your life, but you will be well on your way to iPhone programming studliness. Check out AnneKate Halsall’s Taming the Wild Dogcow tumblog for impressions and ah-has during the course of a class.

There are a number of video courses available. Stanford CS 193P iPhone Application Development is online. I had come across another one that was really awesome, but I’ve lost the link :-( . I’ll update this posting if I come across it.

The next level down is books. I love books. I learned to program from books and magazines. Old folks may remember back in the day when computer magazines had pages of BASIC program listings. Keying those in and debugging the inevitable typos is how I learned to program. There are two books I really like for iOS programming: the Apress Beginning iPhone 4 Development: Exploring the iOS SDK by Dave Mark, Jack Nutting, and Jeff LaMarche; and the Big Nerd Ranch iPhone Programming, the Big Nerd Ranch Guide by Aaron Hillegass and Joe Conway. I recommend people read both of them. If one book glosses over a topic the other covers in depth. Disclaimer: I’ve been the technical reviewer for the Apress Beginning iPhone books since the first edition.

What order to read them? If you’re strapped for time, read the Ranch one first. It’s short and to the point. Dedicate a weekend or a couple of evenings and type in everything. Then start working through the Apress book at your leisure.

If you’ve got more time, or you’re working over a longer period of time with other people, such as the Pittsburgh CocoaHeads Learn iPhone project, use the Apress book. It’s longer and wordier (380 vs 630 pages), but goes into topics in more detail. Some of the code is repetitious so you might not want to type in everything.

You’ll want some introductory books if you’ve never programmed before. Stephen Kochan’s Objective-C 2.0 programming Language good. I’m partial to the Apress Learn C on the Mac by Dave Mark, followed by Learn Objective-C on the Mac the latter written by me and my hero Scott Knaster. This pair was designed to take you from “loops are cool!” up through Categories, Properties, and Predicates. If you already know C you can go straight into Learn Objective-C. If you already know how to program in something else and just want a quick brush-up on what’s peculiar to C, I’ve broken out the first two chapters of the first edition of ore Mac OS X Programming into a C Refresher. Learn Objective-C has an appendix on what weirdnesses to expect if you’re coming from other languages like VB or Java. I know I get frustrated when I have to wade through “loops are cool!” when picking up a new language, so it’s nice having different places you can catch the train.

Finally, take a look around your community. You may have an active CocoaHeads or an NSCoderNight chapter, or perhaps an iPhone programming MeetUp. If there’s not one now, start one! There’s nothing like having living breathing people to ask questions of, and to generally hang around with. You might discover one-off classes like what I did at WVU, or longer-term learning projects like what we’re doing at our local CocoaHeads.

(slightly edited and re-posted over at the miniblog)

April 4, 2011

My first professional bug

Filed under: 20-minute,history,programming — Mark Dalrymple @ 12:10 am
Tags: , ,

Clearning with rooking grass

Mike Ash’s recent Friday Q&A mentioned SIGWINCH, the hearing of which always sends me down memory lane.  My first professional bug was centered around SIGWINCH.  By “professional bug”, I mean a bug that someone paid me actual money to fix during a period of employment.

Straight out of college in 1990 I went to work for a company called Visix, which at the time sold a product called Looking Glass, a file browser much like the Macintosh Finder but for Unix.  Looking Glass supported the major graphical windowing systems of the time: X11, Intergraph’s Environ V, and Sun’s SunView.  The image at the top of this posting is the only screen shot I could find of the version of Looking Glass I worked on running on SunView.

I was hired into tech support, and our team’s duties were phone support (typically debugging network configurations and X server font paths) and porting Looking Glass to other platforms.  Being the Lo Mein on the totem pole, I got given the old platform nobody wanted to touch any more: SunView.

SunOS 4.1.X had just come out, and Looking Glass would hang randomly.  It worked fine on 4.0.3.  My job was to find and fix this hang.  This was my first introduction to a lot of things: unix systems, windowing systems, navigating large code bases, debuggers, and vendor documentation that wasn’t Apple.  Luckily the SunView version didn’t sell terribly well any more because everyone was moving to X11, but there were a couple of customers bitten by this problem.

So what is SunView?  SunView is a windowing system.  Different programs run displaying graphical output into a window.  Nowadays that’s common place, but back when SunView came out it was pretty cool.  SunView was one of the earlier windowing systems, so it had a number of peculiarities: the biggest was that each window on the screen was represented by an honest-to-god device.  /dev/wnd5 is a window, as would be /dev/wnd12.  There were a finite number of these window devices, so once the system ran out of windows no more could be opened.

There was a definite assumption of “one window to one process” in SunView.  Your window was your only playground.  Looking Glass was different, it could open multiple windows.  Because of the finite number of windows available system-wide, on launch we created the alert that said “You can’t open any more windows because you’re out of windows”, thereby consuming a precious window resource, and hide it offscreen.  It was the only way we could reliably tell users why they couldn’t open any more windows.  Glad I wasn’t the one that had to make this work in the first place.

The other peculiarity is that you never got window events.  Even in the 1.0 version of the Macintosh toolbox you could figure out if the user dragged the window, or resized it, or changed its stacking order.  In SunView, you just got a signal. SIGWINCH, for Window Change, and hence the memory-lane trigger.  The user moved a window?  SIGWINCH.  The user resized it?  SIGWINCH.  The user changed the z-order?  SIGWINCH.

With just one window that’s not too bad.  Just query the window for its current size.  For us, though, we had to cache the window’s location, size, and stacking order.  Upon receipt of a SIGWINCH we’d have to walk all of our windows and compare it to the cached version, and see if anyone was resized, and then relayout the window views.

So, back to my bug.  It took me a solid month to fix.  Mainly because it was part time work in amongst my other responsibilities, and also because it was difficult to reproduce.  Spastic clicking and dragging could make it lock up, but not reliably.  Using the debugger was pointless – a 4 meg Sun 3/50 swapped eternally trying to get Looking Glass into gdb.  I ended up using a lot of caveman debugging.

Event queues

The architecture we used is shown in this diagram.  Each window had an event queue (remember that one window to one process assumption).  Upon receipt of events, we would walk our windows: read the events, handle them, then move on to the next window.

I was getting some printouts, though, showing an window receiving mouse-downs and mouse-drags, but no mouse-up.  Occasionally I would see a mouse-up, with no mouse-downs.  Ah-ha!  The mouse-up was being delivered to the wrong window’s event queue.  The fix was easy once I found it : just merge the events from all the windows first, and then process them.

It was then I learned how expensive malloc is.  I malloc’d and free’d event structures, but performance became dog-slow, especially during mouse drags.  Caching the structures made life fast again.

Memories like these make me so happy with the cool tech we get to play with these days.


(subsequently republished and edited quite heavily to the Miniblog)

March 28, 2011

Borkware’s First-Timer’s Guide to WWDC

Filed under: "humor",20-minute,programming — Mark Dalrymple @ 4:19 pm
Tags: , ,


Atm machienSo, you’ve just purchased your first WWDC ticket. Congratulations!  Many folks have have published their “first-timer’s guide to WWDC”, so being a veteran of 6 or 7 of them, both in the modern age and during the Dark Times, I figured I’d hop on the bandwagon.

1) The ticket is expensive, so you’re probably short on cash now.  Don’t worry about booking a hotel.  The weather in San Francisco is really nice.  It hardly ever rains.  And if it does, there are many store fronts and office building entrances you can use for shelter.  It’s also pleasantly warm 24/7.

2) Go to as many sessions, labs, BOF sessions, and parties you can at the Moscone center.  It’s a virtual firehose of firehoses of information and activity.  You won’t have time to bathe, so don’t even bother.

3) Get into the keynote line early.  Most hardcore attendees start lining up Sunday afternoon.  You’ll be guaranteed of a good spot if you get there late Saturday night.  There’s really only 700 spaces in the keynote room, even though the videos make it look deceptively large (*cough* CGI *cough*).  Due to health concerns, Steve’s Reality Distortion Field doesn’t extend past 10 or 15 rows these days.  :-(

4) Don’t worry about food.  In fact, you don’t have to really bring any money, credit cards, or Automatic ATM Machine cards.  I can never remember my PIN Number anyway.  Apple always lays out a huge spread of food from dusk to dawn and back to dusk again.  Make sure to hang around friday evening for Prime Rib and Champagne night, in celebration of the end of a good conference.

5) A secret: you don’t have to wait until the end of a session for Q&A.  There are microphones around the room.  If the one you are at happens to be turned off, no problem.  Bring your own bullhorn.

6) When asking questions in sessions, be sure to state your name, where you work, which platform you work on, which version of Xcode you prefer, and your opinion on the App store and C++ vs Objective-C.  Be sure to complement the speaker on their sartorial choices.  The sound systems are run rather hot, so please don’t speak too loudly into the microphone.  Of course, if you brought a bullhorn, you can tailor its output to the conditions of the room.

7) Follow proper Labs etiquette.  The labs where you can chat with Apple engineers are an invaluable resources.  It is a scarce, shared resource, so treat it like you would computationaly: pretend to be a mutex. You walk into the lab you want and shout “I AM ATTEMPTING TO OBTAIN A LOCK ON THE MEDIA PLAYER FRAMEWORK ENGINEERS”.  If an engineer is free, you’ll hear “LOCK SUCCEDED” from the back, and you can go to the engineer who just shouted and ask your questions.  If no one responds wait until you time out, and try again.  Expert tip: “spinlock”.

8) We’re all friends at WWDC.  If a session looks to be standing room only, feel free to find an available lap.

9) The Thursday night beer bash is actually just a giant mosh pit.

10) Don’t forget that recording devices are forbidden.  So please leave your voice recorder, iPhone, video camera, DSLR, pens and paper at home.  The TSA has been contracted to provide session information security.

Have a great time!  WWDC is an awesome experience.

(subsequently republished and edited a bit to the Miniblog)

Blog at