PDA

View Full Version : Parsing man pages


SOUR-Monkey
2005.05.29, 06:01 AM
Firstly, I must apologize for posting this here as opposed to iDA, unfortunately I can't seem to register there so for now I'll just have to plague iDG with non-game related posts. Very sorry.

Now for the problem: I want to be able to read in a man page and display it in a NSTextView with nice and pretty formatting, but it's proving to be somewhat more difficult than I expected initially. As you'd expect.

It would appear that man pages are formatted using one of the many *roff variations, which in itself is not too big a deal (although I must say, *roff formatting is pretty disgusting to look at). I could have simply written something that parsed the various tags and left it at that.

However, depending on whether the page is in a cat or a man directory, it uses a different style of formatting. This is not good, as it would mean I have to write two interpreters (I was not impressed with the prospect of having to write one).

The lazy-Sam solution is just to run system("man [page of interest]"); and chuck the result of that into my text view, but that doesn't allow me to do all the fancy formatting I would like to, and quite possibly won't even preserve the standard formatting done by man.

Bugger.

So, does anyone know how I should go about formatting the pages? I am sure that I will have to write my own interpreter if I want to do my own formatting (for the NSTextView), as any pre-written programs certainly won't do it for me. However, I'm having difficulty finding information on the exact format used for man pages, as all google searches just return pages and pages of man pages on man.

Of course, it's entirely possible that since I know nothing what-so-ever about *roff formats that I'm completely missing the whole thing and writing an interpreter is a fairly trivial (if time-consuming) process. I'm hoping someone somewhat more in-the-know than myself can enlighten me here.

So with no further ado, let the enlightenment begin!

blobbo
2005.05.29, 03:38 PM
Why not throw the contents of system("man $appname") into a string variable or temporary text file, and then format the result? Or do you want more control?

The source code to "man" must be available. If not, I'm sure someone's done this before. Perhaps you can find a perl or python parser that's already been written?

I'm guessing this is for a dashboard widget?

OneSadCookie
2005.05.29, 04:35 PM
there is a groff command-line tool, which can output postscript, which can be passed to pstopdf, which can be displayed by CoreGraphics or PDFKit.

SOUR-Monkey
2005.05.30, 03:42 AM
Keith, are there any particular options I need to specify for groff to output proper PostScript data? I tried just running groff on a normal man page and piping that into pstopdf, however the result was weird weird weird. The text went off the top of the page, it looked as if lot of words had simply been dropped from the final page, and everything was just one big paragraph. Not quite as pretty as I'd hoped for :-p

I had a look at the man page for groff, but I didn't see anything that looked like it might be related to PostScript output. The pstopdf man page had absolutely nothing of interest, either.

Am I missing something?

OneSadCookie
2005.05.30, 08:04 AM
cat /usr/share/man/man3/glTexImage2D.3 | groff -Tps -mandoc -c | pstopdf -i -o ~/Desktop/glTexImage2D.pdf

extracted from the man pages for groff, man and pstopdf :p

OneSadCookie
2005.05.30, 08:07 AM
BTW, Xcode already does a decent job of this... it's in the help menu.

TomorrowPlusX
2005.05.31, 02:25 PM
Also, the app ManOpen has been doing a good job of this since -- I gather -- the NextStep days. Except it's broken for me on Tiger...

Also, Sogudi does a good job via Safari -- just type man:whatever