Linux Format logo

Questions and answers - what do you think?

I'm thinking of creating a Q&A database on the TuxRadar site, to help people find answers to Linux problems that are archived away in our old magazines - because, let's face it, problems such as compiling the kernel, fixing Grub, etc, are just as important today as they were last year or even longer ago.

The problem is, all our Answers stuff exists in PDF form, and to get it into a database requires some conversion - conversion we don't have time to do. So I'm asking you, rather than trashing this idea because it's unworkable, do you think it's worth us releasing the PDFs and asking the community to help convert them to some simple XML?

The actual conversion we're looking for is fairly simple. Here's an example answer, taken straight from a PDF:

5 Automated virtualisation
QOn start up I want Ubuntu to start
without the need for me to input a
username and password and then
open VMware and start a virtual machine. I
know how to do all of these things manually,
but is it possible for me to write a small
batch file (if that’s the right expression) to
do it for me automatically?
Ffilc7373, from the forums
A There are two steps here:
logging in to your user’s desktop
automatically, and running a
program after logging in. The first is
achieved in Ubuntu 8.10 by selecting the
option to automatically log your user in
during installation. If you have an earlier
Ubuntu or you have already installed
8.10, you can set this option by running
System > Administration > Login
Window, go to the Security tab, tick
Enable Automatic Login and select the
user you want logged in.
Gnome will start any program you tell
it to when it starts up. Go to System >
Preferences > Sessions, press Add and
type the command you want to run,
along with a name and description. To
start VMware Workstation with a
particular virtual machine, use
vmware -X /path/to/virtualmachine.vmx
Where -X tells VMware to both start the
virtual machine and switch to full-screen mode
and the rest is the path to the .vmx file for the
virtual machine. If you’re using VMplayer,
replace vmware with vmplayer in the above
command. If you use VMware-server, you
need to make sure that the server is running
at startup (use the Ubuntu session manager
for this), and then use vmrun to start the
virtual machine. GM

...and here it is converted to the format we're looking for:

==
<title>Automated virtualisation</title>

<question>On start up I want Ubuntu to start without the need for me to input a username and password and then open VMware and start a virtual machine. I know how to do all of these things manually, but is it possible for me to write a small batch file (if that’s the right expression) to do it for me automatically?</question>

<answer>There are two steps here: logging in to your user’s desktop automatically, and running a program after logging in. The first is achieved in Ubuntu 8.10 by selecting the option to automatically log your user in during installation. If you have an earlier Ubuntu or you have already installed 8.10, you can set this option by running System > Administration > Login Window, go to the Security tab, tick
Enable Automatic Login and select the user you want logged in.

Gnome will start any program you tell it to when it starts up. Go to System > Preferences > Sessions, press Add and type the command you want to run, along with a name and description. To start VMware Workstation with a particular virtual machine, use

<command>
vmware -X /path/to/virtualmachine.vmx
</command>

Where -X tells VMware to both start the virtual machine and switch to full-screen mode and the rest is the path to the .vmx file for the virtual machine. If you’re using VMplayer, replace vmware with vmplayer in the above command. If you use VMware-server, you need to make sure that the server is running at startup (use the Ubuntu session manager for this), and then use vmrun to start the virtual machine.</answer>
==

So:

  1. Remove unnecessary line breaks, but keep paragraph breaks
  2. Remove any names and the question number
  3. Put the title, question and answer in little XML markers
  4. Mark any code or command using <command>

Once we have it all in XML, it's a cinch for us to get that into a database and get it all online. The only problem is, do you think we'll get enough people volunteering to help? With about 50 issues of PDFs to go through, each one taking about 10 minutes to do, it's no easy task.

What do you think?


Your comments

Yup, I'd be happy to do one

Yup, I'd be happy to do one or two of these. Maybe I'd god as high as ten.

Sorry. I'd "go", rather than

Sorry. I'd "go", rather than I'd "god".

Point me at what you want a

Point me at what you want a go at and I'll give it a try, I got fed up of doing the community conversion thing but am still willing to do more on that front too if you want ;-)

I thought there was a

I thought there was a project to convert the PDFs to Wiki format going on? Could these be combined and the data from the Wiki used?

Donoreo Says: February 10th,

Donoreo Says:
February 10th, 2009 at 2:49 pm

I thought there was a project to convert the PDFs to Wiki format going on?

That is the community conversion thing I was referring to ;-)

Hi all, Excuse me for

Hi all,

Excuse me for rushing into the dialog: I'm not a reader of Linux Format as you now it, but I'm the editor of the very same magazine in Russia. :)

Just wanted to say we've tried a thing similar to what Hudzilla has suggested when converted our PDF archive into Wiki. It worked well although it tooks some time. So my two cents are it's worth trying.

It's worth a go. Presumably

It's worth a go. Presumably a similar method to avoid replication of work will be used as for the community conversion project.

I did a similar thing

I did a similar thing recently with an ISO standard for IT Security which was in PDF format with a short expiry.
While it was possible to save the whole file to text but the layout was crap.

After some fiddling about, I ended up with a relatively simple shell script (mainly using sed) which removed all the line-breaks, then put in new line-breaks to separate sections and then inserted formatting markup for our wiki layout.

Worked a treat and I would be happy to pass on the script and to help.

If you do get this done, how do you intend to keep it going?
Will there be a constant reliance on the same community to convert from PDF or have you a way to publish future Q&#38;#38;A content in this format?

If so, perhaps it would be worthwhile starting on this endeavour regardless of whether the community is able to provide the muscle for the back issue immediately?

The bit that will be a royal

The bit that will be a royal pain is separating code as it has no regex pattern that would separate it from other content.

vmware -X /path/to/virtualmachine.vmx

I'm also assuming that there's some hidden character separator to signify the paragraph breaks?

I would like to have a go at

I would like to have a go at this. Sign me up!

regards and thanks for a super mag

Des

where do we acquire the

where do we acquire the pdf's to convert, and how do we assure that we are not converting files someone else has converted previously.



Web hosting by UKFast