Linux Format forums Forum Index Linux Format forums
Help, discussion, magazine feedback and more
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Simple Perl and C comparison

 
Post new topic   Reply to topic    Linux Format forums Forum Index -> Programming
View previous topic :: View next topic  
Author Message
gch15



Joined: Thu Jun 09, 2005 5:00 pm
Posts: 39
Location: Norfolk, UK

PostPosted: Wed Jun 29, 2011 10:38 pm    Post subject: Simple Perl and C comparison Reply with quote

Hi,

I program quite a lot in Perl and find to be fast enough for whatever I want to do. A few days back I thought of comparing Perl with C. Since I often read in lines of text from files I thought I will compare the speed of doing this in Perl and C.

First I generate a text file to read using the BASH code below.
Code:

 if [[ -e stuff ]]
   then rm stuff;
 fi
 for x in {1..5000}
 do
  echo "This is line $x" >> stuff;
  done

If I need a longer test file I just change the 5000 to some bigger number.

Below are a Perl script and a C program. Both do the same thing, which is, read lines from the file (stuff, created above) and keep adding them to a string variable. When all lines have been read, the length of this string variable is printed. That is all.

Code:

$ gcc -o for_cmp for_cmp.c

$ time ./for_cmp

88893

real   0m1.126s
user   0m1.122s
sys   0m0.003s


$ time perl for_cmp.pl

88893

real   0m0.014s
user   0m0.006s
sys   0m0.007s


As you can see above, my C program is significantly slower than the Perl script. My C is very amateurish so I believe there must faster ways of doing this in C. I would greatly appreciate an example C code which is faster than (or as fast as) the Perl script in this simple task.

Thanks.

Here is the Perl code
Code:

# begin perl script for_cmp.pl
open(IN, "<stuff");
my $growing;
while(<IN>) {
$growing .= $_;
}
close(IN);
print(length($growing), "\n");
# end perl script


And here is the C code
Code:

/* begin C code for_cmp.c */
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main (int argc, char *argv[]) {
  FILE *infile;
  const size_t mem_chunk = sizeof(char) * 1000 * 500;
  size_t allocd;
  char *growing = (char *) malloc(mem_chunk);
  char *moving = growing;
  allocd = mem_chunk;
  size_t initsize = 10000;
  char *line = (char *) malloc(sizeof(char) * initsize);
  char *str = NULL;
  infile = fopen("stuff", "r");

  while((fgets(line, initsize, infile)) != NULL) {
    if(strlen(growing) + strlen(line) + 100 > allocd) {
  growing = (char *) realloc(growing, allocd + mem_chunk);
  allocd += mem_chunk;
  moving = growing + strlen(growing);
    }
  moving = mempcpy(moving, line, strlen(line));
  }
  printf("%zu\n", strlen(growing));
fclose(infile);
exit(0);
}
/* end C code */
Back to top
View user's profile Send private message
spaceyhase
LXF regular


Joined: Mon Jun 30, 2008 1:07 pm
Posts: 116

PostPosted: Wed Jul 06, 2011 10:42 pm    Post subject: Reply with quote

All the memory allocation and copying is killing the C performance. fgets isn't helping as it is a line-orientated read. What you should do is figure out the file size (using fseek and ftell, for instance) and allocate once. We know the length of the file now so the rest is artificial but... Fill the buffer (again, just a single read will do) and count its length (no need to pull in string.h then either). The best way would be to just keep track of how many bytes have been read as you go - there's no need to count 'em afterwards. Or, as we know the file size and expect to read the file size, if we do read 'the file size' that should suffice in confirming the length of the 'string'.

And then free the memory.

You can probably do the similar in perl to make it even faster too.

Sorry it's all a bit vague. It shows the obvious differences between the two languages and that it isn't just a like-for-like comparison (who knows what perl's interpreter has done?; is 'while<in>' functionally the same as 'fgets'?; etc), even though the question itself is a fairly interesting one.
Back to top
View user's profile Send private message
johnhudson
LXF regular


Joined: Wed Aug 03, 2005 2:37 pm
Posts: 870

PostPosted: Thu Jul 07, 2011 10:14 am    Post subject: Reply with quote

http://en.wikipedia.org/wiki/Hello_world_program_examples
Back to top
View user's profile Send private message
Bazza
LXF regular


Joined: Sat Mar 21, 2009 11:16 am
Posts: 1474
Location: Loughborough

PostPosted: Thu Jul 07, 2011 3:58 pm    Post subject: Reply with quote

Hi jh...

Would be interesting to know how fast this really is:-

http://www.linuxformat.com/forums/viewtopic.php?t=11351

;o)
_________________
73...

Bazza, G0LCU...

Team AMIGA...
Back to top
View user's profile Send private message
gch15



Joined: Thu Jun 09, 2005 5:00 pm
Posts: 39
Location: Norfolk, UK

PostPosted: Fri Jul 22, 2011 1:14 pm    Post subject: Reply with quote

Thanks for the response. I had guessed some of the issues you mention but not all of them so I have learned. "who knows what Perl interpreter has done?", however it is good to know that whatever it is doing it is pretty efficient!

spaceyhase wrote:
All the memory allocation and copying is killing the C performance. fgets isn't helping as it is a line-orientated read. What you should do is figure out the file size (using fseek and ftell, for instance) and allocate once. We know the length of the file now so the rest is artificial but... Fill the buffer (again, just a single read will do) and count its length (no need to pull in string.h then either). The best way would be to just keep track of how many bytes have been read as you go - there's no need to count 'em afterwards. Or, as we know the file size and expect to read the file size, if we do read 'the file size' that should suffice in confirming the length of the 'string'.

And then free the memory.

You can probably do the similar in perl to make it even faster too.

Sorry it's all a bit vague. It shows the obvious differences between the two languages and that it isn't just a like-for-like comparison (who knows what perl's interpreter has done?; is 'while<in>' functionally the same as 'fgets'?; etc), even though the question itself is a fairly interesting one.
Back to top
View user's profile Send private message
View previous topic :: View next topic  
Display posts from previous:   
Post new topic   Reply to topic    Linux Format forums Forum Index -> Programming All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Linux Format forums topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group


Copyright 2011 Future Publishing, all rights reserved.


Web hosting by UKFast