Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old May 28th, 2008, 5:35 PM   #1
titaniumdecoy
Expert Programmer
 
titaniumdecoy's Avatar
 
Join Date: Nov 2005
Posts: 843
Rep Power: 3 titaniumdecoy is on a distinguished road
Send a message via AIM to titaniumdecoy
Quickly select random lines

I have a 100MB file with (I suspect) millions of lines of text in it. I need a way to select 100 random lines and place them in another file. I don't want to spend a lot of time writing a Python script to do this, and I'm not sure how well such a script would operate on such a large file. Does anyone know of a Unix command that could do this? If not, how would you go about this? Thanks.
titaniumdecoy is offline   Reply With Quote
Old May 28th, 2008, 5:56 PM   #2
iEngage
Newbie
 
iEngage's Avatar
 
Join Date: May 2008
Location: teh interwebz
Posts: 22
Rep Power: 0 iEngage is on a distinguished road
Re: Quickly select random lines

well for whatever language you are writing in, just get the number of lines in the file, then have your program randomly generate a number between 1 and the number of lines, and have it read that line number.

pseudo Syntax (Toggle Plain Text)
  1. numberOfLines = getNumOfLines(file.txt);
  2.  
  3. lineNumber = randomNumber(1,numberOfLines);
  4.  
  5. read(file.txt,lineNumber);

obviously depending on the language it may take more code than that... that's not even a real language
__________________
iEngage
iEngage is offline   Reply With Quote
Old May 28th, 2008, 9:38 PM   #3
mbd
Programmer
 
Join Date: Nov 2007
Posts: 86
Rep Power: 1 mbd is on a distinguished road
Re: Quickly select random lines

im not sure why you could not do this yourself, but here ya go
perl Syntax (Toggle Plain Text)
  1. #!/usr/bin/perl
  2.  
  3. @lines = <STDIN>;
  4. for (1 .. (($ARGV[0] < $#lines + 1) ? $ARGV[0] : $#lines + 1))
  5. {
  6. print splice @lines, rand() % ($#lines + 1), 1;
  7. }
mbd is offline   Reply With Quote
Old May 29th, 2008, 12:54 AM   #4
titaniumdecoy
Expert Programmer
 
titaniumdecoy's Avatar
 
Join Date: Nov 2005
Posts: 843
Rep Power: 3 titaniumdecoy is on a distinguished road
Send a message via AIM to titaniumdecoy
Re: Quickly select random lines

Thanks. I'm impressed by how short the program is in Perl.
titaniumdecoy is offline   Reply With Quote
Old May 29th, 2008, 9:50 PM   #5
andro
Professional Programmer
 
Join Date: Oct 2005
Location: California
Posts: 294
Rep Power: 3 andro is on a distinguished road
Send a message via AIM to andro
Re: Quickly select random lines

cat file.txt | awk 'BEGIN {srand()} {print rand() "\t" $0}' | sort -n | cut -f2- | tail -n 100
__________________
http://www.kevinherron.com/
andro is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Random "Letter" Generator kewlgeye Java 19 May 13th, 2008 9:52 AM
random Numbers in openGL csrocker101 C++ 5 Apr 24th, 2007 8:02 PM
time Delays and Random functions Markphaser C++ 17 Feb 21st, 2006 3:48 AM
Random Number & Average Problem Hadrurus Java 6 Aug 15th, 2005 1:08 PM
non repeating random number generation gencor45 C# 2 Feb 9th, 2005 12:11 AM




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 5:32 AM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC