Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old Apr 11th, 2005, 3:05 PM   #1
day1010110
Newbie
 
Join Date: Mar 2005
Posts: 4
Rep Power: 0 day1010110 is on a distinguished road
repeating a command in ssh

Hi!

I'm rather new to the whole programming thing. I'm working on a assignment for an information retrieval course. I need to make a word by document matrix, in order to do this I need to count the occurence of each word in files. Now I've written the code to do all this, but the problem is I have to count over more than 3000 files. Can anybody if it is for instance possible to execute this command on all the files in a directory? For one file I use the command $ sh count.sh file-i-want-to-count > counting. count.sh is the code for counting the word occurences, file-i-want-to-count is one of the file I want to count the words of and counting would be the file where the output will be saved. Please help me, 3000 files is way too much to do by hand...
day1010110 is offline   Reply With Quote
Old Apr 11th, 2005, 3:49 PM   #2
Infinite Recursion
Programming Guru
 
Infinite Recursion's Avatar
 
Join Date: Jul 2004
Location: United States
Posts: 3,467
Rep Power: 8 Infinite Recursion is on a distinguished road
Send a message via MSN to Infinite Recursion Send a message via Yahoo to Infinite Recursion
make a list of your files to process with the 'ls' command and read the filenames in as paramaters and have your script run through them in a while loop... like so:

#!/bin/sh


FILELIST='myfile.lst'

ls -lt | grep -v "unwanted_file" | awk '{print $9}' >> $FILELIST

LINES=`cat $FILELIST`

for i in $LINES
do
	./count.sh $i
        sleep 5
done

rm $FILELIST

echo "Complete."
__________________
http://jasonpowers.net

"There are a thousand hacking at the branches of evil to one who is striking at the root."
Infinite Recursion is offline   Reply With Quote
Old Apr 12th, 2005, 6:04 AM   #3
day1010110
Newbie
 
Join Date: Mar 2005
Posts: 4
Rep Power: 0 day1010110 is on a distinguished road
Hi!

Thank you very much for you're reply! I've just tried it, but is it correct that this way I create one file containing the word countings of all the files (together)? Is it also possible to create 3000 seperate files with the word count for each corresponding original file...cause that's actually what i need. Kind of like with csplit and split, that it executes the same command to each file automatically but saves the output of each file in a seperate file. Is it possible to do something like this? And if so, how? Or is it actually what your script is doing, but I used it wrong?
day1010110 is offline   Reply With Quote
Old Apr 12th, 2005, 8:10 AM   #4
Infinite Recursion
Programming Guru
 
Infinite Recursion's Avatar
 
Join Date: Jul 2004
Location: United States
Posts: 3,467
Rep Power: 8 Infinite Recursion is on a distinguished road
Send a message via MSN to Infinite Recursion Send a message via Yahoo to Infinite Recursion
All my code above does is take a list of files that you want to run your count.sh script on and runs through that list line by line, executing your script on the file that is listed on that line.

Now if you wanted to generate a word count file per entry... (I'm not sure why you would want to do this when you could just append the count to a single file)... you could do this (depending on the output of your count.sh).

Replace this line:
./count.sh $i

with:
./count.sh $i > filename
__________________
http://jasonpowers.net

"There are a thousand hacking at the branches of evil to one who is striking at the root."
Infinite Recursion is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 6:12 AM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC