[geeks] script language advice
Nadine Miller
velociraptor at gmail.com
Fri Feb 1 21:53:09 CST 2008
What language would the collective brain recommend for a script to parse
lines of up to 7500 chars in length? I'm leaning towards shell or php
since I've been doing a lot of tinkering with those of late, and my perl
is very weak.
I'm trying to sort out duplicate files from 3 computers that I've
consolidated on one. The output is from fslint that I ran on the
command line, since I was afraid the gui would not handled the large
number of duplicates (>135K lines, which works out to be a lot more
duplicate files).
My general idea is to split the output into files based on number of
duplicates, e.g. separate files for those that have 2 duplicates, 3
duplicates, etc. I was actually surprised that it only took about 12
hours to process, given that md5sums were generated for every file.
Aside from the line lengths, the biggest bear is that the filesystems
are fat32, so there's a lot of unusual characters (rsync choked on "?"
for example) and spaces in the file paths.
Thanks for any suggestions--
=Nadine=
More information about the geeks
mailing list