Tux

...making Linux just a little more fun!

Talkback:151/lg.tips.html

[ In reference to "/lg.tips.html" in LG#151 ]

Thomas Bonham [thomasbonham at bonhamlinux.org]


Thu, 29 May 2008 05:02:49 -0700

Ben Okopnik wrote:

> On Tue, May 27, 2008 at 03:41:01PM -0700, Thomas Bonham wrote:
>   
>> Hi All,
>>
>> Here is a 2 cent tip which is a  little Perl script for looping through 
>> directory's.
>>     
>
> Why not just use 'File::Find'? It's included in the default Perl
> install, and is both powerful and flexible.
>
> ```
> use File::Find;
>
> find(sub { do_whatever_you_want_here }, @directories_to_search);
> '''
>
> For more info, see 'perldoc File::Find'.

Perl File::Find didn't have everything that I want to be able to do this function. I was not just trying to find files with this but also was try to find items that was in different directory's.

When looking around on the internet for that I want to do everything thing that I was able to find said not to use file::find because it wasn't powerful enough for that I was doing so I just create that function to do some different things along the way.

Thomas


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Thu, 29 May 2008 12:29:31 -0400

On Thu, May 29, 2008 at 05:02:49AM -0700, Thomas Bonham wrote:

>   
> Perl File::Find didn't have everything that I want to be able to do this 
> function. I was not just trying to find files with this but also  was 
> try to find items that was in different directory's.

This is why 'find()' takes an array as the second argument.

use File::Find;
 
@dirs_to_search = qw[ /foo /bar /gribble/fark/twixen/blidge ];
 
sub wanted {
	print if -l;			# Report symlinks
	print if -s >= 1024;	# Files over 1kB in size
	print if -S;			# Sockets
	print if -o;			# If owned by EUID
							
							# .../und so weiter/
}
 
find( \&wanted, @dirs_to_search);
> When looking around on the internet for that I want to do everything 
> thing that I was able to find said not to use file::find because it 
> wasn't powerful enough for that I was doing so I just create that 
> function to do some different things along the way.

Which illustrates the point: Read The Fine Manual before you search the Net. :) File::Find is one of the modules that are installed with Perl by default for a very, very good reason - and it's not because it's "not powerful enough". In fact, being aware of what's included by default (via 'perldoc perlmodlib') is an excellent idea for anyone who uses Perl; these modules have been selected as the "crucially necessary" kit by the folks who write Perl and determine Perl policy, which is one hell of a good recommendation.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Wed, 4 Jun 2008 21:32:20 -0400

----- Forwarded message from Jan Engelhardt <jengelh@medozas.de> -----

Date: Wed, 4 Jun 2008 09:40:02 +0200 (CEST)
From: Jan Engelhardt <jengelh@medozas.de>
Sender: jengelh@sovereign.computergmbh.de
To: thomasbonham@bonhamlinux.org
cc: ben@linuxgazette.net
Subject: Re: 2-cent Tip: Perl Search Directory Function
>>Here is a 2-cent tip that is a little Perl script for looping through 
>>directories.
>
>Why not just use 'File::Find'? It's included in the default Perl 
>install, and is both powerful and flexible. 

Why not just use sh?

find . -type d -print0 | xargs -0r do_whatever_you_want.sh

(resp. xargs -0r/-0rn1 perl -e 'do whatever you want here')

----- End forwarded message -----


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Wed, 4 Jun 2008 23:28:57 -0400

Jan Engelhardt <jengelh@medozas.de> wrote:

> On Wed, Jun 04, 2008 at 09:32:20PM -0400, Benjamin Okopnik wrote:
> 
> >Why not just use 'File::Find'? It's included in the default Perl 
> >install, and is both powerful and flexible. 
> 
> Why not just use sh?
> 
> 	find . -type d -print0 | xargs -0r do_whatever_you_want.sh
> 
> (resp. xargs -0r/-0rn1 perl -e 'do whatever you want here')

Because File::Find can do everything that 'find' can - and do it faster and with a lot more flexibility (try asking 'find' to give you just the current filename - or just the name of the directory that it's traversing.) Because File::Find can be used on multiple platforms - including Solaris and MacOS, where 'find' is Stone-Age primitive. Also, because Perl is much smarter about regexes (yes, I know about the '-regextype' option; even the 'posix-extended' argument isn't anywhere nearly as smart as Perl's regexen) - and because Thomas wanted to search multiple directories, which 'find' doesn't do, TTBOMK.

Oh, and because it doesn't need 'xargs' or any equivalent of it - since the files (or directories) are processed one at a time rather than being spit out as a (possibly) huge list.

That's just a few reasons, off the top of my head. :)

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Neil Youngman [ny at youngman.org.uk]


Thu, 5 Jun 2008 08:01:37 +0100

On Thursday 05 June 2008 04:28, Ben Okopnik wrote:

> Thomas wanted to search
> multiple directories, which 'find' doesn't do, TTBOMK

find /path/to/directory /path/to/another/directory /path/to/athird/directory ... works for me. Or did you mean something else?

Neil


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Thu, 5 Jun 2008 08:59:35 -0400

On Thu, Jun 05, 2008 at 08:01:37AM +0100, Neil Youngman wrote:

> On Thursday 05 June 2008 04:28, Ben Okopnik wrote:
> > Thomas wanted to search
> > multiple directories, which 'find' doesn't do, TTBOMK
> 
> find /path/to/directory /path/to/another/directory  /path/to/athird/directory ...
> works for me. Or did you mean something else?

Hence the 'TTBOMK'. :) The rest of it still holds true.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Rick Moen [rick at linuxmafia.com]


Thu, 5 Jun 2008 00:50:24 -0700

Quoting Ben Okopnik (ben@linuxgazette.net):

> Because File::Find can do everything that 'find' can - and do it faster
> and with a lot more flexibility (try asking 'find' to give you just the
> current filename - or just the name of the directory that it's
> traversing.) Because File::Find can be used on multiple platforms -
> including Solaris and MacOS, where 'find' is Stone-Age primitive.  Also,
> because Perl is much smarter about regexes (yes, I know about the
> '-regextype' option; even the 'posix-extended' argument isn't anywhere
> nearly as smart as Perl's regexen) - and because Thomas wanted to search
> multiple directories, which 'find' doesn't do, TTBOMK.

Just for the bilingual pun value, I'm almost tempted to hack GNU find to enable it to "conscript" the File:Find module's functions for its own needs -- and make that available via option "-regextype posix-comitatus".

(Oh, OK, if y'all insist, I'l diagram the joke: http://en.wikipedia.org/wiki/Posse_comitatus_(common_law) )


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Thu, 5 Jun 2008 09:02:58 -0400

On Thu, Jun 05, 2008 at 12:50:24AM -0700, Rick Moen wrote:

> Quoting Ben Okopnik (ben@linuxgazette.net):
> 
> > Because File::Find can do everything that 'find' can - and do it faster
> > and with a lot more flexibility (try asking 'find' to give you just the
> > current filename - or just the name of the directory that it's
> > traversing.) Because File::Find can be used on multiple platforms -
> > including Solaris and MacOS, where 'find' is Stone-Age primitive.  Also,
> > because Perl is much smarter about regexes (yes, I know about the
> > '-regextype' option; even the 'posix-extended' argument isn't anywhere
> > nearly as smart as Perl's regexen) - and because Thomas wanted to search
> > multiple directories, which 'find' doesn't do, TTBOMK.
> 
> Just for the bilingual pun value, I'm almost tempted to hack GNU find to 
> enable it to "conscript" the File:Find module's functions for its own
> needs -- and make that available via option "-regextype posix-comitatus".

[groan]

(Has anyone noticed how that rhymes with 'Moen'? :)

It wouldn't take much hacking; 'libpcre' has been available for ages. Either one more option, or replacing 'posix-extended' with 'pcre' would make for a big improvement.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Kapil Hari Paranjape [kapil at imsc.res.in]


Thu, 5 Jun 2008 19:48:00 +0530

Hello,

On Thu, 05 Jun 2008, Ben Okopnik wrote:

> On Thu, Jun 05, 2008 at 12:50:24AM -0700, Rick Moen wrote:
> > Just for the bilingual pun value, I'm almost tempted to hack GNU find to 
> > enable it to "conscript" the File:Find module's functions for its own
> > needs -- and make that available via option "-regextype posix-comitatus".
> It wouldn't take much hacking; 'libpcre' has been available for ages.
> Either one more option, or replacing 'posix-extended' with 'pcre' would
> make for a big improvement.

It isn't clear that one needs libpcre. For example "grep" supports perl-style regexes but doesn't seem to depend on libpcre.

Kapil. --


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Thu, 5 Jun 2008 08:48:27 -0700

On Thu, Jun 05, 2008 at 07:48:00PM +0530, Kapil Hari Paranjape wrote:

> Hello,
> 
> On Thu, 05 Jun 2008, Ben Okopnik wrote:
> > On Thu, Jun 05, 2008 at 12:50:24AM -0700, Rick Moen wrote:
> > > Just for the bilingual pun value, I'm almost tempted to hack GNU find to 
> > > enable it to "conscript" the File:Find module's functions for its own
> > > needs -- and make that available via option "-regextype posix-comitatus".
> 
> > It wouldn't take much hacking; 'libpcre' has been available for ages.
> > Either one more option, or replacing 'posix-extended' with 'pcre' would
> > make for a big improvement.
> 
> It isn't clear that one needs libpcre. For example "grep" supports
> perl-style regexes but doesn't seem to depend on libpcre.

You have a point - but PCRE would be an easy way to implement it instead of hacking the whole thing. Although given what's already in there, that probably wouldn't be such a big task.

* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://linuxgazette.net *


Top    Back


Rick Moen [rick at linuxmafia.com]


Thu, 5 Jun 2008 09:22:29 -0700

Quoting Ben Okopnik (ben@linuxgazette.net):

> [groan]
> 
> (Has anyone noticed how that rhymes with 'Moen'? :)

Not taking offence, but, just in case anyone doesn't know, and lest they misread the above: My family doesn't pronounce "Moen" as a one-syllable word rhyming with moan, but rather as a two-syllable word rhyming with Bowen or Owen.

In Norway & Denmark (whence comes the surname), they pronounce it differently, but, hey, that's the sort thing that happens when people emigrate and adopt an entirely different language with at best overlapping phonemes.

I believe that, there, it's a one-syllable word with a diphthong /oeu/ sound in the middle, like the Faroese word nøvn ("name") or the Norwegian øy ("island"). In Norwegian and Danish, a "moen" is a rolling field, e.g., a glacier-shaped pasture.

Some Norwegians ending up being called "Moen" because they happened to live in the "Moen" rural district[1] at the time the tax-collectors came through and said "OK, stop changing surnames every time you move to a new rural district. It's confusing the records. You say you recently use to be named Arne Karlsson Hovland, because you then lived in the Hovland gråd (rural district) and your father's name is Karl? Well, you're now Arne Karlsson Moen. Remember that. Stick with it, even if you move again."

[1] http://www.moen-gard.no/


Top    Back


René Pfeiffer [lynx at luchs.at]


Tue, 10 Jun 2008 20:20:04 +0200

On Jun 04, 2008 at 2328 -0400, Ben Okopnik appeared and said:

> [...]
> Oh, and because it doesn't need 'xargs' or any equivalent of it - since
> the files (or directories) are processed one at a time rather than being
> spit out as a (possibly) huge list.

And what do I do when I just want this huge list (or a couple of slightly smaller huge lists)? Push the results to a global array? I had an overdose of C++ so I developed an allergy to globals. :)

Best, René.


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Tue, 10 Jun 2008 15:11:31 -0400

On Tue, Jun 10, 2008 at 08:20:04PM +0200, René Pfeiffer wrote:

> On Jun 04, 2008 at 2328 -0400, Ben Okopnik appeared and said:
> > [...]
> > Oh, and because it doesn't need 'xargs' or any equivalent of it - since
> > the files (or directories) are processed one at a time rather than being
> > spit out as a (possibly) huge list.
> 
> And what do I do when I just want this huge list (or a couple of
> slightly smaller huge lists)? Push the results to a global array?
> I had an overdose of C++ so I developed an allergy to globals. :)

Perl has that allergy built right in - at least for smart programmers. :)

use strict;		# Disallows implicit declaration of globals

Other than that, though - yes, you'd push it onto a huge array, or slightly smaller huge lists. I suppose you could also create a hash or a scalar if you wanted to.

# Private vars
my ($very_long_line, @huge, @kinda_huge1, @kinda_huge2, %gigantor);
 
sub wanted {
	push @huge, $_;				# Save the entire list
 
	if (/^[A-Z]+$/){
		push @kinda_huge1, $_;	# ALL CAPS FILENAMES
	}
	else {
		push @kinda_huge2, $_;	# All the others
	}
 
	# Create a 'hash of lists' in which the keys are full directory
	# names and the values are list(ref)s containing the filenames
	push @{$gigantor{$File::Find::dir}}, $_;
	
	# While we're at it, count how many times each filename occurs
	$gigantor{file_counts}{$_}++;
 
	# Create that big long scalar - special for René! :)
	$very_long_line .= defined $very_long_line ? " $_" : $_;
}

I.e., options galore.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Thomas Adam [thomas.adam22 at gmail.com]


Tue, 10 Jun 2008 20:16:22 +0100

2008/6/10 Ben Okopnik <ben@linuxgazette.net>:

> Perl has that allergy built right in - at least for smart programmers. :)
>
> ``
> use strict;             # Disallows implicit declaration of globals
> ''

Pffft. A glorified spell-checker more like. :)

-- Thomas Adam


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Tue, 10 Jun 2008 17:17:28 -0400

On Tue, Jun 10, 2008 at 08:16:22PM +0100, Thomas Adam wrote:

> 2008/6/10 Ben Okopnik <ben@linuxgazette.net>:
> > Perl has that allergy built right in - at least for smart programmers. :)
> >
> > ``
> > use strict;             # Disallows implicit declaration of globals
> > ''
> 
> Pffft.  A glorified spell-checker more like.  :)

Thomas, perhaps you shouldn't just scream "I'M A BAD PROGRAMMER!!!" so loudly where people can hear you. :)

I suppose that 'use strict' can be described as 'a glorified spell-checker' - after all, you just have - but not by anyone who has read the documentation or actually knows what it does. On the other hand, if you class it with other 'spell checkers' such as, say, 'lint', 'valgrind', 'ddd' and so on, then perhaps you're right; it definitely belongs in that list.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


Thomas [thomas at edulinux.homeunix.org]


Tue, 10 Jun 2008 22:21:19 +0100

On Tue, 10 Jun 2008 17:17:28 -0400 Ben Okopnik <ben@linuxgazette.net> wrote:

> Thomas, perhaps you shouldn't just scream "I'M A BAD PROGRAMMER!!!" so
> loudly where people can hear you. :)

I make no bones about that. And yes, it was a joke. ;)

-- Thomas Adam

-- 
"It was the cruelest game I've ever played and it's played inside my
head." -- "Hush The Warmth", Gorky's Zygotic Mynci.


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Fri, 13 Jun 2008 15:10:45 -0400

On Tue, Jun 10, 2008 at 10:21:19PM +0100, Thomas wrote:

> On Tue, 10 Jun 2008 17:17:28 -0400
> Ben Okopnik <ben@linuxgazette.net> wrote:
> > Thomas, perhaps you shouldn't just scream "I'M A BAD PROGRAMMER!!!" so
> > loudly where people can hear you. :)
> 
> I make no bones about that.  And yes, it was a joke.   ;)

Note that I didn't say that you were a bad programmer. I just said you shouldn't scream it so loudly. :)

(Jokes do come in a variety of flavors, of course - and the "beauty" of one can get right into the eye of an innocent beholder...)

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


René Pfeiffer [lynx at luchs.at]


Tue, 10 Jun 2008 21:23:19 +0200

On Jun 10, 2008 at 1511 -0400, Ben Okopnik appeared and said:

> On Tue, Jun 10, 2008 at 08:20:04PM +0200, René Pfeiffer wrote:
> > On Jun 04, 2008 at 2328 -0400, Ben Okopnik appeared and said:
> > > [...]
> > > Oh, and because it doesn't need 'xargs' or any equivalent of it - since
> > > the files (or directories) are processed one at a time rather than being
> > > spit out as a (possibly) huge list.
> >
> > And what do I do when I just want this huge list (or a couple of
> > slightly smaller huge lists)? Push the results to a global array?
> > I had an overdose of C++ so I developed an allergy to globals. :)
>
> Perl has that allergy built right in - at least for smart programmers. :)

Yes, and in the second after I sent my email I knew that you were going to teach me a lesson. :)

> ``
> use strict;		# Disallows implicit declaration of globals
> ''

Reading manuals costs too much time! ;)

> Other than that, though - yes, you'd push it onto a huge array, or
> slightly smaller huge lists. I suppose you could also create a hash
> or a scalar if you wanted to. [...]

I have to try that. I basically want to do a very simple thing: create lists of places where group A or user B has access to.

> I.e., options galore.

There's more than one way to screw up. :)

Thanks, René.


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Tue, 10 Jun 2008 19:38:27 -0400

On Tue, Jun 10, 2008 at 09:23:19PM +0200, René Pfeiffer wrote:

> On Jun 10, 2008 at 1511 -0400, Ben Okopnik appeared and said:
> > 
> > Perl has that allergy built right in - at least for smart programmers. :)
> 
> Yes, and in the second after I sent my email I knew that you were going
> to teach me a lesson. :)

It's purely in gratitude for the lessons you give me in MySQL and TCP/IP internals. :)

> > ``
> > use strict;		# Disallows implicit declaration of globals
> > ''
> 
> Reading manuals costs too much time! ;)

That's one of the reasons for using 'strict' and 'warnings'. You don't have to read anything - they'll just scream at you if you do it wrong. :)

> > Other than that, though - yes, you'd push it onto a huge array, or
> > slightly smaller huge lists. I suppose you could also create a hash
> > or a scalar if you wanted to. [...]
> 
> I have to try that. I basically want to do a very simple thing: create
> lists of places where group A or user B has access to.

If I understand what you're saying, that would be something like

$gid = getgrnam($group_A);
$uid = getpwnam($user_B);
 
# Get the perms, owner, and group of the file
($mode, $owner, $group) = (stat($File::Find::name))[2, 4, 5]
	or die "Stat failed: $!\n";
# Mask off the filetype/convert to octal
$mode = sprintf "%04o", $mode & 07777;

Hopefully, the rest is obvious.

> > I.e., options galore.
> 
> There's more than one way to screw up. :)

That's the motto of Perl! Oh, wait... :)

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back


René Pfeiffer [lynx at luchs.at]


Wed, 11 Jun 2008 12:33:36 +0200

On Jun 10, 2008 at 1938 -0400, Ben Okopnik appeared and said:

> On Tue, Jun 10, 2008 at 09:23:19PM +0200, René Pfeiffer wrote:
> > [...]
> > Reading manuals costs too much time! ;)
> 
> That's one of the reasons for using 'strict' and 'warnings'. You don't
> have to read anything - they'll just scream at you if you do it wrong.
> :)

That's why I love to talk to GCC and Intel's CC; they tell long stories at times. :)

> > [...]
> > I have to try that. I basically want to do a very simple thing: create
> > lists of places where group A or user B has access to.
> 
> If I understand what you're saying, that would be something like
> 
> ```
> $gid = getgrnam($group_A);
> $uid = getpwnam($user_B);
> 
> # Get the perms, owner, and group of the file
> ($mode, $owner, $group) = (stat($File::Find::name))[2, 4, 5]
> 	or die "Stat failed: $!\n";
> # Mask off the filetype/convert to octal
> $mode = sprintf "%04o", $mode & 07777;
> '''
> 
> Hopefully, the rest is obvious.

Yes, it is, thanks. The problem I had yesterday was to make the code that find2perl spits out get to work. "find2perl /tmp -perm 0220 -user lynx" produces code where a hash of all uid/user mappings is created first.

my (%uid, %user);
while (my ($name, $pw, $uid) = getpwent) {
        $uid{$name} = $uid{$uid} = $uid;
}

And believe it or not, this wasn't working as expected yesterday (or I was really tired). The "($uid == $uid{'lynx'})" part failed. Too bad the code I fiddled with is on my machine at home. I'll try again now with a different code, let's see.

Best, René.


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Wed, 11 Jun 2008 07:24:05 -0400

On Wed, Jun 11, 2008 at 12:33:36PM +0200, René Pfeiffer wrote:

> On Jun 10, 2008 at 1938 -0400, Ben Okopnik appeared and said:
> > On Tue, Jun 10, 2008 at 09:23:19PM +0200, René Pfeiffer wrote:
> > > [...]
> > > Reading manuals costs too much time! ;)
> > 
> > That's one of the reasons for using 'strict' and 'warnings'. You don't
> > have to read anything - they'll just scream at you if you do it wrong.
> > :)
> 
> That's why I love to talk to GCC and Intel's CC; they tell long stories
> at times. :)

If you tell Perl to 'use diagnostics;', it will too - in a very grandmotherly fashion. "Look, I know it's not your fault; you weren't really trying to do anything bad - but this awful thing just happened..." Unlike 'use warnings' and 'use strict', though, 'use diagnostics' should be removed before putting the code into production: it slows things down. For me, the easiest way to use it is from the command line; that way, there's nothing to remove.

perl -Mdiagnostics scriptname
> And believe it or not, this wasn't working as expected yesterday (or I
> was really tired). The "($uid == $uid{'lynx'})" part failed. Too bad the
> code I fiddled with is on my machine at home. I'll try again now with a
> different code, let's see.

I just tried it with '0600' as perms - some of the files I had in '/tmp' had those - and it worked fine. Try stripping out some of the conditions in the 'wanted' block and see what happens.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *


Top    Back