Glob hell

Added by Arseny Kapoulkine almost 9 years ago

I want to write a rule that, given a directory path (i.e. $(SUBDIR)), extracts all source files recursively and returns a list of relative paths.
Glob scans the directory tree without recursing into subfolders, so I needed something else.

Initially I tried to use :W expansion, something like:

rule C.Sources PATH { # default to current directory
if ! $(PATH) {
PATH = $(SUBDIR) ;
}

  1. retrieve all c/cpp files from path
    local EXT = cpp c ;
    local FILES = @($(PATH)/**/*.$(EXT):W) ;
  1. return relative paths
    return [ Subst $(FILES) : ^$(SUBDIR)/ ] ;
    }

That worked fine... until somebody put the project folder in a path that contained '+'. It got parsed as an expression, subst failed and rule returned absolute paths.

Then I tried to use Glob recursively, something like

rule C.SourcesHelper PATH : RELPATH { # retrieve all c/cpp files from path
local EXT = cpp c ;
local FILES = [ Glob $(PATH) : *.$(EXT) : 0 ] ;

  1. recursively retrieve from subfolders
    local DIRS = [ Glob $(PATH) : */ : 0 ] ;
for DIR in $(DIRS)
{
FILES += [ C.SourcesHelper $(PATH)/$(DIR) : $(DIR) ] ;
}
return $(RELPATH)$(FILES) ;
}

Well, it did not work - it seems that if we're globbing a path that ends with / (each recursive call ends up that way), we're getting the names with the first letter chopped. Which is, I believe, a bug. Currently I'm removing last slash from Glob results manually, but:

a. there seems to be a bug in Glob?
b. perhaps a built-in recursive matching glob-like rule is a good idea? :W syntax lacks the third Glob argument (prepend)
c. alternatively, perhaps a built-in "give me relative path from two paths" is a good idea?


Replies (6)

RE: Glob hell - Added by Joshua Jensen almost 9 years ago

I updated the :W modifier with the ability to remove text from the beginning of the found files:

Echo @(test5/**:W=test5/) ;

I'm reading the result of your post more carefully to determine if there is a bug or not.

Josh

RE: Glob hell - Added by Joshua Jensen almost 9 years ago

First bit of information:

local EXT = cpp c ;
local FILES = @($(PATH)/**/*.$(EXT):W) ;

is slower than:

local EXT = cpp c h ;
local include = "=*.$(EXT)" ;
Echo
($(PATH)/**$(include:J=""):W) ;

RE: Glob hell - Added by Joshua Jensen almost 9 years ago

I cannot reproduce the chopped character issue. What platform are you on?

local DIRS = [ Glob $(PATH) : */ : 0 ] ;

[S:\jamplus]jam
.git/ bin/ build/ docs/ samples/ src/ tests/
*** found 1 target(s)...
*** finished in 0.01 sec

RE: Glob hell - Added by Arseny Kapoulkine almost 9 years ago

:W= is great, thanks for that!

As for Glob, sorry - my description of a bug is unclear. The problem does not manifest itself unless there is a double-slash in the path.

For example, let's take the following hierarchy:
test/ab/bc/c.cpp
Jamfile.jam

Now I'm invoking the rule C.SourcesHelper specified above like this:

Echo [ C.SourcesHelper test : "" ] ;

I added debug output to make things clear.

path test relpath files a.cpp dirs ab/
path test/ab/ relpath ab/ files dirs bc/
path test/ab//bc/ relpath bc/ files .cpp dirs
a.cpp ab/bc/.cpp

First run is ok as we can see, second run is also ok - but it ends with PATH ending with a slash; because of that in the third run PATH ends with double slash, and first letter of file name is removed. If I invoke the rule like that:

Echo [ C.SourcesHelper test/ : "" ] ;

The output becomes:

path test/ relpath files dirs ab/
path test//ab/ relpath ab/ files dirs c/
path test//ab//c/ relpath c/ files dirs

We get double slash on the second run which results in problems.

RE: Glob hell - Added by Arseny Kapoulkine almost 9 years ago

By the way, while testing I believe I discovered another bug - Glob does not return single-letter directories to me, i.e. I originally started with test/a/b/c.cpp path, but neither a nor b were returned until I made them two-letter.

I'm running on Windows7 64-bit (though I believe the behavior will reproduce on any Windows platform)

RE: Glob hell - Added by Arseny Kapoulkine almost 9 years ago

Oh, and removing last slash helped because that way $(PATH) never ends with a slash (unless it initially ended with one, but I did not run into that), thus $(PATH)/$(DIR) does not contain double slashes.

(1-6/6)