Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New feature: require_glob() (similar to require() but supports globs) #804

Merged
merged 14 commits into from Aug 19, 2020

Conversation

patschi
Copy link
Contributor

@patschi patschi commented Aug 4, 2020

Let me introduce you...

  • glob()
  • eglob() (extended glob())

It's basically like any other glob() function, known from other languages. You can recursively list files, for all kind of different usage scenarios.

There are currently three arguments:

eglob(
  "path", // as string
  true,   // recursive: true|false
  ".js",  // the file extension to filter for, string. "*" to disable filter.
);

What was I'm trying to solve here?
I'm having a directory structure like:

domains/
domains/user1/
domains/user1/domain1.js
domains/user1/domain2.js
domains/user2/
domains/user2/domain1.js
domains/user2/domain2.js

So far I've been using a self-crafted bash script to merge all individual .js files into one single dnscontrol.js file. But that's boring. This is way I've decided solving that with a functionality to be able to dynamically list files and include them.

For example:

var load = glob("./domains/");
for (i = 0; i < load.length; i++) {
	console.log("Loading " + load[i] + "...")
	require(load[i]);
}

For example:

Loading domains/user1/domain1.js...
Hi, domain1 here. I got loaded!
Loading domains/user2/domain2.js...
Hi, domain2 here. I got loaded!

The major difference
However, there's one key difference between eglob() (extended glob) and glob() (wrapper in JavaScript around eglob()). While glob() is designed for being simple and used for dynamic-loading (like above), eglob() is the smarter brother and the actual Go-side function.

glob() will simply return an array, while eglob() returns an array of objects for more complex logics, if required.

Let's go for an example: (glob())

[
  "domains/user1/domain1.tld.js",
  "domains/user2/domain2.tld.js",
  "domains/test.js"
]

vs (eglob())

[
	{
		"DirPath": "domains/user1/",
		"FileName": "domain1.tld.js",
		"IsDir": false,
		"ModTime": "2020-08-03T19:37:04+0100",
		"Mode": 666,
		"Size": 406
	},
	{
		"DirPath": "domains/user2/",
		"FileName": "domain2.tld.js",
		"IsDir": false,
		"ModTime": "2020-08-03T19:37:04+0100",
		"Mode": 666,
		"Size": 446
	}
]

A few examples:

console.log(">" + JSON.stringify(glob("./domains/"), null, 2) + "<"); // default
console.log(">" + JSON.stringify(glob("./domains/", false), null, 2) + "<"); // not recursive
console.log(">" + JSON.stringify(glob("./domains/", true, "*"), null, 2) + "<"); // recursive and showing all files
console.log(">" + JSON.stringify(glob("./domains/", true), null, 2) + "<"); // recursive and only .js files (default)

One more important thing to note: eglob() is as smart as require() is. It lists files always relative to the JavaScript file where it's being used in. Let's go with an example:
dnscontrol.js:

require("domains/index.js");

domains/index.js:

console.log( JSON.stringify(glob("./user1/"), null, 2) );

This will now show files being present underneath ./domains/user1/ and NOT at below ./domains/.

Hope this makes sense.

@tlimoncelli
Copy link
Contributor

tlimoncelli commented Aug 8, 2020

Wow! That's a lot to take in! (That's a compliment... there's a lot of feature work going on here.)

This makes me wonder where we should draw the line between what we include in the language and what we should leave out. I want to keep things simple and not turn into a competitor to Terraform, Chef, or Puppet.

Like the Go language itself, I prefer to only add features that solve a pain point that exists at the time; future-proofing or including variations for completeness sake should be avoided.

Let's take a step back and look at the most common use cases: recursively and non-recursively require()ing files from a directory. For that, I'd like to propose a much more simple interface:

require_glob("domains/", false, ".js")

Since the last 2 arts are the same as the defaults, we can shorten this to:

require_glob("domains/*.js")

Or, if recursion is needed:

require_glob("domains", true)

Are there any use-cases that this wouldn't cover?

@patschi
Copy link
Contributor Author

patschi commented Aug 9, 2020

This makes me wonder where we should draw the line between what we include in the language and what we should leave out.

That's a good point. I was wondering about the same if glob() should be JavaScript or Go. However as I see it as a kinda-wrapper, I thought JavaScript might be easier to maintain or to take it as an example.

I want to keep things simple and not turn into a competitor to Terraform, Chef, or Puppet.

IMHO dnscontrol is far away from being a competitor for named examples - but already as great! :D

Are there any use-cases that this wouldn't cover?

I see your point. But my opinion and decision was: Why limiting the data retrieved by Golang's glob(), which might be useful for any kind of different usecases we both aren't aware of right now? Users always find creative ways to solve the strangest scenarios with the most elegant solutions. :) (For example loading JavaScript files based on modification dates.)

Based on that point: IMHO

  • I'm not happy to rework glob() to require_glob(), as it might be used for some other cases too.
  • But additional require_glob() wrapper in JavaScript might be a way to go.

Also to add: It's also possible to use "patterns" as described in Golang docs.

@tlimoncelli
Copy link
Contributor

Hi!

I think we can meet half way.

require_glob() is an oft-requested feature. I think your solution for that is great and I'd love to receive a PR that implements it. Exposing glob() or eglob() to the Javascript? I don't think that belongs in helpers.js. It belongs in your own dnsconfig.js or file that your require() yourself.

That's the short version. The long version is...

Following the YAGNI principle, I only add features to DnsControl when they are needed. (https://martinfowler.com/bliki/Yagni.html)

Every feature requires support: the code has to be maintained, bugs fixed, support questions answered, etc. Adding unused features for completeness or because someone might need it in the future adds to the work. dnscontrol isn't my full-time job so I want to keep that load to a minimum.

(YAGNI exists outside of tech too. In the legal system they have a term "ripe": You can't sue someone because of a hypothetical problem a law might cause... you have to find a real situation where the law is causing a problem and sue based on that. The reason is that we can always debate a hypothetical. When something is real ("ripe") the debate is limited. )

For example, if we expose eglob()'s objects to the users, then I have to support it. I don't know Javascript. Supporting questions around that would be a big problem.

There may come a day where we switch Javascript interpreters (the current one isn't getting updates) or switch to a smaller subset of javascript. The less we expose to the user the better.

So, is there a "ripe" need for require_glob()? Maybe. Why not simply add a require() every time you create a new file? Most people add domains once in a while. If you are adding them constantly, this might not be the right tool for you. A major sports network uses dnscontrol to maintain 2,000+ domains without require_glob().

There are other ways get the benefit of require_glob(). For example, generate dnsconfig.js using a Makefile. Split dnsconfig.js into a front and end file, insert all the require()'s in the middle:

dnsconfig.js: inputs/*.js
        ( cat base/dnsconfig.js-FRONT ; \
                ls inputs/*.go | awk  '{ print " require(" $$1 ")" }' ; \
                cat base/dnsconfig.js-END ) > dnsconfig.js

(That's from memory. It may not have the quoting right.)

There are reasons NOT to add require_glob(): There are security issues around accidentally requiring an unintended file. require() itself is a security issue because if you require() an untrusted file, it can do anything. It can redefine D()! I had someone tell me they're using unix file permissions and subdirectories to control who can edit what. I warned him about this and suggested git PRs are safer.

Is there a "ripe" need for eglob()? You mentioned "loading JavaScript files based on modification dates" and I assume that was a joke.

That said, if you have a project that is blocked because glob/eglob doesn't exist, please let me know more about the situation. I would gladly help engineer a solution... possibly surprise you with something even better than glob/eglob.

This is similar to how the Go project is managed. Features like Modules, Generic, etc. could have been added years ago. However, by delaying they were able to study the issue deeply and come up with a better solution.

That was more than I planned to write but I hope you read all the way here :-)

@patschi
Copy link
Contributor Author

patschi commented Aug 12, 2020

Exposing glob() or eglob() to the Javascript? I don't think that belongs in helpers.js. It belongs in your own dnsconfig.js or file that your require() yourself.

I'm not sure if I'm getting this right. Writing an own glob() function in a .js file won't work out, as it doesn't have access to the underneath filesystem to get these kind of information. So you need corresponding code in Go and to expose it to JS to make this work. But I have to admit that I don't know the JS engine, Otto, in-depth to be sure on that.

But just to make sure we align to the decision/way of implementation: Would it be fine to have glob() just returning a simple array of found files, and require_glob() as a wrapper (Go or JS?) around that? (For being able to work you need an array of files at minimum anyway.)

Following the YAGNI principle
Every feature requires support

I've never heard about YAGNI principle, to be honest. But thanks for bringing this up! Haven't read linked blog post yet, but seems to be an interesting one I've saved for reading later.

And also regarding the support - I've to admit, you're totally right on that. Sometimes I might be too excited, thinking too complex/big and basically keep throwing in stuff someone might need someday in some way - which is completely the opposite of YAGNI. (To my defense: I'm not in the software development (since longer time anymore)! :)) So I do appreciate bringing this to attention.

For example, generate dnsconfig.js using a Makefile.

Yes. That's actually what I'm doing right now. But specially during testing I've wondered why specific changes didn't apply or ended up in expected results... What the "issue" was? You might have guessed it: Forgot running the make script I wrote. So that was basically one motivation moving that part into dnscontrol itself.

That was more than I planned to write but I hope you read all the way here :-)

Yes, of course I did! I do really appreciate it! :)

@tlimoncelli
Copy link
Contributor

Exposing glob() or eglob() to the Javascript? I don't think that belongs in helpers.js. It belongs in your own dnsconfig.js or file that your require() yourself.

I'm not sure if I'm getting this right. Writing an own glob() function in a .js file won't work out, as it doesn't have access to the underneath filesystem to get these kind of information. So you need corresponding code in Go and to expose it to JS to make this work. But I have to admit that I don't know the JS engine, Otto, in-depth to be sure on that.

Good point. My thought is to include it in the Go code, implement glob() in helpers.js but don't document it. (Maybe with a name or comment that implies it isn't for general use?). require_glob() would be implemented as needed and documented.

Tom

@patschi
Copy link
Contributor Author

patschi commented Aug 13, 2020

implement glob() in helpers.js

require() depends on the glob-function to load each file, so glob() technically can't be in the helpers.js (also because it doesn't have filesystem access).

What I'd suggest instead: glob() and require_glob() in the Go code and both exposed to the JS engine, where require_glob() can Go-wise use the functionality provided by glob(). (This way glob() is technically required for require_glob() to work and "exists for a reason" and is not "useless overhead" to be maintained. If this makes sense.). So both functions basically can be maintained in the Go code and changed/removed if needed.

Sounds like a plan, doesn't it? :D

@tlimoncelli
Copy link
Contributor

That's sounds like a plan. The require_glob() function should have a documentation page but not the glob() function. That way if someone uses it, it's obviously at their own risk.
How about that?

@patschi
Copy link
Contributor Author

patschi commented Aug 14, 2020

I've just reworked it. However I've still gone with the approach having require_glob() in the JS-part. That's way easier and less complex than handling all the sanity checks and passing around the stuff within Go-code. Adjusted docs for require_glob() as well. Is this fine for you?

docs/_functions/global/require_glob.md Show resolved Hide resolved
pkg/js/js.go Show resolved Hide resolved
@tlimoncelli
Copy link
Contributor

This needs a "go generate" and it should be ready for merge. LMK

@patschi
Copy link
Contributor Author

patschi commented Aug 18, 2020

As IGNORE_TARGET() was also implemented in helper.js, therefore the merge conflict, it might be easier to merge this one and commiting a single go generate afterwards, wouldn't it?

@tlimoncelli
Copy link
Contributor

There's no right way to deal with this (one of the problems with go generate). On this project I ask each merge to slip in a go generate; git commit file1 file2 file3 as part of the merge.

Hopefully this will go away after golang/go#35950.

@patschi
Copy link
Contributor Author

patschi commented Aug 18, 2020

There's no right way to deal with this (one of the problems with go generate). On this project I ask each merge to slip in a go generate; git commit file1 file2 file3 as part of the merge.

Do you want me merging master into js-glob branch and then committing an updated static.go here?

Hopefully this will go away after golang/go#35950.

Ah, yeah. Already upvoted that apparently :D

@tlimoncelli
Copy link
Contributor

Yes. Please git merge master then commit and push.

@patschi
Copy link
Contributor Author

patschi commented Aug 18, 2020

There we Go! :)
(Apologize the bad joke.)

@tlimoncelli
Copy link
Contributor

Puns are permitted on this project. In fact, they are encouraged. :)

@tlimoncelli tlimoncelli changed the title glob() feature! New feature: require_glob() is like require() but supports globs Aug 19, 2020
@tlimoncelli tlimoncelli changed the title New feature: require_glob() is like require() but supports globs New feature: require_glob() (similar to require() but supports globs) Aug 19, 2020
@tlimoncelli tlimoncelli merged commit 576c2bd into StackExchange:master Aug 19, 2020
rblenkinsopp pushed a commit to rblenkinsopp/dnscontrol that referenced this pull request Aug 21, 2020
…StackExchange#804)

* Initial implementation of findFiles/globe/glob

* Fixed path, some small improvements

* filepath.Dir() calls Clean() automatically anyway

* Relative path support (like require()), renamed func

* Check file ext prefix, further comments, var renaming

* Updated static.go after merge

* Added doc for glob()

* Tiny adjustment of description of glob()

* Updated docs for possible pattern

* Reworked glob, added public-facing require_glob()

* Updated docs with examples

* Updated static.go

* go generate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants