Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User defined imports name mangling #1865

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

infomiho
Copy link
Contributor

@infomiho infomiho commented Mar 7, 2024

I've had some issues with conflicting imports when integrating Arctic #1851 and implemented this mangling to help.

Closes #1062
Closes #284

Copy link
Contributor

@sodic sodic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already do the same thing in other places.

For example, these two:

"namespaceMiddlewareConfigFnImportAlias" .= middlewareConfigFnAlias

And possibly others.

I suggest a regex text search in the generated project). I used this command to find the ones I listed here (but didn't do a fully thorough check):

rg 'import.*\{.*as.*\}' -g '**/src/**/*' -g '**/sdk/**/*' -g '!**/dist/**/*'

Ideally, we'd have a single consistent procedure for handling all aliased imports. If not, we should at least make sure we don't apply two procedures for the same symbols (but I'd push for the "ideally" :)).

mangleImportIdentifier JI.JsImport {JI._name = JsImportField name} = prefixTheImportName name jsImport

prefixTheImportName :: String -> JsImport -> JsImport
prefixTheImportName originalName = applyJsImportAlias (Just ("__userDefined" ++ toUpperFirst originalName))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to suggest mangling our imports instead of user imports (to keep the path between user symbols and our code as clear as possible).

However, it seems that it's not our imports conflicting with user imports - it's the thing we're defining in the same file.

Anyway, I don't how to make this without changing user symbol names, but am leaving this comment in case you have an idea.

@sodic
Copy link
Contributor

sodic commented Mar 8, 2024

Btw, your PR description (as its written now) won't trigger automatic closing of both issues. I got burned by this several times.

Instead of:

Closes #123 and #234

Do

Closes #123
Closes #234 

Context: nus-cs2103-AY2021S2/forum#293

@infomiho
Copy link
Contributor Author

infomiho commented Mar 8, 2024

Okay, so we need to unify the way we mangle the imports. I found the following using your regex:

  • sdk/wasp/server/operations/actions/index.ts and sdk/wasp/server/operations/queries/index.ts
    Found in OperationsGenerator -> ("_ext" suffix using applyExtImportAlias and extImportToJsImport 🟣)

  • sdk/wasp/server/webSocket/index.ts
    Found in WebSocketGenerator using the same fn as OperationsGenerator 🟣

  • server/src/routes/apis/index.ts
    Found in ApiRoutesG (custom "wasp_" prefix 🔵)

  • server/src/middleware/globalMiddleware.ts
    Found in ServerGenerator (custom "_waspGlobalMiddlewareConfigFn" 🔵)

  • server/src/jobs/*.ts
    Found in JobGenerator (custom "_waspJobDefinition" 🔵)

🔵 manual alias
🟣 using the applyExtImportAlias and extImportToJsImport

Okay, that seems manageable. Great call to ripgrep!

@infomiho
Copy link
Contributor Author

infomiho commented Mar 11, 2024

The biggest issue I'd like to resolve to make this "proper name mangling" is to somehow keep track of aliases that were used and ensure no duplicate aliases are used in the same file (thus breaking our code!).

Two cases

Happy path (works now)

Let simplest thing we can have in a file:

  • import { foo as foo__userDefined } from ...
  • import { bar as bar__userDefined from ...

This is all fine and in the basic case of nothing goes wrong since there are no duplicate import names i.e. we have foo and bar here.

Duplicate names

What if we had a situation like this in a single template file:

  • import { foo as foo__userDefined } from ...
  • import { foo as foo__userDefined } from ...

We now have duplicate names! foo and foo are both in the same file. This gives us foo__userDefined and foo__userDefined aliases. This would cause our code to break ⚠️

Solutions

Ideally, we would do it like e.g. Rollup does it and keep a global store of aliases and then on duplicate, we would just add a number at the end.

Ideal solution (keep track at template level)

If we kept the imports unique per-file, it might look like this:

File1:

  • import { foo as foo__userDefined$0 } from ...
  • import { foo as foo__userDefined$1 } from ...

File2:

  • import { foo as foo__userDefined$0 } from ...
  • import { foo as foo__userDefined$1 } from ...

Okay solution IMHO (keep track at global level)

Same like the previous solution, but we could have a "global state" and no per-file scoping:

File1:

  • import { foo as foo__userDefined$0 } from ...
  • import { foo as foo__userDefined$1 } from ...

File2:

  • import { foo as foo__userDefined$2 } from ...
  • import { foo as foo__userDefined$3 } from ...

{=/ dbSeeds =}

const seeds = {
{=# dbSeeds =}
{= importIdentifier =},
{= name =}: {= seedFn.importIdentifier =},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we reference the import name in the CLI, we had to adjust the code a bit.

{=/ globalMiddlewareConfigFn.isDefined =}
{=^ globalMiddlewareConfigFn.isDefined =}
const {=& globalMiddlewareConfigFn.importAlias =} = (mc: MiddlewareConfig) => mc
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for a custom alias anymore 😄

{=# routeMiddlewareConfigFn.isDefined =}
{=& routeMiddlewareConfigFn.importStatement =}
{=/ routeMiddlewareConfigFn.isDefined =}
{=/ apiRoutes =}

const idFn: MiddlewareConfigFn = x => x

{=# apiRoutes =}
{=^ routeMiddlewareConfigFn.isDefined =}
const {=& routeMiddlewareConfigFn.importAlias =} = idFn
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for custom aliases anymore, rewrote this code to work with the new mangling.

@infomiho
Copy link
Contributor Author

All of the code that needed name mangling needed to come inside of the Generator monad, which wasn't too much of a difference since we used the JsImport in FileDrafts anyways.

@Martinsos
Copy link
Member

@infomiho I am curious, when does this situation where two user-defined symbols come into conflict happen? Does it happen if they have a query and action with the same name, or something like that? I am asking, because I wonder if we are even ok with allowing them to do something like that. But I am guessing there is some other use case you were considering here.

Because for me, the main thing I wanted us to fix with name mangling was ensuring the user-defined symbols don't come into conflict with the symbols that we defined.

Copy link
Member

@Martinsos Martinsos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome initiative @infomiho !
I like the general idea, although it seems to me like we should still figure out some details a bit: I left comments.

I didn't review all the single files in the Generator where names are defined, but I reviewed the files with the core name mangling logic.

Comment on lines +58 to +66
mangleImportIdentifier :: JsImport -> Generator JsImport
mangleImportIdentifier JI.JsImport {JI._name = JsImportModule name} = mangleJsImportName name jsImport
mangleImportIdentifier JI.JsImport {JI._name = JsImportField name} = mangleJsImportName name jsImport

mangleJsImportName :: String -> JsImport -> Generator JsImport
mangleJsImportName originalName originalJsImport = do
mangledName <- mangleName originalName

return $ applyJsImportAlias (Just mangledName) originalJsImport
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This split is a bit weird.

Your second function takes String + JsImport that has that same String in it, only to apply it as an alias.

I would combine these two into one function, it will be simpler to understand. Function could be defined as setMangledAlias :: JsImport -> Generator JsImport.
It also makes it clear that if there is an alias already, it will overwrite it. Which is behaviour you have right now, but I wonder if we are ok with that? Might be unexpected! If we are ok with it, I would make that clear.

And for getting the name out of JsImport -> you can have a case inside your function to do that easily:

let name = case jsImport of
  JI.JsImport { JI._name = JsImportModule name } -> name
  JI.JsImpoort { JI._name = JsImportField name } -> name

And then you can do mangledName <- mangleName name in the next line, and so o.

Or, what might be even better, is defining a standalone, top level, function called getJsImportNameSymbol :: JsImport -> String that gets you that name. Maybe there is a better more semantically correct name, you will know that better than me since this is fresher for you.

@@ -36,16 +39,28 @@ extImportNameToJsImportName :: EI.ExtImportName -> JsImportName
extImportNameToJsImportName (EI.ExtImportModule name) = JsImportModule name
extImportNameToJsImportName (EI.ExtImportField name) = JsImportField name

jsImportToImportJson :: Maybe JsImport -> Aeson.Value
jsImportToImportJson :: Maybe JsImport -> Generator Aeson.Value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so so far, this function was, given a JsImport, returning a Json object that contains information about that import that we can easily use in a template.

I see two things immediately that are a bit suspicious:

  1. Name is a bit confusing, feels like it could tell us more about what it does, or at least there could be a comment clarifying this.
  2. I see it receives a Maybe JsImport, which is a bit weird, as I would usually expect such cases to be handled by the outside code. It is basically two functions in one -> returns Json for no import, or json for import. Which by clean code you might want to handle as two functions then. THat said, if it is really useful to us in practice, then fine, no worries, let's leave it as it is.

Ok, that said, let's focus on new changes. THis is now doing mangling also! If so, I find that a bit unexpected: I am turning jsImport into json, and now it also gets mangled. So maybe I would add that to the name, to make that obvious?
Or, what sounds like the best option to me, is to not do the mangling here, but to do it separately: either at the moment of constructing the JsImport itself, or as a separate function mangleJsImport :: JsImport -> JsImport. Separate function actually sounds the best. Then we have control over what we want to mangle vs what we don't want to mangle.

But I am guessing you might have a specific reason why you put it here? Because we want to always mangle everything?

Comment on lines +75 to +76
logGeneratorWarning w = modify $ \GeneratorState {errors = errors', warnings = warnings', mangledNames = mangledNames'} ->
GeneratorState {errors = errors', warnings = w : warnings', mangledNames = mangledNames'}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we have written this code like this, instead of doing:

logGeneratorWarning w = modify $ \genState ->
  genState { warnings = w : warnings genState }

Give it a try, if this compiles, then this is nicer solution here.

@@ -37,7 +39,8 @@ newtype Generator a = Generator

data GeneratorState = GeneratorState
{ warnings :: [GeneratorWarning],
errors :: [GeneratorError]
errors :: [GeneratorError],
mangledNames :: Data.Map.Map String Int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this, I don't know what String is here, and what Int is here.
Maybe you could write a comment explaining them?
What you could also do is add something like type Name = String, that could help.


return mangledName
where
mangleName' :: String -> Data.Map.Map String Int -> (String, Data.Map.Map String Int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so after reading this, I now understand why you have that Map. I would love to not have to read this to be able to understand that though, but I commented that above.

So how mangling works for let's say names foo, bar, and then foo again, is like this:

  • foo__userDefined0, bar__userDefined0 foo__userDefined1.
    Why not:
  • foo__userDefined0, bar__userDefined1, foo__userDefined2?

Btw i am looking at the implementation again I think I got this wrong, I guess the first one doesn't even have a number, just __userDefined? But ok, doesn't change the argument.

It would result in much simpler implementation on our side, we need just a single number as state, and I don't see us losing anything?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok there is one thing I thought of only now -> and that is that we might want to minimize the impact of them adding a new user defined name to the codebase. If such user defined name shifts all other names for a +1, then we have a lot of files updating in generated code, while we could have easily avoided that. Is that why you used the approach you used? If so, that is a quite good reason, and it might be worthing mentioning it in the comments somewhere, in case somebody in the future gets same idea like me to simplify the code.


-- Mangles a name to avoid name clashes. This is useful when generating code that may have user-defined names.
-- It keeps the mangled names unique by appending a counter to the end of the name.
mangleName :: String -> Generator String
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So function is called mangleName, but you actually mangle it as a user defined name. I think that needs to be clear from the name. Because I might want to mangle some of our names also at some point (if not already), and I would happily use this, not knowing it assumes it is mangling a user defined name.

So I would either forget the whole idea of userdefined and just mangle names with a counter, be those user defined names or not (yeah, why do we even care?), or I would rename this function to mangleUserDefinedName.


-- This logs an error and does throw, thus short-circuiting the computation until caught.
logAndThrowGeneratorError :: GeneratorError -> Generator a
logAndThrowGeneratorError e = logGeneratorError >> throwError e
where
logGeneratorError :: Generator ()
logGeneratorError = modify $ \GeneratorState {errors = errors', warnings = warnings'} ->
GeneratorState {errors = e : errors', warnings = warnings'}
logGeneratorError = modify $ \GeneratorState {errors = errors', warnings = warnings', mangledNames = mangledNames'} ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one can also be simplified like I explained above!

. extImportToSdkJsImport

extImportToSdkJsImport :: Maybe EI.ExtImport -> Maybe JI.JsImport
extImportToSdkJsImport Nothing = Nothing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this is a code smell: taking Maybe a as a function argument, especially when it is a single argument.

Yeah when you have this, doing Nothing for Nothing, and something for Just, I dont think it makes much sense then to take Maybe as an argument, you are not doing anything really. If I want to get this behavior, I will just call this function with fmap on Maybe EI.ExtImport. But you just made it much harder for me to run it on EI.ExtImport that is not wrapped in Maybe. So you got nothing (pun intended), but lost something.

GJI.jsImportToImportJson
. extImportToSdkJsImport

extImportToSdkJsImport :: Maybe EI.ExtImport -> Maybe JI.JsImport
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain to me a bit why we have this function, how is it that we have it now but didn't have it before?

getImportJsonForJobDefinition jobName =
GJI.jsImportToImportJson $
Just $
JI.JsImport
{ JI._path = JI.ModuleImportPath $ makeSdkImportPath [relfileP|server/jobs|],
JI._name = JI.JsImportField jobName,
-- NOTE: We are using alias to avoid name conflicts with user defined imports.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so we are not aliasing our names anymore, because we are aliasing their names.

I wonder how this will affect the error messages? Will they now be seeing stuff like getTasks__userDefined in the error messages? If so, it would be better that we mangle our names, and not theirs, when possible.

Also, I remember at some point in the past, I was worrying about the situation where we embed their code directly into the template (so not just importing it, but actually embedding it) -> then there was a lot of opportunity for conflicts with our names in that file. I think we don't have such situations any more though, since we don't embed their code directly -> we had that as a feature in Wasp in very early days and that has been removed.

@infomiho infomiho mentioned this pull request Apr 18, 2024
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants