Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporte Metadata like description #373

Open
fabian-hiller opened this issue Jan 16, 2024 Discussed in #368 · 26 comments
Open

Supporte Metadata like description #373

fabian-hiller opened this issue Jan 16, 2024 Discussed in #368 · 26 comments
Assignees
Labels
enhancement New feature or request priority This has priority

Comments

@fabian-hiller
Copy link
Owner

Discussed in #368

Originally posted by xcfox January 14, 2024
Valibot is a super cool Schema Builder! I am attracted by Valibot's modular design.

I plan to use Valibot as an Entity Schema Builder for MikroORM.

So far, I am using TypeScript's class and Decorators to define MikroORM's Entity like this:

@Entity()
export class BaseEntity {
  @PrimaryKey()
  id!: number

  @Property()
  createdAt = new Date()

  @Property({ onUpdate: () => new Date() })
  updatedAt = new Date()
}

interface WithName {
  name: string
}

@Entity()
export class User extends BaseEntity implements WithName {
  @Property({ unique: true, comment: "user's name" })
  name!: string

  @index()
  @Property({ nullable: true })
  email: string

  @Property({ type: 'text' })
  password!: string
}

Because TypeScript's class doesn't support multiple inheritance, I have to define the name field repeatedly.

Maybe I can use Valibot and custom pipelines to solve this problem:

import { intersect, Output, nullable, date, string, object } from 'valibot'
import { primaryKey, property, index, unique, description } from './custom'

const baseEntity = object({
  id: string([primaryKey()]),
  createdAt: date([property()]),
  updatedAt: date([property()]),
})

const withName = object({
  name: string([property(), index(), description("user's name")]),
})

export const userEntity = intersect([
  object({
    email: nullable(string([property(), unique()])),
    password: string([property()]),
  }),
  baseEntity,
  withName,
])

export interface IUser extends Output<typeof userEntity> {}

Here I use many custom pipelines: primaryKey, property, index, unique, description. However, the current version of Valibot's pipeline only supports Validation and Transformation. I don't think these custom pipes belong to either Validation or Transformation.

Once Metadata is supported, Valibot has the ability to shine in all scenarios where a Schema needs to be defined, such as graphql, typegoose, typeorm and not just for data validation like zod. Valibot's modularity, combined with Metadata, makes for a very powerful Schema Builder! All projects using TypeScript can use Valibot to define Schemas, and the development experience will be far superior to that of class and Decorators.

@fabian-hiller fabian-hiller self-assigned this Jan 16, 2024
@fabian-hiller fabian-hiller added enhancement New feature or request priority This has priority labels Jan 16, 2024
@fabian-hiller
Copy link
Owner Author

fabian-hiller commented Jan 22, 2024

I have investigated this use case. As far as I understand my code, the current implementation of Valibot does not limit this functionality. This feature can already be added by third party libraries by providing pipeline actions like index and description. Here is an example that implements index and description:

import * as v from 'valibot';

/**
 * Index metadata type.
 */
type IndexMetadata<TInput> = {
  async: false;
  _parse(input: TInput): v.PipeActionResult<TInput>;
  type: 'index';
};

/**
 * Creates a pipeline metadata action to indicate that the field should be indexed.
 *
 * @returns A metadata action.
 */
function index<TInput>(): IndexMetadata<TInput> {
  return {
    type: 'index',
    async: false,
    _parse(input) {
      return v.actionOutput(input);
    },
  };
}

/**
 * Description metadata type.
 */
type DescriptionMetadata<TInput> = {
  async: false;
  _parse(input: TInput): v.PipeActionResult<TInput>;
  type: 'description';
  description: string;
};

/**
 * Creates a pipeline metadata action that adds a description the field.
 *
 * @param description The description to be added.
 *
 * @returns A metadata action.
 */
function description<TInput>(description: string): DescriptionMetadata<TInput> {
  return {
    type: 'description',
    description,
    async: false,
    _parse(input) {
      return v.actionOutput(input);
    },
  };
}

// Create user entity schema
const UserEntitySchema = v.object({
  name: v.string([
    index(),
    description('The name of the user.'),
    v.maxLength(30),
  ]),
  // ...
});

// Read index metadata
const shouldBeIndexed = !!UserEntitySchema.entries.name.pipe?.find(
  (action) => 'type' in action && action.type === 'index'
);

The more important thing for us as a community is to decide if this functionality belongs in the scope of Valibot, and if the library should get a first-class metadata API. If we decide to do so, we should discuss if we want to add this to our current pipeline feature as shown above, if we want to rename and reimplement the pipeline feature so that it is not necessary to include async and _parse for metadata objects, or if we want to implement this completely different, for example with a metadata method that adds the metadata directly to the schema object apart from the validation and transformation actions.

I look forward to hearing what you think.

@xcfox
Copy link
Contributor

xcfox commented Jan 25, 2024

@fabian-hiller You're absolutely right.

Over the past few days, I've built the valibot-mikro library for building mikro entities with the help of valibot.
What I'm doing now is disguising metadata like property(), index() as BaseValidation.

/**
 * Mikro property meta.
 */
export type PropertyMeta<Entity = any, TInput = any> = BaseValidation<TInput> & {
	/**
	 * The validation type.
	 */
	type: "mikro_property"

	/**
	 * The property meta.
	 */
	meta?: Partial<EntitySchemaProperty<TInput, Entity>>
}

This works fine, but it feels a little twisted.
As you say, this contains unnecessary async and _parse.

I strongly agree with rename and reimplement the pipeline feature to support pure metadata.
In addition, actually, in valibot-mikro, I've implemented some methods similar to the metadata method, that is, manyToMany, manyToOne, which are actually relation types specific to MikroORM.

In my practice, I realized that there should be no need for an abstract metadata method method, but many more explicit metadata methods, such as property, index, columnName.
By putting the metadata in the pipeline, we can clearly identify the type of the field, while the secondary metadata is in the secondary position:

const User = object({
  name: number([property(), index(), columnName('a_strange_column_name')])
})

But by using metadata as a wrapper, the code starts to get confused, metadata() is not a type, but it is in the place of the type:

const User = object({
  name: metadata(number(), [property(), index(), columnName('a_strange_column_name')])
})

Such an approach may could be even more messy:

const User = object({
  name: columnName(property(index(number())), 'a_strange_column_name')
})

As a community, could we also consider building in some generic metadata like description() so that when writing the schema, we don't have to write generic meta information for every 3rd party library. I see that the describe() method exists in zod.

@xcfox
Copy link
Contributor

xcfox commented Jan 25, 2024

I also noticed that not every schema can accept pipe arguments, such as recursive(), enum_(), picklist().
I understand that these schemas include validation by themselves and may not need an additional pipe, but when I use valibot as a schema builder, I do need to provide a way for all schemas to pass in a metadata.

Do you use a plan to complement the pipe parameter for all schemas?
Or can we add additional meta attributes to the schema? For example, for BaseSchema:

/**
 * Base schema type.
 */
export type BaseSchema<TInput = any, TOutput = TInput> = {
  /**
   * store the metadata.
   */
  meta?: BaseMetadata[];
  /**
   * Whether it's async.
   */
  async: false;
  /**
   * Parses unknown input based on its schema.
   *
   * @param input The input to be parsed.
   * @param info The parse info.
   *
   * @returns The parse result.
   *
   * @internal
   */
  _parse(input: unknown, info?: ParseInfo): SchemaResult<TOutput>;
  /**
   * Input and output type.
   *
   * @internal
   */
  _types?: { input: TInput; output: TOutput };
};

The advantage of this is that the metadate can be completely ignored when running validation for better performance.

@fabian-hiller
Copy link
Owner Author

I like the idea of putting the metadata under its own key. However, this would require that we put the metadata in a different array to separate from the pipe, or we would have to filter and process the pipeline when creating the schema to extract the metadata. Also, I think it would be nice to access the metadata directly via .metadata?.description. Instead of writing .metadata?.find((item) => item.type === 'description').

const User = object({
  // Note: This would require us to always specify the pipeline, even if it is empty
  name: string([], [index(), description('Lorem ipsum')]),
  age: number([minValue(10)], [index(), description('Lorem ipsum')]),
})

Another idea is to provide the metadata directly as an object instead of an array of functions. I suspect this will also result in the smallest bundle size.

const User = object({
  name: string([], { index: true, description: 'Lorem ipsum' }),
  age: number([minValue(10)], { index: true, description: 'Lorem ipsum' }),
})

@xcfox
Copy link
Contributor

xcfox commented Jan 26, 2024

I totally agree with the use of key-value pairs to store metadata.
For the best development experience, I think we should put the metadata in the same array as the validators and transformations:

const User = object({
  name: string([index(), description('Lorem ipsum')]),
  age: number([minValue(10), index(), description('Lorem ipsum')]),
})

To do this we need to do some extra work when creating the schema, which fortunately is fairly simple.

Also, I noticed that ErrorMessage should actually be treated as metadata, so that we don't have to leave a separate pass position for message, and for all schemas we can use rest parameters to pass the pipeline so that we don't have to explicitly build arrays:

const User = object({
  name: string(message("The name is illegal"), index(), description('Lorem ipsum')),
  age: number(minValue(10), index(), description('Lorem ipsum')),
})

This will result in a huge break change.
Another idea is to use the current location of message to accept metadata, which should accept a string or metadata object for compatibility with older code:

const User = object({
  name: string({ description:"Lorem ipsum", index: true }),
  age: number({ message:"too young", description:"Lorem ipsum", index: true },[minValue(10)]),
})

Taking into account compatibility, development experience and packaging size, I think this is a very good solution. We also need to expose the Metadata interface externally for code hinting and type checking:

// valibot
export interface Metadata {
  description?: string
  message?: string
}

Users need to add additional global type declarations when installing third-party libraries:

// env.d.ts
import { Metadata } from 'valibot'
import { MikroMetadata } from 'valibot-mikro'
declare module 'valibot' {
  export interface Metadata extends MikroMetadata {}
}

@fabian-hiller
Copy link
Owner Author

Thanks again for the details! I will look into this and get back to you with the results.

@fabian-hiller
Copy link
Owner Author

fabian-hiller commented Jan 30, 2024

For the best development experience, I think we should put the metadata in the same array as the validators and transformations

I agree that the DX is very nice this way. The downside for me is the internal implementation because it requires us to filter and process the pipeline on schema creation. This would increase the bundle size and slow down the startup performance. Since I expect that the metadata feature will only be used by a small fraction of users, I am not sure if I want to go this way.

Also, I noticed that ErrorMessage should actually be treated as metadata

I see your point. On the other hand, ErrorMessage is used as the first optional argument of every schema and validation functions and validation functions have no pipeline. The current approach makes it consistent across the library.

and for all schemas we can use rest parameters to pass the pipeline so that we don't have to explicitly build arrays

When I first implemented Valibot in July 2023, I considered this approach. In the end I decided against it because it makes it much harder to distinguish the pipe arguments and limits the API for complex schema functions like object that have other optional arguments. Also, the formatting with Prettier looks a lot more chaotic this way.

Another idea is to use the current location of message to accept metadata

This might work, and I will consider it. It might make it harder to handle and process custom error messages, which is an important feature of the library. Also, I am not sure if message should be part of the metadata.

We also need to expose the Metadata interface externally for code hinting and type checking

Great idea! I agree!


Although I have argued against your suggestions, this comment is not a final decision, and I welcome your feedback. In the end, my goal is to work with the community to create a great schema library.

If we decide to add the metadata object as the last optional argument of each schema function, I see several benefits.

  • The current API remains unchanged, there are no breaking changes
  • The optional metadata argument can be easily identified in defaultArgs
  • The metadata argument can be added to the schema object without processing
  • The metadata is easily accessible as key-value pairs via Schema.metadata
  • Users can extend Metadata to add custom properties
  • The bundle size and performance footprint is minimal
const User = object({
  name: string({ index: true, description: 'Lorem ipsum' }),
  age: number([minValue(10)], { index: true, description: 'Lorem ipsum' }),
})

@xcfox
Copy link
Contributor

xcfox commented Jan 31, 2024

I strongly agree with adding the metadata object as the last optional argument of each schema function.
This approach is not much different on DX compared to using the current location of message to accept metadata.
The only problem is that this leads to more verbose signatures for schema functions, such as for object() functions:

Click to show code
/**
 * Creates an object schema.
 *
 * @param entries The object entries.
 * @param metadata The schema metadata.
 *
 * @returns An object schema.
 */
export function object<TEntries extends ObjectEntries>(
  entries: TEntries,
  metadata?: Metadata
): ObjectSchema<TEntries>;

/**
 * Creates an object schema.
 *
 * @param entries The object entries.
 * @param pipe A validation and transformation pipe.
 * @param metadata The schema metadata.
 *
 * @returns An object schema.
 */
export function object<TEntries extends ObjectEntries>(
  entries: TEntries,
  pipe?: Pipe<ObjectOutput<TEntries, undefined>>,
  metadata?: Metadata
): ObjectSchema<TEntries>;

/**
 * Creates an object schema.
 *
 * @param entries The object entries.
 * @param message The error message.
 * @param pipe A validation and transformation pipe.
 * @param metadata The schema metadata.
 *
 * @returns An object schema.
 */
export function object<TEntries extends ObjectEntries>(
  entries: TEntries,
  message?: ErrorMessage,
  pipe?: Pipe<ObjectOutput<TEntries, undefined>>,
  metadata?: Metadata
): ObjectSchema<TEntries>;

/**
 * Creates an object schema.
 *
 * @param entries The object entries.
 * @param rest The object rest.
 * @param pipe A validation and transformation pipe.
 * @param metadata The schema metadata.
 *
 * @returns An object schema.
 */
export function object<
  TEntries extends ObjectEntries,
  TRest extends BaseSchema | undefined
>(
  entries: TEntries,
  rest: TRest,
  pipe?: Pipe<ObjectOutput<TEntries, TRest>>,
  metadata?: Metadata
): ObjectSchema<TEntries, TRest>;

/**
 * Creates an object schema.
 *
 * @param entries The object entries.
 * @param rest The object rest.
 * @param message The error message.
 * @param pipe A validation and transformation pipe.
 * @param metadata The schema metadata.
 *
 * @returns An object schema.
 */
export function object<
  TEntries extends ObjectEntries,
  TRest extends BaseSchema | undefined
>(
  entries: TEntries,
  rest: TRest,
  message?: ErrorMessage,
  pipe?: Pipe<ObjectOutput<TEntries, TRest>>,
  metadata?: Metadata
): ObjectSchema<TEntries, TRest>;

export function object<
  TEntries extends ObjectEntries,
  TRest extends BaseSchema | undefined = undefined
>(
  entries: TEntries,
  arg2?: Metadata | Pipe<ObjectOutput<TEntries, TRest>> | ErrorMessage | TRest,
  arg3?: Metadata | Pipe<ObjectOutput<TEntries, TRest>> | ErrorMessage,
  arg4?: Metadata | Pipe<ObjectOutput<TEntries, TRest>>,
  arg5?: Metadata
): ObjectSchema<TEntries, TRest>

This looks like it will come with more maintenance costs, other than that the solution is perfect!

@fabian-hiller
Copy link
Owner Author

Which metadata properties should Valibot ship by default? Can you create a list for me?

@xcfox
Copy link
Contributor

xcfox commented Jan 31, 2024

We should refer to existing popular standards such as annotations for json-schema, parameter for openapi, GraphQL or ts-doc.
I researched the above specification and came up with some generic metadata fields:

/**
 * Schema metadata type.
 */
export interface SchemaMetadata<T = any> {
  /**
   * The name of the schema.
   */
  name?: string;
  /**
   * A brief description of the schema.
   */
  description?: string;
  /**
   *  The instance value of the schema should not be used and the schema may be removed in the future.
   */
  deprecated?: boolean;

  /**
   * The `examples` is a place to provide an array of examples that validate against the schema.
   */
  examples?: T[];
}

Of course, we can also strictly follow the json-schema specification.

Then we will have the following build-in metadata fields:

/**
 * Schema metadata type.
 */
export interface SchemaMetadata<T = any> {
  /**
   * The title of the schema.
   */
  title?: string;
  /**
   * A brief description of the schema.
   */
  description?: string;

  /**
   * The `examples` is a place to provide an array of examples that validate against the schema.
   */
  examples?: T[];

  /**
   * The `readOnly` keyword indicates that the value of the instance is managed exclusively by the owning authority, and attempts by an application to modify the value of this property are expected to be ignored or rejected by that owning authority.
   */
  readOnly?: boolean;

  /**
   * The `writeOnly` keyword indicates that the value is never present when the instance is retrieved from the owning authority.
   */
  writeOnly?: boolean;

  /**
   *  The instance value of the schema should not be used and the schema may be removed in the future.
   */
  deprecated?: boolean;
}

@fabian-hiller
Copy link
Owner Author

fabian-hiller commented Jan 31, 2024

What about SQL properties like index, unique and primary key?

@xcfox
Copy link
Contributor

xcfox commented Jan 31, 2024

I don't think SQL properties should be included in the valibot package itself. This is because when dealing with a specific business, the situation is complex and varied.
In the TypeScript world, each ORM has its own unique way of defining schema:

https://sequelize.readthedocs.io/
https://orm.drizzle.team/docs/sql-schema-declaration
https://typeorm.io/entities#what-is-entity
https://typegoose.github.io/typegoose/docs/guides/quick-start-guide/#quick-overview-of-typegoose

It is quite difficult to adapt valibot to such a variety of situations or to follow ORM package updates.

In order for valibot to be a generalized schema builder, I think it needs to be used to create various community packages such as valibot-drizzle, valibot-typeorm.
And valibot itself should remain modular and extensible so that developers can easily add unique metadata:

// env.d.ts
import { SchemaMetadata } from 'valibot'
import { MikroMetadata } from 'valibot-mikro'
declare module 'valibot' {
  export interface SchemaMetadata extends MikroMetadata {}
}

@xcfox
Copy link
Contributor

xcfox commented Feb 1, 2024

I'm trying to implement this feature, and the current solution is to use metadata as the last parameter of the schema function.
But now I'm facing a tricky problem. TypeScript doesn't seem to hit the overload correctly.

Here is the array() declaration
/**
 * Creates a array schema.
 *
 * @param item The item schema.
 * @param pipe A validation and transformation pipe.
 * @param metadata The schema metadata.
 *
 * @returns A array schema.
 */
export function array<TItem extends BaseSchema>(
  item: TItem,
  pipe?: Pipe<Output<TItem>[]>,
  metadata?: SchemaMetadata<Input<TItem>>
): ArraySchema<TItem>;

/**
 * Creates a array schema.
 *
 * @param item The item schema.
 * @param message The error message.
 * @param pipe A validation and transformation pipe.
 * @param metadata The schema metadata.
 *
 * @returns A array schema.
 */
export function array<TItem extends BaseSchema>(
  item: TItem,
  message?: ErrorMessage,
  pipe?: Pipe<Output<TItem>[]>,
  metadata?: SchemaMetadata<Input<TItem>>
): ArraySchema<TItem>;

/**
 * Creates a array schema.
 *
 * @param item The item schema.
 * @param metadata The schema metadata.
 *
 * @returns A array schema.
 */
export function array<TItem extends BaseSchema>(
  item: TItem,
  metadata?: SchemaMetadata<Input<TItem>>
): ArraySchema<TItem>;

export function array<TItem extends BaseSchema>(
  item: TItem,
  arg2?: SchemaMetadata<Input<TItem>> | Pipe<Output<TItem>[]> | ErrorMessage,
  arg3?: SchemaMetadata<Input<TItem>> | Pipe<Output<TItem>[]>,
  arg4?: SchemaMetadata<Input<TItem>>
): ArraySchema<TItem>
const schema2 = array(number(), 'Error', [length(1), includes(123)]);

For the above code, TypeScript, gives the following error message:

No overload matches this call.
  Overload 1 of 3, '(item: NumberSchema<number>, pipe?: Pipe<number[]> | undefined, metadata?: SchemaMetadata<number> | undefined): ArraySchema<NumberSchema<number>, number[]>', gave the following error.
    Argument of type 'string' is not assignable to parameter of type '(BaseValidation<number[]> | BaseTransformation<number[]>)[]'.ts(2769)

But the following code works fine:

const schema2 = array(number(), 'Error', [length(1), includes(123)], {});
const schema3 = array(number(), 'Error', [length(1), includes(123)], undefined);

If you have a spare moment, check out my code implementation here

To avoid the above, I think we should use the current ErrorMessage location for the metadata. This location should accept both ErrorMessage and SchemaMetadata, so that break change can be avoided.
We also reserve a message field in metadata for cases where both ErrorMessage and SchemaMetadata are needed. We can either extract the message in defaultArgs(), or we can just make the message part of the metadata, either way it's easy.

This approach also makes our schema function much simpler:

Here we only need two overloads
/**
 * Creates a array schema.
 *
 * @param item The item schema.
 * @param pipe A validation and transformation pipe.
 *
 * @returns A array schema.
 */
export function array<TItem extends BaseSchema>(
  item: TItem,
  pipe?: Pipe<Output<TItem>[]>
): ArraySchema<TItem>;

/**
 * Creates a array schema.
 *
 * @param item The item schema.
 * @param metadata The schema metadata or error message.
 * @param pipe A validation and transformation pipe.
 *
 * @returns A array schema.
 */
export function array<TItem extends BaseSchema>(
  item: TItem,
  metadata?: ErrorMessage | SchemaMetadata<Input<TItem>>,
  pipe?: Pipe<Output<TItem>[]>
): ArraySchema<TItem>;

export function array<TItem extends BaseSchema>(
  item: TItem,
  arg2?: SchemaMetadata<Input<TItem>> | ErrorMessage | Pipe<Output<TItem>[]>,
  arg3?: Pipe<Output<TItem>[]>
): ArraySchema<TItem>
const schema2 = array(number(), 'Error', [length(1), includes(123)]);
const schema3 = array(number(), { message: 'Error', description: "Lorem ipsum" }, [length(1), includes(123)]);

Let's compare the scenarios in detail: that is, placing metadata at the last parameter and placing metadata at the ErrorMessage position:

point at the last parameter at the ErrorMessage position
no breaking changes ⚠️ ts may miss the expected function overload
easily identified in defaultArgs
can be added to the schema object without processing ⚠️ Requires additional operations on the message
easily accessible as key-value pairs via Schema.metadata
Small packing size
code complexity ⚠️ complexity function overload ✅ similar to the current
performance ✅ slightly more complex defaultArgs ✅ slightly more complex defaultArgs

@fabian-hiller
Copy link
Owner Author

We should refer to existing popular standards such as annotations for ...

I would not add properties like examples or deprecated. What use do you see for them? For example, JS Doc can be used as a comment to add such informations.

I don't think SQL properties should be included in the valibot package itself.

Weren't the SQL properties the main reason for this feature? Are the properties not standardized? I thought that a Valibot schema could then be sufficient to generate SQL commands.

In general: Where and how do you plan to use this metadata feature?

✅ slightly more complex defaultArgs (at the ErrorMessage position)

How would you implement defaultArgs in this case?

@xcfox
Copy link
Contributor

xcfox commented Feb 1, 2024

Weren't the SQL properties the main reason for this feature? Are the properties not standardized?

Yes, in fact, there is no standard.
In addition, valibot schema does not contain sufficient information to generate table-building statements SQL. Let's look at a few examples:

  1. In valibot, there is a number() schema,

    • In MySQL, there are INT, SMALLINT, BIGINT, TINYINT, FLOAT, DOUBLE, DECIMAL types.

    • In PostgreSQL, there are INT, SMALLINT, BIGINT, DECIMAL, DOUBLE PRECISION, NUMERIC, MONEY types.

  2. When we face the same SQL statement to build a table, there are many differences in the API design of different orm:

    CREATE TABLE "Post" (
      "id" SERIAL,
      "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
      "updatedAt" TIMESTAMP(3) NOT NULL,
      "title" VARCHAR(255) NOT NULL,
      "content" TEXT,
      "published" BOOLEAN NOT NULL DEFAULT false,
      "authorId" INTEGER NOT NULL,
      PRIMARY KEY ("id")
    );
    TypeORM Declaration Code
    import {
      Entity,
      PrimaryGeneratedColumn,
      Column,
      CreateDateColumn,
      UpdateDateColumn,
    } from 'typeorm';
    
    @Entity('Post')
    export class Post {
      @PrimaryGeneratedColumn()
      id: number;
    
      @CreateDateColumn({ type: 'timestamp', default: () => 'CURRENT_TIMESTAMP(3)' })
      createdAt: Date;
    
      @UpdateDateColumn({ type: 'timestamp' })
      updatedAt: Date;
    
      @Column({ type: 'varchar', length: 255 })
      title: string;
    
      @Column({ type: 'text', nullable: true })
      content: string;
    
      @Column({ type: 'boolean', default: false })
      published: boolean;
    
      @Column({ type: 'int' })
      authorId: number;
    }
    MikroORM Declaration Code
    import {
      Entity,
      PrimaryKey,
      Property,
      SerializedPrimaryKey,
    } from '@mikro-orm/core';
    
    @Entity()
    export class Post {
      @PrimaryKey()
      id: number;
    
      @Property({ type: Date, onCreate: () => new Date() })
      createdAt: Date;
    
      @Property({ type: Date, onUpdate: () => new Date() })
      updatedAt: Date;
    
      @Property()
      title: string;
    
      @Property({ nullable: true })
      content: string;
    
      @Property({ default: false })
      published: boolean;
    
      @Property()
      authorId: number;
    }

In general: Where and how do you plan to use this metadata feature?

I want to use it as the alternative of reflect-metadata

There are lots of problems when declaring schema in TypeScript's class with reflect-metadata:

  • Inheritance only, difficult to combine
  • Need to repeat the type declaration and loses type safety

Specifically, I have two use cases:

  • declare MikroORM's Entity, I've basically implemented a library valibot-mikro

  • declare GraphQL schema

The whole community has been trying to move away from reflect-metadata for a few years now.
In the SQL scenario, emerging ORMs are starting to use their own schema syntax:

This is especially true in the world of GraphQL: pothos, nexus, gqtx, grats.

As you can see, there are so many schema builders in the community. The problem is that the current schema builder ecosystem is separate and not as universal as class and reflect-metadata.

I hope that valibot will become the universal schema builder in the TypeScript world.
Once I declared mikro's entity with valibot, no extra work was needed to get json-schema, openapi graphql's ObjectType, Parameters for command line applications.

@xcfox
Copy link
Contributor

xcfox commented Feb 1, 2024

How would you implement defaultArgs in this case?

/**
 * Returns message and pipe from dynamic arguments.
 *
 * @param arg1 First argument.
 * @param arg2 Second argument.
 *
 * @returns The default arguments.
 */
export function defaultArgs<TPipe extends Pipe<any> | PipeAsync<any>>(
  arg1: ErrorMessage | SchemaMetadata | TPipe | undefined,
  arg2: TPipe | undefined
): [ErrorMessage | undefined, TPipe | undefined, SchemaMetadata | undefined] {
  if (Array.isArray(arg1)) return [undefined, arg1, undefined];

  if (typeof arg1 === 'string' || typeof arg1 === 'function')
    return [arg1, arg2, undefined];

  return [arg1?.message, arg2, arg1];
}

Compare this to the implementation when I use metadata as the last parameter:

/**
 * Returns message and pipe from dynamic arguments.
 *
 * @param args The arguments.
 *
 * @returns The default arguments.
 */
export function defaultArgs<TPipe extends Pipe<any> | PipeAsync<any>>(
  ...args: (SchemaMetadata | TPipe | ErrorMessage | undefined)[]
): [ErrorMessage | undefined, TPipe | undefined, SchemaMetadata | undefined] {
  let message: ErrorMessage | undefined;
  let pipe: TPipe | undefined;
  let metadata: SchemaMetadata | undefined;

  for (const arg of args) {
    if (typeof arg === 'string' || typeof arg === 'function') {
      message = arg;
    } else if (Array.isArray(arg)) {
      pipe = arg;
    } else if (typeof arg === 'object') {
      metadata = arg;
    }
  }

  return [message, pipe, metadata];
}

@fabian-hiller
Copy link
Owner Author

fabian-hiller commented Feb 2, 2024

Thank you very much! I see a problem. The object and tuple schema functions accept a schema as the first optional argument. Since a schema is represented as an object, we can not distinguish it from the metadata object.

@fabian-hiller
Copy link
Owner Author

fabian-hiller commented Mar 5, 2024

Hi, I may change the implementation of object and tuple. I plan to remove the rest argument and provide strictObject and objectWithRest besides object. Same for tuple. strictObject and objectWithRest simply reuse object internally. This change has two benefits. On the one hand, the implementation is more modular, which reduces the bundle size when using only object without the rest argument. On the other hand, it reduces the complexity of the implementation and the type overload definitions.

If we also remove the pipe argument and introduce the pipe function, we completely eliminate function overload definitions for schema functions. If we do this, we can consider either passing the metadata in the form of an object as an argument to schema functions (which would be the easiest), or providing metadata actions for the pipe function. However, this would require the metadata actions to be executed first, which degrades the performance of the initialization and increases the bundle size.

I would therefore prefer option 1. Nevertheless, we should also take a look at the DX.

@fabian-hiller
Copy link
Owner Author

I could imagine that the first optional argument is always the message and the second optional argument is metadata.

// With message and metadata
const Schema = pipe(
  string('Text ...', { description: 'Text ...' }),
  minLength(1)
);

// With just metadata
const Schema = pipe(
  string({ description: 'Text ...' }),
  minLength(1)
);

@xcfox
Copy link
Contributor

xcfox commented Mar 6, 2024

I prefer to use metadata as argument, it makes valibot more clean when constructing the schema:

// With the new `pipe` function
const UserSchema = pipe(
  object({
    id: pipe(string(), primaryKey(), columnName(user_id)),
    name: pipe(string(), unique()),
    bio: pipe(string(), description('Text ...')),
  }),
  table('users')
)

// With metadata argument
const UserSchema = object(
  {
    id: string({ primaryKey: true, columnName: 'user_id' }),
    name: string({ unique: true }),
    bio: string({ description: 'Text ...' }),
  },
  { tableName: 'users' }
)

Another important point is that when using valibot as a pure schema builder, the ORM itself contains solid validation, and it is not very useful to validate data on top of the ORM.
That said, when I use the medatada feature, I almost no longer need the pipeline to validate the data. Separating the pipeline from medatada seems perfect to me. This actually provides better DX, I don't have to declare pipe over and over again.

The only extra work for the developer is to declare the extra metadata fields while installing valibot, as yup and trpc do:

// env.d.ts
import { MikroSchemaMetadata } from 'valibot-mikro'
import 'valibot'

declare module 'valibot' {
  export interface SchemaMetadata extends MikroSchemaMetadata {}
}

@fabian-hiller
Copy link
Owner Author

Thank you very much for your feedback! I will get back to you as soon as I have made the proposed changes. Then we can discuss the details of the metadata feature implementation together.

@xcfox
Copy link
Contributor

xcfox commented Apr 18, 2024

I've found that having TypeScript accurately infer the metadata type is very helpful in checking the correctness of the program.

const Cat = pipe(
  object({
    id: pipe(string(), primaryKey(), columnName('user_id')),
    name: pipe(string(), unique()),
    loveFish: pipe(boolean(), description('Does the cat love fish?')),
  }),
  name('Cat')
);

expectTypeOf<typeof Cat['name']>().toEqualTypeOf<string>()
const CatEntity = toMikroEntity(Cat); // It should pass

const Dog = object({
  id: string(),
  name: string(),
  loveFish: boolean(),
})

const DogEntity = toMikroEntity(Dog); // It should fail with a type error because Dog does not have a name

This does not seem to be achievable using metadata arguments. So now I think pipe is a better design.

Another idea is to add the with method to the schema:

export interface BaseSchema {
  // ...
   with<T>(modifier: (schema: this) => T): T;
  // ...
}

function _with<T>(this: BaseSchema, modifier: (schema: this) => T): T {
  return modifier(this);
}
const Cat = object({
  id: string().with(primaryKey()).with(columnName('user_id')),
  name: string().with(unique()),
  loveFish: boolean().with(description('Does the cat love fish?')),
}).with(name('Cat'));

This is more readable than pipe, but adds a little bit of bundle size.

@fabian-hiller
Copy link
Owner Author

I don't understand what toMikroEntity is doing.

This does not seem to be achievable using metadata arguments.

Can you explain this in more detail?

@xcfox
Copy link
Contributor

xcfox commented Apr 25, 2024

I have a detailed implementation of toMikroEntity at valibot-mikro

When EntitySchema is missing a name, toEntitySchema throws an error. If TypeScript could hint at this missing name error, then we could notice the error much earlier, rather than reporting it at runtime.

In order for TypeScript to do this hint, we need to carry more detailed types on the valibot schema, For example, carrying a name:

function name(
  value: string
): <T extends object>(x: T) => T & { name: string } {
  return (x) => {
    x.name = value
    return x
  }
}

// with pipe
const Cat = pipe(
  object({
    id: pipe(string(), primaryKey(), columnName('user_id')),
    name: pipe(string(), unique()),
    loveFish: pipe(boolean(), description('Does the cat love fish?')),
  }),
  name('Cat')
);

expectTypeOf<typeof Cat['name']>().toEqualTypeOf<string>()

Using pipe, we can modify the type of schema, in this example, we've added the extra name field to Cat.
And I think it's harder to achieve this with metadata argument.

// with metadata argument
const Cat = object(
  {
    id: string(),
    name: string(),
    loveFish: boolean({ description: 'Does the cat love fish?' }),
  },
  { name: Cat }
)

// Is this possible?
expectTypeOf<(typeof Cat)['name']>().toEqualTypeOf<string>()
expectTypeOf<
  (typeof Cat)['entries']['loveFish']['description']
>().toEqualTypeOf<string>()

@fabian-hiller
Copy link
Owner Author

Thank you for the details! I am quite busy with my studies at the moment. I will probably get back to you next week.

@fabian-hiller
Copy link
Owner Author

This is still on my list. I will get back to you as soon as I have the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority This has priority
Projects
None yet
Development

No branches or pull requests

2 participants