Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide better defaults for mimetype to extension mapping #81

Open
wants to merge 31 commits into
base: master
Choose a base branch
from

Conversation

mx1up
Copy link

@mx1up mx1up commented Jan 16, 2023

based off pr #15 , afterwards I realized in the meantime mimeFromExtension had already been added 🙄 but without defaults. I went ahead anyway since @kevmoo indicated PR's are still welcome.
So this PR mainly adds the defaults and custom error handling (orElse and nullable version), while maintaining backwards compatibility.

example default:
image/jpeg now nicely maps to "jpg" instead of "jpe"

I copied firstWhereOrNull from the collection package to avoid a dependency.

fixes #55
fixes #13

Edit: removed collection dependency

@mx1up
Copy link
Author

mx1up commented Jan 27, 2023

@kevmoo , @natebosch could you have a look please? 🙏


/// Allow for a user-specified MIME type-extension mapping that overrides the
/// default.
void addMimeType(String mimeType, String extension) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be more comfortable with this PR if it didn't include this API.

Why do we need this one?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @natebosch , thanks for having a look

this library does not know all possible file formats and I think it's neither realistic nor a goal of this lib to achieve that. The library also seems to be seeded with quite a random list of formats. So it can be quite a frequent situation in which a user of this library needs a format added. A pull request could be made, but the release cycle is not that fast and it's also a bit of overhead for maintainers.

That's why I think it may be useful to have this method so users don't immediately get stuck.

Similarly, this library already provides addExtension.

The 2nd usecase is one in which the user of the lib has a different preference than the one that is included. For example, if one prefers 'text/plain' to map to 'conf' as it currently does, one can still configure that old behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow response.

I'm worried about having an addMimeType that only impacts extensionFromMime and doesn't impact lookupMimeType.

We could either make this signature addMimeType(String mimeType, String extension, {List<int>? headerBytes}) and try to make this work with lookupMimeType, or we could consider an alternative name like `addExtensionPreference.

@devoncarew @lrhn WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I worry about global mutable state.
This is a kind of registry, where any code can add its own preferences, but they then also affect everybody else.

I'd prefer a structure where you have a Mime object instead of global state and functions. Then you can get your own object, modify it, and pass it around to your own code, without affecting other people's Mime objects.
That's a larger change, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'll be able to do any refactoring of existing usage.

We could add a MimeRegistry class which can be overridden like this. The top-level APIs would continue to use the global map, the MimeRegistry would fall back on the global map, but it could also store local overrides. @lrhn WDYT about something like this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lrhn - Should I work on a PR with an implementation of a MimeRegistry, or do you think we can get by with the top level approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can definitely "get by" with the top-level approach, but it's inherently fragile. I prefer to avoid adding fragile features.

What I'd do, if I had all the time in the world, is to have an immutable registry, and a mutator "builder" type.
The top-level functions should then just forward to the corresponding functions on MimeRegistry.defaultRegistry.

abstract interface class MimeRegistry {
  static get MimeRegistry currentRegistry => ... (probably something that can be zone overridden) ...;

  factory MimeRegistry.empty() = ...;

  int get defaultMagicNumbersMaxLength;
  String lookup(String path, {List<int>? headerBytes});
  String extensionFor(String mimeType);

  MimeRegistryBuilder get build;
  MimeRegistry rebuild(void Function(MimeRegistryBuilder) mutator);
}

abstract interface class MimeRegistryBuilder {
  // I don't know what you'd want, but something like:

  /// Adds mime type, if it doesn't already exist. 
  /// Adds [extension] to the extensions for [mimeType].
  /// Set the header-recognizer function to [headerRecognizer].
  void addMimeType(String mimeType, {String? extension, bool Function(List<int>) headerRecognizer});
  void removeMimeType(String mimeType);
  void addExtension(String mimeType, String additionalExtension);
  void removeExtension(String mimeType, String extension);
  // Maybe some inspection functions, but it's a little weird if those are not available on the registry itself.
  Iterable<String> get allMimeTypes;
  Iterable<String> allExtensionsFor(String mimeType);
  bool Function(List<int>)? headerRecognizerFor(String mimeType);
}

The builder is building on top of an existing registry, and it can either defer to it, lazily copy from it, or eagerly clone.
I'd make the actual implementations hidden.

But I might just be over-engineering. Always a risk 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'd do, if I had all the time in the world

What do you think we should do given that we have less time than that?

Copy link
Member

@lrhn lrhn May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same thing! The worry about lack of time for something else afterwards 😁

But the concrete problem we're trying to solve here seems to be solved without the addMimeType, so I'd remove that, not add configurability in this PR, and just give access to the default file-type extensions.

Then, if we still need configurability, we can come back and look at it fresh, and not stall this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed addMimeType

@mx1up mx1up requested a review from natebosch April 4, 2023 09:14
/// If there are multiple extensions for [mimeType], return preferred extension
/// if defined, or the first occurrence in the map.
/// If no extension is found, `null` is returned.
String? extensionFromMimeOrNull(String mimeType) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't have the "OrNull" on the name here. That's just repeating the type.
It distinguishes it from the function below, but see comment there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

'video/x-matroska': 'mkv',
};

/// Lookup file extension for a given MIME type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is a noun phrase, so treat this as a parameterized getter. That includes documenting it using a noun phrase too.
(At least the first sentence, after that it's OK to talk about returning a value.)

/// File extension for a given MIME type, if available.
///
/// Returns `null` if [mimeType] has no associated extension.
/// If [mimeType] has multiple associated extensions,
/// the preferred extension is returned.

Don't mention the map, it's not part of the public API, so a reader won't know what "the map" is.

Copy link
Author

@mx1up mx1up Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done (had to look up what a 'noun phrase' is, so not sure if I did right ;) )

?.key;
}

String extensionFromMime(String mimeType, {String? orElse}) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs documentation.
But I'd just drop this method. The user can easily do .extensionFromMime(myMime) ?? orElse() themselves.
Or .extensionFromMime(myMime)! if they know it's there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, I now remember I did this to preserve backward compatibility..

if (_preferredExtensionsMap.containsKey(mimeTypeLC)) {
return _preferredExtensionsMap[mimeTypeLC]!;
}
return defaultExtensionMap.entries
Copy link
Member

@lrhn lrhn May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just inline that extension method here:

  for (var entry in defaultExtensionsMap.entries) {
     if (entry.value == mimeTypeLC) return entry.key;  
  }
  return null;

It's far too much (unnecessary) syntactic overhead to add your own firstWhereOrNull extension (takes up nine lines further down) to save, basically, one line of code here. And I personally don't find it easier to read either.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@mx1up mx1up requested a review from lrhn December 13, 2023 18:45
@mx1up
Copy link
Author

mx1up commented Dec 13, 2023

@lrhn I applied all your suggestions, please have another look

@lrhn
Copy link
Member

lrhn commented Dec 13, 2023

Looks fine to me. Could probably tweak the documentation phrasing a little, but that's a never-ending search for perfection, we can always iterate on that.

@natebosch WDYT?

@mx1up
Copy link
Author

mx1up commented Dec 13, 2023

thanks!

about documentation: feel free to make suggestions, love to improve.

I just noticed I forgot to remove some part of the documentation about addMimeType, will do it now

@mx1up
Copy link
Author

mx1up commented Dec 13, 2023

removed addMimeType from documentation

@mx1up
Copy link
Author

mx1up commented Jan 30, 2024

@natebosch @lrhn any updates..?

'video/x-matroska': 'mkv',
};

/// The extension for a given MIME type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Consider changing "extension" to "file extension". Maybe it's just me having worked too much with "extension types" recently, but "extension" isn't feeling precise enough.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/// if defined, otherwise an extension chosen by the library.
/// If no extension is found, `null` is returned.
String? extensionFromMime(String mimeType) {
final mimeTypeLC = mimeType.toLowerCase();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Avoid abbreviations, so LC could be lowerCase, but I'd probably just do:

 mimeType = mimeType.toLowerCase();

and not rename.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it would be bad practice to overwrite incoming params?
anyway, it turned out the function became so trivial I didn't need the var anymore


/// Add an override for common extensions since different extensions may map
/// to the same MIME type.
final Map<String, String> _preferredExtensionsMap = <String, String>{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not possible for code to add to this map, right?
If so, it proably should be const.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, not anymore ;)

but applying the optimization made it no longer possible

return _preferredExtensionsMap[mimeTypeLC]!;
}

for (final entry in defaultExtensionMap.entries) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the point of this code is that the defaultExtensionMap has multiple extensions mapping to the same mime type, so the reverse lookup can choose the "wrong" (not preferred) extension.

Could we create a new map by reversing the defaultExtensionMap and then overriding the ones that we care about, like:

Map<String, String> _preferredExtensionsMap = {for (var entry in defaultExtensiosnMap.entries) entry.value: entry.key}..addAll({
  'application/vnd.ms-excel': 'xls',
  'application/vnd.ms-powerpoint': 'ppt',
  'image/jpeg': 'jpg',
  'image/tiff': 'tif',
  'image/svg+xml': 'svg',
  'text/calendar': 'ics',
  'text/javascript': 'js',
  'text/plain': 'txt',
  'text/sgml': 'sgml',
  'text/x-pascal': 'pas',
  'video/mp4': 'mp4',
  'video/mpeg': 'mpg',
  'video/quicktime': 'mov',
  'video/x-matroska': 'mkv',
});

Then we only need to look up in one map (and not do a linear search).

That's a possible optimization, what we have is already better than what was before, so no change required.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@mx1up
Copy link
Author

mx1up commented Feb 19, 2024

@lrhn I applied all suggestions, please have a look at all unresolved conversations 🙏

@mx1up mx1up requested a review from lrhn February 19, 2024 18:32
@kevmoo
Copy link
Member

kevmoo commented Apr 19, 2024

@lrhn ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ExtensionFromMime returns jpe for image/jpeg Add content type-to-extension conversion
5 participants