Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong encoding in originalname containing unicode characters #962

Open
truemogician opened this issue Dec 22, 2020 · 11 comments
Open

Wrong encoding in originalname containing unicode characters #962

truemogician opened this issue Dec 22, 2020 · 11 comments

Comments

@truemogician
Copy link

truemogician commented Dec 22, 2020

Version : 1.4.2

System : Windows 10

When uploading a file whose name contains unicode character, file.orginalname turns out to be some messy code, indicating something has gone wrong in encoding.
Maybe the problem isn't with multer, but I cannot find a way to get the encoding proper. Any explanation or solution will be appreciated ❤️

@ongiao
Copy link

ongiao commented Jan 6, 2021

Version : 1.4.2
System : Windows 10

Same with your problem. I am using Postman to upload a file with Chinese name,

export const uploadHandler = multer({ storage: iTwinStorage({
  filename: (_req, file, cb) => {
    console.log("filename: ", file.originalname);
    
    cb(null, file.originalname);
  },
}) }).any();

and file.originalname gives me some garbled code (such as Л�bentley�revit!�-@r7.rvt).

Version : 1.4.2

System : Windows 10

When uploading a file whose name contains unicode character, file.orginalname turns out to be some messy code, indicating something has gone wrong in encoding.
Maybe the problem isn't with multer, but I cannot find a way to get the encoding proper. Any explanation or solution will be appreciated ❤️

@keliq
Copy link

keliq commented Feb 18, 2021

It's Postman's problem not multer. You can always get correct originalname with curl or axios. For example:

curl --location --request POST 'http://localhost:8080/files' \
--header 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6Ik...' \
--form 'files=@"/Users/keliq/Pictures/截图/小白菜.jpeg"'

@erguotou520
Copy link

erguotou520 commented Aug 1, 2022

I'm using curl but still get the wrong encoding...
image

@ElderlyBoy
Copy link

You may need this:

req.files[0].originalname = Buffer.from(req.files[0].originalname, 'latin1').toString('utf-8');

@YICHUNLIN
Copy link

you can try update multer package from 1.4.2 to ^1.4.5-lts.1, i successed at 2023.3.14

@MohamedClio
Copy link

@ElderlyBoy I saw a lot of people just adding this line and they say just add it to multer configuration, can you please tell me exactly where should I add it as I'm fairly new to this?
req.files[0].originalname = Buffer.from(req.files[0].originalname, 'latin1').toString('utf-8');

@ElderlyBoy
Copy link

@MohamedClio
mulit doc
Or you can handle the file name separately in each handler:

//example
router.post('/example', (req, res) => {
  req.files[0].originalname = Buffer.from(req.files[0].originalname, 'latin1').toString('utf-8');
  //...your code
})

@m1h43l
Copy link

m1h43l commented Apr 4, 2024

@MohamedClio mulit doc Or you can handle the file name separately in each handler:

//example
router.post('/example', (req, res) => {
  req.files[0].originalname = Buffer.from(req.files[0].originalname, 'latin1').toString('utf-8');
  //...your code
})

Just a note: This code above assumes that originalname is encoded in LATIN1 / ISO-8859-1. But this assumptions may be wrong as many times as it may be right. It is just an assumptions. As long as you don't take the actual encoding into account and act accordingly you may get a wrong result.

@Doc999tor
Copy link

Just a note: This code above assumes that originalname is encoded in LATIN1 / ISO-8859-1. But this assumptions may be wrong as many times as it may be right. It is just an assumptions. As long as you don't take the actual encoding into account and act accordingly you may get a wrong result.

It's not entirely correct - multer by default decodes headers values as latin1. It's an old, well-known bug in the last stable version of multer
If you expect that the encoding will be utf8, this hack will translate the headers values to utf8

This PR #1210 provides you a straightforward way to encode the headers as you expect them to be

@m1h43l
Copy link

m1h43l commented Apr 5, 2024

multer uses busboy and if I understand the busboy code correct it more or less supports latin1, utf8 and utf16. But there are dozens of other encodings.

https://github.com/mscdex/busboy/blob/master/lib/utils.js#L384

@Doc999tor
Copy link

multer uses busboy and if I understand the busboy code correct it more or less supports latin1, utf8 and utf16

Multer has a bug, so it uses busboy incorrectly
Instead of supporting these 3 encodings, multer mistakenly converts all the headers value to latin1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants