Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: Parsing wikimedia image paths (with multiple paths) #419

Open
matadorhernan opened this issue Jan 26, 2021 · 1 comment
Open

Issue: Parsing wikimedia image paths (with multiple paths) #419

matadorhernan opened this issue Jan 26, 2021 · 1 comment
Labels

Comments

@matadorhernan
Copy link

matadorhernan commented Jan 26, 2021

Test case:

https://en.wikipedia.org/wiki/Facebook

Issue

The page returns infobox.image() as one path for two files and the current solution is redirecting as a combined entity as opposed to an array of entities. You can fix that changing lines: 4 to 11 but I don't know how would that affect your code

Fix

//src/image/Image.js

//line 4
const encodeTitle = function (path: string) :string[] {

  //separate files using split() join()
  let paths = path.split(".svg").join(".svg{}") //repeat for all kinds of media
                    .split("{}");

//remove all falsey values I usually use lodash and I just noticed I don't know
//which method is best if not using lodash but here, (whenever the image path only contains one file the second index is falsey)

  paths.filter(file =>{
   return file != undefined && file != '' && file != null;
  });
  
  paths.map(file =>{
     let title = file.replace(/^(image|file?)\:/i, '');
     //titlecase it
     title = title.charAt(0).toUpperCase() + title.substring(1);
     //spaces to underscores
     title = title.trim().replace(/ /g, '_');
     return title;
  }) 

  return paths;
 
}

I know you probably can code this with a regex but I don't know how, so whenever you do Ill see it.

@spencermountain
Copy link
Owner

thanks Miguel - yeah, good find. This is the first case I've seen of two images in one infobox property - we are grabbing the logo text without a test, or validation.

For reference:

{{Infobox website
| name = Facebook
| logo = [[File:Facebook f logo (2019).svg |100px]]<br/><br/>[[File:Facebook Logo (2019).svg|196px]]
}}}

We're stuck behind a breaking change on dev branch, but i'm happy to add this to the next release, which may be in a few weeks.
cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants