Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ne pas transformer tous les caractères non-ascii en leur version html #831

Closed
ghost opened this issue Apr 7, 2013 · 10 comments
Closed

Comments

@ghost
Copy link

ghost commented Apr 7, 2013

Je viens de mettre en place sur un blog à http://traductions.sploing.fr/index.html et quand on regarde le code source de la page, tous les caractères non-ascii sont transformés en leur équivalent html, ce qui le rend plus difficilement lisible et réutilisable par Facebook et co, comme par exemple pour le titre de cet article  : http://traductions.sploing.fr/novlangue/2013/04/06/peche-fiscal/ où un nbsp; remplace un bête espace. Est-ce qu'il ne serait pas possible de laisser le texte utf-8 tel quel ? Après tout, c'est ce qui est déclaré en en-tête de la page ici.

@justinmayer
Copy link
Member

Have a look here:

https://github.com/getpelican/pelican/blob/master/pelican/themes/notmyidea/templates/article.html#L2

Does that help address the &nbsp; in the <title> tag behavior you described?

(cc: @bbinet, since he may be better-positioned linguistically to assist)

@ghost
Copy link
Author

ghost commented Apr 12, 2013

Oh thanks a lot ! I do speak english but somehow I just used my mothertongue. Probably because the doc also exists in French. I'll give it a try tonight and again thanks for your kind reply !

Justin Mayer notifications@github.com schrieb:

Have a look here:

https://github.com/getpelican/pelican/blob/master/pelican/themes/notmyidea/templates/article.html#L2

Does that help address the &nbsp; in the <title> tag behavior you
described?

(cc: @bbinet, since he may be better-positioned linguistically to
assist)


Reply to this email directly or view it on GitHub:
#831 (comment)

Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.

@ghost
Copy link
Author

ghost commented Apr 12, 2013

Your suggestion did work, you can check it there with the facebook debugger : https://developers.facebook.com/tools/debug Wonderful !

In order for me to validate html5, do you also know how to ask to pelican to escape correctly urls, I mean with %20 instead of spaces. There : https://github.com/sploinga/traductions/blob/master/bootstrap2/templates/article_infos.html for example, if I could not only give to facebook and twitter the article.title, but also the article.title|correct format for the url, it would be very nice!

Btw, thanks a lot for this great piece of software. Simple and powerful at the same time.

@justinmayer
Copy link
Member

You could try looking at the Jinja docs; perhaps something in the following section might work: http://jinja.pocoo.org/docs/templates/#builtin-filters

@ghost
Copy link
Author

ghost commented Apr 12, 2013

OK great this is the doc I missed. Will first look there next time :) Thanks again !

You could try looking at the Jinja docs; perhaps something in the
following section might work:
http://jinja.pocoo.org/docs/templates/#builtin-filters


Reply to this email directly or view it on GitHub:
#831 (comment)

Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.

@justinmayer
Copy link
Member

No problem. Is there anything else we can do to assist regarding this issue?

@ghost
Copy link
Author

ghost commented Apr 13, 2013

nope i think i found out. In jinja 1 there was an urlencode function, but they dropped it in jinja 2 apparently to be encoding agnostic. See pallets/jinja@37303a8 and pallets/jinja#17 . What I did was simply to grab the code proposed there and add it locally to my filters.py in my jinja installation. This is not very safe because next time i update it it will be erased, but at least it works.

code is simply : 

def do_urlencode(value)
    if type(value) == 'Markup':
    value = value.unescape()
    value = value.encode('utf8')
    value = quote_plus(value)
    return Markup(value)

quote_plus comes from urllib.

Then, in my template, i striptag before i urlencode, and pof it works correctly according to facebook debugger, the w3c validator, and the html source code.

It might be a good idea to integrate some custom jinja filters such as this one in pelican, or to recommand striptagging title and description for sharing websites ? For this filter you only need to force utf-8 on everyone, don't know if you want that.

Again, thanks a lot for your very helpful and much appreciated remarks ! 

@avaris
Copy link
Member

avaris commented Apr 13, 2013

@sploinga, you can add additional Jinja filters with Pelican. Define that function in your settings file and also add the following in settings.

JINJA_FILTERS = {'do_urlencode': do_urlencode}

Now you can use that filter and you won't lose it when you update.

@ghost
Copy link
Author

ghost commented Apr 13, 2013

Great, I'll do it soon that way, thanks ! :)

Now that you two did answer everything I could hope for, I think it's safe to assume you can close the issue ;)

@justinmayer
Copy link
Member

Glad to hear it. Issue closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants