Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix/slugify-non-ascii #805

Closed
wants to merge 1 commit into from
Closed

Conversation

hbsciw
Copy link
Contributor

@hbsciw hbsciw commented Feb 26, 2024

This pull request fixes an issue where questions with full Persian titles were causing a "many redirect error" when trying to access them through the link. The issue was caused by the slugify library returning a blank string for the title of these questions. With this fix, the slugify library has been updated to return a proper string for the title, resolving the issue and allowing for smooth access to all questions, regardless of their title.

@LinkinStars
Copy link
Member

Firstly, thank you for your contribution. There are a few questions below:

  1. Could you tell us some test cases for Persian to help us reproduce the problem?
  2. If we replace the slugify library directly, it may affect the links that are already generated. I haven't actually tested the various scenarios, but it would be a lot of work. My personal suggestion would be to do post processing separately for specific issues.
  3. Do not change the maximum length of the limit, our current maximum is 150
  4. Do not include debug information in the code fmt.Println

@LinkinStars LinkinStars self-assigned this Feb 27, 2024
@hbsciw
Copy link
Contributor Author

hbsciw commented Feb 27, 2024

@LinkinStars Thank you for response.
I solved 3 and 4.

  1. A post with this title has many redirect problem.
    title= چگونه می‌توانم مساله گزارش شده را ارزبای کنم؟
    And post with this title hasn't problem.
    title = چگونه می‌توانم Bug گزارش شده را ارزیابی کنم؟

  2. I think since the slugs aren't saved, it won't be a problem. I also tested a few things and it didn't seem to be a problem. Have you any specific scenario in your mind? Can you tell us your exact solution?

@LinkinStars
Copy link
Member

@LinkinStars Thank you for response. I solved 3 and 4.

  1. A post with this title has many redirect problem.
    title= چگونه می‌توانم مساله گزارش شده را ارزبای کنم؟
    And post with this title hasn't problem.
    title = چگونه می‌توانم Bug گزارش شده را ارزیابی کنم؟
  2. I think since the slugs aren't saved, it won't be a problem. I also tested a few things and it didn't seem to be a problem. Have you any specific scenario in your mind? Can you tell us your exact solution?

Thank you for the examples. Firstly, you're right that using Persian as a title does cause problems and can lead to redirection errors. The reason is because the url address of the title is empty after formatting. After slugify.Slugify(title) the title is empty.

I have two better solutions to this problem that we can discuss.

solution 1

I don't know if you noticed that there is a convertChinese method in the UrlTitle method. Yes, in fact, Chinese has the same problem. It then relies on this method to identify and convert to English. So maybe we can write a similar method to handle this.

solution 2

You are right, the address behind this url is not saved. It is also only for SEO optimisation. If the language can't be converted to English on the URL, then it doesn't really make a lot of sense for SEO. So I would recommend using a shorter address for access to avoid this problem.

image

BTW

By the way, you don't really have to abandon slugify to achieve the capabilities you need. You can just replace the IsValidCharacterChecker method. Like in the example below.

package main

import (
	"fmt"
	"strings"
	"unicode"

	"github.com/Machiel/slugify"
)

func IsValidCharacterChecker(r rune) bool {
	// TODO: Implement your own character validation
	return true
}

func main() {
	originalTitle := " چگونه می‌توانم مساله گزارش شده را ارزبای کنم؟"
	slug := slugify.New(slugify.Configuration{IsValidCharacterChecker: IsValidCharacterChecker})
	title := slug.Slugify(originalTitle)
	fmt.Println(title)
}

@LinkinStars
Copy link
Member

Our final solution is to use a fixed description(topic) when the language cannot be parsed.

@LinkinStars LinkinStars closed this May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants