Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correcting TYPOs and adding stations in Hebrew for RavKav #751

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

Correcting TYPOs and adding stations in Hebrew for RavKav #751

wants to merge 11 commits into from

Conversation

tzagim
Copy link
Contributor

@tzagim tzagim commented May 8, 2021

Correcting TYPOs
Adding stations in Hebrew for RavKav, it's not perfect yet, Adding the files to the "IW" subdirectory I added a file of place / city names by zone_id. Can you fix it?
By the way, when the app is in Hebrew, you need to move the operator name from right to left.

base on https://transitfeeds.com/p/ministry-of-transport-and-road-safety/820/latest/file/stops.txt

@phcoder
Copy link
Collaborator

phcoder commented Sep 27, 2021

Can you please split out the part that is ready and rewrite history to make commits more understandable. Especially pay attention to describing what actually happened like "Add missing newline" or "transcoded files". Drop test commits. Please ping me wen ready

@micolous
Copy link
Collaborator

Thanks!

The CSV-based stop database actually has dual-language support, so having a separate file isn't right. It can read the non-English stop and operator names with the local_name parameter.

However, it looks like there's GTFS data, which is probably a better way to do this. But our GTFS tooling doesn't support translations.txt yet, so if we switched to it, we'd only get Hebrew stop names.

So what would need to happen to make this work is:

  1. Extend the existing MdST-GTFS tools to support translations.txt.
  2. Migrate RavKav to use GTFS as its main data source, with a mapping file like what ./data/seq_go has.

This is quite complicated that what one can achieve in the GitHub Web UI, so I'm not sure what your appetite for this is. :)

It looks like the agency also publishes data in Arabic, which would be nice to include as well. But all of our stop database system assumes there are only two languages (English and "Local"), which would need to change to support > 2 languages, but would also benefit other places (eg: Japan has station names in Japanese kanji, Japanese kana, Japanese romaji and English).

Fixing that is going to be a whole lot more complicated.

@micolous
Copy link
Collaborator

I've had a look through this some more now. It's been tough for me, between not being able to read Hebrew, and various programs barfing on RTL text in ways that I'm sure native speakers are familiar. 😅

I've started implementing >2 language support on an MdST level, but have yet to push this into Metrodroid itself. I've also gotten basic GTFS translation support working, but it looks like the agency's GTFS data uses the old GTFS translation specification. It's workable, but going to be interesting if any GTFS data uses the new spec.

There are gaps in their translation data, for example: Haifa Center has this data in the stops.txt:

23223,17016,חיפה מרכז,,32.822306,34.997177,0,,

But then they translate ת. רכבת חיפה מרכז (Haifa Center Rail Station) instead of חיפה מרכז (Haifa Center):

$ ag -Q 'חיפה מרכז' gtfs_tmp/translations.txt
77411:ת. רכבת חיפה מרכז,AR,محطة قطار حيفا مركز
77412:ת. רכבת חיפה מרכז,EN,Haifa Center Rail Station
77413:ת. רכבת חיפה מרכז,HE,ת. רכבת חיפה מרכז

There is also no alternative stop ID which matches this translation exactly:

$ ag -Q 'ת. רכבת חיפה מרכז' gtfs_tmp/stops.txt
976:975,42977,ת. רכבת חיפה מרכז, רחוב:  0 עיר: חיפה רציף:   קומה:  ,32.821689,34.996503,0,,4000
6822:6821,47575,ת. רכבת חיפה מרכז, רחוב:  0 עיר: חיפה רציף:   קומה:  ,32.820794,34.997275,0,,4000
16607:16606,42978,ת. רכבת חיפה מרכז, רחוב:  0 עיר: חיפה רציף:   קומה:  ,32.821922,34.997164,0,,4000
30268:30267,47574,ת. רכבת חיפה מרכז, רחוב:  0 עיר: חיפה רציף:   קומה:  ,32.821237,34.996554,0,,4000
31868:31867,41540,ת. רכבת חיפה מרכז, רחוב:דרך העצמאות 106 עיר: חיפה רציף:   קומה:  ,32.820918,34.996850,0,,4000

I've been able to work around some of these by using alternative IDs, but there are gaps even with existing data.

As for the other data:

  • operators.csv: I've incorporated the corrected English names, and the Hebrew names.

  • stations.csv: It looks like you've copied the GTFS stop_code, presuming the IDs on the RavKav card are the same as what is on the GTFS data – they're not!

    For example, Herzliya is GTFS stop code 17034, but is RavKav code 7002. Looking at cardpeek, there is mention of a third ID for the national rail company: 3500.

  • zone_id.csv: I'm not sure how to use this data, I don't think the reader code uses this yet.

micolous added a commit to micolous/metrodroid-working-copy that referenced this pull request Oct 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants