Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trips, schedules and legs in fptf 2 #61

Open
juliuste opened this issue Oct 26, 2018 · 34 comments
Open

trips, schedules and legs in fptf 2 #61

juliuste opened this issue Oct 26, 2018 · 34 comments

Comments

@juliuste
Copy link
Member

This issue is an attempt to bundle and advance the discussion about major changes to FPTF types in version 2, previously debated in #5, #33 and #42.


General consensus

There are a few points about which there seemed to be general consensus (?):

  • Introduce a trip type (mentioned in this comment)
  • Adapt the schedule type to fit together with the newly introduced trip type (mentioned in this comment)
  • Change how we handle time information in journey and stopover to distinguish between realtime and scheduled dates (mentioned in this comment)

However, we didn't agree on any specific specification changes yet, so here's my proposal, please express your opinions!


schedule, trip and leg

In my original proposal to introduce a trip type, I initially intended to remove the schedule type entirely. However @derhuerst made the point that in some cases it might actually be useful to have another “aggregator“ type like schedule so that a large amount of trips could also be condensed to a smaller amount of schedules (like in GTFS, for example).

What I take from this is that we should design both types in a way that one could always generate a consistent dataset using only trips from one using schedules. This doesn't need to work vice versa however, since trips can contain additional data, e.g. realtime information, that doesn't belong to a schedule. With this in mind, I came up with the following specification(s):

trip

{
	type: 'trip', // required
	id: '12345', // unique, required
	line: '1234', // line id or object, optional
	route: '1234', // route id or object, optional
	mode: 'bus', // required if route is an id or if the trip mode differs from the route mode
	subMode: , // reserved for future use
	stopovers: [] // required, list of stopover objects
}

One could discuss adding an optional schedule key similar to trip.route or trip.line.

schedule

schedule.starts is an object tripId -> startTime now instead of an array. This allows both to translate a schedule into a set of single trips as well as to provide updated realtime information for trips that are part of a schedule (as discussed in #43).

Also, schedule.line was added to be consistent with the other objects.

{
	type: 'schedule', // required
	id: '12345', // unique, required
	route: '1234', // route id or object, required
	line: '1234', // line id or object, optional
	mode: 'bus', // see section on modes, overrides `route`/`line` mode, e.g. for replacements services
	subMode: , // reserved for future use
	sequence: [
		// seconds relative to departure at first station/stop
		// in 1-to-1 relation to `route` stops
		{
			arrival: -30 // optional, when the vehicle enters the route
			// The departure at the first stop must be 0.
			departure: 0 // required
		},
		{
			arrival: 50, // optional
			departure: 70 // required
		}
		{
			arrival: 120, // The arrival at the last stop is required.
			departure: 150 // optional, when the vehicle leaves the route
		}
	],
	starts: { // object trip.id -> Unix timestamp, required
		'trip1234': 1488379661, // start time of the trip
   		'trip2345': 1488379761,
		'trip3456': 1488379861,
		'trip4567': 1488379961
	}
}

We could further discuss whether to use timestamps or dates as values in the starts object, I kept the timestamps for now.

leg

I suggest that we treat leg as a separate new type, enabling applications to differentiate between trips, which are associated to the movement of an actual vehicle, meaning that the trip starts where the vehicle/line starts and ends where the line/vehicle movement ends, and legs, which depend on the movement of the passenger.

{
	type: 'leg', // required
	id: '12345', // unique, required
	trip: '1234', // trip id or object, optional
	line: '1234', // line id or object, optional
	schedule: undefined, // removed, schedules - if used - should be referenced using the trip or trip id
	// other keys of the existing journey.leg specification
	// see proposed spec changes regarding date information below
}

scheduled / realtime data in leg and stopover

I applied the proposal from #33 to leg and stopover, giving us the following spec changes:

leg

{
	type: 'leg', // required
	id: '12345', // unique, required
	
	// other keys mentioned above
	
	schedule: undefined // removed, see explanation above
	departure: undefined, // removed
	departureDelay: undefined, // removed
	arrival: undefined, // removed
	arrivalDelay: undefined, // removed

	scheduledDeparture: '2017-03-17T15:00:00+02:00', // ISO 8601 string (with origin timezone), required if `realtimeDeparture` is null
	realtimeDeparture: '2017-03-17T15:01:00+02:00', // ISO 8601 string (with origin timezone), required if `scheduledDeparture` is null
	scheduledArrival: '2017-03-17T15:00:00+02:00', // ISO 8601 string (with destination timezone), required if `realtimeArrival` is null
	realtimeArrival: '2017-03-17T15:01:00+02:00' // ISO 8601 string (with destination timezone), required if `scheduledArrival` is null
}

stopover

{
	type: 'stopover', // required
	// other keys defined in the stopover spec
	
	line: '1234', // line id or object, optional
	trip: '1234', // trip id or object, optional
	schedule: undefined // removed, schedules - if used - should be referenced using the trip or trip id
	
	departure: undefined, // removed
	departureDelay: undefined, // removed
	arrival: undefined, // removed
	arrivalDelay: undefined, // removed

	scheduledDeparture: '2017-03-17T15:00:00+02:00', // ISO 8601 string (with origin timezone), required if `realtimeDeparture` is null
	realtimeDeparture: '2017-03-17T15:01:00+02:00', // ISO 8601 string (with origin timezone), required if `scheduledDeparture` is null
	scheduledArrival: '2017-03-17T15:00:00+02:00', // ISO 8601 string (with destination timezone), required if `realtimeArrival` is null
	realtimeArrival: '2017-03-17T15:01:00+02:00' // ISO 8601 string (with destination timezone), required if `scheduledArrival` is null
}

Again, this is just a proposal based on previous discussions, so please express your opinions!

@juliuste
Copy link
Member Author

Related: #27

@juliuste
Copy link
Member Author

(Notification @matkoniecz @ialokim, because I can't assign you)

@juliuste juliuste self-assigned this Oct 26, 2018
@ialokim
Copy link

ialokim commented Nov 1, 2018

Thanks @juliuste for pushing things towards the FPTFv2 release and for starting this discussion putting together the different ideas!

While I was thinking about the specification changes proposed above, I found it quite difficult to imagine all the object's relations. I've drawn a little schema which you can find here and which will hopefully help us to get a better overview of all the objects and fields defined in FPTF.

The schema is a first draft for FPTF v2, so it already includes your suggestions from above and the ones that I am going to add now. Changes regarding FPTF v1 should all be marked by the red color.

Feel free to edit the schema as it is made using the open source diagram website draw.io, you only have to click on the little edit icon at the bottom and post a new link in the comments.


I will explain my proposals for each FPTF type separately while also commenting @juliuste's ideas from above.

line

While playing a bit with FPTF, I found it a bit confusing that there is only a line.name and no line.number field. I would suggest to add the latter as a new optional field which would only contain the line number. As an example, for the metro line 1 of Paris line.number would only be 1 while line.name would be M1, and the product type "metro" should be expressed differently (refer also to the discussions regarding subMode and/or products, in my opinion at least one of it should make it in FPTF v2 as well).

route

I propose to add the two fields route.origin and route.destination which would be especially useful for the departures and arrivals at one specific station (see also the stopover type). Also, as many trips will be sharing the same route, I suggest a new array of trip objects (or rather references) as route.trips.

trip

I totally agree on adding a new trip type which is especially useful for third-party API clients and realtime data. I like your suggestion to keep the schedule type for static data, too, so I generally agree on your proposal about trip.

One could discuss adding an optional schedule key similar to trip.route or trip.line.

I don't think this would have any real use-cases; as you've mentioned schedule should be more like an compact way of expressing schedules for static data that could be processed to get trip objects, whereas trip.schedule field won't provide you more information as the trip already contains IMHO.

schedule

schedule.starts is an object tripId -> startTime now instead of an array.

This looked a bit strange to me at first, but thinking about it, I couldn't find any better solution. So I'm backing your suggestion, but would opt for using ISO-8601 date strings instead of Unix timestamps to keep it consistent with other fields like realtimeDeparture e.g.
I also agree on adding the optional schedule.line.

leg

Introducing the new type leg seems very reasonable for me but I would not include a leg.id field as it won't be very reproducible in most cases (at least I can't think of one where it would be necessary to refer to some walking leg of some API response).

On the other hand, I propose to add an optional leg.route field here too to keep things consistent. I quite like your proposals for expressing scheduled and realtime data, so no complaints about this part.

stopover

Same thing about realtime data here. But regarding the discussion in FPTI-JS#4, I think it would be best to keep stopover as the result type for arrivals and departures methods. In order to be useful for that purpose, I suggest to add the optional stopover.line, stopover.route and stopover.trip fields. Especially the corresponding line would contain the line number and the route the origin and destination (see above).

stop, station, location, region

I do not have any changes for these types.


I am sorry for the large comment, but I don't wanted to open a lot of different issues as most of the ideas are quite related. If you feel like there is need for a deeper discussion about one of them, feel free to open a new issue!

@juliuste
Copy link
Member Author

juliuste commented Nov 1, 2018

@ialokim thank you very much for the detailed answer, will go through everything you wrote in the next few days. 🙂 Just one small thing already, before I forget about it:

Introducing the new type leg seems very reasonable for me but I would not include a leg.id field as it won't be very reproducible in most cases (at least I can't think of one where it would be necessary to refer to some walking leg of some API response).

I also thought about this but ended up reasoning this way: We already enforce ids for journeys which poses mostly the same problem (most APIs don't return this information out of the box), so a lot of packages already use a workaround where we generate ids for legs and combine all leg ids to a journey id in the end. Additionally, some APIs actually offer things like pricing requests for legs where you supply the leg id, or booking/reservation for a specific leg. Therefore - especially since we already enforce manually generating such ids in journey - I'd rather make it required, even though this info might be not too useful in some cases like walking legs.

However, I'd agree to make it optional if we also decide to make journey.id optional. 😀

@ialokim
Copy link

ialokim commented Nov 3, 2018

We already enforce ids for journeys

I was not aware of this. But considering your examples given where some API returns e.g. pricing information for some specific leg (or entire journey), I would definitely opt for marking it as optional and same for journey.id then, as they are not basic attributes as stated for the ease-of-use goal for FPTF:

Only basic attributes are required, most is optional.

@juliuste
Copy link
Member Author

Alright, after almost 7 months, I finally had time to go through this again 🎉

line

While playing a bit with FPTF, I found it a bit confusing that there is only a line.name and no line.number field. I would suggest to add the latter as a new optional field which would only contain the line number.

I agree, but we should probably make clear in the docs that this could also be a character, e.g. the RER A or RER B in Paris would then have a number: A or number: B, right? 😄

route

I propose to add the two fields route.origin and route.destination which would be especially useful for the departures and arrivals at one specific station (see also the stopover type). Also, as many trips will be sharing the same route, I suggest a new array of trip objects (or rather references) as route.trips.

From my understanding, a route is merely a static, ordered list of stations, that always belongs to a line (which is why IMHO the line should have a list of trips instead of the route). For origin and destination, one could argue where they best fit in, but note that often (in "real life scenarios" you will sometimes know origin/destination of a line without knowing the entire route.

schedule

This looked a bit strange to me at first, but thinking about it, I couldn't find any better solution. So I'm backing your suggestion, but would opt for using ISO-8601 date strings instead of Unix timestamps to keep it consistent with other fields like realtimeDeparture e.g.

You're definitely right, ISO dates make more sense 👍 I didn't even think about it as this was part of the "old" schedule definition, then now's the time to change that!

stopover

Same thing about realtime data here. But regarding the discussion in FPTI-JS#4, I think it would be best to keep stopover as the result type for arrivals and departures methods. In order to be useful for that purpose, I suggest to add the optional stopover.line, stopover.route and stopover.trip fields. Especially the corresponding line would contain the line number and the route the origin and destination (see above).

I agree for the line and trip fields, regarding the route field see my comments above.

@derhuerst
Copy link
Member

derhuerst commented Aug 27, 2019

An update on the realtime/prognosed vs. planned/scheduled time data discussions:

After talking to many people who have used hafas-client or one of the transport.rest APIs, I have gotten the following impression:

  • The scheduledArrival/scheduledDeparture/scheduled*Delay/scheduled*Platform naming is not very helpful, as it is often unclear what "scheduled" refers to.
  • The fact that hafas-client currently omits the realtime/prognosed time for cancelled items prevents legitimate usage of the info.
  • People almost always get the different notions of realtime/prognosed vs. planned/scheduled time information. But, as argued before, they often don't pay attention to covering all use cases when they quickly put consuming code together.

I went ahead and created a draft PR for hafas-client that changes it to the following schema:

field description fallback/default
when/arrival/departure  realtime/prognosed date+time, as ISO 8601 null
plannedWhen/plannedArrival/plannedDeparture planned/scheduled date+time, as ISO 8601 null
delay/arrivalDelay/departureDelay The difference between in realtime/prognosed and planned/scheduled date+time, in seconds null

When any of these items is cancelled, when would be set to null, and the following fields will be added:

field description
cancelled true
prognosedWhen/prognosedArrival/prognosedDeparture realtime/prognosed date+time, as ISO 8601

This proposal makes the following trade-offs:

  • It adds a redundant field "delay", potentially trading consistency for ease-of-use. – My subjective perspective is that most people would be fine with this, and one can easily leave it out when persisting the data.
  • It (continues to) tries to make it hard to use the data in a wrong/uninformed way (e.g. with cancelled or delayed trips), which would lead to a bad UX in transit products.
  • It makes clear that often so-called "realtime data" actually is a prognosis, and allows the addition of a realtimeWhen field for systems that clearly differentiate prognosed and realtime (as in a fact) data.

What do you think? @ialokim @juliuste @matkoniecz

@matkoniecz
Copy link
Contributor

When any of these items is cancelled, when would be set to null, and the following fields will be added:

Why cancelled transit would have still "realtime/prognosed date+time"? I would expect it to have solely scheduled time (as prognosed time for cancelled trip is null if anything).

@derhuerst
Copy link
Member

When any of these items is cancelled, when would be set to null, and the following fields will be added:

Why cancelled transit would have still "realtime/prognosed date+time"? I would expect it to have solely scheduled time (as prognosed time for cancelled trip is null if anything).

So if you, as a consuming developer, just access when (and optionally plannedWhen & cancelled), you won't accidentally display realtime data for cancelled items.

But if you are interested in the last prognosis available, even though it is cancelled, you would access prognosedWhen.

Does that make sense?

@ialokim
Copy link

ialokim commented Aug 28, 2019

But if you are interested in the last prognosis available, even though it is cancelled, you would access prognosedWhen.

Okay, I got the idea now as it was unclear to me too. But why not naming it lastPrognosedWhen to make the idea clearer?

Apart from that, I quite agree with your proposals @derhuerst, but would opt for making the delay fields optional as it is only for the consumer's convenience.

@ialokim
Copy link

ialokim commented Aug 31, 2019

Alright, after almost 7 months, I finally had time to go through this again 🎉

Don't worry, it took me 2 months more to answer 👍

line

While playing a bit with FPTF, I found it a bit confusing that there is only a line.name and no line.number field. I would suggest to add the latter as a new optional field which would only contain the line number.

I agree, but we should probably make clear in the docs that this could also be a character, e.g. the RER A or RER B in Paris would then have a number: A or number: B, right? 😄

Sure, if you can think of a name not that confusing for the attribute, I'd be free for suggestions.

route

I propose to add the two fields route.origin and route.destination which would be especially useful for the departures and arrivals at one specific station (see also the stopover type). Also, as many trips will be sharing the same route, I suggest a new array of trip objects (or rather references) as route.trips.

From my understanding, a route is merely a static, ordered list of stations, that always belongs to a line (which is why IMHO the line should have a list of trips instead of the route). For origin and destination, one could argue where they best fit in, but note that often (in "real life scenarios" you will sometimes know origin/destination of a line without knowing the entire route.

Perhaps we're having slightly different understandings of how the relation between line route and trip should be. I'll try to explain my point of view:

  • a line is the general service of some public transportation system with the same name, color and line number. An example would be the District line of the London Underground. While all the public transport services on a line share the same name, they do no always start and end at the same stop/station. This is where route comes into play:
  • a route is indeed a static, ordered list of stations. Each line (except circular ones) has at least two routes for both directions, but could have even more if it does not always pass by exactly the same stations. In the District line example from above, each item in the list under services should be represented by two routes, one for each direction.
  • a trip represents a single public transport service run at a specific time. It follows one specific route from origin to destination passing by all stations on the route. This could be e.g. the tube leaving Richmond at 5:31 am and arriving at Upminster at 7:00 am (see timetable here).

In my perception, it would not make much sense to add origin and destination information to the line, as it always has at least to (opposite) directions. As each trip belongs to a specific route, I would include a list of trips into the route and not into the line directly.

To address your concerns about knowing origin and destination without being aware of all the intermediate stations, you could simply add a route with both end stations, leaving the stops key out as it is optional.


Take your time to answer 😉

@juliuste juliuste mentioned this issue Oct 3, 2019
@juliuste
Copy link
Member Author

juliuste commented Oct 3, 2019

As it gets increasingly hard to tell what we actually agree on here, I started writing a draft branch, see #63.

What I copied so far:

  • new trip type
  • new leg type
  • non-url-safe ids (from another discussion)

What's still missing:

  • scheduled/planned dates
  • line name/number/letter
  • subModes

@derhuerst
Copy link
Member

But if you are interested in the last prognosis available, even though it is cancelled, you would access prognosedWhen.

Okay, I got the idea now as it was unclear to me too. But why not naming it lastPrognosedWhen to make the idea clearer?

lastPrognosedWhen would communicate that more clearly, true. I prefer previousPrognosedWhen or formerPrognosedWhen, because this prognosis might not be the last ever.

@ialokim
Copy link

ialokim commented Oct 26, 2019

I prefer previousPrognosedWhen or formerPrognosedWhen

I would go with the second one.

@derhuerst
Copy link
Member

derhuerst commented Oct 30, 2019

re #61 (comment)

@juliuste raised the concern that, accessing the realtime/prognosed time and falling back to the planned time from a different field is not as easy as it should. This is probably the most common use case.

After discussion, we propose to let when/arrival/departure fall back to the planned time (when no realtime/prognosed time is available).

Edit: I have adapted #63 to contain the following changes.


The schema would look as follows:

field value fallback/default
when/arrival/departure  realtime/prognosed date+time, as ISO 8601 planned/scheduled date+time, as ISO 8601
plannedWhen/plannedArrival/plannedDeparture planned/scheduled date+time, as ISO 8601 null
delay/arrivalDelay/departureDelay The difference between in realtime/prognosed and planned/scheduled date+time, in seconds null

When the arrival/departure is cancelled, its fields would look as follows:

field value
cancelled true
when/arrival/departure null
plannedWhen/plannedArrival/plannedDeparture planned/scheduled date+time, as ISO 8601
prognosedWhen/prognosedArrival/prognosedDeparture realtime/prognosed date+time (if known), as ISO 8601
delay/arrivalDelay/departureDelay null

@derhuerst
Copy link
Member

@ialokim What do you think about #61 (comment) ?

@ialokim
Copy link

ialokim commented Nov 25, 2019

Thanks for pinging me, I would probably have forgotten about it for some time again. 🙈

After discussion, we propose to let when/arrival/departure fall back to the planned time (when no realtime/prognosed time is available).

This seems reasonable to me. If I got it right, *delay would be null in the case of having no realtime/prognosed time data available so that one could differentiate between the case realtime and planned coincide (delay: 0) vs. no realtime information?

prognosedWhen/prognosedArrival/prognosedDeparture

What about your idea in #61 (comment) of changing it to formerPrognosedWhen to emphasise that this data is not up-to-date any more? (Also wasPrognosedWhen came to my mind now).

@derhuerst
Copy link
Member

If I got it right, *delay would be null in the case of having no realtime/prognosed time data available so that one could differentiate between the case realtime and planned coincide (delay: 0) vs. no realtime information?

Correct!

prognosedWhen/prognosedArrival/prognosedDeparture

What about [...] changing it to formerPrognosedWhen to emphasise that this data is not up-to-date any more? (Also wasPrognosedWhen came to my mind now).

We could do this, I see it as a trade-off between ease of use and how-hard-it-is-to-do-things-wrong. 😀

@ialokim
Copy link

ialokim commented Nov 27, 2019

If I got it right, *delay would be null in the case of having no realtime/prognosed time data available so that one could differentiate between the case realtime and planned coincide (delay: 0) vs. no realtime information?

Correct!

Nice, I like your proposal then!

prognosedWhen/prognosedArrival/prognosedDeparture

What about [...] changing it to formerPrognosedWhen to emphasise that this data is not up-to-date any more? (Also wasPrognosedWhen came to my mind now).

We could do this, I see it as a trade-off between ease of use and how-hard-it-is-to-do-things-wrong. grinning

I would vote for something different than simply prognosedWhen as the name would then imply already (without further reading in any documentation) that this data is not currently prognosed, but has been before. I guess it wouldn't change much of the "ease-of-useness" whereas the key would then textually describe better the provided information.

@derhuerst
Copy link
Member

derhuerst commented Mar 16, 2020

Fine with me, I prefer formerPrognosedWhen over wasPrognosedWhen (and formerPlannedWhen over wasPlannedWhen)?

@ialokim
Copy link

ialokim commented Mar 23, 2020

So let's go with formerPrognosedWhen. I would keep plannedWhen though to be consistent with the case of no cancellation. Also plannedWhen would not change just because the trip is cancelled.

@derhuerst
Copy link
Member

derhuerst commented Mar 25, 2020

Wrapping up, the schema would look like this:

field value fallback/default
when/arrival/departure  realtime/prognosed date+time, as ISO 8601 planned/scheduled date+time, as ISO 8601
plannedWhen/plannedArrival/plannedDeparture planned/scheduled date+time, as ISO 8601 null
delay/arrivalDelay/departureDelay The difference between in realtime/prognosed and planned/scheduled date+time, in seconds null

When the arrival/departure is cancelled, its fields would look as follows:

field value
cancelled true
when/arrival/departure null
plannedWhen/plannedArrival/plannedDeparture planned/scheduled date+time, as ISO 8601
prognosedWhen/prognosedArrival/prognosedDeparture omitted
formerPrognosedWhen/formerPrognosedArrival/formerPrognosedDeparture last known realtime/prognosed date+time (if known), as ISO 8601
delay/arrivalDelay/departureDelay null

@ialokim
Copy link

ialokim commented Mar 29, 2020

Why did you explicitly state prognosedWhen/prognosedArrival/prognosedDeparture for the cancelled case? This field would never be set, would it? So it could be just omitted in the spec.

@derhuerst
Copy link
Member

derhuerst commented Mar 29, 2020

I think it depends on the environment: In some languages & serialisation formats you'd specify as an Optional, so they'd have the None value; In other languages & formats, you'd omit them entirely. In JSON, you'd probably omit them.

@ialokim
Copy link

ialokim commented Mar 29, 2020

What I wanted to say is that this key should never be set, even when the departure/arrival is not cancelled (at least it does not figure in the table above). That's why it doesn't make sense to have it in the second table either, I'd guess?

@juliuste
Copy link
Member Author

juliuste commented Apr 6, 2020

First of all: Quick storytime, to give some context on why I had a few changes of mind regarding our discussion here: A few months back, I had the “chance” to introduce typescript to a medium-sized JavaScript code base that interacted with an FPTF-style API client of ours (both FPTFv1 as well as hafas-client's implementation of the proposal we discuss right now). As I was going through the code, I found a lot of subtle bugs where we had expected properties to be set where they actually weren't or vice versa. Finding such bugs when adding types to an existing code base is nothing surprising per se, however it actually struck me that an outstanding amount of these bugs was related to date and delay properties on FPTF-style objects.

Now, this is of course still only a pretty small sample on how FPTF objects are used by other people, and it could be that everyone contributing to that codebase (including me) was just plain dumb, but this got me to realize that the current proposal brings a significant mental overhead associated with all non-obvious logic for our attributes that can lead to that kind of bugs. I therefore think that it should be our highest priority to make the properties as obvious and intuitive as possible. This might sound a bit abstract, so I'll try to list some (subjective) findings of mine:

  • It is generally sub-optimal to have a date field that uses some sort of fallback logic which mixes up the semantical meaning of that date. In our case, that would e.g. be the when field that refers to realtime information, if available, otherwise falling back to planned data. While this can be seen as an additional field just for user convenience (as even stated by myself before in this thread), the reality is that people also tend to accidentally use that field in calculations that assume some sort of semantical meaning (e.g. assuming when would always a planned date), without realizing that the semantical meaning is fuzzy.
  • Generally, the concept of scheduled/planned vs. realtime/prognosed information is relatively obvious to a lot of people. While not strictly applicable for newer forms of transit, e.g. on-demand, it might be possible to extend that differentiation to those new situations without messing too much with the intuition people already have for this concept and the data availability (required/optional attributes) from “classic” transport systems.
  • Making fields optional adds a significant mental burden everytime people use them, so we should always try to make fields required if such a requirement would be feasible for all the different data providers we know of.
  • (just an additional point which doesn't conflict with the current proposal) There is a need to express the difference between prognosed information and realtime/actual information.

Having this in mind, I propose to adapt our current proposal in the following way (note that for simplicity I only use when here, but the same applies for departure and arrival):

Field Type Explanation Notes
scheduledWhen ISO 8601 date, required For “classic” public transport: the date that can be found in timetables and schedules. For flexible/on-demand transport: the date that was initially promised/announced to the passenger. Could also be named planned…, but the word scheduled might be suited better to include that on-demand case.
actualWhen ISO 8601 date, optional The date at which the event actually occured in real-life. null or omitted if the information is unknown or the event was cancelled. If it is unclear if data exposed by an API was only a prognosis, use prognosedWhen instead. This data is probably not available in a lot of APIs, but it still makes sense to standardize it IMHO, also to semantically separate it from prognosed information and make API authors spend some time to think about what the data they expose actually represents.
prognosedWhen ISO 8601 date, optional Current prognosis for when the event is expected to happen. null or omitted if the information is unknown or the event was cancelled.
formerPrognosedWhen ISO 8601 date, optional Last known prognosis for when a cancelled event was expected to happen. null or omitted if the information is unknown or the event was not cancelled.
cancelled Boolean, required Flag indicating if the event was cancelled.

Furthermore, if we decide to keep the delay field(s) for user convenience (that would be actualDelay, prognosedDelay and formerPrognosedDelay regarding my proposal), I have an additional thought to consider (note, however, that this proposal is completely independent from the part above and also works with the current working draft, so we could decide on them separately):

If two attributes of an object are equal in their availability (if a location's longitude is available, latitude is too; if longitude is not available, latitude is neither), it might make sense to group them into a separate sub-object. We did that for the example for latitudes and longitudes by introducing the location type. The main reason behind this was that - since the availability of longitude and latitude was the same - people would have conditions like if exists(station.longitude) then … in their code, which implicitly also makes sure the latitude is set. The problem here is that this not only looks weird and assumes readers of the code to know the implicitly encoded relation between the latitude and longitude properties, but in typed languages this also just gets really annoying (most compilers only let you access optionals if you explicitly check them, after all, so we would need to also check if latitude was set in this case to make the compiler happy).

Now applying the same logic to the date situation, it might therefore make sense to encapsulate actual, prognosed and formerPrognosed data in sub-objects like {date: '2020-04-04T10:00:00+09:00', delay: 0}. While that whole object would be optional, the attributes on that object could then be required. A disadvantage of this is that it would leave the scheduled date looking somehow different (just being an ISO string literal instead of such an object) from the other ones, but on the other hand, we already implicitly have this different-looking planned date in the current proposal as well (just that the difference is more subtle in that there is no plannedDelay attribute).


Wrapping up, please let me know what you think, and of course I'm also fine if we decide not to follow up on all of these proposals.

Just for convenience and so that we don't forget about anything, I'll add a shortlist for my three proposals here which can be judged more or less independently:

  • Update the semantic meaning of our dates as specified in the table above
  • Add a way to express the difference between actual/realtime and prognosed information
  • Group dates and delays in sub-objects

@juliuste
Copy link
Member Author

juliuste commented Jun 18, 2020

@derhuerst @ialokim I think it might make sense for our discussions if we set a deadline for replying (usually I'm the "culplit" who takes too long, so this is also directed at myself 😂), so I'd kindly ask you to give opinions until end of the month (June 30) or let me know that you need longer to reply. Please indicate if this is fine for you or not (thumbs up/down).

@derhuerst
Copy link
Member

derhuerst commented Aug 16, 2020

[...] I propose to adapt our current proposal in the following way (note that for simplicity I only use when here, but the same applies for departure and arrival):

  • scheduledWhen
  • actualWhen
  • prognosedWhen
  • formerPrognosedWhen
  • cancelled

I agree with your reasoning and these proposals. 👍

Now applying the same logic to the date situation, it might therefore make sense to encapsulate actual, prognosed and formerPrognosed data in sub-objects like {date: '2020-04-04T10:00:00+09:00', delay: 0}. While that whole object would be optional, the attributes on that object could then be required.

What would the event look like if only the prognosed time is known?

@ialokim
Copy link

ialokim commented Aug 16, 2020

I agree with your reasoning and these proposals. 👍

Same here 👍

delay

As raised by @derhuerst, I think your proposal about grouping date and delays and therefore be able to require the delay field might cause more difficulties for incomplete data sources than it helps users to get the delay information. I think I would at last even prefer to omit the delay fields at all: If they are optional, the client has to implement a way to calculate the delay itself anyhow. Also, API users have to check for the different fields (scheduled, actual, prognosed and formerPrognosed) anyhow so calculating the time offset between two of the date values should not be too difficult at the client's side.

@juliuste
Copy link
Member Author

juliuste commented Aug 16, 2020

What would the event look like if only the prognosed time is known?

Not 100% sure what you mean by that, because that wouldn't be valid FPTF according to my proposal, since scheduledWhen is required (and delays would - IMO - always be relative to the scheduled time). Can you elaborate?

@juliuste
Copy link
Member Author

juliuste commented Oct 7, 2020

@derhuerst ping

@derhuerst
Copy link
Member

What would the event look like if only the prognosed time is known?

[...] scheduledWhen is required (and delays would - IMO - always be relative to the scheduled time).

With various APIs, I've seen the "scheduled" date+time to be missing. It won't be possible to fulfil the spec in these cases.

@andaryjo
Copy link

andaryjo commented Apr 1, 2021

Hi everybody, here's a lot to catch up for me. I tried to read through the most of it, but please excuse me if I did miss prior discussions to the points I'd like to make.

I recently finished an at least usable version of the trias-client (read more about what TRIAS is and why I think it matters here) and while I was trying to incorporate the FPTF, I had to made some changes to get it working for my current own use cases. As there's already a discussion going on regarding a V2 of FPTF, I thought it would make sense to propse my ideas here instead of opening a new issue. Following will be therefore based on the current state of the V2 draft.

stopover

Still not sure if I was just not able to realize how to properly use it, but I did not find any possibility to model simple departures for a stop in FPTF. Departure boards are used in nearly every public transport app and provided by every data provider, so I think this is a valid use case.

I tried to model departures using the stopover type (that is probably designed to be used for the intermediate stops of a leg), but I'm missing both line and direction of the departure.

leg

Same applies to the legs of a journey, where information regarding line and direction is missing (at least in the specs, I've seen it in use in the hafas-client).

@derhuerst
Copy link
Member

I tried to read through the most of it, but please excuse me if I did miss prior discussions to the points I'd like to make.

No worries, glad you add your perspective!

stopover

Still not sure if I was just not able to realize how to properly use it, but I did not find any possibility to model simple departures for a stop in FPTF. Departure boards are used in nearly every public transport app and provided by every data provider, so I think this is a valid use case.

Indeed, it was meant as one stopover of a series of stopovers, which are e.g. part of a journey leg.

But its definition is, like the rest of FPTF, quite broad: "A stopover represents a vehicle stopping at a stop/station at a specific time."

So, as long as we keep all stopover fields purely related to intermediate stopovers (such as "is leaving the vehicle allowed?") optional in the future, it is suits the departure board case, right?

I tried to model departures using the stopover type (that is probably designed to be used for the intermediate stops of a leg), but I'm missing both line and direction of the departure.

Keep in mind that, by philosophy, FPTF doesn't specify all fields that people have use cases for. Rather, we tried to focus on common fields. So every FPTF-compatible lib is welcome to extend the spec.

I can see though how a departure board (or just a departure information in itself) is a very common case! We could specify line & direction as optional fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants