Matching of timetable, real-time and disruptions

Brief Description

Matching is about merging timetable data, real-time data and possibly disruptions from different sources and real-time data. As there is no universal cross-system reference in the Swiss public transport system that supports merging the two data sources, matching is sometimes a delicate matter. The Swiss Journey ID will improve this. Until then and for international services, matching is the only option.

Functional Description

There is a principle that matching must be so robust that something meaningful is still output for certain use cases, even if only timetable data or real-time data is available. Adhering to this principle as far as possible requires a thorough understanding of timetable planning and timetable data, or scheduling and real-time data.

Matching for Switzerland has already been carried out within OJP and GTFS-RT. Since matching must be configured for each business organisation, a list is kept for which transport companies (identified by business organisation number) matched real-time information is available.

Below, we describe how matching can be done in the following cases:

  • Target data – target data with HRDF – GTFS: This is relevant if GTFS-RT is used for real-time but the timetables are loaded as HRDF.
  • Target data – target data in Switzerland/abroad:
  • Target data – real time with HRDF – SIRI ET/PT:
  • Target data – real time with NeTEX – SIRI ET / PT

Swiss Journey ID – SJYID

In future, matching will always be done via the sjyid and, if necessary, via travel relationships (see VDV 454 version 3.0 section 5.2.2.6).

It is important to note that the sjyid is only unique together with an operating day. This is why the sjyid cannot be the ID of a trip in NeTEx or GTFS (these would have to be valid for several days).

Understanding Matching

There are different levels:

  • Matching with sjyid and driving relationships: What we want to achieve in the future.
  • Matching via trip ID and trip relation: The identifiers of the two parts can be converted and/or used 1:1. Unfortunately, the ID often does not match (e.g. HRDF – GTFS)
  • Matching via train number: The train number in Switzerland is given and should be sufficient for matching. If the other rules for assigning train numbers (e.g. for shadow trains) are taken into account, you will come quite far for rail travel (see e.g. train number – Wikipedia).
  • Matching via line/direction: In some cases, the correct journey can already be identified via start time, information on line, offer category and direction.
  • Matching via FahrtStartEnde: Start location, start time, arrival location, arrival time, if applicable transport company, offer category, line information is used to combine the journeys from the two parts. This does not work for shorter journeys.
  • Matching route sections: Information from stops is superimposed. The journey from the 2nd part is assigned to the best match from the first part. Arrival time, departure time, stops, but also other detailed information can be used. Cases such as an amplifier journey remain treacherous if the journey relationship cannot be used.

This is usually implemented with a fallback from top to bottom. If a journey cannot be made at all, it may be omitted depending on the use case (e.g. disruption information) or run as an extra journey (real time without target journey).

Matching over blurred times

In Switzerland, timetable data is only up to the minute. There may be a slight mismatch of times. However, it is generally not a good idea to match based on blurred times. The original target times should also be transmitted in real time. This should be taken into account.

Stops and stop places

The longer, the more the data is output for specific stops. However, it may make sense to go back to the stop for matching purposes, as a platform change could still happen.

Technical Description

Relevant Identifiers HRDF

In HRDF, the relevant information is:

  • *Z – line with FahrtId, administration and version
  • *I RN for PostBus. The region code.

With these four details, a journey is fully defined.

Alternatively, instead of the version, the version on which a corresponding tag is active can also be selected from the *A VE and the BITFIELD.

The version is only defined by us during the HRDF export. I.e. this is NOT a stable part of the identifier and must be reloaded with each input.

The sjyid will be exported to HRDF as follows:

  • FPLAN: *I JY <refid>
  • INFOTEXT: <refid> ch:1:sjyid:<id>

Relevant Identifiers GTFS

The trip_id serves as the identifier.

In the definition of GTFS and in the internal representation in SKI+, the lines (and also sometimes journeys) were summarised differently. For this reason, there is no direct equivalent to the journeys/lines in HRDF. I.e. it is not possible to convert the ID.

If possible, the route_id is kept stable during the export. Changes usually only occur at the time of the timetable change.

The jsyid will be exported in GTFS as follows: There will be a new column sjyid in trips.txt.

! We are considering whether to always include the sjyid in GTFS-RT.

Relevant Identifiers NeTEx

As with GTFS, the internal representation of the ID serves as the basis for the ServiceJourney generated in NeTEx.

The id attribute of the ServiceJourney will continue to be directly generated and has no relation to HRDF.

The validity can be determined by evaluating the AvailabilityCondition element.

The sjyid will be stored in a Key Value value:

Attention: There may be several ServiceJourneys with the same SJYID (variants).

Relevant Identifiers OJP

In OJP, the sjyid is output in ojp:JourneyRef. This always includes ojp:OperatingDayRef. Both parts are contained in the ojp:Service element:

<ojp:Service>
 <ojp:OperatingDayRef>2024-03-07</ojp:OperatingDayRef>
 <ojp:JourneyRef>ch:1:sjyid:100001:817-003</ojp:JourneyRef>
 <siri:LineRef>ojp:91081:A</siri:LineRef>
 <siri:DirectionRef>H</siri:DirectionRef>

Matching HRDF – GTFS

This use case is described here.

Matching to international data – Timetable

If, due to the data situation, there is only the Swiss target journey part, the mapping must be carried out either via the ID or via the few stops that are available. A typical example is the ICE where we only have the Basel – Basel Badischer Bahnhof part of the target data. These two must therefore be carried out.

By default, the target journeys are always scheduled at the first commercial stop abroad. If this is also handled in the second data set, at least two stops can be matched correctly. This is unlikely to be done incorrectly, especially for international rail services.

Local transport is usually fully present in either one or both systems (since only one control system is responsible).

Matching via trip ID is often excluded. The train number is one option. Unfortunately, it is also possible, if not probable, that the offer category and operator are different. This should not be the focus of the international matchin.

Matching to international data – real-time

CUS already generates a consolidated cross-border journey for direct journeys. There may be no stops abroad (except for the final stop).

If, due to the data situation, there is only the Swiss target journey part, the mapping must be carried out either via the ID or via the few stops that are available. A typical example is the ICE where we only have the Basel – Basel Badischer Bahnhof part of the target data. These two must therefore be carried out.

The target journeys are always scheduled at the first commercial stop abroad, i.e. two stops should always be available.

Matching for OJP (current)

From a technical point of view, various forms of matching can be carried out. In the first step, referencing is carried out using a FahrtID (journey ID). This is the case for the standard-gauge railway, for example. If this FahrtID (journey ID) is not provided, different parameters from the two data sources must be matched:

  • Stops (see also Master Data and Metadata – Overview): This is not relevant in Switzerland as the DiDok number is generally used. It is only necessary to distinguish between a stop and a stopping point. However, since the stopping point contains the DiDok number, derivation is ensured again.
  • Transport companies: Transport companies are matched indirectly via line direction
  • Line and direction: Trips are matched based on line and direction (see also Timetable data – overview). Since lines are distinguished based on their affiliation to a transport company, transport companies are indirectly matched with each other.
  • Within the assigned line/direction, an algorithm provided by the supplier tries to search for and match the identical journeys.

TODO

Planned extensions of this page:

  • Relevant identifiers SIRI ET / PT
  • Relevant identifiers SIRI SX / VDV 736
  • Matching NeTEx – SIRI