Feeds and feed versions
Transitland is built on publicly available GTFS data contributed by our user community to the Transitland Atlas. Detailed information is kept on each Feed, and updated whenever a new version of each Feed is discovered. Feed versions are archived for download (as
.zip files) and imported into the Transitland Datastore for API querying by operators, stops, routes, schedules, etc.
A Feed represents a unique GTFS data source. Each Feed has a URL to a publicly accessible GTFS archive, a mapping of GTFS
agency_id values to Transitland Operators, the geographic extent of the Feed, and the details of the Feed's license.
Feed data model
|Onestop ID||Feed Onestop ID|
|URL||Publicly accessible GTFS archive|
|Geometry||Convex hull of Stops in the Feed|
|DateTime||Last time the Feed was retrieved|
|DateTime||Last time the Feed was imported|
|String||License name, such as |
|String||Required attribution text|
|URL||URL to Feed License|
|Feed Versions||Feed Version IDs (SHA1) for this Feed|
|Feed Version||Active Feed Version ID|
|Object array||Mapping of gtfs |
|Changesets||Changesets created from Feed|
Feed license information
To learn more about how Transitland classifies the licenses associated with a feed, see this overview of Transitland legal and licensing issues.
Approximately once per day, the URL for each Feed is checked. When a new version of the Feed is found, a Feed Version is created. The ID for each Feed Version is the SHA1 checksum of the GTFS archive.
Feed versions data model
|SHA1||SHA1 checksum of the GTFS archive|
|MD5||MD5 checksum of GTFS archive|
|Onestop ID||Parent Feed Onestop ID|
|DateTime||Time was originally fetched|
|URL||URL when fetched|
|URL||Archived copy of Feed Version, if allowed|
|URL||Archived Google feedvalidator.py report|
|Date||First day of scheduled service|
|Date||Last day of scheduled service|
|DateTime||Last time Feed Version was imported|
|Integer||Import level (0-4)|
|Enum||Import status, such as |
|IDs||Feed Version Import IDs|
|IDs||Feed Version Info IDs|
|Changesets||Changesets created from Feed Version|
Feed versions API
Active feed version
The most recent version of a feed that has been imported into the Transitland Datastore is marked as active. The schedule API endpoint only allows querying of the trips and calendars in the active feed version.
The FeedMaintenance service within Transitland Datastore automatically decides when to import a newly fetched feed version. If no need feed version is available when existing ScheduleStopPairs are about to expire, the FeedMaintenance service will extend them into the future.
Most recent feed version
Are you looking for the most recently fetched Feed Version? This is not necessarily the Feed Version where
To query for the most recently fetched version of a feed, use an API query like:
feed_onestop_id as appropriate).
To directly download a copy of the most recently fetched version of a feed, use:
https://transit.land/api/v1/feeds/f-9v6-capitalmetro/download_latest_feed_version (replacing the Onestop ID as appropriate). Note that downloading is not allowed for Feeds where
Feed version reports
The Transitland Datastore creates a number of validation and statistical reports for each Feed Version. The currently defined types of reports are:
FeedVersionInfoStatistics: General statistics
FeedVersionInfoConveyalValidation: Conveyal gtfs-lib validation results
Additionally, the results of Google's feedvalidator.py HTML output will be stored on the Feed Version as
feedvalidator_url when available. In the future, this may instead be stored as an additional type of report.
Feed version report data model
|Feed Version||Parent Feed Version|
|Onestop ID||Parent Feed|
|JSON||JSON blob containing report data|
Feed version report API
|Onestop ID||Filter by Feed||Caltrain|
|Feed Version||Filter by Feed Version||Caltrain, single Feed Version|
|Enum||Filter by report type||Caltrain statistics|
This report contains details about the files in the GTFS archive and basic statistics about the CSV columns and values.
filenames: The filenames present in the directory of the archive containing the CSV files.
statistics: Data for each GTFS CSV file and column, with the
totalnumber of rows with data for that column, the
uniquenumber of values encountered, as well as the
scheduled_service: Key-value data for the number of seconds of scheduled service for each date the Feed has scheduled trips.
This report contains the JSON output of Conveyal's gtfs-lib validator.
Google feedvalidator.py reports
The HTML output of Google feedvalidator.py. Currently, this is stored on the actual Feed Version record as
feedvalidator_url as a link to a copy of the report stored on S3.
Feed version update statistics
The Feeds API also provides simple statistics characterizing when & how a Feed is updated. This includes the total number of Feed Versions, the average number of days between publication of new Feed Versions, the average number of service days in each Feed Version schedule, and the average number of days of overlapping schedule between subsequent Feed Versions. Note: because the
fetched_at value is used to sort the Feed Versions, manually uploaded files are excluded from statistics.
|Onestop ID||Feed Onestop ID|
|Integer||Total number of Feed Versions|
|Integer||Total, excluding manually updated Feed Versions|
|Feed Versions||Filtered Feed Versions ordered by |
|Integer||Average days between new Feed Versions|
|Float||Average number of days in each schedule|
|Float||Average number of days overlap between subsequent schedules|