Contents
Data Dictionary
title: "Transitland Datasets: File Formats and Data Dictionary"
Data Files and Formats
Each Dataset package includes data in the following standardized formats:
stops.csv- Tabular data ready to view as a spreadsheet, or import into GIS using latitude and longitude columnsstops.geojsonl- Geospatial vector data ready to load into GIS or data-science toolsroutes.csv- Tabular data ready to view as a spreadsheet (no geometries)routes.geojsonl- Geospatial vector data including each route's shape
Data Dictionary
Each Dataset zip file includes:
- Full data dictionary in human-readable Markdown format
- Frictionless DataPackage v2 as machine-readable JSON
- Complete field descriptions and examples
The latest data dictionary is also printed below for reference.
Route-Oriented Dataset Schema
The routes.csv file includes four main sets of columns and the routes.geojsonl file includes four main sets of matching properties:
- Route record and metadata - Basic route information
- Agency record and metadata - Agency serving the route
- Route headway data - Quantitative summary of route frequency
- Feed metadata - Source feed information
Route Record and Metadata
route_onestop_id- Transitland's unique identifier for the route. Can be used to construct a link tohttps://www.transit.land/routes/<route_onestop_id>route_short_name- Rider-facing short name for the routeroute_long_name- Rider-facing longer, descriptive nameroute_desc- Rider-facing descriptionroute_type- Vehicle type enum (see Route Types below)route_id- GTFS ID used within the source feed. This will not be unique across other feeds.route_color- Hex RGB value for map representation
Agency Information
agency_id- GTFS ID from source feed for the agency; may not be unique across different feeds (see) agency_name- Rider-facing agency name
Route Headway Data
Headway data is calculated for each route direction and day-of-week category (Monday-Friday, Saturday, Sunday). Headways represent the median number of seconds between departing trips within specific time windows.
Time windows:
- 7am-9am (peak morning)
- 9am-4pm (midday)
- 4pm-6pm (peak evening)
- 6pm-7am (overnight)
Columns include:
<prefix>_selected_service_date- Transitland's algorithm selected this as a representative service date. The following data is all calculated from this service day's schedules.<prefix>_departure_times- Space-separated list of departure times<prefix>_headway_<time>_mean- Average headway in seconds<prefix>_headway_<time>_median- Median headway in seconds<prefix>_selected_stop_id- GTFS ID from source feed of the stop used to calculate headways<prefix>_selected_stop_intid- Internal stop ID within Transitland data; ensured to be globally unique.<prefix>_selected_stop_name- Rider-facing stop name
Prefixes:
hw_best- Most departureshw_dow1_dir0/1- Monday-Friday, direction 0/1hw_dow6_dir0/1- Saturday, direction 0/1hw_dow7_dir0/1- Sunday, direction 0/1
Route Types
Defined by the GTFS specification and extended by Transitland. For the complete list of route types including extended codes (100-1700+), see
Standard GTFS route types:
0: Tram, Streetcar, Light rail1: Subway, Metro2: Rail3: Bus4: Ferry5: Cable tram6: Aerial lift7: Funicular11: Trolleybus12: Monorail
Stop-Oriented Dataset Schema
The stops.csv file includes five main sets of columns and the stops.geojsonl file includes five sets of properties on each feature:
- Stop record and metadata - Basic stop information
- Administrative boundaries - Geographic and political boundaries
- Route records and metadata - Up to 5 routes serving the stop
- Transit service summary - Frequency of arrivals/departures
- Feed metadata - Source feed information
Stop Record and Metadata
stop_onestop_id- Transitland's unique identifier for the stop location (see) stop_id- GTFS ID from source feed for the stop (may not be unique across different feeds)stop_name- Name provided by transit operatorstop_desc- Optional descriptionstop_lon- Longitude coordinatestop_lat- Latitude coordinate
Administrative Boundaries
adm0_name- Country name (e.g., "United States of America")adm0_iso- Country ISO code following ISO 3166-1 standard (e.g., "US")adm1_name- State/province name (e.g., "Oregon")adm1_iso- State/province ISO code following ISO 3166-2 standard (e.g., "US-OR")
Route Records and Metadata
Each stop can be served by up to 5 different routes. For each route (1-5), the following information is provided:
agency_id_N- ID of the transit operatoragency_name_N- Name of the transit operatorroute_id_N- ID of the routeroute_short_name_N- Short name of the routeroute_long_name_N- Long name of the routeroute_type_N- Vehicle type of the route. See route typesroute_color_N- Hex RGB value for map representation
Transit Service Summary
Service frequency is calculated based on scheduled departures for a designated "target week" near the dataset delivery date. Includes:
departure_count_dowN- Total trips departing on day N (both directions)departure_count_dowN_dir0- Trips departing on day N (inbound)departure_count_dowN_dir1- Trips departing on day N (outbound)
Where N ranges from 1 (Monday) to 7 (Sunday).
Feed metadata
Each stop or route record includes the following columns/properties providing information on the source lineage.
feed_id- Transitland's Onestop ID for source feed (see). Can be used to construct links to https://www.transit.land/feeds/<feed_id>feed_version_sha1- Feed version identifier (see). Can be used to construct links to https://www.transit.land/feed-versions/<feed_version_sha1>feed_version_fetched_at- Timestamp for when Transitland fetched the feed version from which this stop or route came. Can be used to evaluate data freshness.