Pedestrian Environment Index (PEI) Documentation - Spring 2025

This project implements the Pedestrian Environment Index (PEI) methodology as developed at the University of Illinois Chicago (see the research paper: https://www.sciencedirect.com/science/article/pii/S0966692314001343). The PEI provides a composite measure of the walkability of an environment, incorporating the following subindices:

Population Density Index (PDI)
Commercial Density Index (CDI)
Intersection Density Index (IDI)
Land-use Diversity Index (LDI)

1. Motivation and Introduction

The Pedestrian Environment Index (PEI) is a composite measure of walkability that combines four key subindices to evaluate pedestrian-friendly environments. This implementation of the PEI is useful for researchers aiming to:

Assess the current walkability of neighborhoods or regions.
Compare walkability across different areas.
Identify areas with potential for improvement.

2. Getting Started

Prerequisites

Python 3.x:
Ensure Python is installed and available in your system path. Check using:
```
python --version
```
Required Libraries:
Install the following Python libraries:
osmnx
pandas
numpy
matplotlib.pyplot
csv
census
Census API Key:
Obtain a Census API key from Census API Key Signup.
Save the key in a text file named census_api_key.txt in the same directory as PDI_generator.ipynb.

Installation

Install the required libraries using pip:

pip install osmnx pandas numpy matplotlib csv census

3. Core Subindices

Population Density Index (PDI)

Definition: Measures residential population density within defined areas.
Data Source: Population and area data are downloaded from the Missouri Census Data Center.
Calculation:

$\text{Population Density} = \frac{\text{Total Population}}{\text{Total Area (Square Miles)}}$
- PDI: Percentile rank of Population Density across all years and cities.

Commercial Density Index (CDI)

Definition: Evaluates the density of commercial establishments per block group.
Data Source: Data is sourced using the Overpass API.
Method:
Tags used include shops, restaurants, cafes, banks, schools, cinemas, parks, sports centers, and stadiums.
Area is derived from census tracts in the US Census GeoJSON files.

$\text{Commercial Density} = \frac{\text{Count of Commercial POIs}}{\text{Total Land Area (Square Miles)}}$
- CDI: Percentile rank of Commercial Density across all years and cities.

Intersection Density Index (IDI)

Definition: Quantifies the density of intersections in a given area.
Data Source: Retrieved using the Overpass API.
Method: $\text{Intersection Density} = \frac{\text{Number of Nodes Part of More than One Way (Intersections)}}{\text{Area (Square Miles)}}$
IDI: Percentile rank of Intersection Densities across all years and cities.

Land-use Diversity Index (LDI)

Definition: Analyzes the diversity of land-use types within an area.
Data Source: Land-use data is retrieved from OpenStreetMap using the Overpass API with the "landuse" tag.
Method: $\text{Entropy} = \sum \left( \frac{\text{Area of Land Use Type}}{\text{Total Area}} \cdot \ln \left( \frac{\text{Area of Land Use Type}}{\text{Total Area}} \right) \right)$ (for all land-use types with non-zero area).
- Normalized by: $\frac{\text{Entropy}}{\ln(\text{Number of Land Use Types with Non-Zero Area})}$
- LDI: Percentile rank of Entropies across all years and cities.

4. PEI Formula

The PEI is calculated using the following formula:

\[ PEI = \frac{{(1 + PDI) \cdot (1 + IDI) \cdot (1 + LDI) \cdot (1 + CDI)}}{16} \]

5. Implementation Workflow

Step 1: Files

Download population data files from the Missouri Census Data Center (MCDC) for each required year.
Download block group and census tract files from the US Census Bureau website.

Step 2: Subindex Calculation

Run individual generator scripts (e.g., <subindex>_<city>_<year>.ipynb) for each subindex.
Outputs include CSV and GeoJSON files with fields for block group, year, and the "raw subindex" values:
Population Density
Commercial Density
Intersection Density
Entropy
Append all results to master files (all_PDI, all_CDI, all_IDI, all_LDI) for comprehensive cross-year/city comparison.

Step 3: Normalization

Normalize raw subindex data - between 1 and 0:
This normalization is done by taking each block group/tract's percentile rank for each "raw subindex" versus every other city and every other year:
Because we normalize across all cities and years, our subindex values can be compared seamlessly against any other block group regardless of time/location.
We are able to normalize across all cities and years thanks to these aforementioned 4 files - all_PDI.csv, all_CDI.csv, all_IDI.csv, all_LDI.csv - which contain raw data for all years/cities.
Once we normalize the raw subindex we can now call it an actual subindex - e.g. the normalized Commercial Density becomes CDI, normalized Entropy becomes LDI, etc.
The file also updates the CSV & GeoJSON files for each subindex and city with a new field - one of CDI, LDI, PDI, IDI.
Now we finally have finalized CSV and GeoJSON files for the 4 subindexes:
- <subindex>_<city>_<year>.csv/geojson

Step 4: PEI Calculation

Combine normalized subindices using the PEI formula for each block group/tract.
Generate CSV and GeoJSON files (PEI_<city>_<year>.csv/geojson).

Step 5: Web App Integration

Upload finalized GeoJSON files to AWS S3 buckets.
Implement and visualize data on the web app.

6. Usage

Inside the Fall24 folder you will find 3 folders:

Tract_Files - we mainly used this folder for testing:
This contains the relevant ipynb files for creating the subindexes.
It also contains tracts.geojson which has the first 10 rows of tracts in the US - useful for testing. - Please download full tract files from https://www.census.gov/cgi-bin/geo/shapefiles/index.php, or contact abeesen3@gatech.edu for full files.
BlockGroup_Files:
This contains the relevant ipynb files for creating the subindexes. (In the PDI file, only run code blocks after the #NEW comment).
It also contains block_groups.geojson which has the first 10 rows of blockgroups in Atlanta - useful for testing. - Please create the full files using the ./Spring24/Blockgroup GeoJSON Generator folder (you need to rename the output of this to block_groups.geojson), downlaod the full files from https://www.census.gov/cgi-bin/geo/shapefiles/index.php, or contact abeesen3@gatech.edu for full files.

Step 1: Data Download

Download the required data files from the following sources: - Population Data: Obtain CSV files from Missouri Census Data Center (MCDC). - Census Block Groups/Tract GeoJSON: Retrieve the required GeoJSON files from the US Census Bureau or relevant sources: - As described above, download files from https://www.census.gov/cgi-bin/geo/shapefiles/index.php, or contact abeesen3@gatech.edu for full files.

Step 2: Update Generator Scripts

Modify the generator scripts (PDI_Generator.ipynb, CDI_Generator.ipynb, LDI_Generator.ipynb, IDI_Generator.ipynb) to include your specific file paths and input parameters. For all the subindex files, they will have a portion like the code shown below. This is the only part you must update as required:

calculate_<subindex>(
    input_geojson="path_to_your_geojson_file.geojson",  # Replace with your census tract or blockgroup GeoJSON file
    output_prefix="<tracts> or <block_groups>", # tracts or bg based on if we are analyzing tracts or blockgroups
    year=2013,
    aggregate_file="<subindex>_<tract/bg>_all.csv"  # update the <subindex> and choose tracts or bg
)

calculate_<subindex>(
    input_geojson="path_to_your_geojson_file.geojson",  # Replace with your census tract or blockgroup GeoJSON file
    output_prefix="<tracts> or <block_groups>", # tracts or bg based on if we are analyzing tracts or blockgroups
    year=2017,
    aggregate_file="<subindex>_<tract/bg>_all.csv"  # update the <subindex> and choose tracts or bg
)

calculate_<subindex>(
    input_geojson="path_to_your_geojson_file.geojson",  # Replace with your census tract or blockgroup GeoJSON file
    output_prefix="<tracts> or <block_groups>", # tracts or bg based on if we are analyzing tracts or blockgroups
    year=2022,
    aggregate_file="<subindex>_<tract/bg>_all.csv"  # update the <subindex> and choose tracts or bg
)

Step 3: Run Scripts in the Following Order

Run the Subindex Generators:
Execute the following scripts to calculate raw subindices:
CDI_Generator.ipynb
LDI_Generator.ipynb
IDI_Generator.ipynb
These can be run in any order.
Run PDI
For small input files (not many tracts or geojsons), run our current PDI_Generator.ipynb.
For larger files, a custom approach using CSV files from Missouri Census Data Center (MCDC) is requied: - For this, please contact cnguyen369@gatech.edu
Normalize Subindices:
Run Normalizer.ipynb to normalize the raw subindices across all years and cities.
Generate PEI:
Finally, run PEI_Generator.ipynb to calculate the Pedestrian Environment Index.

Step 4: Output

This process will output normalized subindex files and the final PEI results as CSV and GeoJSON files.
The file format will be:
- <subindex>_<city>_<year>.csv/geojson
Utilize the Subindex_Visualizer.ipynb file to visualize your geojson file output!

7. Challenges

The biggest challege in our statistic generators was developing the PDI Generator.

While most of our subindexes - CDI, LDI, IDI - use the Overpass API (OSM data) to gather data, this is not possible for the PDI as population data is not provided by OSM.
Because of this, we were forced to utilize the Census API, which had 2 main issues:
- It often returned simply the latest data i.e. 2024 data - even when we requested historical population data.
- On large geoJSONs, where we have to make hundreds and thousands of API calls, the Census API frequently errored due to API call limits.
  - This became a considerable problem when running our files using PACE to generate Census Tract data for Dr Ku. Our code would run for 10 or so hours and then fail - as we would run out of API tokens.
We got over this challenge by downloading population data by tract/block group directly - from Missouri Census Data Center (MCDC)
We could then easily calculate Population Density and hence PDI by block_group/tract by merging this data with our block_groups/tracts geoJSON files - which contain an area column.

8. Spring 2025 Aditions - PEI Dynamic Adjustment Documentation

The purpose of this tool is to load block groups sequentially with their individual and combined subindex scores and change a subindex value at a specified block group to observe trends and distinguishing factors from adjacent block groups. This proffers insight into how adjacent block groups can become more cohesive and bolster existing infrastructure to create a more pedestrian-friendly environment.

Input:
The tool takes in a GeoDataFrame with GEOID'd block groups, along with the four subindices and its composite PEI, plus user-provided selections for block group, subindex, and new value.
Output:
The output is a live-updated GeoDataFrame where the specified subindex value is modified for the chosen block group, enabling visualization of the resulting changes to the map and overall infrastructure.
Challenges:
Key challenges included maintaining data integrity after edits, and beginning the design of the function for seamless integration with interactive visualization tools on the web app.
Future Work:
Future improvements include adding functionality to adjust multiple subindices at once, creating a user-friendly and non-terminal application to the web app to visualize theoretical metropolitan PEI changes, and integrating model-based recalculations of PEI rather than direct manual edits.
Conclusion:
The PEI slider tool provides an intuitive way to experiment with subindex values and better understand the sensitivity of walkability metrics at a granular level, paving the way for more informed urban planning discussions.

9. Spring 2025 Aditions - Public Transport Accessibility Level (PTAL)

This project also implements the Public Transport Accessibility Level (PTAL) methodology, adapted from Transport for London’s (TfL) standard practices (Reference: PTAL Methodology, Transport for London, April 2010). PTAL measures the accessibility of locations to public transit services based on walking distance and service frequency.

1. Motivation and Introduction

The PTAL score quantifies how easily a location is served by public transit. This project adapts the PTAL method for U.S. cities like Atlanta, enabling:

Quantitative assessments of transit accessibility.
Cross-region comparison of transit service quality.
Identification of underserved and well-served areas for urban planning.

2. Getting Started

Prerequisites:
Python 3.x
Libraries: geopandas, pandas, numpy, shapely, math
Public Transit Stop Data (GeoJSON) with service frequency information (data pulled from UrbanFootprint and Mobility Database).

3. Core Components

Walk Access Time:
Time taken to walk from a Point-of-Interest (POI) to a Service Access Point (SAP) like a bus stop or rail station.
Walk speed: 80 meters/minute (4.8 km/hr).
Max distances: 640m for buses, 960m for rail.
Service Frequency and Reliability:
Number of transit services per hour.
Add 2 minutes reliability penalty for buses.
Add 0.75 minutes reliability penalty for rail.
Equivalent Doorstep Frequency (EDF):
Formula:
$$ EDF = \frac{30}{\text{Walk Time} + \text{Average Waiting Time}} $$
Accessibility Index (AI):
Calculation: Sum of EDFs, with secondary routes discounted by 50% to avoid overrepresentation.
PTAL Level Assignment:
Accessibility Index values are normalized similar to other subindices.

4. PTAL Calculation Workflow

Data Preparation:
Load POIs (e.g., block groups).
Load Transit Stops with service frequency attributes.
Walk Access Calculation:
Filter stops within the walking thresholds.
Access Time and EDF Calculation:
Compute walk time and average waiting time.
Calculate EDF for each nearby route.
Accessibility Index Computation:
Sum EDFs, applying 50% discount to non-dominant routes per transport mode.
PTAL Assignment:
Assign final PTAL levels based on AI bands.

5. Outputs

Final outputs are available as CSV and GeoJSON files with PTAL scores for each geographic unit.

6. Challenges

Limited transit stop data coverage in non-dense areas.
Incomplete GTFS feeds for certain systems.
Simplified pedestrian network modeling.

6. Limitations

Data Quality and Completeness:
PEI heavily relies on OpenStreetMap (OSM) data and census datasets, which may have missing or outdated entries, especially outside major urban centers.
Simplified Walkability:
Current PEI calculations assume basic walk access without modeling detailed pedestrian barriers (e.g., highways, rivers, unsafe crossings).
Transit Data Challenges:
PTAL relies on accurate service frequency data, which is often incomplete, inconsistent, or missing for certain cities or transit providers.

7. Future Work

Integrate complete GTFS schedule data for previous years.
Merge with walkability indices like PEI for a full mobility landscape.

10. Spring 2025 Aditions - Nationwide Data Generation for Dr Ku.

We also created a nationwide data dataset using the PEI methodologies we described above. We simple ran the relevant PEI files in the aforementioned way (see 6. Usage), but with a national_tracts.geojson file. - This dataset containts CDI, LDI, IDI, PDI, and PEI files as a CSV. - We created the files on both a Tract and County level (County Level data is new to Spring 2025).

In order to access raw files, please contact: abeesen3@gatech.edu

Nationwide dataset examples:

Boston	Chicago
Miami	New York
San Francisco	Washington, DC

11. Spring 2025 Aditions - PEI Documentation.

We also created an easy-to-share pdf file that summarizes our PEI methodologies to be used for exposure, funding, etc. This can be found in this repository as VIP_PEI_Documentation.pdf.

12. Overall Future Work

To improve the comprehensiveness of the Public Infrastructure Environment (PIE) framework, a major future goal is to expand the number of subindices beyond the current set.
If we were to do this, we also want to make sure that they are properly integrated into PEI.

Planned improvements include:

Support for New Subindices:
Add visualization options for future subindices.
Expansion to More Cities:
Make additional cities available for viewing.
Official Publication:
Publish our research officially and publicly to help advance urban sustainability.
Verification via Overpass API:
Fully integrate data checking against Overpass API to verify the accuracy of inputted data.
Ground-Truthing:
Conduct surveys and other research to validate PEI accuracy and compare with other walkability models.

13. Contributing

We welcome contributions to this project.

Steps to Contribute:

Fork the repository.
Create a feature branch:
```
git checkout -b feature/new-feature
```
Push your changes and submit a pull request.

14. License

This project is shared for research and educational purposes. Please do not redistribute for commercial use.

Web App Documentation

The web app is currently deployed at this link: https://vip-pei-app-2.onrender.com/ 😊

This deployment is dynamic and so any updates to our codebase (https://github.com/AtharvaBeesen/vip-pei-app-2) will automatically be displayed.

1. Introduction

App Name: VIP SMUR PEI Proof of Concept
Purpose: Visualize the work we have done in creating the aforementioned subindexes. We also wanted to allow the data we generate to be available online in a visually appealing, paletable, and easy-to-download way.

Key Features:

Interactive map with GeoJSON visualization for the subindexes - PDI, IDI, CDI, LDI, PEI - across different cities and years.
Dynamic city, statistic, and year selection.
Ability to download CSV and GeoJSON files for selected data.

Technology Stack:

Frontend: React, JavaScript, HTML, CSS
Backend: All functionality contained within JavaScript
Mapping Library: Leaflet
Data Source: Amazon S3 Buckets

2. Getting Started

Prerequisites

Node.js and npm/yarn:
Ensure Node.js and npm (or yarn) are installed on your system. You can check this by running:
```
node -v
npm -v
```
If not installed, download them from Node.js official site.
Code Editor (Optional):
Install a code editor like Visual Studio Code.
Browser:
A modern browser like Chrome, Firefox, or Edge to test your app.
Git:
Install Git for cloning the repository. Check installation by running:
```
git --version
```
Leaflet Library Dependencies:
The app uses Leaflet for maps, which requires:
A valid internet connection to download Leaflet assets via npm or yarn.
Ensure the browser supports Leaflet.

Installation

Clone the repository:

git clone https://github.com/AtharvaBeesen/vip-pei-app-2.git

Install dependencies:
```
npm install
```
Ensure Leaflet is installed:
```
npm install leaflet
```
Run the application:
```
npm start
```
Access the app at http://localhost:3000.

Deployment

Already deployed! We deployed using Render:
https://vip-pei-app-2.onrender.com/

3. Features

3.1 Interactive Map

Displays GeoJSON data visualized on a Leaflet map.
Map dynamically updates based on city, statistic, and year selections.

3.2 City, Statistic, and Year Selection

Dropdown menus for users to select:
Cities: Atlanta, New York, Los Angeles.
Statistics: IDI, PDI, CDI, LDI, PEI.
Years: 2022, 2013.

3.3 File Downloads

CSV and GeoJSON files for the selected data can be downloaded with a single click.

4. Components

4.1 App.js

The main entry point for the application.
Manages state for selected city, statistic, and year.
Renders CitySelector, DownloadButton, and MapComponent.

4.2 CitySelector.js

Dropdown menus for selecting city, statistic, and year.
Capitalizes city names and statistics for user-friendly display.

4.3 MapComponent.js

Displays the Leaflet map and GeoJSON data.
Dynamically fetches data from S3 Buckets based on user selections (more on this below).
Highlights GeoJSON features with a color-coded scheme based on statistic values.

4.4 DownloadButton.js

Provides buttons to download CSV and GeoJSON files from S3.
Dynamically constructs download URLs based on user selections.

5. API Integration

The app dynamically fetches GeoJSON data from an Amazon S3 bucket: - URL format:
https://vip-censusdata.s3.us-east-2.amazonaws.com/{city}_blockgroup_{statistic}_{year}.geojson

Example:

For Atlanta, IDI, and 2022:
https://vip-censusdata.s3.us-east-2.amazonaws.com/atlanta_blockgroup_IDI_2022.geojson

5. City Comparison Tool (Spring 2025)

1. Overview

We created a City Comparison Tool MVP to allow users to dynamically compare changes in walkability-related subindices (IDI, PDI, CDI, LDI, PEI) between two different years for a selected city.

This tool calculates and visualizes the percentage change in the selected statistic for each census block group, helping to identify areas that have improved or declined over time.

2. Key Features

City Selection:
Compare changes for Atlanta, New York, or Los Angeles.
Statistic Selection:
Choose one of the following subindices to analyze:
Intersection Density Index (IDI)
Population Density Index (PDI)
Commercial Density Index (CDI)
Land-use Diversity Index (LDI)
Pedestrian Environment Index (PEI)
Year Selection:
Select two different years to calculate the percent change.
Dynamic GeoJSON Visualization:
The map displays block groups color-coded by the percentage change in the selected statistic.
Interactive Tooltips:
Hovering over a block group shows its GEOID and computed percent change.
Custom Color Scale:
The visualization uses a diverging color scheme to easily distinguish between positive and negative changes.

3. Technical Workflow

Data Fetching:
The tool fetches the respective city's GeoJSON files for both selected years from the Amazon S3 bucket.
Difference Calculation:
For most subindices (IDI, PDI, CDI, PEI), block groups are matched by GEOID.
For LDI (which currently lacks GEOIDs), features are temporarily matched by their array index.
Percent Change Computation:
The percent change for each block group is calculated as:

\[ \text{Percent Change} = \frac{(\text{After Value} - \text{Before Value})}{\text{Before Value}} \times 100 \]

Visualization:
Block groups are shaded according to the magnitude of their percent change.
Tooltips display the block group ID and the exact percent change.
Robust Edge Case Handling:
If both before and after values are zero, the percent change is set to 0.
If a before value is zero and after is nonzero, the block group is highlighted accordingly without division errors.

4. Known Limitations

LDI currently matches features by array index due to missing GEOIDs (planned to be fixed in future data versions).
Only a few cities and years are currently available.
No smoothing or statistical aggregation (e.g., at the neighborhood level) yet applied — purely block group level.

5. Future Work

Add more cities and historical years to expand the tool’s coverage.
Improve LDI data quality by assigning proper GEOIDs.
Integrate this comparison tool directly into the main PEI web app navigation.
Add multi-year trend graphs and regional aggregation for deeper urban insights.
Allow users to select multiple subindices at once for richer comparisons.

6. Future Work

There are 3 key goals we hope to achieve:

Increase the number of cities and years supported:
This would simply require us to continue running our subindex generators over the course of the next semester(s) in order to continue to grow the size of our database.
Seek Collaboration/Monetization Opportunities:
As we grow the site and our footprint in the space, we could seek to replicate WalkScore's monetization and collaboration business model:
Collaborate with products/sites:
Work with products or sites that require walkability statistics (e.g., Zillow and City Planning Companies) to create customized statistics at a cost.
Direct collaboration with local government:
Assist local governments in achieving their goals of improving walkability in urban areas.
Developing a for-cost API:
Create an API that allows third-party researchers to use our data.
- For Example: Collaborating with researchers like Dr. Ku, who uses our data to enhance his psychology research.
Improve the UI:
This goal is less essential and is more about improving user-experience in the case that we decide to push towards becoming a standalone software for third-parties to gather walkability data on different urban areas.

7. Contributing

Please contact abeesen3@gatech.edu before doing so.
Example of how to contribute:
Fork the repository (https://github.com/AtharvaBeesen/vip-pei-app-2)
Create a feature branch:
```
git checkout -b feature/new-feature
```
Push your changes and submit a pull request.

8. License

This project is shared for research and educational purposes. Please do not redistribute for commercial use.

Presentation

Team

Name	Seniority	Major	Department	GitHub Handle
Yao Xiao	Sophomore	Computer Science	COC	Xyrro
Mason DeWitt	Freshman	Electrical Engineering	ECE	Masonrd
Joshua Cohen	Senior	Civil Engineering	CEE	paradoxwalk
Nicholas Stone	Sophomore	Computer Science	COC	nstone213
Atharva Beesen	Junior	Computer Science	COC	AtharvaBeesen