Pedestrian Environment Index (PEI) Documentation
This project implements the Pedestrian Environment Index (PEI) methodology as developed at the University of Illinois Chicago (see the research paper: https://www.sciencedirect.com/science/article/pii/S0966692314001343). The PEI provides a composite measure of the walkability of an environment, incorporating the following subindices:
- Population Density Index (PDI)
- Commercial Density Index (CDI)
- Intersection Density Index (IDI)
- Land-use Diversity Index (LDI)
1. Motivation and Introduction
The Pedestrian Environment Index (PEI) is a composite measure of walkability that combines four key subindices to evaluate pedestrian-friendly environments. This implementation of the PEI is useful for researchers aiming to:
- Assess the current walkability of neighborhoods or regions.
- Compare walkability across different areas.
- Identify areas with potential for improvement.
2. Getting Started
Prerequisites
-
Python 3.x:
Ensure Python is installed and available in your system path. Check using: -
Required Libraries:
Install the following Python libraries: - osmnx
- pandas
- numpy
- matplotlib.pyplot
- csv
-
census
-
Census API Key:
Obtain a Census API key from Census API Key Signup.
Save the key in a text file namedcensus_api_key.txt
in the same directory asPDI_generator.ipynb
.
Installation
Install the required libraries using pip:
3. Core Subindices
Population Density Index (PDI)
- Definition: Measures residential population density within defined areas.
- Data Source: Population and area data are downloaded from the Missouri Census Data Center.
-
Calculation:
\(\text{Population Density} = \frac{\text{Total Population}}{\text{Total Area (Square Miles)}}\)
- PDI: Percentile rank of Population Density across all years and cities.
Commercial Density Index (CDI)
- Definition: Evaluates the density of commercial establishments per block group.
- Data Source: Data is sourced using the Overpass API.
- Method:
- Tags used include shops, restaurants, cafes, banks, schools, cinemas, parks, sports centers, and stadiums.
-
Area is derived from census tracts in the US Census GeoJSON files.
\(\text{Commercial Density} = \frac{\text{Count of Commercial POIs}}{\text{Total Land Area (Square Miles)}}\)
- CDI: Percentile rank of Commercial Density across all years and cities.
Intersection Density Index (IDI)
- Definition: Quantifies the density of intersections in a given area.
- Data Source: Retrieved using the Overpass API.
- Method: \(\text{Intersection Density} = \frac{\text{Number of Nodes Part of More than One Way (Intersections)}}{\text{Area (Square Miles)}}\)
- IDI: Percentile rank of Intersection Densities across all years and cities.
Land-use Diversity Index (LDI)
- Definition: Analyzes the diversity of land-use types within an area.
- Data Source: Land-use data is retrieved from OpenStreetMap using the Overpass API with the "landuse" tag.
- Method: \(\text{Entropy} = \sum \left( \frac{\text{Area of Land Use Type}}{\text{Total Area}} \cdot \ln \left( \frac{\text{Area of Land Use Type}}{\text{Total Area}} \right) \right)\) (for all land-use types with non-zero area).
- Normalized by: \(\frac{\text{Entropy}}{\ln(\text{Number of Land Use Types with Non-Zero Area})}\)
- LDI: Percentile rank of Entropies across all years and cities.
4. PEI Formula
The PEI is calculated using the following formula:
5. Implementation Workflow
Step 1: Files
- Download population data files from the Missouri Census Data Center (MCDC) for each required year.
- Download block group and census tract files from the US Census Bureau website.
Step 2: Subindex Calculation
- Run individual generator scripts (e.g.,
<subindex>_<city>_<year>.ipynb
) for each subindex. - Outputs include CSV and GeoJSON files with fields for block group, year, and the "raw subindex" values:
- Population Density
- Commercial Density
- Intersection Density
- Entropy
- Append all results to master files (
all_PDI
,all_CDI
,all_IDI
,all_LDI
) for comprehensive cross-year/city comparison.
Step 3: Normalization
- Normalize raw subindex data - between 1 and 0:
- This normalization is done by taking each block group/tract's percentile rank for each "raw subindex" versus every other city and every other year:
- Because we normalize across all cities and years, our subindex values can be compared seamlessly against any other block group regardless of time/location.
- We are able to normalize across all cities and years thanks to these aforementioned 4 files -
all_PDI.csv
,all_CDI.csv
,all_IDI.csv
,all_LDI.csv
- which contain raw data for all years/cities. - Once we normalize the raw subindex we can now call it an actual subindex - e.g. the normalized
Commercial Density
becomesCDI
, normalizedEntropy
becomesLDI
, etc. -
The file also updates the CSV & GeoJSON files for each subindex and city with a new field - one of
CDI
,LDI
,PDI
,IDI
. -
Now we finally have finalized CSV and GeoJSON files for the 4 subindexes:
<subindex>_<city>_<year>.csv/geojson
Step 4: PEI Calculation
- Combine normalized subindices using the PEI formula for each block group/tract.
- Generate CSV and GeoJSON files (
PEI_<city>_<year>.csv/geojson
).
Step 5: Web App Integration
- Upload finalized GeoJSON files to AWS S3 buckets.
- Implement and visualize data on the web app.
6. Usage
Inside the Fall24 folder you will find 3 folders:
- Tract_Files - we mainly used this folder for testing:
- This contains the relevant
ipynb
files for creating the subindexes. -
It also contains
tracts.geojson
which has the first 10 rows of tracts in the US - useful for testing. - Please download full tract files from https://www.census.gov/cgi-bin/geo/shapefiles/index.php, or contact abeesen3@gatech.edu for full files. -
BlockGroup_Files:
- This contains the relevant
ipynb
files for creating the subindexes. (In the PDI file, only run code blocks after the#NEW
comment). - It also contains
block_groups.geojson
which has the first 10 rows of blockgroups in Atlanta - useful for testing. - Please create the full files using the./Spring24/Blockgroup GeoJSON Generator
folder (you need to rename the output of this toblock_groups.geojson
), downlaod the full files from https://www.census.gov/cgi-bin/geo/shapefiles/index.php, or contact abeesen3@gatech.edu for full files.
Step 1: Data Download
Download the required data files from the following sources: - Population Data: Obtain CSV files from Missouri Census Data Center (MCDC). - Census Block Groups/Tract GeoJSON: Retrieve the required GeoJSON files from the US Census Bureau or relevant sources: - As described above, download files from https://www.census.gov/cgi-bin/geo/shapefiles/index.php, or contact abeesen3@gatech.edu for full files.
Step 2: Update Generator Scripts
Modify the generator scripts (PDI_Generator.ipynb
, CDI_Generator.ipynb
, LDI_Generator.ipynb
, IDI_Generator.ipynb
) to include your specific file paths and input parameters. For all the subindex files, they will have a portion like the code shown below. This is the only part you must update as required:
calculate_<subindex>(
input_geojson="path_to_your_geojson_file.geojson", # Replace with your census tract or blockgroup GeoJSON file
output_prefix="<tracts> or <block_groups>", # tracts or bg based on if we are analyzing tracts or blockgroups
year=2013,
aggregate_file="<subindex>_<tract/bg>_all.csv" # update the <subindex> and choose tracts or bg
)
calculate_<subindex>(
input_geojson="path_to_your_geojson_file.geojson", # Replace with your census tract or blockgroup GeoJSON file
output_prefix="<tracts> or <block_groups>", # tracts or bg based on if we are analyzing tracts or blockgroups
year=2017,
aggregate_file="<subindex>_<tract/bg>_all.csv" # update the <subindex> and choose tracts or bg
)
calculate_<subindex>(
input_geojson="path_to_your_geojson_file.geojson", # Replace with your census tract or blockgroup GeoJSON file
output_prefix="<tracts> or <block_groups>", # tracts or bg based on if we are analyzing tracts or blockgroups
year=2022,
aggregate_file="<subindex>_<tract/bg>_all.csv" # update the <subindex> and choose tracts or bg
)
Step 3: Run Scripts in the Following Order
- Run the Subindex Generators:
Execute the following scripts to calculate raw subindices: CDI_Generator.ipynb
LDI_Generator.ipynb
-
IDI_Generator.ipynb
These can be run in any order. -
Run PDI
- For small input files (not many tracts or geojsons), run our current
PDI_Generator.ipynb
. -
For larger files, a custom approach using CSV files from Missouri Census Data Center (MCDC) is requied: - For this, please contact cnguyen369@gatech.edu
-
Normalize Subindices:
RunNormalizer.ipynb
to normalize the raw subindices across all years and cities. -
Generate PEI:
Finally, runPEI_Generator.ipynb
to calculate the Pedestrian Environment Index.
Step 4: Output
- This process will output normalized subindex files and the final PEI results as CSV and GeoJSON files.
- The file format will be:
<subindex>_<city>_<year>.csv/geojson
- Utilize the
Subindex_Visualizer.ipynb
file to visualize your geojson file output!
7. Challenges
The biggest challege in our statistic generators was developing the PDI Generator.
-
While most of our subindexes -
CDI
,LDI
,IDI
- use theOverpass API (OSM data)
to gather data, this is not possible for thePDI
as population data is not provided by OSM. -
Because of this, we were forced to utilize the
Census API
, which had 2 main issues:- It often returned simply the latest data i.e. 2024 data - even when we requested historical population data.
- On large geoJSONs, where we have to make hundreds and thousands of API calls, the Census API frequently errored due to API call limits.
- This became a considerable problem when running our files using
PACE
to generate Census Tract data for Dr Ku. Our code would run for 10 or so hours and then fail - as we would run out of API tokens.
- This became a considerable problem when running our files using
-
We got over this challenge by downloading population data by tract/block group directly - from Missouri Census Data Center (MCDC)
- We could then easily calculate
Population Density
and hencePDI
by block_group/tract by merging this data with our block_groups/tracts geoJSON files - which contain an area column.
8. Future Work - Potential Statistic Creation Enhancements
While this semester we started working on the idea of normalization across all urban areas and years – which ensures our statistics can be properly used for comparisons - we still have a few more potential enhancements:
-
Logarithmic normalization:
While we currently create percentile ranks to evaluate indexes, utilizing a logarithmic normalization would mean scores are less spread out and more realistic. -
Improved normalization reliability:
As we add new cities to our database - which we hope to do by continuing to run our generators over the course of next semester - our scores will automatically become more reliable as we normalize with larger datasets. -
Expanding CDI tags:
We aim to add more tags to the Commercial Density Index (CDI), as it is currently too narrow.
9. Contributing
We welcome contributions to this project.
Steps to Contribute:
- Fork the repository.
- Create a feature branch:
- Push your changes and submit a pull request.
10. License
This project is shared for research and educational purposes. Please do not redistribute for commercial use.
Web App Documentation
The web app is currently deployed at this link: https://vip-pei-app-2.onrender.com/ 😊
This deployment is dynamic and so any updates to our codebase (https://github.com/AtharvaBeesen/vip-pei-app-2) will automatically be displayed.
1. Introduction
- App Name: VIP SMUR PEI Proof of Concept
- Purpose: Visualize the work we have done in creating the aforementioned subindexes. We also wanted to allow the data we generate to be available online in a visually appealing, paletable, and easy-to-download way.
Key Features:
- Interactive map with GeoJSON visualization for the subindexes - PDI, IDI, CDI, LDI, PEI - across different cities and years.
- Dynamic city, statistic, and year selection.
- Ability to download CSV and GeoJSON files for selected data.
Technology Stack:
- Frontend: React, JavaScript, HTML, CSS
- Backend: All functionality contained within JavaScript
- Mapping Library: Leaflet
- Data Source: Amazon S3 Buckets
2. Getting Started
Prerequisites
-
Node.js and npm/yarn:
If not installed, download them from Node.js official site.
Ensure Node.js and npm (or yarn) are installed on your system. You can check this by running: -
Code Editor (Optional):
Install a code editor like Visual Studio Code. -
Browser:
A modern browser like Chrome, Firefox, or Edge to test your app. -
Git:
Install Git for cloning the repository. Check installation by running: -
Leaflet Library Dependencies:
The app uses Leaflet for maps, which requires: - A valid internet connection to download Leaflet assets via npm or yarn.
- Ensure the browser supports Leaflet.
Installation
- Clone the repository:
- Install dependencies:
- Ensure Leaflet is installed:
- Run the application:
- Access the app at
http://localhost:3000
.
Deployment
- Already deployed! We deployed using
Render
:
https://vip-pei-app-2.onrender.com/
3. Features
3.1 Interactive Map
- Displays GeoJSON data visualized on a Leaflet map.
- Map dynamically updates based on city, statistic, and year selections.
3.2 City, Statistic, and Year Selection
- Dropdown menus for users to select:
- Cities: Atlanta, New York, Los Angeles.
- Statistics: IDI, PDI, CDI, LDI, PEI.
- Years: 2022, 2013.
3.3 File Downloads
- CSV and GeoJSON files for the selected data can be downloaded with a single click.
4. Components
4.1 App.js
- The main entry point for the application.
- Manages state for selected city, statistic, and year.
- Renders
CitySelector
,DownloadButton
, andMapComponent
.
4.2 CitySelector.js
- Dropdown menus for selecting city, statistic, and year.
- Capitalizes city names and statistics for user-friendly display.
4.3 MapComponent.js
- Displays the Leaflet map and GeoJSON data.
- Dynamically fetches data from S3 Buckets based on user selections (more on this below).
- Highlights GeoJSON features with a color-coded scheme based on statistic values.
4.4 DownloadButton.js
- Provides buttons to download CSV and GeoJSON files from S3.
- Dynamically constructs download URLs based on user selections.
5. API Integration
The app dynamically fetches GeoJSON data from an Amazon S3 bucket: - URL format:
https://vip-censusdata.s3.us-east-2.amazonaws.com/{city}_blockgroup_{statistic}_{year}.geojson
Example:
For Atlanta, IDI, and 2022:
https://vip-censusdata.s3.us-east-2.amazonaws.com/atlanta_blockgroup_IDI_2022.geojson
6. Future Work
There are 3 key goals we hope to achieve:
-
Increase the number of cities and years supported:
This would simply require us to continue running our subindex generators over the course of the next semester(s) in order to continue to grow the size of our database. -
Seek Collaboration/Monetization Opportunities:
As we grow the site and our footprint in the space, we could seek to replicate WalkScore's monetization and collaboration business model: -
Collaborate with products/sites:
Work with products or sites that require walkability statistics (e.g., Zillow and City Planning Companies) to create customized statistics at a cost. -
Direct collaboration with local government:
Assist local governments in achieving their goals of improving walkability in urban areas. -
Developing a for-cost API:
Create an API that allows third-party researchers to use our data.- For Example: Collaborating with researchers like Dr. Ku, who uses our data to enhance his psychology research.
-
Improve the UI:
This goal is less essential and is more about improving user-experience in the case that we decide to push towards becoming a standalone software for third-parties to gather walkability data on different urban areas.
7. Contributing
- Please contact abeesen3@gatech.edu before doing so.
- Example of how to contribute:
- Fork the repository (https://github.com/AtharvaBeesen/vip-pei-app-2)
- Create a feature branch:
- Push your changes and submit a pull request.
8. License
This project is shared for research and educational purposes. Please do not redistribute for commercial use.
Presentation
Team
Name | Seniority | Major | Department | GitHub Handle |
---|---|---|---|---|
C. "Albert" Le | Sophomore | Computer Engineering | ECE | balbertle |
Chunlan Wang | Masters | Architecture (DC) | ARCH | wang-123-xi |
Yichao Shi | PhD | Architecture | ARCH | SHIyichao98 |
Atharva Beesen | Junior | Computer Science | COC | AtharvaBeesen |