Galveston Housing Unit Allocation#

Housing Unit Allocation using Galveston Oregon Housing Unit Inventory Notebook from https://github.com/IN-CORE/pyincore/tree/develop/pyincore/analyses/housingunitallocation

import pandas as pd
import numpy as np
import sys # For displaying package versions
import os # For managing directories and file paths if drive is mounted


from pyincore import IncoreClient, Dataset, FragilityService, MappingSet, DataService
from pyincore.analyses.housingunitallocation import HousingUnitAllocation
client = IncoreClient()
# Check package versions - good practice for replication
print("Python Version ",sys.version)
print("pandas version: ", pd.__version__)
print("numpy version: ", np.__version__)
# Check working directory - good practice for relative path access
os.getcwd()

Initial Interdependent Community Description - Galveston, Texas#

Explore building inventory and social systems. Specifically look at how the building inventory connects with the housing unit inventory using the housing unit allocation. The housing unit allocation method will provide detail demographic characteristics for the community allocated to each structure.

# Galveston, TX Housing unit inventory
housing_unit_inv = "5fc6ab1cd2066956f49e7a03"
# Galveston, TX Address point inventory
address_point_inv = "5fc6aadcc38a0722f563392e"
# Galveston, TX Building inventory
building_inv = "60354b6c123b4036e6837ef7"

Run Housing Unit Allocation#

https://github.com/IN-CORE/incore-docs/blob/main/notebooks/housingunitallocation.ipynb

Rosenheim, Nathanael, Roberto Guidotti, Paolo Gardoni & Walter Gillis Peacock. (2019). Integration of detailed household and housing unit characteristic data with critical infrastructure for post-hazard resilience modeling. Sustainable and Resilient Infrastructure. doi.org/10.1080/23789689.2019.1681821

# Create housing allocation 
hua = HousingUnitAllocation(client)

# Load input dataset
hua.load_remote_input_dataset("housing_unit_inventory", housing_unit_inv)
hua.load_remote_input_dataset("address_point_inventory", address_point_inv)
hua.load_remote_input_dataset("buildings", building_inv)

# Specify the result name
result_name = "Galveston_HUA"

seed = 1238
iterations = 1

# Set analysis parameters
hua.set_parameter("result_name", result_name)
hua.set_parameter("seed", seed)
hua.set_parameter("iterations", iterations)
# Run Housing unit allocation analysis
hua.run_analysis()
# Retrieve result dataset
result = hua.get_output_dataset("result")

# Convert dataset to Pandas DataFrame
hua_df = result.get_dataframe_from_csv(low_memory=False)

# Display top 5 rows of output data
hua_df.head()

Explore results from Housing Unit Allocation#

Keep observations that are matched to a building.

hua_df = hua_df.loc[hua_df['aphumerge'] == 'both']
hua_df['Race Ethnicity'] = "0 Vacant HU No Race Ethnicity Data"
hua_df['Race Ethnicity'].notes = "Identify Race and Ethnicity Housing Unit Characteristics."

hua_df.loc[(hua_df['race'] == 1) & (hua_df['hispan'] == 0),'Race Ethnicity'] = "1 White alone, Not Hispanic"
hua_df.loc[(hua_df['race'] == 2) & (hua_df['hispan'] == 0),'Race Ethnicity'] = "2 Black alone, Not Hispanic"
hua_df.loc[(hua_df['race'].isin([3,4,5,6,7])) & (hua_df['hispan'] == 0),'Race Ethnicity'] = "3 Other Race, Not Hispanic"
hua_df.loc[(hua_df['hispan'] == 1),'Race Ethnicity'] = "4 Any Race, Hispanic"
hua_df.loc[(hua_df['gqtype'] >= 1),'Race Ethnicity'] = "5 Group Quarters no Race Ethnicity Data"

# Check new variable
table_title = "Confirm housing unit characteristic by Race and Ethnicity."
pd.crosstab(hua_df['Race Ethnicity'], hua_df['race'], 
            margins=True, margins_name="Total").style.set_caption(table_title)
# Check new variable
table_title = "Confirm housing unit characteristic by Race and Ethnicity."
pd.crosstab(hua_df['Race Ethnicity'], hua_df['hispan'], 
            margins=True, margins_name="Total").style.set_caption(table_title)
table_title = "Table 1. Housing Unit Characteristics by Race and Ethnicity"
table1 = pd.pivot_table(hua_df, values='numprec', index=['Race Ethnicity'],
                              margins = True, margins_name = 'Total',
                              aggfunc=[len, np.sum], 
                              fill_value=0).reset_index().rename(
                                                            columns={'len': 'Housing Unit',
                                                                     'sum' : 'Population',
                                                                     'numprec': 'Count'})

varformat = {('Housing Unit','Count'): "{:,}", ('Population','Count'): "{:,}"}
table1.style.set_caption(table_title).format(varformat).set_table_styles([
    dict(selector='th', props=[('text-align', 'center')]),])

Validate the Housing Unit Allocation has worked#

Notice that the population count totals for the community should match (pretty closely) data collected for the 2010 Decennial Census. This can be confirmed by going to data.census.gov

https://data.census.gov/cedsci/table?q=DECENNIALPL2010.P1&g=1600000US4828068,4837252&tid=DECENNIALSF12010.P1

Differences in the housing unit allocation and the Census count may be due to differences between political boundaries and the building inventory. See Rosenheim et al 2019 for more details.

The housing unit allocation, plus the building dresults will become the input for the dislocation model.

# Save cleaned HUA file as CSV
hua_df.to_csv(result_name+str(seed)+'_cleaned.csv')