adding permit data
This commit is contained in:
commit
341a129953
1
README.md
Normal file
1
README.md
Normal file
@ -0,0 +1 @@
|
||||
This repo contains collected public source documents.
|
||||
12
not_livable_delivery/LOADER_README.txt
Normal file
12
not_livable_delivery/LOADER_README.txt
Normal file
@ -0,0 +1,12 @@
|
||||
not_livable_permits -- loading notes
|
||||
====================================
|
||||
Rows: 135,159. Files: not_livable_permits.csv / .parquet / data_dictionary.csv
|
||||
|
||||
PREFERRED -- parquet (dtypes embedded, no re-contamination):
|
||||
import pandas as pd; df = pd.read_parquet('not_livable_permits.parquet')
|
||||
|
||||
CSV: read code columns as str (else '06037'->int 6037, leading zero lost):
|
||||
STR_COLS = ['permit_id', 'city', 'state', 'source_dataset', 'tract_geoid', 'county_fips']
|
||||
df = pd.read_csv('not_livable_permits.csv', dtype={c:str for c in STR_COLS})
|
||||
FIPS canonical: state=2, county_fips=5, tract_geoid=11 digits.
|
||||
Tight headline = uninhabitable_on_census_day & residential_flag & ~excluded_any (=32,199).
|
||||
31
not_livable_delivery/data_dictionary.csv
Normal file
31
not_livable_delivery/data_dictionary.csv
Normal file
@ -0,0 +1,31 @@
|
||||
column,dtype,pct_populated,n_distinct,example,definition
|
||||
permit_id,string,95.7,125150,340653475.0,Source permit/record id within the issuing jurisdiction.
|
||||
city,string,100.0,46,nyc,Source city/jurisdiction slug.
|
||||
state,string,100.0,17,36,USPS state abbreviation.
|
||||
source_dataset,string,100.0,50,nyc_construction_permits.csv,Originating open-data dataset id.
|
||||
ingestion_quality,string,100.0,7,hand_mapped_v8,Ingestion grade (hand_mapped_v8/partial/narrow/...); weight screening rows by this.
|
||||
address,string,94.6,84985,1621 AVENUE T,Street address as published by the source (street-only for most cities).
|
||||
latitude,float,57.5,41689,40.601231,WGS84 latitude where the source provided coordinates.
|
||||
longitude,float,57.4,41556,-73.955623,WGS84 longitude where the source provided coordinates.
|
||||
tract_geoid,string,72.7,5976,36047055800,11-digit 2020 Census tract GEOID.
|
||||
county_fips,string,75.3,32,36047,5-digit county FIPS.
|
||||
geo_resolution,string,100.0,3,full_address,Best available geo precision: full_address > latlon > tract_only > none.
|
||||
not_livable_type,string,100.0,3,demolition,demolition | condemned_unsafe | under_construction.
|
||||
census_day_status,string,100.0,5,active,active|unconfirmed|completed_before|issued_after|unknown_dates vs 2020-04-01.
|
||||
signal_strength,string,100.0,3,strong,strong (demo/condemn) | medium (active construction) | weak (unconfirmed).
|
||||
match_keyword,string,100.0,33,demolition,Keyword that triggered the type classification.
|
||||
match_basis,string,85.7,243,demolition|demo,"All matched keywords (pipe-delimited), for QA."
|
||||
residential_flag,bool,100.0,2,unknown,"yes if residential/dwelling context detected, else unknown."
|
||||
start_date,date,99.7,5544,2019-01-01,Permit issue date (the 'start').
|
||||
end_date,date,56.2,3986,2019-01-02,Completion/CO date where a completion proxy exists; blank for issuance-only sources.
|
||||
units,int,19.0,338,0.0,Dwelling units on the permit where reported.
|
||||
use_type_raw,string,42.5,756,1-2-3 FAMILY,Verbatim source permit-type/use text.
|
||||
description_raw,string,87.6,71365,INTERIOR DEMOLITION( PARTITION REMOVAL) ; NEW PARTITION; CEI,Verbatim source work-description text.
|
||||
uninhabitable_18,bool,100.0,2,True,Same under an 18-month recency window (conservative alternative).
|
||||
uninhabitable_24,string,100.0,2,True,
|
||||
uninhabitable_on_census_day,bool,100.0,2,True,TRUE if uninhabitable on Census Day (type-aware; 24mo recency + no-rebuild for demo/condemn).
|
||||
rebuilt_before_census_day,string,100.0,2,False,
|
||||
excl_interior_accessory_demo,bool,100.0,2,False,TRUE if a bare-demo row is interior/accessory (not a dwelling).
|
||||
excl_erect_nonresidential,bool,100.0,2,False,TRUE if an 'erect' row lacks residential context.
|
||||
excl_dc_unitsfallback,bool,100.0,2,False,TRUE if a DC row entered only via the units>0 fallback.
|
||||
excluded_any,bool,100.0,2,False,TRUE if any exclusion flag is set.
|
||||
|
144739
not_livable_delivery/not_livable_permits.csv
Normal file
144739
not_livable_delivery/not_livable_permits.csv
Normal file
File diff suppressed because it is too large
Load Diff
BIN
not_livable_delivery/not_livable_permits.parquet
Normal file
BIN
not_livable_delivery/not_livable_permits.parquet
Normal file
Binary file not shown.
BIN
not_livable_delivery/permits_unified_v10_feed_delivery.zip
Normal file
BIN
not_livable_delivery/permits_unified_v10_feed_delivery.zip
Normal file
Binary file not shown.
Loading…
Reference in New Issue
Block a user