2.8 KiB
2.8 KiB
| 1 | column | dtype | pct_populated | n_distinct | example | definition |
|---|---|---|---|---|---|---|
| 2 | permit_id | string | 95.7 | 125150 | 340653475.0 | Source permit/record id within the issuing jurisdiction. |
| 3 | city | string | 100.0 | 46 | nyc | Source city/jurisdiction slug. |
| 4 | state | string | 100.0 | 17 | 36 | USPS state abbreviation. |
| 5 | source_dataset | string | 100.0 | 50 | nyc_construction_permits.csv | Originating open-data dataset id. |
| 6 | ingestion_quality | string | 100.0 | 7 | hand_mapped_v8 | Ingestion grade (hand_mapped_v8/partial/narrow/...); weight screening rows by this. |
| 7 | address | string | 94.6 | 84985 | 1621 AVENUE T | Street address as published by the source (street-only for most cities). |
| 8 | latitude | float | 57.5 | 41689 | 40.601231 | WGS84 latitude where the source provided coordinates. |
| 9 | longitude | float | 57.4 | 41556 | -73.955623 | WGS84 longitude where the source provided coordinates. |
| 10 | tract_geoid | string | 72.7 | 5976 | 36047055800 | 11-digit 2020 Census tract GEOID. |
| 11 | county_fips | string | 75.3 | 32 | 36047 | 5-digit county FIPS. |
| 12 | geo_resolution | string | 100.0 | 3 | full_address | Best available geo precision: full_address > latlon > tract_only > none. |
| 13 | not_livable_type | string | 100.0 | 3 | demolition | demolition | condemned_unsafe | under_construction. |
| 14 | census_day_status | string | 100.0 | 5 | active | active|unconfirmed|completed_before|issued_after|unknown_dates vs 2020-04-01. |
| 15 | signal_strength | string | 100.0 | 3 | strong | strong (demo/condemn) | medium (active construction) | weak (unconfirmed). |
| 16 | match_keyword | string | 100.0 | 33 | demolition | Keyword that triggered the type classification. |
| 17 | match_basis | string | 85.7 | 243 | demolition|demo | All matched keywords (pipe-delimited), for QA. |
| 18 | residential_flag | bool | 100.0 | 2 | unknown | yes if residential/dwelling context detected, else unknown. |
| 19 | start_date | date | 99.7 | 5544 | 2019-01-01 | Permit issue date (the 'start'). |
| 20 | end_date | date | 56.2 | 3986 | 2019-01-02 | Completion/CO date where a completion proxy exists; blank for issuance-only sources. |
| 21 | units | int | 19.0 | 338 | 0.0 | Dwelling units on the permit where reported. |
| 22 | use_type_raw | string | 42.5 | 756 | 1-2-3 FAMILY | Verbatim source permit-type/use text. |
| 23 | description_raw | string | 87.6 | 71365 | INTERIOR DEMOLITION( PARTITION REMOVAL) ; NEW PARTITION; CEI | Verbatim source work-description text. |
| 24 | uninhabitable_18 | bool | 100.0 | 2 | True | Same under an 18-month recency window (conservative alternative). |
| 25 | uninhabitable_24 | string | 100.0 | 2 | True | |
| 26 | uninhabitable_on_census_day | bool | 100.0 | 2 | True | TRUE if uninhabitable on Census Day (type-aware; 24mo recency + no-rebuild for demo/condemn). |
| 27 | rebuilt_before_census_day | string | 100.0 | 2 | False | |
| 28 | excl_interior_accessory_demo | bool | 100.0 | 2 | False | TRUE if a bare-demo row is interior/accessory (not a dwelling). |
| 29 | excl_erect_nonresidential | bool | 100.0 | 2 | False | TRUE if an 'erect' row lacks residential context. |
| 30 | excl_dc_unitsfallback | bool | 100.0 | 2 | False | TRUE if a DC row entered only via the units>0 fallback. |
| 31 | excluded_any | bool | 100.0 | 2 | False | TRUE if any exclusion flag is set. |