Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDBG 2020 #1348

Open
8 of 11 tasks
damonmcc opened this issue Dec 27, 2024 · 11 comments
Open
8 of 11 tasks

CDBG 2020 #1348

damonmcc opened this issue Dec 27, 2024 · 11 comments
Assignees
Labels
data update Related to a data product update

Comments

@damonmcc
Copy link
Member

damonmcc commented Dec 27, 2024

primary outputs:

  • csvs with details about the CDBG eligibility of 2020 census block groups, tracts, and boroughs
  • an excel file with three sheets to align with May 2019 output: Tract, Borough Block Group

source data:

  • 2020 census tracts
  • PLUTO 24v4
  • 2020 HUD Low and Moderate Income Summary Data (LMISD)

source data we seem to be missing, but would only need for documentation:

  • ACS data for total population in each census tract
  • latest "Low and Moderate 4-Person Family Income Limit: $xx,xxx"

details from GDE Manual

NYC receives Community Development Block Grants from the U.S. Department of Housing and Urban Development (HUD). These grants may be used for projects in residential areas where over half the population is living in low- and moderate-income households.

A census tract is considered residential if at least 50.0% of the total built floor area is residential. This is determined by summing ResArea and BldgArea for MapPLUTO lots in the census tract. If at least 90% of a lot's area is in the tract, the entire lot is assigned to that tract. Otherwise, the building area is assigned proportionately based on the percent of the lot's area in each census tract.

The clipped version of MapPLUTO was used because otherwise waterfront properties on the East River in Queens and Brooklyn had a portion of their building area assigned to Manhattan. An additional check was added to verify that a lot's area was fully captured. This was a problem for waterfront lots that extended outside the tract boundaries.

new/active stuff

new source data in sharepoint

CDBG Eligibility by Census Tract: metadata, DCP website, Bytes, Open Data csv, Open Data fgdb

Status of Update

  • source data extracted
  • draft build succeeded
  • draft build passed QA
  • data packaged and distributed

dev todos

  • use >= in eligibility logic
  • use HUD's LOWMOD_PCT for block groups and and 'LOWMOD/LOWMODUNIV` for tracts (instead of any total population data)
  • remove artifacts of trying to use total population to determine eligibility
  • create a README

dev questions

  • should the eligibility logic be >= 50% and >=51%?
    • the language in the docs is "at least 50% of the total built floor area is residential" and "at least 51.00% of the residents are low- and moderate-income persons", but the excel sheet used >
    • answer from AD: use >=
  • should we use HUD's "persons with the potential for being low/mod income" or total population to calculate eligibility?
    • answer from OMB: use HUD's "persons with the potential for being low/mod income" and update documentation
  • there are 95 lots that are not entirely contained by city census blocks/tracts, so some bldg area is "lost" from pluto. Is this fine? Do we want to allocate 100% in these odd edge cases? See CDBG 2020 #1348 (comment)
    • DE idea: allocate the % of the lot that is in the city geographies because that's consistent with how they're allocated across multiple geographies and it isn't invalid that lots cross county lines

old stuff (but both are related to geocoding CDBG projects, not determining census tract eligibility)

db-cdbg repo

DE CDBG sharepoint folder

@damonmcc damonmcc added the data update Related to a data product update label Dec 27, 2024
@damonmcc damonmcc moved this to 📬 Next in Data Engineering Dec 27, 2024
@damonmcc damonmcc changed the title CDBG 2025 CDBG Eligibility 2025 Dec 29, 2024
@damonmcc damonmcc changed the title CDBG Eligibility 2025 CDBG 2025 Dec 29, 2024
@damonmcc damonmcc self-assigned this Dec 30, 2024
@damonmcc
Copy link
Member Author

damonmcc commented Dec 30, 2024

will add this (new) data product to the DE repo as a dbt project

asked OMB some clarifying questions about the source data but we can proceed as-is

@damonmcc damonmcc moved this from 📬 Next to 🏗 In progress in Data Engineering Dec 30, 2024
@damonmcc
Copy link
Member Author

damonmcc commented Dec 31, 2024

added a new DB to ed-data called db-cdbg per our wiki

@damonmcc
Copy link
Member Author

added HUD LMISD source data sent by OMB to edm-recipes/inbox in S3

Image

@damonmcc
Copy link
Member Author

the build on the dev branch looked good. after archiving the latest dcp_mappluto_clipped, ran a build here and shared with @AmandaDoyle via email

@damonmcc
Copy link
Member Author

damonmcc commented Dec 31, 2024

per HUD LMISD data dictionary:

Field Definition
LOWMOD The count of Low- and Moderate-income persons.
LOWMODUNIV Persons with the potential for being deemed Low-, Moderate- and Middle-income. Use as the denominator for LOW, LOWMOD, and LMMI %'s.
LOWMOD_PCT The percentage of Low- and Moderate-income persons. Calculated from LOWMOD divided by LOWMODUNIV.

since, per the DCP website, a tract is eligible if "at least 51.00% of the residents are low- and moderate-income persons", seems like we need the total population of each 2020 tract and 2020 block group

@fvankrieken found a source dataset we already archive that may have this (I think dcp_pop_acs?

@fvankrieken
Copy link
Contributor

dcp_censusdata (and dcp_censusdata_blocks) - just a lot smaller than the acs datasets

@damonmcc
Copy link
Member Author

damonmcc commented Jan 2, 2025

noting that some lots seem to extend outside of NYC. so <100% of their floor areas are assigned to their NYC census tracts/block groups

@damonmcc damonmcc changed the title CDBG 2025 CDBG 2024 Jan 2, 2025
@fvankrieken
Copy link
Contributor

lots not fully in NYC census geographies issue example

lot over osm basemap
Image

lot in zola
Image

area of interest in pff (where we can see cb boundaries
Image

@damonmcc damonmcc changed the title CDBG 2024 CDBG 2020 Jan 6, 2025
This was referenced Jan 6, 2025
@damonmcc
Copy link
Member Author

damonmcc commented Jan 6, 2025

shared latest build from branch dm-cdbg-zeros with @AmandaDoyle for review

edit: noticed that, in the output shared, the borough-level % of population that is low/mod income was still using total population as the denominator

@damonmcc damonmcc mentioned this issue Jan 6, 2025
@damonmcc
Copy link
Member Author

damonmcc commented Jan 7, 2025

we've been asked to determine ZAP project eligibility using this new tract eligibility data. @AmandaDoyle wants an output to review by the morning of 1/8 (tomorrow)

will create a sub-issue for this

@damonmcc damonmcc moved this from 🏗 In progress to 🔍 In review in Data Engineering Jan 8, 2025
@damonmcc
Copy link
Member Author

damonmcc commented Jan 9, 2025

shared outputs for review on 1/8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data update Related to a data product update
Projects
Status: 🔍 In review
Development

No branches or pull requests

2 participants