Column

Map of LEP and CVALEP Population by Community District

Column

Information about the NYC Community Language Profiles tool

This Language Profiles tool was created by the NYC Civic Engagement Commission and the NYC Office of Data Analytics. The tool provides detailed information about the languages (other than English) that are spoken by 1.8 million city residents across each of New York’s 59 community districts.

Community Boards can use the Community Language Profiles tool to refine outreach metrics and more effectively reach constituents who prefer to interact or receive services in a language other than English.


To access, display, and use the map tool on this page, begin by enabling a layer with the layer selector tool in the top right corner of the map pane. This will display tiles based on your selection, each with information that is displayed upon clicking (or tapping) a tile of interest.

There are also tables that display the information that is visualized in the map. These tables can be accessed, filtered or sorted, and even downloaded by clicking the headers that contain the word ‘Table’ above this information panel.

  • New York City is inhabited by unique and diverse groups across it’s population. Many residents of this city speak more than one language; a number of them speak and understand non-English languages more fluently than English. This affects the way these New Yorkers may engage with their government - both within the voting booth and beyond. This tool looks to explore two specific subsets of this diverse population:

    • NYC residents that have limited English proficiency, or the “LEP” population; as well as
    • NYC residents that are citizens of voting age that have limited English proficiency, or the “CVALEP” population.
  • The data describing such NYC residents comes from 5-year American Community Survey (ACS) data (2017 - 2021). For more information on how LEP and CVALEP speakers are defined in the data, please refer to the Glossary tab.

  • New York City government bodies are mandated to provide language services in the City’s Local Law 30 eligible languages. However, this is not at the exclusion of other languages present in the City. Under NYC Executive Order 120, New Yorkers have the right to receive information in the language they feel most comfortable speaking regardless of the language spoken. The NYC Community Language Profiles map provides data for all languages spoken in the City by community districts, and aids community boards, non profit groups and other allies to represent the City’s efforts to encourage and facilitate civic participation in local government.

Chart: % LEP by Community District

Table: LEP & CVALEP by CD

Table: LEP & CVALEP Languages by CD

Glossary of Terms

Term Description
American Community Survey (ACS) An annual U.S. Census survey that provides social and economic demographic data on communities, including language(s) spoken at home and English proficiency. It is conducted every month, every year, and administered by the U.S. Census Bureau. The ACS is sent to a random sample of addresses in the 50 states, District of Columbia, and Puerto Rico. Data collected includes social, economic, housing and population data. The data is available for the nation, states, counties and other geographic areas down to the block group level. ACS data helps local officials, community leaders, and businesses understand the changes taking place in their communities.
Census tract Census tracts are small, relatively permanent statistical subdivisions of a county, and are defined for the purpose of taking the census.
CVALEP Citizen of voting age with limited English proficiency.
Limited English proficiency (LEP)—as used by ACS (see definition above) A person’s English proficiency as determined by the ACS. The ACS asks three questions to assess languages spoken by each person who lives in the home: whether people speak a language other than English; if yes, which language they speak; and how well they speak English. The question about English proficiency states: “How well does this person speak English?” People can answer “Very well,” “Well,” “Not well” or “Not at all.” LEP is defined as anyone who does not answer “very well.”
Community Boards Community boards are local representative bodies. There are 59 community boards each representing a community district. Community boards have a formal role in the city charter to 1. Improve the delivery of services; 2. Planning and reviewing land use in the community; and 3. Making recommendations on the city’s budget.
Community District Community districts represent New York City’s 59 community boards. Districts range in size and population.
City Council District 51 districts throughout the five boroughs each represented by an elected Council Member.
ZIP Code Tabulation Areas (ZCTAs) ZIP Code Tabulation Areas (ZCTAs) are generalized areal representations of United States Postal Service (USPS) ZIP Code service areas. The USPS ZIP Codes identify the individual post office or metropolitan area delivery station associated with mailing addresses. ZIP Code Tabulation Area (or ZCTA) is a trademark of the U.S. Census Bureau; ZIP Code is a trademark of the U.S. Postal Service.
---
title: "NYC Community Language Profiles"
output:
  flexdashboard::flex_dashboard:
    vertical_layout: fill
    source_code: embed
    theme: readable
    css: style.css
    self_contained: TRUE

---

```{r setup, include=FALSE}

library(tidyverse)
library(DT)
library(plotly)
library(crosstalk)
library(rgdal)
library(tidyr)

library(leaflet)
library(leaflet.extras)
library(sf)
library(geosphere)
library(raster)

library(scales)

# library(gbatr)

#  options(DT.fillContainer = FALSE)
#  options(DT.autoHideNavigation = FALSE)
#  options(DT.fillContainer = TRUE)

```

```{r read_csv}
 
# Notes:
# * I think tl is junk. I believe I've commented all of it out.

# tl <- read_csv("assets/top_languages0.csv")

# tl0 <- read_csv("assets/Top_Five_Languages.csv")
top5_lang_by_cd <- read_csv("output/Top_Five_Languages_2024_08_14.csv", show_col_types=FALSE)

# % of LEP/CVALEP population speaking a language in a community district 
# sums up to 100% across CD
# tl00 <- read_csv("assets/CVALEP_Languages_21.csv")
lang_by_cd <- read_csv("output/LEP_CVALEP_Languages_by_CD_2024_08_14.csv", show_col_types=FALSE)

# Total population in each CD, including population by LEP/CVLEP
# cecf > lep_by_cd
lep_by_cd <- read_csv("output/LEP_CVALEP_by_CD_2024_08_14.csv", show_col_types=FALSE)

# Sort
lep_by_cd <- lep_by_cd[order(lep_by_cd$`boro_cd`),]
# sort
top5_lang_by_cd <- top5_lang_by_cd[order(top5_lang_by_cd$boro_cd, -top5_lang_by_cd$perc_lang_lep),]
# sort
lang_by_cd <- lang_by_cd[order(lang_by_cd$boro_cd, -lang_by_cd$perc_lang_cvalep),]


```

```{r data, include=FALSE }
# Load in Data

# Load in Shapefiles for community districts
cecm <- read_sf("assets/shape/cec.shp")

# Create/Join Centroids for Points
centroids <- as.data.frame(st_centroid(cecm))
# create dataframe with center points as lat/lon (instead of geometry)
centroidsf <- centroids %>%
  separate(geometry, c('long', 'lat'), sep = ",")

centroidsf$long <- as.numeric(gsub("[\\c,(,]", "", centroidsf$long))
centroidsf$lat <- as.numeric(gsub("[\\),]", "", centroidsf$lat))
# I don't think this does anything
centroidsf <- centroidsf[, c(1, 2, 3, 5, 4)]

## Create lookup table: for each CD
# mapping polygons
cd_shp <- readOGR("assets/shape/cec.shp")
cd_shp <- cd_shp[cd_shp$boro_cd %in% lang_by_cd$boro_cd,] # Removes CDs with sparse populations (parks, cemetaries, airports)

modzcta_sf <- "assets/zip/Modified Zip Code Tabulation Areas (MODZCTA).geojson" %>%
  st_read()

all_polls <- read_csv('assets/all_polls.csv', show_col_types=FALSE)


```

Column {data-width=550}
-------------------------------------
###  **Map of LEP and CVALEP Population by Community District**
```{r map_prep, include=FALSE}

# Formatted % CVALEP/LEP that will be displayed in the popup
top5_lang_by_cd$lang_popup_cvalep <- paste(sep='',
  top5_lang_by_cd$Language, ': ',
  percent(top5_lang_by_cd$perc_lang_cvalep, accuracy=.1)
)
top5_lang_by_cd$lang_popup_lep <- paste(sep='',
  top5_lang_by_cd$Language,
  ': ',
  percent(top5_lang_by_cd$perc_lang_lep, accuracy=.1)
)

# concatenate the % language CVALEP/LEP popup text
cvalep_lang_str_df <- top5_lang_by_cd[, c("boro_cd", "lang_popup_cvalep")] %>%
  group_by(boro_cd) %>%
  summarize(lang_popup_collapsed_cvalep = str_c(lang_popup_cvalep, collapse = "<br>"), .groups = 'drop')
lep_lang_str_df <- top5_lang_by_cd[, c("boro_cd", "lang_popup_lep")] %>%
  group_by(boro_cd) %>%
  summarize(lang_popup_collapsed_lep = str_c(lang_popup_lep, collapse = "<br>"), .groups = 'drop')

lep_by_cd <- merge(
  x=lep_by_cd,
  y=cvalep_lang_str_df,
  by.x='boro_cd',
  by.y='boro_cd'
)
lep_by_cd <- merge(
  x=lep_by_cd,
  y=lep_lang_str_df,
  by.x='boro_cd',
  by.y='boro_cd'
)


# cecz > lep_by_cd_shp
lep_by_cd_shp <- merge(x = cd_shp,
              y = lep_by_cd,
              by.x = "boro_cd",
              by.y = "boro_cd",
              all.x = TRUE)


lep_by_cd_shp$shape_area <- NULL
lep_by_cd_shp$shape_leng <- NULL

## Load in CEC Poll Sites data
load(file = "assets/Poll_Sites_Geo.Rda")

# Load in ED, City Council Shapefiles
ed_sf <- read_sf(paste(Sys.getenv("cec_map"),
                       "assets/shape/geo_export_eec1d155-e38d-4c3e-85e6-9f606199d53d.shp",
                       sep = ""))

cc_sf <- read_sf('assets/shape/City_Council_Districts/geo_export_e063fae0-b31e-4b2c-9f9e-d886d63b06c5.shp')

```

```{r map, fig.alt = "An interactive Map of NYC Community Districts that shades the ground area of each district based on the percentage of the population that is limited in English profieciency as well as a citizen of voting age."}
pal <- colorNumeric(
  palette = "Blues",
  domain = lep_by_cd_shp$perc_cvalep_cd)

pal1 <- colorNumeric(
  palette = "Greens",
  domain = lep_by_cd_shp$perc_lep_cd)

pal0 <- colorFactor(c("navy", "red"), domain = c("Early Voting", "Election Day"))

lep_by_cd_shp %>%
  leaflet() %>%
  addProviderTiles("CartoDB") %>%
  setView(lat = 40.730610,
          lng = -73.935242,
          zoom = 11) %>%
  addMapPane("layer0",zIndex=430) %>% 
  addMapPane("layer1", zIndex=420) %>% 
  addMapPane("layer2",zIndex=410) %>% 
  addMapPane("layer3",zIndex=400) %>% 
  addMapPane("layer4",zIndex=380) %>% 
  addMapPane("layer5",zIndex=360) %>%
  addPolygons(data = lep_by_cd_shp,
              fillColor = ~pal(lep_by_cd_shp$perc_cvalep_cd),
              color = "#444444", # you need to use hex colors
              weight = 1,
              smoothFactor = .5,
              options = pathOptions(pane = "layer2"),
              fillOpacity = .5,
              highlightOptions = highlightOptions(color = "white",
                                                  weight = 2,
                                                  bringToFront = TRUE),
              #dashArray = "3",
              group = "Percent CVALEP by CD",
              label = paste(sep='',
                lep_by_cd_shp$boro, ' Community District ',
                # extract the community district number
                sub("0", "", substring(lep_by_cd_shp$boro_cd, 2)),
                ': ', lep_by_cd_shp$cd_name
              ),
              popup = paste(sep='',
                lep_by_cd_shp$cd_name, "<br>",
                "<br>% LEP: ", percent(lep_by_cd_shp$perc_lep_cd, accuracy=.1), 
                "<br>% CVALEP: ", percent(lep_by_cd_shp$perc_cvalep_cd, accuracy=.1),
                "<br>2010 Population: ", format(lep_by_cd_shp$cd_pop, big.mark = ','),
                "<br><br>Top 5 CVALEP Language:<br>", lep_by_cd_shp$lang_popup_collapsed_cvalep
              )
              ) %>%
  addPolygons(data = lep_by_cd_shp,
              fillColor = ~pal1(lep_by_cd_shp$perc_lep_cd),
              # fillColor = ~pal(lep_by_cd_shp$perc_cvalep_cd),
              color = "#444444", # you need to use hex colors
              weight = 1,
              smoothFactor = .5,
              options = pathOptions(pane = "layer2"),
              fillOpacity = .5,
              highlightOptions = highlightOptions(color = "white",
                                                  weight = 2,
                                                  bringToFront = TRUE),
              #dashArray = "3",
              group = "Percent LEP by CD",
              label = paste(sep='',
                lep_by_cd_shp$boro, ' Community District ',
                # extract the community district number
                sub("0", "", substring(lep_by_cd_shp$boro_cd, 2)),
                ': ', lep_by_cd_shp$cd_name
              ),
              popup = paste(sep='',
                lep_by_cd_shp$cd_name, "<br>",
                "<br>% LEP: ",percent(lep_by_cd_shp$perc_lep_cd, accuracy=.1),
                "<br>% CVALEP: ", percent(lep_by_cd_shp$perc_cvalep_cd, accuracy=.1),
                "<br>2010 Population: ", format(lep_by_cd_shp$cd_pop, big.mark = ','),
                "<br><br>Top 5 LEP Language:<br>", lep_by_cd_shp$lang_popup_collapsed_lep
              )
              ) %>%
  addPolygons(
    data = cc_sf,
    fillColor = "white",
    color = "#000000", # you need to use hex colors
    weight = 2,
    options = pathOptions(pane = "layer4"),
    fillOpacity = .15,
    group = "City Council Districts",
    popup = ~paste0("Council District ", cc_sf$coun_dist)
  ) %>%
  addPolygons(
    data = modzcta_sf,
    fillColor = "white",
    color = "#000000", # you need to use hex colors
    weight = 2,
    options = pathOptions(pane = "layer4"),
    fillOpacity = .15,
    group = "Zip Code Tabulation Area",
    popup = ~paste0(
      "Zip Code Tabulation Area: ", modzcta_sf$label,
      "<br>", "Population Estimate (2010): ", modzcta_sf$pop_est
    )
  ) %>% addLegend(pal = pal,
            labFormat = labelFormat(
              suffix = "%",
              transform = function(x) 100 * x
            ),
            values = lep_by_cd_shp$perc_cvalep_cd,
            position = "bottomright",
            title = "CVALEP Population (%)",
            group = "Percent CVALEP by CD"
  ) %>% addLegend(pal = pal1,
            labFormat = labelFormat(
              suffix = "%",
              transform = function(x) 100 * x
            ),                  
            values = lep_by_cd_shp$perc_lep_cd,
            position = "bottomleft",
            title = "LEP Population (%)",
            group = "Percent LEP by CD") %>%
  addLayersControl(
    overlayGroups = c(
      "Percent LEP by CD",
      "Percent CVALEP by CD",
      "City Council Districts",
      "Zip Code Tabulation Area"
      )
    ) %>% hideGroup(c(
      "Percent LEP by CD",
      "Percent CVALEP by CD",
      "City Council Districts",
      "Zip Code Tabulation Area"
      )
    )


```

Column {data-width=450 .tabset}
-------------------------------------

### **Information about the _NYC Community Language Profiles_ tool**

 This Language Profiles tool was created by the [NYC Civic Engagement Commission](https://www1.nyc.gov/site/civicengagement/index.page) and the [NYC Office of Data Analytics](https://www.nyc.gov/content/oti/pages/data-analytics). The tool provides detailed information about the languages (other than English) that are spoken by 1.8 million city residents across each of New York's 59 community districts.

   
Community Boards can use the Community Language Profiles tool to refine outreach metrics and more effectively reach constituents who prefer to interact or receive services in a language other than English.

***



> _To access, display, and use the map tool on this page, begin by enabling a layer with the layer selector tool in the top right corner of the map pane. This will display tiles based on your selection, each with information that is displayed upon clicking (or tapping) a tile of interest._
   
> _There are also tables that display the information that is visualized in the map. These tables can be accessed, filtered or sorted, and even downloaded by clicking the headers that contain the word 'Table' above this information panel._
   

* New York City is inhabited by unique and diverse groups across it's population. Many residents of this city speak more than one language; a number of them speak and understand non-English languages more fluently than English. This affects the way these New Yorkers may engage with their government - both within the voting booth and beyond. This tool looks to explore two specific subsets of this diverse population:
  - NYC residents that have limited English proficiency, or the "LEP" population; as well as
  - NYC residents that are citizens of voting age that have limited English proficiency, or the "CVALEP" population.
  

* The data describing such NYC residents comes from 5-year [American Community Survey (ACS)](https://www.census.gov/programs-surveys/acs/microdata/access.html) data (2017 - 2021). For more information on how LEP and CVALEP speakers are defined in the data, please refer to the Glossary tab.
   

* New York City government bodies are mandated to provide language services in the City’s Local Law 30 eligible languages. However, this is not at the exclusion of other languages present in the City. Under NYC Executive Order 120, New Yorkers have the right to receive information in the language they feel most comfortable speaking regardless of the language spoken. The NYC Community Language Profiles map provides data for all languages spoken in the City by community districts, and aids community boards, non profit groups and other allies to represent the City’s efforts to encourage and facilitate civic participation in local government.
   




### **Chart: % LEP by Community District**

```{r plot}

# {r plot, fig.alt = "An Interactive, Horizontal Bar Chart that compares each district based on the percentage of the population that is limited in English profieciency based on Census data."}


plot_ly(
  data=lep_by_cd,
  y=~cd_name,
  x=~round(perc_lep_cd*100, digits=.1),
  type='bar',
  orientation='h'
) %>% config(displayModeBar = F) %>%
layout(title = "Percent of Population that is Limited in English<br>Proficiency (LEP). Source: 2021 5-year ACS",
       margin = list(l=50, r=50, b=50, t=50, pad=0),
       yaxis = list(title = "",
                    categoryorder = "total ascending",
                    categoryarray = ~perc_lep_cd,
                    dtick = 1,
                    htick = 1,
                    pad = 20),
       xaxis = list(title = "% of Total Population",
                    ticksuffix='%')
)

```

### **Table: LEP & CVALEP by CD**

```{r}

# {r, fig.alt = "A table that describes each of New York's Community Districts - including information on the population based on the 2010 Census as well as the number of residents that are limited in English proficiency or citizens of voting age."}


# NEED TO INCLUDE CD POP
datatable(
  lep_by_cd[, c('boro_cd', 'cd_name', 'cd_pop', 'lep_cd', 'perc_lep_cd', 'cvalep_cd', 'perc_cvalep_cd')],
  extensions = c('Scroller','Buttons'),
  rownames = NULL,
  colnames=c(
    'Community District', # boro_cd
    'Name', # 'cd_name' 
    '2010 Population', #'cd_pop', 
    'LEP Population Estimate', # 'lep_cd', 
    '% of Total Population that is LEP', # 'perc_lep_cd', 
    'CVALEP Population Estimate', # 'cvalep_cd', 
    '% of Total Population that is CVALEP' # 'perc_cvalep_cd'
    ),
  fillContainer = TRUE,
  options = list(
    # deferRender = TRUE,
    # scrollY = 300
    # scroller = TRUE,
    dom = 'Blfrtip',
    buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
    lengthMenu = list(c(10,25,50,-1),
                      c(10,25,50,"All"))
  )
) %>% 
  formatPercentage(c('perc_lep_cd', 'perc_cvalep_cd'), 2) %>%
  formatCurrency(
    columns=c('cd_pop', 'lep_cd', 'cvalep_cd'), 
    currency='', digits=0)
```

### **Table: LEP & CVALEP Languages by CD**

```{r}

# {r, fig.alt = "A table that describes each of the languages, including how many residents are estimated to speak a given language, that are spoken within each  Community District of New York City."}

datatable(
  lang_by_cd[, c('boro_cd', 'cd_name', 'Language', 'lep_by_lang_cd', 'perc_lang_lep', 'cvalep_by_lang_cd', 'perc_lang_cvalep')],
  rownames = NULL,
  colnames =  c(
    # 'boro_cd', 
    'Community District',
    # 'cd_name' 
    'Name', 
    'Language',
    # 'lep_by_lang_cd', 
    'LEP Population Estimate', 
    # 'perc_lang_lep', 
    '% LEP Population',
    # 'cvalep_by_lang_cd', 
    'CVALEP Population Estimate',
    # 'perc_lang_cvalep'
    '% CVALEP Population'
    ),
  extensions = c('Scroller','Buttons'),
  fillContainer = TRUE,
  options = list(
    # deferRender = TRUE,
    #scrollY = 300,
    #scroller = TRUE,
    dom = 'Blfrtip',
    buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
    lengthMenu = list(c(10,25,50,-1),
                     c(10,25,50,"All"))
  )
) %>% 
  formatPercentage(c('perc_lang_lep', 'perc_lang_cvalep'), 2) %>%
  formatCurrency(
    columns=c('lep_by_lang_cd', 'cvalep_by_lang_cd'), 
    currency='', digits=0)
```

### **Glossary of Terms**
| Term      | Description |
| ----------- | ----------- |
| American Community Survey (ACS)      | An annual U.S. Census survey that provides social and economic demographic data on communities, including language(s) spoken at home and English proficiency. It is conducted every month, every year, and administered by the U.S. Census Bureau. The ACS is sent to a random sample of addresses in the 50 states, District of Columbia, and Puerto Rico. Data collected includes social, economic, housing and population data. The data is available for the nation, states, counties and other geographic areas down to the block group level. ACS data helps local officials, community leaders, and businesses understand the changes taking place in their communities.       |
| Census tract   | Census tracts are small, relatively permanent statistical subdivisions of a county, and are defined for the purpose of taking the census.        |
| CVALEP   | Citizen of voting age with limited English proficiency.        |
| Limited English proficiency (LEP)—as used by ACS (see definition above)   | A person’s English proficiency as determined by the ACS. The ACS asks three questions to assess languages spoken by each person who lives in the home: whether people speak a language other than English; if yes, which language they speak; and how well they speak English. The question about English proficiency states: “How well does this person speak English?” People can answer “Very well,” “Well,” “Not well” or “Not at all.” LEP is defined as anyone who does not answer “very well.” |
| Community Boards   | Community boards are local representative bodies. There are 59 community boards each representing a community district. Community boards have a formal role in the city charter to 1. Improve the delivery of services; 2. Planning and reviewing land use in the community; and 3. Making recommendations on the city’s budget.        |
| Community District   | Community districts represent New York City’s 59 community boards. Districts range in size and population.         |
| City Council District  | 51 districts throughout the five boroughs each represented by an elected Council Member. |
|ZIP Code Tabulation Areas (ZCTAs)  | ZIP Code Tabulation Areas (ZCTAs) are generalized areal representations of United States Postal Service (USPS) ZIP Code service areas. The USPS ZIP Codes identify the individual post office or metropolitan area delivery station associated with mailing addresses.  ZIP Code Tabulation Area (or ZCTA) is a trademark of the U.S. Census Bureau; ZIP Code is a trademark of the U.S. Postal Service.|