How to determine what constitutes a greater urban area of a city using open street map open data

The case of Buenos Aires Greater Urban Area

Introduction

The problem

The data

OSM typology of residential streets

Data analysis

agba = gpd.GeoDataFrame.from_file(home + ‘/Downloads/paisNoOficial/rad_paisp.shp’)#retain only those radios the belong to a 'departamento' in AGBA
agba = agba.loc[agba.DEPARTAMENTO.isin(departamentos.values()),:]
#calculate density based on population
agba['densidad'] = agba.Pob / agba.AREA
agba['densidad'] = pd.qcut(agba.densidad,10,labels=False)
agba['mixYden'] = (agba.TIPO == 'M') | (agba.densidad == 0)
agbaAltaDens = agba.loc[~agba.mixYden,:]
agbaBajaDens = agba.loc[agba.mixYden,:]
calles = gpd.read_file(home + ‘/Downloads/buenos-aires_argentina.osm2pgsql-shapefiles/buenos-aires_argentina_osm_line.shp’)
calles = calles.loc[:,[‘highway’,’geometry’]]
mask = (calles.highway == 'residential') | (calles.highway == 'living_street') | (calles.highway == 'footway') | (calles.highway == 'pedestrian')
calles = calles.loc[mask,:]
#change projection in meters for POSGAR94 Argentina 5
calles = calles.to_crs(epsg=22185)calles['largo'] = calles.geometry.map(lambda x: x.length)calles = calles.loc[calles.largo <= 2000,:]
calles['buffer'] = calles.geometry.buffer(100)
calles.drop(['geometry'],axis=1,inplace=True)
calles.columns = [u'highway', u'geometry']
intersec = gpd.overlay(radiosSel, calle, how=’intersection’)
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(1,1,1)
radiosSel.plot(ax=ax,color='green')
calle.plot(ax=ax,color='red')
intersec2 = gpd.overlay(radiosSel, calle2, how=’intersection’)
appenDF = intersec2.append(intersec)
appenDFdiss = appenDF.dissolve(by=’LINK’)
primerID = calles.calleID.unique()[0]
counter = 0
for i in calles.calleID.unique():
#for each street
subset = calles.loc[calles.calleID==i,:]
unaLinea = subset.iloc[0:1,:]
conjuntoDeRadios = agbaBajaDens.loc[agbaBajaDens.LINK.isin(subset.LINK),:]
#intercect
if i == primerID:
intercept = gpd.overlay(conjuntoDeRadios, unaLinea, how=’intersection’)
else:
tmp = gpd.overlay(conjuntoDeRadios, unaLinea, how=’intersection’)
intercept = intercept.append(tmp)
counter += 1
primerLINKid = intercept.LINK.unique()[0]
for i in intercept.LINK.unique():
if i == primerLINKid:
dissolve = intercept.loc[intercept.LINK==i,:].dissolve(by=’LINK’).reset_index()
else:
try:
tmp = intercept.loc[intercept.LINK==i,:].dissolve(by=’LINK’).reset_index()
except:
#tomo los intercepts dentro de un radio por su LINK
tmp = intercept.loc[intercept.LINK==i,:]
#creo el primer elemento de la cascade_union
geoConError = tmp.geometry.iloc[0]
for i in range(1,tmp.shape[0]):
geoConError = shapely.ops.cascaded_union([geoConError,tmp.geometry.iloc[i]])
tmp = tmp.iloc[0:1,:].reset_index().drop([‘index’],axis=1)
tmp[‘geometry’][0] = geoConError

dissolve = dissolve.append(tmp)

The results

Where it worked

Factory and sports club area excluded from original radios
Railroad, stadiums and shopping malls excluded from original radios
Country clubs on the borders of the urban area

Where it didn’t

Airports and parks excluded, but with a erroneous type for the street in the middle of the park
Elongated shapes

References

Urban Data Scientist Twitter: @lephcero. Github: /alephcero