Evaluating and Understanding the Geocoding of City Directories of Paris (1787-1914)

Evaluating and Understanding the Geocoding of City Directories of Paris (1787-1914)

24 February 2026|In the spotlight, PARIS, Publications|

Data-Driven Geography of Urban Sprawl and Densification

As in other western cities, the fast-paced urban, industrial, and commercial sprawl of Paris during the 19th century provided the backdrop and driving force for the publishing phenomenon of trade directories. The authors show how these collections of millions of nominative entries associated with addresses can be turned into a serial dataset whose massive, fine-grained, and geolocated nature opens up new possibilities for quantitative and multi-scale analyses of the dynamics at play during one of the most dramatic socio-spatial transformations of the city.

They highlight the methodological conditions of such data-driven analyses and emphasize the importance of understanding source effects. The findings underscore the significance of data science in critically evaluating digital sources and adhering to best practices in the production of large historical datasets.

Excerpt

In the research project SoDUCo, we developed an automatic pipeline to extract, semantically annotate, geocode, and structure 144 directories of Paris published between 1787 and 1914. This process involved image segmentation to detect entries, OCR for text extraction, named entity recognition and geocoding to assign geographic positions to the addresses in directory entries, resulting in an open dataset of about 23 million records [GeoHistoricalData 2023].

The spatial and social dynamics of European cities in the 19th century are characterized by growth, sprawl, and socio-economic transformations as a result of industrialization. Growth and sprawl dynamics have been mainly investigated using demographic sources, dependent on census rationale and administrative boundaries, or getting morphological information from city plans. Both types of sources are less detailed than directories in terms of spatial and temporal scale.

For the first time, it is possible to study the dynamics of a European capital at a key moment in its history, with unprecedented spatio-temporal resolution and extent: address-level information for the whole city, roughly every year over more than a century. Adopting a data-driven geography perspective, we demonstrate that the Paris directories dataset is a valuable multi-scale and multi-granularity (spatial and temporal) research tool for analyzing the city’s urban growth throughout the 19th century. However, such a massive digital dataset can mask many biases and source effects. We show that an expert examination of the directories, and their extracted and geocoded content, helps to better understand the dynamics at work in the city’s urbanizing margins.

Inventory of address lists in directories and their processing in the pipeline chain (status in November 2023).

Inventory of address lists in directories and their processing in the pipeline chain (status in November 2023).

Gravier, J., Baciocchi, S., Cristofoli, P., Duménieu, B., Carlinet, E., Chazalon, J., Abadie, N., Tual, S., Perret, J., 2026. Evaluating and Understanding the Geocoding of City Directories of Paris (1787-1914): Data-Driven Geography of Urban Sprawl and Densification. Digital Humanities Quarterly, 19(4)

Julie Gravier is a Permanent researcher of the French National Centre for Scientific Research and Associate Researcher to Géographie-cités.

Partagez cet article

Go to Top