Skip to contents

This dataset, boston_pts, is a data frame containing information on housing values and neighborhood characteristics in the Boston area. It is based on the classic dataset by Harrison and Rubinfeld (1978), corrected for minor errors and augmented with the latitude and longitude of the observations. Gilley and Pace also note that the MEDV variable is censored, with values at or over USD 50,000 set to USD 50,000.

Usage

data(boston_pts)

Format

A data frame with 506 observations and 20 variables:

TOWN

Town name (factor with 92 levels)

TOWNNO

Town number (integer)

TRACT

Census tract number (integer)

LON

Longitude (numeric)

LAT

Latitude (numeric)

MEDV

Median value of owner-occupied homes in USD 1,000s (numeric, censored at 50)

CMEDV

Corrected median value of owner-occupied homes (numeric)

CRIM

Per capita crime rate by town (numeric)

ZN

Proportion of residential land zoned for lots over 25,000 sq.ft. (numeric)

INDUS

Proportion of non-retail business acres per town (numeric)

CHAS

Charles River dummy variable (factor: "0" = not bounded, "1" = bounded)

NOX

Nitric oxides concentration (parts per 10 million, numeric)

RM

Average number of rooms per dwelling (numeric)

AGE

Proportion of owner-occupied units built prior to 1940 (numeric)

DIS

Weighted distances to five Boston employment centers (numeric)

RAD

Index of accessibility to radial highways (integer)

TAX

Full-value property-tax rate per $10,000 (integer)

PTRATIO

Pupil-teacher ratio by town (numeric)

B

Proportion of Black residents, defined as 1000(Bk - 0.63)^2 (numeric)

LSTAT

Percentage of lower status of the population (numeric)

Source

Data taken from the spData package version 2.3.4

Details

The dataset consists of 506 observations and 20 variables, including socio-economic, environmental, and housing characteristics. Geographic coordinates (longitude and latitude) are provided for spatial analysis. Related data objects include boston.utm, a matrix of tract point coordinates projected to UTM zone 19, and boston.soi, a sphere of influence neighbors list.

The dataset name has been kept as boston_pts to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the lightsf package and assists users in identifying its specific characteristics. The suffix pts indicates that the dataset includes spatial point information. The original content has not been modified in any way.