This dataset, boston_pts
, is a data frame containing information on housing values
and neighborhood characteristics in the Boston area. It is based on the classic dataset
by Harrison and Rubinfeld (1978), corrected for minor errors and augmented with the latitude
and longitude of the observations. Gilley and Pace also note that the MEDV
variable
is censored, with values at or over USD 50,000 set to USD 50,000.
Usage
data(boston_pts)
Format
A data frame with 506 observations and 20 variables:
- TOWN
Town name (factor with 92 levels)
- TOWNNO
Town number (integer)
- TRACT
Census tract number (integer)
- LON
Longitude (numeric)
- LAT
Latitude (numeric)
- MEDV
Median value of owner-occupied homes in USD 1,000s (numeric, censored at 50)
- CMEDV
Corrected median value of owner-occupied homes (numeric)
- CRIM
Per capita crime rate by town (numeric)
- ZN
Proportion of residential land zoned for lots over 25,000 sq.ft. (numeric)
- INDUS
Proportion of non-retail business acres per town (numeric)
- CHAS
Charles River dummy variable (factor: "0" = not bounded, "1" = bounded)
- NOX
Nitric oxides concentration (parts per 10 million, numeric)
- RM
Average number of rooms per dwelling (numeric)
- AGE
Proportion of owner-occupied units built prior to 1940 (numeric)
- DIS
Weighted distances to five Boston employment centers (numeric)
- RAD
Index of accessibility to radial highways (integer)
- TAX
Full-value property-tax rate per
$10,000
(integer)- PTRATIO
Pupil-teacher ratio by town (numeric)
- B
Proportion of Black residents, defined as 1000(Bk - 0.63)^2 (numeric)
- LSTAT
Percentage of lower status of the population (numeric)
Details
The dataset consists of 506 observations and 20 variables, including socio-economic,
environmental, and housing characteristics. Geographic coordinates (longitude and latitude)
are provided for spatial analysis. Related data objects include boston.utm
, a matrix
of tract point coordinates projected to UTM zone 19, and boston.soi
, a sphere of
influence neighbors list.
The dataset name has been kept as boston_pts
to avoid confusion with other datasets
in the R ecosystem. This naming convention helps distinguish this dataset as part of the
lightsf
package and assists users in identifying its specific characteristics.
The suffix pts
indicates that the dataset includes spatial point information.
The original content has not been modified in any way.