-
Notifications
You must be signed in to change notification settings - Fork 6
/
8_b_temperature.qmd
977 lines (677 loc) · 38.6 KB
/
8_b_temperature.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
## Temperature
## Learning objectives
By the end of this practical you should be able to:
1. Explain and execute appropriate pre-processing steps of raster data
1. Replicate published methodologies using raster data
1. Design new R code to undertake further analysis
This week:
* [Appendix "Raster operations in R"](https://mgimond.github.io/Spatial/raster-operations-in-r.html) from Intro to GIS and Spatial Analysis by Gimond (2019)
* [Raster manipulation](https://rspatial.org/raster/spatial/8-rastermanip.html) from Spatial data science by Hijmans (2016). This last one is another tutorial --- it seems there aren't any decent free raster textbook chapters, let me know if you find one.
Remember this is just a starting point, explore the [reading list](https://rl.talis.com/3/ucl/lists/139FBAF8-DACD-60FB-8BDC-E9C9E09BA885.html?lang=en-GB&login=1), practical and lecture for more ideas.
## Introduction
Within this practical we are going to be using data from the Landsat satellite series provided for free by the United States Geological Survey (USGS) to replicate published methods. Landsat imagery is the longest free temporal image repository of consistent medium resolution data. It collects data at each point on Earth each every 16 days (temporal resolution) in a raster grid composed of 30 by 30 m cells (spatial resolution). Geographical analysis and concepts are becoming ever more entwined with remote sensing and Earth observation.
## Data
### Shapefile
The shapefile of Manchester is available from the data folder for this week on [GitHub](https://github.com/andrewmaclachlan/CASA0005repo/tree/master/prac8_data). To download this consult [How to download data and files from GitHub], i'd used Option 1.
### Raster data (Landsat)
To download the data it's the same process we saw in week 1 on the USGS Earth Explorer website:
1. Enter Manchester in the address/place box > select the country as United Kingdom > click Show and then click on the word Manchester in the box that appears.
```{r echo=FALSE, out.width = "350pt", fig.align='center', cache=TRUE}
knitr::include_graphics('prac_7/earthexplorer.png')
```
1. Select the date range between the 12/5/2019 and 14/5/2019 --- it's a US website so check the dates are correct.
1. Click dataset and select Landsat, then Landsat Collection 1 Level-1, check Landsat 8 (level 2 is surface reflectance --- see [Remote sensing background (optional)]
1. Click results, there should be one image (GEOTiff), download it..it might take a while
1. Landsat data comes zipped twice as a ```.tar.gz```. Use [7Zip](https://www.7-zip.org/) or another file extractor, extract it once to get to a ```.tar``` then extract again and files should appear. Or the code below will also let you extract Landsat data...
#### Alternative raster data
Occasionally the earth explorer website can go down for maintenance or during government shutdowns. If possible I strongly advise you to learn how to use its interface as multiple other data providers have similar interfaces. GitHub also place a strict size limit on files of 100MB. However, in order to account for situations like this I’ve placed the zipped file on GoogleDrive and will demonstrate how to access this from R using the new `googledrive` package.
This could be a great option for you to gain reproducibility points if you have large files that you can't upload to GitHub.
In GoogleDrive you need to ensure your file is shareable with others --- right click on it > Share > then copy the link. I have done this for my file in the example below, but if you try and replicate this, make sure you've done it otherwise it might not work when other people try and run your code, as they won't have access to the file on your GoogleDrive.
Depending on your internet speed this example might take some time...
Be sure to change the path to your practical 7 folder but make sure you include the filename within it and set overwrite to T (or TRUE) if you are going to run this again.
```{r, cache=TRUE, eval=F}
library("googledrive")
o<-drive_download("https://drive.google.com/open?id=1MV7ym_LW3Pz3MxHrk-qErN1c_nR0NWXy",
path="prac_7/exampleGoogleDrivedata/LC08_L1TP_203023_20190513_20190521_01_T1.tar.gz",
overwrite=T)
```
Next we need to uncompress and unzip the file with `untar()`, first list the files that end in the extension `.gz` then pass that to `untar` with the pipe `%>%` remember this basically means after this function... then...do this other function with that data
```{r, eval=F, cache=T}
library(tidyverse)
library(fs)
library(stringr)
library(utils)
listfiles<-dir_info(here::here("prac_7", "exampleGoogleDrivedata")) %>%
dplyr::filter(str_detect(path, ".gz")) %>%
dplyr::select(path)%>%
dplyr::pull()%>%
#print out the .gz file
print()%>%
as.character()%>%
utils::untar(exdir=here::here("prac_7", "exampleGoogleDrivedata"))
```
## Processing raster data
### Loading
Today, we are going to be using a Landsat 8 raster of Manchester. The vector shape file for Manchester has been taken from an ESRI repository.
1. Let's load the majority of packages we will need here.
```{r message=FALSE, warning=FALSE, cache=TRUE}
## listing all possible libraries that all presenters may need following each practical
library(sp)
#library(raster)
library(rgeos)
library(rgdal)
library(rasterVis)
library(ggplot2)
library(terra)
library(sf)
```
1. Now let's list all our Landsat bands except band 8 (band 8 is the panchromatic band and we don't need it here) along with our study area shapefile. Each band is a separate ```.TIF``` file.
```{r message=FALSE, warning=FALSE, cache=TRUE}
library(stringr)
library(raster)
library(fs)
library(sf)
library(tidyverse)
# List your raster files excluding band 8 using the patter argument
listlandsat<-dir_info(here::here("prac_7", "Lsatdata"))%>%
dplyr::filter(str_detect(path, "[B123456790].TIF")) %>%
dplyr::select(path)%>%
pull()%>%
as.character()%>%
# Load our raster layers into a stack
raster::stack()
# Load the manchester boundary
manchester_boundary <- st_read(here::here("prac_7",
"Manchester_boundary",
"manchester_boundary.shp"))
#check they have the same Coordinate Reference System (CRS)
crs(manchester_boundary)
crs(listlandsat)
```
### Clipping
1. Our raster is currently the size of the scene which satellite data is distributed in, to clip it to our study area it's best to first crop it to the extent of the shapefile and then mask it as we have done in previous practicals...
```{r, cache=TRUE}
lsatmask <- listlandsat %>%
# now crop our temp data to the extent
terra::crop(.,manchester_boundary)%>%
terra::mask(., manchester_boundary)
```
1. If all we wanted to do was clip our data, we could now change our filenames in the raster stack and write the ```.TIFF ``` files out again...
```{r, cache=TRUE}
# add mask to the filenames within the raster stack
names(lsatmask) <- names(lsatmask)%>%
str_c(.,
"mask",
sep="_")
# I need to write mine out in another location
outputfilenames <-
str_c("prac_7/Lsatdata/", "mask/", names(lsatmask) ,sep="")
```
In the first line of code i'm taking the original names of the raster layers and adding "mask" to the end of them. This is done using ```str_c()``` from the stringr package and the arguments
* ```names(lsatmask)```: original raster layer names
* ```"mask"```: what i want to add to the names
* ```sep=""```: how the names and "mask" should be seperated --- "" means no spaces
As i can't upload my Landsat files to GitHub i'm storing them in a folder that is not linked (remember this is all sotred on GitHub) -- so you won't find ```prac8_data/Lsatdata``` on there. If you want to store your clipped Landsat files in your project directory just use:
```{r eval=FALSE, cache=TRUE}
lsatmask %>%
terra::writeRaster(., names(lsatmask),
bylayer=TRUE,
format='raster',
overwrite=TRUE)
```
For me though it's:
```{r, cache=TRUE}
lsatmask %>%
terra::writeRaster(., outputfilenames,
bylayer=TRUE,
format='raster',
overwrite=TRUE)
```
Here i write out each raster layer individually though specifying ```bylayer=TRUE```.You can either use the `format=GTiff`or the native raster format from the `raster` package - `format='raster'` it doesn't really matter as all GIS software can read all types.
## Data exploration
### More loading and manipulating
1. For the next stage of analysis we are only interested in bands 1-7, we can either load them back in from the files we just saved or take them directly from the original raster stack.
```{r, cache=TRUE}
# either read them back in from the saved file:
manc_files<-dir_info(here::here("prac_7", "Lsatdata", "mask")) %>%
dplyr::filter(str_detect(path, "[B1234567]_mask.grd")) %>%
dplyr::filter(str_detect(path, "B11", negate=TRUE))%>%
dplyr::select(path)%>%
pull()%>%
stack()
# or extract them from the original stack
manc<-stack(lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B1_mask,
lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B2_mask,
lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B3_mask,
lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B4_mask,
lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B5_mask,
lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B6_mask,
lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B7_mask)
# Name the Bands based on where they sample the electromagentic spectrum
names(manc) <- c('ultra-blue', 'blue', 'green', 'red', 'NIR', 'SWIR1', 'SWIR2')
```
1. If you want to extract specific information from a raster stack use:
```{r results="hide", eval=FALSE, cache=TRUE}
crs(manc) # projection
extent(manc) # extent
ncell(manc) # number of cells
dim(manc) # number of rows, columns, layers
nlayers(manc) # number of layers
res(manc) # xres, yres
```
### Plotting data
1. Let's actually have a look at our raster data, first in true colour (how humans see the world) and then false colour composites (using any other bands but not the combination of red, green and blue).
```{r, cache=TRUE}
# true colour composite
manc_rgb <- stack(manc$red, manc$green, manc$blue)
# false colour composite
manc_false <- stack(manc$NIR, manc$red, manc$green)
manc_rgb %>%
plotRGB(.,axes=TRUE, stretch="lin")
manc_false %>%
plotRGB(.,axes=TRUE, stretch="lin")
```
### Data similarity
1. What if you wanted to look at signle bands and also check the similarity between bands?
```{r, cache=TRUE}
# Looking at single bands
plot(manc$SWIR2)
## How are these bands different?
#set the plot window size (2 by 2)
par(mfrow = c(2,2))
#plot the bands
plot(manc$blue, main = "Blue")
plot(manc$green, main = "Green")
plot(manc$red, main = "Red")
plot(manc$NIR, main = "NIR")
## Look at the stats of these bands
pairs(manc[[1:7]])
```
Low statistical significance means that the bands are sufficiently different enough in their wavelength reflectance to show different things in the image. We can also make this look a bit nicer with ```ggplot2``` and ```GGally```
```{r cache=FALSE, message=FALSE}
library(ggplot2)
library(GGally)
library(raster)
manc %>%
terra::as.data.frame(., na.rm=TRUE)%>%
dplyr::sample_n(., 100)%>%
ggpairs(.,axisLabels="none")
```
You can do much more using ```GGally``` have a look at the great [documentation](https://ggobi.github.io/ggally/#ggallyggpairs)
## Raster calculations
Now we will move on to raster analysis in order to compute temperature from this raster data. To do so we need to generate additional raster layers, the first of which is NDVI
### NDVI
Live green vegetation can be represented with the NIR and Red Bands through the normalised difference vegetation index (NDVI) as chlorophyll reflects in the NIR wavelength, but absorbs in the Red wavelength.
$$NDVI= \frac{NIR-Red}{NIR+Red}$$
### NDVI function
One of the great strengths of R is that is lets users define their own functions. Here we will practice writing a couple of basic functions to process some of the data we have been working with.
One of the benefits of a function is that it generalises some set of operations that can then be repeated over and again on different data... the structure of a function in R is given below:
```{r prac7_fun,eval=FALSE}
myfunction <- function(arg1, arg2, ... ){
statements
return(object)
}
```
We can use NDVI as an example...
1. Let's make a function called ```NDVIfun```
```{r, cache=TRUE}
NDVIfun <- function(NIR, Red) {
NDVI <- (NIR - Red) / (NIR + Red)
return(NDVI)
}
```
Here we have said our function needs two arguments NIR and Red, the next line calculates NDVI based on the formula and returns it. To be able to use this function throughout our analysis either copy it into the console or make a new R script, save it in your project then call it within this code using the ```source()``` function e.g...
```{r eval=FALSE, cache=TRUE}
source('insert file name')
```
1. To use the function do so through...
```{r, cache=TRUE}
ndvi <- NDVIfun(manc$NIR, manc$red)
```
Here we call the function ```NDVIfun()``` and then provide the NIR and Red band.
1. Check the output
```{r, cache=TRUE}
ndvi %>%
plot(.,col = rev(terrain.colors(10)), main = "Landsat-NDVI")
# Let's look at the histogram for this dataset
ndvi %>%
hist(., breaks = 40, main = "NDVI Histogram", xlim = c(-.3,.8))
```
1. We can reclassify to the raster to show use what is most likely going to vegetation based on the histogram using the 3rd quartile --- anything above the 3rd quartile we assume is vegetation.
> Note, this is an assumption for demonstration purposes, if you were to do something similar in future analysis be sure to provide reasoning with linkage to literature (e.g. policy or academic)
```{r, cache=TRUE}
veg <- ndvi %>%
reclassify(., cbind(-Inf, 0.3, NA))
veg %>%
plot(.,main = 'Possible Veg cover')
```
1. Let's look at this in relation to Manchester as a whole
```{r, cache=TRUE}
manc_rgb %>%
plotRGB(.,axes = TRUE, stretch = "lin", main = "Landsat True Color Composite")
veg %>%
plot(., add=TRUE, legend=FALSE)
```
## Advanced raster calculations
The goal of this final section is to set up a mini investigation to see if there is a relationship between urban area and temperature. If our hypothesis is that there is a relationship then our null is that there is not a relationship...
### Calculating tempearture from Landsat data
Here we are going to compute temperature from Landsat data --- there are many methods that can be found within literature to do so but we will use the one originally developed by Artis & Carnahan (1982), recently summarised by Guha et al. 2018 and and Avdan and Jovanovska (2016).
Some of the terms used our outlined in the terms section at the end of the document.
1. Calculate the Top of Atmosphere (TOA) spectral radiance from the Digital Number (DN) using:
$$\lambda= Grescale * QCAL + Brescale$$
TOA spectral radiance is light reflected off the Earth as seen from the satellite measure in radiance units.
In this equation Grescale and Brescale represent the gain and bias of the image, with QCAL the Digital Number (DN) --- how the raw Landsat image is captured. To go from DN to spectral radiance we use the calibration curve, created before the launch of the sensor. Bias is the spectral radiance of the sensor for a DN of 0, Gain is the gradient of the slope for other values of DN. See page 84 in [RADIOMETRIC CORRECTION OF SATELLITE IMAGES: WHEN AND WHY RADIOMETRIC CORRECTION IS NECESSARY](https://www.ncl.ac.uk/tcmweb/bilko/module7/lesson3.pdf) for further information.
Grescale and Brescale are available from the ```.MTL``` file provided when you downloaded the Landsat data. Either open this file in notepad and extract the required values for band 10 gain (MULT_BAND) and bias (ADD_BAND)
...Or we can automate it using the ```MTL()``` function within the ```RStoolbox``` package
```{r results="hide", warnings=FALSE, message=FALSE, cache=TRUE}
library(RStoolbox)
MTL<-dir_info(here::here("prac_7", "Lsatdata")) %>%
dplyr::filter(str_detect(path, "MTL.txt")) %>%
dplyr::select(path)%>%
pull()%>%
readMeta()
#To see all the attributes
head(MTL)
```
1. Now let's extract the values from the readMTL variable for Band 10...we can either use the function `getMeta()` from `RStoolbox` of just extract the values ourselves...
```{r, cache=TRUE}
offsetandgain <-MTL %>%
getMeta("B10_dn", metaData = ., what = "CALRAD")
offsetandgain
##OR
offsetandgain <- subset(MTL$CALRAD, rownames(MTL$CALRAD) == "B10_dn")
```
1. Run the calculation using the band 10 raster layer
```{r, cache=TRUE}
TOA <- offsetandgain$gain *
lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B10_mask +
offsetandgain$offset
```
1. Next convert the TOA to Brightness Temperature $T_b$ using the following equation:
$$T_b=\frac{K_2}{ln((K_1/\lambda)+1)}$$
Brightness temperature is the radiance travelling upward from the top of the atmosphere to the satellite in units of the temperature of an equivalent black body.
K1 (774.8853) and K2 (1321.0789) are pre launch calibration constants provided by USGS.
Check the [handbook](https://prd-wret.s3-us-west-2.amazonaws.com/assets/palladium/production/atoms/files/LSDS-1574_L8_Data_Users_Handbook_v4.0.pdf) for these values
1. Instead of hardcoding these values...yep, you guessed it... we can extract them from our ```MTL```
```{r, cache=TRUE}
Calidata <- MTL$CALBT%>%
terra::as.data.frame()%>%
mutate(Band=rownames(.))%>%
filter(Band=="B10_dn")
# subset the columns
K1 <- Calidata %>%
dplyr::select(K1)%>%
pull()
K2 <- Calidata %>%
dplyr::select(K2)%>%
pull()
Brighttemp <- (K2 / log((K1 / TOA) + 1))
```
Earlier we calculated NDVI, let's use that to determine emissivity of each pixel.
1. First we need to calculate the fractional vegetation of each pixel, through the equation:
$$F_v= \left( \frac{NDVI - NDVI_{min}}{NDVI_{max}-NDVI_{min}} \right)^2$$
```{r, cache=TRUE}
facveg <- (ndvi-0.2/0.5-0.2)^2
```
Fractional vegetation cover is the ratio of vertically projected area of vegetation to the total surface extent.
Here, $NDVI_{min}$ is the minimum NDVI value (0.2) where pixels are considered bare earth and $NDVI_{max}$ is the value at which pixels are considered healthy vegetation (0.5)
1. Now compute the emissivity using:
$$\varepsilon = 0.004*F_v+0.986$$
```{r, cache=TRUE}
emiss <- 0.004*facveg+0.986
```
Emissivity is the ratio absorbed radiation energy to total incoming radiation energy compared to a blackbody (which would absorb everything), being a measure of absorptivity.
1. Great, we're nearly there... get our LST following the equation from Weng et al. 2004 (also summarised in Guja et al. (2018) and Avdan and Jovanovska (2016)):
$$LST= \frac{T_b}{1+(\lambda \varrho T_b / (p))ln\varepsilon}$$
Where:
$$p= h\frac{c}{\varrho}$$
Ok, don't freak out....let's start with calculating $p$
Here we have:
* $h$ which is Plank's constant $6.626 × 10^-34 Js$
* $c$ which is the velocity of light in a vaccum $2.998 × 10^8 m/sec$
* $\varrho$ which is the Boltzmann constant of $1.38 × 10^-23 J/K$
```{r, cache=TRUE}
Boltzmann <- 1.38*10e-23
Plank <- 6.626*10e-34
c <- 2.998*10e8
p <- Plank*(c/Boltzmann)
```
Now for the rest of the equation....we have the values for:
* $\lambda$ which is the effective wavelength of our data (10.9 for Landsat 8 band 10)
* $\varepsilon$ emissivity
* $T_b$ Brightness Temperature
1. Run the equation with our data
```{r, cache=TRUE}
#define remaining varaibles
lambda <- 1.09e-5
#run the LST calculation
LST <- Brighttemp/(1 +(lambda*Brighttemp/p)*log(emiss))
# check the values
LST
```
1. Are the values very high?... That's because we are in Kevlin not degrees Celcius...let's fix that and plot the map
```{r, cache=TRUE}
LST <- LST-273.15
plot(LST)
```
Nice that's our temperature data sorted.
## Calucating urban area from Landsat data
How about we extract some urban area using another index and then see how our temperature data is related?
We will use the Normalized Difference Built-up Index (NDBI) algorithm for identification of built up regions using the reflective bands: Red, Near-Infrared (NIR) and Mid-Infrared (MIR) originally proposed by Zha et al. (2003).
It is very similar to our earlier NDVI calculation but using different bands...
$$NDBI= \frac{Short-wave Infrared (SWIR)-Near Infrared (NIR)}{Short-wave Infrared (SWIR)+Near Infrared (NIR)}$$
In Landsat 8 data the SWIR is band 6 and the NIR band 5
1. Let's compute this index now...
```{r, cache=TRUE}
NDBI=((lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B6_mask-
lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B5_mask)/
(lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B6_mask+
lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B5_mask))
```
But do you remember our function? ...Well this is the same calculation we used there just with different raster layers (or bands) so we could reuse it...
```{r, cache=TRUE}
NDBIfunexample <- NDVIfun(lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B6_mask,
lsatmask$LC08_L1TP_203023_20190513_20190521_01_T1_B5_mask)
```
## Urban area and temperature relationship
1. We could plot the varaibles agaisnt each other but there are a lot of data points
```{r, cache=TRUE}
plot(values(NDBI), values(LST))
```
This is termed the overplotting problem. So, let's just take a random subset of the same pixels from both raster layers.
1. To do so we need to again stack our layers
```{r, cache=TRUE}
# stack the layers
computeddata <- LST%>%
stack(.,NDBI)%>%
terra::as.data.frame()%>%
na.omit()%>%
# take a random subset
dplyr::sample_n(., 500)%>%
dplyr::rename(Temp="layer.1", NDBI="layer.2")
# check the output
plot(computeddata$Temp, computeddata$NDBI)
```
1. Let's jazz things up, load some more packages
```{r message=FALSE, warning=FALSE, cache=TRUE}
library(plotly)
library(htmlwidgets)
```
1. Transfrom the data to a data.frame to work with `ggplot`, then plot
```{r, cache=TRUE}
heat<-ggplot(computeddata, aes(x = NDBI, y = Temp))+
geom_point(alpha=2, colour = "#51A0D5")+
labs(x = "Temperature",
y = "Urban index",
title = "Manchester urban and temperature relationship")+
geom_smooth(method='lm', se=FALSE)+
theme_classic()+
theme(plot.title = element_text(hjust = 0.5))
# interactive plot
ggplotly(heat)
```
It's a masterpiece!
```{r echo=FALSE, out.width = "450pt", fig.align='center', cache=TRUE, fig.cap="ggplot2 masterpiece. Source: [Allison Horst data science and stats illustrations](https://github.com/allisonhorst/stats-illustrations)"}
knitr::include_graphics('allisonhorst_images/ggplot2_masterpiece.png')
```
1. How about plotting the whole dataset rather than a random subset...
```{r message=FALSE, warning=FALSE, cache=TRUE}
computeddatafull <- LST%>%
stack(.,NDBI)%>%
terra::as.data.frame()%>%
na.omit()%>%
# take a random subset
dplyr::rename(Temp="layer.1", NDBI="layer.2")
hexbins <- ggplot(computeddatafull,
aes(x=NDBI, y=Temp)) +
geom_hex(bins=100, na.rm=TRUE) +
labs(fill = "Count per bin")+
geom_smooth(method='lm', se=FALSE, size=0.6)+
theme_bw()
ggplotly(hexbins)
```
## Statistical summary
1. To see if our variables are related let's run some basic correlation
```{r, cache=TRUE}
library(rstatix)
Correlation <- computeddatafull %>%
cor_test(Temp, NDBI, use = "complete.obs", method = c("pearson"))
Correlation
```
Let's walk through the results here...
* p-value: tells us whether there is a statistically significant correlation between the datasets and if that we can reject the null hypothesis if p<0.05 (there is a 95% chance that the relationship is real).
* cor: Product moment correlation coefficient
* conf.low and con.high intervals: 95% confident that the population correlation coefficient is within this interval
* statistic value (or t, or test statistic)
We can work out the critical t value using:
```{r, cache=TRUE}
abs(qt(0.05/2, 198268))
```
Within this formula
* 0.05 is the confidence level (95%)
* 2 means a 2 sided test
* 198268 is the degrees of freedom (df), being the number of values we have -2
```{r, cache=TRUE}
computeddatafull %>%
pull(Temp)%>%
length()
length(computeddatafull)
```
Here, as our t values is > than the critical value we can say that there is a relationship between the datasets. However, we would normally report the p-value...
As p<0.05 is shows that are variables are have a statistically significant correlation... so as urban area (assuming the index in representative) per pixel increases so does temperature...therefore we can reject our null hypothesis... but remember that this does not imply causation!!
If you want more information on statistics in R go and read [YaRrr! A Pirate's Guide to R](https://bookdown.org/ndphillips/YaRrr/), chapter 13 on hypothesis tests.
## LSOA/MSOA stats
This is all useful, but what if we wanted to present this analysis to local leaders in Manchester to show which areas experience the highest temperature (remember, with the limitation of this being one day!)
Next, we will aggregate our raster data to LSOAs, taking a mean of the pixels in each LSOA.
LSOA data: https://data-communities.opendata.arcgis.com/datasets/5e1c399d787e48c0902e5fe4fc1ccfe3
MOSA data: https://data.cambridgeshireinsight.org.uk/dataset/output-areas/resource/0e5ac3b8-de71-4123-a334-0d1506a50288
Let's prepare the new spatial data:
```{r, warning=FALSE, message=FALSE}
library(dplyr)
library(sf)
# read in LSOA data
UK_LSOA <- st_read(here::here("prac_7",
"LSOA",
"Lower_Super_Output_Area_(LSOA)_IMD_2019__(OSGB1936).shp"))
# project it to match Manchester boundary
UK_LSOA <- UK_LSOA %>%
st_transform(., 32630)
# read in MSOA and project it
MSOA <- st_read(here::here("prac_7",
"MSOA",
"Middle_Layer_Super_Output_Areas_December_2011_Generalised_Clipped_Boundaries_in_England_and_Wales.shp")) %>%
st_transform(., 32630)
#select only MSOA within boundary
manchester_MSOA <- MSOA[manchester_boundary, , op=st_within]
#select only LSOA that intersect MSOA
manchester_LSOA <- UK_LSOA[manchester_MSOA,]
```
Next, we need to do the extraction with `raster::extract()`. `fun()` specifies how to summarise the pixels within the spatial unit (LSOA), `na.rm()=TRUE` ignores NA values and `df=TRUE` outputs the result to a dataframe.
```{r, warning=FALSE, message=FALSE}
# extract mean LST value per LSOA
LST_per_LSOA <- terra::extract(LST, manchester_LSOA, fun=mean, na.rm=TRUE, df=TRUE)
# add the LSOA ID back
LST_per_LSOA$FID<-manchester_LSOA$FID
# join the average temp to the sf
manchester_LSOA_temp <- manchester_LSOA %>%
left_join(.,
LST_per_LSOA,
by="FID")%>%
dplyr::rename(temp=layer)
```
Now we have the temperature per LSOA, but what about the amount of land considered urban? Here, we will assume that any NDBI value above 0 means the whole pixel is considered urban. `raster::extract()` can also be used to get all the pixels within each spatial area (LSOA)..
```{r, warning=FALSE, message=FALSE}
#define urban as NDBI greater than 0
NDBI_urban<- NDBI > 0
# Sum the pixels that are grater than 0 per LSOA
NDBI_urban_per_LSOA <- terra::extract(NDBI_urban, manchester_LSOA, na.rm=TRUE, df=TRUE, fun=sum)
# list the pixels per LSOA
NDBI_per_LSOA_cells <- terra::extract(NDBI_urban, manchester_LSOA, na.rm=TRUE, df=TRUE, cellnumbers=TRUE)
#count the pixels per LSOA
NDBI_per_LSOA2_cells<- NDBI_per_LSOA_cells %>%
count(ID)
#add the LSOA ID to the urban area
NDBI_urban_per_LSOA$FID<-manchester_LSOA$FID
#add the LSOA ID to the number of cells
NDBI_per_LSOA2_cells$FID<-manchester_LSOA$FID
#join these two
Urban_info_LSOA <- NDBI_urban_per_LSOA %>%
left_join(.,
NDBI_per_LSOA2_cells,
by="FID")
# remove what you don't need and rename
Urban_info_LSOA_core_needed <- Urban_info_LSOA %>%
dplyr::rename(urban_count=layer,
LSOA_cells=n) %>%
dplyr::select(urban_count,
LSOA_cells,
FID)%>%
dplyr::mutate(percent_urban=urban_count/LSOA_cells*100)
# join the data
# one sf with temp and % urban per LSOA
manchester_LSOA_temp_urban <- manchester_LSOA_temp %>%
left_join(.,
Urban_info_LSOA_core_needed,
by="FID")
```
## Mapping
Now, we could map both temperature (and the % of urban area) within a LSOA individually....In our map we want to include some place names from Open Street Map, so let's get those from [OSM that is stored on Geofabrik](https://download.geofabrik.de/europe/great-britain/england/greater-manchester.html)
```{r, warning=FALSE, message=FALSE}
Places <- st_read(here::here("prac_7",
"OSM",
"gis_osm_places_free_1.shp")) %>%
st_transform(., 32630)
manchester_Places <- Places[manchester_boundary,]%>%
filter(fclass=="city")
```
Let's make a map, like we did earlier CASA0005 using the `tmap` package, remember to add a caption in Rmarkdown include the argument `fig.cap="caption here"` in the code chunk header.
```{r, fig.cap="Average temperature per LSOA in Manchester, calcualted from Landsat imagery dated 13/5/19, following the methodology specified by Guha et al. 2018", warning=FALSE, message=FALSE}
# this first bit makes the box bigger
# so we can have a north arrow not overlapping the data
# see: https://www.jla-data.net/eng/adjusting-bounding-box-of-a-tmap-map/
bbox_new <- st_bbox(manchester_LSOA_temp_urban) # current bounding box
yrange <- bbox_new$ymax - bbox_new$ymin # range of y values
bbox_new[4] <- bbox_new[4] + (0.1 * yrange) # ymax - top
bbox_new[2] <- bbox_new[2] - (0.1 * yrange) # ymin - bottom
# the plot starts here
library(tmap)
tmap_mode("plot")
# set the new bbox
# remove bbox=bbox_new to see the difference
tm1 <- tm_shape(manchester_LSOA_temp_urban, bbox = bbox_new) +
tm_polygons("temp",
palette="OrRd",
legend.hist=TRUE,
title="Temperature")+
tm_shape(manchester_Places, bbox=bbox_new)+
tm_dots(size=0.1, col="white")+
tm_text(text="name", size=0.75, ymod=-0.5, col="white", fontface = "bold")+
#tm_legend(show=FALSE)+
tm_layout(frame=FALSE,
legend.outside=TRUE)+
tm_compass(type = "arrow", size=1, position = c("left", "top")) +
tm_scale_bar(position= c("left", "bottom"), breaks=c(0,2,4), text.size = .75)
#tm_credits("(a)", position=c(0,0.85), size=1.5)
tm1
```
This could also be repeated for urban area and plotted in one figure, perhaps with an inset map like we previously did. But, we can also now plot two choropleth maps over each other, this is made much easier with the `biscale()` package, however, now we must use `ggplot` as opposed to `tmap`, you can force `tmap` to do it, but the feature is still in development.
## Bivariate mapping (optional)
Whilst bivaraite maps look cool, they don't tell us anything about the actual data values, simply dividing the data into classes based on the style parameter which we have seen before - jenks, equal and so on. So here I want to produce:
* A central bivariate map of LSOAs within Manchester
* The central bivariate will have the MSOA boundaries and some place names
* Two plots showing the distribution of the data
```{r, warning=FALSE, message=FALSE}
library(biscale)
library(cowplot)
library(sysfonts)
library(extrafont)
library(showtext) # more fonts
#font_add_google("Lato", regular.wt = 300, bold.wt = 700) # I like using Lato for data viz (and everything else...). Open sans is also great for web viewing.
showtext_auto()
# create classes
data <- bi_class(manchester_LSOA_temp_urban, x = temp, y = percent_urban, style = "jenks", dim = 3)
#ggplot map
map <- ggplot() +
geom_sf(data = data, mapping = aes(fill = bi_class), color=NA, lwd = 0.1, show.legend = FALSE) +
bi_scale_fill(pal = "DkViolet", dim = 3) +
geom_sf(data = manchester_MSOA, mapping = aes(fill=NA), color="black", alpha=0, show.legend = FALSE)+
geom_sf(data=manchester_Places, mapping=aes(fill=NA), color="white", show.legend = FALSE)+
geom_sf_text(data=manchester_Places, aes(label = name, hjust = 0.5, vjust = -0.5),
nudge_x = 0, nudge_y = 0,
fontface = "bold",
color = "white",
show.legend = FALSE,
inherit.aes = TRUE)+
labs(
title = "",
x="", y=""
) +
bi_theme()
legend <- bi_legend(pal = "DkViolet",
dim = 3,
xlab = "Temperature ",
ylab = "% Urban",
size = 8)
credit<- ("Landsat dervied temperature and urban area, taken 13/5/19")
# combine map with legend
finalPlot <- ggdraw() +
draw_plot(map, 0, 0, 1, 1) +
draw_plot(legend, 0.1, 0.1, 0.2, 0.2)
#draw_text(credit, 0.68, 0.1, 0.2, 0.2, size=10)
finalPlot
```
That's the main bivaraite plot, now to the side plots...
```{r, message=FALSE, warning=FALSE}
urban_box<-ggplot(data, aes(x=bi_class, y=percent_urban, fill=bi_class)) +
geom_boxplot()+
scale_fill_manual(values=c("#CABED0", "#BC7C8F", "#806A8A", "#435786", "#AE3A4E", "#77324C", "#3F2949", "#3F2949"))+
labs(x="Bivariate class (temp, urban)",
y="Urban %")+
theme_light()+
theme(legend.position="none") # Remove legend
temp_violin<-ggplot(data, aes(x=bi_class, y=temp, fill=bi_class))+
geom_violin()+
scale_fill_manual(values=c("#CABED0", "#BC7C8F", "#806A8A", "#435786", "#AE3A4E", "#77324C", "#3F2949", "#3F2949"))+
labs(x="",
y="Temperature")+
guides(fill=guide_legend(title="Class"))+
theme_light()+
theme(legend.position="none") # Remove legend
```
Join them all together in two steps - make the side plots, then join that to the main plot
```{r, message=FALSE, warning=FALSE}
side <- plot_grid(temp_violin, urban_box, labels=c("B","C"),label_size = 12, ncol=1)
all <- plot_grid(finalPlot, side, labels = c('A'), label_size = 12, ncol = 2, rel_widths = c(2, 1))
```
Now this looks a bit rubbish on the page. Usually, you might want to export the map to a `.png` to use elsewhere, but you will need to resize it. I like to plot the map in the console (type all in the console then press enter), click export in the plots tab, reszie the image and note down the numbers then input them below..
```{r, echo=TRUE, message=FALSE, warning=FALSE}
all
dev.copy(device = png, filename = here::here("prac_7", "bivaraite.png"), width = 687, height = 455)
dev.off()
```
```{r echo=FALSE, out.width = "600", fig.align='center', cache=TRUE}
knitr::include_graphics(here::here("prac_7", "bivaraite.png"))
```
### Map notes
* The Manchester outline shapefile doesn't match up with either the MSOA or LSOA boundaries, this could be solved by making sure the boundary data (e.g. city outline) matches any other spatial units.
* When selecting the data (within the Manchester boundary) any MSOA shape that was within the boundary selected
* Then any LSOA that intersected the MSOA layer was selected
* To resolve this you could clip the MSOA layer to the current displayed layer to force a boundary around the map
* A box plot was used as the data was very spread out in the urban % plot, so a violin plot didn't show much.
## Considerations
If you wanted to explore this type of analysis further then you would need to consider the following:
* Other methods for extracting temperature from Landsat data
* Other methods for identifying urban area from Landsat data such as image classificaiton
* The formula used to calculate emissivity --- there are many
* The use of raw satellite data as opposed to remove the effects of the atmosphere. Within this practical we have only used relative spatial indexes (e.g. NDVI). However, if you were to use alternative methods it might be more appropriate to use surface reflectance data (also provided by USGS).
## References
Thanks to former CASA graduate student Dr Matt Ng for providing the outline to the start of this practical
Avdan, U. and Jovanovska, G., 2016. Algorithm for automated mapping of land surface temperature using LANDSAT 8 satellite data. Journal of Sensors, 2016.
Guha, S., Govil, H., Dey, A. and Gill, N., 2018. Analytical study of land surface temperature with NDVI and NDBI using Landsat 8 OLI and TIRS data in Florence and Naples city, Italy. European Journal of Remote Sensing, 51(1), pp.667-678.
Weng, Q., Lu, D. and Schubring, J., 2004. Estimation of land surface temperature–vegetation abundance relationship for urban heat island studies. Remote sensing of Environment, 89(4), pp.467-483.
Young, N.E., Anderson, R.S., Chignell, S.M., Vorster, A.G., Lawrence, R. and Evangelista, P.H., 2017. A survival guide to Landsat preprocessing. Ecology, 98(4), pp.920-932.
Zha, Y., Gao, J. and Ni, S., 2003. Use of normalized difference built-up index in automatically mapping urban areas from TM imagery. International journal of remote sensing, 24(3), pp.583-594.
## Remote sensing refresher
Landsat sensors capture reflected solar energy, convert these data to radiance, then rescale this data into a Digital Number (DN), the latter representing the intensity of the electromagnetic radiation per pixel. The range of possible DN values depends on the sensor radiometric resolution. For example Landsat Thematic Mapper 5 (TM) measures between 0 and 255 (termed 8 bit), whilst Landsat 8 OLI measures between 0 and 65536 (termed 12 bit). These DN values can then be converted into Top of Atmosphere (TOA) radiance and TOA reflectance [through available equations and known constants](https://landsat.usgs.gov/landsat-8-l8-data-users-handbook-section-5¬) that are preloaded into certain software. The former is how much light the instrument sees in meaningful units whilst the latter removes the effects of the light source. However, TOA reflectance is still influenced by atmospheric effects. These atmospheric effects can be removed through atmospheric correction achievable in software such as ENVI and QGIS to give surface reflectance representing a ratio of the amount of light leaving a target to the amount of light striking it.
We must also consider the spectral resolution of satellite imagery, Landsat 8 OLI has 11 spectral bands and as a result is a multi-spectral sensor. As humans we see in the visible part of the electromagnetic spectrum (red-green-blue) --- this would be three bands of satellite imagery --- however satellites can take advantage of the rest of the spectrum. Each band of Landsat measures in a certain part of the spectrum to produce a DN value. We can then combine these values to produce ‘colour composites’. So a ‘true’ colour composite is where red, green and blue Landsat bands are displayed (the visible spectrum). Based on the differing DN values obtained, we can pick out the unique signatures (values of all spectral bands) of each land cover type, termed spectral signature.
For more information read [Young et al. (2017) A survival guide to Landsat preprocessing](https://esajournals.onlinelibrary.wiley.com/doi/pdf/10.1002/ecy.1730)