#install.packages("tidyverse") # Instalar se necessário
library(tidyverse)
library(gt)
Aula 2 - Introdução ao tidyverse
Aula 2
Dr. Gabriel Bertolini
2º semestre 2023 - PPGGEO UFRGS
Oficina de programação
Aula introdutória sobre tidyverse.
Bibliotecas
Função install.packages pode ser usada para baixar arquivos do CRAN - pacotes da comunidade “oficiais” que estão alinhados as praticas e estilo de programação padronizadas do R. Pacotes do github pode ser instalados, utilizando uma função do pacote devtools: devtools::install.github(“link_do_github_exemplo:gabertol/ztR”)
Os pacotes só precisam ser instalados uma vez, e carregados sempre que necessário usando a função library().
1. Funções de R base úteis
<-c(1.1,4.2,2.9,9,1,4,5,0.0001,NA)
vetor
# média
mean(vetor,na.rm=TRUE) #média
[1] 3.400012
# desvio padrão
sd(vetor)
[1] NA
# minimo
min(vetor)
[1] NA
# maximo
max(vetor)
[1] NA
# extensão
length(vetor)
[1] 9
# cria sequencia de valores
<-seq(from=pi,
Ato=pi^2,
by=pi/4)
length(A)
[1] 9
# gerador de números com distribuição normal
<-rnorm(n=1000000,mean=350,sd=5)
PER
mean(PER)
[1] 349.9957
sd(PER)
[1] 5.000009
# gerador de numeros randomicos
rep(c("A","B","C"), 2)
[1] "A" "B" "C" "A" "B" "C"
2. Introdução ao tidyverse
A biblioteca guarda-chuva tidyverse inclui uma serie de pacotes, cada uma para seu fim. Por hora, vamos explorar o pacote dplyr que trata de manipulação básica de bancos de dados. Para isso, vamos primeiramente importar um banco de dados. Para importar, vamos usar a função read.csv
# Importar e criar dataframe (tibble)
<-read.csv("./data/geoquimica_granito.csv") %>%
BDas_tibble()# tibble é um tipo particular de dataframe do tidyverse, que facilita a leitura no prompt
Para observar os valores do dataframe no console, pode ser executar o nome do objeto, no caso BD. Abaixo, usamos a função gt() do pacote gt para tabela interativa.
%>%
BD gt_preview()
X.1 | X | Tectonic_setting | Location_notes | Pluton | Colour | Symbol | Size_rel | Size | SubGroup | Group | DebonPQ | TASMiddlemostPlut | Villaseca | QANOR | LaRoche | Geol_unit_notes | Rock_type_notes | Ref | Age | Latitude | Longitude | SiO2 | TiO2 | Al2O3 | FeOt | MnO | MgO | CaO | K2O | Na2O | P2O5 | H2O | H2Ot | H2O.MINUS | LOI | Li | Be | B | Sc | V | Cr | Ni | Cu | Zn | Rb | Sr | Y | Zr | Nb | Cs | Ba | La | Ce | Pr | Nd | Sm | Eu | Gd | Tb | Dy | Ho | Er | Tm | Yb | Lu | Hf | Ta | Pb | Th | U | Co | Mo | W | Ga | Ge | As | In | Sn | Sb | Cd | Q | C | Or | Ab | An | Ne | Ns | Ks | Di | Wo | Hy | Ol | Il | Tn | Pf | Ru | Ap | Sum_CIPW | Orthoclase | Albite | Anorthite | Quartz | Apatite | Ilmenite | Biotite | Amphibole | Corundum | Rest | sum_meso | P | K | Ti | Cr2O3 | A.NK | A.CNK | K2O.Na2O | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | CR03 | OROGENIC BELT | ////////Peninsula////// | Cape Peninsula | brown4 | 16 | 1.6 | 1.28 | Mafic S-types | 1- CPG & MPG | gr | quartz monzonite | h-P | 3b granite | granodiorite | ////////// | ////// | Unpub. Data, A.Villaros, G. Stevens, R. Scheepers (Villaros et al. 2009) | NA | NA | 65.40690 | 0.8270369 | 17.08490 | 4.914410 | 0.07714897 | 1.7960280 | 1.813515 | 5.131949 | 2.660096 | 0.2880228 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 269.0154 | 97.21508 | 52.24591 | 305.2263 | 25.35902 | NA | 390.4759 | 56.40085 | 112.16620 | 13.223848 | 52.34722 | 11.393739 | 1.2494124 | 10.838077 | 1.6265380 | 9.835326 | 2.024970 | 6.206267 | 0.8976863 | 6.077240 | NA | 8.747229 | NA | NA | 26.782157 | NA | NA | NA | NA | NA | NA | 24.54893 | 4.546013 | 30.32835 | 22.50899 | 7.115483 | 0 | 0 | 0 | 0 | 0 | 4.474137 | 0 | 0.16503878 | 0 | 0 | 0.7405025 | 0.6822185 | 95.10966 | 25.62016 | 22.53101 | 7.109292 | 27.20246 | 0.6797338 | 0.7856851 | 6.932014 | 0 | 4.544585 | -0.3193515 | 95.08560 | 1257.1043 | 42601.85 | 4958.086 | NA | 1.720334 | 1.291518 | 1.929235 | ||||||||
2 | 2 | BB202 | OROGENIC BELT | ////////Peninsula////// | Cape Peninsula | brown4 | 16 | 1.6 | 1.28 | Mafic S-types | 1- CPG & MPG | gr | granodiorite | h-P | 3b granite | granodiorite | ////////// | ////// | Unpub. Data, A.Villaros, G. Stevens, R. Scheepers (Villaros et al. 2009) | NA | NA | 69.64232 | 0.7087655 | 14.88713 | 4.186309 | 0.05397209 | 1.6293460 | 1.436880 | 5.018386 | 2.215911 | 0.22098 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 214.2795 | 98.96108 | 38.28057 | 260.1457 | 19.91177 | NA | 415.6909 | 44.29357 | 90.02860 | 10.382656 | 39.39793 | 7.655565 | 1.0286014 | 8.160105 | 1.1358404 | 8.071263 | 1.516150 | 3.946216 | 0.5438978 | 3.740191 | NA | 7.342101 | NA | NA | 19.492634 | NA | NA | NA | NA | NA | NA | 32.66955 | 3.726173 | 29.65722 | 18.75042 | 5.684928 | 0 | 0 | 0 | 0 | 0 | 4.058911 | 0 | 0.11545828 | 0 | 0 | 0.6482846 | 0.5234192 | 95.83437 | 25.43081 | 18.76877 | 5.680122 | 35.07460 | 0.5215128 | 0.6733272 | 6.222967 | 0 | 3.726782 | -0.2852007 | 95.81369 | 964.4893 | 41659.13 | 4249.049 | NA | 1.640003 | 1.273484 | 2.264706 | ||||||||
3 | 3 | CB02 | OROGENIC BELT | ////////Peninsula////// | Cape Peninsula | brown4 | 16 | 1.6 | 1.28 | Mafic S-types | 1- CPG & MPG | gr | granite | h-P | 3b granite | granite | ////////// | ////// | Unpub. Data, A.Villaros, G. Stevens, R. Scheepers (Villaros et al. 2009) | NA | NA | 70.61077 | 0.5705854 | 15.28393 | 2.822104 | 0.05318077 | 1.3306273 | 1.204323 | 5.597277 | 2.329983 | 0.197212 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 238.5302 | 90.38265 | 29.96433 | 186.0415 | 15.13822 | NA | 547.5218 | 32.36167 | 63.93163 | 7.248746 | 28.7628 | 6.040634 | 1.076092 | 5.3337 | 0.8584825 | 5.574644 | 1.088590 | 3.174243 | 0.4446103 | 2.755472 | NA | 5.434977 | NA | NA | 14.798949 | NA | NA | NA | NA | NA | NA | 31.63569 | 3.674628 | 33.07830 | 19.71567 | 4.686449 | 0 | 0 | 0 | 0 | 0 | 3.314764 | 0 | 0.11376549 | 0 | 0 | 0.5109352 | 0.4671216 | 97.19733 | 29.62021 | 19.73496 | 4.682256 | 33.59253 | 0.4654203 | 0.5420561 | 5.098139 | 0 | 3.676313 | -0.2339899 | 97.17789 | 860.7515 | 46464.68 | 3420.659 | NA | 1.545114 | 1.265067 | 2.402282 | ||||||||
4 | 4 | HB02 | OROGENIC BELT | ////////Peninsula////// | Cape Peninsula | brown4 | 16 | 1.6 | 1.28 | Mafic S-types | 1- CPG & MPG | gr | granite | h-P | 3b granite | granite | ////////// | ////// | Unpub. Data, A.Villaros, G. Stevens, R. Scheepers (Villaros et al. 2009) | NA | NA | 70.61724 | 0.4775028 | 15.70859 | 2.508109 | 0.04066072 | 0.9779424 | 1.311569 | 5.280680 | 2.877528 | 0.2001758 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 254.2501 | 116.29881 | 36.86868 | 181.9237 | 15.15780 | NA | 499.1298 | 34.12151 | 70.41246 | 8.06835 | 30.55275 | 6.613892 | 1.2179917 | 6.696204 | 1.0960613 | 7.270367 | 1.316706 | 3.774652 | 0.5311750 | 3.354008 | NA | 5.152241 | NA | NA | 15.483984 | NA | NA | NA | NA | NA | NA | 29.97338 | 3.353324 | 31.20731 | 24.34884 | 5.199155 | 0 | 0 | 0 | 0 | 0 | 2.436180 | 0 | 0.08698231 | 0 | 0 | 0.4319138 | 0.4741418 | 97.51123 | 28.63039 | 24.37266 | 5.194790 | 31.40372 | 0.4724149 | 0.4536277 | 3.789478 | 0 | 3.351382 | -0.1765739 | 97.49189 | 873.6873 | 43836.51 | 2862.629 | NA | 1.503237 | 1.223927 | 1.835145 | ||||||||
5 | 5 | CR01 | OROGENIC BELT | ////////Peninsula////// | Cape Peninsula | brown4 | 16 | 1.6 | 1.28 | Mafic S-types | 1- CPG & MPG | gr | granite | h-P | 3a granite | granite | ////////// | ////// | Unpub. Data, A.Villaros, G. Stevens, R. Scheepers (Villaros et al. 2009) | NA | NA | 70.70878 | 0.5322106 | 15.28031 | 2.535746 | 0.06085569 | 1.2071555 | 1.235924 | 5.569955 | 2.642243 | 0.2268257 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 233.7445 | 91.32626 | 38.96812 | 172.6349 | 14.63283 | NA | 408.2444 | 32.54389 | 62.92283 | 7.49205 | 29.71011 | 6.455299 | 1.0194147 | 6.264685 | 1.0442407 | 6.880438 | 1.272111 | 3.816830 | 0.4902676 | 3.014657 | NA | 4.650457 | NA | NA | 14.926382 | NA | NA | NA | NA | NA | NA | 30.22193 | 3.200335 | 32.91684 | 22.35792 | 4.649750 | 0 | 0 | 0 | 0 | 0 | 3.007180 | 0 | 0.13018383 | 0 | 0 | 0.4638989 | 0.5372654 | 97.48530 | 29.75506 | 22.37980 | 4.645108 | 31.99386 | 0.5353087 | 0.5056001 | 4.664697 | 0 | 3.200603 | -0.2157710 | 97.46426 | 990.0035 | 46237.87 | 3190.603 | NA | 1.472673 | 1.210504 | 2.108040 | ||||||||
6..3049 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3050 | 3050 | SGMV | OROGENIC BELT | Carajas Province, Amazonia | Velho Guilherme Suite | lightgreen | 18 | 2.0 | 1.60 | PostOro (A2) | 5- KCG | Teixeira et al. 2019 | 1860-1880 | NA | NA | 75.95000 | 0.0600000 | 12.62000 | 2.745275 | NA | 0.0100000 | 0.580000 | 4.530000 | 2.590000 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 786.0000 | 18.00000 | 170.00000 | 164.0000 | 76.00000 | NA | 28.0000 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
Abre tabela estilo excel/ selecionar o objeto com mouse e apertar F2 é a mesma coisa
view(BD)
Trata-se de um banco de dados de @bonin2020. Aqui vamos usar uma série de funções do dplyr e tidyr para operações nesse banco de dados.
Selecionar colunas com select()
# Selecionar algumas colunas
# Usar função names() para examinar quais colunas o banco tem
names(BD) # o banco tem 117 colunas, vamos selecionar algumas colunas
[1] "X.1" "X" "Tectonic_setting"
[4] "Location_notes" "Pluton" "Colour"
[7] "Symbol" "Size_rel" "Size"
[10] "SubGroup" "Group" "DebonPQ"
[13] "TASMiddlemostPlut" "Villaseca" "QANOR"
[16] "LaRoche" "Geol_unit_notes" "Rock_type_notes"
[19] "Ref" "Age" "Latitude"
[22] "Longitude" "SiO2" "TiO2"
[25] "Al2O3" "FeOt" "MnO"
[28] "MgO" "CaO" "K2O"
[31] "Na2O" "P2O5" "H2O"
[34] "H2Ot" "H2O.MINUS" "LOI"
[37] "Li" "Be" "B"
[40] "Sc" "V" "Cr"
[43] "Ni" "Cu" "Zn"
[46] "Rb" "Sr" "Y"
[49] "Zr" "Nb" "Cs"
[52] "Ba" "La" "Ce"
[55] "Pr" "Nd" "Sm"
[58] "Eu" "Gd" "Tb"
[61] "Dy" "Ho" "Er"
[64] "Tm" "Yb" "Lu"
[67] "Hf" "Ta" "Pb"
[70] "Th" "U" "Co"
[73] "Mo" "W" "Ga"
[76] "Ge" "As" "In"
[79] "Sn" "Sb" "Cd"
[82] "Q" "C" "Or"
[85] "Ab" "An" "Ne"
[88] "Ns" "Ks" "Di"
[91] "Wo" "Hy" "Ol"
[94] "Il" "Tn" "Pf"
[97] "Ru" "Ap" "Sum_CIPW"
[100] "Orthoclase" "Albite" "Anorthite"
[103] "Quartz" "Apatite" "Ilmenite"
[106] "Biotite" "Amphibole" "Corundum"
[109] "Rest" "sum_meso" "P"
[112] "K" "Ti" "Cr2O3"
[115] "A.NK" "A.CNK" "K2O.Na2O"
%>% # reparar o uso do comando pype esse codigo significa select(BD,SiO2:Cd)- as funções
BD select(SiO2:Cd) # selecionar elementos quimicos - SiO2 até Cd
# A tibble: 3,050 × 59
SiO2 TiO2 Al2O3 FeOt MnO MgO CaO K2O Na2O P2O5 H2O H2Ot
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
1 65.4 0.827 17.1 4.91 0.0771 1.80 1.81 5.13 2.66 0.2880228 NA NA
2 69.6 0.709 14.9 4.19 0.0540 1.63 1.44 5.02 2.22 0.22098 NA NA
3 70.6 0.571 15.3 2.82 0.0532 1.33 1.20 5.60 2.33 0.197212 NA NA
4 70.6 0.478 15.7 2.51 0.0407 0.978 1.31 5.28 2.88 0.2001758 NA NA
5 70.7 0.532 15.3 2.54 0.0609 1.21 1.24 5.57 2.64 0.2268257 NA NA
6 71.0 0.484 15.5 2.47 0.0396 1.01 1.48 5.33 2.56 0.1863152 NA NA
7 72.4 0.470 14.3 2.68 0.0422 1.03 1.09 5.05 2.75 0.2026448 NA NA
8 73.2 0.428 14.4 2.40 0.0605 0.769 1.06 5.04 2.46 0.2184707 NA NA
9 73.2 0.363 14.4 2.25 0.0211 0.678 1.10 5.05 2.86 0.1591542 NA NA
10 74.8 0.530 13.5 2.73 0.0472 1.18 1.43 3.28 2.30 0.1812674 NA NA
# ℹ 3,040 more rows
# ℹ 47 more variables: H2O.MINUS <dbl>, LOI <chr>, Li <dbl>, Be <dbl>, B <dbl>,
# Sc <dbl>, V <chr>, Cr <dbl>, Ni <dbl>, Cu <dbl>, Zn <dbl>, Rb <dbl>,
# Sr <dbl>, Y <dbl>, Zr <dbl>, Nb <dbl>, Cs <dbl>, Ba <dbl>, La <chr>,
# Ce <dbl>, Pr <chr>, Nd <chr>, Sm <chr>, Eu <chr>, Gd <chr>, Tb <dbl>,
# Dy <dbl>, Ho <dbl>, Er <dbl>, Tm <dbl>, Yb <dbl>, Lu <dbl>, Hf <dbl>,
# Ta <dbl>, Pb <dbl>, Th <chr>, U <dbl>, Co <dbl>, Mo <chr>, W <dbl>, …
%>%
BD select(X,Tectonic_setting,Latitude,Longitude,Albite:Corundum) %>% # selecionar colunas X, Tectonic_setting, Latitude,Longitude e minerais (Albite:Corundum)
select(-X) # usar - pra retirar a variavel
# A tibble: 3,050 × 11
Tectonic_setting Latitude Longitude Albite Anorthite Quartz Apatite Ilmenite
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 OROGENIC BELT NA NA 22.5 7.11 27.2 0.680 0.786
2 OROGENIC BELT NA NA 18.8 5.68 35.1 0.522 0.673
3 OROGENIC BELT NA NA 19.7 4.68 33.6 0.465 0.542
4 OROGENIC BELT NA NA 24.4 5.19 31.4 0.472 0.454
5 OROGENIC BELT NA NA 22.4 4.65 32.0 0.535 0.506
6 OROGENIC BELT NA NA 21.7 6.11 33.0 0.440 0.460
7 OROGENIC BELT NA NA 23.3 4.07 35.3 0.478 0.446
8 OROGENIC BELT NA NA 20.8 3.84 37.9 0.516 0.407
9 OROGENIC BELT NA NA 24.2 4.39 35.3 0.376 0.344
10 OROGENIC BELT NA NA 19.5 5.91 46.2 0.428 0.504
# ℹ 3,040 more rows
# ℹ 3 more variables: Biotite <dbl>, Amphibole <dbl>, Corundum <dbl>
Filtrar colunas com filter()
# Filtrar tectonic_setting
unique(BD$Tectonic_setting) # ver valores unicos de Tectonic_setting para saber qual variavel filtrar
[1] "OROGENIC BELT" "INTRAPLATE VOLCANICS"
[3] "ARCHAEAN CRATONS" "CONVERGENT MARGIN"
[5] "OCEANIC PLATEAU" "OPHIOLITE"
[7] "RIFT VOLCANICS" "CONTINENTAL FLOOD BASALT"
[9] "OCEAN ISLAND"
%>%
BD select(X,Tectonic_setting,Latitude,Longitude,Albite:Corundum) %>% # reparar funções encadeadas
filter(Tectonic_setting=="OCEANIC PLATEAU") # filtrando somente o Oceanic Plateau
# A tibble: 13 × 12
X Tectonic_setting Latitude Longitude Albite Anorthite Quartz Apatite
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 46143 OCEANIC PLATEAU 12.5 -70 37.3 8.51 33.3 0.135
2 46135 OCEANIC PLATEAU 12.5 -70 32.7 9.14 6.31 0.439
3 46142 OCEANIC PLATEAU 12.5 -70 45.7 15.0 22.7 0.304
4 46139 OCEANIC PLATEAU 12.5 -70 31.2 24.6 19.9 0.208
5 46140 OCEANIC PLATEAU 12.5 -70 37.4 17.8 22.7 0.283
6 46134 OCEANIC PLATEAU 12.5 -70 36.3 23.2 8.53 0.408
7 9157-ARU… OCEANIC PLATEAU 12.5 -70 39.1 16.1 21.2 0.337
8 10446-A10 OCEANIC PLATEAU -7 156 31.9 13.7 14.9 0.378
9 10446-A11 OCEANIC PLATEAU -7 156 50.6 2.22 37.5 0.260
10 10446-A6 OCEANIC PLATEAU -7 156 32.5 17.1 22.6 0.378
11 10446-A7 OCEANIC PLATEAU -7 156 31.6 17.7 15.9 0.378
12 10446-A8 OCEANIC PLATEAU -7 156 34.6 20.2 18.0 0.378
13 10446-A9 OCEANIC PLATEAU -7 156 29.1 21.1 20.0 0.401
# ℹ 4 more variables: Ilmenite <dbl>, Biotite <dbl>, Amphibole <dbl>,
# Corundum <dbl>
%>%
BD filter(Tectonic_setting=="CONTINENTAL FLOOD BASALT",
<60,
SiO2>50,
SiO2>1) # filtrar varias colunas - separadas por , TiO2
# A tibble: 13 × 117
X.1 X Tectonic_setting Location_notes Pluton Colour Symbol Size_rel
<int> <chr> <chr> <chr> <chr> <chr> <int> <dbl>
1 1952 201163 CONTINENTAL FLO… NORTH ATLANTI… Skye honey… 2 1.6
2 1953 201168 CONTINENTAL FLO… NORTH ATLANTI… Skye honey… 2 1.6
3 1955 85457 CONTINENTAL FLO… NORTH ATLANTI… Skye honey… 2 1.6
4 1957 85452 CONTINENTAL FLO… NORTH ATLANTI… Skye honey… 2 1.6
5 1958 85453 CONTINENTAL FLO… NORTH ATLANTI… Skye honey… 2 1.6
6 1959 85455 CONTINENTAL FLO… NORTH ATLANTI… Skye honey… 2 1.6
7 1960 85456 CONTINENTAL FLO… NORTH ATLANTI… Skye honey… 2 1.6
8 2055 119793 CONTINENTAL FLO… ETENDEKA PROV… Etane… honey… 2 1.6
9 2056 119795 CONTINENTAL FLO… ETENDEKA PROV… Etane… honey… 2 1.6
10 2057 99730 CONTINENTAL FLO… ETENDEKA PROV… Etane… honey… 2 1.6
11 2111 8564-MC4 CONTINENTAL FLO… ETENDEKA PROV… Messum honey… 2 1.6
12 2113 43064 CONTINENTAL FLO… ETENDEKA PROV… Messum honey… 2 1.6
13 2124 8572-MC44 CONTINENTAL FLO… ETENDEKA PROV… Messum honey… 2 1.6
# ℹ 109 more variables: Size <dbl>, SubGroup <chr>, Group <chr>, DebonPQ <chr>,
# TASMiddlemostPlut <chr>, Villaseca <chr>, QANOR <chr>, LaRoche <chr>,
# Geol_unit_notes <chr>, Rock_type_notes <chr>, Ref <chr>, Age <chr>,
# Latitude <dbl>, Longitude <dbl>, SiO2 <dbl>, TiO2 <dbl>, Al2O3 <dbl>,
# FeOt <dbl>, MnO <dbl>, MgO <dbl>, CaO <dbl>, K2O <dbl>, Na2O <dbl>,
# P2O5 <chr>, H2O <dbl>, H2Ot <dbl>, H2O.MINUS <dbl>, LOI <chr>, Li <dbl>,
# Be <dbl>, B <dbl>, Sc <dbl>, V <chr>, Cr <dbl>, Ni <dbl>, Cu <dbl>, …
%>%
BD filter(Tectonic_setting %in% c("OROGENIC BELT",
"INTRAPLATE VOLCANICS",
"ARCHAEAN CRATONS",
"CONVERGENT MARGIN")) # filtrar pelos strings usando %in% e c()
# A tibble: 2,642 × 117
X.1 X Tectonic_setting Location_notes Pluton Colour Symbol Size_rel
<int> <chr> <chr> <chr> <chr> <chr> <int> <dbl>
1 1 CR03 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
2 2 BB202 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
3 3 CB02 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
4 4 HB02 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
5 5 CR01 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
6 6 LL01 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
7 7 MP21 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
8 8 OK07 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
9 9 LG05 OROGENIC BELT ////////Langebaan… Cape … brown4 16 1.6
10 10 BB201 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
# ℹ 2,632 more rows
# ℹ 109 more variables: Size <dbl>, SubGroup <chr>, Group <chr>, DebonPQ <chr>,
# TASMiddlemostPlut <chr>, Villaseca <chr>, QANOR <chr>, LaRoche <chr>,
# Geol_unit_notes <chr>, Rock_type_notes <chr>, Ref <chr>, Age <chr>,
# Latitude <dbl>, Longitude <dbl>, SiO2 <dbl>, TiO2 <dbl>, Al2O3 <dbl>,
# FeOt <dbl>, MnO <dbl>, MgO <dbl>, CaO <dbl>, K2O <dbl>, Na2O <dbl>,
# P2O5 <chr>, H2O <dbl>, H2Ot <dbl>, H2O.MINUS <dbl>, LOI <chr>, Li <dbl>, …
%>%
BD filter(Tectonic_setting== "OROGENIC BELT" & SiO2>60) # filtrar condicionalmente
# A tibble: 1,370 × 117
X.1 X Tectonic_setting Location_notes Pluton Colour Symbol Size_rel
<int> <chr> <chr> <chr> <chr> <chr> <int> <dbl>
1 1 CR03 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
2 2 BB202 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
3 3 CB02 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
4 4 HB02 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
5 5 CR01 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
6 6 LL01 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
7 7 MP21 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
8 8 OK07 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
9 9 LG05 OROGENIC BELT ////////Langebaan… Cape … brown4 16 1.6
10 10 BB201 OROGENIC BELT ////////Peninsula… Cape … brown4 16 1.6
# ℹ 1,360 more rows
# ℹ 109 more variables: Size <dbl>, SubGroup <chr>, Group <chr>, DebonPQ <chr>,
# TASMiddlemostPlut <chr>, Villaseca <chr>, QANOR <chr>, LaRoche <chr>,
# Geol_unit_notes <chr>, Rock_type_notes <chr>, Ref <chr>, Age <chr>,
# Latitude <dbl>, Longitude <dbl>, SiO2 <dbl>, TiO2 <dbl>, Al2O3 <dbl>,
# FeOt <dbl>, MnO <dbl>, MgO <dbl>, CaO <dbl>, K2O <dbl>, Na2O <dbl>,
# P2O5 <chr>, H2O <dbl>, H2Ot <dbl>, H2O.MINUS <dbl>, LOI <chr>, Li <dbl>, …
Criar colunas com mutate(), agrupamento com group(), arredondamento com round()
%>%
BD select(X,Tectonic_setting,SiO2:MgO) %>%
mutate(l_Si=log(SiO2),
teste=log(SiO2)^2 + sqrt((log(Al2O3)))) #cria nova coluna com o log da coluna SiO2
# A tibble: 3,050 × 10
X Tectonic_setting SiO2 TiO2 Al2O3 FeOt MnO MgO l_Si teste
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 CR03 OROGENIC BELT 65.4 0.827 17.1 4.91 0.0771 1.80 4.18 19.2
2 BB202 OROGENIC BELT 69.6 0.709 14.9 4.19 0.0540 1.63 4.24 19.6
3 CB02 OROGENIC BELT 70.6 0.571 15.3 2.82 0.0532 1.33 4.26 19.8
4 HB02 OROGENIC BELT 70.6 0.478 15.7 2.51 0.0407 0.978 4.26 19.8
5 CR01 OROGENIC BELT 70.7 0.532 15.3 2.54 0.0609 1.21 4.26 19.8
6 LL01 OROGENIC BELT 71.0 0.484 15.5 2.47 0.0396 1.01 4.26 19.8
7 MP21 OROGENIC BELT 72.4 0.470 14.3 2.68 0.0422 1.03 4.28 20.0
8 OK07 OROGENIC BELT 73.2 0.428 14.4 2.40 0.0605 0.769 4.29 20.1
9 LG05 OROGENIC BELT 73.2 0.363 14.4 2.25 0.0211 0.678 4.29 20.1
10 BB201 OROGENIC BELT 74.8 0.530 13.5 2.73 0.0472 1.18 4.31 20.2
# ℹ 3,040 more rows
%>%
BD select(X,Tectonic_setting,SiO2:MgO) %>%
mutate(across(.cols=SiO2:MgO,~log(.x))) # Operações em massa
# A tibble: 3,050 × 8
X Tectonic_setting SiO2 TiO2 Al2O3 FeOt MnO MgO
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 CR03 OROGENIC BELT 4.18 -0.190 2.84 1.59 -2.56 0.586
2 BB202 OROGENIC BELT 4.24 -0.344 2.70 1.43 -2.92 0.488
3 CB02 OROGENIC BELT 4.26 -0.561 2.73 1.04 -2.93 0.286
4 HB02 OROGENIC BELT 4.26 -0.739 2.75 0.920 -3.20 -0.0223
5 CR01 OROGENIC BELT 4.26 -0.631 2.73 0.930 -2.80 0.188
6 LL01 OROGENIC BELT 4.26 -0.726 2.74 0.904 -3.23 0.00650
7 MP21 OROGENIC BELT 4.28 -0.756 2.66 0.986 -3.16 0.0276
8 OK07 OROGENIC BELT 4.29 -0.848 2.67 0.877 -2.81 -0.263
9 LG05 OROGENIC BELT 4.29 -1.01 2.66 0.809 -3.86 -0.389
10 BB201 OROGENIC BELT 4.31 -0.635 2.61 1.00 -3.05 0.165
# ℹ 3,040 more rows
%>%
BD select(X,Tectonic_setting,SiO2:MgO) %>%
group_by(Tectonic_setting) %>%
mutate(contagem=n())
# A tibble: 3,050 × 9
# Groups: Tectonic_setting [9]
X Tectonic_setting SiO2 TiO2 Al2O3 FeOt MnO MgO contagem
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 CR03 OROGENIC BELT 65.4 0.827 17.1 4.91 0.0771 1.80 1471
2 BB202 OROGENIC BELT 69.6 0.709 14.9 4.19 0.0540 1.63 1471
3 CB02 OROGENIC BELT 70.6 0.571 15.3 2.82 0.0532 1.33 1471
4 HB02 OROGENIC BELT 70.6 0.478 15.7 2.51 0.0407 0.978 1471
5 CR01 OROGENIC BELT 70.7 0.532 15.3 2.54 0.0609 1.21 1471
6 LL01 OROGENIC BELT 71.0 0.484 15.5 2.47 0.0396 1.01 1471
7 MP21 OROGENIC BELT 72.4 0.470 14.3 2.68 0.0422 1.03 1471
8 OK07 OROGENIC BELT 73.2 0.428 14.4 2.40 0.0605 0.769 1471
9 LG05 OROGENIC BELT 73.2 0.363 14.4 2.25 0.0211 0.678 1471
10 BB201 OROGENIC BELT 74.8 0.530 13.5 2.73 0.0472 1.18 1471
# ℹ 3,040 more rows
%>%
BD select(X,Tectonic_setting,SiO2:MgO) %>%
group_by(Tectonic_setting) %>%
count()
# A tibble: 9 × 2
# Groups: Tectonic_setting [9]
Tectonic_setting n
<chr> <int>
1 ARCHAEAN CRATONS 310
2 CONTINENTAL FLOOD BASALT 191
3 CONVERGENT MARGIN 763
4 INTRAPLATE VOLCANICS 98
5 OCEAN ISLAND 56
6 OCEANIC PLATEAU 13
7 OPHIOLITE 19
8 OROGENIC BELT 1471
9 RIFT VOLCANICS 129
%>%
BD select(X,Tectonic_setting,SiO2:MgO) %>%
group_by(Tectonic_setting) %>% # Agrupa os valores por grupos baseados nos tipos tectonicos, ou seja, qualquer ação será realizada em diferentes "compartimentos" de dados- isso pode ser muito util para médias globais e de grupo.
mutate(across(SiO2:MgO,~ifelse(.x==0,mean(.x),.x)), # substituir 0 por média por grupo (1 para cada valor de tectonic setting diferente)
SiO2=round(SiO2,digits=2)) #arredondar para dois digitos
# A tibble: 3,050 × 8
# Groups: Tectonic_setting [9]
X Tectonic_setting SiO2 TiO2 Al2O3 FeOt MnO MgO
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 CR03 OROGENIC BELT 65.4 0.827 17.1 4.91 0.0771 1.80
2 BB202 OROGENIC BELT 69.6 0.709 14.9 4.19 0.0540 1.63
3 CB02 OROGENIC BELT 70.6 0.571 15.3 2.82 0.0532 1.33
4 HB02 OROGENIC BELT 70.6 0.478 15.7 2.51 0.0407 0.978
5 CR01 OROGENIC BELT 70.7 0.532 15.3 2.54 0.0609 1.21
6 LL01 OROGENIC BELT 71.0 0.484 15.5 2.47 0.0396 1.01
7 MP21 OROGENIC BELT 72.4 0.470 14.3 2.68 0.0422 1.03
8 OK07 OROGENIC BELT 73.2 0.428 14.4 2.40 0.0605 0.769
9 LG05 OROGENIC BELT 73.2 0.363 14.4 2.25 0.0211 0.678
10 BB201 OROGENIC BELT 74.8 0.530 13.5 2.73 0.0472 1.18
# ℹ 3,040 more rows
Contar com count() ou remoldar dataframe com reframe()
%>%
BD select(X,Tectonic_setting,SiO2:MgO) %>%
group_by(Tectonic_setting) %>% # Agrupa os valores por grupos baseados nos tipos tectonicos, ou seja, qualquer ação será realizada em diferentes "compartimentos" de dados- isso pode ser muito util para médias globais e de grupo.
mutate(across(SiO2:MgO,~ifelse(.x==0, mean(.x),.x)), # substituir 0 por média por grupo (1 para cada valor de tectonic setting diferente)
across(SiO2:MgO,~round(.x,digits=2))) %>%
reframe(n=n(),
across(.cols=SiO2:MgO,~round(mean(.x,na.rm = TRUE),2)))
# A tibble: 9 × 8
Tectonic_setting n SiO2 TiO2 Al2O3 FeOt MnO MgO
<chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ARCHAEAN CRATONS 310 69.6 0.31 15.4 2.43 0.04 1.33
2 CONTINENTAL FLOOD BASALT 191 67.5 0.53 14.6 4.21 0.13 0.5
3 CONVERGENT MARGIN 763 66.4 0.52 15.7 3.92 0.08 1.7
4 INTRAPLATE VOLCANICS 98 70.3 0.32 14.1 3.35 0.08 0.49
5 OCEAN ISLAND 56 65.2 0.53 16.2 4.02 0.16 0.44
6 OCEANIC PLATEAU 13 63.6 0.52 16.1 5.05 0.1 2.59
7 OPHIOLITE 19 61.6 1.07 15.0 6.03 0.17 2.41
8 OROGENIC BELT 1471 69.8 0.41 14.4 2.85 0.07 1.31
9 RIFT VOLCANICS 129 69.3 0.43 13.8 3.66 0.1 0.49
Alterando a forma do dataframe com pivot_longer e _wider()
# Dado longo
%>%
BD select(X,Tectonic_setting,SiO2:MgO) %>%
group_by(Tectonic_setting) %>% # Agrupa os valores por grupos baseados nos tipos tectonicos, ou seja, qualquer ação será realizada em diferentes "compartimentos" de dados- isso pode ser muito util para médias globais e de grupo.
mutate(across(SiO2:MgO,~ifelse(.x==0, mean(.x),.x)), # substituir 0 por média por grupo (1 para cada valor de tectonic setting diferente)
across(SiO2:MgO,~round(.x,digits=2))) %>%
reframe(n=n(),
across(.cols=SiO2:MgO,~round(mean(.x,na.rm = TRUE),2))) %>%
pivot_longer(cols=SiO2:MgO)
# A tibble: 54 × 4
Tectonic_setting n name value
<chr> <int> <chr> <dbl>
1 ARCHAEAN CRATONS 310 SiO2 69.6
2 ARCHAEAN CRATONS 310 TiO2 0.31
3 ARCHAEAN CRATONS 310 Al2O3 15.4
4 ARCHAEAN CRATONS 310 FeOt 2.43
5 ARCHAEAN CRATONS 310 MnO 0.04
6 ARCHAEAN CRATONS 310 MgO 1.33
7 CONTINENTAL FLOOD BASALT 191 SiO2 67.5
8 CONTINENTAL FLOOD BASALT 191 TiO2 0.53
9 CONTINENTAL FLOOD BASALT 191 Al2O3 14.6
10 CONTINENTAL FLOOD BASALT 191 FeOt 4.21
# ℹ 44 more rows
#dado largo
%>%
BD select(X,Tectonic_setting,SiO2:MgO) %>%
group_by(Tectonic_setting) %>% # Agrupa os valores por grupos baseados nos tipos tectonicos, ou seja, qualquer ação será realizada em diferentes "compartimentos" de dados- isso pode ser muito util para médias globais e de grupo.
mutate(across(SiO2:MgO,~ifelse(.x==0, mean(.x),.x)), # substituir 0 por média por grupo (1 para cada valor de tectonic setting diferente)
across(SiO2:MgO,~round(.x,digits=2))) %>%
reframe(n=n(),
across(.cols=SiO2:MgO,~round(mean(.x,na.rm = TRUE),2))) %>%
pivot_longer(cols=SiO2:MgO) %>% # Alongar dataframe para forma tidy
ungroup() %>%
group_by(name) %>%
pivot_wider(id_cols=Tectonic_setting,values_from=value) # Expandir lateralmente o dataframe
# A tibble: 9 × 7
Tectonic_setting SiO2 TiO2 Al2O3 FeOt MnO MgO
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ARCHAEAN CRATONS 69.6 0.31 15.4 2.43 0.04 1.33
2 CONTINENTAL FLOOD BASALT 67.5 0.53 14.6 4.21 0.13 0.5
3 CONVERGENT MARGIN 66.4 0.52 15.7 3.92 0.08 1.7
4 INTRAPLATE VOLCANICS 70.3 0.32 14.1 3.35 0.08 0.49
5 OCEAN ISLAND 65.2 0.53 16.2 4.02 0.16 0.44
6 OCEANIC PLATEAU 63.6 0.52 16.1 5.05 0.1 2.59
7 OPHIOLITE 61.6 1.07 15.0 6.03 0.17 2.41
8 OROGENIC BELT 69.8 0.41 14.4 2.85 0.07 1.31
9 RIFT VOLCANICS 69.3 0.43 13.8 3.66 0.1 0.49
3. Estudo de caso - Vulcões do Tidytuesday
Objetivo: Revisar tidyverse e fazer exercícios. Os dados são disponibilizados no link, e contem o video para analise desse banco no youtube (https://github.com/rfordatascience/tidytuesday/tree/master).
Esse link apresenta mais infos sobre o banco de dado e fontes:
https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-05-12/readme.md
Importação de dados de um banco em excel com readxl:
# Ilustrativo
#volc<-read_excel("volcano.xlsx") %>%
# as_tibble()
Opção B, baixar diretamente do github do tidytuesday:
<- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-05-12/volcano.csv') volc
select()
%>% select(volcano_name,-country,latitude:elevation) volc
# A tibble: 958 × 4
volcano_name latitude longitude elevation
<chr> <dbl> <dbl> <dbl>
1 Abu 34.5 132. 641
2 Acamarachi -23.3 -67.6 6023
3 Acatenango 14.5 -90.9 3976
4 Acigol-Nevsehir 38.5 34.6 1683
5 Adams 46.2 -121. 3742
6 Adatarayama 37.6 140. 1728
7 Adwa 10.1 40.8 1733
8 Afdera 13.1 40.9 1250
9 Agrigan 18.8 146. 965
10 Agua 14.5 -90.7 3760
# ℹ 948 more rows
%>% select(!latitude:elevation) volc
# A tibble: 958 × 23
volcano_number volcano_name primary_volcano_type last_eruption_year country
<dbl> <chr> <chr> <chr> <chr>
1 283001 Abu Shield(s) -6850 Japan
2 355096 Acamarachi Stratovolcano Unknown Chile
3 342080 Acatenango Stratovolcano(es) 1972 Guatem…
4 213004 Acigol-Nevseh… Caldera -2080 Turkey
5 321040 Adams Stratovolcano 950 United…
6 283170 Adatarayama Stratovolcano(es) 1996 Japan
7 221170 Adwa Stratovolcano Unknown Ethiop…
8 221110 Afdera Stratovolcano Unknown Ethiop…
9 284160 Agrigan Stratovolcano 1917 United…
10 342100 Agua Stratovolcano Unknown Guatem…
# ℹ 948 more rows
# ℹ 18 more variables: region <chr>, subregion <chr>, tectonic_settings <chr>,
# evidence_category <chr>, major_rock_1 <chr>, major_rock_2 <chr>,
# major_rock_3 <chr>, major_rock_4 <chr>, major_rock_5 <chr>,
# minor_rock_1 <chr>, minor_rock_2 <chr>, minor_rock_3 <chr>,
# minor_rock_4 <chr>, minor_rock_5 <chr>, population_within_5_km <dbl>,
# population_within_10_km <dbl>, population_within_30_km <dbl>, …
%>% select(starts_with("population")) volc
# A tibble: 958 × 4
population_within_5_km population_within_10_km population_within_30_km
<dbl> <dbl> <dbl>
1 3597 9594 117805
2 0 7 294
3 4329 60730 1042836
4 127863 127863 218469
5 0 70 4019
6 428 3936 717078
7 101 485 18645
8 51 6042 8611
9 0 0 0
10 9890 114404 2530449
# ℹ 948 more rows
# ℹ 1 more variable: population_within_100_km <dbl>
%>% select(contains("within")) volc
# A tibble: 958 × 4
population_within_5_km population_within_10_km population_within_30_km
<dbl> <dbl> <dbl>
1 3597 9594 117805
2 0 7 294
3 4329 60730 1042836
4 127863 127863 218469
5 0 70 4019
6 428 3936 717078
7 101 485 18645
8 51 6042 8611
9 0 0 0
10 9890 114404 2530449
# ℹ 948 more rows
# ℹ 1 more variable: population_within_100_km <dbl>
#- **stars_with()**: colunas que começam com um prefixo
#- **ends_with()**: colunas que terminam com um sufixo
#- **contains()**: colunas que contêm uma string
#- **last_col()**: última coluna
filter()
%>% select(volcano_name,country,latitude:elevation) %>%
volc filter(country== "Antarctica")
# A tibble: 17 × 5
volcano_name country latitude longitude elevation
<chr> <chr> <dbl> <dbl> <dbl>
1 Andrus Antarctica -75.8 -132. 2978
2 Berlin Antarctica -76.0 -136 3478
3 Buckle Island Antarctica -66.8 163. 1239
4 Deception Island Antarctica -63.0 -60.7 602
5 Erebus Antarctica -77.5 167. 3794
6 Hudson Mountains Antarctica -74.3 -99.4 749
7 James Ross Island Antarctica -64.2 -57.8 1630
8 Melbourne Antarctica -74.4 165. 2732
9 Morning Antarctica -78.5 164. 2723
10 Penguin Island Antarctica -62.1 -57.9 180
11 Pleiades, The Antarctica -72.7 166. 3040
12 Royal Society Range Antarctica -78.2 163. 3000
13 Seal Nunataks Group Antarctica -65.0 -60.0 368
14 Siple Antarctica -73.4 -127. 3110
15 Takahe Antarctica -76.3 -112. 3460
16 Toney Mountain Antarctica -75.8 -116. 3595
17 Waesche Antarctica -77.2 -127. 3292
%>% select(volcano_name,country,latitude:elevation) %>%
volc filter(!country== "Chile") %>%
filter(elevation>1000,
>0,
latitude>0) longitude
# A tibble: 252 × 5
volcano_name country latitude longitude elevation
<chr> <chr> <dbl> <dbl> <dbl>
1 Acigol-Nevsehir Turkey 38.5 34.6 1683
2 Adatarayama Japan 37.6 140. 1728
3 Adwa Ethiopia 10.1 40.8 1733
4 Afdera Ethiopia 13.1 40.9 1250
5 Aira Japan 31.6 131. 1117
6 Akademia Nauk Russia 54.0 159. 1180
7 Akagisan Japan 36.6 139. 1828
8 Akan Japan 43.4 144. 1499
9 Akita-Komagatake Japan 39.8 141. 1637
10 Akita-Yakeyama Japan 40.0 141. 1366
# ℹ 242 more rows
# %in% comando logico para declarar vetores para filter
%>% select(volcano_name,country,latitude:elevation) %>%
volc filter(!country %in% c("Chile","Argentina","Japan"))
# A tibble: 809 × 5
volcano_name country latitude longitude elevation
<chr> <chr> <dbl> <dbl> <dbl>
1 Acatenango Guatemala 14.5 -90.9 3976
2 Acigol-Nevsehir Turkey 38.5 34.6 1683
3 Adams United States 46.2 -121. 3742
4 Adwa Ethiopia 10.1 40.8 1733
5 Afdera Ethiopia 13.1 40.9 1250
6 Agrigan United States 18.8 146. 965
7 Agua Guatemala 14.5 -90.7 3760
8 Agua de Pau Portugal 37.8 -25.5 947
9 Agung Indonesia -8.34 116. 2997
10 Ahyi United States 20.4 145. -75
# ℹ 799 more rows
# | ou
# & e
%>%
volc select(volcano_name, country, latitude:elevation) %>%
filter(country == "Antarctica" | latitude < 10)
# A tibble: 382 × 5
volcano_name country latitude longitude elevation
<chr> <chr> <dbl> <dbl> <dbl>
1 Acamarachi Chile -23.3 -67.6 6023
2 Aguilera Chile -50.3 -73.8 2546
3 Agung Indonesia -8.34 116. 2997
4 Alcedo Ecuador -0.43 -91.1 1130
5 Aliso Ecuador -0.53 -78 4267
6 Alutu Ethiopia 7.77 38.8 2335
7 Amasing Indonesia -0.53 127. 1030
8 Ambae Vanuatu -15.4 168. 1496
9 Ambang Indonesia 0.75 124. 1795
10 Ambitle Papua New Guinea -4.08 154. 450
# ℹ 372 more rows
%>%
volc select(volcano_name, country, latitude:elevation) %>%
filter(country == "Antarctica" & elevation > 500)
# A tibble: 15 × 5
volcano_name country latitude longitude elevation
<chr> <chr> <dbl> <dbl> <dbl>
1 Andrus Antarctica -75.8 -132. 2978
2 Berlin Antarctica -76.0 -136 3478
3 Buckle Island Antarctica -66.8 163. 1239
4 Deception Island Antarctica -63.0 -60.7 602
5 Erebus Antarctica -77.5 167. 3794
6 Hudson Mountains Antarctica -74.3 -99.4 749
7 James Ross Island Antarctica -64.2 -57.8 1630
8 Melbourne Antarctica -74.4 165. 2732
9 Morning Antarctica -78.5 164. 2723
10 Pleiades, The Antarctica -72.7 166. 3040
11 Royal Society Range Antarctica -78.2 163. 3000
12 Siple Antarctica -73.4 -127. 3110
13 Takahe Antarctica -76.3 -112. 3460
14 Toney Mountain Antarctica -75.8 -116. 3595
15 Waesche Antarctica -77.2 -127. 3292
mutate()
%>%
volc select(volcano_name, country,last_eruption_year) %>%
filter(!last_eruption_year== "Unknown") %>%
mutate(last_eruption_year=as.numeric(last_eruption_year),
ultima_erupcao=(2022-last_eruption_year),
ultima_erupcao_milhar=ultima_erupcao/1000,
ultima_erupcao_milhao=ultima_erupcao/1000000)
# A tibble: 657 × 6
volcano_name country last_eruption_year ultima_erupcao ultima_erupcao_milhar
<chr> <chr> <dbl> <dbl> <dbl>
1 Abu Japan -6850 8872 8.87
2 Acatenango Guatem… 1972 50 0.05
3 Acigol-Nevse… Turkey -2080 4102 4.10
4 Adams United… 950 1072 1.07
5 Adatarayama Japan 1996 26 0.026
6 Agrigan United… 1917 105 0.105
7 Agua de Pau Portug… 1564 458 0.458
8 Aguilera Chile -1250 3272 3.27
9 Agung Indone… 2019 3 0.003
10 Ahyi United… 2014 8 0.008
# ℹ 647 more rows
# ℹ 1 more variable: ultima_erupcao_milhao <dbl>
%>%
volc mutate(rocha=case_when(
%in% c("Andesite / Basaltic Andesite","Basalt / Picro-Basalt" ,"Trachyte / Trachydacite","Trachyandesite / Basaltic Trachyandesite")~"vulcanica basica",
major_rock_1 %in% c("Phonolite","Foidite","Trachybasalt / Tephrite Basanite","Phono-tephrite / Tephri-phonolite")~"vulcanica alcalina",
major_rock_1 TRUE~"volcanica acida")) %>%
count(rocha)
# A tibble: 3 × 2
rocha n
<chr> <int>
1 volcanica acida 108
2 vulcanica alcalina 75
3 vulcanica basica 775
unique(volc$major_rock_1)
[1] "Andesite / Basaltic Andesite"
[2] "Dacite"
[3] "Rhyolite"
[4] "Basalt / Picro-Basalt"
[5] "Trachyte / Trachydacite"
[6] "Phono-tephrite / Tephri-phonolite"
[7] "Trachyandesite / Basaltic Trachyandesite"
[8] "Trachybasalt / Tephrite Basanite"
[9] "Foidite"
[10] "Phonolite"
Count e arrange()
%>%
volc count(country) %>%
arrange(desc(n))
# A tibble: 89 × 2
country n
<chr> <int>
1 United States 99
2 Indonesia 95
3 Japan 92
4 Russia 79
5 Chile 43
6 Iceland 28
7 Mexico 28
8 Papua New Guinea 28
9 Philippines 27
10 Ecuador 26
# ℹ 79 more rows
%>%
volc count(country) %>%
arrange(n)
# A tibble: 89 × 2
country n
<chr> <int>
1 Algeria 1
2 Armenia-Azerbaijan 1
3 Burma (Myanmar) 1
4 Cape Verde 1
5 Chile-Peru 1
6 China-North Korea 1
7 Colombia-Ecuador 1
8 Comoros 1
9 Djibouti 1
10 El Salvador-Guatemala 1
# ℹ 79 more rows
Group_by ()
%>%
volc filter(!last_eruption_year=="Unknown") %>%
group_by(last_eruption_year,country) %>%
count(last_eruption_year,sort = TRUE) %>%
ungroup()
# A tibble: 591 × 3
last_eruption_year country n
<chr> <chr> <int>
1 2020 Indonesia 8
2 2020 Japan 5
3 -550 Russia 4
4 2019 Indonesia 4
5 2020 Russia 4
6 1992 United States 3
7 1996 Japan 3
8 2000 Indonesia 3
9 2008 United States 3
10 2015 Indonesia 3
# ℹ 581 more rows
Summarise ou reframe()
# Função SE
<-function(vetor,IC){
SE<-sd(vetor)
SD<-length(vetor)
N<-(SD*IC)/sqrt(N)
SE
return(
SE
)
}
# Summarise
%>%
volc group_by(country) %>%
summarise(n=n(),
media_pop_5km=mean(population_within_5_km),
sd_pop_km=sd(population_within_5_km),
se_pop_5km=SE(population_within_5_km,1),
max_pop_5km=max(population_within_5_km),
min_pop_5km=min(population_within_5_km)) %>%
arrange(desc(media_pop_5km)) %>%
filter(n>15) %>%
mutate(across(.cols=media_pop_5km:min_pop_5km,~round(.x,digits=0)))
# A tibble: 16 × 7
country n media_pop_5km sd_pop_km se_pop_5km max_pop_5km min_pop_5km
<chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Mexico 28 269105 1091616 206296 5783287 0
2 Philippines 27 102291 306707 59026 1349742 11
3 Nicaragua 16 97064 251139 62785 989888 100
4 New Zealand 16 71199 261129 65282 1049110 0
5 Ethiopia 26 69227 148454 29114 565206 0
6 Guatemala 17 63078 88080 21362 240892 1373
7 Ecuador 26 41014 117168 22979 534403 0
8 Indonesia 95 35195 129642 13301 1092929 3
9 Japan 92 6355 19149 1996 130474 0
10 Iceland 28 3496 8458 1598 33526 0
11 Papua New G… 28 2978 5215 986 24509 1
12 United Stat… 99 1745 7356 739 52801 0
13 Chile 43 459 1233 188 5951 0
14 Canada 18 444 1559 367 6635 0
15 Russia 79 24 96 11 669 0
16 Antarctica 17 0 0 0 0 0
# México
#n=28
#269105.1 ± 206295.9 habitantes(sigma 1)
# Guatemala
# n = 17
# 63078 ± 21362 habitantes
4. Exercicio
- 1) Em qual país existem mais vulcões?
%>%
volc count(country,sort=TRUE)
# A tibble: 89 × 2
country n
<chr> <int>
1 United States 99
2 Indonesia 95
3 Japan 92
4 Russia 79
5 Chile 43
6 Iceland 28
7 Mexico 28
8 Papua New Guinea 28
9 Philippines 27
10 Ecuador 26
# ℹ 79 more rows
%>%
volc group_by(country) %>%
summarise(n=n()) %>%
arrange(desc(n))
# A tibble: 89 × 2
country n
<chr> <int>
1 United States 99
2 Indonesia 95
3 Japan 92
4 Russia 79
5 Chile 43
6 Iceland 28
7 Mexico 28
8 Papua New Guinea 28
9 Philippines 27
10 Ecuador 26
# ℹ 79 more rows
- 2) Nesse país, qual o tipo de vulcão mais comum?
%>%
volc filter(country=="United States") %>%
count(primary_volcano_type,sort=TRUE)
# A tibble: 15 × 2
primary_volcano_type n
<chr> <int>
1 Stratovolcano 32
2 Shield 15
3 Volcanic field 13
4 Stratovolcano(es) 8
5 Submarine 6
6 Pyroclastic cone(s) 5
7 Shield(s) 5
8 Lava dome(s) 4
9 Caldera 3
10 Maar(s) 3
11 Caldera(s) 1
12 Complex 1
13 Pyroclastic cone 1
14 Stratovolcano? 1
15 Tuff cone(s) 1
- 3) Na América do Sul, quantos e quais vulcões entraram em erupção em 2020?
%>%
volc filter(region=="South America",
==2020) %>%
last_eruption_yearcount(volcano_name,sort=TRUE)
# A tibble: 6 × 2
volcano_name n
<chr> <int>
1 Chillan, Nevados de 1
2 Reventador 1
3 Ruiz, Nevado del 1
4 Sabancaya 1
5 Sangay 1
6 Villarrica 1
- 4) Quais os países nos quais os vulcões apresentam maior risco para a população local (até 100 km)? Filtre países em que existam mais de 20 vulcões.
%>%
volc group_by(country) %>%
summarise(n=n(),
media=mean(population_within_100_km),
se=SE(population_within_100_km,1)) %>%
filter(n>20) %>%
arrange(desc(media))
# A tibble: 11 × 4
country n media se
<chr> <int> <dbl> <dbl>
1 Philippines 27 10247171. 1948086.
2 Indonesia 95 9126614. 1148771.
3 Mexico 28 7034976. 1863798.
4 Japan 92 4057450. 537928.
5 Ethiopia 26 3318110. 683117.
6 Ecuador 26 1819886. 317923.
7 Chile 43 362838. 42331.
8 United States 99 250535. 55360.
9 Papua New Guinea 28 102845. 10647.
10 Iceland 28 65432. 16041.
11 Russia 79 60141. 20473.