From: https://github.com/ksatola
Version: 1.0.0
The data comes from the website of Polish Inspectorate Of Environmental Protection (GIOS - Glowny Inspektorat Ochrony Srodowiska) dowloaded on Feb 18th, 2020.
For the download, I used web scraping techniques. As there are different types of XSLS files for different years with different structure and single or multiple sheets, the ETL logic is defined as follows:
The downloaded content consist of metadata files regarding emission measurement stadions, their codes, locations and measurements characteristics over time as well as aggregated statistics. All downloaded files are in form of ZIP archives. The ZIP archives contain XSLS files. Measurements are gathered in files by year (from 2000 to 2018), emission measurement station, and pollutants. The data covers hourly and daily averages of pollutants measurements.
There is a major metadata format and naming convention change in 2016. I had to tak this into consideration while working on the automated ETL pipeline.
Currently, there are eight emission measurement stadions in the Krakow area taking different sets of measurements:
Three of the stations were renamed in 2016 and other five were colsed between 2004 and 2018:
The first two stations in the Krakow area (MpKrakAlKras, MpKrakBulwar) were initiatied on Jan 1st, 2003.
%load_ext autoreload
%autoreload 2
import sys
sys.path.insert(0, '../src')
import pandas as pd
import numpy as np
import time
import os
import random
import re
import fnmatch
from pathlib import Path
import zipfile
import csv
import requests
import urllib.request
from bs4 import BeautifulSoup
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', 1000)
from prepare import (
get_gios_pollution_data_files,
extract_archived_data,
get_pollutant_measures_for_locations,
get_files_for_name_pattern,
build_gios_analytical_view
)
# Set the url to the website and access the site with our requests library
url = 'http://powietrze.gios.gov.pl/pjp/archives'
response = requests.get(url)
response
<Response [200]>
# https://www.crummy.com/software/BeautifulSoup/bs4/doc/
soup = BeautifulSoup(response.text, "html.parser")
# We use the method .find to locate <ul> of id
ul = soup.find("ul", {"id": "archive_files"})
print(ul)
<ul class="list-unstyled" id="archive_files"> <li> <a href="/pjp/archives/downloadFile/102"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Kody stacji pomiarowych</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/305"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Metadane - stacje i stanowiska pomiarowe</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/304"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Statystyki z lat 2000-2018</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/223"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2000 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/224"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2001 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/225"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2002 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/226"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2003 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/202"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2004 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/203"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2005 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/227"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2006 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/228"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2007 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/229"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2008 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/230"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2009 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/231"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2010 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/232"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2011 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/233"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2012 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/234"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2013 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/302"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2014 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/236"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2015 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/242"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2016 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/262"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2017 roku</p> </div> </a> </li> <li> <a href="/pjp/archives/downloadFile/303"> <div class="col-md-1 col-sm-2 col-xs-3 text-center" style="color: black;"> <div style="width: 50px; height: 52px; display: table; margin: 0 auto;"><img alt="" src="/pjp/assets-0.0.31/img/zip.png"/></div> <p class="archive_file_name">Wyniki pomiarów z 2018 roku</p> </div> </a> </li> </ul>
lis = ul.find_all('li')
resources = []
for li in lis:
file_name = li.find("p", {"class": "archive_file_name"}).getText()
file_url = li.find("a")['href'].split('/')
#print(file_url)
resources.append((file_name, file_url[3]+'/'+file_url[4]))
resources
[('Kody stacji pomiarowych', 'downloadFile/102'), ('Metadane - stacje i stanowiska pomiarowe', 'downloadFile/305'), ('Statystyki z lat 2000-2018', 'downloadFile/304'), ('Wyniki pomiarów z 2000 roku', 'downloadFile/223'), ('Wyniki pomiarów z 2001 roku', 'downloadFile/224'), ('Wyniki pomiarów z 2002 roku', 'downloadFile/225'), ('Wyniki pomiarów z 2003 roku', 'downloadFile/226'), ('Wyniki pomiarów z 2004 roku', 'downloadFile/202'), ('Wyniki pomiarów z 2005 roku', 'downloadFile/203'), ('Wyniki pomiarów z 2006 roku', 'downloadFile/227'), ('Wyniki pomiarów z 2007 roku', 'downloadFile/228'), ('Wyniki pomiarów z 2008 roku', 'downloadFile/229'), ('Wyniki pomiarów z 2009 roku', 'downloadFile/230'), ('Wyniki pomiarów z 2010 roku', 'downloadFile/231'), ('Wyniki pomiarów z 2011 roku', 'downloadFile/232'), ('Wyniki pomiarów z 2012 roku', 'downloadFile/233'), ('Wyniki pomiarów z 2013 roku', 'downloadFile/234'), ('Wyniki pomiarów z 2014 roku', 'downloadFile/302'), ('Wyniki pomiarów z 2015 roku', 'downloadFile/236'), ('Wyniki pomiarów z 2016 roku', 'downloadFile/242'), ('Wyniki pomiarów z 2017 roku', 'downloadFile/262'), ('Wyniki pomiarów z 2018 roku', 'downloadFile/303')]
links = [a["href"] for a in ul.select("a[href]")]
links
['/pjp/archives/downloadFile/102', '/pjp/archives/downloadFile/305', '/pjp/archives/downloadFile/304', '/pjp/archives/downloadFile/223', '/pjp/archives/downloadFile/224', '/pjp/archives/downloadFile/225', '/pjp/archives/downloadFile/226', '/pjp/archives/downloadFile/202', '/pjp/archives/downloadFile/203', '/pjp/archives/downloadFile/227', '/pjp/archives/downloadFile/228', '/pjp/archives/downloadFile/229', '/pjp/archives/downloadFile/230', '/pjp/archives/downloadFile/231', '/pjp/archives/downloadFile/232', '/pjp/archives/downloadFile/233', '/pjp/archives/downloadFile/234', '/pjp/archives/downloadFile/302', '/pjp/archives/downloadFile/236', '/pjp/archives/downloadFile/242', '/pjp/archives/downloadFile/262', '/pjp/archives/downloadFile/303']
%%time
download_base_url = 'http://powietrze.gios.gov.pl/pjp/archives'
path_to_save = "/Users/ksatola/Documents/git/air-polution/data/gios/etl"
get_gios_pollution_data_files(download_base_url, path_to_save)
ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/102 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/305 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/304 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/223 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/224 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/225 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/226 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/202 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/203 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/227 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/228 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/229 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/230 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/231 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/232 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/233 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/234 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/302 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/236 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/242 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/262 ok: 200 http://powietrze.gios.gov.pl/pjp/archives/downloadFile/303 CPU times: user 8.46 s, sys: 8.85 s, total: 17.3 s Wall time: 4min 41s
%%time
source_dir = '/Users/ksatola/Documents/git/air-polution/data/gios/etl'
target_dir = '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/'
file_search_pattern = '*.zip'
extract_archived_data(source_dir, target_dir, file_search_pattern)
Found directory: /Users/ksatola/Documents/git/air-polution/data/gios/etl Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2000 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2001 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Statystyki z lat 2000-2018.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2017 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2016 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2010 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2011 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Kody stacji pomiarowych.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2007 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2006 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2014 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2015 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2009 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2008 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2003 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2002 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Metadane - stacje i stanowiska pomiarowe.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2004 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2005 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2018 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2013 roku.zip Extracting: /Users/ksatola/Documents/git/air-polution/data/gios/etl/Wyniki pomiarów z 2012 roku.zip CPU times: user 3.01 s, sys: 1.26 s, total: 4.28 s Wall time: 4.57 s
# Emission measurement stations codes in the Krakow area
ems_codes = [
# Active stations
'MpKrakOsPias', # from 2016-01-01, pm25, pm10, http://powietrze.gios.gov.pl/pjp/current/station_details/info/10139
'MpKrakWadow', # from 2017-01-01, pm25, pm10, http://powietrze.gios.gov.pl/pjp/current/station_details/info/10447
'MpKrakSwoszo', # from 2019-01-01, pm10, http://powietrze.gios.gov.pl/pjp/current/station_details/info/11303
'MpKrakZloRog', # from 2016-01-01, pm10, http://powietrze.gios.gov.pl/pjp/current/station_details/info/10123
'MpKrakAlKras', # from 2003-01-01, pm25, pm10, CO, NO2, NOx, benzen, http://powietrze.gios.gov.pl/pjp/current/station_details/info/400
'MpKrakBujaka', # from 2010-01-01, pm25, pm10, CO, NO2, NOx, benzen, SO2, O3 http://powietrze.gios.gov.pl/pjp/current/station_details/info/401
'MpKrakBulwar', # from 2003-01-01, pm25, pm10, CO, NO2, NOx, benzen, SO2, http://powietrze.gios.gov.pl/pjp/current/station_details/info/402
'MpKrakDietla', # from 2016-01-01, pm10, NO2, NOx, http://powietrze.gios.gov.pl/pjp/current/station_details/info/10121
# Old codes and historical stations
'MpKrakowWIOSAKra6117', # MpKrakAlKras
'MpKrakowWIOSBuja6119', # MpKrakBujaka
'MpKrakowWIOSBulw6118', # MpKrakBulwar
'MpKrakowWIOSPrad6115', # closed on 2010-02-28
'MpKrakowWSSEKapi6108', # closed on 2009-12-31
'MpKrakowWSSEPrad6102', # closed on 2004-12-31
'MpKrakowWSSERPod6113', # closed on 2004-12-31
'MpKrakTelime' # closed on 2018-06-01
]
source_dir = '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/'
years = [
'2000',
'2001',
'2002',
'2003',
'2004',
'2005',
'2006',
'2007',
'2008',
'2009',
'2010',
'2011',
'2012',
'2013',
'2014',
'2015',
'2016',
'2017',
'2018',
'2019'
]
%%time
# Get all 1g files from 2016-2019 inclusive
file_search_pattern = '201[6789]_*_1g.xlsx'
get_files_for_name_pattern(source_dir, file_search_pattern)
CPU times: user 1.17 ms, sys: 681 µs, total: 1.86 ms Wall time: 1.77 ms
['/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_O3_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_NO2_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_PM2.5_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_PM25_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_Hg(TGM)_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_SO2_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_Hg(TGM)_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_C6H6_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_CO_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_Hg(TGM)_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_C6H6_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_NOx_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_PM10_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_NO2_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_O3_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_SO2_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_NOx_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_PM25_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_CO_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_NO2_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_PM10_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_NOx_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_C6H6_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_SO2_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_O3_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_PM10_1g.xlsx', '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_CO_1g.xlsx']
%%time
file = '2018_SO2_1g.xlsx'
full_path_to_file = os.path.join(source_dir, file)
# Take measurement from a file
measurement_name = file.split('_')[1]
measurement_name
df1 = get_pollutant_measures_for_locations(full_path_to_file, ems_codes, measurement_name, '2018')
df1.head()
/Users/ksatola/anaconda3/lib/python3.7/site-packages/numpy/lib/nanfunctions.py:1115: RuntimeWarning: All-NaN slice encountered overwrite_input=overwrite_input)
CPU times: user 11.9 s, sys: 101 ms, total: 12 s Wall time: 12 s
SO2_mean | SO2_median | SO2_min | SO2_max | SO2_std | SO2_sum | SO2_obs_num | |
---|---|---|---|---|---|---|---|
Datetime | |||||||
2018-01-01 01:00:00 | 8.07894 | 8.07894 | 8.07894 | 8.07894 | NaN | 8.07894 | 1 |
2018-01-01 02:00:00 | NaN | NaN | NaN | NaN | NaN | 0.00000 | 0 |
2018-01-01 03:00:00 | NaN | NaN | NaN | NaN | NaN | 0.00000 | 0 |
2018-01-01 04:00:00 | NaN | NaN | NaN | NaN | NaN | 0.00000 | 0 |
2018-01-01 05:00:00 | NaN | NaN | NaN | NaN | NaN | 0.00000 | 0 |
%%time
file = '2017_C6H6_1g.xlsx'
full_path_to_file = os.path.join(source_dir, file)
# Take measurement from a file
measurement_name = file.split('_')[1]
measurement_name
df2 = get_pollutant_measures_for_locations(full_path_to_file, ems_codes, measurement_name, '2017')
df2.head()
CPU times: user 4.52 s, sys: 27.9 ms, total: 4.55 s Wall time: 4.56 s
C6H6_mean | C6H6_median | C6H6_min | C6H6_max | C6H6_std | C6H6_sum | C6H6_obs_num | |
---|---|---|---|---|---|---|---|
Datetime | |||||||
2017-01-01 01:00:00 | 5.895385 | 5.895385 | 5.53153 | 6.25924 | 0.514569 | 11.79077 | 2 |
2017-01-01 02:00:00 | 6.491270 | 6.491270 | 5.64930 | 7.33324 | 1.190725 | 12.98254 | 2 |
2017-01-01 03:00:00 | 7.056075 | 7.056075 | 5.99393 | 8.11822 | 1.502100 | 14.11215 | 2 |
2017-01-01 04:00:00 | 8.039045 | 8.039045 | 6.58716 | 9.49093 | 2.053275 | 16.07809 | 2 |
2017-01-01 05:00:00 | 8.633105 | 8.633105 | 7.06201 | 10.20420 | 2.221864 | 17.26621 | 2 |
# Merge data frames on datetime index
#df3 = pd.DataFrame() # works also if one dfs is empty
merged = pd.merge(df1, df2, how='outer', left_index=True, right_index=True)
merged.head()
SO2_mean | SO2_median | SO2_min | SO2_max | SO2_std | SO2_sum | SO2_obs_num | C6H6_mean | C6H6_median | C6H6_min | C6H6_max | C6H6_std | C6H6_sum | C6H6_obs_num | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Datetime | ||||||||||||||
2017-01-01 01:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 5.895385 | 5.895385 | 5.53153 | 6.25924 | 0.514569 | 11.79077 | 2.0 |
2017-01-01 02:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 6.491270 | 6.491270 | 5.64930 | 7.33324 | 1.190725 | 12.98254 | 2.0 |
2017-01-01 03:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 7.056075 | 7.056075 | 5.99393 | 8.11822 | 1.502100 | 14.11215 | 2.0 |
2017-01-01 04:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8.039045 | 8.039045 | 6.58716 | 9.49093 | 2.053275 | 16.07809 | 2.0 |
2017-01-01 05:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 8.633105 | 8.633105 | 7.06201 | 10.20420 | 2.221864 | 17.26621 | 2.0 |
merged.tail()
SO2_mean | SO2_median | SO2_min | SO2_max | SO2_std | SO2_sum | SO2_obs_num | C6H6_mean | C6H6_median | C6H6_min | C6H6_max | C6H6_std | C6H6_sum | C6H6_obs_num | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Datetime | ||||||||||||||
2018-12-31 20:00:00 | 6.531955 | 6.531955 | 5.81358 | 7.25033 | 1.015936 | 13.06391 | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-12-31 21:00:00 | 7.601315 | 7.601315 | 5.50472 | 9.69791 | 2.965033 | 15.20263 | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-12-31 22:00:00 | 8.165295 | 8.165295 | 5.41679 | 10.91380 | 3.886973 | 16.33059 | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-12-31 23:00:00 | 8.826955 | 8.826955 | 5.91481 | 11.73910 | 4.118395 | 17.65391 | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2019-01-01 00:00:00 | 9.130160 | 9.130160 | 6.26282 | 11.99750 | 4.055031 | 18.26032 | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
df1.shape
(8760, 7)
df2.shape
(8760, 7)
merged.shape
(17520, 14)
%%time
df_1g = build_gios_analytical_view(years=years, sampling_freq='1g', root_folder=source_dir, ems_codes=ems_codes)
Year: 2000 - df_full.shape (0, 0) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2000_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2000_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2000_O3_1g.xlsx - measurement_name: O3 ---------------------------------------- Year: 2001 - df_full.shape (8784, 14) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2001_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2001_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2001_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2001_SO2_1g.xlsx - measurement_name: SO2 ---------------------------------------- Year: 2002 - df_full.shape (17544, 21) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_O3_1g.xlsx - measurement_name: O3 ---------------------------------------- Year: 2003 - df_full.shape (26304, 28) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_NO2_1g.xlsx - measurement_name: NO2 ---------------------------------------- Year: 2004 - df_full.shape (35064, 35) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_SO2_1g.xlsx - measurement_name: SO2 ---------------------------------------- Year: 2005 - df_full.shape (43848, 42) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_PM10_1g.xlsx - measurement_name: PM10 ---------------------------------------- Year: 2006 - df_full.shape (52608, 42) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_PM2.5_1g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_CO_1g.xlsx - measurement_name: CO ---------------------------------------- Year: 2007 - df_full.shape (61368, 42) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_PM2.5_1g.xlsx - measurement_name: PM25 ---------------------------------------- Year: 2008 - df_full.shape (70128, 42) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_PM2.5_1g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_SO2_1g.xlsx - measurement_name: SO2 ---------------------------------------- Year: 2009 - df_full.shape (78912, 56) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_PM2.5_1g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_PM10_1g.xlsx - measurement_name: PM10 ---------------------------------------- Year: 2010 - df_full.shape (87672, 56) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_PM2.5_1g.xlsx - measurement_name: PM25 ---------------------------------------- Year: 2011 - df_full.shape (96432, 56) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_PM2.5_1g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_PM10_1g.xlsx - measurement_name: PM10 ---------------------------------------- Year: 2012 - df_full.shape (105192, 56) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_PM2.5_1g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_NOx_1g.xlsx - measurement_name: NOx ---------------------------------------- Year: 2013 - df_full.shape (113977, 56) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_PM2.5_1g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_NO2_1g.xlsx - measurement_name: NO2 ---------------------------------------- Year: 2014 - df_full.shape (122737, 56) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_PM2.5_1g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_SO2_1g.xlsx - measurement_name: SO2 ---------------------------------------- Year: 2015 - df_full.shape (131497, 56) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_PM25_1g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_Hg(TGM)_1g.xlsx - measurement_name: Hg(TGM) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_NO2_1g.xlsx - measurement_name: NO2 ---------------------------------------- Year: 2016 - df_full.shape (140257, 56) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_PM2.5_1g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_Hg(TGM)_1g.xlsx - measurement_name: Hg(TGM) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_O3_1g.xlsx - measurement_name: O3 ---------------------------------------- Year: 2017 - df_full.shape (149041, 56) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_Hg(TGM)_1g.xlsx - measurement_name: Hg(TGM) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_SO2_1g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_PM25_1g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_CO_1g.xlsx - measurement_name: CO ---------------------------------------- Year: 2018 - df_full.shape (157801, 56) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_O3_1g.xlsx - measurement_name: O3 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_NO2_1g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_PM25_1g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_Hg(TGM)_1g.xlsx - measurement_name: Hg(TGM) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_PM10_1g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_NOx_1g.xlsx - measurement_name: NOx File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_CO_1g.xlsx - measurement_name: CO File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_C6H6_1g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_SO2_1g.xlsx - measurement_name: SO2 ---------------------------------------- Year: 2019 - df_full.shape (166561, 56) ---------------------------------------- CPU times: user 13min 43s, sys: 4.54 s, total: 13min 48s Wall time: 13min 50s
df_1g.shape
(166561, 56)
df_1g.head()
C6H6_max | C6H6_mean | C6H6_median | C6H6_min | C6H6_obs_num | C6H6_std | C6H6_sum | CO_max | CO_mean | CO_median | CO_min | CO_obs_num | CO_std | CO_sum | NO2_max | NO2_mean | NO2_median | NO2_min | NO2_obs_num | NO2_std | NO2_sum | NOx_max | NOx_mean | NOx_median | NOx_min | NOx_obs_num | NOx_std | NOx_sum | O3_max | O3_mean | O3_median | O3_min | O3_obs_num | O3_std | O3_sum | PM10_max | PM10_mean | PM10_median | PM10_min | PM10_obs_num | PM10_std | PM10_sum | PM25_max | PM25_mean | PM25_median | PM25_min | PM25_obs_num | PM25_std | PM25_sum | SO2_max | SO2_mean | SO2_median | SO2_min | SO2_obs_num | SO2_std | SO2_sum | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Datetime | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2000-01-01 01:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | NaN | 0.0 | NaN | NaN | NaN | NaN | 0.0 | NaN | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2000-01-01 02:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 62.0 | 49.333333 | 48.0 | 38.0 | 3.0 | 12.055428 | 148.0 | 170.0 | 121.000000 | 105.0 | 88.0 | 3.0 | 43.278170 | 363.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2000-01-01 03:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 56.0 | 46.666667 | 47.0 | 37.0 | 3.0 | 9.504385 | 140.0 | 181.0 | 116.000000 | 96.0 | 71.0 | 3.0 | 57.662813 | 348.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2000-01-01 04:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 52.0 | 44.666667 | 46.0 | 36.0 | 3.0 | 8.082904 | 134.0 | 162.0 | 115.333333 | 106.0 | 78.0 | 3.0 | 42.770706 | 346.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2000-01-01 05:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 53.0 | 43.666667 | 43.0 | 35.0 | 3.0 | 9.018500 | 131.0 | 154.0 | 113.000000 | 105.0 | 80.0 | 3.0 | 37.643060 | 339.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
df_1g.tail()
C6H6_max | C6H6_mean | C6H6_median | C6H6_min | C6H6_obs_num | C6H6_std | C6H6_sum | CO_max | CO_mean | CO_median | CO_min | CO_obs_num | CO_std | CO_sum | NO2_max | NO2_mean | NO2_median | NO2_min | NO2_obs_num | NO2_std | NO2_sum | NOx_max | NOx_mean | NOx_median | NOx_min | NOx_obs_num | NOx_std | NOx_sum | O3_max | O3_mean | O3_median | O3_min | O3_obs_num | O3_std | O3_sum | PM10_max | PM10_mean | PM10_median | PM10_min | PM10_obs_num | PM10_std | PM10_sum | PM25_max | PM25_mean | PM25_median | PM25_min | PM25_obs_num | PM25_std | PM25_sum | SO2_max | SO2_mean | SO2_median | SO2_min | SO2_obs_num | SO2_std | SO2_sum | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Datetime | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2018-12-31 20:00:00 | 2.76298 | 1.713593 | 1.75158 | 0.62622 | 3.0 | 1.068886 | 5.14078 | 0.72661 | 0.585620 | 0.585620 | 0.44463 | 2.0 | 0.199390 | 1.17124 | 67.4538 | 45.557525 | 42.55340 | 29.6695 | 4.0 | 15.854688 | 182.2301 | 212.6990 | 97.096300 | 69.32805 | 37.0301 | 4.0 | 78.963510 | 388.3852 | 23.7920 | 23.7920 | 23.7920 | 23.7920 | 1.0 | NaN | 23.7920 | 41.8932 | 29.219671 | 28.5545 | 15.3653 | 7.0 | 9.677638 | 204.5377 | 25.1614 | 20.140967 | 23.6990 | 11.5625 | 3.0 | 7.465067 | 60.4229 | 7.25033 | 6.531955 | 6.531955 | 5.81358 | 2.0 | 1.015936 | 13.06391 |
2018-12-31 21:00:00 | 3.61236 | 2.154820 | 1.68318 | 1.16892 | 3.0 | 1.288190 | 6.46446 | 0.77990 | 0.660650 | 0.660650 | 0.54140 | 2.0 | 0.168645 | 1.32130 | 56.6802 | 41.029525 | 38.75120 | 29.9355 | 4.0 | 11.413497 | 164.1181 | 165.4850 | 81.358750 | 61.15985 | 37.6303 | 4.0 | 57.981694 | 325.4350 | 21.1737 | 21.1737 | 21.1737 | 21.1737 | 1.0 | NaN | 21.1737 | 53.3517 | 38.305571 | 37.9557 | 27.0842 | 7.0 | 9.636778 | 268.1390 | 35.7650 | 30.312100 | 32.6308 | 22.5405 | 3.0 | 6.910436 | 90.9363 | 9.69791 | 7.601315 | 7.601315 | 5.50472 | 2.0 | 2.965033 | 15.20263 |
2018-12-31 22:00:00 | 3.35900 | 2.026807 | 1.43370 | 1.28772 | 3.0 | 1.156020 | 6.08042 | 0.54587 | 0.535710 | 0.535710 | 0.52555 | 2.0 | 0.014368 | 1.07142 | 39.3984 | 32.127175 | 32.45540 | 24.1995 | 4.0 | 7.101207 | 128.5087 | 98.4181 | 55.409125 | 47.31220 | 28.5940 | 4.0 | 31.921626 | 221.6365 | 27.0917 | 27.0917 | 27.0917 | 27.0917 | 1.0 | NaN | 27.0917 | 50.7413 | 39.311457 | 37.1867 | 30.2702 | 7.0 | 7.212393 | 275.1802 | 35.1773 | 30.402933 | 31.0801 | 24.9514 | 3.0 | 5.146472 | 91.2088 | 10.91380 | 8.165295 | 8.165295 | 5.41679 | 2.0 | 3.886973 | 16.33059 |
2018-12-31 23:00:00 | 3.17358 | 2.017590 | 1.51083 | 1.36836 | 3.0 | 1.003648 | 6.05277 | 0.54440 | 0.497900 | 0.497900 | 0.45140 | 2.0 | 0.065761 | 0.99580 | 37.9001 | 28.491200 | 27.40825 | 21.2482 | 4.0 | 7.657392 | 113.9648 | 85.6241 | 46.385250 | 37.66430 | 24.5883 | 4.0 | 28.021586 | 185.5410 | 32.3864 | 32.3864 | 32.3864 | 32.3864 | 1.0 | NaN | 32.3864 | 56.5092 | 42.888271 | 44.2766 | 31.8605 | 7.0 | 7.730065 | 300.2179 | 34.8589 | 32.065400 | 33.2028 | 28.1345 | 3.0 | 3.503519 | 96.1962 | 11.73910 | 8.826955 | 8.826955 | 5.91481 | 2.0 | 4.118395 | 17.65391 |
2019-01-01 00:00:00 | 2.78365 | 1.957933 | 1.68273 | 1.40742 | 3.0 | 0.728220 | 5.87380 | 0.56017 | 0.515095 | 0.515095 | 0.47002 | 2.0 | 0.063746 | 1.03019 | 37.5347 | 27.325025 | 25.80380 | 20.1578 | 4.0 | 8.543424 | 109.3001 | 79.4643 | 43.043625 | 35.21135 | 22.2875 | 4.0 | 26.750338 | 172.1745 | 34.5747 | 34.5747 | 34.5747 | 34.5747 | 1.0 | NaN | 34.5747 | 58.9693 | 48.698329 | 49.1963 | 37.4338 | 7.0 | 7.329880 | 340.8883 | 44.9021 | 38.654567 | 36.6074 | 34.4542 | 3.0 | 5.516595 | 115.9637 | 11.99750 | 9.130160 | 9.130160 | 6.26282 | 2.0 | 4.055031 | 18.26032 |
df_1g.sample(5)
C6H6_max | C6H6_mean | C6H6_median | C6H6_min | C6H6_obs_num | C6H6_std | C6H6_sum | CO_max | CO_mean | CO_median | CO_min | CO_obs_num | CO_std | CO_sum | NO2_max | NO2_mean | NO2_median | NO2_min | NO2_obs_num | NO2_std | NO2_sum | NOx_max | NOx_mean | NOx_median | NOx_min | NOx_obs_num | NOx_std | NOx_sum | O3_max | O3_mean | O3_median | O3_min | O3_obs_num | O3_std | O3_sum | PM10_max | PM10_mean | PM10_median | PM10_min | PM10_obs_num | PM10_std | PM10_sum | PM25_max | PM25_mean | PM25_median | PM25_min | PM25_obs_num | PM25_std | PM25_sum | SO2_max | SO2_mean | SO2_median | SO2_min | SO2_obs_num | SO2_std | SO2_sum | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Datetime | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2018-03-09 02:00:00 | 14.6407 | 9.556283 | 8.371350 | 5.65680 | 3.0 | 4.607675 | 28.66885 | 1.84287 | 1.65907 | 1.65907 | 1.47527 | 2.0 | 0.259932 | 3.31814 | 67.5374 | 52.612175 | 54.03655 | 34.83820 | 4.0 | 13.492997 | 210.44870 | 533.433 | 326.583000 | 286.147 | 200.605 | 4.0 | 148.347785 | 1306.332 | 2.58988 | 2.58988 | 2.58988 | 2.58988 | 1.0 | NaN | 2.58988 | 149.5000 | 114.131813 | 110.0360 | 86.5432 | 8.0 | 21.258410 | 913.0545 | 141.4940 | 101.296267 | 86.70480 | 75.69000 | 3.0 | 35.245209 | 303.8888 | 6.98110 | 6.923020 | 6.923020 | 6.86494 | 2.0 | 0.082138 | 13.84604 |
2004-06-18 23:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.00000 | 0.70000 | 0.70000 | 0.40000 | 2.0 | 0.424264 | 1.40000 | 55.0000 | 45.000000 | 44.00000 | 36.00000 | 3.0 | 9.539392 | 135.00000 | 124.000 | 72.333333 | 54.000 | 39.000 | 3.0 | 45.368859 | 217.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 54.0000 | 36.500000 | 36.5000 | 19.0000 | 2.0 | 24.748737 | 73.0000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 4.00000 | 4.000000 | 4.000000 | 4.00000 | 2.0 | 0.000000 | 8.00000 |
2015-06-08 14:00:19.020000 | 0.3000 | 0.300000 | 0.300000 | 0.30000 | 1.0 | NaN | 0.30000 | 0.49916 | 0.39086 | 0.39086 | 0.28256 | 2.0 | 0.153159 | 0.78172 | 71.4505 | 30.252313 | 11.31010 | 7.99634 | 3.0 | 35.717127 | 90.75694 | 161.895 | 61.559000 | 12.100 | 10.682 | 3.0 | 86.896417 | 184.677 | 92.99680 | 92.99680 | 92.99680 | 92.99680 | 1.0 | NaN | 92.99680 | 39.4836 | 35.250333 | 36.9433 | 29.3241 | 3.0 | 5.287103 | 105.7510 | 13.8651 | 11.458133 | 11.10930 | 9.40000 | 3.0 | 2.252897 | 34.3744 | 6.76999 | 4.731125 | 4.731125 | 2.69226 | 2.0 | 2.883391 | 9.46225 |
2008-11-18 14:00:00 | 0.7000 | 0.700000 | 0.700000 | 0.70000 | 1.0 | NaN | 0.70000 | 0.16000 | 0.16000 | 0.16000 | 0.16000 | 1.0 | NaN | 0.16000 | 81.0000 | 44.333333 | 30.00000 | 22.00000 | 3.0 | 32.005208 | 133.00000 | 276.000 | 119.666667 | 52.000 | 31.000 | 3.0 | 135.795189 | 359.000 | 40.00000 | 40.00000 | 40.00000 | 40.00000 | 1.0 | NaN | 40.00000 | 59.0000 | 35.333333 | 29.0000 | 18.0000 | 3.0 | 21.221059 | 106.0000 | 15.0000 | 14.000000 | 14.00000 | 13.00000 | 2.0 | 1.414214 | 28.0000 | 26.00000 | 17.333333 | 23.000000 | 3.00000 | 3.0 | 12.503333 | 52.00000 |
2016-06-05 15:00:00 | 1.0000 | 0.918085 | 0.918085 | 0.83617 | 2.0 | 0.115845 | 1.83617 | 0.80519 | 0.55546 | 0.55546 | 0.30573 | 2.0 | 0.353172 | 1.11092 | 75.5272 | 44.189950 | 38.83565 | 23.56130 | 4.0 | 23.243374 | 176.75980 | 212.400 | 94.433333 | 38.700 | 32.200 | 3.0 | 102.213812 | 283.300 | 68.99890 | 68.99890 | 68.99890 | 68.99890 | 1.0 | NaN | 68.99890 | 33.7213 | 18.862950 | 15.1281 | 12.7374 | 6.0 | 8.253398 | 113.1777 | 16.4247 | 10.254800 | 7.68853 | 6.65117 | 3.0 | 5.368406 | 30.7644 | 2.54552 | 2.045820 | 2.045820 | 1.54612 | 2.0 | 0.706683 | 4.09164 |
# Create a save directory if not exists
save_dir = '/Users/ksatola/Documents/git/air-polution/data/final'
Path(save_dir).mkdir(parents=True, exist_ok=True)
# Save
gios_1g_all_file = '/Users/ksatola/Documents/git/air-polution/data/final/gios_1g_all.csv'
df_1g.to_csv(gios_1g_all_file, encoding="utf-8", index=True)
# Test read
df_1g_read = pd.read_csv(gios_1g_all_file, encoding='utf-8', sep=",", index_col="Datetime")
df_1g_read.head()
C6H6_max | C6H6_mean | C6H6_median | C6H6_min | C6H6_obs_num | C6H6_std | C6H6_sum | CO_max | CO_mean | CO_median | CO_min | CO_obs_num | CO_std | CO_sum | NO2_max | NO2_mean | NO2_median | NO2_min | NO2_obs_num | NO2_std | NO2_sum | NOx_max | NOx_mean | NOx_median | NOx_min | NOx_obs_num | NOx_std | NOx_sum | O3_max | O3_mean | O3_median | O3_min | O3_obs_num | O3_std | O3_sum | PM10_max | PM10_mean | PM10_median | PM10_min | PM10_obs_num | PM10_std | PM10_sum | PM25_max | PM25_mean | PM25_median | PM25_min | PM25_obs_num | PM25_std | PM25_sum | SO2_max | SO2_mean | SO2_median | SO2_min | SO2_obs_num | SO2_std | SO2_sum | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Datetime | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2000-01-01 01:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | NaN | 0.0 | NaN | NaN | NaN | NaN | 0.0 | NaN | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2000-01-01 02:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 62.0 | 49.333333 | 48.0 | 38.0 | 3.0 | 12.055428 | 148.0 | 170.0 | 121.000000 | 105.0 | 88.0 | 3.0 | 43.278170 | 363.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2000-01-01 03:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 56.0 | 46.666667 | 47.0 | 37.0 | 3.0 | 9.504385 | 140.0 | 181.0 | 116.000000 | 96.0 | 71.0 | 3.0 | 57.662813 | 348.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2000-01-01 04:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 52.0 | 44.666667 | 46.0 | 36.0 | 3.0 | 8.082904 | 134.0 | 162.0 | 115.333333 | 106.0 | 78.0 | 3.0 | 42.770706 | 346.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2000-01-01 05:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 53.0 | 43.666667 | 43.0 | 35.0 | 3.0 | 9.018500 | 131.0 | 154.0 | 113.000000 | 105.0 | 80.0 | 3.0 | 37.643060 | 339.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
assert df_1g.shape == df_1g_read.shape
%%time
df_24g = build_gios_analytical_view(years=years, sampling_freq='24g', root_folder=source_dir, ems_codes=ems_codes)
Year: 2000 - df_full.shape (0, 0) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2000_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2000_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2000_PM10_24g.xlsx - measurement_name: PM10 ---------------------------------------- Year: 2001 - df_full.shape (366, 14) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2001_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2001_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2001_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2001_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2001_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2001_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2001_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) ---------------------------------------- Year: 2002 - df_full.shape (731, 14) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2002_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) ---------------------------------------- Year: 2003 - df_full.shape (1096, 21) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2003_PM2.5_24g.xlsx - measurement_name: PM25 ---------------------------------------- Year: 2004 - df_full.shape (1461, 28) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2004_PM2.5_24g.xlsx - measurement_name: PM25 ---------------------------------------- Year: 2005 - df_full.shape (1827, 42) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2005_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) ---------------------------------------- Year: 2006 - df_full.shape (2192, 42) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2006_C6H6_24g.xlsx - measurement_name: C6H6 ---------------------------------------- Year: 2007 - df_full.shape (2557, 42) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2007_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) ---------------------------------------- Year: 2008 - df_full.shape (2922, 49) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_IP(PM10)_24g.xlsx - measurement_name: IP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_DBahA(PM10)_24g.xlsx - measurement_name: DBahA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_BaA(PM10)_24g.xlsx - measurement_name: BaA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_BkF(PM10)_24g.xlsx - measurement_name: BkF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_BjF(PM10)_24g.xlsx - measurement_name: BjF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2008_BbF(PM10)_24g.xlsx - measurement_name: BbF(PM10) ---------------------------------------- Year: 2009 - df_full.shape (3288, 70) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_BbF(PM10)_24g.xlsx - measurement_name: BbF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_IP(PM10)_24g.xlsx - measurement_name: IP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_BjF(PM10)_24g.xlsx - measurement_name: BjF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_DBahA(PM10)_24g.xlsx - measurement_name: DBahA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_BkF(PM10)_24g.xlsx - measurement_name: BkF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_BaA(PM10)_24g.xlsx - measurement_name: BaA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2009_NO2_24g.xlsx - measurement_name: NO2 ---------------------------------------- Year: 2010 - df_full.shape (3653, 105) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_BaA(PM10)_24g.xlsx - measurement_name: BaA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_DBahA(PM10)_24g.xlsx - measurement_name: DBahA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_IP(PM10)_24g.xlsx - measurement_name: IP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_BjF(PM10)_24g.xlsx - measurement_name: BjF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_formaldehyd_24g.xlsx - measurement_name: formaldehyd File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_BbF(PM10)_24g.xlsx - measurement_name: BbF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2010_BkF(PM10)_24g.xlsx - measurement_name: BkF(PM10) ---------------------------------------- Year: 2011 - df_full.shape (4018, 105) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_Ca2+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_BkF(PM10)_24g.xlsx - measurement_name: BkF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_IP(PM10)_24g.xlsx - measurement_name: IP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_NH4+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_BbF(PM10)_24g.xlsx - measurement_name: BbF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_Mg2+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_DBahA(PM10)_24g.xlsx - measurement_name: DBahA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_BjF(PM10)_24g.xlsx - measurement_name: BjF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_NO3-(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_SO42_(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_EC(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_Na+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_K+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_formaldehyd_24g.xlsx - measurement_name: formaldehyd File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_BaA(PM10)_24g.xlsx - measurement_name: BaA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_OC(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2011_C6H6_24g.xlsx - measurement_name: C6H6 ---------------------------------------- Year: 2012 - df_full.shape (4383, 105) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_formaldehyd_24g.xlsx - measurement_name: formaldehyd File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_SO42_(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_BkF(PM10)_24g.xlsx - measurement_name: BkF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_BjF(PM10)_24g.xlsx - measurement_name: BjF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_NO3-(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_BbF(PM10)_24g.xlsx - measurement_name: BbF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_IP(PM10)_24g.xlsx - measurement_name: IP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_EC(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_DBahA(PM10)_24g.xlsx - measurement_name: DBahA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_Mg2+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_Na+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_NH4+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_OC(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_BaA(PM10)_24g.xlsx - measurement_name: BaA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_K+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2012_Ca2+(PM2.5)_24g.xlsx - measurement_name: PM25 ---------------------------------------- Year: 2013 - df_full.shape (4751, 105) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_K+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_NO3-(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_OC(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_BaA(PM10)_24g.xlsx - measurement_name: BaA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_IP(PM10)_24g.xlsx - measurement_name: IP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_EC(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_BbF(PM10)_24g.xlsx - measurement_name: BbF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_SO42_(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_BjF(PM10)_24g.xlsx - measurement_name: BjF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_Na+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_Ca2+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_BkF(PM10)_24g.xlsx - measurement_name: BkF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_DBahA(PM10)_24g.xlsx - measurement_name: DBahA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_NH4+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2013_Mg2+(PM2.5)_24g.xlsx - measurement_name: PM25 ---------------------------------------- Year: 2014 - df_full.shape (5481, 105) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_IP(PM10)_24g.xlsx - measurement_name: IP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_Ca2+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_BbF(PM10)_24g.xlsx - measurement_name: BbF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_BjF(PM10)_24g.xlsx - measurement_name: BjF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_NH4+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_Mg2+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_BkF(PM10)_24g.xlsx - measurement_name: BkF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_SO42_(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_DBahA(PM10)_24g.xlsx - measurement_name: DBahA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_Na+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_K+(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_BaA(PM10)_24g.xlsx - measurement_name: BaA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_NO3-(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_OC(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_EC(PM2.5)_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2014_formaldehyd_24g.xlsx - measurement_name: formaldehyd ---------------------------------------- Year: 2015 - df_full.shape (5846, 112) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_IP(PM10)_24g.xlsx - measurement_name: IP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_Jony_w_PM25_24g.xlsx - measurement_name: Jony File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_BaA(PM10)_24g.xlsx - measurement_name: BaA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_DBah(PM10)_24g.xlsx - measurement_name: DBah(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_Hg(TGM)_24g.xlsx - measurement_name: Hg(TGM) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_PM25_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_BkF(PM10)_24g.xlsx - measurement_name: BkF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_formaldehyd_24g.xlsx - measurement_name: formaldehyd File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_BjF(PM10)_24g.xlsx - measurement_name: BjF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_BbF(PM10)_24g.xlsx - measurement_name: BbF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2015_C6H6_24g.xlsx - measurement_name: C6H6 ---------------------------------------- Year: 2016 - df_full.shape (6211, 119) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_Jony_PM2_5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_formaldehyd_24g.xlsx - measurement_name: formaldehyd File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_IP(PM10)_24g.xlsx - measurement_name: IP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_BaA(PM10)_24g.xlsx - measurement_name: BaA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_BkF(PM10)_24g.xlsx - measurement_name: BkF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_Hg(TGM)_24g.xlsx - measurement_name: Hg(TGM) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_DBahA(PM10)_24g.xlsx - measurement_name: DBahA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_BbF(PM10)_24g.xlsx - measurement_name: BbF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_PM2.5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2016_BjF(PM10)_24g.xlsx - measurement_name: BjF(PM10) ---------------------------------------- Year: 2017 - df_full.shape (6577, 119) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_BjF(PM10)_24g.xlsx - measurement_name: BjF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_formaldehyd_24g.xlsx - measurement_name: formaldehyd File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_BbF(PM10)_24g.xlsx - measurement_name: BbF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_Jony_PM2_5_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_IP(PM10)_24g.xlsx - measurement_name: IP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_BkF(PM10)_24g.xlsx - measurement_name: BkF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_DBahA(PM10)_24g.xlsx - measurement_name: DBahA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_BaA(PM10)_24g.xlsx - measurement_name: BaA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_PM25_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2017_Hg(TGM)_24g.xlsx - measurement_name: Hg(TGM) ---------------------------------------- Year: 2018 - df_full.shape (6942, 119) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_Jony_PM25_24g.xlsx - measurement_name: Jony File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_formaldehyd_24g.xlsx - measurement_name: formaldehyd File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_PM10_24g.xlsx - measurement_name: PM10 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_NO2_24g.xlsx - measurement_name: NO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_PM25_24g.xlsx - measurement_name: PM25 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_Ni(PM10)_24g.xlsx - measurement_name: Ni(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_BaA(PM10)_24g.xlsx - measurement_name: BaA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_Pb(PM10)_24g.xlsx - measurement_name: Pb(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_As(PM10)_24g.xlsx - measurement_name: As(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_DBahA(PM10)_24g.xlsx - measurement_name: DBahA(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_C6H6_24g.xlsx - measurement_name: C6H6 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_BkF(PM10)_24g.xlsx - measurement_name: BkF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_Hg(TGM)_24g.xlsx - measurement_name: Hg(TGM) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_BjF(PM10)_24g.xlsx - measurement_name: BjF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_SO2_24g.xlsx - measurement_name: SO2 File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_BaP(PM10)_24g.xlsx - measurement_name: BaP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_IP(PM10)_24g.xlsx - measurement_name: IP(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_BbF(PM10)_24g.xlsx - measurement_name: BbF(PM10) File: /Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/2018_Cd(PM10)_24g.xlsx - measurement_name: Cd(PM10) ---------------------------------------- Year: 2019 - df_full.shape (7307, 119) ---------------------------------------- CPU times: user 44.3 s, sys: 495 ms, total: 44.8 s Wall time: 45.8 s
df_24g.shape
(7307, 119)
df_24g.head()
As(PM10)_max | As(PM10)_mean | As(PM10)_median | As(PM10)_min | As(PM10)_obs_num | As(PM10)_std | As(PM10)_sum | BaA(PM10)_max | BaA(PM10)_mean | BaA(PM10)_median | BaA(PM10)_min | BaA(PM10)_obs_num | BaA(PM10)_std | BaA(PM10)_sum | BaP(PM10)_max | BaP(PM10)_mean | BaP(PM10)_median | BaP(PM10)_min | BaP(PM10)_obs_num | BaP(PM10)_std | BaP(PM10)_sum | BbF(PM10)_max | BbF(PM10)_mean | BbF(PM10)_median | BbF(PM10)_min | BbF(PM10)_obs_num | BbF(PM10)_std | BbF(PM10)_sum | BjF(PM10)_max | BjF(PM10)_mean | BjF(PM10)_median | BjF(PM10)_min | BjF(PM10)_obs_num | BjF(PM10)_std | BjF(PM10)_sum | BkF(PM10)_max | BkF(PM10)_mean | BkF(PM10)_median | BkF(PM10)_min | BkF(PM10)_obs_num | BkF(PM10)_std | BkF(PM10)_sum | C6H6_max | C6H6_mean | C6H6_median | C6H6_min | C6H6_obs_num | C6H6_std | C6H6_sum | Cd(PM10)_max | Cd(PM10)_mean | Cd(PM10)_median | Cd(PM10)_min | Cd(PM10)_obs_num | Cd(PM10)_std | Cd(PM10)_sum | DBah(PM10)_max | DBah(PM10)_mean | DBah(PM10)_median | DBah(PM10)_min | DBah(PM10)_obs_num | DBah(PM10)_std | DBah(PM10)_sum | DBahA(PM10)_max | DBahA(PM10)_mean | DBahA(PM10)_median | DBahA(PM10)_min | DBahA(PM10)_obs_num | DBahA(PM10)_std | DBahA(PM10)_sum | IP(PM10)_max | IP(PM10)_mean | IP(PM10)_median | IP(PM10)_min | IP(PM10)_obs_num | IP(PM10)_std | IP(PM10)_sum | NO2_max | NO2_mean | NO2_median | NO2_min | NO2_obs_num | NO2_std | NO2_sum | Ni(PM10)_max | Ni(PM10)_mean | Ni(PM10)_median | Ni(PM10)_min | Ni(PM10)_obs_num | Ni(PM10)_std | Ni(PM10)_sum | PM10_max | PM10_mean | PM10_median | PM10_min | PM10_obs_num | PM10_std | PM10_sum | PM25_max | PM25_mean | PM25_median | PM25_min | PM25_obs_num | PM25_std | PM25_sum | Pb(PM10)_max | Pb(PM10)_mean | Pb(PM10)_median | Pb(PM10)_min | Pb(PM10)_obs_num | Pb(PM10)_std | Pb(PM10)_sum | SO2_max | SO2_mean | SO2_median | SO2_min | SO2_obs_num | SO2_std | SO2_sum | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Datetime | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2000-01-01 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 135.9 | 132.95 | 132.95 | 130.0 | 2.0 | 4.171930 | 265.9 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 106.0 | 106.0 | 106.0 | 106.0 | 1.0 | NaN | 106.0 |
2000-01-02 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 129.1 | 122.55 | 122.55 | 116.0 | 2.0 | 9.263099 | 245.1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 93.0 | 93.0 | 93.0 | 93.0 | 1.0 | NaN | 93.0 |
2000-01-03 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 41.2 | 37.10 | 37.10 | 33.0 | 2.0 | 5.798276 | 74.2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 42.0 | 42.0 | 42.0 | 42.0 | 1.0 | NaN | 42.0 |
2000-01-04 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 36.4 | 31.20 | 31.20 | 26.0 | 2.0 | 7.353911 | 62.4 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 38.0 | 38.0 | 38.0 | 38.0 | 1.0 | NaN | 38.0 |
2000-01-05 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 33.9 | 28.95 | 28.95 | 24.0 | 2.0 | 7.000357 | 57.9 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 33.0 | 33.0 | 33.0 | 33.0 | 1.0 | NaN | 33.0 |
df_24g.tail()
As(PM10)_max | As(PM10)_mean | As(PM10)_median | As(PM10)_min | As(PM10)_obs_num | As(PM10)_std | As(PM10)_sum | BaA(PM10)_max | BaA(PM10)_mean | BaA(PM10)_median | BaA(PM10)_min | BaA(PM10)_obs_num | BaA(PM10)_std | BaA(PM10)_sum | BaP(PM10)_max | BaP(PM10)_mean | BaP(PM10)_median | BaP(PM10)_min | BaP(PM10)_obs_num | BaP(PM10)_std | BaP(PM10)_sum | BbF(PM10)_max | BbF(PM10)_mean | BbF(PM10)_median | BbF(PM10)_min | BbF(PM10)_obs_num | BbF(PM10)_std | BbF(PM10)_sum | BjF(PM10)_max | BjF(PM10)_mean | BjF(PM10)_median | BjF(PM10)_min | BjF(PM10)_obs_num | BjF(PM10)_std | BjF(PM10)_sum | BkF(PM10)_max | BkF(PM10)_mean | BkF(PM10)_median | BkF(PM10)_min | BkF(PM10)_obs_num | BkF(PM10)_std | BkF(PM10)_sum | C6H6_max | C6H6_mean | C6H6_median | C6H6_min | C6H6_obs_num | C6H6_std | C6H6_sum | Cd(PM10)_max | Cd(PM10)_mean | Cd(PM10)_median | Cd(PM10)_min | Cd(PM10)_obs_num | Cd(PM10)_std | Cd(PM10)_sum | DBah(PM10)_max | DBah(PM10)_mean | DBah(PM10)_median | DBah(PM10)_min | DBah(PM10)_obs_num | DBah(PM10)_std | DBah(PM10)_sum | DBahA(PM10)_max | DBahA(PM10)_mean | DBahA(PM10)_median | DBahA(PM10)_min | DBahA(PM10)_obs_num | DBahA(PM10)_std | DBahA(PM10)_sum | IP(PM10)_max | IP(PM10)_mean | IP(PM10)_median | IP(PM10)_min | IP(PM10)_obs_num | IP(PM10)_std | IP(PM10)_sum | NO2_max | NO2_mean | NO2_median | NO2_min | NO2_obs_num | NO2_std | NO2_sum | Ni(PM10)_max | Ni(PM10)_mean | Ni(PM10)_median | Ni(PM10)_min | Ni(PM10)_obs_num | Ni(PM10)_std | Ni(PM10)_sum | PM10_max | PM10_mean | PM10_median | PM10_min | PM10_obs_num | PM10_std | PM10_sum | PM25_max | PM25_mean | PM25_median | PM25_min | PM25_obs_num | PM25_std | PM25_sum | Pb(PM10)_max | Pb(PM10)_mean | Pb(PM10)_median | Pb(PM10)_min | Pb(PM10)_obs_num | Pb(PM10)_std | Pb(PM10)_sum | SO2_max | SO2_mean | SO2_median | SO2_min | SO2_obs_num | SO2_std | SO2_sum | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Datetime | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2018-12-27 00:00:00 | 1.06 | 0.780000 | 0.78 | 0.5 | 2.0 | 0.395980 | 1.56 | 6.02000 | 6.02000 | 6.02000 | 6.02000 | 1.0 | NaN | 6.02000 | 6.07 | 4.752500 | 4.745 | 3.45 | 4.0 | 1.218589 | 19.01000 | 2.69000 | 2.69000 | 2.69000 | 2.69000 | 1.0 | NaN | 2.69000 | 2.08000 | 2.08000 | 2.08000 | 2.08000 | 1.0 | NaN | 2.08000 | 2.00000 | 2.00000 | 2.00000 | 2.00000 | 1.0 | NaN | 2.00000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.37000 | 0.315000 | 0.315 | 0.26 | 2.0 | 0.077782 | 0.63000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.4700 | 0.4700 | 0.4700 | 0.4700 | 1.0 | NaN | 0.4700 | 3.99000 | 3.99000 | 3.99000 | 3.99000 | 1.0 | NaN | 3.99000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.00 | 0.625000 | 0.62500 | 0.25 | 2.0 | 0.530330 | 1.25000 | 19.36 | 18.195 | 18.87 | 15.68 | 4.0 | 1.713816 | 72.78 | 16.32 | 16.32 | 16.32 | 16.32 | 1.0 | NaN | 16.32 | 0.00969 | 0.007035 | 0.007035 | 0.00438 | 2.0 | 0.003755 | 0.01407 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-12-28 00:00:00 | 1.06 | 0.686667 | 0.50 | 0.5 | 3.0 | 0.323316 | 2.06 | 6.02000 | 6.02000 | 6.02000 | 6.02000 | 1.0 | NaN | 6.02000 | 6.07 | 4.768000 | 4.830 | 3.45 | 5.0 | 1.055898 | 23.84000 | 2.69000 | 2.69000 | 2.69000 | 2.69000 | 1.0 | NaN | 2.69000 | 2.08000 | 2.08000 | 2.08000 | 2.08000 | 1.0 | NaN | 2.08000 | 2.00000 | 2.00000 | 2.00000 | 2.00000 | 1.0 | NaN | 2.00000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.37000 | 0.306667 | 0.290 | 0.26 | 3.0 | 0.056862 | 0.92000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.4700 | 0.4700 | 0.4700 | 0.4700 | 1.0 | NaN | 0.4700 | 3.99000 | 3.99000 | 3.99000 | 3.99000 | 1.0 | NaN | 3.99000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.94 | 1.063333 | 1.00000 | 0.25 | 3.0 | 0.846778 | 3.19000 | 27.65 | 22.016 | 23.81 | 11.99 | 5.0 | 6.295084 | 110.08 | 23.42 | 23.42 | 23.42 | 23.42 | 1.0 | NaN | 23.42 | 0.01195 | 0.008673 | 0.009690 | 0.00438 | 3.0 | 0.003886 | 0.02602 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-12-29 00:00:00 | 1.06 | 0.686667 | 0.50 | 0.5 | 3.0 | 0.323316 | 2.06 | 6.02000 | 6.02000 | 6.02000 | 6.02000 | 1.0 | NaN | 6.02000 | 6.07 | 4.768000 | 4.830 | 3.45 | 5.0 | 1.055898 | 23.84000 | 2.69000 | 2.69000 | 2.69000 | 2.69000 | 1.0 | NaN | 2.69000 | 2.08000 | 2.08000 | 2.08000 | 2.08000 | 1.0 | NaN | 2.08000 | 2.00000 | 2.00000 | 2.00000 | 2.00000 | 1.0 | NaN | 2.00000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.37000 | 0.306667 | 0.290 | 0.26 | 3.0 | 0.056862 | 0.92000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.4700 | 0.4700 | 0.4700 | 0.4700 | 1.0 | NaN | 0.4700 | 3.99000 | 3.99000 | 3.99000 | 3.99000 | 1.0 | NaN | 3.99000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.94 | 1.063333 | 1.00000 | 0.25 | 3.0 | 0.846778 | 3.19000 | 23.15 | 18.350 | 17.70 | 14.50 | 5.0 | 3.357871 | 91.75 | 20.64 | 20.64 | 20.64 | 20.64 | 1.0 | NaN | 20.64 | 0.01195 | 0.008673 | 0.009690 | 0.00438 | 3.0 | 0.003886 | 0.02602 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-12-30 00:00:00 | 1.06 | 0.686667 | 0.50 | 0.5 | 3.0 | 0.323316 | 2.06 | 6.02000 | 6.02000 | 6.02000 | 6.02000 | 1.0 | NaN | 6.02000 | 6.07 | 4.768000 | 4.830 | 3.45 | 5.0 | 1.055898 | 23.84000 | 2.69000 | 2.69000 | 2.69000 | 2.69000 | 1.0 | NaN | 2.69000 | 2.08000 | 2.08000 | 2.08000 | 2.08000 | 1.0 | NaN | 2.08000 | 2.00000 | 2.00000 | 2.00000 | 2.00000 | 1.0 | NaN | 2.00000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.37000 | 0.306667 | 0.290 | 0.26 | 3.0 | 0.056862 | 0.92000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.4700 | 0.4700 | 0.4700 | 0.4700 | 1.0 | NaN | 0.4700 | 3.99000 | 3.99000 | 3.99000 | 3.99000 | 1.0 | NaN | 3.99000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.94 | 1.063333 | 1.00000 | 0.25 | 3.0 | 0.846778 | 3.19000 | 22.84 | 17.136 | 16.70 | 13.48 | 5.0 | 3.579648 | 85.68 | 19.74 | 19.74 | 19.74 | 19.74 | 1.0 | NaN | 19.74 | 0.01195 | 0.008673 | 0.009690 | 0.00438 | 3.0 | 0.003886 | 0.02602 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2018-12-31 00:00:00 | 0.50 | 0.500000 | 0.50 | 0.5 | 3.0 | 0.000000 | 1.50 | 4.21025 | 4.21025 | 4.21025 | 4.21025 | 1.0 | NaN | 4.21025 | 4.48 | 3.569046 | 3.780 | 2.36 | 5.0 | 0.773945 | 17.84523 | 1.84231 | 1.84231 | 1.84231 | 1.84231 | 1.0 | NaN | 1.84231 | 1.90216 | 1.90216 | 1.90216 | 1.90216 | 1.0 | NaN | 1.90216 | 1.35311 | 1.35311 | 1.35311 | 1.35311 | 1.0 | NaN | 1.35311 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.28623 | 0.238743 | 0.260 | 0.17 | 3.0 | 0.060961 | 0.71623 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.2472 | 0.2472 | 0.2472 | 0.2472 | 1.0 | NaN | 0.2472 | 2.46422 | 2.46422 | 2.46422 | 2.46422 | 1.0 | NaN | 2.46422 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.68 | 0.982390 | 1.01717 | 0.25 | 3.0 | 0.715634 | 2.94717 | 20.22 | 15.644 | 16.28 | 10.71 | 5.0 | 3.850303 | 78.22 | 17.83 | 17.83 | 17.83 | 17.83 | 1.0 | NaN | 17.83 | 0.00693 | 0.005540 | 0.005650 | 0.00404 | 3.0 | 0.001448 | 0.01662 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
df_24g.sample(5)
As(PM10)_max | As(PM10)_mean | As(PM10)_median | As(PM10)_min | As(PM10)_obs_num | As(PM10)_std | As(PM10)_sum | BaA(PM10)_max | BaA(PM10)_mean | BaA(PM10)_median | BaA(PM10)_min | BaA(PM10)_obs_num | BaA(PM10)_std | BaA(PM10)_sum | BaP(PM10)_max | BaP(PM10)_mean | BaP(PM10)_median | BaP(PM10)_min | BaP(PM10)_obs_num | BaP(PM10)_std | BaP(PM10)_sum | BbF(PM10)_max | BbF(PM10)_mean | BbF(PM10)_median | BbF(PM10)_min | BbF(PM10)_obs_num | BbF(PM10)_std | BbF(PM10)_sum | BjF(PM10)_max | BjF(PM10)_mean | BjF(PM10)_median | BjF(PM10)_min | BjF(PM10)_obs_num | BjF(PM10)_std | BjF(PM10)_sum | BkF(PM10)_max | BkF(PM10)_mean | BkF(PM10)_median | BkF(PM10)_min | BkF(PM10)_obs_num | BkF(PM10)_std | BkF(PM10)_sum | C6H6_max | C6H6_mean | C6H6_median | C6H6_min | C6H6_obs_num | C6H6_std | C6H6_sum | Cd(PM10)_max | Cd(PM10)_mean | Cd(PM10)_median | Cd(PM10)_min | Cd(PM10)_obs_num | Cd(PM10)_std | Cd(PM10)_sum | DBah(PM10)_max | DBah(PM10)_mean | DBah(PM10)_median | DBah(PM10)_min | DBah(PM10)_obs_num | DBah(PM10)_std | DBah(PM10)_sum | DBahA(PM10)_max | DBahA(PM10)_mean | DBahA(PM10)_median | DBahA(PM10)_min | DBahA(PM10)_obs_num | DBahA(PM10)_std | DBahA(PM10)_sum | IP(PM10)_max | IP(PM10)_mean | IP(PM10)_median | IP(PM10)_min | IP(PM10)_obs_num | IP(PM10)_std | IP(PM10)_sum | NO2_max | NO2_mean | NO2_median | NO2_min | NO2_obs_num | NO2_std | NO2_sum | Ni(PM10)_max | Ni(PM10)_mean | Ni(PM10)_median | Ni(PM10)_min | Ni(PM10)_obs_num | Ni(PM10)_std | Ni(PM10)_sum | PM10_max | PM10_mean | PM10_median | PM10_min | PM10_obs_num | PM10_std | PM10_sum | PM25_max | PM25_mean | PM25_median | PM25_min | PM25_obs_num | PM25_std | PM25_sum | Pb(PM10)_max | Pb(PM10)_mean | Pb(PM10)_median | Pb(PM10)_min | Pb(PM10)_obs_num | Pb(PM10)_std | Pb(PM10)_sum | SO2_max | SO2_mean | SO2_median | SO2_min | SO2_obs_num | SO2_std | SO2_sum | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Datetime | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2013-03-07 23:59:59 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3.100 | 2.150 | 2.150 | 1.200 | 2.0 | 1.343503 | 4.300 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2016-04-01 00:00:00 | 2.008 | 1.7545 | 1.7545 | 1.501 | 2.0 | 0.358503 | 3.509 | 4.928 | 4.928 | 4.928 | 4.928 | 1.0 | NaN | 4.928 | 5.712 | 4.974 | 4.974 | 4.236 | 2.0 | 1.043690 | 9.948 | 1.920 | 1.920 | 1.920 | 1.920 | 1.0 | NaN | 1.920 | 3.497 | 3.497 | 3.497 | 3.497 | 1.0 | NaN | 3.497 | 4.083 | 4.083 | 4.083 | 4.083 | 1.0 | NaN | 4.083 | NaN | NaN | NaN | NaN | 0.0 | NaN | 0.0 | 1.943 | 1.2850 | 1.2850 | 0.627 | 2.0 | 0.930553 | 2.570 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.536 | 0.536 | 0.536 | 0.536 | 1.0 | NaN | 0.536 | 4.075 | 4.075 | 4.075 | 4.075 | 1.0 | NaN | 4.075 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.056 | 1.816 | 1.816 | 1.576 | 2.0 | 0.339411 | 3.632 | 41.0 | 32.525 | 30.15 | 28.8 | 4.0 | 5.697587 | 130.1 | 22.300 | 22.3000 | 22.3000 | 22.300 | 1.0 | NaN | 22.300 | 0.07153 | 0.047235 | 0.047235 | 0.02294 | 2.0 | 0.034358 | 0.09447 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2002-01-06 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 73.6 | 73.600 | 73.60 | 73.6 | 1.0 | NaN | 73.6 | NaN | NaN | NaN | NaN | 0.0 | NaN | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 64.625 | 64.625 | 64.625 | 64.625 | 1.0 | NaN | 64.625 |
2014-05-14 00:00:00 | 0.500 | 0.4660 | 0.4660 | 0.432 | 2.0 | 0.048083 | 0.932 | 0.376 | 0.376 | 0.376 | 0.376 | 1.0 | NaN | 0.376 | 0.985 | 0.842 | 0.842 | 0.699 | 2.0 | 0.202233 | 1.684 | 1.019 | 1.019 | 1.019 | 1.019 | 1.0 | NaN | 1.019 | 2.438 | 2.438 | 2.438 | 2.438 | 1.0 | NaN | 2.438 | 0.587 | 0.587 | 0.587 | 0.587 | 1.0 | NaN | 0.587 | NaN | NaN | NaN | NaN | 0.0 | NaN | 0.0 | 0.283 | 0.2605 | 0.2605 | 0.238 | 2.0 | 0.031820 | 0.521 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.051 | 0.051 | 0.051 | 0.051 | 1.0 | NaN | 0.051 | 0.350 | 0.350 | 0.350 | 0.350 | 1.0 | NaN | 0.350 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.498 | 1.200 | 1.200 | 0.902 | 2.0 | 0.421436 | 2.400 | 15.0 | 14.500 | 14.50 | 14.0 | 2.0 | 0.707107 | 29.0 | 12.000 | 12.0000 | 12.0000 | 12.000 | 1.0 | NaN | 12.000 | 0.00700 | 0.006000 | 0.006000 | 0.00500 | 2.0 | 0.001414 | 0.01200 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2009-10-08 00:00:00 | NaN | NaN | NaN | NaN | 0.0 | NaN | 0.000 | 1.300 | 1.300 | 1.300 | 1.300 | 1.0 | NaN | 1.300 | 6.600 | 6.600 | 6.600 | 6.600 | 1.0 | NaN | 6.600 | 3.500 | 3.500 | 3.500 | 3.500 | 1.0 | NaN | 3.500 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.300 | 2.300 | 2.300 | 2.300 | 1.0 | NaN | 2.300 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.0 | NaN | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.900 | 0.900 | 0.900 | 0.900 | 1.0 | NaN | 0.900 | 0.200 | 0.200 | 0.200 | 0.200 | 1.0 | NaN | 0.200 | 31.0 | 31.0 | 31.0 | 31.0 | 1.0 | NaN | 31.0 | NaN | NaN | NaN | NaN | 0.0 | NaN | 0.000 | 46.0 | 42.000 | 42.00 | 38.0 | 2.0 | 5.656854 | 84.0 | 22.458 | 22.4165 | 22.4165 | 22.375 | 2.0 | 0.05869 | 44.833 | NaN | NaN | NaN | NaN | 0.0 | NaN | 0.00000 | 10.000 | 7.024 | 7.024 | 4.048 | 2.0 | 4.2087 | 14.048 |
# Create a save directory if not exists
save_dir = '/Users/ksatola/Documents/git/air-polution/data/final'
Path(save_dir).mkdir(parents=True, exist_ok=True)
# Save
gios_24g_all_file = '/Users/ksatola/Documents/git/air-polution/data/final/gios_24g_all.csv'
df_24g.to_csv(gios_24g_all_file, encoding="utf-8", index=True)
# Test read
df_24g_read = pd.read_csv(gios_24g_all_file, encoding='utf-8', sep=",", index_col="Datetime")
df_24g_read.head()
As(PM10)_max | As(PM10)_mean | As(PM10)_median | As(PM10)_min | As(PM10)_obs_num | As(PM10)_std | As(PM10)_sum | BaA(PM10)_max | BaA(PM10)_mean | BaA(PM10)_median | BaA(PM10)_min | BaA(PM10)_obs_num | BaA(PM10)_std | BaA(PM10)_sum | BaP(PM10)_max | BaP(PM10)_mean | BaP(PM10)_median | BaP(PM10)_min | BaP(PM10)_obs_num | BaP(PM10)_std | BaP(PM10)_sum | BbF(PM10)_max | BbF(PM10)_mean | BbF(PM10)_median | BbF(PM10)_min | BbF(PM10)_obs_num | BbF(PM10)_std | BbF(PM10)_sum | BjF(PM10)_max | BjF(PM10)_mean | BjF(PM10)_median | BjF(PM10)_min | BjF(PM10)_obs_num | BjF(PM10)_std | BjF(PM10)_sum | BkF(PM10)_max | BkF(PM10)_mean | BkF(PM10)_median | BkF(PM10)_min | BkF(PM10)_obs_num | BkF(PM10)_std | BkF(PM10)_sum | C6H6_max | C6H6_mean | C6H6_median | C6H6_min | C6H6_obs_num | C6H6_std | C6H6_sum | Cd(PM10)_max | Cd(PM10)_mean | Cd(PM10)_median | Cd(PM10)_min | Cd(PM10)_obs_num | Cd(PM10)_std | Cd(PM10)_sum | DBah(PM10)_max | DBah(PM10)_mean | DBah(PM10)_median | DBah(PM10)_min | DBah(PM10)_obs_num | DBah(PM10)_std | DBah(PM10)_sum | DBahA(PM10)_max | DBahA(PM10)_mean | DBahA(PM10)_median | DBahA(PM10)_min | DBahA(PM10)_obs_num | DBahA(PM10)_std | DBahA(PM10)_sum | IP(PM10)_max | IP(PM10)_mean | IP(PM10)_median | IP(PM10)_min | IP(PM10)_obs_num | IP(PM10)_std | IP(PM10)_sum | NO2_max | NO2_mean | NO2_median | NO2_min | NO2_obs_num | NO2_std | NO2_sum | Ni(PM10)_max | Ni(PM10)_mean | Ni(PM10)_median | Ni(PM10)_min | Ni(PM10)_obs_num | Ni(PM10)_std | Ni(PM10)_sum | PM10_max | PM10_mean | PM10_median | PM10_min | PM10_obs_num | PM10_std | PM10_sum | PM25_max | PM25_mean | PM25_median | PM25_min | PM25_obs_num | PM25_std | PM25_sum | Pb(PM10)_max | Pb(PM10)_mean | Pb(PM10)_median | Pb(PM10)_min | Pb(PM10)_obs_num | Pb(PM10)_std | Pb(PM10)_sum | SO2_max | SO2_mean | SO2_median | SO2_min | SO2_obs_num | SO2_std | SO2_sum | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Datetime | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2000-01-01 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 135.9 | 132.95 | 132.95 | 130.0 | 2.0 | 4.171930 | 265.9 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 106.0 | 106.0 | 106.0 | 106.0 | 1.0 | NaN | 106.0 |
2000-01-02 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 129.1 | 122.55 | 122.55 | 116.0 | 2.0 | 9.263099 | 245.1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 93.0 | 93.0 | 93.0 | 93.0 | 1.0 | NaN | 93.0 |
2000-01-03 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 41.2 | 37.10 | 37.10 | 33.0 | 2.0 | 5.798276 | 74.2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 42.0 | 42.0 | 42.0 | 42.0 | 1.0 | NaN | 42.0 |
2000-01-04 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 36.4 | 31.20 | 31.20 | 26.0 | 2.0 | 7.353911 | 62.4 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 38.0 | 38.0 | 38.0 | 38.0 | 1.0 | NaN | 38.0 |
2000-01-05 00:00:00 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 33.9 | 28.95 | 28.95 | 24.0 | 2.0 | 7.000357 | 57.9 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 33.0 | 33.0 | 33.0 | 33.0 | 1.0 | NaN | 33.0 |
df_24g_read.shape
(7307, 119)
assert df_24g.shape == df_24g_read.shape
%%time
df_full = pd.DataFrame()
for year in years:
file_search_pattern = year+'_*_1g.xlsx'
files = get_files_for_name_pattern(folder, file_search_pattern)
df_for_year = pd.DataFrame()
print(f"Year: {year} - df_full.shape {df_full.shape}")# - files: {files}")
for file in files:
# Take measurement from a file name
measurement_name = file.split('_')[1]
# Manual corrections to inconsistent names created by data supplier
file_name = os.path.basename(file)
# Unify headers, instead of PM2.5 we should have PM25
if re.search('PM2.5', file_name):
#if file_name in ['2012_PM2.5_1g.xlsx', '2016_PM2.5_1g.xlsx']:
measurement_name = 'PM25'
#print(measurement_name)
print(f"File: {file} - measurement_name: {measurement_name}")
# Gather data for a measurement
df_measure = get_pollutant_measures_for_locations(file, ems_codes, measurement_name, year)
print(f"df_measure: {df_measure.head(2)}")
# Merge data frames on datetime index (add more columns for the specified time range)
df_for_year = pd.merge(df_for_year, df_measure, how='outer', left_index=True, right_index=True)
print(f"{measurement_name} - df_measure.shape {df_measure.shape} - df_for_year.shape {df_for_year.shape}")
print(f"df_full.columns: {df_full.columns} - df_for_year.columns {df_for_year.columns}")
# Append new rows with new range of datetimes
df_full = df_full.append(df_for_year, ignore_index = False, verify_integrity=True, sort=False) # keep the appended df index intact
df_full.shape
df_full.head()
df_full.tail()
df_full.sample(5)
df_full.to_csv('/Users/ksatola/Documents/git/air-polution/data/final/gios_df_full_24g_ok.csv', encoding="utf-8", index=False)
%%time
extracted_dir = '/Users/ksatola/Documents/git/air-polution/data/gios/etl/extracted/'
#file = '2017_C6H6_1g.xlsx'
#file = '2015_CO_1g.xlsx'
#file = '2014_CO_1g.xlsx'
#file = '2012_NOx_1g.xlsx'
file = '2005_NOx_1g.xlsx'
full_path_to_file = os.path.join(extracted_dir, file)
# 2016-2019
#dft = pd.read_excel(full_path_to_file, header=1) # read 2nd row as header
# 2012-2015
dft = pd.read_excel(full_path_to_file, header=0) # read 1st row as header
dft.rename(columns={dft.columns[0]: datetime_col_name}, inplace = True)
dft.head(10)
# Get columns defined in ems_codes and datetime
cols_in_scope = ems_codes
cols_in_scope.append(dft.columns[0]) # add time column
dft = dft.loc[:, dft.columns.isin(cols_in_scope)] # handle not existing columns
dft.head()
# Remove first X rows as they contain metadata
#dft = dft.iloc[4:, :] # 2016-2019
dft = dft.iloc[2:, :] # 2015, 2014
dft.head()
cols = dft.columns[1:]
# Replace commas with dots (in all columns but the first one - detatime)
# for 2016-2019
# not needed for 2012-2015
dft[cols] = dft[cols].apply(lambda x: x.str.replace(',','.'))
# Not used when only datetime column is present
if len(cols) > 0:
# Change columns type
dft[dft.columns[0]] = dft[dft.columns[0]].apply(pd.to_datetime)
dft[cols] = dft[cols].apply(pd.to_numeric)
dft.head()
# Set datetime index
dft = dft.set_index(dft.columns[0])
dft.head()
# Calculate statistics for the measure
cols = dft.columns
df_return = pd.DataFrame(index=dft.index.copy())
# If the measurements are available from multiple stations
if len(cols) >= 1:
df_return[measurement_name+'_mean'] = dft[cols].mean(axis=1, skipna=True)
df_return[measurement_name+'_median'] = dft[cols].median(axis=1, skipna=True)
df_return[measurement_name+'_min'] = dft[cols].min(axis=1, skipna=True)
df_return[measurement_name+'_max'] = dft[cols].max(axis=1, skipna=True)
df_return[measurement_name+'_std'] = dft[cols].std(axis=1, skipna=True)
df_return[measurement_name+'_sum'] = dft[cols].sum(axis=1, skipna=True)
df_return[measurement_name+'_obs_num'] = dft.apply(lambda x: x.count(), axis=1) # count not-null values in a row
df_return.head(10)
df_return.tail(5)