Extract data from NetCDF file based on stationReading NetCDF file of station data?How to convert a NetCDF file to GeoTIFF using GDALExtract single datapoint from netCDF file in R using ndcfUsing netCDF4 Python climate algorithm?Calculating Climate Suitable Days using netCDF4 and Python?Extract data from netcdf using a shapefile polygon in pythonGlobal NetCDF datasets with station time series data?Swapped axes in R raster brick plot from netcdfsaving values to a new netcdf arrayPython&ArcGIS: Combining several raster layers into netCDF
What exploit Are these user agents trying to use?
CAST throwing error when run in stored procedure but not when run as raw query
Unable to supress ligatures in headings which are set in Caps
How can I deal with my CEO asking me to hire someone with a higher salary than me, a co-founder?
Ambiguity in the definition of entropy
Is it acceptable for a professor to tell male students to not think that they are smarter than female students?
How seriously should I take size and weight limits of hand luggage?
Why is it a bad idea to hire a hitman to eliminate most corrupt politicians?
Expand and Contract
Can my sorcerer use a spellbook only to collect spells and scribe scrolls, not cast?
Apex Framework / library for consuming REST services
Arrow those variables!
How to tell a function to use the default argument values?
Why didn't Miles's spider sense work before?
Would Slavery Reparations be considered Bills of Attainder and hence Illegal?
What does the expression "A Mann!" means
What's the in-universe reasoning behind sorcerers needing material components?
Is it possible to create a QR code using text?
ssTTsSTtRrriinInnnnNNNIiinngg
What historical events would have to change in order to make 19th century "steampunk" technology possible?
I would say: "You are another teacher", but she is a woman and I am a man
How to add frame around section using titlesec?
Is there a hemisphere-neutral way of specifying a season?
Bullying boss launched a smear campaign and made me unemployable
Extract data from NetCDF file based on station
Reading NetCDF file of station data?How to convert a NetCDF file to GeoTIFF using GDALExtract single datapoint from netCDF file in R using ndcfUsing netCDF4 Python climate algorithm?Calculating Climate Suitable Days using netCDF4 and Python?Extract data from netcdf using a shapefile polygon in pythonGlobal NetCDF datasets with station time series data?Swapped axes in R raster brick plot from netcdfsaving values to a new netcdf arrayPython&ArcGIS: Combining several raster layers into netCDF
I have read other solutions for NetCDF data but my data are a little different and I do not know how to extract data from NetCDF and save them in CSV files based on stations. Data include the maximum temperature values for stations. I just need stations located in latitudes:25.74 to 49.05 and longitude: -93.44 to -116.0. The format of time is different and I just need time[7518:43947190] which include data from 1948. I want to create multiple csv files. every file have to be data for one single station that include time, tmax, and quality flag of data.
from netCDF4 import Dataset
dataset=Dataset("D:/ushcn_tmax.nc")
#### Print dimentions #####
print dataset.file_format
print dataset.dimensions.keys()
print dataset.dimensions['name_strlen']
print dataset.dimensions['obs']
print dataset.dimensions['station']
#### Print variables ####
print dataset.variables.keys()
print dataset.variables['LON']
print dataset.variables['LAT']
print dataset.variables['ELEVATION']
print dataset.variables['STATION_NAME']
print dataset.variables['STATION_INDEX']
print dataset.variables['TIME']
print dataset.variables['TMAX']
print dataset.variables['TMAX_MFLAG']
print dataset.variables['TMAX_QFLAG']
print dataset.variables['TMAX_SFLAG']
Dimension and variables of my data can bee seen here:
NETCDF3_CLASSIC
[u'name_strlen', u'obs', u'station']
<type 'netCDF4._netCDF4.Dimension'>: name = 'name_strlen', size = 50
<type 'netCDF4._netCDF4.Dimension'>: name = 'obs', size = 43947189
<type 'netCDF4._netCDF4.Dimension'>: name = 'station', size = 1218
[u'LON', u'LAT', u'ELEVATION', u'STATION_NAME', u'STATION_INDEX', u'TIME', u'TMAX', u'TMAX_MFLAG', u'TMAX_QFLAG', u'TMAX_SFLAG']
<type 'netCDF4._netCDF4.Variable'>
float32 LON(station)
standard_name: longitude
long_name: station longitude
units: degrees_east
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float32 LAT(station)
standard_name: latitude
long_name: station latitude
units: degrees_north
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float64 ELEVATION(station)
long_name: elevation above the sea level
standard_name: elevation
units: m
positive: up
axis: Z
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 STATION_NAME(station, name_strlen)
long_name: USHCN station name
cf_role: timeseries_id
unlimited dimensions:
current shape = (1218, 50)
filling off
<type 'netCDF4._netCDF4.Variable'>
int32 STATION_INDEX(obs)
long_name: which station this obs is for
instance_dimension: station
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float64 TIME(obs)
standard_name: time
long_name: Time
units: decimal day
_FillValue: -9999.0
comment: time calculeted as: year + day_of_year/days_in_year
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
int32 TMAX(obs)
standard_name: TMAX
long_name: maximum temperature
units: degrees F
coordinates: time lat lon elevation
_FillValue: -9999
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_MFLAG(obs)
standard_name: TMAX_MFLAG
long_mane: measurement flag for TMAX
flag_values: BDLT
flag_meanings: Blank = no measurement information applicable; B = precipitation total formed from two 12-hour totals; D = precipitation total formed from four six-hour totals; L = temperature appears to be lagged with respect to reported hour of OBServation; T = trace of precipitation, snowfall, or snow depth
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_QFLAG(obs)
standard_name: TMAX_QFLAG
long_mane: quality flag for TMAX
flag_values: ADGIKMNORSTWX
flag_meanings: Blank = did not fail any quality assurance check; A = failed accumulation total check; D = failed duplicate check; G = failed gap check; I = failed internal consistency check; K = failed streak/frequent-value check; M = failed megaconsistency check; N = failed naught check; O = failed climatological outlier check; R = failed lagged range check; S = failed spatial consistency check; T = failed temporal consistency check; W = temperature too warm for snow; X = failed bounds check;
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_SFLAG(obs)
standard_name: TMAX_SFLAG
long_mane: source flag for TMAX
flag_values: 0126ABFGHIMQRSX
flag_meanings: Blank = No source (i.e., data value missing); 0 = U.S. Cooperative Summary of the Day (NCDC DSI-3200); 1 = U.S. Preliminary Cooperative Summary of the Day -- Transmitted; 2 = U.S. Preliminary Cooperative Summary of the Day -- Keyed from paper forms; 6 = CDMP Cooperative Summary of the Day (NCDC DSI-3206); A = U.S. Automated Surface Observing System (ASOS) real-time data (since January 1, 2006); B = U.S. ASOS data for October 2000-December 2005 (NCDC DSI-3211); F = U.S. Fort data; G = Official Global Climate Observing System (GCOS) or other government-supplied data; H = High Plains Regional Climate Center real-time data; I = International collection (non U.S. data received through personal contacts); M = Monthly METAR Extract (additional ASOS data); Q = Data from several African countries that had been 'quarantined', that is, withheld from public release until permission was granted from the respective meteorological services; R = NCDC Reference Network Database (Climate Reference Network and Historical Climatology Network-Modernized); S = Global Summary of the Day (NCDC DSI-9618), NOTE: 'S' values are derived from hourly synoptic reports exchanged on the Global Telecommunications System (GTS).Daily values derived in this fashion may differ significantly from 'true' daily data, particularly for precipitation (i.e., use with caution); X = U.S. First-Order Summary of the Day (NCDC DSI-3210)
unlimited dimensions:
current shape = (43947189,)
filling off
I have tried to read data with :
xr.open_dataset("D:/ushcn_tmax.nc")
df=dataset.sel(lon=-99.30,lat=32.73,method='nearest')
while the mentione lat and lon belong to one station and I received error "KeyError: 'lat'".
Is there any way I can convert variables (lat, lon, and time) to dimentions to make it easier to work with? Or any way I can extract data based on station as dimension?
python csv netcdf
bumped to the homepage by Community♦ 2 days ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
I have read other solutions for NetCDF data but my data are a little different and I do not know how to extract data from NetCDF and save them in CSV files based on stations. Data include the maximum temperature values for stations. I just need stations located in latitudes:25.74 to 49.05 and longitude: -93.44 to -116.0. The format of time is different and I just need time[7518:43947190] which include data from 1948. I want to create multiple csv files. every file have to be data for one single station that include time, tmax, and quality flag of data.
from netCDF4 import Dataset
dataset=Dataset("D:/ushcn_tmax.nc")
#### Print dimentions #####
print dataset.file_format
print dataset.dimensions.keys()
print dataset.dimensions['name_strlen']
print dataset.dimensions['obs']
print dataset.dimensions['station']
#### Print variables ####
print dataset.variables.keys()
print dataset.variables['LON']
print dataset.variables['LAT']
print dataset.variables['ELEVATION']
print dataset.variables['STATION_NAME']
print dataset.variables['STATION_INDEX']
print dataset.variables['TIME']
print dataset.variables['TMAX']
print dataset.variables['TMAX_MFLAG']
print dataset.variables['TMAX_QFLAG']
print dataset.variables['TMAX_SFLAG']
Dimension and variables of my data can bee seen here:
NETCDF3_CLASSIC
[u'name_strlen', u'obs', u'station']
<type 'netCDF4._netCDF4.Dimension'>: name = 'name_strlen', size = 50
<type 'netCDF4._netCDF4.Dimension'>: name = 'obs', size = 43947189
<type 'netCDF4._netCDF4.Dimension'>: name = 'station', size = 1218
[u'LON', u'LAT', u'ELEVATION', u'STATION_NAME', u'STATION_INDEX', u'TIME', u'TMAX', u'TMAX_MFLAG', u'TMAX_QFLAG', u'TMAX_SFLAG']
<type 'netCDF4._netCDF4.Variable'>
float32 LON(station)
standard_name: longitude
long_name: station longitude
units: degrees_east
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float32 LAT(station)
standard_name: latitude
long_name: station latitude
units: degrees_north
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float64 ELEVATION(station)
long_name: elevation above the sea level
standard_name: elevation
units: m
positive: up
axis: Z
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 STATION_NAME(station, name_strlen)
long_name: USHCN station name
cf_role: timeseries_id
unlimited dimensions:
current shape = (1218, 50)
filling off
<type 'netCDF4._netCDF4.Variable'>
int32 STATION_INDEX(obs)
long_name: which station this obs is for
instance_dimension: station
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float64 TIME(obs)
standard_name: time
long_name: Time
units: decimal day
_FillValue: -9999.0
comment: time calculeted as: year + day_of_year/days_in_year
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
int32 TMAX(obs)
standard_name: TMAX
long_name: maximum temperature
units: degrees F
coordinates: time lat lon elevation
_FillValue: -9999
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_MFLAG(obs)
standard_name: TMAX_MFLAG
long_mane: measurement flag for TMAX
flag_values: BDLT
flag_meanings: Blank = no measurement information applicable; B = precipitation total formed from two 12-hour totals; D = precipitation total formed from four six-hour totals; L = temperature appears to be lagged with respect to reported hour of OBServation; T = trace of precipitation, snowfall, or snow depth
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_QFLAG(obs)
standard_name: TMAX_QFLAG
long_mane: quality flag for TMAX
flag_values: ADGIKMNORSTWX
flag_meanings: Blank = did not fail any quality assurance check; A = failed accumulation total check; D = failed duplicate check; G = failed gap check; I = failed internal consistency check; K = failed streak/frequent-value check; M = failed megaconsistency check; N = failed naught check; O = failed climatological outlier check; R = failed lagged range check; S = failed spatial consistency check; T = failed temporal consistency check; W = temperature too warm for snow; X = failed bounds check;
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_SFLAG(obs)
standard_name: TMAX_SFLAG
long_mane: source flag for TMAX
flag_values: 0126ABFGHIMQRSX
flag_meanings: Blank = No source (i.e., data value missing); 0 = U.S. Cooperative Summary of the Day (NCDC DSI-3200); 1 = U.S. Preliminary Cooperative Summary of the Day -- Transmitted; 2 = U.S. Preliminary Cooperative Summary of the Day -- Keyed from paper forms; 6 = CDMP Cooperative Summary of the Day (NCDC DSI-3206); A = U.S. Automated Surface Observing System (ASOS) real-time data (since January 1, 2006); B = U.S. ASOS data for October 2000-December 2005 (NCDC DSI-3211); F = U.S. Fort data; G = Official Global Climate Observing System (GCOS) or other government-supplied data; H = High Plains Regional Climate Center real-time data; I = International collection (non U.S. data received through personal contacts); M = Monthly METAR Extract (additional ASOS data); Q = Data from several African countries that had been 'quarantined', that is, withheld from public release until permission was granted from the respective meteorological services; R = NCDC Reference Network Database (Climate Reference Network and Historical Climatology Network-Modernized); S = Global Summary of the Day (NCDC DSI-9618), NOTE: 'S' values are derived from hourly synoptic reports exchanged on the Global Telecommunications System (GTS).Daily values derived in this fashion may differ significantly from 'true' daily data, particularly for precipitation (i.e., use with caution); X = U.S. First-Order Summary of the Day (NCDC DSI-3210)
unlimited dimensions:
current shape = (43947189,)
filling off
I have tried to read data with :
xr.open_dataset("D:/ushcn_tmax.nc")
df=dataset.sel(lon=-99.30,lat=32.73,method='nearest')
while the mentione lat and lon belong to one station and I received error "KeyError: 'lat'".
Is there any way I can convert variables (lat, lon, and time) to dimentions to make it easier to work with? Or any way I can extract data based on station as dimension?
python csv netcdf
bumped to the homepage by Community♦ 2 days ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Hello, can you update the question with the result ofdataset = xr.open_dataset("D:/ushcn_tmax.nc")
>>>dataset
?
– Andrea Massetti
Mar 2 at 1:05
add a comment |
I have read other solutions for NetCDF data but my data are a little different and I do not know how to extract data from NetCDF and save them in CSV files based on stations. Data include the maximum temperature values for stations. I just need stations located in latitudes:25.74 to 49.05 and longitude: -93.44 to -116.0. The format of time is different and I just need time[7518:43947190] which include data from 1948. I want to create multiple csv files. every file have to be data for one single station that include time, tmax, and quality flag of data.
from netCDF4 import Dataset
dataset=Dataset("D:/ushcn_tmax.nc")
#### Print dimentions #####
print dataset.file_format
print dataset.dimensions.keys()
print dataset.dimensions['name_strlen']
print dataset.dimensions['obs']
print dataset.dimensions['station']
#### Print variables ####
print dataset.variables.keys()
print dataset.variables['LON']
print dataset.variables['LAT']
print dataset.variables['ELEVATION']
print dataset.variables['STATION_NAME']
print dataset.variables['STATION_INDEX']
print dataset.variables['TIME']
print dataset.variables['TMAX']
print dataset.variables['TMAX_MFLAG']
print dataset.variables['TMAX_QFLAG']
print dataset.variables['TMAX_SFLAG']
Dimension and variables of my data can bee seen here:
NETCDF3_CLASSIC
[u'name_strlen', u'obs', u'station']
<type 'netCDF4._netCDF4.Dimension'>: name = 'name_strlen', size = 50
<type 'netCDF4._netCDF4.Dimension'>: name = 'obs', size = 43947189
<type 'netCDF4._netCDF4.Dimension'>: name = 'station', size = 1218
[u'LON', u'LAT', u'ELEVATION', u'STATION_NAME', u'STATION_INDEX', u'TIME', u'TMAX', u'TMAX_MFLAG', u'TMAX_QFLAG', u'TMAX_SFLAG']
<type 'netCDF4._netCDF4.Variable'>
float32 LON(station)
standard_name: longitude
long_name: station longitude
units: degrees_east
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float32 LAT(station)
standard_name: latitude
long_name: station latitude
units: degrees_north
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float64 ELEVATION(station)
long_name: elevation above the sea level
standard_name: elevation
units: m
positive: up
axis: Z
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 STATION_NAME(station, name_strlen)
long_name: USHCN station name
cf_role: timeseries_id
unlimited dimensions:
current shape = (1218, 50)
filling off
<type 'netCDF4._netCDF4.Variable'>
int32 STATION_INDEX(obs)
long_name: which station this obs is for
instance_dimension: station
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float64 TIME(obs)
standard_name: time
long_name: Time
units: decimal day
_FillValue: -9999.0
comment: time calculeted as: year + day_of_year/days_in_year
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
int32 TMAX(obs)
standard_name: TMAX
long_name: maximum temperature
units: degrees F
coordinates: time lat lon elevation
_FillValue: -9999
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_MFLAG(obs)
standard_name: TMAX_MFLAG
long_mane: measurement flag for TMAX
flag_values: BDLT
flag_meanings: Blank = no measurement information applicable; B = precipitation total formed from two 12-hour totals; D = precipitation total formed from four six-hour totals; L = temperature appears to be lagged with respect to reported hour of OBServation; T = trace of precipitation, snowfall, or snow depth
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_QFLAG(obs)
standard_name: TMAX_QFLAG
long_mane: quality flag for TMAX
flag_values: ADGIKMNORSTWX
flag_meanings: Blank = did not fail any quality assurance check; A = failed accumulation total check; D = failed duplicate check; G = failed gap check; I = failed internal consistency check; K = failed streak/frequent-value check; M = failed megaconsistency check; N = failed naught check; O = failed climatological outlier check; R = failed lagged range check; S = failed spatial consistency check; T = failed temporal consistency check; W = temperature too warm for snow; X = failed bounds check;
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_SFLAG(obs)
standard_name: TMAX_SFLAG
long_mane: source flag for TMAX
flag_values: 0126ABFGHIMQRSX
flag_meanings: Blank = No source (i.e., data value missing); 0 = U.S. Cooperative Summary of the Day (NCDC DSI-3200); 1 = U.S. Preliminary Cooperative Summary of the Day -- Transmitted; 2 = U.S. Preliminary Cooperative Summary of the Day -- Keyed from paper forms; 6 = CDMP Cooperative Summary of the Day (NCDC DSI-3206); A = U.S. Automated Surface Observing System (ASOS) real-time data (since January 1, 2006); B = U.S. ASOS data for October 2000-December 2005 (NCDC DSI-3211); F = U.S. Fort data; G = Official Global Climate Observing System (GCOS) or other government-supplied data; H = High Plains Regional Climate Center real-time data; I = International collection (non U.S. data received through personal contacts); M = Monthly METAR Extract (additional ASOS data); Q = Data from several African countries that had been 'quarantined', that is, withheld from public release until permission was granted from the respective meteorological services; R = NCDC Reference Network Database (Climate Reference Network and Historical Climatology Network-Modernized); S = Global Summary of the Day (NCDC DSI-9618), NOTE: 'S' values are derived from hourly synoptic reports exchanged on the Global Telecommunications System (GTS).Daily values derived in this fashion may differ significantly from 'true' daily data, particularly for precipitation (i.e., use with caution); X = U.S. First-Order Summary of the Day (NCDC DSI-3210)
unlimited dimensions:
current shape = (43947189,)
filling off
I have tried to read data with :
xr.open_dataset("D:/ushcn_tmax.nc")
df=dataset.sel(lon=-99.30,lat=32.73,method='nearest')
while the mentione lat and lon belong to one station and I received error "KeyError: 'lat'".
Is there any way I can convert variables (lat, lon, and time) to dimentions to make it easier to work with? Or any way I can extract data based on station as dimension?
python csv netcdf
I have read other solutions for NetCDF data but my data are a little different and I do not know how to extract data from NetCDF and save them in CSV files based on stations. Data include the maximum temperature values for stations. I just need stations located in latitudes:25.74 to 49.05 and longitude: -93.44 to -116.0. The format of time is different and I just need time[7518:43947190] which include data from 1948. I want to create multiple csv files. every file have to be data for one single station that include time, tmax, and quality flag of data.
from netCDF4 import Dataset
dataset=Dataset("D:/ushcn_tmax.nc")
#### Print dimentions #####
print dataset.file_format
print dataset.dimensions.keys()
print dataset.dimensions['name_strlen']
print dataset.dimensions['obs']
print dataset.dimensions['station']
#### Print variables ####
print dataset.variables.keys()
print dataset.variables['LON']
print dataset.variables['LAT']
print dataset.variables['ELEVATION']
print dataset.variables['STATION_NAME']
print dataset.variables['STATION_INDEX']
print dataset.variables['TIME']
print dataset.variables['TMAX']
print dataset.variables['TMAX_MFLAG']
print dataset.variables['TMAX_QFLAG']
print dataset.variables['TMAX_SFLAG']
Dimension and variables of my data can bee seen here:
NETCDF3_CLASSIC
[u'name_strlen', u'obs', u'station']
<type 'netCDF4._netCDF4.Dimension'>: name = 'name_strlen', size = 50
<type 'netCDF4._netCDF4.Dimension'>: name = 'obs', size = 43947189
<type 'netCDF4._netCDF4.Dimension'>: name = 'station', size = 1218
[u'LON', u'LAT', u'ELEVATION', u'STATION_NAME', u'STATION_INDEX', u'TIME', u'TMAX', u'TMAX_MFLAG', u'TMAX_QFLAG', u'TMAX_SFLAG']
<type 'netCDF4._netCDF4.Variable'>
float32 LON(station)
standard_name: longitude
long_name: station longitude
units: degrees_east
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float32 LAT(station)
standard_name: latitude
long_name: station latitude
units: degrees_north
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float64 ELEVATION(station)
long_name: elevation above the sea level
standard_name: elevation
units: m
positive: up
axis: Z
unlimited dimensions:
current shape = (1218,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 STATION_NAME(station, name_strlen)
long_name: USHCN station name
cf_role: timeseries_id
unlimited dimensions:
current shape = (1218, 50)
filling off
<type 'netCDF4._netCDF4.Variable'>
int32 STATION_INDEX(obs)
long_name: which station this obs is for
instance_dimension: station
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
float64 TIME(obs)
standard_name: time
long_name: Time
units: decimal day
_FillValue: -9999.0
comment: time calculeted as: year + day_of_year/days_in_year
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
int32 TMAX(obs)
standard_name: TMAX
long_name: maximum temperature
units: degrees F
coordinates: time lat lon elevation
_FillValue: -9999
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_MFLAG(obs)
standard_name: TMAX_MFLAG
long_mane: measurement flag for TMAX
flag_values: BDLT
flag_meanings: Blank = no measurement information applicable; B = precipitation total formed from two 12-hour totals; D = precipitation total formed from four six-hour totals; L = temperature appears to be lagged with respect to reported hour of OBServation; T = trace of precipitation, snowfall, or snow depth
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_QFLAG(obs)
standard_name: TMAX_QFLAG
long_mane: quality flag for TMAX
flag_values: ADGIKMNORSTWX
flag_meanings: Blank = did not fail any quality assurance check; A = failed accumulation total check; D = failed duplicate check; G = failed gap check; I = failed internal consistency check; K = failed streak/frequent-value check; M = failed megaconsistency check; N = failed naught check; O = failed climatological outlier check; R = failed lagged range check; S = failed spatial consistency check; T = failed temporal consistency check; W = temperature too warm for snow; X = failed bounds check;
unlimited dimensions:
current shape = (43947189,)
filling off
<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_SFLAG(obs)
standard_name: TMAX_SFLAG
long_mane: source flag for TMAX
flag_values: 0126ABFGHIMQRSX
flag_meanings: Blank = No source (i.e., data value missing); 0 = U.S. Cooperative Summary of the Day (NCDC DSI-3200); 1 = U.S. Preliminary Cooperative Summary of the Day -- Transmitted; 2 = U.S. Preliminary Cooperative Summary of the Day -- Keyed from paper forms; 6 = CDMP Cooperative Summary of the Day (NCDC DSI-3206); A = U.S. Automated Surface Observing System (ASOS) real-time data (since January 1, 2006); B = U.S. ASOS data for October 2000-December 2005 (NCDC DSI-3211); F = U.S. Fort data; G = Official Global Climate Observing System (GCOS) or other government-supplied data; H = High Plains Regional Climate Center real-time data; I = International collection (non U.S. data received through personal contacts); M = Monthly METAR Extract (additional ASOS data); Q = Data from several African countries that had been 'quarantined', that is, withheld from public release until permission was granted from the respective meteorological services; R = NCDC Reference Network Database (Climate Reference Network and Historical Climatology Network-Modernized); S = Global Summary of the Day (NCDC DSI-9618), NOTE: 'S' values are derived from hourly synoptic reports exchanged on the Global Telecommunications System (GTS).Daily values derived in this fashion may differ significantly from 'true' daily data, particularly for precipitation (i.e., use with caution); X = U.S. First-Order Summary of the Day (NCDC DSI-3210)
unlimited dimensions:
current shape = (43947189,)
filling off
I have tried to read data with :
xr.open_dataset("D:/ushcn_tmax.nc")
df=dataset.sel(lon=-99.30,lat=32.73,method='nearest')
while the mentione lat and lon belong to one station and I received error "KeyError: 'lat'".
Is there any way I can convert variables (lat, lon, and time) to dimentions to make it easier to work with? Or any way I can extract data based on station as dimension?
python csv netcdf
python csv netcdf
edited Feb 21 at 20:45
Vince
14.8k32849
14.8k32849
asked Feb 21 at 20:37
AmyAmy
62
62
bumped to the homepage by Community♦ 2 days ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 2 days ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Hello, can you update the question with the result ofdataset = xr.open_dataset("D:/ushcn_tmax.nc")
>>>dataset
?
– Andrea Massetti
Mar 2 at 1:05
add a comment |
Hello, can you update the question with the result ofdataset = xr.open_dataset("D:/ushcn_tmax.nc")
>>>dataset
?
– Andrea Massetti
Mar 2 at 1:05
Hello, can you update the question with the result of
dataset = xr.open_dataset("D:/ushcn_tmax.nc")
>>>dataset
?– Andrea Massetti
Mar 2 at 1:05
Hello, can you update the question with the result of
dataset = xr.open_dataset("D:/ushcn_tmax.nc")
>>>dataset
?– Andrea Massetti
Mar 2 at 1:05
add a comment |
1 Answer
1
active
oldest
votes
Note, I can update the answer with the correct variables if you include in the question the variables as read by xarray (as I asked you in the comment).
You are using two different packages when xarray
alone would do the work just fine (xarray
has a netCDF4
backend, but wraps to it more human accessible, understandable and readable methods.. and much more!).
import xarray as xr
dataset = xr.open_dataset(r"C:path_to_dsdataset.nc")
Inspect the variables
Example with one file I had on pc:
dataset
>>> <xarray.Dataset>
>>> Dimensions: (time: 63, x: 4000, y: 4000)
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> * x (x) float64 1.5e+06 1.5e+06 1.5e+06 ... 1.6e+06 1.6e+06 1.6e+06
>>> * time (time) datetime64[ns] 2013-06-21T23:53:00 ... 2019-01-13T23:50:56
>>> Data variables:
>>> relative_humidity (time, y, x) float32 ...#show descriptive data about the whole dataset
Select location
Use the xarray.Dataset.sel()
method with the name of the coordinates that you found above. With my example, I will go with "x" and "y":
subset = dataset.sel(x=1.504e+06, y=-4.202e+06, method='nearest')
Note, you are assuming that the coordinates are in decimal degrees but you should check the projection attribute to be sure you are querying in the correct reference system Depending on how well documented is the metadata of the file, you can access the attribute of the coordinate variable by:
dataset.y
>>> <xarray.DataArray 'y' (y: 4000)>
>>> array([-4200012.5, -4200037.5, -4200062.5, ..., -4299937.5, -4299962.5,
-4299987.5])
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> Attributes:
>>> units: metre
>>> standard_name: projection_y_coordinate
>>> long_name: y coordinate of projection
Or you might need to see the global attribute of the file, that might contain a projection field by
dataset.attrs
Select the time
As for the time dimension, xarray
converts automatically the difficult to read "seconds past" format of netCDF4
in a more human readable np.datetime64
.
However, I prefer to select dates with the help of pandas.to_datetime()
method, that easily converts strings to datetime.datetime
(which is even more human readable than np.datetime
).
I am sure there are similar methods in xarray
, or even the exact same since xarray
inherits many functions from pandas
and numpy
, but I can never be bothered looking for them.
import pandas.to_datetime as to_datetime
all_dates = dataset.time.values
sel_dates = [date for date in all_dates if to_datetime(date) >= to_datetime("01/01/1948")] #select dates past 1st Jan 1948
result = subset.sel(time=sel_dates)
all done! just select the variable you are querying, in my case relative_humidity
result.relative_humidity.values
>>> array([0.1,...,0.2], dtype=float32)
It is a np.ndarray
of shape=(n_selected_dates, n_selected pixels)
; in this case 1 pixel as we passed only one value for x, y.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "79"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgis.stackexchange.com%2fquestions%2f313125%2fextract-data-from-netcdf-file-based-on-station%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Note, I can update the answer with the correct variables if you include in the question the variables as read by xarray (as I asked you in the comment).
You are using two different packages when xarray
alone would do the work just fine (xarray
has a netCDF4
backend, but wraps to it more human accessible, understandable and readable methods.. and much more!).
import xarray as xr
dataset = xr.open_dataset(r"C:path_to_dsdataset.nc")
Inspect the variables
Example with one file I had on pc:
dataset
>>> <xarray.Dataset>
>>> Dimensions: (time: 63, x: 4000, y: 4000)
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> * x (x) float64 1.5e+06 1.5e+06 1.5e+06 ... 1.6e+06 1.6e+06 1.6e+06
>>> * time (time) datetime64[ns] 2013-06-21T23:53:00 ... 2019-01-13T23:50:56
>>> Data variables:
>>> relative_humidity (time, y, x) float32 ...#show descriptive data about the whole dataset
Select location
Use the xarray.Dataset.sel()
method with the name of the coordinates that you found above. With my example, I will go with "x" and "y":
subset = dataset.sel(x=1.504e+06, y=-4.202e+06, method='nearest')
Note, you are assuming that the coordinates are in decimal degrees but you should check the projection attribute to be sure you are querying in the correct reference system Depending on how well documented is the metadata of the file, you can access the attribute of the coordinate variable by:
dataset.y
>>> <xarray.DataArray 'y' (y: 4000)>
>>> array([-4200012.5, -4200037.5, -4200062.5, ..., -4299937.5, -4299962.5,
-4299987.5])
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> Attributes:
>>> units: metre
>>> standard_name: projection_y_coordinate
>>> long_name: y coordinate of projection
Or you might need to see the global attribute of the file, that might contain a projection field by
dataset.attrs
Select the time
As for the time dimension, xarray
converts automatically the difficult to read "seconds past" format of netCDF4
in a more human readable np.datetime64
.
However, I prefer to select dates with the help of pandas.to_datetime()
method, that easily converts strings to datetime.datetime
(which is even more human readable than np.datetime
).
I am sure there are similar methods in xarray
, or even the exact same since xarray
inherits many functions from pandas
and numpy
, but I can never be bothered looking for them.
import pandas.to_datetime as to_datetime
all_dates = dataset.time.values
sel_dates = [date for date in all_dates if to_datetime(date) >= to_datetime("01/01/1948")] #select dates past 1st Jan 1948
result = subset.sel(time=sel_dates)
all done! just select the variable you are querying, in my case relative_humidity
result.relative_humidity.values
>>> array([0.1,...,0.2], dtype=float32)
It is a np.ndarray
of shape=(n_selected_dates, n_selected pixels)
; in this case 1 pixel as we passed only one value for x, y.
add a comment |
Note, I can update the answer with the correct variables if you include in the question the variables as read by xarray (as I asked you in the comment).
You are using two different packages when xarray
alone would do the work just fine (xarray
has a netCDF4
backend, but wraps to it more human accessible, understandable and readable methods.. and much more!).
import xarray as xr
dataset = xr.open_dataset(r"C:path_to_dsdataset.nc")
Inspect the variables
Example with one file I had on pc:
dataset
>>> <xarray.Dataset>
>>> Dimensions: (time: 63, x: 4000, y: 4000)
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> * x (x) float64 1.5e+06 1.5e+06 1.5e+06 ... 1.6e+06 1.6e+06 1.6e+06
>>> * time (time) datetime64[ns] 2013-06-21T23:53:00 ... 2019-01-13T23:50:56
>>> Data variables:
>>> relative_humidity (time, y, x) float32 ...#show descriptive data about the whole dataset
Select location
Use the xarray.Dataset.sel()
method with the name of the coordinates that you found above. With my example, I will go with "x" and "y":
subset = dataset.sel(x=1.504e+06, y=-4.202e+06, method='nearest')
Note, you are assuming that the coordinates are in decimal degrees but you should check the projection attribute to be sure you are querying in the correct reference system Depending on how well documented is the metadata of the file, you can access the attribute of the coordinate variable by:
dataset.y
>>> <xarray.DataArray 'y' (y: 4000)>
>>> array([-4200012.5, -4200037.5, -4200062.5, ..., -4299937.5, -4299962.5,
-4299987.5])
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> Attributes:
>>> units: metre
>>> standard_name: projection_y_coordinate
>>> long_name: y coordinate of projection
Or you might need to see the global attribute of the file, that might contain a projection field by
dataset.attrs
Select the time
As for the time dimension, xarray
converts automatically the difficult to read "seconds past" format of netCDF4
in a more human readable np.datetime64
.
However, I prefer to select dates with the help of pandas.to_datetime()
method, that easily converts strings to datetime.datetime
(which is even more human readable than np.datetime
).
I am sure there are similar methods in xarray
, or even the exact same since xarray
inherits many functions from pandas
and numpy
, but I can never be bothered looking for them.
import pandas.to_datetime as to_datetime
all_dates = dataset.time.values
sel_dates = [date for date in all_dates if to_datetime(date) >= to_datetime("01/01/1948")] #select dates past 1st Jan 1948
result = subset.sel(time=sel_dates)
all done! just select the variable you are querying, in my case relative_humidity
result.relative_humidity.values
>>> array([0.1,...,0.2], dtype=float32)
It is a np.ndarray
of shape=(n_selected_dates, n_selected pixels)
; in this case 1 pixel as we passed only one value for x, y.
add a comment |
Note, I can update the answer with the correct variables if you include in the question the variables as read by xarray (as I asked you in the comment).
You are using two different packages when xarray
alone would do the work just fine (xarray
has a netCDF4
backend, but wraps to it more human accessible, understandable and readable methods.. and much more!).
import xarray as xr
dataset = xr.open_dataset(r"C:path_to_dsdataset.nc")
Inspect the variables
Example with one file I had on pc:
dataset
>>> <xarray.Dataset>
>>> Dimensions: (time: 63, x: 4000, y: 4000)
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> * x (x) float64 1.5e+06 1.5e+06 1.5e+06 ... 1.6e+06 1.6e+06 1.6e+06
>>> * time (time) datetime64[ns] 2013-06-21T23:53:00 ... 2019-01-13T23:50:56
>>> Data variables:
>>> relative_humidity (time, y, x) float32 ...#show descriptive data about the whole dataset
Select location
Use the xarray.Dataset.sel()
method with the name of the coordinates that you found above. With my example, I will go with "x" and "y":
subset = dataset.sel(x=1.504e+06, y=-4.202e+06, method='nearest')
Note, you are assuming that the coordinates are in decimal degrees but you should check the projection attribute to be sure you are querying in the correct reference system Depending on how well documented is the metadata of the file, you can access the attribute of the coordinate variable by:
dataset.y
>>> <xarray.DataArray 'y' (y: 4000)>
>>> array([-4200012.5, -4200037.5, -4200062.5, ..., -4299937.5, -4299962.5,
-4299987.5])
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> Attributes:
>>> units: metre
>>> standard_name: projection_y_coordinate
>>> long_name: y coordinate of projection
Or you might need to see the global attribute of the file, that might contain a projection field by
dataset.attrs
Select the time
As for the time dimension, xarray
converts automatically the difficult to read "seconds past" format of netCDF4
in a more human readable np.datetime64
.
However, I prefer to select dates with the help of pandas.to_datetime()
method, that easily converts strings to datetime.datetime
(which is even more human readable than np.datetime
).
I am sure there are similar methods in xarray
, or even the exact same since xarray
inherits many functions from pandas
and numpy
, but I can never be bothered looking for them.
import pandas.to_datetime as to_datetime
all_dates = dataset.time.values
sel_dates = [date for date in all_dates if to_datetime(date) >= to_datetime("01/01/1948")] #select dates past 1st Jan 1948
result = subset.sel(time=sel_dates)
all done! just select the variable you are querying, in my case relative_humidity
result.relative_humidity.values
>>> array([0.1,...,0.2], dtype=float32)
It is a np.ndarray
of shape=(n_selected_dates, n_selected pixels)
; in this case 1 pixel as we passed only one value for x, y.
Note, I can update the answer with the correct variables if you include in the question the variables as read by xarray (as I asked you in the comment).
You are using two different packages when xarray
alone would do the work just fine (xarray
has a netCDF4
backend, but wraps to it more human accessible, understandable and readable methods.. and much more!).
import xarray as xr
dataset = xr.open_dataset(r"C:path_to_dsdataset.nc")
Inspect the variables
Example with one file I had on pc:
dataset
>>> <xarray.Dataset>
>>> Dimensions: (time: 63, x: 4000, y: 4000)
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> * x (x) float64 1.5e+06 1.5e+06 1.5e+06 ... 1.6e+06 1.6e+06 1.6e+06
>>> * time (time) datetime64[ns] 2013-06-21T23:53:00 ... 2019-01-13T23:50:56
>>> Data variables:
>>> relative_humidity (time, y, x) float32 ...#show descriptive data about the whole dataset
Select location
Use the xarray.Dataset.sel()
method with the name of the coordinates that you found above. With my example, I will go with "x" and "y":
subset = dataset.sel(x=1.504e+06, y=-4.202e+06, method='nearest')
Note, you are assuming that the coordinates are in decimal degrees but you should check the projection attribute to be sure you are querying in the correct reference system Depending on how well documented is the metadata of the file, you can access the attribute of the coordinate variable by:
dataset.y
>>> <xarray.DataArray 'y' (y: 4000)>
>>> array([-4200012.5, -4200037.5, -4200062.5, ..., -4299937.5, -4299962.5,
-4299987.5])
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> Attributes:
>>> units: metre
>>> standard_name: projection_y_coordinate
>>> long_name: y coordinate of projection
Or you might need to see the global attribute of the file, that might contain a projection field by
dataset.attrs
Select the time
As for the time dimension, xarray
converts automatically the difficult to read "seconds past" format of netCDF4
in a more human readable np.datetime64
.
However, I prefer to select dates with the help of pandas.to_datetime()
method, that easily converts strings to datetime.datetime
(which is even more human readable than np.datetime
).
I am sure there are similar methods in xarray
, or even the exact same since xarray
inherits many functions from pandas
and numpy
, but I can never be bothered looking for them.
import pandas.to_datetime as to_datetime
all_dates = dataset.time.values
sel_dates = [date for date in all_dates if to_datetime(date) >= to_datetime("01/01/1948")] #select dates past 1st Jan 1948
result = subset.sel(time=sel_dates)
all done! just select the variable you are querying, in my case relative_humidity
result.relative_humidity.values
>>> array([0.1,...,0.2], dtype=float32)
It is a np.ndarray
of shape=(n_selected_dates, n_selected pixels)
; in this case 1 pixel as we passed only one value for x, y.
answered Mar 2 at 2:03
Andrea MassettiAndrea Massetti
333312
333312
add a comment |
add a comment |
Thanks for contributing an answer to Geographic Information Systems Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgis.stackexchange.com%2fquestions%2f313125%2fextract-data-from-netcdf-file-based-on-station%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Hello, can you update the question with the result of
dataset = xr.open_dataset("D:/ushcn_tmax.nc")
>>>dataset
?– Andrea Massetti
Mar 2 at 1:05