Extract data from NetCDF file based on stationReading NetCDF file of station data?How to convert a NetCDF file to GeoTIFF using GDALExtract single datapoint from netCDF file in R using ndcfUsing netCDF4 Python climate algorithm?Calculating Climate Suitable Days using netCDF4 and Python?Extract data from netcdf using a shapefile polygon in pythonGlobal NetCDF datasets with station time series data?Swapped axes in R raster brick plot from netcdfsaving values to a new netcdf arrayPython&ArcGIS: Combining several raster layers into netCDF

What exploit Are these user agents trying to use?

CAST throwing error when run in stored procedure but not when run as raw query

Unable to supress ligatures in headings which are set in Caps

How can I deal with my CEO asking me to hire someone with a higher salary than me, a co-founder?

Ambiguity in the definition of entropy

Is it acceptable for a professor to tell male students to not think that they are smarter than female students?

How seriously should I take size and weight limits of hand luggage?

Why is it a bad idea to hire a hitman to eliminate most corrupt politicians?

Expand and Contract

Can my sorcerer use a spellbook only to collect spells and scribe scrolls, not cast?

Apex Framework / library for consuming REST services

Arrow those variables!

How to tell a function to use the default argument values?

Why didn't Miles's spider sense work before?

Would Slavery Reparations be considered Bills of Attainder and hence Illegal?

What does the expression "A Mann!" means

What's the in-universe reasoning behind sorcerers needing material components?

Is it possible to create a QR code using text?

ssTTsSTtRrriinInnnnNNNIiinngg

What historical events would have to change in order to make 19th century "steampunk" technology possible?

I would say: "You are another teacher", but she is a woman and I am a man

How to add frame around section using titlesec?

Is there a hemisphere-neutral way of specifying a season?

Bullying boss launched a smear campaign and made me unemployable



Extract data from NetCDF file based on station


Reading NetCDF file of station data?How to convert a NetCDF file to GeoTIFF using GDALExtract single datapoint from netCDF file in R using ndcfUsing netCDF4 Python climate algorithm?Calculating Climate Suitable Days using netCDF4 and Python?Extract data from netcdf using a shapefile polygon in pythonGlobal NetCDF datasets with station time series data?Swapped axes in R raster brick plot from netcdfsaving values to a new netcdf arrayPython&ArcGIS: Combining several raster layers into netCDF













1















I have read other solutions for NetCDF data but my data are a little different and I do not know how to extract data from NetCDF and save them in CSV files based on stations. Data include the maximum temperature values for stations. I just need stations located in latitudes:25.74 to 49.05 and longitude: -93.44 to -116.0. The format of time is different and I just need time[7518:43947190] which include data from 1948. I want to create multiple csv files. every file have to be data for one single station that include time, tmax, and quality flag of data.



from netCDF4 import Dataset
dataset=Dataset("D:/ushcn_tmax.nc")

#### Print dimentions #####
print dataset.file_format
print dataset.dimensions.keys()
print dataset.dimensions['name_strlen']
print dataset.dimensions['obs']
print dataset.dimensions['station']

#### Print variables ####
print dataset.variables.keys()
print dataset.variables['LON']
print dataset.variables['LAT']
print dataset.variables['ELEVATION']
print dataset.variables['STATION_NAME']
print dataset.variables['STATION_INDEX']
print dataset.variables['TIME']
print dataset.variables['TMAX']
print dataset.variables['TMAX_MFLAG']
print dataset.variables['TMAX_QFLAG']
print dataset.variables['TMAX_SFLAG']


Dimension and variables of my data can bee seen here:



NETCDF3_CLASSIC
[u'name_strlen', u'obs', u'station']
<type 'netCDF4._netCDF4.Dimension'>: name = 'name_strlen', size = 50

<type 'netCDF4._netCDF4.Dimension'>: name = 'obs', size = 43947189

<type 'netCDF4._netCDF4.Dimension'>: name = 'station', size = 1218

[u'LON', u'LAT', u'ELEVATION', u'STATION_NAME', u'STATION_INDEX', u'TIME', u'TMAX', u'TMAX_MFLAG', u'TMAX_QFLAG', u'TMAX_SFLAG']
<type 'netCDF4._netCDF4.Variable'>
float32 LON(station)
standard_name: longitude
long_name: station longitude
units: degrees_east
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float32 LAT(station)
standard_name: latitude
long_name: station latitude
units: degrees_north
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float64 ELEVATION(station)
long_name: elevation above the sea level
standard_name: elevation
units: m
positive: up
axis: Z
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 STATION_NAME(station, name_strlen)
long_name: USHCN station name
cf_role: timeseries_id
unlimited dimensions:
current shape = (1218, 50)
filling off

<type 'netCDF4._netCDF4.Variable'>
int32 STATION_INDEX(obs)
long_name: which station this obs is for
instance_dimension: station
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float64 TIME(obs)
standard_name: time
long_name: Time
units: decimal day
_FillValue: -9999.0
comment: time calculeted as: year + day_of_year/days_in_year
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
int32 TMAX(obs)
standard_name: TMAX
long_name: maximum temperature
units: degrees F
coordinates: time lat lon elevation
_FillValue: -9999
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_MFLAG(obs)
standard_name: TMAX_MFLAG
long_mane: measurement flag for TMAX
flag_values: BDLT
flag_meanings: Blank = no measurement information applicable; B = precipitation total formed from two 12-hour totals; D = precipitation total formed from four six-hour totals; L = temperature appears to be lagged with respect to reported hour of OBServation; T = trace of precipitation, snowfall, or snow depth
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_QFLAG(obs)
standard_name: TMAX_QFLAG
long_mane: quality flag for TMAX
flag_values: ADGIKMNORSTWX
flag_meanings: Blank = did not fail any quality assurance check; A = failed accumulation total check; D = failed duplicate check; G = failed gap check; I = failed internal consistency check; K = failed streak/frequent-value check; M = failed megaconsistency check; N = failed naught check; O = failed climatological outlier check; R = failed lagged range check; S = failed spatial consistency check; T = failed temporal consistency check; W = temperature too warm for snow; X = failed bounds check;
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_SFLAG(obs)
standard_name: TMAX_SFLAG
long_mane: source flag for TMAX
flag_values: 0126ABFGHIMQRSX
flag_meanings: Blank = No source (i.e., data value missing); 0 = U.S. Cooperative Summary of the Day (NCDC DSI-3200); 1 = U.S. Preliminary Cooperative Summary of the Day -- Transmitted; 2 = U.S. Preliminary Cooperative Summary of the Day -- Keyed from paper forms; 6 = CDMP Cooperative Summary of the Day (NCDC DSI-3206); A = U.S. Automated Surface Observing System (ASOS) real-time data (since January 1, 2006); B = U.S. ASOS data for October 2000-December 2005 (NCDC DSI-3211); F = U.S. Fort data; G = Official Global Climate Observing System (GCOS) or other government-supplied data; H = High Plains Regional Climate Center real-time data; I = International collection (non U.S. data received through personal contacts); M = Monthly METAR Extract (additional ASOS data); Q = Data from several African countries that had been 'quarantined', that is, withheld from public release until permission was granted from the respective meteorological services; R = NCDC Reference Network Database (Climate Reference Network and Historical Climatology Network-Modernized); S = Global Summary of the Day (NCDC DSI-9618), NOTE: 'S' values are derived from hourly synoptic reports exchanged on the Global Telecommunications System (GTS).Daily values derived in this fashion may differ significantly from 'true' daily data, particularly for precipitation (i.e., use with caution); X = U.S. First-Order Summary of the Day (NCDC DSI-3210)
unlimited dimensions:
current shape = (43947189,)
filling off


I have tried to read data with :



xr.open_dataset("D:/ushcn_tmax.nc")
df=dataset.sel(lon=-99.30,lat=32.73,method='nearest')


while the mentione lat and lon belong to one station and I received error "KeyError: 'lat'".
Is there any way I can convert variables (lat, lon, and time) to dimentions to make it easier to work with? Or any way I can extract data based on station as dimension?










share|improve this question
















bumped to the homepage by Community 2 days ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • Hello, can you update the question with the result of dataset = xr.open_dataset("D:/ushcn_tmax.nc") >>>dataset ?

    – Andrea Massetti
    Mar 2 at 1:05















1















I have read other solutions for NetCDF data but my data are a little different and I do not know how to extract data from NetCDF and save them in CSV files based on stations. Data include the maximum temperature values for stations. I just need stations located in latitudes:25.74 to 49.05 and longitude: -93.44 to -116.0. The format of time is different and I just need time[7518:43947190] which include data from 1948. I want to create multiple csv files. every file have to be data for one single station that include time, tmax, and quality flag of data.



from netCDF4 import Dataset
dataset=Dataset("D:/ushcn_tmax.nc")

#### Print dimentions #####
print dataset.file_format
print dataset.dimensions.keys()
print dataset.dimensions['name_strlen']
print dataset.dimensions['obs']
print dataset.dimensions['station']

#### Print variables ####
print dataset.variables.keys()
print dataset.variables['LON']
print dataset.variables['LAT']
print dataset.variables['ELEVATION']
print dataset.variables['STATION_NAME']
print dataset.variables['STATION_INDEX']
print dataset.variables['TIME']
print dataset.variables['TMAX']
print dataset.variables['TMAX_MFLAG']
print dataset.variables['TMAX_QFLAG']
print dataset.variables['TMAX_SFLAG']


Dimension and variables of my data can bee seen here:



NETCDF3_CLASSIC
[u'name_strlen', u'obs', u'station']
<type 'netCDF4._netCDF4.Dimension'>: name = 'name_strlen', size = 50

<type 'netCDF4._netCDF4.Dimension'>: name = 'obs', size = 43947189

<type 'netCDF4._netCDF4.Dimension'>: name = 'station', size = 1218

[u'LON', u'LAT', u'ELEVATION', u'STATION_NAME', u'STATION_INDEX', u'TIME', u'TMAX', u'TMAX_MFLAG', u'TMAX_QFLAG', u'TMAX_SFLAG']
<type 'netCDF4._netCDF4.Variable'>
float32 LON(station)
standard_name: longitude
long_name: station longitude
units: degrees_east
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float32 LAT(station)
standard_name: latitude
long_name: station latitude
units: degrees_north
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float64 ELEVATION(station)
long_name: elevation above the sea level
standard_name: elevation
units: m
positive: up
axis: Z
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 STATION_NAME(station, name_strlen)
long_name: USHCN station name
cf_role: timeseries_id
unlimited dimensions:
current shape = (1218, 50)
filling off

<type 'netCDF4._netCDF4.Variable'>
int32 STATION_INDEX(obs)
long_name: which station this obs is for
instance_dimension: station
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float64 TIME(obs)
standard_name: time
long_name: Time
units: decimal day
_FillValue: -9999.0
comment: time calculeted as: year + day_of_year/days_in_year
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
int32 TMAX(obs)
standard_name: TMAX
long_name: maximum temperature
units: degrees F
coordinates: time lat lon elevation
_FillValue: -9999
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_MFLAG(obs)
standard_name: TMAX_MFLAG
long_mane: measurement flag for TMAX
flag_values: BDLT
flag_meanings: Blank = no measurement information applicable; B = precipitation total formed from two 12-hour totals; D = precipitation total formed from four six-hour totals; L = temperature appears to be lagged with respect to reported hour of OBServation; T = trace of precipitation, snowfall, or snow depth
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_QFLAG(obs)
standard_name: TMAX_QFLAG
long_mane: quality flag for TMAX
flag_values: ADGIKMNORSTWX
flag_meanings: Blank = did not fail any quality assurance check; A = failed accumulation total check; D = failed duplicate check; G = failed gap check; I = failed internal consistency check; K = failed streak/frequent-value check; M = failed megaconsistency check; N = failed naught check; O = failed climatological outlier check; R = failed lagged range check; S = failed spatial consistency check; T = failed temporal consistency check; W = temperature too warm for snow; X = failed bounds check;
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_SFLAG(obs)
standard_name: TMAX_SFLAG
long_mane: source flag for TMAX
flag_values: 0126ABFGHIMQRSX
flag_meanings: Blank = No source (i.e., data value missing); 0 = U.S. Cooperative Summary of the Day (NCDC DSI-3200); 1 = U.S. Preliminary Cooperative Summary of the Day -- Transmitted; 2 = U.S. Preliminary Cooperative Summary of the Day -- Keyed from paper forms; 6 = CDMP Cooperative Summary of the Day (NCDC DSI-3206); A = U.S. Automated Surface Observing System (ASOS) real-time data (since January 1, 2006); B = U.S. ASOS data for October 2000-December 2005 (NCDC DSI-3211); F = U.S. Fort data; G = Official Global Climate Observing System (GCOS) or other government-supplied data; H = High Plains Regional Climate Center real-time data; I = International collection (non U.S. data received through personal contacts); M = Monthly METAR Extract (additional ASOS data); Q = Data from several African countries that had been 'quarantined', that is, withheld from public release until permission was granted from the respective meteorological services; R = NCDC Reference Network Database (Climate Reference Network and Historical Climatology Network-Modernized); S = Global Summary of the Day (NCDC DSI-9618), NOTE: 'S' values are derived from hourly synoptic reports exchanged on the Global Telecommunications System (GTS).Daily values derived in this fashion may differ significantly from 'true' daily data, particularly for precipitation (i.e., use with caution); X = U.S. First-Order Summary of the Day (NCDC DSI-3210)
unlimited dimensions:
current shape = (43947189,)
filling off


I have tried to read data with :



xr.open_dataset("D:/ushcn_tmax.nc")
df=dataset.sel(lon=-99.30,lat=32.73,method='nearest')


while the mentione lat and lon belong to one station and I received error "KeyError: 'lat'".
Is there any way I can convert variables (lat, lon, and time) to dimentions to make it easier to work with? Or any way I can extract data based on station as dimension?










share|improve this question
















bumped to the homepage by Community 2 days ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • Hello, can you update the question with the result of dataset = xr.open_dataset("D:/ushcn_tmax.nc") >>>dataset ?

    – Andrea Massetti
    Mar 2 at 1:05













1












1








1


1






I have read other solutions for NetCDF data but my data are a little different and I do not know how to extract data from NetCDF and save them in CSV files based on stations. Data include the maximum temperature values for stations. I just need stations located in latitudes:25.74 to 49.05 and longitude: -93.44 to -116.0. The format of time is different and I just need time[7518:43947190] which include data from 1948. I want to create multiple csv files. every file have to be data for one single station that include time, tmax, and quality flag of data.



from netCDF4 import Dataset
dataset=Dataset("D:/ushcn_tmax.nc")

#### Print dimentions #####
print dataset.file_format
print dataset.dimensions.keys()
print dataset.dimensions['name_strlen']
print dataset.dimensions['obs']
print dataset.dimensions['station']

#### Print variables ####
print dataset.variables.keys()
print dataset.variables['LON']
print dataset.variables['LAT']
print dataset.variables['ELEVATION']
print dataset.variables['STATION_NAME']
print dataset.variables['STATION_INDEX']
print dataset.variables['TIME']
print dataset.variables['TMAX']
print dataset.variables['TMAX_MFLAG']
print dataset.variables['TMAX_QFLAG']
print dataset.variables['TMAX_SFLAG']


Dimension and variables of my data can bee seen here:



NETCDF3_CLASSIC
[u'name_strlen', u'obs', u'station']
<type 'netCDF4._netCDF4.Dimension'>: name = 'name_strlen', size = 50

<type 'netCDF4._netCDF4.Dimension'>: name = 'obs', size = 43947189

<type 'netCDF4._netCDF4.Dimension'>: name = 'station', size = 1218

[u'LON', u'LAT', u'ELEVATION', u'STATION_NAME', u'STATION_INDEX', u'TIME', u'TMAX', u'TMAX_MFLAG', u'TMAX_QFLAG', u'TMAX_SFLAG']
<type 'netCDF4._netCDF4.Variable'>
float32 LON(station)
standard_name: longitude
long_name: station longitude
units: degrees_east
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float32 LAT(station)
standard_name: latitude
long_name: station latitude
units: degrees_north
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float64 ELEVATION(station)
long_name: elevation above the sea level
standard_name: elevation
units: m
positive: up
axis: Z
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 STATION_NAME(station, name_strlen)
long_name: USHCN station name
cf_role: timeseries_id
unlimited dimensions:
current shape = (1218, 50)
filling off

<type 'netCDF4._netCDF4.Variable'>
int32 STATION_INDEX(obs)
long_name: which station this obs is for
instance_dimension: station
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float64 TIME(obs)
standard_name: time
long_name: Time
units: decimal day
_FillValue: -9999.0
comment: time calculeted as: year + day_of_year/days_in_year
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
int32 TMAX(obs)
standard_name: TMAX
long_name: maximum temperature
units: degrees F
coordinates: time lat lon elevation
_FillValue: -9999
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_MFLAG(obs)
standard_name: TMAX_MFLAG
long_mane: measurement flag for TMAX
flag_values: BDLT
flag_meanings: Blank = no measurement information applicable; B = precipitation total formed from two 12-hour totals; D = precipitation total formed from four six-hour totals; L = temperature appears to be lagged with respect to reported hour of OBServation; T = trace of precipitation, snowfall, or snow depth
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_QFLAG(obs)
standard_name: TMAX_QFLAG
long_mane: quality flag for TMAX
flag_values: ADGIKMNORSTWX
flag_meanings: Blank = did not fail any quality assurance check; A = failed accumulation total check; D = failed duplicate check; G = failed gap check; I = failed internal consistency check; K = failed streak/frequent-value check; M = failed megaconsistency check; N = failed naught check; O = failed climatological outlier check; R = failed lagged range check; S = failed spatial consistency check; T = failed temporal consistency check; W = temperature too warm for snow; X = failed bounds check;
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_SFLAG(obs)
standard_name: TMAX_SFLAG
long_mane: source flag for TMAX
flag_values: 0126ABFGHIMQRSX
flag_meanings: Blank = No source (i.e., data value missing); 0 = U.S. Cooperative Summary of the Day (NCDC DSI-3200); 1 = U.S. Preliminary Cooperative Summary of the Day -- Transmitted; 2 = U.S. Preliminary Cooperative Summary of the Day -- Keyed from paper forms; 6 = CDMP Cooperative Summary of the Day (NCDC DSI-3206); A = U.S. Automated Surface Observing System (ASOS) real-time data (since January 1, 2006); B = U.S. ASOS data for October 2000-December 2005 (NCDC DSI-3211); F = U.S. Fort data; G = Official Global Climate Observing System (GCOS) or other government-supplied data; H = High Plains Regional Climate Center real-time data; I = International collection (non U.S. data received through personal contacts); M = Monthly METAR Extract (additional ASOS data); Q = Data from several African countries that had been 'quarantined', that is, withheld from public release until permission was granted from the respective meteorological services; R = NCDC Reference Network Database (Climate Reference Network and Historical Climatology Network-Modernized); S = Global Summary of the Day (NCDC DSI-9618), NOTE: 'S' values are derived from hourly synoptic reports exchanged on the Global Telecommunications System (GTS).Daily values derived in this fashion may differ significantly from 'true' daily data, particularly for precipitation (i.e., use with caution); X = U.S. First-Order Summary of the Day (NCDC DSI-3210)
unlimited dimensions:
current shape = (43947189,)
filling off


I have tried to read data with :



xr.open_dataset("D:/ushcn_tmax.nc")
df=dataset.sel(lon=-99.30,lat=32.73,method='nearest')


while the mentione lat and lon belong to one station and I received error "KeyError: 'lat'".
Is there any way I can convert variables (lat, lon, and time) to dimentions to make it easier to work with? Or any way I can extract data based on station as dimension?










share|improve this question
















I have read other solutions for NetCDF data but my data are a little different and I do not know how to extract data from NetCDF and save them in CSV files based on stations. Data include the maximum temperature values for stations. I just need stations located in latitudes:25.74 to 49.05 and longitude: -93.44 to -116.0. The format of time is different and I just need time[7518:43947190] which include data from 1948. I want to create multiple csv files. every file have to be data for one single station that include time, tmax, and quality flag of data.



from netCDF4 import Dataset
dataset=Dataset("D:/ushcn_tmax.nc")

#### Print dimentions #####
print dataset.file_format
print dataset.dimensions.keys()
print dataset.dimensions['name_strlen']
print dataset.dimensions['obs']
print dataset.dimensions['station']

#### Print variables ####
print dataset.variables.keys()
print dataset.variables['LON']
print dataset.variables['LAT']
print dataset.variables['ELEVATION']
print dataset.variables['STATION_NAME']
print dataset.variables['STATION_INDEX']
print dataset.variables['TIME']
print dataset.variables['TMAX']
print dataset.variables['TMAX_MFLAG']
print dataset.variables['TMAX_QFLAG']
print dataset.variables['TMAX_SFLAG']


Dimension and variables of my data can bee seen here:



NETCDF3_CLASSIC
[u'name_strlen', u'obs', u'station']
<type 'netCDF4._netCDF4.Dimension'>: name = 'name_strlen', size = 50

<type 'netCDF4._netCDF4.Dimension'>: name = 'obs', size = 43947189

<type 'netCDF4._netCDF4.Dimension'>: name = 'station', size = 1218

[u'LON', u'LAT', u'ELEVATION', u'STATION_NAME', u'STATION_INDEX', u'TIME', u'TMAX', u'TMAX_MFLAG', u'TMAX_QFLAG', u'TMAX_SFLAG']
<type 'netCDF4._netCDF4.Variable'>
float32 LON(station)
standard_name: longitude
long_name: station longitude
units: degrees_east
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float32 LAT(station)
standard_name: latitude
long_name: station latitude
units: degrees_north
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float64 ELEVATION(station)
long_name: elevation above the sea level
standard_name: elevation
units: m
positive: up
axis: Z
unlimited dimensions:
current shape = (1218,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 STATION_NAME(station, name_strlen)
long_name: USHCN station name
cf_role: timeseries_id
unlimited dimensions:
current shape = (1218, 50)
filling off

<type 'netCDF4._netCDF4.Variable'>
int32 STATION_INDEX(obs)
long_name: which station this obs is for
instance_dimension: station
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
float64 TIME(obs)
standard_name: time
long_name: Time
units: decimal day
_FillValue: -9999.0
comment: time calculeted as: year + day_of_year/days_in_year
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
int32 TMAX(obs)
standard_name: TMAX
long_name: maximum temperature
units: degrees F
coordinates: time lat lon elevation
_FillValue: -9999
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_MFLAG(obs)
standard_name: TMAX_MFLAG
long_mane: measurement flag for TMAX
flag_values: BDLT
flag_meanings: Blank = no measurement information applicable; B = precipitation total formed from two 12-hour totals; D = precipitation total formed from four six-hour totals; L = temperature appears to be lagged with respect to reported hour of OBServation; T = trace of precipitation, snowfall, or snow depth
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_QFLAG(obs)
standard_name: TMAX_QFLAG
long_mane: quality flag for TMAX
flag_values: ADGIKMNORSTWX
flag_meanings: Blank = did not fail any quality assurance check; A = failed accumulation total check; D = failed duplicate check; G = failed gap check; I = failed internal consistency check; K = failed streak/frequent-value check; M = failed megaconsistency check; N = failed naught check; O = failed climatological outlier check; R = failed lagged range check; S = failed spatial consistency check; T = failed temporal consistency check; W = temperature too warm for snow; X = failed bounds check;
unlimited dimensions:
current shape = (43947189,)
filling off

<type 'netCDF4._netCDF4.Variable'>
|S1 TMAX_SFLAG(obs)
standard_name: TMAX_SFLAG
long_mane: source flag for TMAX
flag_values: 0126ABFGHIMQRSX
flag_meanings: Blank = No source (i.e., data value missing); 0 = U.S. Cooperative Summary of the Day (NCDC DSI-3200); 1 = U.S. Preliminary Cooperative Summary of the Day -- Transmitted; 2 = U.S. Preliminary Cooperative Summary of the Day -- Keyed from paper forms; 6 = CDMP Cooperative Summary of the Day (NCDC DSI-3206); A = U.S. Automated Surface Observing System (ASOS) real-time data (since January 1, 2006); B = U.S. ASOS data for October 2000-December 2005 (NCDC DSI-3211); F = U.S. Fort data; G = Official Global Climate Observing System (GCOS) or other government-supplied data; H = High Plains Regional Climate Center real-time data; I = International collection (non U.S. data received through personal contacts); M = Monthly METAR Extract (additional ASOS data); Q = Data from several African countries that had been 'quarantined', that is, withheld from public release until permission was granted from the respective meteorological services; R = NCDC Reference Network Database (Climate Reference Network and Historical Climatology Network-Modernized); S = Global Summary of the Day (NCDC DSI-9618), NOTE: 'S' values are derived from hourly synoptic reports exchanged on the Global Telecommunications System (GTS).Daily values derived in this fashion may differ significantly from 'true' daily data, particularly for precipitation (i.e., use with caution); X = U.S. First-Order Summary of the Day (NCDC DSI-3210)
unlimited dimensions:
current shape = (43947189,)
filling off


I have tried to read data with :



xr.open_dataset("D:/ushcn_tmax.nc")
df=dataset.sel(lon=-99.30,lat=32.73,method='nearest')


while the mentione lat and lon belong to one station and I received error "KeyError: 'lat'".
Is there any way I can convert variables (lat, lon, and time) to dimentions to make it easier to work with? Or any way I can extract data based on station as dimension?







python csv netcdf






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 21 at 20:45









Vince

14.8k32849




14.8k32849










asked Feb 21 at 20:37









AmyAmy

62




62





bumped to the homepage by Community 2 days ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 2 days ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.














  • Hello, can you update the question with the result of dataset = xr.open_dataset("D:/ushcn_tmax.nc") >>>dataset ?

    – Andrea Massetti
    Mar 2 at 1:05

















  • Hello, can you update the question with the result of dataset = xr.open_dataset("D:/ushcn_tmax.nc") >>>dataset ?

    – Andrea Massetti
    Mar 2 at 1:05
















Hello, can you update the question with the result of dataset = xr.open_dataset("D:/ushcn_tmax.nc") >>>dataset ?

– Andrea Massetti
Mar 2 at 1:05





Hello, can you update the question with the result of dataset = xr.open_dataset("D:/ushcn_tmax.nc") >>>dataset ?

– Andrea Massetti
Mar 2 at 1:05










1 Answer
1






active

oldest

votes


















0














Note, I can update the answer with the correct variables if you include in the question the variables as read by xarray (as I asked you in the comment).



You are using two different packages when xarray alone would do the work just fine (xarray has a netCDF4 backend, but wraps to it more human accessible, understandable and readable methods.. and much more!).



import xarray as xr
dataset = xr.open_dataset(r"C:path_to_dsdataset.nc")


Inspect the variables



Example with one file I had on pc:



dataset
>>> <xarray.Dataset>
>>> Dimensions: (time: 63, x: 4000, y: 4000)
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> * x (x) float64 1.5e+06 1.5e+06 1.5e+06 ... 1.6e+06 1.6e+06 1.6e+06
>>> * time (time) datetime64[ns] 2013-06-21T23:53:00 ... 2019-01-13T23:50:56
>>> Data variables:
>>> relative_humidity (time, y, x) float32 ...#show descriptive data about the whole dataset


Select location



Use the xarray.Dataset.sel() method with the name of the coordinates that you found above. With my example, I will go with "x" and "y":



subset = dataset.sel(x=1.504e+06, y=-4.202e+06, method='nearest')


Note, you are assuming that the coordinates are in decimal degrees but you should check the projection attribute to be sure you are querying in the correct reference system Depending on how well documented is the metadata of the file, you can access the attribute of the coordinate variable by:



dataset.y
>>> <xarray.DataArray 'y' (y: 4000)>
>>> array([-4200012.5, -4200037.5, -4200062.5, ..., -4299937.5, -4299962.5,
-4299987.5])
>>> Coordinates:
>>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
>>> Attributes:
>>> units: metre
>>> standard_name: projection_y_coordinate
>>> long_name: y coordinate of projection


Or you might need to see the global attribute of the file, that might contain a projection field by



dataset.attrs


Select the time



As for the time dimension, xarray converts automatically the difficult to read "seconds past" format of netCDF4 in a more human readable np.datetime64.
However, I prefer to select dates with the help of pandas.to_datetime() method, that easily converts strings to datetime.datetime (which is even more human readable than np.datetime).
I am sure there are similar methods in xarray, or even the exact same since xarray inherits many functions from pandas and numpy, but I can never be bothered looking for them.



import pandas.to_datetime as to_datetime
all_dates = dataset.time.values
sel_dates = [date for date in all_dates if to_datetime(date) >= to_datetime("01/01/1948")] #select dates past 1st Jan 1948
result = subset.sel(time=sel_dates)


all done! just select the variable you are querying, in my case relative_humidity



result.relative_humidity.values 
>>> array([0.1,...,0.2], dtype=float32)


It is a np.ndarray of shape=(n_selected_dates, n_selected pixels); in this case 1 pixel as we passed only one value for x, y.






share|improve this answer























    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "79"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgis.stackexchange.com%2fquestions%2f313125%2fextract-data-from-netcdf-file-based-on-station%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Note, I can update the answer with the correct variables if you include in the question the variables as read by xarray (as I asked you in the comment).



    You are using two different packages when xarray alone would do the work just fine (xarray has a netCDF4 backend, but wraps to it more human accessible, understandable and readable methods.. and much more!).



    import xarray as xr
    dataset = xr.open_dataset(r"C:path_to_dsdataset.nc")


    Inspect the variables



    Example with one file I had on pc:



    dataset
    >>> <xarray.Dataset>
    >>> Dimensions: (time: 63, x: 4000, y: 4000)
    >>> Coordinates:
    >>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
    >>> * x (x) float64 1.5e+06 1.5e+06 1.5e+06 ... 1.6e+06 1.6e+06 1.6e+06
    >>> * time (time) datetime64[ns] 2013-06-21T23:53:00 ... 2019-01-13T23:50:56
    >>> Data variables:
    >>> relative_humidity (time, y, x) float32 ...#show descriptive data about the whole dataset


    Select location



    Use the xarray.Dataset.sel() method with the name of the coordinates that you found above. With my example, I will go with "x" and "y":



    subset = dataset.sel(x=1.504e+06, y=-4.202e+06, method='nearest')


    Note, you are assuming that the coordinates are in decimal degrees but you should check the projection attribute to be sure you are querying in the correct reference system Depending on how well documented is the metadata of the file, you can access the attribute of the coordinate variable by:



    dataset.y
    >>> <xarray.DataArray 'y' (y: 4000)>
    >>> array([-4200012.5, -4200037.5, -4200062.5, ..., -4299937.5, -4299962.5,
    -4299987.5])
    >>> Coordinates:
    >>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
    >>> Attributes:
    >>> units: metre
    >>> standard_name: projection_y_coordinate
    >>> long_name: y coordinate of projection


    Or you might need to see the global attribute of the file, that might contain a projection field by



    dataset.attrs


    Select the time



    As for the time dimension, xarray converts automatically the difficult to read "seconds past" format of netCDF4 in a more human readable np.datetime64.
    However, I prefer to select dates with the help of pandas.to_datetime() method, that easily converts strings to datetime.datetime (which is even more human readable than np.datetime).
    I am sure there are similar methods in xarray, or even the exact same since xarray inherits many functions from pandas and numpy, but I can never be bothered looking for them.



    import pandas.to_datetime as to_datetime
    all_dates = dataset.time.values
    sel_dates = [date for date in all_dates if to_datetime(date) >= to_datetime("01/01/1948")] #select dates past 1st Jan 1948
    result = subset.sel(time=sel_dates)


    all done! just select the variable you are querying, in my case relative_humidity



    result.relative_humidity.values 
    >>> array([0.1,...,0.2], dtype=float32)


    It is a np.ndarray of shape=(n_selected_dates, n_selected pixels); in this case 1 pixel as we passed only one value for x, y.






    share|improve this answer



























      0














      Note, I can update the answer with the correct variables if you include in the question the variables as read by xarray (as I asked you in the comment).



      You are using two different packages when xarray alone would do the work just fine (xarray has a netCDF4 backend, but wraps to it more human accessible, understandable and readable methods.. and much more!).



      import xarray as xr
      dataset = xr.open_dataset(r"C:path_to_dsdataset.nc")


      Inspect the variables



      Example with one file I had on pc:



      dataset
      >>> <xarray.Dataset>
      >>> Dimensions: (time: 63, x: 4000, y: 4000)
      >>> Coordinates:
      >>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
      >>> * x (x) float64 1.5e+06 1.5e+06 1.5e+06 ... 1.6e+06 1.6e+06 1.6e+06
      >>> * time (time) datetime64[ns] 2013-06-21T23:53:00 ... 2019-01-13T23:50:56
      >>> Data variables:
      >>> relative_humidity (time, y, x) float32 ...#show descriptive data about the whole dataset


      Select location



      Use the xarray.Dataset.sel() method with the name of the coordinates that you found above. With my example, I will go with "x" and "y":



      subset = dataset.sel(x=1.504e+06, y=-4.202e+06, method='nearest')


      Note, you are assuming that the coordinates are in decimal degrees but you should check the projection attribute to be sure you are querying in the correct reference system Depending on how well documented is the metadata of the file, you can access the attribute of the coordinate variable by:



      dataset.y
      >>> <xarray.DataArray 'y' (y: 4000)>
      >>> array([-4200012.5, -4200037.5, -4200062.5, ..., -4299937.5, -4299962.5,
      -4299987.5])
      >>> Coordinates:
      >>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
      >>> Attributes:
      >>> units: metre
      >>> standard_name: projection_y_coordinate
      >>> long_name: y coordinate of projection


      Or you might need to see the global attribute of the file, that might contain a projection field by



      dataset.attrs


      Select the time



      As for the time dimension, xarray converts automatically the difficult to read "seconds past" format of netCDF4 in a more human readable np.datetime64.
      However, I prefer to select dates with the help of pandas.to_datetime() method, that easily converts strings to datetime.datetime (which is even more human readable than np.datetime).
      I am sure there are similar methods in xarray, or even the exact same since xarray inherits many functions from pandas and numpy, but I can never be bothered looking for them.



      import pandas.to_datetime as to_datetime
      all_dates = dataset.time.values
      sel_dates = [date for date in all_dates if to_datetime(date) >= to_datetime("01/01/1948")] #select dates past 1st Jan 1948
      result = subset.sel(time=sel_dates)


      all done! just select the variable you are querying, in my case relative_humidity



      result.relative_humidity.values 
      >>> array([0.1,...,0.2], dtype=float32)


      It is a np.ndarray of shape=(n_selected_dates, n_selected pixels); in this case 1 pixel as we passed only one value for x, y.






      share|improve this answer

























        0












        0








        0







        Note, I can update the answer with the correct variables if you include in the question the variables as read by xarray (as I asked you in the comment).



        You are using two different packages when xarray alone would do the work just fine (xarray has a netCDF4 backend, but wraps to it more human accessible, understandable and readable methods.. and much more!).



        import xarray as xr
        dataset = xr.open_dataset(r"C:path_to_dsdataset.nc")


        Inspect the variables



        Example with one file I had on pc:



        dataset
        >>> <xarray.Dataset>
        >>> Dimensions: (time: 63, x: 4000, y: 4000)
        >>> Coordinates:
        >>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
        >>> * x (x) float64 1.5e+06 1.5e+06 1.5e+06 ... 1.6e+06 1.6e+06 1.6e+06
        >>> * time (time) datetime64[ns] 2013-06-21T23:53:00 ... 2019-01-13T23:50:56
        >>> Data variables:
        >>> relative_humidity (time, y, x) float32 ...#show descriptive data about the whole dataset


        Select location



        Use the xarray.Dataset.sel() method with the name of the coordinates that you found above. With my example, I will go with "x" and "y":



        subset = dataset.sel(x=1.504e+06, y=-4.202e+06, method='nearest')


        Note, you are assuming that the coordinates are in decimal degrees but you should check the projection attribute to be sure you are querying in the correct reference system Depending on how well documented is the metadata of the file, you can access the attribute of the coordinate variable by:



        dataset.y
        >>> <xarray.DataArray 'y' (y: 4000)>
        >>> array([-4200012.5, -4200037.5, -4200062.5, ..., -4299937.5, -4299962.5,
        -4299987.5])
        >>> Coordinates:
        >>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
        >>> Attributes:
        >>> units: metre
        >>> standard_name: projection_y_coordinate
        >>> long_name: y coordinate of projection


        Or you might need to see the global attribute of the file, that might contain a projection field by



        dataset.attrs


        Select the time



        As for the time dimension, xarray converts automatically the difficult to read "seconds past" format of netCDF4 in a more human readable np.datetime64.
        However, I prefer to select dates with the help of pandas.to_datetime() method, that easily converts strings to datetime.datetime (which is even more human readable than np.datetime).
        I am sure there are similar methods in xarray, or even the exact same since xarray inherits many functions from pandas and numpy, but I can never be bothered looking for them.



        import pandas.to_datetime as to_datetime
        all_dates = dataset.time.values
        sel_dates = [date for date in all_dates if to_datetime(date) >= to_datetime("01/01/1948")] #select dates past 1st Jan 1948
        result = subset.sel(time=sel_dates)


        all done! just select the variable you are querying, in my case relative_humidity



        result.relative_humidity.values 
        >>> array([0.1,...,0.2], dtype=float32)


        It is a np.ndarray of shape=(n_selected_dates, n_selected pixels); in this case 1 pixel as we passed only one value for x, y.






        share|improve this answer













        Note, I can update the answer with the correct variables if you include in the question the variables as read by xarray (as I asked you in the comment).



        You are using two different packages when xarray alone would do the work just fine (xarray has a netCDF4 backend, but wraps to it more human accessible, understandable and readable methods.. and much more!).



        import xarray as xr
        dataset = xr.open_dataset(r"C:path_to_dsdataset.nc")


        Inspect the variables



        Example with one file I had on pc:



        dataset
        >>> <xarray.Dataset>
        >>> Dimensions: (time: 63, x: 4000, y: 4000)
        >>> Coordinates:
        >>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
        >>> * x (x) float64 1.5e+06 1.5e+06 1.5e+06 ... 1.6e+06 1.6e+06 1.6e+06
        >>> * time (time) datetime64[ns] 2013-06-21T23:53:00 ... 2019-01-13T23:50:56
        >>> Data variables:
        >>> relative_humidity (time, y, x) float32 ...#show descriptive data about the whole dataset


        Select location



        Use the xarray.Dataset.sel() method with the name of the coordinates that you found above. With my example, I will go with "x" and "y":



        subset = dataset.sel(x=1.504e+06, y=-4.202e+06, method='nearest')


        Note, you are assuming that the coordinates are in decimal degrees but you should check the projection attribute to be sure you are querying in the correct reference system Depending on how well documented is the metadata of the file, you can access the attribute of the coordinate variable by:



        dataset.y
        >>> <xarray.DataArray 'y' (y: 4000)>
        >>> array([-4200012.5, -4200037.5, -4200062.5, ..., -4299937.5, -4299962.5,
        -4299987.5])
        >>> Coordinates:
        >>> * y (y) float64 -4.2e+06 -4.2e+06 -4.2e+06 ... -4.3e+06 -4.3e+06
        >>> Attributes:
        >>> units: metre
        >>> standard_name: projection_y_coordinate
        >>> long_name: y coordinate of projection


        Or you might need to see the global attribute of the file, that might contain a projection field by



        dataset.attrs


        Select the time



        As for the time dimension, xarray converts automatically the difficult to read "seconds past" format of netCDF4 in a more human readable np.datetime64.
        However, I prefer to select dates with the help of pandas.to_datetime() method, that easily converts strings to datetime.datetime (which is even more human readable than np.datetime).
        I am sure there are similar methods in xarray, or even the exact same since xarray inherits many functions from pandas and numpy, but I can never be bothered looking for them.



        import pandas.to_datetime as to_datetime
        all_dates = dataset.time.values
        sel_dates = [date for date in all_dates if to_datetime(date) >= to_datetime("01/01/1948")] #select dates past 1st Jan 1948
        result = subset.sel(time=sel_dates)


        all done! just select the variable you are querying, in my case relative_humidity



        result.relative_humidity.values 
        >>> array([0.1,...,0.2], dtype=float32)


        It is a np.ndarray of shape=(n_selected_dates, n_selected pixels); in this case 1 pixel as we passed only one value for x, y.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 2 at 2:03









        Andrea MassettiAndrea Massetti

        333312




        333312



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Geographic Information Systems Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fgis.stackexchange.com%2fquestions%2f313125%2fextract-data-from-netcdf-file-based-on-station%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Crop image to path created in TikZ? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)Crop an inserted image?TikZ pictures does not appear in posterImage behind and beyond crop marks?Tikz picture as large as possible on A4 PageTransparency vs image compression dilemmaHow to crop background from image automatically?Image does not cropTikzexternal capturing crop marks when externalizing pgfplots?How to include image path that contains a dollar signCrop image with left size given

            រឿង រ៉ូមេអូ និង ហ្ស៊ុយលីយេ សង្ខេបរឿង តួអង្គ បញ្ជីណែនាំ

            Ромео және Джульетта Мазмұны Қысқаша сипаттамасы Кейіпкерлері Кино Дереккөздер Бағыттау мәзірі