Skip to content

Commit

Permalink
Merge pull request #300 from ldecicco-USGS/master
Browse files Browse the repository at this point in the history
https and tz
  • Loading branch information
ldecicco-USGS authored Jan 20, 2017
2 parents 094dc32 + ae0e8f7 commit 961f957
Show file tree
Hide file tree
Showing 15 changed files with 84 additions and 92 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: dataRetrieval
Type: Package
Title: Retrieval Functions for USGS and EPA Hydrologic and Water Quality Data
Version: 2.6.6
Date: 2016-12-15
Version: 2.6.7
Date: 2017-01-20
Authors@R: c( person("Robert", "Hirsch", role = c("aut"),
email = "[email protected]"),
person("Laura", "DeCicco", role = c("aut","cre"),
Expand Down
6 changes: 1 addition & 5 deletions R/AAA.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,4 @@ pkg.env <- new.env()
options(Access.dataRetrieval = NULL)
}

.onAttach = function(libname, pkgname){
packageStartupMessage("USGS is switching from http to https:
Please see https://help.waterdata.usgs.gov/news/December%205%2C%202016
for more information.")
}

20 changes: 9 additions & 11 deletions R/importRDB1.r
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,11 @@
#'
#' @param obs_url character containing the url for the retrieval or a file path to the data file.
#' @param asDateTime logical, if \code{TRUE} returns date and time as POSIXct, if \code{FALSE}, Date
#' @param tz character to set timezone attribute of datetime. Default is an empty quote, which converts the
#' datetimes to UTC (properly accounting for daylight savings times based on the data's provided tz_cd column).
#' Possible values to provide are "America/New_York","America/Chicago", "America/Denver","America/Los_Angeles",
#' "America/Anchorage","America/Honolulu","America/Jamaica","America/Managua","America/Phoenix", and "America/Metlakatla"
#' @param tz character to set timezone attribute of datetime. Default converts the datetimes to UTC
#' (properly accounting for daylight savings times based on the data's provided tz_cd column).
#' Recommended US values include "UTC","America/New_York","America/Chicago", "America/Denver","America/Los_Angeles",
#' "America/Anchorage","America/Honolulu","America/Jamaica","America/Managua","America/Phoenix", and "America/Metlakatla".
#' For a complete list, see \url{https://en.wikipedia.org/wiki/List_of_tz_database_time_zones}
#' @param convertType logical, defaults to \code{TRUE}. If \code{TRUE}, the function will convert the data to dates, datetimes,
#' numerics based on a standard algorithm. If false, everything is returned as a character
#' @return A data frame with the following columns:
Expand Down Expand Up @@ -83,16 +84,13 @@
#' fullPath <- file.path(filePath, fileName)
#' importUserRDB <- importRDB1(fullPath)
#'
importRDB1 <- function(obs_url, asDateTime=TRUE, convertType = TRUE, tz=""){
importRDB1 <- function(obs_url, asDateTime=TRUE, convertType = TRUE, tz="UTC"){

if(tz != ""){
tz <- match.arg(tz, c("America/New_York","America/Chicago",
"America/Denver","America/Los_Angeles",
"America/Anchorage","America/Honolulu",
"America/Jamaica","America/Managua",
"America/Phoenix","America/Metlakatla","UTC"))
if(tz == ""){
tz <- "UTC"
}

tz <- match.arg(tz, OlsonNames())

if(file.exists(obs_url)){
doc <- obs_url
Expand Down
8 changes: 3 additions & 5 deletions R/importWQP.R
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,9 @@
importWQP <- function(obs_url, zip=FALSE, tz=""){

if(tz != ""){
tz <- match.arg(tz, c("America/New_York","America/Chicago",
"America/Denver","America/Los_Angeles",
"America/Anchorage","America/Honolulu",
"America/Jamaica","America/Managua",
"America/Phoenix","America/Metlakatla"))
tz <- match.arg(tz, OlsonNames())
} else {
tz <- "UTC"
}

if(!file.exists(obs_url)){
Expand Down
29 changes: 14 additions & 15 deletions R/importWaterML1.r
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,11 @@
#'
#' @param obs_url character or raw, containing the url for the retrieval or a file path to the data file, or raw XML.
#' @param asDateTime logical, if \code{TRUE} returns date and time as POSIXct, if \code{FALSE}, Date
#' @param tz character to set timezone attribute of . Default is an empty quote, which converts the
#' s to UTC (properly accounting for daylight savings times based on the data's provided tz_cd column).
#' Possible values to provide are "America/New_York","America/Chicago", "America/Denver","America/Los_Angeles",
#' "America/Anchorage","America/Honolulu","America/Jamaica","America/Managua","America/Phoenix", and "America/Metlakatla"
#' @param tz character to set timezone attribute of datetime. Default converts the datetimes to UTC
#' (properly accounting for daylight savings times based on the data's provided tz_cd column).
#' Recommended US values include "UTC","America/New_York","America/Chicago", "America/Denver","America/Los_Angeles",
#' "America/Anchorage","America/Honolulu","America/Jamaica","America/Managua","America/Phoenix", and "America/Metlakatla".
#' For a complete list, see \url{https://en.wikipedia.org/wiki/List_of_tz_database_time_zones}
#' @return A data frame with the following columns:
#' \tabular{lll}{
#' Name \tab Type \tab Description \cr
Expand Down Expand Up @@ -108,7 +109,7 @@
#' importFile <- importWaterML1(fullPath,TRUE)
#'

importWaterML1 <- function(obs_url,asDateTime=FALSE, tz=""){
importWaterML1 <- function(obs_url,asDateTime=FALSE, tz="UTC"){
#note: obs_url is a dated name, does not have to be a url/path
raw <- FALSE
if(class(obs_url) == "character" && file.exists(obs_url)){
Expand All @@ -120,13 +121,10 @@ importWaterML1 <- function(obs_url,asDateTime=FALSE, tz=""){
returnedDoc <- xml_root(getWebServiceData(obs_url, encoding='gzip'))
}

if(tz != ""){ #check tz is valid if supplied
tz <- match.arg(tz, c("America/New_York","America/Chicago",
"America/Denver","America/Los_Angeles",
"America/Anchorage","America/Honolulu",
"America/Jamaica","America/Managua",
"America/Phoenix","America/Metlakatla"))
}else{tz <- "UTC"}
if(tz == ""){ #check tz is valid if supplied
tz <- "UTC"
}
tz <- match.arg(tz, OlsonNames())

timeSeries <- xml_find_all(returnedDoc, ".//ns1:timeSeries") #each parameter/site combo

Expand Down Expand Up @@ -189,9 +187,8 @@ importWaterML1 <- function(obs_url,asDateTime=FALSE, tz=""){

nObs <- length(values)
qual <- xml_attr(obs,"qualifiers")
if(all(is.na(qual))){
noQual <- TRUE
}else{noQual <- FALSE}

noQual <- all(is.na(qual))

dateTime <- xml_attr(obs,"dateTime")
if(asDateTime){
Expand Down Expand Up @@ -335,6 +332,8 @@ importWaterML1 <- function(obs_url,asDateTime=FALSE, tz=""){
mergedDF <- mergedDF[c(mergedNames[-tzLoc],mergedNames[tzLoc])]
mergedDF <- arrange(mergedDF,site_no, dateTime)

names(mergedDF) <- make.names(names(mergedDF))

#attach other site info etc as attributes of mergedDF
if(!raw){
attr(mergedDF, "url") <- obs_url
Expand Down
20 changes: 9 additions & 11 deletions R/importWaterML2.r
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,11 @@
#'
#' @param obs_url character or raw, containing the url for the retrieval or a path to the data file, or raw XML.
#' @param asDateTime logical, if \code{TRUE} returns date and time as POSIXct, if \code{FALSE}, character
#' @param tz character to set timezone attribute of datetime. Default is an empty quote, which converts the
#' datetimes to UTC (properly accounting for daylight savings times based on the data's provided tz_cd column).
#' Possible values to provide are "America/New_York","America/Chicago", "America/Denver","America/Los_Angeles",
#' "America/Anchorage","America/Honolulu","America/Jamaica","America/Managua","America/Phoenix", and "America/Metlakatla"
#' @param tz character to set timezone attribute of datetime. Default converts the datetimes to UTC
#' (properly accounting for daylight savings times based on the data's provided tz_cd column).
#' Recommended US values include "UTC","America/New_York","America/Chicago", "America/Denver","America/Los_Angeles",
#' "America/Anchorage","America/Honolulu","America/Jamaica","America/Managua","America/Phoenix", and "America/Metlakatla".
#' For a complete list, see \url{https://en.wikipedia.org/wiki/List_of_tz_database_time_zones}
#' @return mergedDF a data frame time, value, description, qualifier, and identifier
#' @export
#' @importFrom xml2 read_xml
Expand Down Expand Up @@ -39,15 +40,12 @@
#' fullPath <- file.path(filePath, fileName)
#' UserData <- importWaterML2(fullPath)
#'
importWaterML2 <- function(obs_url, asDateTime=FALSE, tz=""){
importWaterML2 <- function(obs_url, asDateTime=FALSE, tz="UTC"){

if(tz != ""){
tz <- match.arg(tz, c("America/New_York","America/Chicago",
"America/Denver","America/Los_Angeles",
"America/Anchorage","America/Honolulu",
"America/Jamaica","America/Managua",
"America/Phoenix","America/Metlakatla"))
}else{tz = "UTC"}
tz = "UTC"
}
tz <- match.arg(tz, OlsonNames())

raw <- FALSE
if(class(obs_url) == "character" && file.exists(obs_url)){
Expand Down
2 changes: 1 addition & 1 deletion R/tabbedDataRetrievals.R
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ NULL

#' US State Code Lookup Table
#'
#' Data pulled from \url{http://www2.census.gov/geo/docs/reference/state.txt}
#' Data pulled from \url{https://www2.census.gov/geo/docs/reference/state.txt}
#' on April 1, 2015.
#'
#' @name stateCd
Expand Down
26 changes: 13 additions & 13 deletions inst/doc/dataRetrieval.Rnw
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ Table \ref{tab:func} describes the functions available in the dataRetrieval pack
%------------------------------------------------------------
In this section, examples of Web retrievals document how to get raw data. This data includes site information (\ref{sec:usgsSite}), measured parameter information (\ref{sec:usgsParams}), historical daily values(\ref{sec:usgsDaily}), unit values (which include real-time data but can also include other sensor data stored at regular time intervals) (\ref{sec:usgsRT}), water quality data (\ref{sec:usgsWQP}), groundwater level data (\ref{sec:gwl}), peak flow data (\ref{sec:peak}), rating curve data (\ref{sec:rating}, and surface-water measurement data (\ref{sec:meas}). Section \ref{sec:metadata} shows instructions for getting metadata that is attached to each returned data frame.

The USGS organizes hydrologic data in a standard structure. Streamgages are located throughout the United States, and each streamgage has a unique ID (referred in this document and throughout the dataRetrieval package as \enquote{siteNumber}). Often (but not always), these ID's are 8 digits for surface-water sites and 15 digits for groundwater sites. The first step to finding data is discovering this siteNumber. There are many ways to do this, one is the National Water Information System: Mapper \url{http://maps.waterdata.usgs.gov/mapper/index.html}.
The USGS organizes hydrologic data in a standard structure. Streamgages are located throughout the United States, and each streamgage has a unique ID (referred in this document and throughout the dataRetrieval package as \enquote{siteNumber}). Often (but not always), these ID's are 8 digits for surface-water sites and 15 digits for groundwater sites. The first step to finding data is discovering this siteNumber. There are many ways to do this, one is the National Water Information System: Mapper \url{https://maps.waterdata.usgs.gov/mapper/index.html}.

Once the siteNumber is known, the next required input for USGS data retrievals is the \enquote{parameter code}. This is a 5-digit code that specifies the measured parameter being requested. For example, parameter code 00631 represents \enquote{Nitrate plus nitrite, water, filtered, milligrams per liter as nitrogen}, with units of \enquote{mg/l as N}.

Expand Down Expand Up @@ -351,7 +351,7 @@ siteINFO <- readNWISsite(siteNumbers)
@

Site information is obtained from:
\url{http://waterservices.usgs.gov/rest/Site-Test-Tool.html}
\url{https://waterservices.usgs.gov/rest/Site-Test-Tool.html}

Information on the returned data can be found with the \texttt{comment} function as described in section \ref{sec:metadata}.

Expand Down Expand Up @@ -453,7 +453,7 @@ parameterINFO <- readNWISpCode(parameterCd)
\subsection{Daily Data}
\label{sec:usgsDaily}
%------------------------------------------------------------
To obtain daily records of USGS data, use the \texttt{readNWISdv} function. The arguments for this function are siteNumber, parameterCd, startDate, endDate, and statCd (defaults to \texttt{"}00003\texttt{"}). If you want to use the default values, you do not need to list them in the function call. Daily data is pulled from \url{http://waterservices.usgs.gov/rest/DV-Test-Tool.html}.
To obtain daily records of USGS data, use the \texttt{readNWISdv} function. The arguments for this function are siteNumber, parameterCd, startDate, endDate, and statCd (defaults to \texttt{"}00003\texttt{"}). If you want to use the default values, you do not need to list them in the function call. Daily data is pulled from \url{https://waterservices.usgs.gov/rest/DV-Test-Tool.html}.

The dates (start and end) must be in the format \texttt{"}YYYY-MM-DD\texttt{"} (note: the user must include the quotes). Setting the start date to \texttt{"}\texttt{"} (no space) will prompt the program to ask for the earliest date, and setting the end date to \texttt{"}\texttt{"} (no space) will prompt for the latest available date.

Expand Down Expand Up @@ -575,7 +575,7 @@ America/Phoenix
America/Metlakatla
\end{verbatim}

Data are retrieved from \url{http://waterservices.usgs.gov/rest/IV-Test-Tool.html}. There are occasions where NWIS values are not reported as numbers, instead a common example is \enquote{Ice.} Any value that cannot be converted to a number will be reported as NA in this package. Site information and measured parameter information is attached to the data frame as attributes. This is discused further in section \ref{sec:metadata}.
Data are retrieved from \url{https://waterservices.usgs.gov/rest/IV-Test-Tool.html}. There are occasions where NWIS values are not reported as numbers, instead a common example is \enquote{Ice.} Any value that cannot be converted to a number will be reported as NA in this package. Site information and measured parameter information is attached to the data frame as attributes. This is discused further in section \ref{sec:metadata}.

\newpage

Expand Down Expand Up @@ -673,7 +673,7 @@ surfaceData <- readNWISmeas(siteNumber)
\section{Water Quality Portal Web Retrievals}
\label{sec:usgsSTORET}
%------------------------------------------------------------
There are additional water quality data sets available from the Water Quality Data Portal (\url{http://www.waterqualitydata.us/}). These data sets can be housed in either the STORET database (data from EPA), NWIS database (data from USGS), STEWARDS database (data from USDA), and additional databases are slated to be included in the future. Because only USGS uses parameter codes, a \texttt{"}characteristic name\texttt{"} must be supplied. The \texttt{readWQPqw} function can take either a USGS parameter code, or a more general characteristic name in the parameterCd input argument. The Water Quality Data Portal includes data discovery tools and information on characteristic names. The following example retrieves specific conductance from a DNR site in Wisconsin.
There are additional water quality data sets available from the Water Quality Data Portal (\url{https://www.waterqualitydata.us/}). These data sets can be housed in either the STORET database (data from EPA), NWIS database (data from USGS), STEWARDS database (data from USDA), and additional databases are slated to be included in the future. Because only USGS uses parameter codes, a \texttt{"}characteristic name\texttt{"} must be supplied. The \texttt{readWQPqw} function can take either a USGS parameter code, or a more general characteristic name in the parameterCd input argument. The Water Quality Data Portal includes data discovery tools and information on characteristic names. The following example retrieves specific conductance from a DNR site in Wisconsin.


<<label=getQWData, echo=TRUE, eval=FALSE>>=
Expand All @@ -683,7 +683,7 @@ specificCond <- readWQPqw('WIDNR_WQX-10032762',

A tool for finding NWIS characteristic names can be found at:

\url{http://www.waterqualitydata.us/public_srsnames/}
\url{https://www.waterqualitydata.us/public_srsnames/}

\FloatBarrier

Expand All @@ -699,11 +699,11 @@ The previous examples all took specific input arguments: siteNumber, parameterCd
%------------------------------------------------------------
The function \texttt{whatNWISsites} can be used to discover NWIS sites based on any query that the NWIS Site Service offers. This is done by using the \texttt{"..."} argument, which allows the user to use any arbitrary input argument. We can then use the service here:

\url{http://waterservices.usgs.gov/rest/Site-Test-Tool.html}
\url{https://waterservices.usgs.gov/rest/Site-Test-Tool.html}

to discover many options for searching for NWIS sites. For example, you may want to search for sites in a lat/lon bounding box, or only sites tidal streams, or sites with water quality samples, sites above a certain altitude, etc. The results of this site query generate a URL. For example, the tool provided a search within a specified bounding box, for sites that have daily discharge (parameter code = 00060) and temperature (parameter code = 00010). The generated URL is:

\url{http://waterservices.usgs.gov/nwis/site/?format=rdb&bBox=-83.0,36.5,-81.0,38.5&parameterCd=00010,00060&hasDataTypeCd=dv}
\url{https://waterservices.usgs.gov/nwis/site/?format=rdb&bBox=-83.0,36.5,-81.0,38.5&parameterCd=00010,00060&hasDataTypeCd=dv}

The following dataRetrieval code can be used to get those sites:

Expand Down Expand Up @@ -731,10 +731,10 @@ For NWIS data, the function \texttt{readNWISdata} can be used. The argument list
\multicolumn{1}{c}{\textbf{\textsf{Description}}} &
\multicolumn{1}{c}{\textbf{\textsf{Reference URL}}} \\ [0pt]
\hline
daily values & dv & \url{http://waterservices.usgs.gov/rest/DV-Test-Tool.html}\\
[5pt]instantaneous & iv & \url{http://waterservices.usgs.gov/rest/IV-Test-Tool.html}\\
[5pt]groundwater levels & gwlevels & \url{http://waterservices.usgs.gov/rest/GW-Levels-Test-Tool.html}\\
[5pt]water quality & qwdata & \url{http://nwis.waterdata.usgs.gov/nwis/qwdata}\\
daily values & dv & \url{https://waterservices.usgs.gov/rest/DV-Test-Tool.html}\\
[5pt]instantaneous & iv & \url{https://waterservices.usgs.gov/rest/IV-Test-Tool.html}\\
[5pt]groundwater levels & gwlevels & \url{https://waterservices.usgs.gov/rest/GW-Levels-Test-Tool.html}\\
[5pt]water quality & qwdata & \url{https://nwis.waterdata.usgs.gov/nwis/qwdata}\\
\hline
\end{tabular}
}
Expand All @@ -761,7 +761,7 @@ siteInfo <- attr(dischargeWI, "siteInfo")

Just as with NWIS, the Water Quality Portal (WQP) offers a variety of ways to search for sites and request data. The possible Web service arguments for WQP site searches is found here:

\url{http://www.waterqualitydata.us/webservices_documentation.jsp}
\url{https://www.waterqualitydata.us/webservices_documentation}

To discover available sites in the WQP in New Jersey that have measured Chloride, use the function \texttt{whatWQPsites}.

Expand Down
Binary file modified inst/doc/dataRetrieval.pdf
Binary file not shown.
Loading

0 comments on commit 961f957

Please sign in to comment.