The IDA Near Real-Time System

Peter Davis, Jon Berger and David Chavez
University of California, San Diego

Figure 1. Schematic representation of IDA NRTS architecture. Data acquired from geophysical instruments at a field site are fed into a Solaris workstation. From there, the data are available to users on the Internet by way of TCP/IP connections via telephone circuit (either analog or digital) used in either dialup or continuous (leased) mode; satellites; or on a local area network (LAN). Once on the Internet, data can be easily accessed by individual investigators and organizations interested in monitoring seismic activity in near real-time. The system is actively used by agencies charged with monitoring earthquakes, clandestine nuclear tests, and tsunamis.

For the past seven years, the University of California, San Diego (UCSD) has used the Near Real-Time System (NRTS), a body of software developed at UCSD with funding from IRIS, to collect IRIS GSN data over the Internet.

In September 1992, NRTS was first used to telemeter data from the Kislovodsk miniarray back to a data collection center in Obninsk, Russia, and from there to San Diego. Since then, the software has undergone major revision and has matured into a robust system, capable of acquiring data from a variety of stations. NRTS is now used by the US Geological Survey's National Earthquake Information Center (NEIC), and the National Oceanographic and Atmospheric Administration's (NOAA) tsunami warning centers in Hawaii and Alaska. The IRIS Data Management System requests made by the IRIS SPYDER^® system use NRTS' AutoDRM capabilities. The two IRIS/IDA stations that are also part of the Comprehensive Test Ban Treaty's International Monitoring System use NRTS to transmit data to the US National Data Center.

From its inception, NRTS was designed to meet the many requirements for communicating with a GSN station, where "last kilometer problems" often come into play. For example, power is usually at a premium at GSN stations, and can be subject to frequent interruption. Also, the bandwidth of the circuit to the station is often severely limited, and high communications costs create the need to reduce connectivity to minutes per day. With these and similar restrictions in mind, NRTS was designed around the TCP/IP protocol suite, and can thus use the Internet and its associated long-haul telecommunications infrastructure. By basing data acquisition and transmission upon the TCP/IP protocols, the task of connecting to a remote location is reduced to the task of bringing the Internet to the station a problem for which a multitude of off-the-shelf solutions exist. Additionally, the application software on both ends of the circuit can be designed without the need for knowledge of the details of the communications links. As a consequence, the NRTS data management framework permits robust recovery from interruptions in communications links. The problem of restricted bandwidth is alleviated by node replication at NRTS hubs located where connectivity is less bandwidth-limited. All of these features are very important at GSN stations, which tend to lie at the very periphery of the cyber universe.

Figure 2. Photo of IRIS/IDA station KDAK (Kodiak, Alaska), one of the US IMS seismic stations. In the foreground is a cover to protect the wellhead, and in back, a shed housing recording equipment. The round object in front of the shed is a tank containing propane fuel for the station's thermoelectric generator.

System Architecture
NRTS runs on any POSIX compliant UNIX platform. It accepts a data stream as input, writes the data to a disk loop, and then services data requests from that loop. Once in the loop, data may be requested, either in segments or continuous feeds, in miniSEED, SAC, CSS, or GSE (Alpha or Beta) formats. If a continuous feed is requested, those data are passed on with little additional latency. Each packet input to the NRTS host is immediately output by the data request server.

A computer running NRTS may be configured either as a station host or as a hub. A station host accepts data locally and stores that data within a disk loop of configurable length, and is limited only by disk size. The host's data server can satisfy requests for any data retained within that disk loop. At many UCSD stations, the loop length is set to one week. A hub accepts data feeds from one or more station hosts. The hub's data server may accept all or part of the data available from a given station host. The amount of data transferred to the hub is only limited by the bandwidth and cost of the circuit connecting host and hub. There are currently three principle NRTS hubs: one at UCSD in La Jolla, one in Obninsk, Russia, and one at IRIS in Washington, DC.

Data requests may be directed to either a hub or a station host. If a hub's data server cannot satisfy a request from data already transmitted to that hub, then the data server consults an ordered list of NRTS data servers known to handle data from the desired station(s). These servers may be running either on the station host or on other NRTS hubs. In cases where the circuit to a station is bandwidth limited, it is desirable to direct data requests first to the hub rather than the station. All requests that can be satisfied at the hub are fulfilled from there, and only those data not at the hub already are requested from the station, thus avoiding duplicate transmission.

Figure 3. IRIS/IDA seismographic stations currently accessible via telemetry.

The KDAK Example
Telemetry from station KDAK (Kodiak, Alaska) is a good example of how the NRTS manages data retrieval over a complicated circuit. All data recorded on site are sent via spread spectrum radio to a PC at a Coast Guard facility three kilometers away. NRTS running on the PC stores the data and retransmits a portion over a leased telephone line to a Sun workstation at the University of Alaska, Fairbanks. NRTS on that Sun stores the data and retransmits to a number of users around the world over the Internet. The bandwidth of the telephone leg is insufficient to transfer all data recorded at KDAK. Data not routinely transmitted over the above circuit may be obtained by sending AutoDRM requests to IDA's server, idahub.ucsd.edu. NRTS retrieves the requested segments and returns data to the user via email.

Future Developments for NRTS
During the coming year, IDA expects to establish telemetry to most of the remaining stations not yet reachable. As additional circuits are put in place and the bandwidth of existing circuits are broadened, more data than ever will be available in near real-time. The architecture of NRTS is well designed to accommodate these changes. In fact, a few changes to configuration files are all that are required to convert management of a bandwidth-limited dialup circuit that handles only low rate and state-of-health data to one that handles continuously the entire output of a GSN station.

As projects such as EarthScope make telemetered data even easier for end-users to access, the node replicating capabilities of NRTS will come into full play. Data will be routinely copied to nodes that can be easily accessed thus preventing the circuit over "the last kilometer" from being overwhelmed servicing requests. The recipient need never know (nor care) about the details of how data are retrieved from a station halfway around the world.