‘Big data’, its compilation, management and sharing, is increasingly becoming a key part of research in fields across the academic spectrum, and mainly refers to datasets that are too large to be managed by more traditional data-processing software. Beyond university research and academia, such data is also being used in governmental, economic and manufacturing research projects and can be gathered in a multitude of ways, from mobile devices to wireless sensor networks. Analysis of such huge datasets via methods such as predictive analytics and user behaviour analytics has great potential (and indeed has already been used) to improve human lives by for example preventing disease spread, fighting crime and predicting economic trends.
Along with the opportunities for discovery and innovation that big data has brought, there have also come problems for individuals and institutions. To deal with such gargantuan and unwieldy datasets, analysing them, sharing them with colleagues internally, collaborating externally and globally, an equally massive computer power and high-speed network capacity is required. Building a usable and effective research network has become a key strategy for research and teaching institutions to ensure that they can work and partner with other researchers and take part in larger projects, as well as making sure they remain centres that entice the most exciting researchers through their doors.
Whilst R1 universities in the United States have led the way in implementing such high-speed research networks, smaller institutions have been at risk of falling behind. The private University of St. Thomas, Minnesota was one such centre. Without the funding to implement a high-speed research network and Science DMZ (a network configuration that is secure but without the performance limits that come with passing data through a standard firewall), researchers at St. Thomas were unable to reach their full research potential. However, all that is set to change.
Around two years ago, St. Thomas Chief Information Officer and Vice President of Innovation & Technology Services Ed Clark, and Associate Director of Research Computing Eric Tornoe set out to ensure that the college not only implemented a new high-speed network to match those of the best research centres and major laboratories around the world, but to secure the longer-term growth of such a system. At the time, whilst the institution’s basic networking was up-to-date, requests for further research support had been increasing. With the assistance of Charles Nguyen, the IT Manager of the University of Minnesota St. Anthony Falls Laboratory, Tornoe put together St. Thomas’ first National Science Foundation (NSF) grant application for funding to develop the new network. Unfortunately, due in large part to the extremely short lead time between beginning and submitting the application, and despite some positive reviewer comments, this application was unsuccessful.
Building a usable and effective research network has become a key strategy for research and teaching institutions.
Undeterred, however, and aware of how important such a development would be for the university, when another call for grant applications was put out by the NSF a few months later, Tornoe and Nguyen tried again. This time the grant in question was better suited to St. Thomas – it had time in-built for the network design, something the college needed. The team also added Will Bear, Associate Vice President for Infrastructure and Applications at St. Thomas, which was instrumental in creating a better technical design. In turn, this allowed the crafting of a more compelling narrative about how the funding would shape the future of the institution. Tornoe and Nguyen submitted their second application, and after a long wait the good news came; the grant application had been successful. The NSF pledged almost $400,000 to St. Thomas to develop their high-speed network, allowing new faculty to continue their research at the same or higher level as that at their previous institution and current faculty to move forward and engage in their research using a dedicated network unconstrained by the limits of commodity internet. Student researchers also benefit from working with a state-of-the-art research network, allowing them access to real-world systems and the opportunity to develop skills that directly translate to future employment opportunities. With a two-year grant period (and the possibility of a one-year extension) as well as the commitment of St. Thomas administration to the ongoing maintenance of the network beyond the grant period, the stage was set for work to begin.
St. Thomas will enter a new era of research capability, allowing faculty and students to collaborate with top research centres and laboratories across the globe.
The Tommie Science Network
The aim of the new high-speed network development team at St. Thomas is to build a consolidated, centrally-supported environment that will provide all the tools and capacity needed for the data-intensive research requirements of researchers now and in the future. Working closely with experienced team members from the University of Minnesota, the goal is to implement the college’s long-term Cyberinfrastructure plan; the creation of the ‘Tommie Science Network’ connected to a Science DMZ, whilst guaranteeing the quality and sustainability of the college’s research infrastructure. These improvements will allow for fast transfers of large datasets between campus locations, to the Cloud and to other institutions (both national and international) via Internet2, opening the door for management of significantly larger datasets and real-time collaborations, opportunities that are not supported by the current network.
These new speeds will be achieved through a variety of mechanisms on the Tommie Science Network. Firstly, a network ‘switch’ that can handle the higher speeds will be implemented. This is the single most expensive item of the new network, costing over $100,000. Secondly, the problem of the firewall and security needs addressing; typically, a science network needs to sit outside a firewall because a modern firewall inspects each single unit of information that passes through it, and in the case of big data causes astronomical slowdowns in research traffic. To maintain security outside of the firewall, either Access Control Lists can be applied to the switch, allowing pre-authorised traffic to bypass the firewall, or an enterprise firewall can be employed that detects the nature of incoming traffic and routes it where appropriate through the science network. The latter method maintains speed and is more secure as the pipeline is only open during transmission rather than idly waiting for traffic. Other components such a Globus (a high-speed file transfer and management system), PerfSonar (a tool for monitoring and maintaining research network speeds) and Data Transfer Nodes (which sit in the Science DMZ and consist of very fast computers with very large, very fast storage arrays) will be employed as part of the new system. The University of Minnesota’s Research Networking Team will play a key part in the effort to design and build the new network; as well as having significant expertise in this area, the institution also has access to the Northern Lights GigaPOP, where the Tommie Science Network will hook up to Internet2.
A bright research future
With their successful grant application Ed Clark, Eric Tornoe, Will Bear, Charles Nguyen and the rest of the network team have begun the process of a serious research upgrade for The University of St. Thomas. With the first year of the grant period dedicated to the Tommie Science Network design, meetings considering the technical, organisation and support structures needed have already begun between the University of Minnesota research network staff and St. Thomas researchers and infrastructure personnel. The team strongly believes that once the network goes live (in two to three years’ time), that St. Thomas will enter a new era of research capability, allowing faculty and students to collaborate with top research centres and laboratories across the globe with seamless and efficient handling of large datasets that has been hitherto impossible. With this new capability the university hopes it will be able to add to its already exemplary pool of faculty, attracting top researchers and students by offering them research support and resources usually only available at larger institutions. It would seem that with the Tommie Science Network, The University of St. Thomas has secured its future as a cutting-edge, innovative and forward-thinking research and education centre that will hold its place at the sharp end of scientific research for years to come.
Are there any particular projects at St. Thomas that you feel will benefit from the implementation of the Tommie Science Network?
<> Many of the projects that will be served by the new network were instrumental in demonstrating our need to the NSF. Dr Chih Lai, a co-PI for the grant, and his students are creating Predictive Models with Big Data Technology for Precision Agriculture. This project will benefit significantly from greater network speeds as they ingest and analyse daily 30TB datasets from remote research locations. Dr Tommie Marrinan is performing real-time local analysis of remote large-scale simulation data, which requires streaming capacity of at least 10gbps from various supercomputing facilities around the world, including Argonne National Laboratory and Sandia National Laboratory.