Skip to content
wallingTACC edited this page May 15, 2016 · 3 revisions

Compute Resources

We have created an allocation on TACC's Wrangler data system(https://www.tacc.utexas.edu/systems/wrangler). Wrangler includes 500TB of high speed flash storage attached to 96 24-core, 128GB RAM compute nodes. Optimized versions of popular data anatlytic tools are pre-installed, including R, Python and Hadoop.

Please see David Walling in conf1 for getting access to this system.

Pre-staged Datasets

We have pre-staged datasets related to this hackathon at the following location: /data/shared/zika

c252-101.wrangler(20)# du -ksh /data/shared/zika/*
26G     /data/shared/zika/austin_aerial
23M     /data/shared/zika/github
47G     /data/shared/zika/pubmed

In addition to the data available in github, we have included a collection of aerial photography images of the Austin area, as well as a download of the open access subset from PubMed.

Resource Reservations

For this hackathon, we have created a 10 node Hadoop cluster available under the reservation id: hadoop+Zika+1487

In order to submit jobs to this cluster, you must:

  • create a TACC account.
  • see David Walling to get added to the project allocation.
  • ssh to wrangler: $> ssh username@wrangler.tacc.utexas.edu
  • create an interactive session: $> idev -r hadoop+Zika+1487
  • interact with the cluster from the commandline: Ex. hadoop fs -ls /tmp/zika

Interactive Consoles

Rstudio, Jupyter and general VNC sessions are avaiable to the Wrangler compute nodes from our visualization portal: http://vis.tacc.utexas.edu

After logging into the portal, select Wrangler under the 'Jobs' tab and follow the prompts for launching your sessions.