Wrangler

Compute Resources

We have created an allocation on TACC's Wrangler data system(https://www.tacc.utexas.edu/systems/wrangler). Wrangler includes 500TB of high speed flash storage attached to 96 24-core, 128GB RAM compute nodes. Optimized versions of popular data anatlytic tools are pre-installed, including R, Python and Hadoop.

Please see David Walling in conf1 for getting access to this system.

Pre-staged Datasets

We have pre-staged datasets related to this hackathon at the following location: /data/shared/zika

c252-101.wrangler(20)# du -ksh /data/shared/zika/*
26G     /data/shared/zika/austin_aerial
23M     /data/shared/zika/github
47G     /data/shared/zika/pubmed

In addition to the data available in github, we have included a collection of aerial photography images of the Austin area, as well as a download of the open access subset from PubMed.

Resource Reservations

For this hackathon, we have created a 10 node Hadoop cluster available under the reservation id: hadoop+Zika+1487

In order to submit jobs to this cluster, you must:

create a TACC account.
see David Walling to get added to the project allocation.
ssh to wrangler: $> ssh username@wrangler.tacc.utexas.edu
create an interactive session: $> idev -r hadoop+Zika+1487
interact with the cluster from the commandline: Ex. hadoop fs -ls /tmp/zika

Interactive Consoles

Rstudio, Jupyter and general VNC sessions are avaiable to the Wrangler compute nodes from our visualization portal: http://vis.tacc.utexas.edu

After logging into the portal, select Wrangler under the 'Jobs' tab and follow the prompts for launching your sessions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrangler

Compute Resources

Pre-staged Datasets

Resource Reservations

Interactive Consoles

Clone this wiki locally