Sherlock, a shared resource#
Sherlock is a shared compute cluster available for use by all Stanford faculty members and their research teams to support sponsored research.
Sherlock is a resource for research
Sherlock is not suitable for course work, class assignments or general-use training sessions.
Users interested in using computing resources in such contexts are encouraged to investigate FarmShare, Stanford’s community computing environment, which is primarily intended for to support coursework.
It is open to the Stanford community as a computing resource to support departmental or sponsored research, thus a faculty member's sponsorship is required for all user accounts.
Please note that your use of this system falls under the "Computer and Network Usage Policy", as described in the Stanford Administrative Guide. In particular, sharing authentication credentials is strictly prohibited. Violation of this policy will result in termination of access to Sherlock.
Sherlock has been designed, deployed, and is maintained and operated by the Stanford Research Computing Center (SRCC) staff. The SRCC is a joint effort of the Dean of Research and IT Services to build and support a comprehensive program to advance computational research at Stanford.
Sherlock has been initially purchased and supported with seed funding from Stanford's Provost. It comprises a set of freely available compute nodes, a few specific resources such as large-memory machines and GPU servers, as well as the associated networking equipment and storage. These resources can be used to run computational codes and programs, and are managed through a job scheduler using a fair-share algorithm.
Data risk classification#
Low and Moderate Risk data
Sherlock is approved for computing with Low and Moderate Risk data only.
High Risk data
Sherlock is NOT HIPAA compliant and shouldn't be used to process PHI nor PII. The system is approved for computing with Low and Moderate Risk data only, and is not suitable to process High Risk data. For more information about data risk classifications, see the Information Security Risk Classification page.
What's a cluster?#
A computing cluster is a federation of multiple compute nodes (independent computers), most commonly linked together through a high-performance interconnect network.
What makes it a "super-computer" is the ability for a program to address resources (such as memory, CPU cores) located in different compute nodes, through the high-performance interconnect network.
On a computing cluster, users typically connect to login nodes, using a secure remote login protocol such as SSH. Unlike in traditional interactive environments, users then need to prepare compute jobs to submit to a resource scheduler. Based on a set of rules and limits, the scheduler will then try to match the jobs' resource requirements with available resources such as CPUs, memory or computing accelerators such as GPUs. It will then execute the user defined tasks on the selected resources, and generate output files in one of the different storage locations available on the cluster, for the user to review and analyze.
The condominium model#
For users who need more than casual access to a shared computing environment, SRCC also offers faculty members the possibility to purchase additional dedicated resources to augment Sherlock, by becoming Sherlock owners. Choosing from a standard set of server configurations supported by SRCC staff (known as the Sherlock catalog), principal investigators (PIs) can purchase their own servers to add to the cluster.
The vast majority of Sherlock's compute nodes are actually owners nodes, and PI purchases are the main driver behind the rapid expansion of the cluster, which went from 120 nodes in early 2014, to more than 1,000 nodes mid-2017. An order of magnitude increase in about 3 years.
This model, often referred to as the Condo model, allows Sherlock owners to benefit from the scale of the cluster and give them access to more compute nodes than their individual purchase. This provides owners with much greater flexibility than owning a standalone cluster.
The resource scheduler configuration works like this:
- owners and their research teams have priority use of the resources they purchase,
- when those resources are idle, other owners can use them,
- when the purchasing owner wants to use his/her resources, other jobs will be killed
This provides a way to get more resources to run less important jobs in the background, while making sure that an owner always gets immediate access to his/her own nodes.
Participating owners also have shared access to the original base Sherlock nodes, along with everyone else.
Benefits to owners include:
- Data center hosting, including backup power and cooling,
- Access to high-performance, large parallel scratch disk space,
- Priority access to nodes that they own,
- Background access to any owner nodes that are not in use,
- System configuration and administration,
- User support,
- Standard software stack, appropriate for a range of research needs,
- Possibility for users to install additional software applications as needed,
How to become an owner#
For administrative reasons, SRCC offers PIs the possibility to purchase Sherlock nodes on a quarterly basis. Large orders could be accommodated at any time, though.
Please note that the minimum purchase per PI is one physical server. We cannot accommodate multiple PIs pooling funds for a single node.
If you are interested in becoming an owner, you can find the latest information about ordering Sherlock nodes on the Sherlock ordering page (SUNet ID login required). Feel free to contact us is you have any additional question.
The research computing landscape evolves very quickly, and to both accommodate growth and technological advances, it's necessary to adapt the Sherlock environment to these evolutions.
Every year or so, a new generation of processors is released, which is why, over a span of several years, multiple generations of CPUs and GPUs make their way into Sherlock. This provides users with access to the latest features and performance enhancements, but it also adds some heterogeneity to the cluster, which is important to keep in mind when compiling software and requesting resources to run them.
Another key component of Sherlock is the interconnect network that links all of Sherlock's compute nodes together and act as a backbone for the whole cluster. This network fabric is of finite capacity, and based on the individual networking switches characteristics and the typical research computing workflows, it can accommodate up to about 850 compute nodes.
As nodes get added to Sherlock, the number of available ports decreases, and at some point, the fabric gets full and no more nodes can be added. Sherlock reached that stage for the first time in late 2016, which prompted the installation of a whole new fabric, to allow for further system expansion.
This kind of evolution is the perfect opportunity to upgrade other components too: management software, ancillary services architecture and user applications. In January 2017, those components were completely overhauled and a new, completely separate cluster was kick-started, using using a different set of hardware and software, while conserving the same storage infrastructure, to ease the transition process.
After a transition period, the older Sherlock hardware, compute and login nodes, have been be merged in the new cluster, and from a logical perspective (connection, job scheduling and computing resources), nodes attached to each of the fabrics have been reunited to form a single cluster again.
As Sherlock continues to evolve and grow, the new fabric will also approach capacity again, and the same process will happen again to start the next generation of Sherlock.
Maintenances and upgrades#
The SRCC institutes a monthly scheduled maintenance window on Sherlock, to ensure optimal operation, avoid potential issues and prepare for future expansions. This window will be used to make hardware repairs, software and firmware updates, and perform general manufacturer recommended maintenance on our environment.
As often as possible, maintenances will take place on the first Tuesday of every month, from 8am to 12am, and will be announced 2 weeks in advance, through the usual communication channels.
In case an exceptional amount of work is required, the maintenance window could be extended to 10 hours (from 8am to 6pm).
During these times, access to Sherlock will be unavailable, login will be disabled and jobs won't run. A reservation will be placed in the scheduler so running jobs can finish before the maintenance, and jobs that wouldn't finish by the maintenance window would be pushed after it.
Q: Why doing maintenances at all?
A: Due to growth in our compute environment and the increasing complexity of the systems we deploy, we felt it prudent to arrange for a regular time when we could comfortably and without pressure fix problems or update facilities with minimal impact to our customers. Most, if not all, major HPC centers have regular maintenance schedules. We also need to enforce the Minimum Security rules edicted by the Stanford Information Security Office, which mandate deployment of security patches in a timely manner.
Q: Why Tuesdays 8am-12am? Why not do this late at night?
A: We have observed that the least busy time for our services is at the beginning of the week in the morning hours. Using this time period should not interrupt most of our users. If the remote possibility of a problem that extends past the scheduled downtime occurs, we would have our full staff fresh and available to assist in repairs and quickly restore service.
Q: I have jobs running, what will happen to them?
A: For long-running jobs, we strongly recommend checkpointing your results on a periodic basis. Besides, we will place a reservation in the scheduler for each maintenance that would prevent jobs to run past it. This means that the scheduler will only allow jobs to run if they can finish by the time the maintenance starts. If you submit a long job soon before the maintenance, it will be delayed until after the maintenance. That will ensure that no work is lost when the maintence starts.