By Mike Jackson.
After serving as a valuable repository for UK e-Science projects for many years, the NeSCForge service, hosted by the National e-Science Centre, will be coming to an end. The service will be shut down on Monday 20th December 2010. Naturally, people with resources hosted on NeSCForge may be wondering how to retrieve their resources. This guide describes what you can retrieve and how.
For more information about the closure of NeSCForge see the announcement of 27 September 2010.
Why write this guide?
We were asked by a NeSCForge-hosted project for advice on how to retrieve source code and other resources before the NeSCForge service was shutdown.
What resources can be retrieved?
The following resources can be retrieved from NeSCForge:
Project web pages
Mailing list archives
Files and releases
What resources cannot be retrieved?
Unfortunately, the software on which NeSCForge runs is bespoke. This prevents the following resources from being retrieved:
Bug, support, patch, and feature tracker content
Task manager content
What you will need
You will need a NeSCForge user name and password, and you will need administrator privileges for the project whose resources you want to retrieve. Throughout this guide, we'll use the project name myproject, and a user called myuser who has the password mypassword. Substitute your own project name, user and password where these occur.
CVS repositories, web pages and mailing list archives are stored in NeSCForge's server directories which can be accessed via a secure copy (SCP) client.
Linux and Unix
Many Linux and Unix operating systems provide an SCP client as standard.
Run the following command:
$ scp firstname.lastname@example.org:/opt/SOURCE DESTINATION
You will then be prompted for your password.
On a Windows, you can use the SSH Secure Shell 3.2.9 (2000-2003) SFTP client, from http://www.ssh.com. However, this can cause problems when copying symbolically-linked files in e-mail archives. In the instructions that follow you might have to copy specific sub-directories and files, rather than being able to do the all-in-one copy that is described.
The procedure for running SFTP is as follows:
Click Quick Connect
Connect to Remote Host appears
Enter Host Name: myproject.forge.nesc.ac.uk
Enter User Name: myuser
Host Identification appears
Enter Password appears
Enter Password: mypassword
In the Remote Name area, you will see a public_html directory
Right-click in the Remote Name area and select Go to folder...
Go to Remote Folder appears.
Enter Folder Name: /opt. This takes you to the root of the NeSCForge directory structure
Retrieving resources "by hand"
You can retrieve your project resources using secure copy (for web pages, mailing list archives and CVS) or an internet browser (release files and documents). Alternatively, we've provided a Linux/UNIX shell script and Java client to automatically retrieve these resources. You may want to try the script first. The following sections also explain where the resources are located in NeSCForge.
Project web pages
Project web pages are the pages available when you visit http://myproject.forge.nesc.ac.uk. In NeSCForge, these are held in /opt/projects/myproject/htdocs. If your project has such pages you can copy them using:
$ scp -r email@example.com:/opt/projects/myproject/ myproject/www
NeSCForge e-mail lists (if you used them) will be stored in one of two places, depending on whether they were public or private. These places are:
You can copy e-mail list archives, including their attachments, using the commands:
$ scp -r firstname.lastname@example.org:/opt/mailman/archives/public/myproject-users myproject/mail/myproject-users
$ scp -r email@example.com:/opt/mailman/archives/private/myproject-developers myproject/mail/myproject-developers
Certain files won't be copied due to permissions, e.g. /opt/mailman/archives/public/myproject-users/database, but these files don't appear to be necessary.
If you browse into myproject/mail/myproject-users, for example, and open index.html in a web browser, you can explore the archive.
If an e-mail had an attachment then there will be a hyperlink of the form:
This hyperlink, as you'll notice, still cites NeSCForge. However, all attachments are available in the attachments directory of the mail archive, e.g. myproject/mail/myproject-developers/attachments/20100601/044222f0/attachment-0001.doc.
Each mail directory contains TXT files which can be used as standard mail folders in Pine, for example.
In NeSCForge, a project's CVS repository is held in /opt/cvsroot/myproject.
If you have a CVS repository you can copy it using:
$ scp -r firstname.lastname@example.org:/opt/cvsroot/myproject myproject/cvs
This copies the complete repository, not a checkout of its current state, so you will have access to the complete version history, etc. You can then use this repository as usual, for example:
$ export CVSROOT=/home/someuser/myproject/cvs
$ cvs co someDirectoryInCVS
Files and releases
Unfortunately, files and releases cannot be accessed via SCP. One way round this problem is to click the Files tab for your project and download each file in turn. You'll need to log in to access private documents.
Like files and releases, documents cannot be accessed via SCP, but can be accessed by clicking the Docs tab for your project and downloading each file in turn. Again, a login is required to download private documents.
Retrieving resources automatically
Your project resources can be retrieved automatically using a Linux/UNIX shell script and secure copy for web pages, mailing list archives and CVS, and a simple Java client for release files and documents. If you would like a copy of these scripts, please contact us.
Shell script for web pages, CVS and e-mail lists
The Software Sustainability Institute can provide you with a shell script for Linux/UNIX (called nescforgecopy.sh) that uses scp to copy NeSCForce resources. It has been tested on Linux RedHat 9 and Solaris 9. The script builds a directory structure to hold the copied resources, with the root directory being named after the project name.
Once you have a copy of the script, you'll need to make the following changes:
1. Edit th e username, password and project values (line 21,22,23):
2. Set the following value to false if you have no WWW pages (line 24):
3. Set the following value to false if you have no CVS (line 25):
4. Set the following value to a list of your public e-mail list addresses (line 26). If you have more than one email list, separate the names with a space:
If you do not have any lists, just comment the line out:
5. Set the following value to be a list of your private e-mail list addresses (line 27), or comment it out:
6. Run the script:
You will have to enter your password every time a secure copy is performed. This will happen once for the web pages, once for CVS and once for each e-mail list archive.
Java client for public documents and files
We can provide a simple Java client that copies all publicly accessible files and documents from a project's Files and Documents pages. The source code has been compiled and run on Linux RedHat 9 and Solaris 9, under Java 1.5 and 1.6. The client (called nescforgedownloader.jar) was compiled and tested under Java 1.5.0_07 and also tested on Java 1.6.0_16.
Once you have a copy of the client, you can run it using the command:
$ java -jar nescforgedownloader.jar NN DIRECTORY
You can compile the source using the command:
$ javac uk/ac/software/NeSCForgeDownloader.java
and run it using:
$ java uk.ac.software.NeSCForgeDownloader NN DIRECTORY
Where NN is your project ID and DIRECTORY is a local directory to output your files.
To get your project ID:
Visit your project page, e.g. http://forge.nesc.ac.uk/projects/myproject/
Click on the Docs link.
Look at the URL. It will be something like: http://forge.nesc.ac.uk/docman/?group_id=NN, where NN is the ID.
As an example, to copy the ngs project's files and documents, you'd run the command:
$ java -jar nescforgedownloader.jar 58 ngs
And if you wished to copy, the ogsadai project's files and documents, you'd run:
$ java -jar nescforgedownloader.jar 12 ogsadai
This only works for publicly available files and documents. Private files and documents will have to be downloaded manually or made public first.