Retrieving project resources from NeSCForge

NeSCForge.pngBy Mike Jackson.

After serving as a valuable repository for UK e-Science projects for many years, the NeSCForge service, hosted by the National e-Science Centre, will be coming to an end. The service will be shut down on Monday 20th December 2010. Naturally, people with resources hosted on NeSCForge may be wondering how to retrieve their resources. This guide describes what you can retrieve and how.

For more information about the closure of NeSCForge see the announcement of 27 September 2010.

Why write this guide?

We were asked by a NeSCForge-hosted project for advice on how to retrieve source code and other resources before the NeSCForge service was shutdown.

What resources can be retrieved?

The following resources can be retrieved from NeSCForge:

  • Project web pages
  • Mailing list archives
  • CVS repository
  • Files and releases
  • Uploaded documents

What resources cannot be retrieved?

Unfortunately, the software on which NeSCForge runs is bespoke. This prevents the following resources from being retrieved:

  • Bug, support, patch, and feature tracker content
  • Forum content
  • Task manager content
  • Surveys
  • News

What you will need

You will need a NeSCForge user name and password, and you will need administrator privileges for the project whose resources you want to retrieve. Throughout this guide, we'll use the project name myproject, and a user called myuser who has the password mypassword. Substitute your own project name, user and password where these occur.

CVS repositories, web pages and mailing list archives are stored in NeSCForge's server directories which can be accessed via a secure copy (SCP) client.

Linux and Unix

Many Linux and Unix operating systems provide an SCP client as standard.

Run the following command:

	 $ scp myuser@myproject.forge.nesc.ac.uk:/opt/SOURCE DESTINATION 

You will then be prompted for your password.

Windows

On a Windows, you can use the SSH Secure Shell 3.2.9 (2000-2003) SFTP client, from http://www.ssh.com. However, this can cause problems when copying symbolically-linked files in e-mail archives. In the instructions that follow you might have to copy specific sub-directories and files, rather than being able to do the all-in-one copy that is described.

The procedure for running SFTP is as follows:

  1. Start SFTP
  2. Click Quick Connect
  3. Connect to Remote Host appears
  4. Enter Host Name: myproject.forge.nesc.ac.uk
  5. Enter User Name: myuser
  6. Click Connect
  7. Host Identification appears
  8. Click No
  9. Enter Password appears
  10. Enter Password: mypassword
  11. Click OK
  12. In the Remote Name area, you will see a public_html directory
  13. Right-click in the Remote Name area and select Go to folder...
  14. Go to Remote Folder appears.
  15. Enter Folder Name: /opt. This takes you to the root of the NeSCForge directory structure
  16. Click OK

Retrieving resources "by hand"

You can retrieve your project resources using secure copy (for web pages, mailing list archives and CVS) or an internet browser (release files and documents). Alternatively, we've provided a Linux/UNIX shell script and Java client to automatically retrieve these resources. You may want to try the script first. The following sections also explain where the resources are located in NeSCForge.

Project web pages

Project web pages are the pages available when you visit http://myproject.forge.nesc.ac.uk. In NeSCForge, these are held in /opt/projects/myproject/htdocs. If your project has such pages you can copy them using:

	 $ scp -r myuser@myproject.forge.nesc.ac.uk:/opt/projects/myproject/ myproject/www 

Mailing lists

NeSCForge e-mail lists (if you used them) will be stored in one of two places, depending on whether they were public or private. These places are:

 /opt/mailman/archives/public/myproject-users
/opt/mailman/archives/private/myproject-developers 

You can copy e-mail list archives, including their attachments, using the commands:

	 $ scp -r myuser@myproject.forge.nesc.ac.uk:/opt/mailman/archives/public/myproject-users myproject/mail/myproject-users
$ scp -r myuser@myproject.forge.nesc.ac.uk:/opt/mailman/archives/private/myproject-developers myproject/mail/myproject-developers 

Certain files won't be copied due to permissions, e.g. /opt/mailman/archives/public/myproject-users/database, but these files don't appear to be necessary.

If you browse into myproject/mail/myproject-users, for example, and open index.html in a web browser, you can explore the archive.

If an e-mail had an attachment then there will be a hyperlink of the form:

http://forge.nesc.ac.uk/mailman/private/myproject-developers/attachments/20100601/044222f0/attachment-0001.doc

This hyperlink, as you'll notice, still cites NeSCForge. However, all attachments are available in the attachments directory of the mail archive, e.g. myproject/mail/myproject-developers/attachments/20100601/044222f0/attachment-0001.doc.

Each mail directory contains TXT files which can be used as standard mail folders in Pine, for example.

CVS repository

In NeSCForge, a project's CVS repository is held in /opt/cvsroot/myproject.

If you have a CVS repository you can copy it using:

	 $ scp -r myuser@myproject.forge.nesc.ac.uk:/opt/cvsroot/myproject myproject/cvs 

This copies the complete repository, not a checkout of its current state, so you will have access to the complete version history, etc. You can then use this repository as usual, for example:

	 $ export CVSROOT=/home/someuser/myproject/cvs
$ cvs co someDirectoryInCVS 

Files and releases

Unfortunately, files and releases cannot be accessed via SCP. One way round this problem is to click the Files tab for your project and download each file in turn. You'll need to log in to access private documents.

Uploaded documents

Like files and releases, documents cannot be accessed via SCP, but can be accessed by clicking the Docs tab for your project and downloading each file in turn. Again, a login is required to download private documents.

Retrieving resources automatically

Your project resources can be retrieved automatically using a Linux/UNIX shell script and secure copy for web pages, mailing list archives and CVS, and a simple Java client for release files and documents. If you would like a copy of these scripts, please contact us.

Shell script for web pages, CVS and e-mail lists

The Software Sustainability Institute can provide you with a shell script for Linux/UNIX (called nescforgecopy.sh) that uses scp to copy NeSCForce resources. It has been tested on Linux RedHat 9 and Solaris 9. The script builds a directory structure to hold the copied resources, with the root directory being named after the project name.

Once you have a copy of the script, you'll need to make the following changes:

1. Edit th e username, password and project values (line 21,22,23):

	  NF_USER=myuser
NF_PASSWORD=mypassword
NF_PROJECT=myproject  

2. Set the following value to false if you have no WWW pages (line 24):

	 NF_HAS_WWW=true 

3. Set the following value to false if you have no CVS (line 25):

	 NF_HAS_CVS=true 

4. Set the following value to a list of your public e-mail list addresses (line 26). If you have more than one email list, separate the names with a space:

	 NF_PUBLIC_MAIL="myproject-users myproject-announce" 

or

	 NF_PUBLIC_MAIL="myproject-users" 

If you do not have any lists, just comment the line out:

	 # NF_PUBLIC_MAIL="myproject-users" 

5. Set the following value to be a list of your private e-mail list addresses (line 27), or comment it out:

	 NF_PRIVATE_MAIL="myproject-developers" 

6. Run the script:

	 $ ./nescforgecopy.sh 

You will have to enter your password every time a secure copy is performed. This will happen once for the web pages, once for CVS and once for each e-mail list archive.

Java client for public documents and files

We can provide a simple Java client that copies all publicly accessible files and documents from a project's Files and Documents pages. The source code has been compiled and run on Linux RedHat 9 and Solaris 9, under Java 1.5 and 1.6. The client (called nescforgedownloader.jar) was compiled and tested under Java 1.5.0_07 and also tested on Java 1.6.0_16.

Once you have a copy of the client, you can run it using the command:

	 $ java -jar nescforgedownloader.jar NN DIRECTORY 

You can compile the source using the command:

	 $ javac uk/ac/software/NeSCForgeDownloader.java 

and run it using:

	 $ java uk.ac.software.NeSCForgeDownloader NN DIRECTORY 

Where NN is your project ID and DIRECTORY is a local directory to output your files.

To get your project ID:

  • Visit your project page, e.g. http://forge.nesc.ac.uk/projects/myproject/
  • Click on the Docs link.
  • Look at the URL. It will be something like: http://forge.nesc.ac.uk/docman/?group_id=NN, where NN is the ID.

As an example, to copy the ngs project's files and documents, you'd run the command:

	 $ java -jar nescforgedownloader.jar 58 ngs 

And if you wished to copy, the ogsadai project's files and documents, you'd run:

	 $ java -jar nescforgedownloader.jar 12 ogsadai 

This only works for publicly available files and documents. Private files and documents will have to be downloaded manually or made public first.