Next: CSPACK - Tutorial Up: CSPACK Previous: List of Tables

CSPACK - Overview

GLOSSARY

Software packages used in High Energy Physics

A short description of packages referred to in this document are given below.

ZEBRA - The data structure management system

The data structure management package ZEBRA was developed at CERN in order to overcome the lack of dynamic data structure facilities in FORTRAN, the favourite computer language in high energy physics. It implements the dynamic creation and modification of data structures at execution time and their transport to and from external media. ZEBRA input/output is either by a sequential or direct access method. Two data representations, native (no data conversion when transferred to/from the external medium) and exchange (a conversion to/from an interchange format is made if necessary), allow data to be transported between computers of the same and of different architectures.

Many of the packages described below are based on Zebra.

EPIO - A machine independant input/output package

EPIO is an input/output package still in use by some experiments at CERN. CSPACK provides remote file transfer and access for EPIO files.

KUIP - The user interface package

The purpose of KUIP (Kit for a User Interface Package) is to handle the dialogue between the user and the application program It parses the commands input into the system, verifies them for correctness and then hands over control to the relevant action routines.

HBOOK - The histogramming package

HBOOK provides a library of FORTRAN callable routines for the manipulation of histograms, scatter plots, tables and ntuples. These may be stored on disk files using the RZ direct access routines of the ZEBRA package.

PAW - The Physics Analysis Workstation

The PAW system is widely used by physicists to perform interactive data analysis and presentation. It uses the facilities provided by packages such as HBOOK, KUIP and of course ZEBRA.

FATMEN - A Distributed File and Tape Management System

The FATMEN system provides a fully distributed file catalogue and file access in a location, operating system and device independent manner. The ZEBRA RZ package is used to store the file catalogue information. The CSPACK facilities are also used by FATMEN for catalogue update distribution, remote file access and remote data file access.

PATCHY - The Source Code Management System

PATCHY is a source code management system which has been in use for many years. Files may be stored in a number of formats: CARD files, compact binary PAM files or in CETA format. All of the above formats may be transferred between different machines by tools in the CSPACK package.

CMZ - A Code Management system using ZEBRA

CMZ is an advanced Code Management system, backward compatible with PATCHY, that is based on ZEBRA. As with HBOOK, the ZEBRA RZ package is used to store data on disk.

Components of the CSPACK system

CZ - The ZEBRA Communications Package

The CZ package is a small set of FORTRAN callable routines used by FATMEN, PAW and other applications. It provides a simple means of starting a remote server and then exchanging character or binary data. The actual communication is performed by TCPAW, running over TCP/IP, or transparent DECnet task-to-task.

XZ - The remote I/O package

XZ is a small package built on top of CZ which permits remote I/O, such as OPEN, CLOSE, READ, WRITE etc. and remote file transfer.

TCPAW - The Networking Package

TCPAW provides the network layer for many of the tools in the current CSPACK package is built. It consists of FORTRAN callable C routines, and is implemented on a variety of platforms, including VM/CMS, VAX/VMS, and Unix systems.

TCPAW uses the internet daemon (INETD) to start servers, except on VM/CMS, where REXEC is used.

SYSREQ - The System Service Request Facility

SYSREQ is a facility developed at RAL for generalised inter-system communications. It allows commands to be sent to, and replies received from, services running in dedicated service machines under the VM/CMS. For example, all communication with the HEPVM Tape Management System (TMS), that was developed at the Rutherford Appleton Laboratory in the UK and is now running at several of the larger HEPVM sites, is via SYSREQ. At CERN, a facility has been developed to permit remote users use the facilities of SYSREQ, by forwarding the messages and replies over TCP/IP. This system is known as SYSREQ-TCP.

TELNETG - A extended TELNET program

TELNETG is a modified version of the standard TELNET program that allows the input/output of a HIGZ based graphics session on a remote system to be displayed in a graphics window on the local workstation. TELNETG is available for Unix and VAX/VMS systems.

TAGIBM - A 3270 terminal emulator

TAGIBM is a powerful 3270 terminal emulator similar to TELNETG but with full-screen emulation for IBM systems.

INETD - the internet daemon

On all systems except VM/CMS and IBM MVS, the server for ZFTP, distributed PAW and the CZ/XZ FORTRAN routines is started using the internet daemon (INETD), except between VAX/VMS systems when the DECnet option is activated.

The inetd daemon is normally started when your system is rebooted. Once started, the inetd daemon listens for connections on certain Internet sockets specified in the /etc/inetd.conf file. When the inetd daemon receives a request on one of these sockets, it determines what service corresponds to that socket and then either handles the service request itself or invokes the appropriate server, such as ZSERV or PAWSERV.

A separate process exists for each concurrent connection to a given host.

REXEC - the remote execution daemon

As INETD is not available for VM/CMS, another solution has to be used. TCPAW uses the REXEC command to start servers on VM/CMS systems. The REXEC daemon autologs the machine of the specified user, having verified the username and password. This means that the machine in question must not be in use, i.e. logged on or disconnected. Once the machine is autologged, the ZSERV or PAWSERV program is started.

If you have problems connecting to a remote VM system, first check that the account is not in use. If you still have problems, ensure that your PROFILE EXEC does not contain any statements which cause it to run a command, e.g. EXEC MAIL, either unconditionally or in DISCONNECTED mode.

Alternatives and recommendations

Many of the tools developed in this package were first written in the framework of the PAW [3] project. They were extended and enhanced for the FATMEN [9] project. The tools are based on such de-facto standards as DECnet and TCP/IP sockets. However, new standards are now emerging which, together with enhancements to HEP packages, render parts of CSPACK redundant. Some of these are described below.

The main recommendations are:

ftp transfer

Unformatted files in exchange mode can be transferred using binary ftp between different machines without problems except:

  1. On VMS systems, one cannot currently specify the format of the target file. The file can be converted on the VMS end using the RESIZE command, which invokes the following command file: CERNLIB RESIZE command
    $!***************************************************************
    $!*                                                             *
    $!* RESIZE.COM v1.02                                            *
    $!*                                                             *
    $!* Resize ftp-files                                            *
    $!* Author: J.Zoll 90/07/24                                     *
    $!*                                                             *
    $!* Mods       Date   Comments                                  *
    $!* MARQUINA 92/12/07 Add cosmetics for public release          *
    $!* M.Kelsey 93/10/01 Prevent clashes on simultaneous runs      *
    $!*                                                             *
    $!***************************************************************
    $ ver_proc=F$VERIFY(0)
    $ SAY   :== WRITE/SYMBOL SYS$OUTPUT
    $ If p1.eqs.""
    $ Then Say "%DCL-I-SYNT syntax: resize [-s size] input_file [output_file]"
    $      Exit
    $ Endif
    $ On ERROR     Then Goto EXIT
    $ On CONTROL_Y Then Goto EXIT
    $!
    $      ifile=p1
    $      ofile=p2
    $      size =3600
    $ If p1.eqs."-S"
    $ Then ifile=p3
    $      ofile=p4
    $      size =p2
    $ Endif
    $ If ofile.eqs."".or.ofile.eqs."-" Then ofile=ifile
    $
    $ ffile="EXCH_"+F$GETJPI("","PID")+".DAT"
    $ OPEN/WRITE OUTP 'ffile
    $ WRITE OUTP    "RECORD"
    $ WRITE OUTP    "BLOCK_SPAN              yes"
    $ WRITE OUTP    "CARRIAGE_CONTROL        none"
    $ WRITE OUTP    "FORMAT                  fixed"
    $ WRITE OUTP    "SIZE                    ''size'"
    $ CLOSE OUTP
    $
    $ Say "resize: setting record size of ",ofile," to ",size," bytes..."
    $ EXCHANGE/NETWORK 'ifile 'ofile -
            /TRANSFER=BLOCK -
            /FDL='ffile
    $!
    $ purge/nolog 'ofile'
    $ EXIT:
    $ DELETE/NOCONF/NOLOG 'ffile'.*
    $ dummy=F$VERIFY(ver_proc)
    $ Exit
  2. On Unix systems, ZEBRA FZ files should be read and written with C I/O (option L in call to FZFILE.
  3. On VM/CMS systems, one can specify the record format of the target file using an ftp subcommand such as the following:

    bin f 3600

NFS access

The rules for NFS access are as above, with the additional proviso that C I/O should also be used on VMS systems in all cases where the record format of the data file is STREAM_LF.

ZEBRA RZ files

ZEBRA RZ files may now be written in both exchange and native data formats. For systems that use the IEEE floating point format, such as most Unix systems including Sun, Apollo, RS6000 etc., native and exchange formats are identical. It is recommended that exchange data format be used whereever possible. Such files may be transferred between different systems using the standard ftp utility and accessed at the record level using nfs. This obviates the need for the GETRZ and PUTRZ commands in ZFTP, for example.

ZEBRA FZ files

Exchange format has always existed for ZEBRA FZ files. However, due to limitations of certain FORTRAN implementations, such files have not been easily transferable to/from these systems. (FORTRAN typically writes control words at the beginning and end of each record in sequential files on most Unix systems. These control words render the file unreadable from other systems across NFS, or if the file is transferred using FTP without further conversion). ZEBRA FZ has now been enhanced to provide I/O using the C run time library (or FORTRAN direct-access I/O). Files written with either of these options maybe be shared across systems using NFS or transferred using FTP without further conversion.

PATCHY files

PATCHY files may be kept in binary (PAM) or formatted (CARD) files. Card files may be shared across systems without problems, unless certain special characters are used. The use of card files removes the need for the ZFTP GETP and PUTP commands.

Introduction

Many High Energy physics experiments use some or all of the following packages:



 -- PATCHY or CMZ for code management

 -- Zebra FZ and RZ packages for I/O
 -- PAW, HBOOK for histogramming

The transfer of the files used by these packages is often difficult, and network access impossible.

For example, PATCHY PAM files have are normally transferred between different machines in a special interchange format, known as CETA. Network access to PAM files between different hardware platforms is not supported. The transfer of Zebra files also requires the use of an interchange format, Zebra binary (or even ASCII) exchange format. This requires a three step process to transfer a file:



 -- Convert to exchange format

 -- Transfer
 -- Convert back to native

Trasnfer of such files to and from Unix machines is further complicated by the fact that that data records, when written by FORTRAN, contain control information which renders the file unreadable on the remote system and so a further step is required to add or remote these control words.

CSPACK

CSPACK is designed to solve the above problems, by providing network transfer and access to the commonly used HEP formats with transparent, on the fly data conversion. This is performed through a file transfer program called ZFTP.

In addition, a FORTRAN callable interface allows users to code their own applications, or call directly routines that provide complete file transfer of ASCII, binary, FORTRAN direct-access, Zebra FZ or RZ and PAM files. Routines for record level access also exist.

CSPACK also includes other tools and routines for the distributed computing environment, such as the TELNETG program, which permits a graphics application, such as PAW or GEANT, to be run on a remote machine utilising a graphics window on the local workstation.



Next: CSPACK - Tutorial Up: CSPACK Previous: List of Tables


goossens@cern.ch