Difference: MassSe (1 vs. 13)

Revision 132015-07-11 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (the GSTREAM library)

Line: 113 to 113
  -- AndrasLaszlo - 10 Jul 2009
Changed:
<
<
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1273224241" name="gstream.tar.gz" path="gstream.tar.gz" size="20879" stream="gstream.tar.gz" tmpFilename="/usr/tmp/CGItemp6887" user="AndrasLaszlo" version="7"
>
>
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1436608587" name="gstream.tar.gz" path="gstream.tar.gz" size="21576" stream="gstream.tar.gz" tmpFilename="/usr/tmp/CGItemp2323" user="AndrasLaszlo" version="8"

Revision 122010-05-07 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (the GSTREAM library)

Added:
>
>

Installation

 Note: The GSTREAM library is installed on the KFKI AFS. Look at that page for instructions on activation.
Added:
>
>
To install it to your computer, download GSTREAM, then say:
tar  -xzf  gstream.tar.gz
cd  gstream
./configure  [--prefix=some_installation_path]
make
make  install
 

The GSTREAM library for read/write C++ streams to Storage Elements

As one does generally not want to always stage out the data files from SE-s onto a local disk by hand, and then process it, it is recommended to have read/write streams. The C++ library GSTREAM implements such a library. ( gstream, igstream, ogstream stream classes, like the usual C++ STL fstream, ifstream, ofstream file input/output stream classes; the letter 'g' standing for 'grid'.) It does nothing else, but treats the file as a normal file, unless its name begins with the string /grid/. In this case, it stages out the datafile in question onto a local (or AFS) area, and then treats the local file as a normal file. Sooner or later this solution has to be replaced with a GFAL based C++ library.

Line: 17 to 28
  int main(int argc, char *argv[]) {
Added:
>
>
// Set debug and checksum levels for verifying what is happening (default: 0, i.e. don't check). gstream_debug(1); gstream_checksum(1);
  // Open the datafile for reading. igstream igfile("/grid/cms/alaszlo/some_datafile.dat"); // Extract data from your datafile with 'igstream::operator>>' or with 'igstream::read(char*, int)'.
Line: 93 to 108
  The wrapper scripts may be improved in the future. We thank to Kálmán Kővári for this toolkit. There are also other useful scripts in the package. Please have a look a them.
Changed:
<
<
Also the scripts gexists.sh, gput.sh, grep.sh, gget.sh, gdel.sh may be helpful to manipulate single files.
>
>
Also the scripts gexists.sh, gput.sh, grep.sh, gget.sh, gdel.sh, gcheck.sh may be helpful to manipulate single files.
 

-- AndrasLaszlo - 10 Jul 2009

Changed:
<
<
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1273160081" name="gstream.tar.gz" path="gstream.tar.gz" size="20887" stream="gstream.tar.gz" tmpFilename="/usr/tmp/CGItemp5148" user="AndrasLaszlo" version="6"
>
>
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1273224241" name="gstream.tar.gz" path="gstream.tar.gz" size="20879" stream="gstream.tar.gz" tmpFilename="/usr/tmp/CGItemp6887" user="AndrasLaszlo" version="7"

Revision 112010-05-06 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (the GSTREAM library)

Line: 98 to 98
  -- AndrasLaszlo - 10 Jul 2009
Changed:
<
<
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1227461895" name="gstream.tar.gz" path="gstream.tar.gz" size="19165" stream="gstream.tar.gz" tmpFilename="/usr/tmp/CGItemp28590" user="AndrasLaszlo" version="5"
>
>
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1273160081" name="gstream.tar.gz" path="gstream.tar.gz" size="20887" stream="gstream.tar.gz" tmpFilename="/usr/tmp/CGItemp5148" user="AndrasLaszlo" version="6"

Revision 102009-07-10 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (the GSTREAM library)

Line: 95 to 95
  Also the scripts gexists.sh, gput.sh, grep.sh, gget.sh, gdel.sh may be helpful to manipulate single files.
Changed:
<
<
-- AndrasLaszlo - 29 Jun 2008
>
>
-- AndrasLaszlo - 10 Jul 2009
 
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1227461895" name="gstream.tar.gz" path="gstream.tar.gz" size="19165" stream="gstream.tar.gz" tmpFilename="/usr/tmp/CGItemp28590" user="AndrasLaszlo" version="5"

Revision 92008-11-23 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (the GSTREAM library)

Line: 97 to 97
  -- AndrasLaszlo - 29 Jun 2008
Changed:
<
<
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1227279718" name="gstream.tar.gz" path="gstream.tar.gz" size="18850" stream="gstream.tar.gz" tmpFilename="/usr/tmp/CGItemp27205" user="AndrasLaszlo" version="4"
>
>
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1227461895" name="gstream.tar.gz" path="gstream.tar.gz" size="19165" stream="gstream.tar.gz" tmpFilename="/usr/tmp/CGItemp28590" user="AndrasLaszlo" version="5"

Revision 82008-11-21 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (the GSTREAM library)

Line: 6 to 6
 

The GSTREAM library for read/write C++ streams to Storage Elements

Changed:
<
<
As one does generally not want to always stage out the data files from SE-s onto a local disk by hand, and then process it, it is recommended to have read/write streams. The C++ library GSTREAM implements such a library. (gstream, igstream, ogstream stream classes, like the usual C++ STL fstream, ifstream, ofstream file input/output stream classes; the letter 'g' standing for 'grid'.) It does nothing else, but treats the file as a normal file, unless its name begins with the string /grid/. In this case, it stages out the datafile in question onto a local (or AFS) area, and then treats the local file as a normal file. Sooner or later this solution has to be replaced with a GFAL based C++ library.
>
>
As one does generally not want to always stage out the data files from SE-s onto a local disk by hand, and then process it, it is recommended to have read/write streams. The C++ library GSTREAM implements such a library. ( gstream, igstream, ogstream stream classes, like the usual C++ STL fstream, ifstream, ofstream file input/output stream classes; the letter 'g' standing for 'grid'.) It does nothing else, but treats the file as a normal file, unless its name begins with the string /grid/. In this case, it stages out the datafile in question onto a local (or AFS) area, and then treats the local file as a normal file. Sooner or later this solution has to be replaced with a GFAL based C++ library.
  One commonly faces the problem that the file not only has to be processed, but it also has to be passed through a filter program. Therefore I also wrote pipe streams for grid storage (igpstream, ogpstream), which are based on the ipstream and opstream classes of the library at http://pstreams.sourceforge.net (note the LGPL license!).
Line: 54 to 53
  Before usage, one has to export the following environmental variables:
Changed:
<
<
export LCG_GFAL_VO=your_vo
(for bash), or
setenv LCG_GFAL_VO your_vo
(for tcsh), and
>
>
export LCG_GFAL_VO=your_vo
(for bash), or
setenv LCG_GFAL_VO your_vo
(for tcsh), and
 
Changed:
<
<
export DEST=your_favourite_storage_element1,your_storage_element2,...
(for bash), or
setenv DEST your_favourite_storage_element1,your_storage_element2,...
(for tcsh).
>
>
export DEST=your_favourite_storage_element1,your_storage_element2,...
(for bash), or
setenv DEST your_favourite_storage_element1,your_storage_element2,...
(for tcsh).
  Setting the environmental variable TMPDIR is optional. This specifies the local (or AFS) directory, where the datafiles are staged out (therefore, it has to have large disk space!). E.g.:
Changed:
<
<
export TMPDIR=/tmp
(for bash), or
setenv TMPDIR /tmp
(for tcsh).
>
>
export TMPDIR=/tmp
(for bash), or
setenv TMPDIR /tmp
(for tcsh).
  If not specified, the current working directory ($PWD) is used, as this is recommended for grid jobs (the working nodes have large disk spaces).
Line: 77 to 72
 
> lcg-cr.sh  /home/user_name/source  /grid/your_vo/user_name/destination  your_favourite_se  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Changed:
<
<
Here source may be large directory tree structure.
>
>
Here source may be large directory tree structure.
 
> lcg-rep.sh  /grid/your_vo/user_name/source  destination_se  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Changed:
<
<
Here source may be large directory tree structure. destination_se is the SE, where you want the replica.
>
>
Here source may be large directory tree structure. destination_se is the SE, where you want the replica.
 
> lcg-cp.sh  /grid/your_vo/user_name/source  /home/user_name/dest  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Changed:
<
<
Here source may be large directory tree structure.
>
>
Here source may be large directory tree structure.
 
> lcg-del.sh  /grid/your_vo/user_name/target  target_se  your_vo  > runplan.dat
Line: 100 to 93
  The wrapper scripts may be improved in the future. We thank to Kálmán Kővári for this toolkit. There are also other useful scripts in the package. Please have a look a them.
Added:
>
>
Also the scripts gexists.sh, gput.sh, grep.sh, gget.sh, gdel.sh may be helpful to manipulate single files.
  -- AndrasLaszlo - 29 Jun 2008
Changed:
<
<
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1224145996" name="gstream.tar.gz" path="gstream.tar.gz" size="18743" stream="gstream.tar.gz" user="Main.AndrasLaszlo" version="1"
>
>
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1227279718" name="gstream.tar.gz" path="gstream.tar.gz" size="18850" stream="gstream.tar.gz" tmpFilename="/usr/tmp/CGItemp27205" user="AndrasLaszlo" version="4"

Revision 72008-11-13 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (the GSTREAM library)

Added:
>
>
Note: The GSTREAM library is installed on the KFKI AFS. Look at that page for instructions on activation.
 

The GSTREAM library for read/write C++ streams to Storage Elements

Line: 103 to 104
 -- AndrasLaszlo - 29 Jun 2008
Changed:
<
<
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1217520173" name="gstream.tar.gz" path="gstream.tar.gz" size="18723" stream="gstream.tar.gz" user="Main.AndrasLaszlo" version="3"
>
>
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1224145996" name="gstream.tar.gz" path="gstream.tar.gz" size="18743" stream="gstream.tar.gz" user="Main.AndrasLaszlo" version="1"

Revision 62008-07-31 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (the GSTREAM library)

The GSTREAM library for read/write C++ streams to Storage Elements

Changed:
<
<
As one does generally not want to always stage out the data files from _SE_-s onto a local disk by hand, and then process it, it is recommended to have read/write streams. The C++ library GSTREAM implements such a library. (gstream, igstream, ogstream stream classes, like the usual C++ STL fstream, ifstream, ofstream file input/output stream classes; the letter 'g' standing for 'grid'.) It does nothing else, but treats the file as a normal file, unless its name begins with the
>
>
As one does generally not want to always stage out the data files from SE-s onto a local disk by hand, and then process it, it is recommended to have read/write streams. The C++ library GSTREAM implements such a library. (gstream, igstream, ogstream stream classes, like the usual C++ STL fstream, ifstream, ofstream file input/output stream classes; the letter 'g' standing for 'grid'.) It does nothing else, but treats the file as a normal file, unless its name begins with the
 string /grid/. In this case, it stages out the datafile in question onto a local (or AFS) area, and then treats the local file as a normal file. Sooner or later this solution has to be replaced with a GFAL based C++ library.

One commonly faces the problem that the file not only has to be processed, but it also has to be passed through a filter program. Therefore I also wrote pipe streams for grid storage (igpstream, ogpstream), which are based on the ipstream and opstream classes of the library at http://pstreams.sourceforge.net (note the LGPL license!).

Line: 103 to 103
 -- AndrasLaszlo - 29 Jun 2008
Changed:
<
<
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1214763926" name="gstream.tar.gz" path="gstream.tar.gz" size="18719" stream="gstream.tar.gz" user="Main.AndrasLaszlo" version="2"
>
>
META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1217520173" name="gstream.tar.gz" path="gstream.tar.gz" size="18723" stream="gstream.tar.gz" user="Main.AndrasLaszlo" version="3"

Revision 52008-06-29 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<

Mass file handling on the Grid (a WMSX extension)

>
>

Mass file handling on the Grid (the GSTREAM library)

 
Changed:
<
<

A WMSX toolkit for command line mass file manipulations

>
>

The GSTREAM library for read/write C++ streams to Storage Elements

 
Changed:
<
<
The wrapper scripts lcg-cr.sh, lcg-rep.sh, lcg-cp.sh, lcg-del.sh provide wrappers around the commands lcg-cr, lcg-rep, lcg-cp, lcg-del, such that whole directory trees can also be handled. Wrappers around other lcg- commands may be also written based on them (in this case, we would be grateful to be shared with them). These scripts provide text outputs of the planned series of lfc- and lcg- commands. This text output can be passed to the script runplan.sh, which executes the command list. Every command is included in a checker loop: execution is given up after 5 unsuccessful tries. If any interruption occures (e.g. because of failure), the runplan.sh script can be started again, and will continue from the command, where it got interrupted.

Practical examples:

> lcg-cr.sh  /home/user_name/source  /grid/your_vo/user_name/destination  your_favourite_se  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Here source may be large directory tree structure.

> lcg-rep.sh  /grid/your_vo/user_name/source  destination_se  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Here source may be large directory tree structure. destination_se is the SE, where you want the replica.

> lcg-cp.sh  /grid/your_vo/user_name/source  /home/user_name/dest  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Here source may be large directory tree structure.

> lcg-del.sh  /grid/your_vo/user_name/target  target_se  your_vo  > runplan.dat
> runplan.sh runplan.dat
Here target may be large directory tree structure, and target_se is the SE, from where you want to remove your copy. Setting it to all means removal of all replicas.

The wrapper scripts may be improved in the future. We thank to Kálmán Kővári for this toolkit.

A WMSX toolkit for read/write C++ streams to Storage Elements

As one does generally not want to always stage out the data files from _SE_-s onto a local disk by hand, and then process it, it is recommended to have read/write streams. The C++ library, shipped with WMSX implements such a library. (gstream, igstream, ogstream stream classes, like the usual C++ STL fstream, ifstream, ofstream file input/output stream classes; the letter 'g' standing for 'grid'). It does nothing else, but treats the file as a normal file, unless its name begins with the

>
>
As one does generally not want to always stage out the data files from _SE_-s onto a local disk by hand, and then process it, it is recommended to have read/write streams. The C++ library GSTREAM implements such a library. (gstream, igstream, ogstream stream classes, like the usual C++ STL fstream, ifstream, ofstream file input/output stream classes; the letter 'g' standing for 'grid'.) It does nothing else, but treats the file as a normal file, unless its name begins with the
 string /grid/. In this case, it stages out the datafile in question onto a local (or AFS) area, and then treats the local file as a normal file. Sooner or later this solution has to be replaced with a GFAL based C++ library.
Changed:
<
<
One commonly faces the problem that the file not only has to be processed, but it also has to be passed through a filter programme. Therefore I also wrote pipe streams for grid storage (igpstream, ogpstream), which are based on the ipstream and opstream classes of the library at http://pstreams.sourceforge.net (note the LGPL license!).
>
>
One commonly faces the problem that the file not only has to be processed, but it also has to be passed through a filter program. Therefore I also wrote pipe streams for grid storage (igpstream, ogpstream), which are based on the ipstream and opstream classes of the library at http://pstreams.sourceforge.net (note the LGPL license!).
  Practical examples:
Line: 89 to 56
 
export LCG_GFAL_VO=your_vo
(for bash), or
setenv LCG_GFAL_VO your_vo
(for tcsh), and
Changed:
<
<
export DEST=your_favourite_storage_element
(for bash), or
setenv DEST your_favourite_storage_element
(for tcsh).
>
>
export DEST=your_favourite_storage_element1,your_storage_element2,...
(for bash), or
setenv DEST your_favourite_storage_element1,your_storage_element2,...
(for tcsh).
  Setting the environmental variable TMPDIR is optional. This specifies the local (or AFS) directory, where the datafiles are staged out (therefore, it has to have large disk space!). E.g.:
Line: 98 to 65
 
setenv TMPDIR /tmp
(for tcsh).

If not specified, the current working directory ($PWD) is used, as this is recommended for grid jobs (the working nodes have large disk spaces). \ No newline at end of file

Added:
>
>

GSTREAM toolkit extensions for command line mass file manipulations

Some helper scripts are also shipped together with the GSTREAM library. The wrapper scripts lcg-cr.sh, lcg-rep.sh, lcg-cp.sh, lcg-del.sh provide wrappers around the commands lcg-cr, lcg-rep, lcg-cp, lcg-del, such that whole directory trees can also be handled. Wrappers around other lcg- commands may be also written based on them (in this case, we would be grateful to be shared with them). These scripts provide text outputs of the planned series of lfc- and lcg- commands. This text output can be passed to the script runplan.sh, which executes the command list. Every command is included in a checker loop: execution is given up after 5 unsuccessful tries. If any interruption occures (e.g. because of failure), the runplan.sh script can be started again, and will continue from the command, where it got interrupted.

Practical examples:

> lcg-cr.sh  /home/user_name/source  /grid/your_vo/user_name/destination  your_favourite_se  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Here source may be large directory tree structure.

> lcg-rep.sh  /grid/your_vo/user_name/source  destination_se  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Here source may be large directory tree structure. destination_se is the SE, where you want the replica.

> lcg-cp.sh  /grid/your_vo/user_name/source  /home/user_name/dest  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Here source may be large directory tree structure.

> lcg-del.sh  /grid/your_vo/user_name/target  target_se  your_vo  > runplan.dat
> runplan.sh runplan.dat
Here target may be large directory tree structure, and target_se is the SE, from where you want to remove your copy. Setting it to all means removal of all replicas.

The wrapper scripts may be improved in the future. We thank to Kálmán Kővári for this toolkit. There are also other useful scripts in the package. Please have a look a them.

-- AndrasLaszlo - 29 Jun 2008

META FILEATTACHMENT attachment="gstream.tar.gz" attr="" comment="The GSTREAM library source." date="1214763926" name="gstream.tar.gz" path="gstream.tar.gz" size="18719" stream="gstream.tar.gz" user="Main.AndrasLaszlo" version="2"

Revision 42008-04-16 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (a WMSX extension)

Line: 55 to 55
  // Extract data from your datafile with 'igstream::operator>>' or with 'igstream::read(char*, int)'. // Close the datafile. igfile.close();
Added:
>
>
igfile.clear();
  // Open the datafile for writing. ogstream ogfile("/grid/cms/alaszlo/some_datafile.dat"); // Write data to your datafile with 'ogstream::operator<<' or with 'ogstream::write(char*, int)'. // Close the datafile. ogfile.close();
Added:
>
>
ogfile.clear();
  // Open the datafile for reading, through a filter program. igpstream igpfile("/grid/cms/alaszlo/some_datafile.dat.gz", "gunzip --stdout %f"); // Extract data from your datafile with 'igpstream::operator>>' or with 'igpstream::read(char*, int)'. // Close the datafile. igpfile.close();
Added:
>
>
igpfile.clear();
  // Open the datafile for writing, through a filter program. ogpstream ogpfile("/grid/cms/alaszlo/some_datafile.dat.gz", "gzip - > %f"); // Write data to your datafile with 'ogpstream::operator<<' or with 'ogpstream::write(char*, int)'. // Close the datafile. ogpfile.close();
Added:
>
>
ogpfile.clear();
  return 0; }

Revision 32007-11-29 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (a WMSX extension)

Line: 94 to 94
 
setenv TMPDIR /tmp
(for tcsh).

If not specified, the current working directory ($PWD) is used, as this is recommended for grid jobs (the working nodes have large disk spaces).

Deleted:
<
<

-- AndrasLaszlo - 17 Sep 2007

Revision 22007-09-17 - AndrasLaszlo

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (a WMSX extension)

Changed:
<
<
  • A WMSX toolkit for command line mass file manipulations.
>
>

A WMSX toolkit for command line mass file manipulations

  The wrapper scripts lcg-cr.sh, lcg-rep.sh, lcg-cp.sh, lcg-del.sh provide wrappers around the commands lcg-cr, lcg-rep, lcg-cp, lcg-del, such that whole directory trees can also be handled. Wrappers around other lcg- commands may be also written based on them (in this case, we would be grateful to be shared with them). These scripts provide text outputs of the planned series of lfc- and lcg- commands. This text output can be passed to the script runplan.sh, which executes the command list. Every command is included in a checker loop: execution is given up after 5 unsuccessful tries. If any interruption occures (e.g. because of failure), the runplan.sh script can be started again, and will continue from the command, where it got interrupted.
Line: 33 to 33
  Here target may be large directory tree structure, and target_se is the SE, from where you want to remove your copy. Setting it to all means removal of all replicas.
Changed:
<
<
The wrapper scripts may be improved in the future.
>
>
The wrapper scripts may be improved in the future. We thank to Kálmán Kővári for this toolkit.
 
Changed:
<
<
  • A WMSX toolkit for read/write C++ streams to Storage Elements.
>
>

A WMSX toolkit for read/write C++ streams to Storage Elements

  As one does generally not want to always stage out the data files from _SE_-s onto a local disk by hand, and then process it, it is recommended to have read/write streams. The C++ library, shipped with WMSX implements such a library. (gstream, igstream, ogstream stream classes, like the usual C++ STL fstream, ifstream, ofstream file input/output stream classes; the letter 'g' standing for 'grid'). It does nothing else, but treats the file as a normal file, unless its name begins with the string /grid/. In this case, it stages out the datafile in question onto a local (or AFS) area, and then treats the local file as a normal file. Sooner or later this solution has to be replaced with a GFAL based C++ library.
Line: 78 to 78
 }
Added:
>
>
When compileing, pass the option `gstream-config --cflags` to the compiler, and when linking, pass the `gstream-config --libs` option to the linker.
 Before usage, one has to export the following environmental variables:

export LCG_GFAL_VO=your_vo
(for bash), or
Changed:
<
<
setenv :CG_GFAL_VO your_vo
(for tcsh), and
>
>
setenv LCG_GFAL_VO your_vo
(for tcsh), and
 
export DEST=your_favourite_storage_element
(for bash), or
setenv DEST your_favourite_storage_element
(for tcsh).
Line: 92 to 94
 
setenv TMPDIR /tmp
(for tcsh).

If not specified, the current working directory ($PWD) is used, as this is recommended for grid jobs (the working nodes have large disk spaces). \ No newline at end of file

Added:
>
>

-- AndrasLaszlo - 17 Sep 2007

Revision 12007-09-15 - AndrasLaszlo

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WebHome"

Mass file handling on the Grid (a WMSX extension)

  • A WMSX toolkit for command line mass file manipulations.

The wrapper scripts lcg-cr.sh, lcg-rep.sh, lcg-cp.sh, lcg-del.sh provide wrappers around the commands lcg-cr, lcg-rep, lcg-cp, lcg-del, such that whole directory trees can also be handled. Wrappers around other lcg- commands may be also written based on them (in this case, we would be grateful to be shared with them). These scripts provide text outputs of the planned series of lfc- and lcg- commands. This text output can be passed to the script runplan.sh, which executes the command list. Every command is included in a checker loop: execution is given up after 5 unsuccessful tries. If any interruption occures (e.g. because of failure), the runplan.sh script can be started again, and will continue from the command, where it got interrupted.

Practical examples:

> lcg-cr.sh  /home/user_name/source  /grid/your_vo/user_name/destination  your_favourite_se  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Here source may be large directory tree structure.

> lcg-rep.sh  /grid/your_vo/user_name/source  destination_se  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Here source may be large directory tree structure. destination_se is the SE, where you want the replica.

> lcg-cp.sh  /grid/your_vo/user_name/source  /home/user_name/dest  your_vo  > runplan.dat
> runplan.sh  runplan.dat
Here source may be large directory tree structure.

> lcg-del.sh  /grid/your_vo/user_name/target  target_se  your_vo  > runplan.dat
> runplan.sh runplan.dat
Here target may be large directory tree structure, and target_se is the SE, from where you want to remove your copy. Setting it to all means removal of all replicas.

The wrapper scripts may be improved in the future.

  • A WMSX toolkit for read/write C++ streams to Storage Elements.

As one does generally not want to always stage out the data files from _SE_-s onto a local disk by hand, and then process it, it is recommended to have read/write streams. The C++ library, shipped with WMSX implements such a library. (gstream, igstream, ogstream stream classes, like the usual C++ STL fstream, ifstream, ofstream file input/output stream classes; the letter 'g' standing for 'grid'). It does nothing else, but treats the file as a normal file, unless its name begins with the string /grid/. In this case, it stages out the datafile in question onto a local (or AFS) area, and then treats the local file as a normal file. Sooner or later this solution has to be replaced with a GFAL based C++ library.

One commonly faces the problem that the file not only has to be processed, but it also has to be passed through a filter programme. Therefore I also wrote pipe streams for grid storage (igpstream, ogpstream), which are based on the ipstream and opstream classes of the library at http://pstreams.sourceforge.net (note the LGPL license!).

Practical examples:

#include "gstream.h"

int main(int argc, char *argv[])
{
    // Open the datafile for reading.
    igstream igfile("/grid/cms/alaszlo/some_datafile.dat");
        // Extract data from your datafile with 'igstream::operator>>' or with 'igstream::read(char*, int)'.
    // Close the datafile.
    igfile.close();

    // Open the datafile for writing.
    ogstream ogfile("/grid/cms/alaszlo/some_datafile.dat");
        // Write data to your datafile with 'ogstream::operator<<' or with 'ogstream::write(char*, int)'.
    // Close the datafile.
    ogfile.close();

    // Open the datafile for reading, through a filter program.
    igpstream igpfile("/grid/cms/alaszlo/some_datafile.dat.gz", "gunzip --stdout %f");
        // Extract data from your datafile with 'igpstream::operator>>' or with 'igpstream::read(char*, int)'.
    // Close the datafile.
    igpfile.close();

    // Open the datafile for writing, through a filter program.
    ogpstream ogpfile("/grid/cms/alaszlo/some_datafile.dat.gz", "gzip - > %f");
        // Write data to your datafile with 'ogpstream::operator<<' or with 'ogpstream::write(char*, int)'.
    // Close the datafile.
    ogpfile.close();

    return 0;
}

Before usage, one has to export the following environmental variables:

export LCG_GFAL_VO=your_vo
(for bash), or
setenv :CG_GFAL_VO your_vo
(for tcsh), and

export DEST=your_favourite_storage_element
(for bash), or
setenv DEST your_favourite_storage_element
(for tcsh).

Setting the environmental variable TMPDIR is optional. This specifies the local (or AFS) directory, where the datafiles are staged out (therefore, it has to have large disk space!). E.g.:

export TMPDIR=/tmp
(for bash), or
setenv TMPDIR /tmp
(for tcsh).

If not specified, the current working directory ($PWD) is used, as this is recommended for grid jobs (the working nodes have large disk spaces).

 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback