MFC Programmer's SourceBook : Thinking in C++
Bruce Eckel's Thinking in C++, 2nd Ed Contents | Prev | Next

Iostream examples

In this section you’ll see some examples of what you can do with all the information you’ve learned in this chapter. Although many tools exist to manipulate bytes (stream editors like sed and awk from Unix are perhaps the most well known, but a text editor also fits this category), they generally have some limitations. sed and awk can be slow and can only handle lines in a forward sequence, and text editors usually require human interaction, or at least learning a proprietary macro language. The programs you write with iostreams have none of these limitations: They’re fast, portable, and flexible. It’s a very useful tool to have in your kit.

Code generation

The first examples concern the generation of programs that, coincidentally, fit the format used in this book. This provides a little extra speed and consistency when developing code. The first program creates a file to hold main( ) (assuming it takes no command-line arguments and uses the iostream library):

//: C18:Makemain.cpp
// Create a shell main() file
#include <fstream>
#include <strstream>
#include <cstring>
#include <cctype>
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  requireArgs(argc, 1);
  ofstream mainfile(argv[1]);
  assure(mainfile, argv[1]);
  istrstream name(argv[1]);
  ostrstream CAPname;
  char c;
  while(name.get(c))
    CAPname << char(toupper(c));
  CAPname << ends;
  mainfile << "//" << ": " << CAPname.rdbuf()
    << " -- " << endl
    << "#include <iostream>" << endl
    << endl
    << "main() {" << endl << endl
    << "}" << endl;
} ///:~ 

The argument on the command line is used to create an istrstream, so the characters can be extracted one at a time and converted to upper case with the Standard C library macro toupper( ). This returns an int so it must be explicitly cast to a char. This name is used in the headline, followed by the remainder of the generated file.

Maintaining class library source

The second example performs a more complex and useful task. Generally, when you create a class you think in library terms, and make a header file Name.h for the class declaration and a file where the member functions are implemented, called Name.cpp. T hese files have certain requirements: a particular coding standard (the program shown here will use the coding format for this book), and in the header file the declarations are generally surrounded by some preprocessor statements to prevent multiple declarations of classes. (Multiple declarations confuse the compiler – it doesn’t know which one you want to use. They could be different, so it throws up its hands and gives an error message.)

This example allows you to create a new header-implementation pair of files, or to modify an existing pair. If the files already exist, it checks and potentially modifies the files, but if they don’t exist, it creates them using the proper format.

//: C18:Cppcheck.cpp
// Configures .H & .CPP files
// To conform to style standard.
// Tests existing files for conformance
#include <fstream>
#include <strstream>
#include <cstring>
#include <cctype>
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  const int sz = 40;  // Buffer sizes
  const int bsz = 100;
  requireArgs(argc, 1); // File set name
  enum bufs { base, header, implement,
    Hline1, guard1, guard2, guard3,
    CPPline1, include, bufnum };
  char b[bufnum][sz];
  ostrstream osarray[] = {
    ostrstream(b[base], sz),
    ostrstream(b[header], sz),
    ostrstream(b[implement], sz),
    ostrstream(b[Hline1], sz),
    ostrstream(b[guard1], sz),
    ostrstream(b[guard2], sz),
    ostrstream(b[guard3], sz),
    ostrstream(b[CPPline1], sz),
    ostrstream(b[include], sz),
  };
  osarray[base] << argv[1] << ends;
  // Find any '.' in the string using the
  // Standard C library function strchr():
  char* period = strchr(b[base], '.');
  if(period) *period = 0; // Strip extension
  // Force to upper case:
  for(int i = 0; b[base][i]; i++)
    b[base][i] = toupper(b[base][i]);
  // Create file names and internal lines:
  osarray[header] << b[base] << ".H" << ends;
  osarray[implement] << b[base] << ".CPP" << ends;
  osarray[Hline1] << "//" << ": " << b[header]
    << " -- " << ends;
  osarray[guard1] << "#ifndef " << b[base]
                  << "_H" << ends;
  osarray[guard2] << "#define " << b[base]
                  << "_H" << ends;
  osarray[guard3] << "#endif // " << b[base]
                  << "_H" << ends;
  osarray[CPPline1] << "//" << ": "
                    << b[implement]
                    << " -- " << ends;
  osarray[include] << "#include \""
                   << b[header] << "\"" <<ends;
  // First, try to open existing files:
  ifstream existh(b[header]),
           existcpp(b[implement]);
  if(!existh) { // Doesn't exist; create it
    ofstream newheader(b[header]);
    assure(newheader, b[header]);
    newheader << b[Hline1] << endl
      << b[guard1] << endl
      << b[guard2] << endl << endl
      << b[guard3] << endl;
  }
  if(!existcpp) { // Create cpp file
    ofstream newcpp(b[implement]);
    assure(newcpp, b[implement]);
    newcpp << b[CPPline1] << endl
      << b[include] << endl;
  }
  if(existh) { // Already exists; verify it
    strstream hfile; // Write & read
    ostrstream newheader; // Write
    hfile << existh.rdbuf() << ends;
    // Check that first line conforms:
    char buf[bsz];
    if(hfile.getline(buf, bsz)) {
      if(!strstr(buf, "//" ":") ||
         !strstr(buf, b[header]))
        newheader << b[Hline1] << endl;
    }
    // Ensure guard lines are in header:
    if(!strstr(hfile.str(), b[guard1]) ||
       !strstr(hfile.str(), b[guard2]) ||
       !strstr(hfile.str(), b[guard3])) {
       newheader << b[guard1] << endl
         << b[guard2] << endl
         << buf
         << hfile.rdbuf() << endl
         << b[guard3] << endl << ends;
    } else
      newheader << buf
        << hfile.rdbuf() << ends;
    // If there were changes, overwrite file:
    if(strcmp(hfile.str(),newheader.str())!=0){
      existh.close();
      ofstream newH(b[header]);
      assure(newH, b[header]);
      newH << "//@//" << endl // Change marker
        << newheader.rdbuf();
    }
    delete hfile.str();
    delete newheader.str();
  }
  if(existcpp) { // Already exists; verify it
    strstream cppfile;
    ostrstream newcpp;
    cppfile << existcpp.rdbuf() << ends;
    char buf[bsz];
    // Check that first line conforms:
    if(cppfile.getline(buf, bsz))
      if(!strstr(buf, "//" ":") ||
         !strstr(buf, b[implement]))
        newcpp << b[CPPline1] << endl;
    // Ensure header is included:
    if(!strstr(cppfile.str(), b[include]))
      newcpp << b[include] << endl;
    // Put in the rest of the file:
    newcpp << buf << endl; // First line read
    newcpp << cppfile.rdbuf() << ends;
    // If there were changes, overwrite file:
    if(strcmp(cppfile.str(),newcpp.str())!=0){
      existcpp.close();
      ofstream newCPP(b[implement]);
      assure(newCPP, b[implement]);
      newCPP << "//@//" << endl // Change marker
        << newcpp.rdbuf();
    }
    delete cppfile.str();
    delete newcpp.str();
  }
} ///:~ 

This example requires a lot of string formatting in many different buffers. Rather than creating a lot of individually named buffers and ostrstream objects, a single set of names is created in the enum bufs. Then two arrays are created: an array of character buffers and an array of ostrstream objects built from those character buffers. Note that in the definition for the two-dimensional array of char buffers b, the number of char arrays is determined by bufnum, the last enumerator in bufs. When you create an enumeration, the compiler assigns integral values to all the enum labels starting at zero, so the sole purpose of bufnum is to be a counter for the number of enumerators in buf. The length of each string in b is sz.

The names in the enumeration are base, the capitalized base file name without extension; header, the header file name; implement, the implementation file (CPP) name; Hline1, the skeleton first line of the header file; guard1, guard2, and guard3, the “guard” lines in the header file (to prevent multiple inclusion); CPPline1, the skeleton first line of the CPP file; and include, the line in the CPP file that includes the header file.

osarray is an array of ostrstream objects created using aggregate initialization and automatic counting. Of course, this is the form of the ostrstream constructor that takes two arguments (the buffer address and buffer size), so the constructor calls must be formed accordingly inside the aggregate initializer list. Using the bufs enumerators, the appropriate array element of b is tied to the corresponding osarray object. Once the array is created, the objects in the array can be selected using the enumerators, and the effect is to fill the corresponding b element. You can see how each string is built in the lines following the ostrstream array definition.

Once the strings have been created, the program attempts to open existing versions of both the header and CPP file as ifstreams. If you test the object using the operator ‘ !’ and the file doesn’t exist, the test will fail. If the header or implementation file doesn’t exist, it is created using the appropriate lines of text built earlier.

If the files do exist, then they are verified to ensure the proper format is followed. In both cases, a strstream is created and the whole file is read in; then the first line is read and checked to make sure it follows the format by seeing if it contains both a “ //:” and the name of the file. This is accomplished with the Standard C library function strstr( ). If the first line doesn’t conform, the one created earlier is inserted into an ostrstream that has been created to hold the edited file.

In the header file, the whole file is searched (again using strstr( )) to ensure it contains the three “guard” lines; if not, they are inserted. The implementation file is checked for the existence of the line that includes the header file (although the compiler effectively guarantees its existence).

In both cases, the original file (in its strstream) and the edited file (in the ostrstream) are compared to see if there are any changes. If there are, the existing file is closed, and a new ofstream object is created to overwrite it. The ostrstream is output to the file after a special change marker is added at the beginning, so you can use a text search program to rapidly find any files that need reviewing to make additional changes.

Detecting compiler errors

All the code in this book is designed to compile as shown without errors. Any line of code that should generate a compile-time error is commented out with the special comment sequence “//!”. The following program will remove these special comments and append a numbered comment to the line, so that when you run your compiler it should generate error messages and you should see all the numbers appear when you compile all the files. It also appends the modified line to a special file so you can easily locate any lines that don’t generate errors:

//: C18:Showerr.cpp
// Un-comment error generators
#include <iostream>
#include <fstream>
#include <strstream>
#include <cstdio>
#include <cstring>
#include <cctype>
#include "../require.h"
using namespace std;
char* marker = "//!";

char* usage =
"usage: showerr filename chapnum\n"
"where filename is a C++ source file\n"
"and chapnum is the chapter name it's in.\n"
"Finds lines commented with //! and removes\n"
"comment, appending //(#) where # is unique\n"
"across all files, so you can determine\n"
"if your compiler finds the error.\n"
"showerr /r\n"
"resets the unique counter.";

// File containing error number counter:
char* errnum = "../errnum.txt";
// File containing error lines:
char* errfile = "../errlines.txt";
ofstream errlines(errfile,ios::app);

int main(int argc, char* argv[]) {
  requireArgs(argc, 2, usage);
  if(argv[1][0] == '/' || argv[1][0] == '-') {
    // Allow for other switches:
    switch(argv[1][1]) {
      case 'r': case 'R':
        cout << "reset counter" << endl;
        remove(errnum); // Delete files
        remove(errfile);
        return 0;
      default:
        cerr << usage << endl;
        return 1;
    }
  }
  char* chapter = argv[2];
  strstream edited; // Edited file
  int counter = 0;
  {
    ifstream infile(argv[1]);
    assure(infile, argv[1]);
    ifstream count(errnum);
    assure(count, errnum);
    if(count) count >> counter;
    int linecount = 0;
    const int sz = 255;
    char buf[sz];
    while(infile.getline(buf, sz)) {
      linecount++;
      // Eat white space:
      int i = 0;
      while(isspace(buf[i]))
        i++;
      // Find marker at start of line:
      if(strstr(&buf[i], marker) == &buf[i]) {
        // Erase marker:
        memset(&buf[i], ' ', strlen(marker));
        // Append counter & error info:
        ostrstream out(buf, sz, ios::ate);
        out << "//(" << ++counter << ") "
            << "Chapter " << chapter
            << " File: " << argv[1]
            << " Line " << linecount << endl
            << ends;
          edited << buf;
        errlines << buf; // Append error file
      } else
        edited << buf << "\n"; // Just copy
    }
  } // Closes files
  ofstream outfile(argv[1]); // Overwrites
  assure(outfile, argv[1]);
  outfile << edited.rdbuf();
  ofstream count(errnum); // Overwrites
  assure(count, errnum);
  count << counter; // Save new counter
} ///:~ 

The marker can be replaced with one of your choice.

Each file is read a line at a time, and each line is searched for the marker appearing at the head of the line; the line is modified and put into the error line list and into the strstream edited. When the whole file is processed, it is closed (by reaching the end of a scope), reopened as an output file and edited is poured into the file. Also notice the counter is saved in an external file, so the next time this program is invoked it continues to sequence the counter.

A simple datalogger

This example shows an approach you might take to log data to disk and later retrieve it for processing. The example is meant to produce a temperature-depth profile of the ocean at various points. To hold the data, a class is used:

//: C18:DataLogger.h
// Datalogger record layout
#ifndef DATALOG_H
#define DATALOG_H
#include <ctime>
#include <iostream>

class DataPoint {
  std::tm time; // Time & day
  static const int bsz = 10;
  // Ascii degrees (*) minutes (') seconds ("):
  char latitude[bsz], longitude[bsz];
  double depth, temperature;
public:
  std::tm getTime();
  void setTime(std::tm t);
  const char* getLatitude();
  void setLatitude(const char* l);
  const char* getLongitude();
  void setLongitude(const char* l);
  double getDepth();
  void setDepth(double d);
  double getTemperature();
  void setTemperature(double t);
  void print(std::ostream& os);
};
#endif // DATALOG_H ///:~ 

The access functions provide controlled reading and writing to each of the data members. The print( ) function formats the DataPoint in a readable form onto an ostream object (the argument to print( )). Here’s the definition file:

//: C18:Datalog.cpp {O}
// Datapoint member functions
#include "DataLogger.h"
#include <iomanip>
#include <cstring>
using namespace std;

tm DataPoint::getTime() { return time; }

void DataPoint::setTime(tm t) { time = t; }

const char* DataPoint::getLatitude() {
  return latitude;
}

void DataPoint::setLatitude(const char* l) {
  latitude[bsz - 1] = 0;
  strncpy(latitude, l, bsz - 1);
}

const char* DataPoint::getLongitude() {
  return longitude;
}

void DataPoint::setLongitude(const char* l) {
  longitude[bsz - 1] = 0;
  strncpy(longitude, l, bsz - 1);
}

double DataPoint::getDepth() { return depth; }

void DataPoint::setDepth(double d) { depth = d; }

double DataPoint::getTemperature() {
  return temperature;
}

void DataPoint::setTemperature(double t) {
  temperature = t;
}

void DataPoint::print(ostream& os) {
  os.setf(ios::fixed, ios::floatfield);
  os.precision(4);
  os.fill('0'); // Pad on left with '0'
  os << setw(2) << getTime().tm_mon << '\\'
     << setw(2) << getTime().tm_mday << '\\'
     << setw(2) << getTime().tm_year << ' '
     << setw(2) << getTime().tm_hour << ':'
     << setw(2) << getTime().tm_min << ':'
     << setw(2) << getTime().tm_sec;
  os.fill(' '); // Pad on left with ' '
  os << " Lat:" << setw(9) << getLatitude()
     << ", Long:" << setw(9) << getLongitude()
     << ", depth:" << setw(9) << getDepth()
     << ", temp:" << setw(9) << getTemperature()
     << endl;
} ///:~ 

In print( ), the call to setf( ) causes the floating-point output to be fixed-precision, and precision( ) sets the number of decimal places to four.

The default is to right-justify the data within the field. The time information consists of two digits each for the hours, minutes and seconds, so the width is set to two with setw( ) in each case. (Remember that any changes to the field width affect only the next output operation, so setw( ) must be given for each output.) But first, to put a zero in the left position if the value is less than 10, the fill character is set to ‘0’. Afterwards, it is set back to a space.

The latitude and longitude are zero-terminated character fields, which hold the information as degrees (here, ‘*’ denotes degrees), minutes (‘), and seconds(“). You can certainly devise a more efficient storage layout for latitude and longitude if you desire.

Generating test data

Here’s a program that creates a file of test data in binary form (using write( )) and a second file in ASCII form using DataPoint::print( ). You can also print it out to the screen but it’s easier to inspect in file form.

//: C18:Datagen.cpp
//{L} Datalog
// Test data generator
#include <fstream>
#include <cstdlib>
#include <cstring>
#include "../require.h"
#include "DataLogger.h"
using namespace std;

int main() {
  ofstream data("data.txt");
  assure(data, "data.txt");
  ofstream bindata("data.bin", ios::binary);
  assure(bindata, "data.bin");
  time_t timer;
  // Seed random number generator:
  srand(time(&timer)); 
  for(int i = 0; i < 100; i++) {
    DataPoint d;
    // Convert date/time to a structure:
    d.setTime(*localtime(&timer));
    timer += 55; // Reading each 55 seconds
    d.setLatitude("45*20'31\"");
    d.setLongitude("22*34'18\"");
    // Zero to 199 meters:
    double newdepth  = rand() % 200;
    double fraction = rand() % 100 + 1;
    newdepth += double(1) / fraction;
    d.setDepth(newdepth);
    double newtemp = 150 + rand()%200; // Kelvin
    fraction = rand() % 100 + 1;
    newtemp += (double)1 / fraction;
    d.setTemperature(newtemp);
    d.print(data);
    bindata.write((unsigned char*)&d,
                  sizeof(d));
  }
} ///:~ 

The file DATA.TXT is created in the ordinary way as an ASCII file, but DATA.BIN has the flag ios::binary to tell the constructor to set it up as a binary file.

The Standard C library function time( ), when called with a zero argument, returns the current time as a time_t value, which is the number of seconds elapsed since 00:00:00 GMT, January 1 1970 (the dawning of the age of Aquarius?). The current time is the most convenient way to seed the random number generator with the Standard C library function srand( ), as is done here.

Sometimes a more convenient way to store the time is as a tm structure, which has all the elements of the time and date broken up into their constituent parts as follows:

struct tm {
  int tm_sec; // 0-59 seconds
  int tm_min; // 0-59 minutes
  int tm_hour; // 0-23 hours
  int tm_mday; // Day of month
  int tm_mon; // 1-12 months
  int tm_year; // Calendar year
  int tm_wday; // Sunday == 0, etc.
  int tm_yday; // 0-365 day of year
  int tm_isdst; // Daylight savings?
};

To convert from the time in seconds to the local time in the tm format, you use the Standard C library localtime( ) function, which takes the number of seconds and returns a pointer to the resulting tm. This tm, however, is a static structure inside the localtime( ) function, which is rewritten every time localtime( ) is called. To copy the contents into the tm struct inside DataPoint, you might think you must copy each element individually. However, all you must do is a structure assignment, and the compiler will take care of the rest. This means the right-hand side must be a structure, not a pointer, so the result of localtime( ) is dereferenced. The desired result is achieved with

d.setTime(*localtime(&timer));

After this, the timer is incremented by 55 seconds to give an interesting interval between readings.

The latitude and longitude used are fixed values to indicate a set of readings at a single location. Both the depth and the temperature are generated with the Standard C library rand( ) function, which returns a pseudorandom number between zero and the constant RAND_MAX. To put this in a desired range, use the modulus operator % and the upper end of the range. These numbers are integral; to add a fractional part, a second call to rand( ) is made, and the value is inverted after adding one (to prevent divide-by-zero errors).

In effect, the DATA.BIN file is being used as a container for the data in the program, even though the container exists on disk and not in RAM. To send the data out to the disk in binary form, write( ) is used. The first argument is the starting address of the source block – notice it must be cast to an unsigned char* because that’s what the function expects. The second argument is the number of bytes to write, which is the size of the DataPoint object. Because no pointers are contained in DataPoint, there is no problem in writing the object to disk. If the object is more sophisticated, you must implement a scheme for serialization . (Most vendor class libraries have some sort of serialization structure built into them.)

Verifying & viewing the data

To check the validity of the data stored in binary format, it is read from the disk and put in text form in DATA2.TXT, so that file can be compared to DATA.TXT for verification. In the following program, you can see how simple this data recovery is. After the test file is created, the records are read at the command of the user.

//: C18:Datascan.cpp
//{L} Datalog
// Verify and view logged data
#include <iostream>
#include <fstream>
#include <strstream>
#include <iomanip>
#include "../require.h"
#include "DataLogger.h"
using namespace std;

int main() {
  ifstream bindata("data.bin", ios::binary);
  assure(bindata, "data.bin");
  // Create comparison file to verify data.txt:
  ofstream verify("data2.txt");
  assure(verify, "data2.txt");
  DataPoint d;
  while(bindata.read(
    (unsigned char*)&d, sizeof d))
    d.print(verify);
  bindata.clear(); // Reset state to "good"
  // Display user-selected records:
  int recnum = 0;
  // Left-align everything:
  cout.setf(ios::left, ios::adjustfield);
  // Fixed precision of 4 decimal places:
  cout.setf(ios::fixed, ios::floatfield);
  cout.precision(4);
  for(;;) {
    bindata.seekg(recnum* sizeof d, ios::beg);
    cout << "record " << recnum << endl;
    if(bindata.read(
      (unsigned char*)&d, sizeof d)) {
      cout << asctime(&(d.getTime()));
      cout << setw(11) << "Latitude"
           << setw(11) << "Longitude"
           << setw(10) << "Depth"
           << setw(12) << "Temperature"
           << endl;
      // Put a line after the description:
      cout << setfill('-') << setw(43) << '-'
           << setfill(' ') << endl;
      cout << setw(11) << d.getLatitude()
           << setw(11) << d.getLongitude()
           << setw(10) << d.getDepth()
           << setw(12) << d.getTemperature()
           << endl;
    } else {
      cout << "invalid record number" << endl;
      bindata.clear(); // Reset state to "good"
    }
    cout << endl
      << "enter record number, x to quit:";
    char buf[10];
    cin.getline(buf, 10);
    if(buf[0] == 'x') break;
    istrstream input(buf, 10);
    input >> recnum;
  }
} ///:~ 

The ifstream bindata is created from DATA.BIN as a binary file, with the ios::nocreate flag on to cause the assert( ) to fail if the file doesn’t exist. The read( ) statement reads a single record and places it directly into the DataPoint d . (Again, if DataPoint contained pointers this would result in meaningless pointer values.) This read( ) action will set bindata’s failbit when the end of the file is reached, which will cause the while statement to fail. At this point, however, you can’t move the get pointer back and read more records because the state of the stream won’t allow further reads. So the clear( ) function is called to reset the failbit.

Once the record is read in from disk, you can do anything you want with it, such as perform calculations or make graphs. Here, it is displayed to further exercise your knowledge of iostream formatting.

The rest of the program displays a record number (represented by recnum) selected by the user. As before, the precision is fixed at four decimal places, but this time everything is left justified.

The formatting of this output looks different from before:

record 0
Tue Nov 16 18:15:49 1993
Latitude   Longitude  Depth     Temperature
-------------------------------------------
45*20'31"  22*34'18"  186.0172  269.0167     

To make sure the labels and the data columns line up, the labels are put in the same width fields as the columns, using setw( ). The line in between is generated by setting the fill character to ‘-’, the width to the desired line width, and outputting a single ‘-’.

If the read( ) fails, you’ll end up in the else part, which tells the user the record number was invalid. Then, because the failbit was set, it must be reset with a call to clear( ) so the next read( ) is successful (assuming it’s in the right range).

Of course, you can also open the binary data file for writing as well as reading. This way you can retrieve the records, modify them, and write them back to the same location, thus creating a flat-file database management system. In my very first programming job, I also had to create a flat-file DBMS – but using BASIC on an Apple II. It took months, while this took minutes. Of course, it might make more sense to use a packaged DBMS now, but with C++ and iostreams you can still do all the low-level operations that are necessary in a lab.

Counting editor

Often you have some editing task where you must go through and sequentially number something, but all the other text is duplicated. I encountered this problem when pasting digital photos into a Web page – I got the formatting just right, then duplicated it, then had the problem of incrementing the photo number for each one. So I replaced the photo number with XXX, duplicated that, and wrote the following program to find and replace the “XXX” with an incremented count. Notice the formatting, so the value will be “001,” “002,” etc.:

//: C18:NumberPhotos.cpp
// Find the marker "XXX" and replace it with an
// incrementing number whereever it appears. Used
// to help format a web page with photos in it
#include <fstream>
#include <sstream>
#include <iomanip>
#include <string>
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  requireArgs(argc, 2);
  ifstream in(argv[1]);
  assure(in, argv[1]);
  ofstream out(argv[2]);
  assure(out, argv[2]);
  string line;
  int counter = 1;
  while(getline(in, line)) {
    int xxx = line.find("XXX");
    if(xxx != string::npos) {
      ostringstream cntr;
      cntr << setfill('0') << setw(3) << counter++;
      line.replace(xxx, 3, cntr.str());
    }
    out << line << endl;
  }
} ///:~ 

Breaking up big files

This program was created to break up big files into smaller ones, in particular so they could be more easily downloaded from an Internet server (since hangups sometimes occur, this allows someone to download a file a piece at a time and then re-assemble it at the client end). You’ll note that the program also creates a reassembly batch file for DOS (where it is messier), whereas under Linux/Unix you simply say something like “ cat *piece* > destination.file ”.

This program reads the entire file into memory, which of course relies on having a 32-bit operating system with virtual memory for big files. It then pieces it out in chunks to the smaller files, generating the names as it goes. Of course, you can come up with a possibly more reasonable strategy that reads a chunk, creates a file, reads another chunk, etc.

Note that this program can be run on the server, so you only have to download the big file once and then break it up once it’s on the server.

//: C18:Breakup.cpp
// Breaks a file up into smaller files for 
// easier downloads
#include <iostream>
#include <fstream>
#include <iomanip>
#include <strstream>
#include <string>
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  requireArgs(argc, 1);
  ifstream in(argv[1], ios::binary);
  assure(in, argv[1]);
  in.seekg(0, ios::end); // End of file
  long fileSize = in.tellg(); // Size of file
  cout << "file size = " << fileSize << endl;
  in.seekg(0, ios::beg); // Start of file
  char* fbuf = new char[fileSize];
  require(fbuf != 0);
  in.read(fbuf, fileSize);
  in.close();
  string infile(argv[1]);
  int dot = infile.find('.');
  while(dot != string::npos) {
    infile.replace(dot, 1, "-");
    dot = infile.find('.');
  }
  string batchName(
    "DOSAssemble" + infile + ".bat");
  ofstream batchFile(batchName.c_str());
  batchFile << "copy /b ";
  int filecount = 0;
  const int sbufsz = 128;
  char sbuf[sbufsz];
  const long pieceSize = 1000L * 100L;
  long byteCounter = 0;
  while(byteCounter < fileSize) {
    ostrstream name(sbuf, sbufsz);
    name << argv[1] << "-part" << setfill('0') 
      << setw(2) << filecount++ << ends;
    cout << "creating " << sbuf << endl;
    if(filecount > 1) 
      batchFile << "+";
    batchFile << sbuf;
    ofstream out(sbuf, ios::out | ios::binary);
    assure(out, sbuf);
    long byteq;
    if(byteCounter + pieceSize < fileSize)
      byteq = pieceSize;
    else
      byteq = fileSize - byteCounter;
    out.write(fbuf + byteCounter, byteq);
    cout << "wrote " << byteq << " bytes, ";
    byteCounter += byteq;
    out.close();
    cout << "ByteCounter = " << byteCounter 
      << ", fileSize = " << fileSize << endl;
  }
  batchFile << " " << argv[1] << endl;
} ///:~ 


Go to CodeGuru.com
Contact: webmaster@codeguru.com
© Copyright 1997-1999 CodeGuru