Index
The templates and classes describe in this chapter implement an output iterator-based approach to parsing MIME documents. A MIME parser gets constructed by instantiating a sequence of template classes. Each one of them is an output iterator over a sequence, a stream of tokens, that modifies the stream and passes the modified stream to the next output iterator.
The initial output iterator,
x::mime::newline_iter,
instantiates an output iterator over char values.
Its template parameter is an output iterator class which iterates over
int values.
x::mime::newline_iter takes the
char sequence it iterates over, and promotes each
character to an int between 0 and 255.
Additionally,
x::mime::newline_iter
inserts an x::mime::newline_start into the output
sequence before each newline sequence, and
x::mime::newline_end after the newline sequence.
All other output iterators described in this chapter iterate over
an int value sequence, which consists of
the char values, from the original output
sequence that comprises the MIME document, and additional values
inserted by these output iterators.
Its important to note that
the original char output sequence does not
get modified, but gets supplemented by int
values that the output iterators insert into the output sequence,
like x::mime::newline_start and
x::mime::newline_end, which appear before and
after the LF
(or
the CRLF)
value.
The LF (or the CRLF values)
remain in the output sequence where they were, but get bracketed by
x::mime::newline_start and
x::mime::newline_end.
#include <x/mime/newlineiter.H> #include <vector> #include <iterator> #include <iostream> int main() { std::vector<int> seq; typedef std::back_insert_iterator<std::vector<int>> ins_iter_t; auto iter=std::copy(std::istreambuf_iterator<char>(std::cin), std::istreambuf_iterator<char>(), x::mime::newline_iter<ins_iter_t> ::create(ins_iter_t(seq))); iter.get()->eof(); ins_iter_t value=iter.get()->iter; for (int c:seq) { if (x::mime::nontoken(c)) std::cout << (char)c; else std::cout << '<' << c << '>'; } std::cout << std::endl << std::flush; return 0; }
Instantiating a x::mime::newline_iter results
in an output iterator, but x::mime::newline_iter
gets instantiated by create() like a
reference-counted object
(because, internally, it is).
The template parameter is an output iterator class over
ints, and the constructor
takes an instance of the template parameter class.
This example copies chars from
std::cin into the instantiated
x::mime::newline_iter,
which outputs to a
std::back_insert_iterator<std::vector<int>>.
No formal means exist to notify an output iterator of an end to the
output sequence, other than its destruction, so the MIME parsing
iterators use this convention. The output iterator's
get() method returns a reference to the
underlying reference-counted method, with an
eof() that must be invoked in order to signal
the end of the output sequence.
All MIME parsing templates and classes require that
x::mime::newline_iter's
eof() must get invoked.
In addition to eof(), the
iter class member gives the current value of the
output iterator that
x::mime::newline_iter's constructor received,
via create().
Sample output from the above newlineparser.C
example:
$ cat newlineparser.txt
Subject: test
test
$ ./newlineparser <newlineparser.txt
Subject: test<256>
<257><256>
<257>test<256>
<257><-1>
x::mime::newline_iter promotes the
char sequence it iterates over to
int between 0 and 255.
x::mime::nontoken() returns
true if the given value is in this range, and
false for additional tokens. As the sample output
shows, 256 and 257
(corresponding to
x::mime::newline_start and
x::mime::newline_end) wrap each newline character.
-1 is x::mime::eof that gets
inserted by
x::mime::newline_iter's
eof().
One important characteristic of
x::mime::newline_iter is that when the
output sequence does not end with a newline,
x::mime::newline_iter inserts
x::mime::newline_start immediately followed by
x::mime::newline_end, without a newline in between
(this gets triggered by eof()).
In all cases
x::mime::newline_iter does not modify the
character part of the output sequence that gets forwarded to its
output iterator, but the output sequence always ends with a newline
sequence:
$ cat newlineparser.txt
Subject: test
test
$ ./newlineparser <newlineparser.txt
Subject: test<256>
<257><256>
<257>test<256><257><-1>
This is same as the previous example, except that the original
MIME-formatted message did not end with a newline.
x::mime::newline_iter adds
x::mime::newline_start (256) immediately followed by
x::mime::newline_end (257), before the trailing
x::mime::eof.
CRLF newline sequencesx::mime::newline_iter<ins_iter_t>::create(ins_iter_t(seq), true);
Setting the second optional parameter to
x::mime::newline_iter's
create() to true instantiates
the output iterator that recognizes CRLF sequence as
the newline sequence instead of LF.
x::mime::newline_iter inserts
x::mime::newline_start and
x::mime::newline_end before and after
each CRLF sequence. CR and
LF by themselves are left alone.
x::mime::newline_iter iterates over a
char-valued output sequence that contains
a MIME document. Its template parameter parameter is an
output iterator class that iterates over
int values, and create()
takes an instance of the template class.
The iterator passed to create() iterates over
an int values that consists of the
char values that
x::mime::newline_iter iterates over.
Additionally, each recognized newline sequence gets preceded by a
x::mime::newline_start and followed by
x::mime::newline_end.
This includes the implied newline at the end of the output sequence
that does not end with an explicit newline sequence.
Invoking eof() on
x::mime::newline_iter's output iterator
object insert the
x::mime::eof into the output sequence.