Iterator Intervals

Implicit Iterator Intervals

The Standard C++ Library is built around the concept of iterators. On the one hand, output iterators can be incremented and assigned values, and on the other hand input iterators traverse a range between a start iterator (usually referred to as the begin iterator) and a past-the-end iterator (the end iterator). All containers follow this convention, either through member functions begin() and end() or through pointer arithmetics based on base pointer and size. Also, several other range concepts (for instance, input from an istream)  also follow this convention.

On the other hand, all standard library algorithms work on iterator ranges, which are passed implicitly by providing the two iterators independently.

The problem with implicit iterator intervals is that they don’t play nice with functions that transform intervals: for instance, a filter function would turn an interval into the elements of that interval which satisfy a certain predicate, or a sub-interval function would restrict iteration to a fixed contiguous sub-interval. In the current situation, since it’s not possible to dispatch the result of a single function call to two arguments (the expected begin and end iterators), the result of filter or sub-interval functions has to be stored as a variable, which implies that the type has to be explicitly spelled out (and it’s often too complex to write down).

Explicit Iterator Intervals

Instead of having algorithms operate on two argument iterators, I propose to combine the two iterators into a single object, called an iterator interval. The type of this iterator interval is unspecified: expression templates are used to avoid unnecessary recourse to runtime polymorphism. The only constraint is that an iterator interval must specify at least the following expressions

T::iterator is an input iterator. Then, given const T t :
t.begin() is the first iterator in the interval.
t.end() is the past-the-end iterator in the interval.

Note that this rules out the standard library containers as intervals: these will return a const_iterator instead of a normal iterator. This is because a container is not an interval, it merely has an interval that represents all its content. The mutability of a container determines the mutability of its members, whereas the mutability of an interval does not affect the mutability of the elements of the represented sequence.

For starters, a few helper functions that create intervals from most things which can have intervals:

namespace interval
{
  namespace detail
  {
    template<typename It>
    class interval_type
    {
      It begin_iterator, end_iterator;
    public:
      interval_type(It begin, It end)
        : begin_iterator(begin), end_iterator(end) {} 

      typedef It iterator; 

      iterator begin() const { return this -> begin_iterator; }
      iterator end() const { return this -> end_iterator; }
    };
  } 

  template<typename T, std::size_t N>
  detail::interval_type<T> of(T (&array)[N])
  {
    return detail::interval_type<T*>(array, array + N);
  } 

  template<typename It>
  detail::interval_type<It> of(It begin, It end)
  {
    return detail::interval_type<It>(begin,end);
  } 

  template<typename T>
  detail::interval_type< std::istream_iterator<T> > of(std::istream &in)
  {
    return detail::interval_type< std::istream_iterator<T> >
      (std::istream_iterator<T>(in),
       std::istream_iterator<T>());
  } 

  template<typename T>
  detail::interval_type< typename T::const_iterator > of(const T &t)
  {
    return detail::interval_type< typename T::const_iterator >
      (t.begin(), t.end());
  } 

  template<typename T>
  detail::interval_type< typename T::iterator > of(T &t)
  {
    return detail::interval_type< typename T::iterator >
      (t.begin(), t.end());
  }
}

So, to create an interval from a container, you can write interval::of(my_vector), to create one from an array you can write interval::of(my_array), and to create one from an input stream you can write interval::of<Type>(my_stream). Once you have obtained an interval, you can start using it.

Algorithms

For the purposes of this article, only two elementary algorithms have been rewritten. One could consider rewriting others as well, using the same principle. The key of rewriting algorithms is to have them take a single iterator interval instead of two iterators.

namespace interval
{
  template<typename I, typename It>
  void copy(const I &i, const It &o)
  {
    std::copy(i.begin(), i.end(), o);
  } 

  template<typename I, typename F>
  void for_each(const I &i, const F &f)
  {
    std::for_each(i.begin(), i.end(), f);
  }
}

So, now, it’s possible to write code such as:

interval::copy(interval::of<int>(std::cin), std::back_inserter(my_vector));

Filtering

The real benefit of using intervals is that now intervals can be transformed. An example transformation is to eliminate the elements in an interval which do not satisfy a certain predicate. This involves defining a new iterator type which wraps around the old one, but which will advance internally so that it always references a value which satisfies the predicate. The obvious limitation is that all iterators become forward iterators (the random access property, if present, cannot be kept).

namespace interval
{
  namespace detail
  {
    template<typename It, typename T>
    class filtered_interval_iterator
    {
      It internal;
      const T *owner; 

      void advance()
      {
        while (this -> internal != this -> owner -> end_iterator
               && ! this -> owner -> filter(* this -> internal))
          ++ this -> internal;
      } 

    public: 

      typedef typename std::iterator_traits<It>::value_type value_type;
      typedef typename std::iterator_traits<It>::reference reference;
      typedef typename std::iterator_traits<It>::pointer pointer;
      typedef typename std::iterator_traits<It>::difference_type difference_type;
      typedef std::input_iterator_tag iterator_category; 

      filtered_interval_iterator(It internal, const T &owner)
        : internal(internal), owner(&owner)
      { this -> advance(); } 

      filtered_interval_iterator &operator++()
      {
        ++ this -> internal;
        this -> advance();
        return *this;
      } 

      filtered_interval_iterator operator++(int)
      {
        filtered_interval_iterator temp(*this);
        ++ *this;
        return temp;
      } 

      reference operator*() const { return *this->internal; } 

      bool operator == (const filtered_interval_iterator &o)
      {
        return this -> internal == o.internal;
      } 

      bool operator != (const filtered_interval_iterator &o)
      {
        return this -> internal != o.internal;
      }
    }; 

    template<typename It, typename F>
    class filtered_interval_type
    {
      F filter;
      It begin_iterator, end_iterator; 

      friend class filtered_interval_iterator<It,filtered_interval_type>; 

    public: 

      filtered_interval_type(It begin, It end, F filter)
        : filter(filter), begin_iterator(begin), end_iterator(end) {} 

      typedef filtered_interval_iterator<It,filtered_interval_type> iterator; 

      iterator begin() const
      {
        return iterator(this -> begin_iterator, *this);
      } 

      iterator end() const
      {
        return iterator(this -> end_iterator, *this);
      }
    };
  } 

  template<typename I, typename F>
  detail::filtered_interval_type<typename I::iterator,F> filter(const I &interval, F flt)
  {
    return detail::filtered_interval_type<typename I::iterator,F>
      (interval.begin(), interval.end(), flt);
  }
}

Filtering takes an interval and a predicate, and filters the interval with the predicate, resulting in a new interval. So, to print to standard output all even numbers read from standard input:

bool even(int x) { return x%2==0; } 

interval::copy( interval::filter( interval::of<int>(std::cin), even ),
                std::ostream_iterator<int>(std::cout, "\n"));

Sub-intervals

Another functionality that can be easily implemented is the ability to select only a subset of an interval. The implementation increments the begin-iterator until it reaches the start distance, then forces any iterator which has moved the total length to equal the past-the-end iterator to mark the end of the interval.

namespace interval
{
  namespace detail
  {
    template<typename It>
    class sub_interval_iterator
    {
      It internal;
      typename std::iterator_traits<It>::difference_type left; 

    public: 

      typedef typename std::iterator_traits<It>::value_type value_type;
      typedef typename std::iterator_traits<It>::reference reference;
      typedef typename std::iterator_traits<It>::pointer pointer;
      typedef typename std::iterator_traits<It>::difference_type difference_type;
      typedef std::input_iterator_tag iterator_category; 

      sub_interval_iterator(It internal, difference_type left)
        : internal(internal), left(left)
      { } 

      sub_interval_iterator &operator++()
      {
        -- this -> left;
        ++ this -> internal;
        return *this;
      } 

      sub_interval_iterator operator++(int)
      {
        sub_interval_iterator temp(*this);
        ++ *this;
        return temp;
      } 

      reference operator*() const { return *this->internal; } 

      bool operator == (const sub_interval_iterator &o)
      {
        return this -> left == 0 && o.left == 0
          || this -> internal == o.internal;
      } 

      bool operator != (const sub_interval_iterator &o)
      {
        return this -> internal != o.internal
          && this -> left != o.left;;
      }
    }; 

    template<typename It>
    class sub_interval_type
    {
    public:
      typedef typename std::iterator_traits<It>::difference_type distance;
    private:
      distance length;
      It begin_iterator, end_iterator;
    public:
      sub_interval_type(It begin, It end, distance start, distance length)
        : length(length), begin_iterator(begin), end_iterator(end)
      {
        while (start > 0 && begin != end) { --start; ++begin; }
        begin_iterator = begin;
      } 

      typedef sub_interval_iterator<It> iterator; 

      iterator begin() const
      {
        return iterator(this -> begin_iterator, this -> length);
      } 

      iterator end() const
      {
        return iterator(this -> end_iterator, 0);
      }
    };
  } 

  template<typename I>
  detail::sub_interval_type<typename I::iterator>
  sub(const I& i, typename I::iterator::difference_type start, typename I::iterator::difference_type length)
  {
    return detail::sub_interval_type<typename I::iterator>(i.begin(), i.end(), start, length);
  }
}

With this function, one can specify the start and length of the sub-interval, just like with a typical sub-string function.

Assignment

How to assign an interval to a container? Containers have constructors and assignment functions that work on two iterators, not a single interval. However, it’s possible to work around this, in part, by using a cast function:

namespace interval
{
  namespace detail
  {
    template<typename I, typename C>
    struct assign_type
    {
      I i;
      assign_type(const I &i) : i(i) {}
      operator C() const { return C(i.begin(), i.end()); }
    };
  } 

  template<typename C, typename I>
  detail::assign_type<I,C> assign(const I &i)
  {
    return detail::assign_type<I,C>(i);
  }
}

The return value of the assignment function above can now be assigned to the appropriate container.

The final example reads even integers from standard input, stores them in a vector, then outputs all the contents of the vector between index 2 and index 12:

bool even(int i) { return i % 2 == 0; } 

int main()
{
  std::vector<int> values = interval::assign< std::vector<int> >
   ( interval::filter ( interval::of<int>(std::cin), even )); 

  interval::copy
   ( interval::sub( interval::of(values), 2, 10 ),
     std::ostream_iterator<int>(std::cout, "\n") );
}

Possible extensions

It is currently not possible to retrieve the original iterator, but this could be done easily (just extend the interval type to handle this) and would be useful (for instance when using std::find across an interval).  Also, iterator types are not conserved: filtering conserves bidirectional iterators and sub-interval extraction conserves random access iterators (plus, on random access iterators, sub-interval extraction is much easier than the above version, which has to handle forward iterators).

0 Responses to “Iterator Intervals”


  1. No Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>



1150 feed subscribers
(readers who polled a feed this week)