起因和背景

今天帮朋友做一套Python课程练习,题目要求用一个受限制(Constrained)的List为底层,去实现一部分类似Java中ArrayList的功能,题目对List做了如下限制

  • lst[I] 使用已存在且是正数的Index,来获取或设置一个值
  • len(lst) 获取lst的长度
  • lst.append(None) 一次增加一个列表长度
  • del lst[len(lst)-1] 删除列表中最后一个元素
点击以查看源代码
class ConstrainedList(list):
    """Constrains the list class so it offers only the following primitive array API:

        - `lst[i]` for getting and setting a value at an *existing, positive* index `i`
        - `len(lst)` to obtain the number of slots
        - `lst.append(None)` to grow the list by *one slot at a time*
        - `del lst[len(lst)-1]` to delete the last slot in a list

       All other operations will result in an exception being raised.
    """

    def __init__(self, *args):
        super().__init__(*args)

    def append(self, value):
        if value is not None:
            raise ValueError('Can only append None to constrained list!')
        super().append(value)

    def __getitem__(self, idx):
        if idx < 0 or idx >= len(self):
            raise ValueError('Can only use positive, valid indexes on constrained lists!')
        return super().__getitem__(idx)

    def __setitem__(self, idx, value):
        if idx < 0 or idx >= len(self):
            raise ValueError('Can only use positive, valid indexes on constrained lists!')
        super().__setitem__(idx, value)

    def __delitem__(self, idx):
        if idx != len(self) - 1:
            raise ValueError('Can only delete last item in constrained list!')
        super().__delitem__(idx)

    def __getattribute__(self, name):
        if name in ('insert', 'pop', 'remove', 'min', 'max',
                    'index', 'count', 'clear', 'copy', 'extend'):
            raise AttributeError('Method "' + name + '" not supported on constrained list!')
        else:
            return super().__getattribute__(name)

    # __getattribute__ isn't called for special methods, so the following are needed

    def __add__(self, value):
        raise AttributeError('Constrained lists do not support `+`!')

    def __contains__(self, value):
        raise AttributeError('Constrained lists do not support `in`!')

    def __eq__(self, value):
        raise AttributeError('Constrained lists do not support `==`!')

    def __iter__(self):
        raise AttributeError('Constrained lists do not support iteration!')

    def __str__(self):
        raise AttributeError('Constrained lists do not support stringification!')

    def __repr__(self):
        raise AttributeError('Constrained lists do not support stringification!')

    # for testing only! (don't use this in your ArrayList implementation)

    def _as_list(self):
        return list(super().__iter__())

在此基础上要求实现ArrayList的部分功能:

  • 基于下标的访问
  • 字符串化
  • 单元素操作
  • 断言(True/False查询)
  • 查询
  • 块操作
  • 迭代

下面是课程提供起始代码(Starter Code),有兴趣的朋友可以打开看看

点击以查看源代码
class ArrayList:
    def __init__(self):
        self.data = ConstrainedList() # don't change this line!

    
    ### subscript-based access ###
    
    def _normalize_idx(self, idx):
        nidx = idx
        if nidx < 0:
            nidx += len(self.data)
            if nidx < 0:
                nidx = 0
        return nidx
    
    def __getitem__(self, idx):
        """Implements `x = self[idx]`"""
        assert(isinstance(idx, int))
        nidx = self._normalize_idx(idx)
        if nidx >= len(self.data):
            raise IndexError
        return self.data[nidx]

    def __setitem__(self, idx, value):
        """Implements `self[idx] = x`"""
        assert(isinstance(idx, int))
        nidx = self._normalize_idx(idx)
        if nidx >= len(self.data):
            raise IndexError
        self.data[nidx] = value

    def __delitem__(self, idx):
        """Implements `del self[idx]`"""
        assert(isinstance(idx, int))
        nidx = self._normalize_idx(idx)
        if nidx >= len(self.data):
            raise IndexError
        for i in range(nidx+1, len(self.data)):
            self.data[i-1] = self.data[i]
        del self.data[len(self.data)-1]
    

    ### stringification ###
    
    def __str__(self):
        """Implements `str(self)`. Returns '[]' if the list is empty, else
        returns `str(x)` for all values `x` in this list, separated by commas
        and enclosed by square brackets. E.g., for a list containing values
        1, 2 and 3, returns '[1, 2, 3]'."""
        
        
    def __repr__(self):
        """Supports REPL inspection. (Same behavior as `str`.)"""
            


    ### single-element manipulation ###
    
    def append(self, value):
        """Appends value to the end of this list."""
        
    
    def insert(self, idx, value):
        """Inserts value at position idx, shifting the original elements down the
        list, as needed. Note that inserting a value at len(self) --- equivalent
        to appending the value --- is permitted. Raises IndexError if idx is invalid."""
        
    
    def pop(self, idx=-1):
        """Deletes and returns the element at idx (which is the last element,
        by default)."""
        
    
    def remove(self, value):
        """Removes the first (closest to the front) instance of value from the
        list. Raises a ValueError if value is not found in the list."""
        
    

    ### predicates (T/F queries) ###
    
    def __eq__(self, other):
        """Returns True if this ArrayList contains the same elements (in order) as
        other. If other is not an ArrayList, returns False."""
        

    def __contains__(self, value):
        """Implements `val in self`. Returns true if value is found in this list."""
        


    ### queries ###
    
    def __len__(self):
        """Implements `len(self)`"""
        
    
    def min(self):
        """Returns the minimum value in this list."""
        
    
    def max(self):
        """Returns the maximum value in this list."""
        
    
    def index(self, value, i=0, j=None):
        """Returns the index of the first instance of value encountered in
        this list between index i (inclusive) and j (exclusive). If j is not
        specified, search through the end of the list for value. If value
        is not in the list, raise a ValueError."""
        
    
    def count(self, value):
        """Returns the number of times value appears in this list."""
        

    
    ### bulk operations ###

    def __add__(self, other):
        """Implements `self + other_array_list`. Returns a new ArrayList
        instance that contains the values in this list followed by those 
        of other."""
        
    
    def clear(self):
        self.data = ConstrainedList() # don't change this!
        
    def copy(self):
        """Returns a new ArrayList instance (with a separate data store), that
        contains the same values as this list."""
        

    def extend(self, other):
        """Adds all elements, in order, from other --- an Iterable --- to this list."""
        

            
    ### iteration ###
    
    def __iter__(self):
        """Supports iteration (via `iter(self)`)"""

问题描述

在最后实现迭代时,遇到了一个问题,先看一下出问题那部分的测试用例:

    def test_case7(self):
        tc = TestCase()
        lst = ArrayList()

        data = [random.randrange(1000) for _ in range(100)]
        lst.data = ConstrainedList(data)
        tc.assertEqual(data, [x for x in lst])

        it1 = iter(lst)
        it2 = iter(lst)
        for x in data:
            tc.assertEqual(next(it1), x)
            tc.assertEqual(next(it2), x)

注意这部分代码的最后一段,他为同一个ArrayList lst 创建了两个迭代器,分别是 it1it2。我们要去自行实现这个迭代器,以下是我第一次写的代码:

### iteration ###
    
    def __iter__(self):
        """Supports iteration (via `iter(self)`)"""
        return self

    def __next__(self):
        if self.iter_index >= len(self):
            raise StopIteration
        else:
            self.iter_index += 1
            return self[self.iter_index - 1]

这段代码在只使用一个迭代器,也就是测试用例的前半部分时,没有任何问题,可以正常工作。但在后半部分,由于同时存在 it1it2 ,但两个迭代器使用的 iter_index 这个迭代器索引,由于 iter_index 是一个由 self 修饰的实例变量,所以两个迭代器实际上在操作同一个值;也就是说,当 it1 执行完next之后,iter_index 的值会被+1, it2 执行完next之后, iter_index的值又会被+1,实际上 iter_index 增加了2,这导致的问题显而易见。

解决方案

对于Python来说,容器不应该知道迭代器的状态,迭代器也不应该知道容器的内容,迭代器只应该知道容器对象是谁。如果混合使用容器和迭代器,就如用本例中的 self.iter_index ,不同的迭代器会共享彼此的状态,导致非预期的问题。这和为什么所有可迭代的内置类型都有一个独立的迭代器类,是同一个原因(有些甚至有独立的反向迭代器)。

所以在这里,我们也需要单独实现一个迭代器类来解决此问题:

class ArrayListIterator:
    def __init__(self, parent):
        self.index = 0
        self.parent = parent

    def __iter__(self):
        return self

    def __next__(self):
        if self.index >= len(self.parent):
            raise StopIteration
        else:
            self.index += 1
            return self.parent[self.index - 1]

### iteration ###

    def __iter__(self):
        """Supports iteration (via `iter(self)`)"""
        return ArrayListIterator(self)

问题完美解决!