git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Iterators, iterables and special objects


On 7/21/20 9:32 PM, Peter Sl??ik wrote:
> Hi list, two related questions:
> 
> 1. Why do functions used to iterate over collections or dict members return
> specialized objects like
> 
> type(dict.keys()) -> class 'dict_keys'
> type(dict.values()) -> class 'dict_values'
> type(dict.items()) -> class 'dict_items'
> type(filter(..., ...)) -> class 'filter'
> type(map(..., ...)) -> class 'map'
> type(enumerate(...)) -> class 'enumerate'
> 
> instead of returning some more general 'iterable' and 'view' objects? Are
> those returned objects really that different from one another that it makes
> sense to have individual implementations?

The key-word here is "light-weight"! (or is that two, words?)

These do not duplicate the keys/values in the dict (example), and thus 
have no (real) storage requirements (think thousands of values). We 
could think of them as an access channel or 'pipe' - they are a "view" 
of the data if you follow the database-terminology.

Another analogy is to "generators". A generator will yield successive 
data items, but will not compute or take storage resources for all of 
the values it will return, instead performing in a 'lazy' fashion or JiT 
(Just-in-time) delivery. Similarly, once a generator is "exhausted", it 
terminates. It cannot be re-used, without being re-computed.

For your reading pleasure: PEP 3106 -- Revamping dict.keys(), .values() 
and .items() https://www.python.org/dev/peps/pep-3106/


> 2. Why do these functions return iterators instead of iterables? First, I
> find it confusing - to me, it is the loop's job to create an iterator from
> the supplied iterable, and not the object that is being iterated over. And
> second, with this design, iterators are required to be iterables too, which
> is confusing as hell (at least for people coming from other languages in
> which the distinction is strict).

This is a great question. (even if I'm motivated to say-so because it 
puzzled me too!) However, may I caution you not to expect that just 
because one language 'thinks' in a certain manner, that 
another/every-other language must do the same, eg in English we say 
"blue ball" but in other spoken-languages it may be expressed in the 
sequence "ball, blue". Which is correct? ...more correct?
(well, 'mine', of course! Cue Frank Sinatra: "I did it my way...")

Some more reading, which also under-pins and expands the above web.ref: 
https://docs.python.org/3/library/stdtypes.html


If 'everything' in Python is an "object", then some objects will contain 
multiple values. One would expect that these multiple values would be 
"iterable" (one could "iterate" over them). However, does the object 
provide a method to perform this iteration-function? If it does, then 
that object is also an "iterator"!

However, as @Terry explained, there is *no requirement* that a 
multi-valued object provide such a method.

Per above, the "return iterators instead of iterables" decision comes 
back to 'weight'. No 'copy' of the iterable object is made, only 
iterator functionality is provided, eg to a for-loop. It is also an 
example of Python's "we're all adults here" laissez-faire and dynamic 
philosophy: there is no rule that iterable <==> iterator!

A final thought for your consideration, is that Python considers 
functions (and methods) "first-class objects". Which means that they can 
be passed as arguments/parameters, for example. In some ways then, it 
may be helpful to think of an iterator (or generator) method as a 
function being passed to the for-loop's 'control function'.
(others may dislike this picture, so don't tell anyone I said-so...)


Further tutorials you may find helpful:
https://towardsdatascience.com/python-basics-iteration-and-looping-6ca63b30835c
https://www.w3schools.com/python/gloss_python_iterator_vs_iterable.asp
https://www.tutorialspoint.com/difference-between-python-iterable-and-iterator
https://www.python-course.eu/python3_iterable_iterator.php
-- 
Regards =dn