Nerd Notes

Python

Python: co-routines in multi-threading/processing

Asyncio contains the event_loop.run_in_executor function which allows you to run co-routines in a separate executor - this being a Thread or Processor pool (via the concurrent.futures module):

def blocking_task(i):
   print(i) #or some other useful task...

async def run_blocking_tasks(executor):
    loop = asyncio.get_event_loop()
    blocking_tasks = [
        loop.run_in_executor(executor, blocking_task, i)
        for i in range(6)
    ]

    completed, pending = await asyncio.wait(blocking_tasks)
    results = [t.result() for t in completed]
    return results

event_loop = asyncio.get_event_loop()
try:
    event_loop.run_until_complete(
        run_blocking_tasks(executor)
    )
finally:
    event_loop.close()

Important: note that the last argument to run_in_executor is *args, i.e. the arguments to pass to the blocking task. Passing the argument directly (blocking_task(i)) will not work as expected

Python

Python: custom sorting

Python3 allows us to sort iterables using the sorted function. It also allows us to define a "key function". The key function accepts a single parameter - an item from the iterable. It returns an integer, the lower the integer, the higher up in the sorted list (i.e. an item which returns 1 will appear before an item which returns 2)

Let's define an iterable of dicts:

listOfDicts = [{
     'level': 'high',
     'id' : 3
 },
 {
     'level': 'low',
     'id' : 2
 },
 {
     'level': 'low',
     'id' : 1
 },
 {
     'level': 'medium',
     'id' : 4
}]

We'd like to have dicts whose level attribute is high to appear first in the list, followed by medium, and last by low. Let's define a suitable key function to do this:

def custom_sort_function(obj):
     if (obj['level'] == 'high'):
             return 1
     if (obj['level'] == 'medium'):
             return 2
     if (obj['level'] == 'low'):
             return 3
     return 4

Last, we apply the above function to our iterable using sorted. P.S. note how we use a lambda to feed each entry into the custom_sort_function - this is not necessary - you can just provide the function name

sorted(listOfDicts, key=lambda x : custom_sort_function(x))
# - OR -
sorted(listOfDicts, key=custom_sort_function)

Which returns the sorted list as expected:

[{'level': 'high', 'id': 3}, {'level': 'medium', 'id': 4}, {'level': 'low', 'id': 2}, {'level': 'low', 'id': 1}]

Python

Python dictionary (list) comprehensions

It's possible to generate a dictionary using comprehensions:

t = ["1", "234", "45"]
a = { keyName: len(keyName) for keyName in t }
print(a)
# prints: {'1': 1, '234': 3, '45': 2}

Python

Python: Using "map" to update a list of dictionary

old_list = [{
   'fieldToUpdate': 'oldValue'
}]

new_list = list( map( lambda entry: entry.update({'fieldToUpdate': 'newValue'}) or entry, old_list) )

# new_list is now: [{'fieldToUpdate': 'newValue'}]

The above code snippet makes use of the fact that the "update" method of a standard python dict returns "None" - hence you can see the "or" being used to returned the newly updated entry

Python

Saving data structures using slots

Usually Python devs store data structures in dictionaries. A more efficient way is to use "slotted objects". A slotted object is a class which has the slots attributed defined. In the interest of space, the typical class dict attribute is removed.

Example of a slotted object:

class DemoObject(object):   # Note we subclass "object" to remove the __dict__ attr
    __slots__ = ('a', 'b', 'c')

demo1 = DemoObject()
demo1.a = 1
demo1.b = 2

Slotted objects have a couple of advantages over dicts:

Faster attribute access (approx 15% speedup)
Savings on storage space especially when multiple objects are created (over 50% savings)

Caveats:

Use of multiple inheritance can be tricky
Make sure to use latest pickling protocol

Python

Generators vs Iterators

TL;DR: All generators are iterators, but not all iterators are generators

A python generator is a special case of an iterator. A generator can be any function that includes the yield statement:

def squares(start, stop):
    for i in range(start, stop):
        yield i * i

generator = squares(a, b)

An iterator is a more generic class, which gives you more flexibility (example maintaining state, restarting generator), which is however more verbose:

class Squares(object):
    def __init__(self, start, stop):
       self.start = start
       self.stop = stop
    def __iter__(self): return self
    def next(self): # __next__ in Python 3
       if self.start >= self.stop:
           raise StopIteration
       current = self.start * self.start
       self.start += 1
       return current # can also be "yield"

iterator = Squares(a, b)

Python