## Projecting Data using map

Let us go through the details about `map` to project the data.
* We can use `map` on top of `iterable` to return new `iterable` with all the transformed elements based up on the logic.
* It takes transformation logic and iterable as arguments. We can pass transformation logic either as regular function or lambda function.
* `map` returns a special iterable called as `map`. We have to type cast to regular collection such as `list` to preview the data or we can use for loop to iterate and print the data.
* Data from objects such as `filter`, `map` etc will be flushed out once we read from it.
* Number of elements in the `map` object will be same as `iterable` that is passed to it.

In [1]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/G2AClqM9Wv8?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

In [1]:
%run 02_preparing_data_sets.ipynb

In [2]:
orders[:10]

['1,2013-07-25 00:00:00.0,11599,CLOSED',
 '2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT',
 '3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '4,2013-07-25 00:00:00.0,8827,CLOSED',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE',
 '6,2013-07-25 00:00:00.0,7130,COMPLETE',
 '7,2013-07-25 00:00:00.0,4530,COMPLETE',
 '8,2013-07-25 00:00:00.0,2911,PROCESSING',
 '9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT',
 '10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT']

In [3]:
len(orders)

68883

In [4]:
order_items[:10]

['1,1,957,1,299.98,299.98',
 '2,2,1073,1,199.99,199.99',
 '3,2,502,5,250.0,50.0',
 '4,2,403,1,129.99,129.99',
 '5,4,897,2,49.98,24.99',
 '6,4,365,5,299.95,59.99',
 '7,4,502,3,150.0,50.0',
 '8,4,1014,4,199.92,49.98',
 '9,5,957,1,299.98,299.98',
 '10,5,365,5,299.95,59.99']

In [5]:
len(order_items)

172198

* Get order_dates from orders

In [6]:
order = '1,2013-07-25 00:00:00.0,11599,CLOSED'
order.split(',')[1]

'2013-07-25 00:00:00.0'

In [7]:
order_dates = map(
    lambda order: order.split(',')[1],
    orders
)

In [8]:
type(order_dates)

map

In [9]:
list(order_dates)[:10]

['2013-07-25 00:00:00.0',
 '2013-07-25 00:00:00.0',
 '2013-07-25 00:00:00.0',
 '2013-07-25 00:00:00.0',
 '2013-07-25 00:00:00.0',
 '2013-07-25 00:00:00.0',
 '2013-07-25 00:00:00.0',
 '2013-07-25 00:00:00.0',
 '2013-07-25 00:00:00.0',
 '2013-07-25 00:00:00.0']

In [10]:
len(orders)

68883

```{note}
This will return 0 as data from map object `order_dates` is flushed out as part of the previous read.
```

In [11]:
len(list(order_dates))

0

```{note}
Creating order_dates once again by invoking `map` function to validate the number of elements. Number of elements in order_dates is same as orders.
```

In [12]:
order_dates = map(
    lambda order: order.split(',')[1],
    orders
)

In [13]:
len(list(order_dates))

68883

In [19]:
order_dates = map(
    lambda order: order.split(',')[1],
    orders
)

In [20]:
set(order_dates)

{'2013-07-25 00:00:00.0',
 '2013-07-26 00:00:00.0',
 '2013-07-27 00:00:00.0',
 '2013-07-28 00:00:00.0',
 '2013-07-29 00:00:00.0',
 '2013-07-30 00:00:00.0',
 '2013-07-31 00:00:00.0',
 '2013-08-01 00:00:00.0',
 '2013-08-02 00:00:00.0',
 '2013-08-03 00:00:00.0',
 '2013-08-04 00:00:00.0',
 '2013-08-05 00:00:00.0',
 '2013-08-06 00:00:00.0',
 '2013-08-07 00:00:00.0',
 '2013-08-08 00:00:00.0',
 '2013-08-09 00:00:00.0',
 '2013-08-10 00:00:00.0',
 '2013-08-11 00:00:00.0',
 '2013-08-12 00:00:00.0',
 '2013-08-13 00:00:00.0',
 '2013-08-14 00:00:00.0',
 '2013-08-15 00:00:00.0',
 '2013-08-16 00:00:00.0',
 '2013-08-17 00:00:00.0',
 '2013-08-18 00:00:00.0',
 '2013-08-19 00:00:00.0',
 '2013-08-20 00:00:00.0',
 '2013-08-21 00:00:00.0',
 '2013-08-22 00:00:00.0',
 '2013-08-23 00:00:00.0',
 '2013-08-24 00:00:00.0',
 '2013-08-25 00:00:00.0',
 '2013-08-26 00:00:00.0',
 '2013-08-27 00:00:00.0',
 '2013-08-28 00:00:00.0',
 '2013-08-29 00:00:00.0',
 '2013-08-30 00:00:00.0',
 '2013-08-31 00:00:00.0',
 '2013-09-01

In [22]:
order_dates = map(
    lambda order: order.split(',')[1],
    orders
)

In [23]:
len(set(order_dates))

364

* Use orders and extract order_id as well as order_date from each element in the form of a tuple. Make sure that order_id is of type int.

In [24]:
orders[:10]

['1,2013-07-25 00:00:00.0,11599,CLOSED',
 '2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT',
 '3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '4,2013-07-25 00:00:00.0,8827,CLOSED',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE',
 '6,2013-07-25 00:00:00.0,7130,COMPLETE',
 '7,2013-07-25 00:00:00.0,4530,COMPLETE',
 '8,2013-07-25 00:00:00.0,2911,PROCESSING',
 '9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT',
 '10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT']

In [25]:
[(1, '2013-07-25 00:00:00.0'), (2, '2013-07-25 00:00:00.0')]

[(1, '2013-07-25 00:00:00.0'), (2, '2013-07-25 00:00:00.0')]

In [26]:
order = orders[0]

In [27]:
(int(order.split(',')[0]), order.split(',')[1])

(1, '2013-07-25 00:00:00.0')

In [28]:
order_tuples = map(
    lambda order: (int(order.split(',')[0]), order.split(',')[1]),
    orders
)

In [29]:
list(order_tuples)[:10]

[(1, '2013-07-25 00:00:00.0'),
 (2, '2013-07-25 00:00:00.0'),
 (3, '2013-07-25 00:00:00.0'),
 (4, '2013-07-25 00:00:00.0'),
 (5, '2013-07-25 00:00:00.0'),
 (6, '2013-07-25 00:00:00.0'),
 (7, '2013-07-25 00:00:00.0'),
 (8, '2013-07-25 00:00:00.0'),
 (9, '2013-07-25 00:00:00.0'),
 (10, '2013-07-25 00:00:00.0')]